I’ve played around with the Matasano Crypto Challenges several times in the last few years. I finally decided to just sit down and go through them all - partly to prove to myself that I could actually understand crypto problems, if only I tried. It also seemed like a great opportunity to brush up on my Ruby, as a long-time Python advocate.
This challenge should have been simple, but since I don’t know much Ruby, I
ended up spending a lot of time muddling through Ruby’s confusing
unpack syntax. While documented fairly thoroughly
here, I’ve yet
to find a reason why
pack('H*') converts a hex string to byte string, while
pack('m') converts a byte string to base64. To me, it seems like one of
these should be
unpack, but then, I’m coming from Python where this would be
I ended up writing short utility functions so that I didn’t have to think about
which way I wanted to go anymore. I’ve used
m0 for encoding base64 values,
so that they all appear on one line, but
m for decoding, as that’s the format
all challenges in this set are given in.
I originally solved this challenge by converting the hex strings to integers
to_i(16), however this led to problems later - I needed to convert byte
strings to hex before xoring them, and I also ended up hackily padding the
number back to its original length with zeros when the first bytes xored to
null. I think xoring char by char is probably a more natural way to think about
The pertinent part of this challenge is figuring out how to score a string. All
samples up to this point were in ASCII, which only has 7 bits of valid
characters, so I rejected strings containing bytes with the first bit
set. I also rejected strings with bytes lower than
\x10, since those are
ASCII control characters. Finally, we’ll define a set of good characters, and
score the string on the percentage of those it has: alphanumerics and whitespace
This challenge is fairly simple - find the best single-xor result for each string, then find the best result from among those results.
Again, fairly trivial - the hardest part of this challenge is figuring out how to pad the key to the right length. I’m not completely satisfied with my solution in the end, but it does work.
The warning on this page is correct - this challenge was definitely the most frustrating of this set. Writing the edit distance function was not particularly difficult, though I did debate the efficiency of converting the integer to a binary string. I originally wrote this using an extra nested loop, that would bit shift the xored result, but decided that converting to a binary string was clearer.
Figuring out the best key size was slightly fiddly, but mostly just because I
wasn’t familiar with Ruby’s slicing syntax. I’m finding the
trick very frustrating - I wish Ruby had a
The final step of determining which of the top N key sizes was valid was also
a bit annoying - I forgot to discount any blocks with invalid characters the
first time, which lead to some weird edge cases. There was also a bit of mucking
around with Ruby arrays required to get the
.transpose working correctly.
This challenge was like the first - it should have been easy, but Ruby made it hard. In this case, because the Ruby OpenSSL implentation of AES-ECB automatically adds padding to ciphertexts. This would be fine, were it well-documented, but I ended up scouring the Internet, confused by why Python was giving me a different ciphertext in what seemed to be the same conditions.
This challenge didn’t feel like it really had an end - unlike the others, it was very difficult to tell if I’d successfully solved it. I picked a ciphertext, but as far as I know, there’s no way for me to tell if it’s the right one.
My solution here was to check for ciphertexts with the same blocks repeated, as this seemed to be what was hinted at. There was also only a single ciphertext with this property.
Putting it all together
This set was primarily frustrating - it’s the least fun set in my opinion, and can be pretty boring if you already know some basic crypto. I felt like I was fighting with language details, rather than interesting problems. However, I was writing in an unfamiliar language, and things definitely get better from here on in.
My solutions for this challenge are up on Github here, along with test cases, as best as I could determine a correct solution.