A while ago, we had a look at a (simple) substitution cipher. Recall, this cipher proceeds by, as its name suggests, substituting one letter for another. For the cipher to be reversible, the substitution table is in fact a permutation of the alphabet. This permutation is the cipher “key”.
We surmised that frequency analysis can help us decode a message enciphered with a substitution, but that does not help us with atypical or very short messages. What if the message is somewhat skewed, say it doesn’t even have the letter E? Can we still use the (expected) letter frequencies to start decoding it? Well, maybe. Could we use some other approach? What if we had a language model that helps us decide if a tentative deciphering makes sense? Oh, right, we do have one. And since we can use it as a “fitness” function, we can apply genetic programming to find the code!