Breaking the Substitution Cipher (Substitution Cipher, part II)

A while ago, we had a look at a (simple) substitution cipher. Recall, this cipher proceeds by, as its name suggests, substituting one letter for another. For the cipher to be reversible, the substitution table is in fact a permutation of the alphabet. This permutation is the cipher “key”.

cipher-coin

We surmised that frequency analysis can help us decode a message enciphered with a substitution, but that does not help us with atypical or very short messages. What if the message is somewhat skewed, say it doesn’t even have the letter E? Can we still use the (expected) letter frequencies to start decoding it? Well, maybe. Could we use some other approach? What if we had a language model that helps us decide if a tentative deciphering makes sense? Oh, right, we do have one. And since we can use it as a “fitness” function, we can apply genetic programming to find the code!

*
* *

In classical optimizations problems, one devises an algorithm to adjust the parameters of an objective function evaluated on a series of data. Genetic programming, on the contrary, has an unchanging objective function (the “fitness” function) and changes the data—each datum is an individual in a population—so to produces data that maximizes (or minimizes) the objective function. Therefore, genetic programming is dual to classical optimization theory.

Changing the data, in genetic programming, proceeds randomly. The vocabulary is fraught with evolutionary theory metaphors: mutations, crossover, etc. The population “evolves” as follows: you take one datum, and you produce a number of “offsprings” by making copies of the datum with random changes. The random changes can be random flipping of bits, or it can involve more or less complicated operations like exchanging a subset of the bits with other members of the population, somewhat mimicking sexual reproduction.

Then each “generation” (a new bunch of offspring created from “parents”) is selected: only the so-or-so% best, according to the fitness function “survive” and are allowed to “reproduce”. We have generation succeeding one another until we arrive to a local minima and that we have a good fitness.

For more on genetic programming, have a look at A Field Guide to Genetic Programming (which also comes with a free/open PDF) and An Introduction to Genetic Programming for Scientists and Engineers

*
* *

Applied to our problem, an individual is a substitution (all 26 letters permuted in some random order), and the fitness function measures how English-looking the resulting deciphering is. We will use the language model we have discussed earlier, a simple order-1 Markov model. What are mutations in our case? Well, substitutions that are “near” the parent’s substitution, say, substitutions created by swapping randomly 2 letters in the substitution (and in the experiments, I allowed 1 to 10 such random swaps, creating 100 offspring for every individual selected).

In early experiment, I applied the beginner’s method: generate a population of N, kept, say, 10% of the best, then replenished the population back to N with mutants. This approach is susceptible to fall very rapidly into local minima. Instead, I switch to a strategy where I keep 2% of the best individuals, 1.3% of the worst, and 3.3% of new, random individuals (the numbers are looking weird in %, but on a population of 150000 individuals, I kept 3000 best, 2000 worst, and injected 5000 new). Why? I had no good reasons to choose this particular mix, but it worked well, experimentally, so, why not?

*
* *

The program is rather simple conceptually: you start with an initial population, and an enciphered message. For each individual, you have a forward (enciphering) and inverse (deciphering) substitution; you apply the inverse to the enciphered message, you compute its score using the language model. You sort the list of individual from best to worst, you “kill” everything between the first b best and the last w worst individual, inject n new individual, you proceed to the mutations of those kept+new individuals until you replenish your population. You repeat until the best score stagnates for a few tens of generations.

In the experiments, the program does not use any information of the original text, it is only fed the cipher-text. Here’s one example (each line represent the best individual of its generation):

ryf yubrndl ng zynt sed odeluis aeirub teb rdeibaurrfh ndekkl tury kurrkf bjoondruis hnzjafireruni jiruk int --- score: -3.67719
rbd bisroft ok pboh caf jfatinc uanris has rfansuirrdw ofallt hirb lirrld smjjofrinc wopmudnrarion mnril noh --- score: -3.48384
rbd bisroct ok pbom fac jcatief uaeris mas rcaesuirrdw ocallt mirb lirrld shjjocrief wophuderarioe heril eom --- score: -3.37573
ias aryioke oc maof hlk bklerth dltiry fly ikltydriisp oklgge fria griigs ynbbokirth pomndstilirot ntirg tof --- score: -3.32334
lda dislouy oh vdor teu pueyint menlis res luensmillag ouebby rild billba skppoulint govkmanlelion knlib nor --- score: -3.26811
cla liscotm ok plow det utemind hencis wes ctenshiccaj oterrm wicl riccra sguuotcind jopghancecion gncir now --- score: -3.158
sho hegsiny iz phic wan bnayerw larseg cag snarglessom inatty cesh tessto gubbinserw mipulorsaseir urset ric --- score: -3.03343
sho hegsiny iz phic wan bnayerw larseg cag snarglessom inatty cesh tessto gubbinserw mipulorsaseir urset ric --- score: -3.03343
sho hegsimy if phic wam bmayerw larseg cag smarglesson imatty cesh tessto gubbimserw nipulorsaseir urset ric --- score: -3.02616
tse sidtory ok vsom far prayinf gantid mad trandgitteb orally mits little dupportinf bovugentation until nom --- score: -2.89792
tse sidtory ok wsom far prayinf gantid mad trandgitteb orally mits little dupportinf bowugentation until nom --- score: -2.87348
tse sidtory ok gsom far prayinf vantid mad trandvitteb orally mits little dupportinf boguventation until nom --- score: -2.86868
tse sidtory ok gsom far prayinf vantid mad trandvitteb orally mits little dupportinf boguventation until nom --- score: -2.86868
tse sidtory ok csom far prayinf gantid mad trandgitteb orally mits little dupportinf bocugentation until nom --- score: -2.84646
the hidtory of whom kar prayink gantid mad trandgitteb orally mith little dupportink bowugentation until nom --- score: -2.80786
the histord ov khof yar pradiny mantis fas transmitteg oralld fith little supportiny gokumentation until nof --- score: -2.77897
the hictord of ghow kar pradink mantic wac trancmittey oralld with little cupportink yogumentation until now --- score: -2.77454
the history oc ghow dar brayind mantis was transmittep orally with little subbortind pogumentation until now --- score: -2.6702
the history of chom dar brayind pantis mas transpittew orally mith little subbortind wocupentation until nom --- score: -2.60388
the history of chom dar brayind pantis mas transpittew orally mith little subbortind wocupentation until nom --- score: -2.60388
the history of chom dar brayind pantis mas transpittew orally mith little subbortind wocupentation until nom --- score: -2.60388
the history of chom dar brayind pantis mas transpittew orally mith little subbortind wocupentation until nom --- score: -2.60388
the history of chom dar brayind pantis mas transpittew orally mith little subbortind wocupentation until nom --- score: -2.60388
the history of chom dar brayind pantis mas transpittew orally mith little subbortind wocupentation until nom --- score: -2.60388
the history of chom dar brayind pantis mas transpittew orally mith little subbortind wocupentation until nom --- score: -2.60388
the history of chom dar prayind bantis mas transbittew orally mith little supportind wocubentation until nom --- score: -2.60051
the history of ghow dar prayind mantis was transmittec orally with little supportind cogumentation until now --- score: -2.58955
the history of ghow dar prayind mantis was transmittec orally with little supportind cogumentation until now --- score: -2.58955
the history of ghow dar prayind mantis was transmittec orally with little supportind cogumentation until now --- score: -2.58955
the history of ghom dar prayind cantis mas transcittew orally mith little supportind wogucentation until nom --- score: -2.58914
the history of chow dar prayind mantis was transmitteg orally with little supportind gocumentation until now --- score: -2.585
the history of chow gar praying mantis was transmitted orally with little supporting documentation until now --- score: -2.57303
the history of chow gar praying mantis was transmitted orally with little supporting documentation until now --- score: -2.57303

Quite surprisingly, it arrives rather quickly (a few tens of generations, and about one minute of wall-clock time) to the correct deciphered message. Is this merely a lucky instance? Well, let’s see with some other message:

andegroun oh ri mserseg hse hsiawl yaghae o dogeeg un yskhudh, pahud, mgurunf ig hyode hduende, hse orrenlel rse igefin unhrurare it pogune quiwifk --- score: -3.34799
andegroun ot ri mserseg tse tsiahl yagtae o dogeeg un ysktudt, patud, mgurunf ig tyode tduende, tse orrenlel rse igefin untrurare iw pogune quihifk --- score: -3.27679
tndegroun oa ri xserseg ase asithy ltgate o dogeeg un lskauda, ptaud, xgurunf ig alode aduende, ase orrenyey rse igefin unarurtre im pogune quihifk --- score: -3.20751
lndefroun oa ri msersef ase asilwh ylfale o dofeef un yskauda, plaud, mfurung if ayode aduende, ase orrenheh rse ifegin unarurlre it pofune quiwigk --- score: -3.12531
andergoun os gi thegher she shiawl barsae o doreer un bhksuds, pasud, trugunf ir sbode sduende, she oggenlel ghe irefin unsgugage im porune quiwifk --- score: -3.06081
uncerstin ta so pbesber abe aboujd muraue t ctreer in mbzaica, luaic, prisinw or amtce acience, abe tssended sbe orewon inasisuse oy ltrine viojowz --- score: -2.97102
pnlersoin ot sa whesher the thapby mprtpe o loreer in mhftilt, kptil, wrising ar tmole tlienle, the ossenyey she aregan intsispse au korine ciabagf --- score: -2.89008
anlersuin ut so whesher the thoamy fartae u lureer in fhktilt, patil, wrising or tfule tlienle, the ussenyey she oregon intsisase oz purine ciomogk --- score: -2.87027
anlersuin ut so phesher the thoamy wartae u lureer in whktilt, batil, prising or twule tlienle, the ussenyey she oregon intsisase of burine ciomogk --- score: -2.76881
ancersuin ut so phesher the thoamy wartae u cureer in whktict, batic, prising or twuce tcience, the ussenyey she oregon intsisase of burine liomogk --- score: -2.74981
uncersain at so phesher the thould murtue a career in mhytict, jutic, prisinb or tmace tcience, the assended she orebon intsisuse ow jarine qioloby --- score: -2.70316
uncertain as to bhether she shouyd pursue a career in phlsics, music, briting or space science, she attended the oregon institute ok marine fioyogl --- score: -2.66362
uncersain at so whesher the thoumy purtue a career in phltict, futic, wrising or tpace tcience, the assenyey she oregon intsisuse oz farine biomogl --- score: -2.64927
uncertain as to whether she shouyd pursue a career in phlsics, music, writing or space science, she attended the oregon institute ok marine fioyogl --- score: -2.62567
untercain as co ghecher she shoulm pursue a tareer in physits, busit, gricind or spate stiente, she accenmem che oredon inscicuce ok barine fiolody --- score: -2.60964
uncertain as to whether she should pursue a career in physics, music, writing or space science, she attended the oregon institute of marine kiology --- score: -2.54929
uncertain as to whether she should pursue a career in physics, music, writing or space science, she attended the oregon institute of marine biology --- score: -2.53944

Even faster! More:

pt shece le auj ceferj tic she uenoegs aur pnuicauge hece zbideu it, ps fwzs le tiwur pu she fice ettegswao seaghpun it eunopzh ncaffac --- score: -3.57572
ed mtoco po arh cofonh dic mto roysoum arn eyricaruo toco bgizor id, em fwbm po diwrn er mto fico oddoumwas moautery id orysebt ycaffac --- score: -3.43762
ld rgeme he ack mepeyk dum rge ceisebr acy licumacbe geme ofutec ud, lr pnor he duncy lc rge pume eddebrnas reabglci ud ecislog imappam --- score: -3.19446
tm huese fe ind seregd mos hue nealebh ing tanosinbe uese cyopen om, th rwch fe mowng tn hue rose emmebhwil heibutna om enaltcu asirris --- score: -3.12987
af huese me ind seregd fos hue netlebh ing atnosinbe uese cyopen of, ah rwch me fowng an hue rose effebhwil heibuant of entlacu tsirris --- score: -3.06921
tm huele fe ind leregd mol hue neasech ing tanolince uele bjopen om, th rwbh fe mowng tn hue role emmechwis heicutna om enastbu alirril --- score: -2.9975
on stere be aly repewy nir ste lefheds alw ofliralde tere ugivel in, os pcus be niclw ol ste pire ennedscah seadtolf in elfhout frappar --- score: -2.95955
op there ve und remegd par the nesyect ung osnarunce here lbawen ap, ot milt ve paing on the mare eppectiuy teuchons ap ensyolh srummur --- score: -2.87312
om thele ve und lepegd mal the nesyect ung osnalunce hele rbawen am, ot pirt ve maing on the pale emmectiuy teuchons am ensyorh sluppul --- score: -2.80463
if these we ary sepemy fos the redgect arm idrosarce hese nkoler of, it pbnt we fobrm ir the pose effectbag teachird of erdginh dsappas --- score: -2.79246
os there ve und remeld sar the negyect unl ognarunce here pbawen as, ot mipt ve sainl on the mare essectiuy teuchong as engyoph grummur --- score: -2.7496
os there ve und remeld sar the negyect unl ognarunce here pbawen as, ot mipt ve sainl on the mare essectiuy teuchong as engyoph grummur --- score: -2.7496
if these we ary sepely fos the redmect arl idrosarce hese nzoger of, it punt we fourl ir the pose effectuam teachird of erdminh dsappas --- score: -2.71263
if these we ary sepely fos the redmect arl idrosarce hese nzoger of, it punt we fourl ir the pose effectuam teachird of erdminh dsappas --- score: -2.71263
if there ve alm reneym for the ledgest aly idloralse here czobel of, it nuct ve fouly il the nore effestuag teashild of eldgich drannar --- score: -2.68283
if there ve alm reneym for the ledgest aly idloralse here czobel of, it nuct ve fouly il the nore effestuag teashild of eldgich drannar --- score: -2.68283
il theme be ons mefeds lam the negrect ond ignamonce heme pyaven al, it fupt be laund in the fame ellectuor teoching al engriph gmoffom --- score: -2.64887
il theme be ons mefeds lam the negrect ond ignamonce heme pyaven al, it fupt be laund in the fame ellectuor teoching al engriph gmoffom --- score: -2.64887
if theme xe ary meledy fom the rengect ard inromarce heme spover of, it lust xe fourd ir the lome effectuag teachirn of erngish nmallam --- score: -2.63798
if there je any rebely for the nedgect anl idnorance here skoven of, it bust je founl in the bore effectuag teachind of endgish drabbar --- score: -2.61719
im there je any rebely mor the nedgect anl idnorance here spoken om, it bust je mounl in the bore emmectuag teachind om endgish drabbar --- score: -2.61248
if there we any repemy for the neglect anm ignorance here skoden of, it pust we founm in the pore effectual teaching of english grappar --- score: -2.59336
if theme be ary melegy fom the rendect arg inromarce heme spover of, it lust be fourg ir the lome effectuad teachirn of erndish nmallam --- score: -2.57601
if theme be ary melegy fom the rendect arg inromarce heme spover of, it lust be fourg ir the lome effectuad teachirn of erndish nmallam --- score: -2.57601
if there je any remely for the nedgect anl idnorance here spoven of, it must je founl in the more effectuag teachind of endgish drammar --- score: -2.56807
if there je any remely for the nedgect anl idnorance here spoven of, it must je founl in the more effectuag teachind of endgish drammar --- score: -2.56807
if there be any remely for the nedgect anl idnorance here swoven of, it must be founl in the more effectuag teachind of endgish drammar --- score: -2.54228
if there be any remely for the nedgect anl idnorance here spoven of, it must be founl in the more effectuag teachind of endgish drammar --- score: -2.52841
if there we any remedy for the neglect and ignorance here skopen of, it must we found in the more effectual teaching of english grammar --- score: -2.49834
if there we any remedy for the neglect and ignorance here skopen of, it must we found in the more effectual teaching of english grammar --- score: -2.49834
if there we any remedy for the neglect and ignorance here spoken of, it must we found in the more effectual teaching of english grammar --- score: -2.46701
if there we any remedy for the neglect and ignorance here spoken of, it must we found in the more effectual teaching of english grammar --- score: -2.46701
if there we any remedy for the neglect and ignorance here spoken of, it must we found in the more effectual teaching of english grammar --- score: -2.46701
if there we any remedy for the neglect and ignorance here spoven of, it must we found in the more effectual teaching of english grammar --- score: -2.45515
if there we any remedy for the neglect and ignorance here spoven of, it must we found in the more effectual teaching of english grammar --- score: -2.45515
if there we any remedy for the neglect and ignorance here spoven of, it must we found in the more effectual teaching of english grammar --- score: -2.45515
if there we any remedy for the neglect and ignorance here spoven of, it must we found in the more effectual teaching of english grammar --- score: -2.45515
if there we any remedy for the neglect and ignorance here spoven of, it must we found in the more effectual teaching of english grammar --- score: -2.45515

Hmm. In this last example, it mistakes w for b, and v for k, but otherwise it’s still very legible. Maybe a couple of different runs (all with different random seeds, evidently) would yield a correct decode?

*
* *

The main weakness of the method is the language model. If the model isn’t very good, then you may get weird tententive deciphered text that still have a good score. For example:

the aitw, bhfch iae grctuagip, keed rg the marugd rg iatharlrdw, eiathbraow, kauft, igd weedw --- score: -3.40505
mer agmo, wesle gar uilmdaugh, brrt iu mer zaidut iu gameaicito, rgamewiako, badsm, gut orrto --- score: -3.25731
our itoy, suglu tir calonictf, vrre ac our kiance ac tiouiaxaey, rtiousaiby, vingo, tce yrrey --- score: -3.05336
oue itoy, suglu tie calonictf, veer ac oue kiancr ac tiouiaxary, etiousaiby, vingo, tcr yeery --- score: -2.97708
ous itoy, ruglu tis calonictf, wsse ac ous kiance ac tiouiaxaey, stiouraiby, wingo, tce yssey --- score: -2.9052
oul tios, bungu itl regoatriw, flld er oul cteard er itoutekeds, litoubeths, ftano, ird sllds --- score: -2.86778
rol atry, foupo tal siprdastw, blle is rol maidse is taroainiey, ltarofiavy, badur, tse ylley --- score: -2.81227
oul tios, mungu itl regoatrid, flly er oul bteary er itouteveys, litoumeths, ftano, iry sllys --- score: -2.78855
son itse, coubo tin rabspirty, wnnd ar son miaprd ar tisoialade, ntisocaixe, wipus, trd ennde --- score: -2.72981
son itse, coubo tin rabspirty, wnnd ar son hiaprd ar tisoialade, ntisocaixe, wipus, trd ennde --- score: -2.72339
son itse, coubo tin rabspirth, wnnd ar son liaprd ar tisoiamade, ntisocaike, wipus, trd ennde --- score: -2.7117
son itse, foubo tin rabspirty, wnnd ar son miaprd ar tisoialade, ntisofaice, wipus, trd ennde --- score: -2.67649
the rots, which ore lacturlom, peed al the brauld al orthrazads, eorthwarns, pruit, old seeds --- score: -2.66222
son itse, foubo tin rabscirth, wnnd ar son piacrd ar tisoialade, ntisofaive, wicus, trd ennde --- score: -2.65864
the rots, which ore lacturlon, peed al the brauld al orthrazads, eorthwarms, pruit, old seeds --- score: -2.65579
son itse, foumo tin ramscirty, wnnd ar son hiacrd ar tisoialade, ntisofaive, wicus, trd ennde --- score: -2.64746
the rots, which ore lacturlon, peed al the brauld al orthravads, eorthwarms, pruit, old seeds --- score: -2.6443
the rots, which ore nacturnol, peed an the braund an orthrazads, eorthwarms, pruit, ond seeds --- score: -2.61706
the rots, which ore nacturnof, peed an the braund an orthrazads, eorthwargs, pruit, ond seeds --- score: -2.61351
the rots, which ore nacturnol, peed an the braund an orthragads, eorthwarms, pruit, ond seeds --- score: -2.60997
the rots, which ore nacturnof, peed an the braund an orthravads, eorthwargs, pruit, ond seeds --- score: -2.60203
the rots, which ore nacturnof, peed an the braund an orthralads, eorthwargs, pruit, ond seeds --- score: -2.58454
the rots, which ore nacturnof, peed an the braund an orthralads, eorthwarks, pruit, ond seeds --- score: -2.57494
the rots, which ore nacturnom, peed an the braund an orthralads, eorthwarys, pruit, ond seeds --- score: -2.57109
the rots, which ore nacturnom, peed an the braund an orthralads, eorthwarys, pruit, ond seeds --- score: -2.57109
the rots, which ore nacturnof, peed an the braund an orthralads, eorthwarys, pruit, ond seeds --- score: -2.57052
the rots, which ore nacturnof, peed an the braund an orthralads, eorthwarys, pruit, ond seeds --- score: -2.57052
the rots, which ore nacturnof, peed an the braund an orthralads, eorthwarys, pruit, ond seeds --- score: -2.57052
the rats, which are nocturnal, peed on the bround on arthromods, earthworks, pruit, and seeds --- score: -2.55168
the rats, which are nocturnal, peed on the bround on arthromods, earthworks, pruit, and seeds --- score: -2.55168
the rats, which are nocturnal, peed on the bround on arthromods, earthworks, pruit, and seeds --- score: -2.55168

…generates a somewhat bizarre deciphering. The model seems to think “arthromods” and “earthworks” are cromulent words, that is, it thinks that “earthworks” is more likely that earthworms, simply because “rk” is more frequent in English than “rm”. A higher-order model would learn longer word parts, eventually entire words, even parts of sentences, and would declare unlikely that the rats peed on the bround on arthromods.

Another limitation of this naïve approach is that the mutation considered here are not very rich, and that sometimes specific mutations have to happen in a specific order to get to the correct deciphering. In the last example, to get “feed” and “arthropods” we have to exchange b and f and m and p to get to the correct deciphering. If they do not happen jointly, then the score may not go up enough to be selected for the next generation.

*
* *

The method therefore approximately works. We can decipher a (simple) substitution without using statistics drawn from the message itself (but we do use some from the language through the language model) just by starting from a small number of random substitutions, and, using genetic programming, generate rounds of mutant substitution which gets closer and closer to a good, likely, deciphering.

While I had no doubt that the method would work, I really expected to have to examine a truly huge number of substitutions. Indeed, we have that the total number of distinct permutations on the 26 letters is 26!\approx 4\times10^{26}\approx 2^{88.3}, which is really large. However,—and that was a surprise!—each generation only had 150000 individuals (a reasonable value for debugging, methought initially), and since there’s only a few tens of generations before getting a good (maybe still partial) deciphering, we only end up examining a few millions.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: