Caesar’s Cipher

Julius Caesar, presumably sometimes during the war in Gaul, according to Suetonius, used a simple cipher to ensure the privacy of his communications.

Caesar’s method can hardly be considered anything close to secure, but it’s still worthwhile to have a look at how you can implement it, and break it, mostly because it’s one of the simplest substitution ciphers.

Caesar’s cipher is a symmetric cipher, where the encryption and decryption keys are the same, and is very simple. Let $k$ be the key, an integer. Let $\{x_i\}_{i=1}^n$, the sequence of letters to encode. Let us suppose for now that the alphabet is limited to A to Z, with no punctuation, nor diacritics, nor numbers.

The encryption is given by:

$c_i = (x_i + k)~\mathrm{mod}~26$.

That is, each letter is “shifted” $k$ positions: A shifted 3 positions becomes D. The decryption is the inverse operation:

$x_i = (c_i - k)~\mathrm{mod}~26$,

where $\mathrm{mod}$ always return a non-negative number.

If you encrypt a piece of text with $k=3$:

Tomorrow, we attack the Gauls

becomes:

tomorrowweattackthegauls

before getting fed to the algorithm, which transforms it into:

wrpru urzzh dwwdf nwkhj dxov

which may superficially look completely and securely encrypted.

But stumbling into a cryptogram, say

igttu zgzzg iqmga ryutc kyzhg tq,

it suffices to try (at most) 25 keys:

$k=1$ : hfsst yfyyf hplfz qxtsb jxygf sp
$k=2$ : gerrs xexxe gokey pwsra iwxfe ro
$k=3$ : fdqqr wdwwd fnjdx ovrqz hvwed qn
$k=4$ : ecppq vcvvc emicw nuqpy guvdc pm
$k=5$ : dboop ubuub dlhbv mtpox ftucb ol
$k=6$ : canno tatta ckgau lsonw estba nk
$k=7$ : bzmmn szssz bjfzt krnmv drsaz mj

and $k=6$ yields cannot attack Gauls on west bank, which solves the cryptogram unambiguously.

*
* *

Caesar’s cipher is weak, and even with a pen-and-paper solving strategy, it cannot be very long before one cracks the code. For one thing, you don’t have to decipher the whole message before figuring out that the key you’re trying’s only yielding gibberish, and you don’t even have to start at the beginning of the message.

Still, if one wants to automate the process (let’s suppose we want to), the only missing piece is a language model that gives a score to each clear-text candidate, and allows us to pick the most probable one. Maybe we could do just that in a follow-up entry?

*
* *

So let us now look at the implementation. It assumes that the letters are lowercase. A direct implementation would give something like:

////////////////////////////////////////
//
// Encrypts string (in-place) using
// Ceasar's method
//
void encrypt( int key,
std::string & message )
{
for (size_t i=0;i<message.size();i++)
message[i]='a'+(ord(message[i])+key) % 26;
}

////////////////////////////////////////
//
// Decripts string (in-place) using
// Ceasar's medthod
//
void decrypt( int key,
std::string & message )
{
for (size_t i=0;i<message.size();i++)
message[i]='a'+(ord(message[i])+(26-key)) % 26;
}


where ord is an helper function that returns 0 for a and 25 for z. There’s just a small complication: as C++’s modulo % returns negative remainder on negative numbers, it is necessary to compensate for this by adding 26, safely performing the arithmetic on non-negative numbers.

*
* *

We will be back on (simple) language models that will help us automate the process of cracking Caesar’s cipher.

To be continued…

2 Responses to Caesar’s Cipher

1. […] the last installment of this series, we had a look at Caesar’s cipher, an absurdly simple encryption technique […]

2. […] time ago, we presented the Caesar Cipher, developed a simple language model that allowed us to break the cipher relatively easily. This […]