## Monday, July 22, 2013

### Encoding two messages simultaneously with a book code

Book codes are a convenient type of polyalphabetic cipher from the days before computation.  The basic idea is that a book (or other text) serves as the key, and the code consists of a list of indexes for words in the key, from which a letter is taken to build the text.

A famous example is the the second Beale cipher, which uses a version of the Declaration of Independence as a key.  The key consists of 1322 words, any of which may be used to encode the 26 letters of the alphabet.  Since the key contains many alternatives for most letters, the sender may choose randomly among those options in order to make it difficult to do frequency analysis.

If they key text is short and the sender is lazy, there may be some weaknesses in the cipher, but I won't dwell on those here.  Instead, let's look at how much information is contained in the cipher text compared to the plain text, if the Declaration of Independence is used:
Expansion factor = ln 1322 / ln 26 = 2.205858830928307
The cipher text contains just over twice the amount of information as the plain text.  Normally, the extra information would be random noise, but in fact you could send a second message in that bandwidth if you wanted to.

Imagine how the 19th century sender might prepare his message:  First, he reads through the key and prepares a list of alternative encodings for each letter of the plaintext.  So far, it's just a normal book cipher.  But then, he turns the list on its side and makes it a grid, so the first possible code for m can represent an a, and the 13th possible code for a can represent an m.

With a small key, the sender will have bottlenecks around low-frequency letters.  The Declaration of Independence only gives you four words starting with k, for example, so you will be better off if you rearrange your alternate alphabet in order of descending frequency, so your four variants of k represent e, t, a, o.  Even then, of course, there will be problems.  A large key is definitely better if both messages are important.

But if one of the messages is truly important, and the other message is just a cover, then the sender can probably work out a plausible message using the simple book code that will adequately conceal the true message.  For example, using the same cipher as Beale text 2:
High value message: THE ARMY DEPARTS AT DAWN
Cover message:        NO ACTION TILL DECEMBER
Coded message: 44, 132, 24, 195, 39, 319, 298, 269, 3, 286, 234, 334, 52, 89, 195, 33, 210, 231, 511, 96
Don't get your hopes up, though.  This was not done with the Beale cipher.