In my last two posts, I first suggested that Voynich Manuscript 81R might contain a poem in Latin dactylic hexameter, but then I argued that the lines only convey about half of the information necessary to encode such a poem. In this post I'll try to reconcile those two arguments by showing that a late medieval/early Renaissance cipher system could have produced this effect.
The pages of the VM have been carbon-dated to between 1404 and 1438. If the text is not a hoax, and it was written within a century or so of the production of the vellum, then what cryptographic techniques might the author plausibly have known, and how would they impact the total bits per line of an enciphered poem?
According to David Kahn's The Code-Breakers, the following methods might have been available to someone in Europe during that period. For most of these, I have created simulations using the Aeneid as a plain text, and measured the effect on bits per line using the formula for Pbc from my last post.
- Writing backwards (0.2% increase)
- Substituting dots for vowels (28.5% decrease)
- Foreign alphabets (little or no change, depending on how well the foreign alphabet maps to the plaintext alphabet)
- Simple substitution (no change)
- Writing in consonants only (45.6% - 49% decrease, depending on whether v and j are treated as vowels)
- Figurate expressions (impractical to test, but likely to increase bits per line)
- Exotic alphabets (no change, same as simple substitution)
- Shorthand (impractical to test, but likely to decrease bits per line)
- Abbreviations (impractical to test, but certain to decrease bits per line)
- Word substitutions (did not test, but likely to cause moderate increase or decrease to bits per line)
- Homophones for vowels (increase bits per line, but the exact difference depends on the number of homophones per vowel. With two homophones for each vowel, there was a 19.5% increase)
- Nulls (increase bits per line, but the exact difference depends on the number of distinct nulls used and the number of nulls inserted per line)
- Homophones for consonants (increase bits per line, but the exact difference depends on the number of homophones per consonant)
- Nomenclators (impact depends on the type of nomenclator. I tested with a large nomenclator and got a 44.5% decrease in bits per line)
- Writing in consonants only
- Using a large nomenclator
Hi Brian,
ReplyDeleteDavid Kahn tried really hard, but his book only really offers a narrow window onto 15th century ciphers and codes.
For example, Alberti reports a long conversation he had with a transposition cipher expert: but our net knowledge of transposition (strewing) ciphers in use in the 15th century is close to zero.
There are plenty of good reasons to think there is something quite atypical about Voynichese: to me, it feels like an elegant marriage between Milanese praxis and Tuscan brainpower - but I would say that, wouldn't I? ;-p
Cheers, Nick
Thanks for the insight, Nick! I completely overlooked transposition ciphers, but of course they have been in use for a long time. I would guess most good transposition ciphers would increase entropy, but I can imagine some lossy ones that would decrease it as well. I'll have to try some simulations with those.
DeleteThe problem is that we have no idea how transposition ciphers were used in the 15th century, because we have zero examples to work with. Which is precisely why Alberti's report of a conversation with a highly skilled transposition cipher expert is so annoying. :-(
ReplyDeleteThe low-level transposition cipher I mentioned in Curse was the one used by Filarete in his treatise on architecture, where he transposed a load of names (e.g. Galeazzo -> Zogalia, Averlino -> Nolivera, etc). Something as simple as that may not affect the entropy hugely? A century later this was mentioned as a Florentine schoolboy game (basically like Pig Latin), so I get the impression that transposition was more of a Florentine big-brain thing than an empirical Milanese thing.
What I find interesting about those transposed names is that this method is based on syllables, so it results in a speakable cipher. I suppose, if two speakers trained themselves, they could learn to speak this code fluently.
DeleteYou're right, though, it doesn't have much impact on entropy as I am measuring it here.
We *do* know that scytales existed long before the medieval period, so I ran some tests simulating a scytale and (as one would expect) the entropy increased. I would expect similar results from all of the transposition ciphers we know about from later periods, because the whole point of those methods is to make the text appear noisier.
So is it possible that there was a 15th century transposition cipher that reduced entropy as I'm measuring it here? If so, how could it work?
In modern cryptography we are concerned about remaining faithful to the original plaintext, so in decipherment we want to be able to reproduce the original text exactly. But medieval cryptographers didn't seem to share that concern as strongly. Some of the non-transposition methods we know were in use (such as abbreviation and removing vowels) actually degrade the text a little, and the assumption is that the reader would be able to resolve the resulting ambiguities through familiarity with the language and the subject matter.
So maybe a 15th century transposition cipher could allow for some loss of fidelity, resulting in a certain number of ambiguities. As a crude example of what I'm thinking about, suppose we sort all of the letters in each word alphabetically, so "in principio creauit deus caelum et terram" becomes "in ciiinoppr aceirtu desu acelum et aemrrt". That introduces noise, but reduces entropy, at the cost of creating ambiguities.
It really is too bad we don't know more about these.
Medieval cryptographers and code-breakers had no useful concept of entropy, or indeed just about anything in our modern toolbox.
ReplyDeleteAs I wrote in "Fifteenth Century Revisited" ( https://www.academia.edu/33813775/Fifteenth_Century_Cryptography_Revisited ), even the 'homophone trick' seem to have emerged primarily as a technique for visually concealing the tell-tale vowels at the end of Italian words, and then moving on Latin. With all that in mind, the suggestion put forward by David Kahn that homophones were introduced to flatten out the stats looks a lot like a spurious back-projection.
My overall position is therefore that Voynich-era cryptography was much more about concealment than transformation. As such, the way Voynichese works makes almost no sense... unless you accept that its words are brutally abbreviated (and also that the text was written more to remind than to encode), at which point all the binomial word length observations start to make sense.