It has often been observed that Voynich characters have relatively low entropy (c.f. this discussion on René Zandbergen's site). This is a serious problem for the proposal I made in my last post, where I suggested that page 81R of the Voynich Manuscript might contain a poem in Latin dactylic hexameter.
Suppose you calculate the bits of information conveyed by a character c of a text T using a formula like the following:
Sc = (ln(fT) - ln(fc)) / ln(2)
where
Sc is the number of bits conveyed by the single character c
fT is the number of characters in the text
fc is the number of times the character c appears in the text
Using this formula we find that the lines on 81R carry, on average, 121.4 bits of information. In contrast, lines of the Aeneid carry an average of 156.1 bits of information. This is a real problem, which becomes even more severe if you look at the incremental information conveyed by the second character in a pair. That is, for a character c appearing immediately after a character b:
Pbc = (ln(fb) - ln(fbc)) / ln(2)
where
Pbc is the number of bits conveyed by character c when it appears in the pair bc
fb is the number of times the character b appears in the text
fbc is the number of times the sequence bc appears in the text, which may also be expressed as the number of times that the character c appears immediately after b.
This second approach to measuring information tells us, for example, that the character "u" in a Latin text conveys no additional information when it follows "q". Since the total frequency of "qu" is the same as the total frequency of "u", the numerator is zero, and total bits likewise is zero.
Hi Brian!
ReplyDeleteI can't find your results of the Codex Rohonc transcription. You have done a great job, too bad you keep it to yourself. Maybe I don't know how to search? In any case, I wish you a good continuation!
I can't find it either! At some point moving from one server to another, I misplaced it. I still have the code to generate it, though, so I'll rebuild it and post it somewhere. (Maybe on github, so it's easier for other people to work with.)
Delete