Part of the algorithm that produces the images is supposed to sort the lexicon in a way that will group similar words together, creating the islands and bands of brightness that the images show. However, the sorting algorithm did not adequately handle the case where there were subsets of the lexicon with absolutely no similarity to each other. As a result, the Voynich image only shows one corner of the full lexicon (about 10%).
The full image is much darker, like a starry sky, implying a text that is less meaningful. That lends a little weight to the theory that the text may be (as Gordon Rugg has suggested) a meaningless hoax. However, I thought I should also test the possibility that it could be polyglot, since my samples so far have all been monoglot. With a polyglot text, my sorting algorithm would be hard pressed to group similar words together, because the lexicon would contain subsets of unrelated words.
After fixing my sorting algorithm, I produced a similarity map from the polyglot medieval collection Carmina Burana, then compared it to a similar-sized monoglot English text, and a similar-sized piece of the Voynich Manuscript. Here are the results.
First, the English text:
And now, the Carmina Burana. The image is actually much larger, because the lexicon is larger. Here, you can see the "starry sky" effect.
And last, the Voynich Manuscript. Again, the "starry sky" effect.
Of the images I have produced so far, the Voynich Manuscript looks most like the Carmina Burana. However, I have the following caveats:
- Carmina Burana is much smaller than the VM. To do a better comparison, I will need a medieval polyglot/macaronic text around 250 kb in size. So far I have not found one.
- I should attempt to produce the same type of text using Cardan grilles, to test Gordon Rugg's hypothesis.
Lastly, I want to add a note about the narrow circumstances under which I think Gordon Rugg may be right, but generally why I don't think his theory will turn out to be correct.
I don't think the VM was generated using Cardan grilles because the work required to generate a document the size of the VM would be significant, and there would be an easier way to do it. It would be much easier to invent a secret alphabet and simply babble along in a vernacular language. The Rugg hypothesis needs to address the question of why the extra effort would have been worth it to the con artist that generated the work.
The narrow circumstance under which this would make sense, to me, is if the VM were generated by someone using something like Cardan grilles as an oracular device, believing that he was thereby receiving a divine message.
However, if the VM is a hoax, I think we will one day decipher it, and we will find that it says something like "the king is a sucker. I can't wait till I'm done with this thing. I'm so sick of vellum...."