Saturday, March 2, 2019

Numbers and Words, Six and Eleven

There is a numerical sequence on page 23 that I have long taken for granted, but as far as I can tell I did not post about. Here is the page:

There is a repeated formula in which a number is embedded. It starts out with the fairly obvious 4 and 5, represented by hash marks, then 6 by the glyph CY, 7 and 8 by CY I and CY I I, then 9 with LT, 10 with T, and eleven with IX.

Given the similarity between the number 6 and the first glyph of the phrase "King of the Jews", and the fact that the glyph for 11 could also represent the word "and", we can come up with some more fuzzy arguments about the underlying language of the text.

Hypothesis A: The glyph CY represents the number "six" and is phonologically similar to the glyph CX as part of the phrase "King of the Jews"

Language King of the Jews Value for CX Value for CY
Koine Greek βασιλεὺς τῶν Ἰουδαίων b[a] ἕξ
Hebrew מלך היהודים m שישה
Albanian o mbret i Judenjve m[bre] gjashtë
Old Church Slavonic Царю Иудейский ts[a] šestĭ
Latin rex Judaeorum r[e] sex
Romanian Împăratul iudeilor i[mp] șase
Hungarian zsidók[nak] királya zs[i] hat
Kalderash Romani zhidovengo thagar zh[i] shov
Dutch De koning van de Joden k[o] zes

Hypothesis B: The glyph IX represents the number "eleven" as well as the word "and", and the two words have some similarity. In this table I include also the number "one", in case the connection is that "eleven" is derived from "one", and "and" sounds like "one".

Language Eleven One And
Koine Greek ἕνδεκα ἕνας και
Hebrew אחת עשרה אחד ו
Albanian njëmbëdhjetë një dhe
Old Church Slavonic ѥдинъ на десѧте ѥдинъ и
Latin ūndecim ūnus et, -que
Romanian unsprezece, unșpe, unsprăd̦ațe unu și
Hungarian tizenegy egy és
Kalderash Romani dešujek jek khaj, thaj
Dutch elf een en

I added Dutch into the list because of the similarity between "one" and "and".

Thursday, February 28, 2019

Crazy Idea: What if the Rohonc Codex is in Romani?

In my last post I looked at a pair of glyphs that I think should stand for the phrase "King of the Jews." To clarify, I had landed on the following idea (which I probably didn't explain very well):

1. The glyph C represents the sound [ʃ] in the names for Christ (Krisztus, [kristuʃ]) and Pilate (Pilátus, [pilaːtuʃ])

2. In the glyph CX, the dot indicates an abbreviation for a word stating with the glyph C, or else possibly the sound [ʃi]

3. In the phrase King of the Jews, the glyph CX stands for the sound [ʒ] or [ʒi] in the name for the Jews (Modern Hungarian Zsidóknak, Middle Hungarian Sidóknac, [ʒidoːknak]).

Yesterday I realized the Romani language should also be a candidate language in my analysis, and I have completely left it out. Romani is interesting because it has not traditionally had its own writing system, and the spoken language has often been carefully guarded by its speakers, so might be a natural subject for an innovative and secret writing system. In addition, the Roma have been present in Hungary and neighboring areas for centuries. To top it off, the phonological argument above could work just as well for Romani, where the phrase King of the Jews could be (depending on dialect) zhidovengo thagarzhidovengo krai, etc.

Saturday, February 23, 2019

CX XCX: King of the Jews, and a moderate argument for Hungarian

The following pair of glyphs, which I transcribe as CX.XCX, is apparently an abbreviation for the phrase "King of the Jews":

The evidence for this is first found in the depictions of the titulus crucis, which reads IGHA CX XC C I. (The XC is missing the dot which would make it XCX).

The titulus crucis is said to contain, according to different accounts:

  • Mark: The King of the Jews
  • Luke: This is the King of the Jews
  • Matthew: This is Jesus, the King of the Jews
  • John: Jesus the Nazarene, King of the Jews
From the titulus crucis alone we can't tell which glyphs are supposed to represent what, but the matter is clarified somewhat by a depiction of the scene where the crown of thorns is placed on Jesus' head:


According to the gospels, Pilate's soldiers mocked Jesus by putting a crown of thorns on his head, a purple robe on his body, and a reed in his hands, and hailing him as the King of the Jews. Pilate himself was not present at this scene, having scourged Jesus and sent him away, but the RC has Pilate here kneeling before Jesus. The dots on Pilate's robe and on the robe worn by Christ in this scene presumably represent the royal purple.

Between Pilate and Jesus is the glyph XA, which could be a verb like "call", "mock" or "ask", or else an expression like "Hail!" But for this post, I'll focus on the phrase king of the Jews.

The glyphs CX and XCX each include a dot, which could indicate an abbreviation. We also see the dot in the glyph QX, which could be an abbreviation for the word "chapter". In that case, we might guess that the glyphs C and XC (without the dot) should represent the initials of the words for king and Jews.

The following table shows the phrase king of the Jews in a number of candidate languages, and the resulting readings for the glyphs C and XC:

Language Phrase C XC
Koine Greek βασιλεὺς τῶν Ἰουδαίων b[a] i[ou]
Hebrew מלך היהודים m y
Albanian o mbret i Judenjve m[bre] j[u]
Old Church Slavonic Царю Иудейский ts[a] j[u]
Latin rex Judaeorum r[e] j[u]
Romanian Împăratul iudeilor i[mp] i[u]
Hungarian zsidók[nak] királya zs[i] k[i]
Middle Hungarian
  (Gáspár Heltai, 1565)
Sidóknac királlya



In this table the most interesting possibility to me is Middle Hungarian, where the glyph C would be read as an s. We see the glyph C at the end of the name of Pilate (Pilátus in Hungarian) and in the title Christ (CUTB.C, Hungarian Krisztus). This glyph also looks like a mirror image of the Greek letter sigma as it is written in uncial script (also the Cyrillic s).

This is the best argument I've put together so far for a specific language underlying the text. It will be even better if we could explain the C.I at the end of this phrase.

If I'm quiet on the blog for a while now, it's because I'm learning Hungarian on Duolingo. I don't have much free time, so one hobby generally pushed out another...

Wednesday, February 20, 2019

Nouns in the Rohonc Codex

I have a word-splitting algorithm and a context similarity algorithm. If I feed my Rohonc transcription into the word-splitting algorithm, then feed the resulting words into the context similarity algorithm, I can get a sense of which words in the Rohonc Codex are similar to each other in meaning or function.

As an example of how the context similarity algorithm works, if I use the King James version of the book of Matthew as a source and I look for words similar to "Jesus", I get the following top five words:


These are all nouns and pronouns referring to human beings.

What do we get if we perform the same steps on the Rohonc codex? I have previously identified the word below as possibly meaning "Jesus":

Here are the words my algorithm identifies as most similar in the RC:

CUT.Ctentatively "Pilate"
Nfirst glyph of "Jesus"
IGHAsecond glyph of "Jesus"
RT.A.CO.Dtentatively a saint's name

It is rewarding to see that the word I previously identified as "Pilate" appears at the top of the list, and a saint's name appears at the bottom of the list.

The wholly new information is the second word, CQ.B1CU, which now looks like a noun referring to a human being.

The appearance of N and IGHA as similar to N.IGHA is also interesting and deserves investigation. Indeed, in the titulus crucis only the second glyph is used to represent the name Jesus. Presumably this is really a two-word phrase, and only the second glyph actually represents the name "Jesus," with the first glyph N representing the title "Lord" or something similar.

Tuesday, February 12, 2019

Word Breaks in the Voynich Manuscript

I have been reworking my word break algorithm, and it is now much more accurate than it was before. For example, in the Vulgate Genesis, I get word breaks like the following:


In this sample text there were 40 genuine word breaks, and my algorithm correctly identified 31 of them, with four false positives and eight genuine breaks missed.

Similarly with the King James Genesis:


Here the sample text contained 41 genuine word breaks, the algorithm correctly identified 29 of them with one false positive and 11 genuine breaks missed.

So what about the Voynich Manuscript? Do the word breaks identified by my algorithm match up to the spaces in the manuscript? Here are the first four lines of the VM in Eva transcription:

fachy sy kal ar ataiin sholshory cthresy kor sholdy
sory ckhar or y kairchtaiin sharasecthar cthar dan
sy aiirsheky ory kaiin shodcthoary cthesdar aiin sy
soiin oteey oteosrol oty cthiar daiin okaiin or okan

In this case, the VM had 39 apparent word breaks, and my algorithm identified 28 of them with five false positives.

From this, it appears word breaks in the Voynich Manuscript act like word breaks in the other sample texts.

Friday, February 8, 2019

Word Breaks in the Rohonc Codex

So far I have come up with two ways to identify word breaks in the RC. One is based on the relative frequency of initial and final glyphs as inferred from the presence or absence of hyphens at the ends of lines, and the other is based on the space between glyphs on the page.

The image below shows the identified word breaks on page 50, where the blue lines indicate word breaks based on space between glyphs and the red lines indicate word breaks based on the formula above. The placement of some of the blue lines is odd, and I think that probably goes back to problems with the how the glyph recognition algorithm counts the spaces between glyphs.

It's a good sign that the two algorithms often identify the same word breaks, giving weight to the idea that spaces do indicate word breaks in the text. Many of these word divisions seem intuitively right, while others are surprising and worth investigating.

Tuesday, February 5, 2019

What if the Rohonc Codex describes a cycle of religious plays?

Here is a theory I am currently entertaining: That the Rohonc Codex describes a cycle of religious plays enacting events in the life of Christ, with each episode describing an act to be performed on a particular day. The text served as a guideline to how the scenes should be set and acted out, with a rough skeleton of the dialog to be embellished by the actors. It is written in code in order to protect the trade secrets of a particular company of players from their competitors.

This could explain, for example, why the names of saints and other figures occur so much more frequently in the text than they do in the scriptures on which the text is apparently based. If you are going to entertain your audience with the encounter between Pilate and Christ for the duration of a scene, the two of them will need to say somewhat more to each other than they do in the scriptures.

Here is how I arrived at that theory:

When I am looking at a problem with many unknown factors, I build a probability tree populated with my estimates of the likelihood of different elements. Here is what my probability tree for the Rohonc Codex looks like:

1. The RC text is intentionally secret (85%)
    1.1 The secrecy of the text protects the reader from persecution (20%, since most heretical texts are not written in code, and a text in a secret script would be immediately suspicious anyway)
    1.2 The secrecy of the text simply protects the content from becoming publicly known (80%)
        1.2.1 The images in the text faithfully represent the content (90%)
        1.2.2 The images in the text are intended to mislead the uninitiated (10%)
2. The RC text is accidentally secret (10%)
    2.1 The script was invented by an otherwise illiterate person (60%)
    2.2 The script was invented to write a language that didn't have a script (40%)
3. The RC text is meaningless (5%)
    3.1 The text is the output of some kind of religious or spiritual process (30%)
    3.2 The text is a hoax intended to fool someone (70%)

The odds that I assign here are based on my impressions of the nature of existing encoded documents, but could certainly be improved by a thorough survey of texts. When the odds are multiplied out, the most likely case in my estimation is 1.2.1, where 85% x 80% x 90% = 61.2%. That is, I think it is most likely that this text contains something religious that is intended to be kept secret, but not kept secret to protect the reader from persecution.

I have not been able to come up with a genre of texts that cleanly fits into 1.2.1, since the images do not suggest anything especially secret in the content. As a result, I have been looking for genres where the secret to be kept might lie in the way that the publicly known information is conveyed.