Sektu: Rohonc

Showing posts with label Rohonc. Show all posts

Saturday, March 2, 2019

Numbers and Words, Six and Eleven

There is a numerical sequence on page 23 that I have long taken for granted, but as far as I can tell I did not post about. Here is the page:

There is a repeated formula in which a number is embedded. It starts out with the fairly obvious 4 and 5, represented by hash marks, then 6 by the glyph CY, 7 and 8 by CY I and CY I I, then 9 with LT, 10 with T, and eleven with IX.

Given the similarity between the number 6 and the first glyph of the phrase "King of the Jews", and the fact that the glyph for 11 could also represent the word "and", we can come up with some more fuzzy arguments about the underlying language of the text.

Hypothesis A: The glyph CY represents the number "six" and is phonologically similar to the glyph CX as part of the phrase "King of the Jews"

Language	King of the Jews	Value for CX	Value for CY
Koine Greek	βασιλεὺς τῶν Ἰουδαίων	b[a]	ἕξ
Hebrew	מלך היהודים	m	שישה
Albanian	o mbret i Judenjve	m[bre]	gjashtë
Old Church Slavonic	Царю Иудейский	ts[a]	šestĭ
Latin	rex Judaeorum	r[e]	sex
Romanian	Împăratul iudeilor	i[mp]	șase
Hungarian	zsidók[nak] királya	zs[i]	hat
Kalderash Romani	zhidovengo thagar	zh[i]	shov
Dutch	De koning van de Joden	k[o]	zes

Hypothesis B: The glyph IX represents the number "eleven" as well as the word "and", and the two words have some similarity. In this table I include also the number "one", in case the connection is that "eleven" is derived from "one", and "and" sounds like "one".

Language	Eleven	One	And
Koine Greek	ἕνδεκα	ἕνας	και
Hebrew	אחת עשרה	אחד	ו
Albanian	njëmbëdhjetë	një	dhe
Old Church Slavonic	ѥдинъ на десѧте	ѥдинъ	и
Latin	ūndecim	ūnus	et, -que
Romanian	unsprezece, unșpe, unsprăd̦ațe	unu	și
Hungarian	tizenegy	egy	és
Kalderash Romani	dešujek	jek	khaj, thaj
Dutch	elf	een	en

I added Dutch into the list because of the similarity between "one" and "and".

Thursday, February 28, 2019

Crazy Idea: What if the Rohonc Codex is in Romani?

In my last post I looked at a pair of glyphs that I think should stand for the phrase "King of the Jews." To clarify, I had landed on the following idea (which I probably didn't explain very well):

1. The glyph C represents the sound [ʃ] in the names for Christ (Krisztus, [kristuʃ]) and Pilate (Pilátus, [pilaːtuʃ])

2. In the glyph CX, the dot indicates an abbreviation for a word stating with the glyph C, or else possibly the sound [ʃi]

3. In the phrase King of the Jews, the glyph CX stands for the sound [ʒ] or [ʒi] in the name for the Jews (Modern Hungarian Zsidóknak, Middle Hungarian Sidóknac, [ʒidoːknak]).

Yesterday I realized the Romani language should also be a candidate language in my analysis, and I have completely left it out. Romani is interesting because it has not traditionally had its own writing system, and the spoken language has often been carefully guarded by its speakers, so might be a natural subject for an innovative and secret writing system. In addition, the Roma have been present in Hungary and neighboring areas for centuries. To top it off, the phonological argument above could work just as well for Romani, where the phrase King of the Jews could be (depending on dialect) zhidovengo thagar, zhidovengo krai, etc.

Saturday, February 23, 2019

CX XCX: King of the Jews, and a moderate argument for Hungarian

The following pair of glyphs, which I transcribe as CX.XCX, is apparently an abbreviation for the phrase "King of the Jews":

The evidence for this is first found in the depictions of the titulus crucis, which reads IGHA CX XC C I. (The XC is missing the dot which would make it XCX).

The titulus crucis is said to contain, according to different accounts:

Mark: The King of the Jews
Luke: This is the King of the Jews
Matthew: This is Jesus, the King of the Jews
John: Jesus the Nazarene, King of the Jews

From the titulus crucis alone we can't tell which glyphs are supposed to represent what, but the matter is clarified somewhat by a depiction of the scene where the crown of thorns is placed on Jesus' head:

HX.H.HK CUT.C XA IGHA CX XCX C I

According to the gospels, Pilate's soldiers mocked Jesus by putting a crown of thorns on his head, a purple robe on his body, and a reed in his hands, and hailing him as the King of the Jews. Pilate himself was not present at this scene, having scourged Jesus and sent him away, but the RC has Pilate here kneeling before Jesus. The dots on Pilate's robe and on the robe worn by Christ in this scene presumably represent the royal purple.

Between Pilate and Jesus is the glyph XA, which could be a verb like "call", "mock" or "ask", or else an expression like "Hail!" But for this post, I'll focus on the phrase king of the Jews.

The glyphs CX and XCX each include a dot, which could indicate an abbreviation. We also see the dot in the glyph QX, which could be an abbreviation for the word "chapter". In that case, we might guess that the glyphs C and XC (without the dot) should represent the initials of the words for king and Jews.

The following table shows the phrase king of the Jews in a number of candidate languages, and the resulting readings for the glyphs C and XC:

Language	Phrase	C	XC
Koine Greek	βασιλεὺς τῶν Ἰουδαίων	b[a]	i[ou]
Hebrew	מלך היהודים	m	y
Albanian	o mbret i Judenjve	m[bre]	j[u]
Old Church Slavonic	Царю Иудейский	ts[a]	j[u]
Latin	rex Judaeorum	r[e]	j[u]
Romanian	Împăratul iudeilor	i[mp]	i[u]
Hungarian	zsidók[nak] királya	zs[i]	k[i]
Middle Hungarian (Gáspár Heltai, 1565)	Sidóknac királlya	s[i]	k[i]

In this table the most interesting possibility to me is Middle Hungarian, where the glyph C would be read as an s. We see the glyph C at the end of the name of Pilate (Pilátus in Hungarian) and in the title Christ (CUTB.C, Hungarian Krisztus). This glyph also looks like a mirror image of the Greek letter sigma as it is written in uncial script (also the Cyrillic s).

This is the best argument I've put together so far for a specific language underlying the text. It will be even better if we could explain the C.I at the end of this phrase.

If I'm quiet on the blog for a while now, it's because I'm learning Hungarian on Duolingo. I don't have much free time, so one hobby generally pushed out another...

Wednesday, February 20, 2019

Nouns in the Rohonc Codex

I have a word-splitting algorithm and a context similarity algorithm. If I feed my Rohonc transcription into the word-splitting algorithm, then feed the resulting words into the context similarity algorithm, I can get a sense of which words in the Rohonc Codex are similar to each other in meaning or function.

As an example of how the context similarity algorithm works, if I use the King James version of the book of Matthew as a source and I look for words similar to "Jesus", I get the following top five words:

Similarity	Word
0.37	HE
0.25	THEY
0.22	PILATE
0.22	PETER
0.18	SHE

These are all nouns and pronouns referring to human beings.

What do we get if we perform the same steps on the Rohonc codex? I have previously identified the word below as possibly meaning "Jesus":

N.IGHA

Here are the words my algorithm identifies as most similar in the RC:

Similarity	Word	Transcription	Notes
0.27		CUT.C	tentatively "Pilate"
0.24		CQ.B1CU	unknown
0.24		N	first glyph of "Jesus"
0.22		IGHA	second glyph of "Jesus"
0.22		RT.A.CO.D	tentatively a saint's name

It is rewarding to see that the word I previously identified as "Pilate" appears at the top of the list, and a saint's name appears at the bottom of the list.

The wholly new information is the second word, CQ.B1CU, which now looks like a noun referring to a human being.

The appearance of N and IGHA as similar to N.IGHA is also interesting and deserves investigation. Indeed, in the titulus crucis only the second glyph is used to represent the name Jesus. Presumably this is really a two-word phrase, and only the second glyph actually represents the name "Jesus," with the first glyph N representing the title "Lord" or something similar.

Friday, February 8, 2019

Word Breaks in the Rohonc Codex

So far I have come up with two ways to identify word breaks in the RC. One is based on the relative frequency of initial and final glyphs as inferred from the presence or absence of hyphens at the ends of lines, and the other is based on the space between glyphs on the page.

The image below shows the identified word breaks on page 50, where the blue lines indicate word breaks based on space between glyphs and the red lines indicate word breaks based on the formula above. The placement of some of the blue lines is odd, and I think that probably goes back to problems with the how the glyph recognition algorithm counts the spaces between glyphs.

It's a good sign that the two algorithms often identify the same word breaks, giving weight to the idea that spaces do indicate word breaks in the text. Many of these word divisions seem intuitively right, while others are surprising and worth investigating.

Tuesday, February 5, 2019

What if the Rohonc Codex describes a cycle of religious plays?

Here is a theory I am currently entertaining: That the Rohonc Codex describes a cycle of religious plays enacting events in the life of Christ, with each episode describing an act to be performed on a particular day. The text served as a guideline to how the scenes should be set and acted out, with a rough skeleton of the dialog to be embellished by the actors. It is written in code in order to protect the trade secrets of a particular company of players from their competitors.

This could explain, for example, why the names of saints and other figures occur so much more frequently in the text than they do in the scriptures on which the text is apparently based. If you are going to entertain your audience with the encounter between Pilate and Christ for the duration of a scene, the two of them will need to say somewhat more to each other than they do in the scriptures.

Here is how I arrived at that theory:

When I am looking at a problem with many unknown factors, I build a probability tree populated with my estimates of the likelihood of different elements. Here is what my probability tree for the Rohonc Codex looks like:

1. The RC text is intentionally secret (85%)

1.1 The secrecy of the text protects the reader from persecution (20%, since most heretical texts are not written in code, and a text in a secret script would be immediately suspicious anyway)

1.2 The secrecy of the text simply protects the content from becoming publicly known (80%)

1.2.1 The images in the text faithfully represent the content (90%)

1.2.2 The images in the text are intended to mislead the uninitiated (10%)

2. The RC text is accidentally secret (10%)

2.1 The script was invented by an otherwise illiterate person (60%)

2.2 The script was invented to write a language that didn't have a script (40%)

3. The RC text is meaningless (5%)

3.1 The text is the output of some kind of religious or spiritual process (30%)

3.2 The text is a hoax intended to fool someone (70%)

The odds that I assign here are based on my impressions of the nature of existing encoded documents, but could certainly be improved by a thorough survey of texts. When the odds are multiplied out, the most likely case in my estimation is 1.2.1, where 85% x 80% x 90% = 61.2%. That is, I think it is most likely that this text contains something religious that is intended to be kept secret, but not kept secret to protect the reader from persecution.

I have not been able to come up with a genre of texts that cleanly fits into 1.2.1, since the images do not suggest anything especially secret in the content. As a result, I have been looking for genres where the secret to be kept might lie in the way that the publicly known information is conveyed.

Sunday, February 3, 2019

A later reference to the Three Tablets

Page 14 shows three tablets with writing on them, and two figures standing among them. Each tablet is numbered and contains a variation on a common formula.

An apparently expanded version of the text of the tablets occurs on page 209. In the image below, the red boxes show text from tablet 1, the green boxes tablet 2, and the blue boxes tablet 3.

Friday, February 1, 2019

Musical notation in the Rohonc Codex

Page 212 has a pattern of dots over the glyphs in one line of text:

CO D QVB N CX C XCAA CVD OD DP IL HX H H N C F1XX

This is reminiscent of liturgical musical notation:

Suppose the dots on page 212 represent the notes of a tune that the words are to be sung to, what could that tell us?

Looking at the first two glyphs, CO D, the dots seem to suggest that each would be sung with a rising melody. That seems to suggest that each glyph represents at least a syllable. The number of dots over the other glyphs suggests a minimally syllabic reading for most of them as well.

Based on the occurrence of sequences elsewhere in the text, this line appears to break down into words as follows:

[CO.D] [QVB.N.CX] [C] [XCAA.CVD.OD.DP.IL] [HX.H.H.N] [C.F1XX]

If you had a missal (or similar text), and only one line was going to be marked specifically for singing, what would that line be?

The page opposite has an image of two men in a boat talking to Jesus, who is not in the boat, and a scriptural reference to the sixth chapter of HF.HS (tentatively John). If the musical line is associated with the scriptural reference, then it seems like it could be a line from the 78th psalm, which is quoted in John 6: Panem de caelo dedit eis manducare, "He gave them bread from Heaven to eat."

Monday, January 28, 2019

How much information is in the Rohonc Codex?

I'm getting back to the Rohonc codex after a few years away, and I'm wondering how much information is stored in the text. That is, if the text were efficiently translated into another language, how long of a text would it be.

Since I don't know where word breaks fall, I can only base the calculation on symbols in the text. Using Shannon's information entropy formula, I can calculate the entropy of some sample texts as follows:

King James Bible (letters):	4.1
King James Bible (words):	8.9
Latin Vulgate book of Genesis (letters):	4.1
Latin Vulgate book of Genesis (words):	10.3
Voynich Manuscript (letters):	4.0
Voynich Manuscript (words):	10.5
Rohonc Codex (glyphs):	5.9

These values make intuitive sense to me. A word in Latin conveys more information than a word in English because it carries more inflectional morphemes. (The Voynich numbers are just thrown in for fun, since we don't know what they convey).

From these values, it looks like a glyph from the Rohonc codex conveys more information than a character in English or Latin. That seems like the right answer given the number of symbols in the system. But how do we use that information to calculate the size of the text?

Letters convey phonological information, but words convey meaning, so the size of a text in letters means something different from the size in words, and we have to do some conversion to get from one to the other.

If we assume the Rohonc script is primarily phonological and roughly as efficient as English and Latin, then the size of the text in phonological terms would be about:

5.9 bits/glyph x 60,142 glyphs = 354,837.8 bits

This would be equivalent to a Latin text of length:

354,837.8 bits ÷ 4.1 bits/letter = 86,545.8 letters

If the underlying language were Latin, then given the number of letters per Latin word (5.2), we could calculate the expected number of Latin words. That would be:

86,545.8 letters ÷ 5.2 letters/word = 16,643.4 words

Given this, if the language were Latin, then each word would consume about 3.6 Rohonc glyphs. That is 60,142 glyphs ÷ 16,643.4 words = 3.6 glyphs/word. If the language were English, each word would consume 2.8 glyphs/word. Probably it is safe to assume Rohonc words are 3-4 glyphs long, on average.

Monday, March 23, 2015

A Weak Argument for the Rohonc Codex as a Bogomil Text

I often privately come back to the question of whether the Rohonc Codex could be a Bogomil text. I think that could only really be proven by decipherment, but here let me lay out a weak argument in favor of that idea.

We don't know much about Bogomil theology, but of the small amount that we do know, I think there are three things that may be reflected in the imagery of the RC.

1. Adoptionism: This is a theological teaching that Jesus was born a man, but was adopted by God as His son. The circumstantial evidence for this in the RC is the fact that Jesus is normally portrayed in a way similar to other saintly people (with a simple turban-halo), but at some point (after the crucifixion?) appears to be transformed into a giant luminous being. Does this represent the adoption of Jesus by God?

Jesus riding into Jerusalem

Luminous Jesus

2. Absence of Mary: One of the implications of adoptionism is that Mary would have become pregnant in the normal earthly manner, rather than through divine intervention. As a result, adoptionists like the Bogomils and Paulicans did not revere Mary in the same way that Catholics and Eastern Orthodox churches did. The circumstantial evidence for this in the RC is the fact that Mary is rarely depicted, and never given a special status.

The family of Jesus receiving the Magi

3. Dualism: In addition to being adoptionist, the Bogomils were apparently dualist, believing in the existence of an evil demiurge as a counterpart to God. Some dualists, like the Marcionists, held that the demiurge was none other than the God of the Old Testament. The main circumstantial evidence for this in the RC is the absence of clear Old Testament imagery.

Make of that what you will.

Saturday, March 21, 2015

Whose Universe is This?

I've been thinking about the diagram of the universe that appears on page 83 of the Rohonc Codex:

At the center, we see the earth, with some cities and maybe a ship on the sea. Underneath the world is apparently Hell, which we also see on page 79:

The circle immediately surrounding the earth appears to hold the sun and moon, and beyond that are other complex shapes whose meaning is not clear to me.

I am wondering if this represents a description of the universe from a particular source. Perhaps from one of the three books of Enoch, or maybe another source that I am not familiar with.

The two figures in the highest heaven are reminiscent of something from the Talmud related to the apostate rabbi Elisha ben Abuyah, who is said to have gone to heaven and seen two figures seated there (understood to be God and Metatron). Since no one was allowed to sit in the presence of God, he exclaimed "Perhaps there are -- God forbid! -- two powers in heaven!"

Thursday, December 11, 2014

The Three Tablets

Page 14 shows three tablets with writing on them, and two figures standing among them.

Delia Huegel takes this to be Moses, Aaron, and the tables of the law, but she notes the odd fact that there are three tables instead of two.

The text on the second and third tablets begins with the formulae I I CQ IGV and I I I CQ IGV, which I expect probably means something like "the second X" and "the third X". Since the first tablet has Q1C in the place of the number "one", I expect Q1C probably represents a word that is semantically equivalent to "first", but etymologically unrelated to the number "one" (e.g. English first, Latin primus, etc.)

Since the tables are numbered one, two and three, I don't think they represent the tables of the law, but rather a set of three things or ideas. The obvious candidate is the holy trinity.

The usual order of the three persons (or hypostases) of the trinity are established in Matthew 28:19 as the Father, the Son and the Holy Spirit, in which case the third tablet would represent the Holy Spirit.

The text of the third tablet contains the glyph RT, which I have previously read as "saint". In a number of languages, the word for "saint" is just a nominal form of the adjective "holy", as it is in Latin: Spiritus Sanctus. This suggests that the word either before or after RT on the third tablet might be "spirit", forming the phrase "holy spirit".

The glyphs of the first and third tablets could be nearly the same, though in a different order, up until the final phrase:

First: Q1C CQ IGV [?] K1A1A I I RAA O X2 O C I RAA C F O R CO [?]
Third: I I I CQ IGV O IGDA O O X2 O K1A1A I I RAA C I RAA RT CUNSAR I IX O O

The word C.F.O.R.CO is interesting because one of my earlier algorithms (about which I did not write) identified it as a likely alphabetic word. Is it possible to read C.F.O.R.CO as "father", and RT CUNSAR.I.IX.O.O as "holy spirit"? If so, how do I test that reading?

Tuesday, December 9, 2014

Heresy? Apocryphy?

I've been running a bunch of simulations to see how the Rohonc data match against alphabetic data from Old Hungarian, Old Albanian, Latin and Old Church Slavonic (in Glagolitic and Cyrillic). My plan was to take predictions based on this data and test them against the names of the evangelists, to see which prediction best explained those names.

There are a number of scenarios where D could be read as t, so my reading of XDC.D as Matthew could work (e.g. as ma-t, mat-t or something similar). However, nothing in my simulations seems to let me read CO.IH.D as Luke.

I agonized over this for a while, then went back to the page where I tried to relate scriptural references to images. I realized that the only strong evidence for reading CO.IH.D as Luke was the triple scriptural reference, and that reading was based on the assumption that the references were to canonical gospels. But if that assumption was wrong, then there was really very little reason to read CO.IH.D as Luke.

Indeed, an image accompanying a reference to CO.IH.D chapter 6 is problematic, since it features an angel (or perhaps a winged Christ) appearing to a man lying on the ground, and I could not match that to anything in the sixth chapter of Luke.

So what would it mean if CO.IH.D is not Luke?

Looking at the images corresponding to CO.IH.D, each one involves a figure with a striped turban and a beard, twice with wings, usually with one other person, though sometimes alone outside a city. Chronologically (according to chapter) the images can be arranged as follows:

Chapter 1

Chapter 6

Chapter 7

Chapter 8

Chapter 9

Chapter 11

Chapter 17

Elsewhere (e.g. in the image of Jesus entering Jerusalem) the figure with the striped turban and pointed beard is Christ, so whatever CO.IH.D is, it would seem to contain a gospel-like narrative of the life or ministry of Christ.

There are any number of candidates among known apocrypha, but I suppose I should start with anything that mentions angels in the first and sixth chapters, or else mentions Christ appearing in the form of an angel.

Friday, November 28, 2014

Old Church Slavonic, Revisited

After a circuitous route of reading, I decided to revisit Old Church Slavonic. It started when I was looking at some of the contested inscriptions in the Basarabi Cave Complex, where I saw the following sequence:

With the bar over it, it looks like either an abbreviation or a nomen sacrum. It made me think of the Rohonc word I tentatively read as "Christ":

From there, I started reading about Glagolitic, and I realized that I should probably look at letter frequencies in Old Church Slavonic as it was written in both Glagolitic and and Cyrillic, so I reanalyzed OCS using the Codex Marianus (Glagolitic) and Codex Suprasliensis (Cyrillic).

Codex Marianus

	Initial	Final	All
izhe	9.1%	14.8%	8.3%
jest	5.0%	16.1%	8.0%
big jer	0.0%	23.4%	7.7%
on	5.3%	8.1%	6.5%
az	2.1%	7.9%	6.2%
tverdo	3.4%	0.0%	5.9%

Codex Suprasliensis

	Initial	Final	All
izhe	11.3%	18.1%	8.7%
az	2.4%	10.6%	7.4%
on	4.1%	9.1%	7.3%
jest	1.2%	11.4%	6.3%
big jer	0.1%	15.4%	5.8%
tverdo	4.1%	1.0%	6.8%

One of the interesting things about Glagolitic is that some of the forms of these letters resemble the forms of the most common Rohonc letters. But I can only steal 10 minutes away today, so I'll have to get back to that in another post.

Wednesday, November 12, 2014

Comparison with Old Church Slavonic and Koine Greek

In this post, I'll complete my initial attempt to compare the initial and final frequencies of the three most common Rohonc glyphs with the three most common letters in some candidate languages.

First, Old Church Slavonic:

	Initial	Final	All
И	6.8%	12.5%	7.8%
Є	4.1%	13.5%	7.4%
Ъ	0.0%	19.91%	7.2%

Old Church Slavonic differs from Rohoncian in that the most common letter is more frequent as a final than as an initial, and there is a wide disparity between the frequency of the second letter as initial and final.

Second, Koine Greek:

	Initial	Final	All
α	12.8%	6.2%	11.0%
ε	15.4%	6.9%	10.1%
ο	10.1%	5.9%	10.1%

Koine Greek differs from Rohoncian in that the third most common letter is more common as an initial than as a final (though if the iota ended up in third place, it would fit well, with initial and final frequencies of 2.7% and 8.6%, respectively).

Suppose we give each of these languages a location in six-dimensional space, indicated by the relative frequencies of the top three symbols as initials and finals...what would their distances from each other be in this space? And which would be closest to Rohoncian?

Interestingly, the two languages that are closest to each other by this measurement are Rohoncian and Latin, with a distance of 0.115. Next closest to Rohoncian is Old Hungarian, with a distance of 0.130. The languages that are most distant from each other are Koine Greek and Old Albanian, with a distance of 0.293.

Overall, I am weakly inclined to think that Rohoncian is some kind of Latin or Hungarian. Not only does this particular measurement favor these two languages, but there are graphical similarities between the Rohonc C, I and the Latin e, i.

(In case you are wondering, I also looked at Voynichese, just for the fun of it. It differs significantly from all of the other languages I have looked at, in that the second and third most common letters occur infrequently as initials or finals.)

Tuesday, November 11, 2014

Comparison with Old Hungarian

So far I've compared the relative frequencies of initials and finals for the three most common glyphs in Rohoncian with Latin and Old Albanian. Today I'll do Old Hungarian.

For Old Hungarian, I used the four gospels from the Hussite Bible, and I counted long and short vowels together. The top three letters break down as follows:

	Initial	Final	All
e, é	16.5%	6.1%	16.6%
a, á	10.1%	8.5%	10.7%
t	6.1%	13.6%	8.0%

Like Latin and Rohoncian, the most common letter in Old Hungarian is more frequent as an initial than as a final, while the third most common letter is more frequent as a final than an initial. Like Latin, these two letters are e and t, respectively.

You might wonder why, if there are around 1,000 glyphs in Rohonc, I am comparing it statistically to alphabets instead of syllabaries or ideographic systems. The reason is that the most frequent glyphs in Rohonc are roughly as frequent as letters ought to be. Among Latin syllables, for example, the most common syllable in the Vulgate version of Genesis is et, but it only accounts for 3.29% of syllables. The most frequent Rohonc glyph is C, and it accounts for 12.9% of glyphs, putting it in the same ballpark as the most frequent letters of alphabetic systems.

Monday, November 10, 2014

Comparison with Old Albanian

A couple of days ago I looked at the relative frequency of the three most common Rohonc symbols as initials and finals, and compared that to the relative frequency of the three most common Latin letters.

Today I'll do the same with Old Albanian. My sample text for Old Albanian is Gjon Buzuku's Meshari, the three most common letters of which are e, i and h:

	Initial	Final	All
e	16.4%	23.0%	19.9%
i	3.2%	6.7%	8.7%
h	2.3%	24.8%	8.3%

In some respects, Old Albanian fits better than Latin. Latin initial i is far more common than Rohonc I (9% > 5%), while Old Abanian i shares with Rohonc I that both are far much more frequent as finals than initials. However, Albanian e occurs more frequently as a final than as an initial.

In order for this to work, the names of two of the evangelists would have to end in h. In fact, the names of two of the evangelists do end in h in the Meshari:

Maξeh: Matthew

March: Mark

Furthermore, Luke is also written with an h: Lucha.

Saturday, November 8, 2014

Relative frequency of initials and finals

In a previous post, I argued that we could use the presence or absence of hyphens at the end of a line to generate some basic statistics about word initials and word finals. At the time I was thinking of using this information to divide the text into words, but over the last few busy months I have been thinking about another use for this data.

In most (or all?) languages, the frequency of ranking initials differs somewhat from the ranking of finals and medials. For example, in Latin, the letter t occurs nearly four times more often at the end of a word than at the beginning, whereas u occurs about 3.5 times more frequently as an initial than a final.

Rohoncian is no different from known languages in that respect. For example, the glyph D occurs 8.5 times more frequently as a final than as an initial. If Rohoncian is a known real language, then the difference between frequency ranking in initial, medial and final positions could be used to help narrow it down.

For example, using the three most common glyphs in Rohoncian, we could construct a kind of litmus test. The relative frequencies of those glyphs are:

	Initial	Final	All
C	12.9%	4.9%	10.2%
I	5.0%	6.4%	9.7%
D	1.5%	12.8%	7.8%

If we wanted to test the theory that Rohoncian is Latin and those three glyphs are alphabetic, then we would match them up to the most common three Latin letters:

	Initial	Final	All
e	15.6%	11.6%	12.9%
i	9.0%	7.8%	11.0%
t	4.9%	19.7%	8.6%

In broad terms, this correspondence seems to work out well. C shares in common with e that both are ranked first in overall frequency and somewhat more frequent as initials. Similarly, D and t share the third position and are significantly more frequent as finals than initials.

The main problem with this, as far as earlier proposals go, is that two of the evangelists have names ending in D (i.e. CO IH D and XDC D). However, this is already a problem because it seems to work best to read those names as Luke (or Mark) and Matthew, and it is not clear what those names share in common that would lead them to be written with the same final.

On the positive side, I had previously proposed reading the word K O A D CX as "nights". If this is the word noctes, then the D falls in the right place, and CX could be read es. (The glyph CX looks like C, but with a dot).

Part of me wonders what we would get if we looked at initials, medials and finals in the Voynich manuscript. But that carcass has been picked over by smarter minds than mine, and yielded almost nothing.

Thursday, November 6, 2014

Quick (Tironian) Note

I've been very busy for a few months, and some of my projects have languished, including working on the Rohonc codex. (How do you prioritize a project that may not succeed?)

However, I happened to see a page from the Old Irish Book of Leinster that got me thinking about this again. I don't have much time (it's my lunch break) but I thought I could write a quick note about it.

The thing that caught my eye was the use of Tironian notes for their phonetic values. An example is the name "Conchobar", written (among other ways) as follows:

The first glyph in this name looks like a backwards C, but it is none other than the Tironian note for con:

Except, instead of representing the Latin morpheme con, it represents only the phonetic value. The same note is used in the name Conall. The RC does not look like a text that is written fully in Tironian notation, but it would be interesting to try to transcribe a sample of the Rohonc codex as though it were a subset of Tironian notation and see what it sounds like.

If you've ever wondered what a text written completely in Tironian notation looks like, here is a piece from the psalms, given at the end of a 9th century work titled Comentarii notarum tironianarum:

Thursday, September 18, 2014

Change of focus on the Rohonc project

The Rohonc transcription has reached a point where I think it is "good enough for now". I have identified about 93% of the glyphs (including 100% of glyphs in the first 50 pages), corrected some problems with the order of lines, and identified those lines that are damaged either at the head or the tail.

If a solution is possible, I doubt it would rely on the 7% of glyphs that I haven't identified yet. Now I'm going to change my focus to word breaks, since I think the data is good enough to apply the formula I mentioned in my last post for identifying word breaks.

In the mean time, I have started a few other projects, so hopefully I will soon be able to post about some other interesting stuff in addition to the Rohonc Codex.