Tuesday, August 8, 2017

How many symbols are in a script?

Given Zipf's law, it seems like you should be able to calculate the number of symbols in a script based on the number of symbols that only occur only one time in a sample of text.

I worked out the math, but the code I've written to do the calculation is very slow when it comes to scripts with a large number of symbols (like Debosnys' cipher), so I also wrote a Monte Carlo simulation that can come up with an approximate answer much more quickly.

The Debosnys ciphers contain 1188 glyphs, of which 277 occur only once. For a text like this we would expect a total glyph inventory of around 1500 symbols.

Here is what the distribution looks like for texts of 1188 symbols. The x-axis is the inventory of symbols, and the y-axis is the number that would appear only once in a distribution that conforms to Zipf's law.

Monday, August 7, 2017

Another note on N-Glyphs

To test the hypothesis that the subglyph N represents nasalization of a vowel, I looked at the frequency of nasalized syllables in Beaudelaire's Fleurs du Mal and compared it to the cipher poem.

I have a copy of Fleurs du Mal containing 3182 Alexandrine lines. (One poem in this copy is not an Alexandrine). Among these lines, I count 6536 nasalized syllables, so an average of 2.05 nasalized syllables per line.

If the Debosnys cipher poem is a French Alexandrine, and the N subglyph represents nasalization (and is the only representation of nasalization), then we should expect to find a similar distribution of N subglyphs in the cipher poem.

Of the 20 lines of the cipher poem, I counted a total of 30 n-glyphs, so an average of 1.5 n-glyphs per line. More specifically, the number of n-glyphs per line was distributed as follows:

0 n-glyphs: 2 lines, 10%
1 n-glyph: 6 lines, 30%
2 n-glyphs: 9 lines, 45%
3 n-glyphs: 2 lines, 10%

Among the Alexandrine lines of Fleurs du Mal we have the following distribution:

0 nasalized vowels: 366 lines, 11.5%
1 nasalized vowel: 816 lines, 25.6%
2 nasalized vowels: 897 lines, 28.2%
3 nasalized vowels: 658 lines, 20.1%
4 nasalized vowels: 312 lines, 9.8%
5 nasalized vowels: 101 lines, 3.2%
6-7 nasalized vowels: 32 lines, 1.6%

This looks like a promising match, but more work needs to be done obviously.

A look at N-Glyphs

In this post I'll take a look at a single class of Debosnys glyphs that I call "N-Glyphs", in hopes of ferreting out some details on how the cipher works.

N-glyphs are characterized by having a subglyph at the top that looks like a tilde (~), which I transliterate as N. The N subglyph has the following properties:
  • It cannot occur on its own, but only in combination with other subglyphs
  • It can only occur at the top of a glyph, or else directly under another N subglyph
  • Though N cannot occur on its own, N.N frequently occurs on its own
The following are examples of all of the types of N-glyphs that I have identified:

If we assume that these glyphs represent syllables, then the observed properties of the N subglyph may give us a clue into what it represents.

The greatest challenge is to explain why N cannot occur on its own, but N.N can. However, many subglyphs can occur as pairs, and I think it is possible that pairs such as N.N, I.I and O.O may represent different subglyphs from the corresponding singles N, I and O. If we can accept that explanation, then two possibilities suggest themselves:

1. N is a consonant that can only occur in syllable-initial position.
  • It cannot occur on its own because it must be accompanied by a vowel to make a syllable
  • It can only occur at the top of a glyph because it is a consonant (such as French b or d) that can only occur in syllable-initial position.
2. N is a marker of a vocalic feature
  • It cannot occur on its own because it is a feature of another subglyph (in this case a vowel) which must be present.
  • It only occurs at the top of a glyph because it is used as a suprasegmental mark
At the moment I'm favoring the idea that the N is a marker of nasalization, directly influenced by the use of the tilde in certain languages as a suprasegmental mark of nasalization (e.g. ã, ẽ, ĩ, õ, ũ). To test this theory, I will look at the frequency of the N subglyph in the cipher poem, and compare it to the frequency of nasalized syllables in a large set of French Alexandrine lines.

Friday, August 4, 2017

Debosnys Cipher Transcription Revision

My initial transcription of the Debosnys cipher texts allocated one transcription to one glyph, where I have defined a glyph as a cluster of graphemes bounded to the left and right by white space. So, for example, the "signature" line is analyzed as six glyphs:


With the Debosnys material, this yields a text of 1188 instances of 425 glyphs. That means a lot of the text will consist of glyphs that only occur once, which makes contextual analysis difficult. I thought it would be useful to be able to do some analysis on deconstructed glyphs as well. So I created a second transcription that looks at the internal structure of the glyphs:

 <C2 B2> <X DOT> <N U> <O Z O> <O2RNO> <CROSSB>

There is an order to the internal structures of glyphs. For example, using Backus-Naur form, you could describe a whole set of glyphs as follows:

<n-glyph> ::= N <n-tail>
<n-tail> ::= <n-medial> | <n-medial> <n-final>
<n-medial> ::= N | U | X | O
<n-final> ::= X | O

I am currently exploring the idea that these structures correspond to syllable structures, with subglyphs representing letters or phonemes.

The distribution of sub-glyphs follows Zipf's law, with the subglyph O being most common. In French, the most common letter is e, and there is a favorable comparison between the frequency of the O subglyph in the cipher poem and the vowel e in a comparable number of lines of Beaudelaire's poetry.

More on that when I have time.

Wednesday, July 19, 2017

Back to the cipher

I've been doing a lot of background research in order to get an idea of the context in which Debosnys produced his cipher. To be honest, it's been pretty depressing to read about 19th century French prisons, the Paris Commune, and the Franco Prussian war. I have built a picture of Debosnys as a plagiarist, a liar, and a murderer, the product of a brutal period in history and a brutal prison culture.

For that reason, I am eager to get away from the background research and into the features of the cipher. I have completed a transcription of all of the cipher text in the images in Farnsworth's book, and started analysis.

In my transcription I count 1188 total glyphs in the text, from an inventory of 425 separate types.

There are 65 pairs of glyphs that repeat in the text. The most common repeated sequence in my transcription is this one:
frequency = 8 (0.67%)

The following two sequences are represented differently in my transcription, but if they are the same, they would represent the most common sequence:
 frequency = 5 (0.42%)
frequency = 4 (0.34%)

Also frequent are these:
frequency = 5 (0.42%)

frequency = 4 (0.34%)

The top five most frequent glyphs are these:


The 19th Century Criminal Handshake

According to Farnsworth, when Debosnys died his body was found to be covered with shocking tattoos. This was not an uncommon practice among criminals in 19th century France and Italy, and the criminal tattoos I have found are strongly reminiscent of the style of art in Debosnys' manuscripts.

For example, Debosnys drew this handshake, which Farnsworth identifies as a Masonic grip called Boaz:

But this is also a common motif on 19th century criminal tattoos, according to Lombroso, who presents some examples in his L'uomo delinquente, such as this simple one:

The following example is reminiscent of the ritual of blood-brotherhood, where the hands are cut and the cuts are pressed together to symbolically join two people by blood.

This one looks like it might symbolize the union of two people, LH and EL. The flower suggests a romantic relationship, but maybe something else.

Lombroso says this one shows a preference for pederasty, taken in context with other tattoos on the prisoner's body:

Whatever the specific meaning of a handshake, in general it symbolizes some kind of close connection between two people. In this case, between Debosnys and LMF.

Wednesday, July 5, 2017


We already know that the name "Henry Deletnack Debosnys" was a pseudonym, but his middle name is curious enough to warrant examination. One could imagine something like de l'Etnac, but the consonant cluster tn is especially awkward and very low-frequency in European languages.

I originally thought his full name must be an anagram, with the inconvenient left-over letters tossed into his middle name. However, now I think his middle name is a literary allusion, of sorts.

In 1870 someone named J. Cantel published a book titled Souvenirs et Impressions de Voyage en Italie under the pseudonym M. le vicomte de Letnac, where Letnac is transparently Cantel spelled backwards.

The book is written as a set of letters from "Arthur", a 15-year-old student, to his older brother, describing his travels in Italy in 1869. I can't find any other works by the same J. Cantel, or any clue about what the J. stands for.

What connection, if any, this has to the identity of Henry Debosnys is anyone's guess at this point, but it seems unlikely to be a coincidence. I'm adding Cantel to my list of names to watch for.