Sektu: Word Breaks in the Voynich Manuscript

Tuesday, February 12, 2019

Word Breaks in the Voynich Manuscript

I have been reworking my word break algorithm, and it is now much more accurate than it was before. For example, in the Vulgate Genesis, I get word breaks like the following:

IN PRINCIPIOCREAVIT DEUS CÆLUM ET TE RRAM TERRAAUTEM ERAT
INANIS ET VACUAET TE NEBRÆ ERANT SUPER FACIEM ABYSSIET
SPIRIT US DEI FEREBAT UR SUPER AQUASDIXIT QUE DEUS FIATLUXET
FACTAEST LUXET VIDIT DEUS LUCEM QUOD ESSET BONA

In this sample text there were 40 genuine word breaks, and my algorithm correctly identified 31 of them, with four false positives and eight genuine breaks missed.

Similarly with the King James Genesis:

INTHE BEGINNING GOD CREATED THE HEAVENAND THE EARTH

AND THE EARTH WASWITHOUT FORMAND VOIDAND DARK NESS

WASUPON THE FACEOF THE DEEPAND THE SPIRIT

OFGODMOVED UPON THE FACEOF THE WATERS AND

Here the sample text contained 41 genuine word breaks, the algorithm correctly identified 29 of them with one false positive and 11 genuine breaks missed.

So what about the Voynich Manuscript? Do the word breaks identified by my algorithm match up to the spaces in the manuscript? Here are the first four lines of the VM in Eva transcription:

fachy sy kal ar ataiin sholshory cthresy kor sholdy

sory ckhar or y kairchtaiin sharasecthar cthar dan

sy aiirsheky ory kaiin shodcthoary cthesdar aiin sy

soiin oteey oteosrol oty cthiar daiin okaiin or okan

In this case, the VM had 39 apparent word breaks, and my algorithm identified 28 of them with five false positives.

From this, it appears word breaks in the Voynich Manuscript act like word breaks in the other sample texts.

Sektu

Tuesday, February 12, 2019

Word Breaks in the Voynich Manuscript

No comments:

Post a Comment

About Me