I mostly write about Voynich images, and must often conclude that multiple explanations remain plausible, and that it’s impossible to know the full intentions of figures this unusual and complex without understanding the accompanying text. So while I believe the imagery is of great historical value and its study is useful, I agree with the general sentiment that understanding the text should remain our ultimate goal.
In this post I will discuss three types of attack on the text and their respective flaws and strengths. It won’t contain much of my own research, but rather a synthesis of and commentary on what others have published.
The methods are the following:
- Label reading
- Computer attacks
- Block paradigm
1. Label reading
By “label reading” I mean the attempted reading of single words, whether these are formatted as labels or not. Attempting to read a label with an image or the first word of a paragraph is essentially the same strategy. The most famous example of label reading is probably Stephen Bax’ work, but there are others walking similar paths (see for example Ruby Novacna’s blog).
It’s the technique I used to rely on myself when I first started writing about the VM, but I see that my latest post with an attempted label reading is already one and a half year old (and still I feel quite new to this): Monkey Business. Since then I haven’t payed much attention to label reading, but I plan to revisit my initial attempts with a fresh look soon.
The idea is to start small. You select an image and a word which is supposed to name the image, and take it from there. Let’s say we think the plant pictured below is broccoli. Did broccoli exist at the time and place this image could be made? (It did, apparently). What were names for broccoli in various languages? Which of those names could the label encode?
The main advantage of label reading is that we can use the images and labels in the manuscript without the absolute need for external sources. By which I mean that, even if the Voynich is unique and has no relation to surviving manuscripts, we still have an avenue of attack. If the text is related to the images and the manuscript can be read, label reading might eventually get us there.
Oh, where to begin? Even though I am a fan of the strategy myself, I must admit that there are so many problems associated with it. I think I’ll need a bullet list…
- Uncertainty whether the selected word actually describes the image.
- Are labels really labels? Or do they add some property of the item? Or something else?
- When selecting a word from a paragraph, you don’t even know whether you’ve picked the right one.
- Uncertainty about the language. Many people presume Latin, but there are a thousand other possibilities, and even in the large European languages names for plants vary or are loanwords.
- Uncertainty about what the image depicts. If my broccoli is actually a tree, the entire exercise is futile and even misleading, and it can set you on the wrong path for months.
- Requires lots of research to do properly, for just one word
- And finally, no attempt so far has proved scalable.
The last point deserves some more explanation. Several researchers have attempted to read labels and assign sound values to individual glyphs, but if you use those readings on a whole paragraph the result is gibberish in every single language. In other words, it’s possible to propose a reading of one word, or selected words throughout the manuscript, but this has never allowed any researcher to convincingly read even a single sentence. This is a criticism of others’ attempts at label reading as well as my own past efforts.
However, the apparent lack of scalability is in my opinion not inherent to the method. It may just imply that so far nobody has succeeded in reading the glyphs the correct way. Hence, I don’t believe criticism on label reading itself is justified; it should be possible to correctly identify an image and read its label, and gradually increase your scope. The criticism should be focused on potential misinterpretation of the image and unlikely glyph readings.
2. Computer attacks
I’ll keep this segment short since computer-assisted strategies are not my area of expertise. It must be mentioned, however, since they are often employed. A recent example is the news reports on AI being used to “uncover the mystery of an ancient manuscript“. This mystery-uncovering has been debunked by Nick Pelling before it even got up to speed in mainstream media, as such things go.
I don’t have to talk about the advantages of using computers as assistants in manuscript studies, code breaking etc. Some might say advanced statistics and computing power are indispensable, and, who knows, it may one day be an AI who solves the Voynich text.
But as we stand now, there are still many problems. For example, Marco Ponzi recently compared various algorithms to sort out which Voynich glyphs are vowels (based on expected alteration patterns). However, a reaction by JKP highlights what I also believe to be the most common difficulty in this matter: parsing. In order to feed the VM text to a computer, we must transcribe it somehow, and that is where we must inevitably make choices.
And there’s our catch 22: before we can properly enlist the help of computers, we must know how to feed the text to them. But before we can do that, we must understand the text better, which is what we’d like the computers’ help with in the first place.
I do believe that perhaps a true AI can help us a lot, but this AI does not exist yet, and making it would first require a ton of human effort.
We, as “interpreters” stand between the VM and computers, but the VM itself may resist an all too rigid machine approach as well. My intuitive approach of the script two years ago was that it contains plenty of abbreviations which must be developed based on one’s intuitive understanding of the sentence and familiarity with the vocabulary. Think of the “bench” character which has various possible meanings in normal medieval texts, or generic ending abbreviations. These are easy for trained humans to understand, but can prove to be extremely problematic for machines.
3. Block paradigm
The block paradigm, based on established cryptological practices, was introduced to Voynich studies by Nick Pelling. We recently did an interview with him, where he explains in detail how it works (click the link for his full explanation in the video).
In code breaking generally there are two types of attack: one where you try to attack the system (working backwards). You can also do a forward attack; if you find out what the plaintext is, you can work forwards from there to the ciphertext.
In other words, if you have the plaintext and the corresponding section in the Voynich, you can discover the method that was used to turn one into the other. Nick discusses quire 20 as an example. Since each page consists of a number of star-labelled paragraphs, one could try to find other medieval sources with similar lists.
For a while I was convinced that I was on the trail of a 15th century recipe book that was going to be the same as quire 20. I pursued it, but in the end I don’t believe that there is a good version of it.
Like all strategies, the block paradigm has advantages and disadvantages.
As Nick says, the advantage of the block paradigm is that everybody should agree the method will work. Imagine that something really unusual was done to the Voynich text, something so unique or far-fetched that we will never understand it by ourselves. Even in such a case, the block paradigm will help us out. We’d have the starting text and the end result (the corresponding VM “block”). All that would be left to do is figure out how they went from A to B, and then see if this method applies to the whole manuscript.
Even if the VM were uncrackable in isolation, the block paradigm would still be able to hand us a key. It’s immensely powerful and an almost guaranteed Voynich (block)buster.
The problem with the block paradigm is that it’s like looking for a needle in a haystack which may not have any needles in it. Countless medieval texts have been lost, and like Nick says in the interview, others are incomplete, scattered, unreliably copied… Yes, the ancients were eager copyists, but they were also pragmatic copyists, especially when “scientific” subjects were involved. Texts would often undergo some form of alteration to suit the user’s needs.
And that’s assuming we even find a block to compare a piece of VM text to. Nick tried using the layout as a guideline, without much success so far. Another route is of course to look for parallels for the imagery, which brings us back to the problems mentioned in the “label reading” method. The block paradigm also assumes that the VM is an altered or encrypted version of an existing text, but there is always the possibility that people were writing in Voynichese directly.
Bottom line: if the contents of the VM text is unique in the record, we will not be able to use the block paradigm method and our block will remain a holy grail in more senses than one (in that it doesn’t exist). In order to do successful test based on the block paradigm, you have to win the proverbial lottery first, if at all possible.
Brute force computer analysis can help us gain insights in the text, and it has its uses as an auxiliary tool. I have great admiration for people like Marco Ponzi who produce one fascinating graph after the other. But before AI can really solve the Voynich for us (which is our ultimate goal), lots of human effort will be required.
On the opposite end of the scale there is label reading, which might work but demands a decent understanding of the imagery. The same can be said about the block paradigm, which presents the additional condition that a suitable text exist in the first place. Still, I think Pelling is right to imply that we should all have our feelers out for a suitable block, since finding one would make our job a lot easier.
But for now, since I can’t build AI’s, I keep studying the imagery. It’s likely our best basis for understanding the text.