Regarding the deviant TTR values of Quire 13 (see previous post), people asked whether there is a difference between both types of Q13 pages. This question was first raised by VViews on the forum, and Nick raised it again in the comments.
While Rene’s graph suggested that Q13 was overall low compared to the rest of the VM, it still showed quite some fluctuation in the affected area. Therefore, I thought it would be the clearest if I visualized both Q13 subsections in line with the last post.
To avoid confusion, I’ll use “P” for those pages with central Pools, and M for those pages with nymphs in the margins.
P = 75, 78, 81, 86, 84
M= 76, 77, 79, 80, 82, 83
To measure each folio, I combined the r+v sides of the same number, since otherwise word counts per file are too low. By this method, my smallest file was 456 words, which I took as the largest window instead of 500 or 1000. For the mid-range window, I kept the crucial m50.
In the figure below, the green dots top right are Herbal sections A and B. Q13 Pool pages are yellow, Marginal nymph pages are blue.
To my surprise, it looks like there is a cluster of blue dots in the higher region.
One blue dot is very low: f77. And one yellow dot is particularly high: f81.
m50 | m456 | |
VM_HA.txt | 0.8525882069 | 0.5984450253 |
VM_HB.txt | 0.8734397358 | 0.6181549737 |
VM_M.txt | 0.8140009381 | 0.5161150641 |
VM_M_f76.txt | 0.8251002227 | 0.5209510452 |
VM_M_f77.txt | 0.7630220713 | 0.4671610431 |
VM_M_f79.txt | 0.8445402299 | 0.5227413615 |
VM_M_f80.txt | 0.8152368421 | 0.5060495559 |
VM_M_f82.txt | 0.8104562738 | 0.5018696115 |
VM_M_f83.txt | 0.812 | 0.5454719235 |
VM_P_all.txt | 0.7815717256 | 0.4826530502 |
VM_P_f75.txt | 0.7604671533 | 0.4498845082 |
VM_P_f78.txt | 0.7877735849 | 0.4891818607 |
VM_P_f81.txt | 0.8021634615 | 0.5204301075 |
VM_P_f84.txt | 0.7731419458 | 0.4717792656 |
The next chart uses normalized data, comparing (top to bottom) Herbal B, Herbal A, Q13 marginal and Q13 pool. Here the difference is at least as clear, if not more so.
I don’t have more time today, but this looks like something that needs further investigation. Full data are available in my share file, which is also basically my work file, so you’ll have to deal with some mess.
Any illuminating opinions are highly appreciated 🙂
EDIT: I made a graph for m50 alone. This has turned out to be a critical value in previous tests, and might be more reliable here than m500 given the small text samples per page. It shows that m50 “predicts” image type in all cases but one:
One other quick comment on one of the Pool pages. Torsten Timm once pointed out a particularly intriguing sequence pair on f84r:-
shedy qokedy qokeedy qokedy chedy okain chey
shedy qokedy qokeedy qokeedy chedy raiin chey
I noted back in 2014 that this might well be as close as we’ll ever get to a “Gillogly sequence” in Voynichese, i.e. a sequence that could prove to be particularly telling about the underlying writing system.
https://ciphermysteries.com/2014/08/15/voynichs-infuriating-liminality
LikeLike
(Sorry, WordPress lost a comment I put in just now, so I’ll try to reconstruct it as best I can.)
Note that the Pool pages “P = 75, 78, 81, 86” should be “P = 75, 78, 81, 84” (i.e. 84, not 86).
It’s interesting that f81 is an outlier here, because f81r is the “ragged right” poem page. I wonder whether f81r is abbreviated slightly less than other pages, because the source lines are shorter. There are notably a good number of free-standing “ol” and “oly” words here, which I’m wondering might be helping to drive Koen’s statistics. So might it be that the pressure of abbreviation elsewhere typically causes “ol + word” pairs to often get collapsed into “olword”?
It’s also interesting that f75v and f84r are in the same Pool pages group, because they share a 4-word repeated sequence between them: “ol shedy qokedy qokeedy” (note that “shedy qokedy qokeedy” is the start of the second Torsten Timm repeat noted in the previous comment).
So there is, I think, good reason to suspect that the Pool pages (which Glen Claston called “Q13B”) may be the lowest-hanging fruit on the Voynichese tree. That is, that Q13B has a great regularity or consistency to its text than other parts of the manuscript have.
LikeLiked by 2 people
Well, i can’t explain the quire behaving like that. To me the quire 13b segments are simply out of context, for me they fit into the entire story in various places. But that of course does not mean the text is not done differently on those pages, just as you have picked out a different subset of artistry on these same pages.
Maybe it has to do with what is represented, regardless of what it may be? Ie is it actually describing roughly circular objects versus describing tubes and such, and does that make the difference to the language behaviour? It would make a difference if it were mathematical descriptions.
At the same time i don’t always see the shapes as drawn matching those which appear to me to be represented, without some tweaking. The double quote Nick speaks of is in the first section of the text and likely has something to do with the top diagram. I take part of the visuals as saying to mirror itself, that is another instance of doubling, i wonder if there is a relationship there. Slightly bigger on the west side, with a chunk out of itself, is that why the addition of a character, and then truncation? Just considering the possibilities.
LikeLike
I’m thinking, maybe take a number of texts in a single language; some poetry, some religious stuff, some science, and then measure (somehow quantifiably?) the difference between them. Then try the same for other languages, and see if the differences are always similar, no matter which language is used. If this holds, then one may speculate on what is the nature of the Voynichese text. Although even if any of this vague suggestion of mine is possible, I can expect the results to be as ambiguous as always 🙂
LikeLike
Hi Guy
This is what I’m trying to achieve with the language clouds. They show the full extent of a language’s possibilities (obviously outliers exist in any language).
When I build a corpus for a language I always try to include texts of different genres. The difficulty is finding enough (about 20) copy-pastable medieval texts in less accessible languages…
LikeLike
Spanish cloud is done, will be included in next post.
I also processed two early Basque texts, but this language is problematic. Basque has never been an extensively written language, and early sources are rare. I collected two of the earliest sources, which are 16th century – and even those are rare. These two will also be included in the next update.
LikeLiked by 1 person