The Shire and Mirkwood compared to random text grabs.

From earlier this week: The Shire text uses 11,119 words, of which 1,484 do not appear in Mirkwood, this is counting every word used – “yes” counts as six words.  That’s 13.3% Shire words.

What we learned today: The Shire text compared to a random word grab of the same sample size – 1,339 Shire words do not match my random text.  That is basically indistinguishable from the Mirkwood difference.  Hmm, fascinating!  Yet most of our Lexos graphs which show both regions paint them as very different from one another at the word level.  Hold on…

Oho!  the Mirkwood text has more words – 16,400 – and only 1,265 are different from a random grab of 16,400 words in the whole novel.  That’s 7.7%.  Very different, my friends!

Let’s clean that up a bit:

  • Shire text: 11,119 words
  • Shire words not appearing in Mirkwood: 13.3%
  • Shire words not appearing in Random text: 12%
  • Mirkwood text: 16,400 words
  • Mirkwood words not appearing in the Shire text: 14.6
  • Mirkwood words not appearing in Random text: 7.7%

Well, well, well.  time to poke at Mirkwood a bit, friends.  Also, it’s time to use the newly-discovered Lexos feature “how many of these words are unique”!  See you soon!

 

 

Leave a comment