Backward

I am rolling through my task of eliminating those words from the concordance which are inflected forms of The Ten Thousand.  I am trying to be ruthless, although my heart hurts as we lose some beauties like “clad”, an elder past participle form of “clothe”.  “Clothes” – the noun – is one of the Ten Thousand.  Does that eliminate the verb “clothe”?  Frankly I might put back “clad” when all is said and done by means of an argument about the archaicness and beauty of its form.

But really, can I afford to keep all the lovely words?  Does that not bias my method?  Does that not leave me with a boatload more words to work with than might be wise for a project of such limited time and resource?  Alas.  For now I will at least try to be ruthless.  Fortunately, I can write a little swan-song here for them.  For a regular present tense noun, if I see the 3d person singular, such as “knits”, I take notice, check The Ten Thousand, find “knit” there, and eliminate “knits” from our consideration.  I’m alert now to the -s ending.  But what about the lack of it?

Tonight’s observation, Hobbit fans, is that “backwards” is among The Ten Thousand, but “backward” is not.  I learn that “backward” as an adjective (I shot him a backward glance) is the usual (but not exclusive) spelling, and that “backward” as an adverb (… and then I fell backwards) is sometimes spelled with the s (but not exclusively).  The -s is more common in British than American writing.  Well, bless.

“Backward”. OED Online. Oxford University Press, March 2015. Web.

Poetry

Poetry in The Hobbit is definitely front-loaded, 12 of 17 come before the beginning of Chapter 10.  Hmmm.  My expectations about higher register after 10.020 did not predict that, because I immediately associate “poetry” with “high register”.  Let’s examine them more closely:

01.063: Chip the Glasses
01.072: Far Over the Misty Mountains Cold
01.142: Far Over the Misty Mountains Cold (reprise)
03.014: Tra-la-la-lally
04.019: Clap! Snap! the black crack!
05.026 through 05.063: Riddles
06.074: A Horrible Song
07.099: The Dwarves’ Wind Song
08.096: Attercop!
08.100: Lazy Lob and crazy Cob
09.049: Wood-Elves’ Barrel Rolling Song
09.053: Song round the River-Door
10.035: Snatches of Old Songs
15.036: Music to Soften Thorin’s Mood
19.002: Tra-la-la-lally Refrain
19.011: A Song Loud and Clear on the Banks of the Stream
19.029: Roads Go Ever On and On

I would submit that poetry in chapter ten and after is all heroic, nostalgic, high-register poetry except for the 19.002 Tra-la-la-lally refrain.  But what about the Tra-la-la-lally refrain?  Is it, in fact, as silly and light as we observed the earlier, 3.014 Tra-la-la-lally?  I suggest that it is not.  Our singers go nowhere near “Tra-la” in their song until after the dragon is withered and his splendour is humbled.  We get rusted swords, perished thrones, and trusted strength betrayed.  Only then do we hear that grass is still growing, that nature can be in its proper order, and the weary traveler is welcomed back to The Last Homely House.

The 19.002 Tra-la-la-lally Refrain heals.  It is the closing parenthesis on Bilbo’s experience of war, an invitation to encapsulate the pain and fear and reach a landmark in his return to wholeness.  “Stand down, little fellow who has gone to war.  The world you knew is shifting back into its place, wider of margin and more precious for its cost.”  I place this poem firmly in “high purpose” although the familiar words of the refrain may be of silly register.

Eyebrows

“Eyebrows” appear five times in The Hobbit, and four of those times they are bushy.

  • 01.008 But Gandalf looked at him from under long bushy eyebrows
  • 01.100 and stuck out his bushy eyebrows,
  • 06.024 He gave Bilbo a queer look from under his bushy eyebrows,
  • 07.038 with his bushy black eyebrows.
  • 08.107 and eyebrows,

Three times they are Gandalf’s eyebrows, once Beorn’s and once Fili’s.  I find it interesting that eyebrows are mentioned only in the first half of the book – surely Gandalf looked gruffly out from under them when he was camped with Bard, surely someone’s got singed in the dragon-attack.  I’m having an idea.

I’ve read general agreement that the tone of The Hobbit changes around chapter 10.  I hypothesize that the moment when Thorin says

  • 10.020: “I am Thorin, son of Thrain, son of Thror”

is the inflection point in the theoretical graph of changes in diction in this work.  I believe that eyebrows are funny, particularly bushy ones, and their comic value keeps them unmentioned in the higher-register second half of the book. Not that Tolkien sat himself down with a list of funny words and said, “None of you shall appear after Barrels Out of Bond!” but that they simply weren’t the right tools for the job after that point.  I take it upon myself to make the theoretical graph a reality – those who know me know I am unable to resist!

Lemmatizing woes: bid v bid

My goal right now is to lemmatize my list of eighty five hundred uncommon words from The Hobbit.  In other words, if “knit” is in The Ten Thousand most common words, then I should remove “knits, knitted, knitting” from my list of words under examination.  These inflected forms are still “knit” in fancy clothes.

In the course of doing this, we will lose some of the gems.  The Ten Thousand list doesn’t distinguish between “bid, bid, bidden” (offer, as bid at an auction) and “bid, bade, bidden” (entreat, as [06.092] “The Lord of the Eagles bids you”).  I must settle for eliminating those words whose stems match a stem in The Ten Thousand most frequent.

Tonight we must say farewell to arms (weapons), bid adieu to bid (entreat), and blow fair winds to blow (strong hit) in service of certainty in the specialness of the words we end up with.

See you at the other end of the alphabet!

How did we get to this list?

If I were to publish a standard concordance (software freely available) of all the words, each entry would have a few words before and a few after my Entry word.  If your computer is clever enough, it could put together the text of The Hobbit from all those overlapping words as you can see:

  • in a HOLE in the ground
  • a hole IN the ground there
  • hole in THE ground there lived.

This is my understanding of scholarly fair use:  I may chop up the words and write about them, but not in a way that your computer could put the text back together.  My idea was to chop up the text approximately into phrases with no overlap between them.  You may know that “in a hole” and “in the ground” are both in paragraph [01.001], but you don’t know in what order.  I marked up my hand-typed copy with [paragraph number] xx at the start of each paragraph and xx where I wanted to chop apart phrases.  Chopping apart phrases was a story in itself, I’m sure a post will come later.

Given that text preparation, my son wrote a Python script to make the concordance and index.  For your own copy of the script, which he publishes under a Lesser General Public License, click here.  You’ll find a Read Me, instructions, the concordance script, and others which he created for this project.

Inspired by Richard E. Blackwelder

Richard Blackwelder, Tolkien scholar and entomologist (insect-scientist, not etymologist) and relation of my wife,  gave us a copy of his book The Tolkien Thesaurus and some adjunct materials as a wedding gift with wishes for a happy life of book-loving together.  In the introduction to that work, he himself describes it as “a concordance” of The Lord of the Rings (the title Thesaurus baffles me), and he wrote thousands and thousands of lemmatized concordance-style entries with book and page number and enough of a phrase that a reader with just a word or two of a quotation that is tickling their memory can find the full passage easily.

Now we have Kindle and other electronic formats; we can solve in seconds the problem which Blackwelder put so many years into solving in the 1980s.  How can I carry on Blackwelder’s inspiring work?  In his companion work Tolkien Phraseology, he writes:

Among the 40,000 or so passages quoted in the Thesaurus itself there are ones of great beauty and ones that speak only of filth and darkness, ones that bring us victorious action or ones taken from folk-songs, ones representing a wide variety of poetic forms or ones conveying only some slangy command.

We may assume that a reader is following the story and the characters and may sometimes fail to notice the unusual words, phrases, or even passages.  Some appear on re-reading, but the compiler has found that many slip by repeatedly and appear only when the sentences are analyzed and the individual words singled out.

His short list of “unusual words, phrases, and passages” from The Lord of the Rings is only a list, without analysis.  This project was born: we will find the unusual words of The Hobbit and, by learning more about these words and how Tolkien uses them, become better readers of the work.

Blackwelder, Richard E. A Tolkien Thesaurus Garland Publishing, New York, 1990.

Blackwelder, Richard E. Tolkien Phraseology: A Companion to A Tolkien Thesaurus Tolkien Archives Fund, Marquette University, 1990.

So Many Editions! or all the pretty paragraphs

Many, many editions of The Hobbit abound – hooray that this story is dear to millions of readers!  With many editions, using a page number for a quote or idea reference can be problematic.  In my Hobbit-word study, I’ve made an index of the paragraphs of the work and given each paragraph a unique number.  When you see a quotation here or in the concordance which is my goal, you can just zip to the index to help you find it in your own edition to get context.

1951 Hobbit Paragraph Index

In the future, I’ll be exploring the 1937 edition; here’s the paragraph index of 1942 edition’s Chapter V.  This one is identical as far as I know to the 1937, and John Rateliff kindly helped me to locate this from the Children’s Book Club.

You can also find these paragraph index links on the About page.

Bread and Cheese: overlooking the most common words

While it is possible to write a story without “the be of and a in to have it I”, these top ten most frequently used words and their close neighbors form the “bread and cheese” of the corpus of written work in Modern English.    To examine Tolkien’s special way with words, I wanted to skip past the ten thousand most common words, the words which just anyone might use.  I have in time come to call them The Ten Thousand in my idiolect.

The Hobbit has about 96,000 words.  After eliminating The Ten Thousand common words, which account for all but 2 to 5% of the British National Corpus (depending on whom you ask and how they measure), there remain 7,172.  Less-common words comprise 7.5 per cent of The Hobbit.  Ahem.  Now, over a hundred of those words are “hobbit”.  But only one is “bebother”.

Strap on your goggles, it’s going to be quite an adventure.

Please see the Works Cited page for full information on our sources.

Leech, Rayson, and Wilson. Word Frequencies in Written and Spoken English.

Good Morning!

And I mean it!

The Tolkien Professor has suggested that I record the pleasing patterns of wonderful words in The Hobbit by J. R. R. Tolkien.  The story sings because the words dance.  Come dance with them!

On this adventure, we use this edition of the work:

Tolkien, J.R.R. (2012-02-15). The Hobbit: 75th Anniversary Edition. Houghton Mifflin Harcourt. Kindle Edition.

We begin by saying Thank you:

  • Corey Olsen, president of Signum University and my thesis advisor.
  • Robin Reid, for advice on fair use and assistance with using the text.
  • John Rateliff, for assistance with the 1937 text.
  • Doug Anderson, for advice on paragraph enumeration.
  • Daroc Alden, coder and data-moosher.
  • Grace Alden, my wife.