In yesterday’s 7172 post, I made a plan to next address the words which have some special tag in the OED, such as obsolete, archaic, rare, dialectical, or jocular words. Overnight, my plan jelled. My goal from the beginning was to distinguish between words which are high-register and low, ignoring the middle ground for now.
I’m going to use the tag “high” on the very few obsolete words, the archaic words, the rare words, and other words which contribute to high register, as labeled by OED.
The tag “low” is going on parochial words, dialectical words, regional words, cuss words, jocular words.
By making tagged entries for each of these, we’ll be ready to move forward with our lexomics analysis while still making entries for other words.
Update 2015.06.20: tagging things “high” or “low” is an even less exact science than picking cherries… I have thrown in the towel on this one for now.