BookCovers

We covered a fair amount of Natural Language Processing in my recent Computational Humanities course this spring. As our running example text, I used selections from the epic fantasy series The Wheel Of Time. This proved to be a rich source of material for our explorations of how to quantify textual meaning and writing style using computational tools.

After visualizing letter frequency, for The Eye of the World, the first book in the series, we set out to determine if the word distribution in the text matched the predicted distribution of Zipf’s Law.

Word Frequency
the 19672
and 8132
to 7382
a 6807
he 6614
of 6383
his 4617
in 4132
was 3838
it 3519

Frequency of words in The Eye of the World.

For a given text, we calculated the usage frequency of each word. But as is typical, looking at the top words did not reveal much about the content of the document. Removing highly-used English words, or stop words, let us uncover more of the content, and we moved on to understanding the algorithms behind drawing word clouds, where words are plotted in an image with their size proportional to their frequency in the document.

In the end, we developed a first approximation to the Wordle algorithm, using a monospaced font and ignoring the possibility of nesting words inside the nooks and crannies of other letters. And by utilizing a wordlist of English words, we could highlight those unique words that typically denote characters or locations with red. You can follow along with the development and code with this Jupyter Notebook.

I’m including a word cloud that I generated for each book in the series. A few things to note: the main character of the series, Rand al’Thor, is prominent in each of the clouds, although you can see when the attention shifts from him to the side-stories of other characters. Also, the system of magic in the world is very gendered, thus the high frequency of man and woman in the books. I’ll focus on the shifting cast of characters in a later post, then later pick up on the rise of abbreviations like he’d, i’ve, and you’re next.

Spoiler Alert

While these clouds don’t convey any info about plot, they might give away some relevant info.

The Eye of the World

EyeOfTheWorld

The Great Hunt

GreatHunt

The Dragon Reborn

EyeOfTheWorld

The Shadow Rising

ShadowRising

The Fires of Heaven

FiresOfHeaven

Lord of Chaos

LordOfChaos

A Crown of Swords

CrownOfSwords

The Path of Daggers

PathOfDaggers

Winter’s Heart

WintersHeart

Crossroads of Twilight

CrossroadsOfTwilight

Knife of Dreams

KnifeOfDreams

The Gathering Storm

GatheringStorm

Towers of Midnight

TowersOfMidnight

A Memory of Light

MemoryOfLight