An Incredibly Dumb Way to Make Sentences
For a long time, Markov chains seemed unapproachable to me. I’m not exactly sure why; perhaps they were always brought up in the context of artificial intelligence, or academic math, or their impossible-to-understand Wikipedia page.
I’m here to tell you: Markov chains are easy to understand, especially the first-order ones. Let’s take a look.
Take the following sentence fragment:
the cat and the dog
Look at each word and count the words that come after it:
the -> cat: 1
-> dog: 1
cat -> and: 1
and -> the: 1
Let’s go through each word in the cat and the dog
. the
has two words that follow it: cat
(the cat
) and dog
(the dog
). cat
is followed by and
. and
is followed by the
. With those pairings, the whole sentence is accounted for: we’ve already mapped the the
after and
, and dog
has no words that follow it. The numbers next to each “follower word” are the number of times it appears after the first word. For example, cat
appears only once after the
.
This is the basic structure of the words in this sentence. It allows us to to make new, similar sentences. We can do that by picking a word at random, then picking one of it’s follower words at random, then repeat. For example, we can randomly start with cat
:
cat and the cat and the dog
We stopped at dog
because it has no followers. As you can see this is a semi-plausible sentence, and looks quite a bit like our original sentence. Even with the randomness, there’s just not a lot of followers to choose from.
More words
Here’s an excerpt of the word structure from Shakespeare’s Much Ado About Nothing. It’s massive so I’ll only show you some of the structure for “a”:
a -> man: 22
-> good: 17
-> very: 7
-> young: 3
-> kind: 3
-> lord: 3
-> messenger: 2
-> lady: 2
-> new: 2
-> lion: 1
-> headborough: 1
-> sexton: 1
-> lamb: 1
-> badge: 1
-> constable: 1
-> victory: 1
-> stuffed: 1
-> skirmish: 1
-> difference: 1
-> reasonable: 1
-> boy: 1
...
That’s a lot more to work with, and it goes on for quite some time! Here are some examples of the sentences we can generate with all that text:
scholar would modestly examine your answer
husband have you welcome my place
purse such matter there’s a rougher task in your friend
Now, I know what you’re thinking: “Wow! That’s literally as good as Shakespeare!” I know. I used a little something extra to pull off this convincing effect: weighting.
Notice how in the word chain above, man
appears 22 times after a
, but a word like headborough
only appears once. That means, if we come to the word a
and we’re picking a follower word, man
is 22 times more likely to follow than headborough
. This is a weighted algorithm and it will create more plausible sentences.
Second-order Markov chains
We’re just tracking a single follower after each word. This is called a single-order Markov chain. But if we were to extend that to two words, we could have a second-order Markov chain:
the -> cat: 1 -> and: 1
cat -> and: 1 -> the: 1
and -> the: 1 -> dog: 1
The really bad example sentence that I picked is… well, really bad at showing this, but now we have two words to look at when choosing a follower word. That is, if we generate the cat
, we can see that it’s always followed by and
. This can give us more realistic sentences because it has more context.
Learning a new programming language
A Markov chain is now my go-to project for learning a new programming language. It’s relatively simple and gets me working with basic data structures, file loading, randomness, and loops. I recently started learning Swift and made one.
Pun generator
However, programming a Markov chain won’t expose you to all parts of a language. My friend Gabe wrote a blog post about learning a new programming language that outlines the steps to make a pun generator. It is a fantastic post that will end in you having a program that will generate the lamest of puns.
I went through it and created a Swift version.
I’m writing this post as sort of a prequel to Gabe’s. I found the pun generator to be a large undertaking compared to a Markov chain. If you’re looking for a something between “hello world” and generating puns, I suggest you give Markov chains a try.