Edward Loveall

An Incredibly Dumb Way to Make Sentences

For a long time, Markov chains seemed unapproachable to me. I’m not exactly sure why; perhaps they were always brought up in the context of artificial intelligence, or academic math, or their impossible-to-understand Wikipedia page.

I’m here to tell you: Markov chains are easy to understand, especially the first-order ones. Let’s take a look.

Take the following sentence fragment:

the cat and the dog

Look at each word and count the words that come after it:

the -> cat: 1
    -> dog: 1
cat -> and: 1
and -> the: 1

Let’s go through each word in the cat and the dog. the has two words that follow it: cat (the cat) and dog (the dog). cat is followed by and. and is followed by the. With those pairings, the whole sentence is accounted for: we’ve already mapped the the after and, and dog has no words that follow it. The numbers next to each “follower word” are the number of times it appears after the first word. For example, cat appears only once after the.

This is the basic structure of the words in this sentence. It allows us to to make new, similar sentences. We can do that by picking a word at random, then picking one of it’s follower words at random, then repeat. For example, we can randomly start with cat:

cat and the cat and the dog

We stopped at dog because it has no followers. As you can see this is a semi-plausible sentence, and looks quite a bit like our original sentence. Even with the randomness, there’s just not a lot of followers to choose from.

More words

Here’s an excerpt of the word structure from Shakespeare’s Much Ado About Nothing. It’s massive so I’ll only show you some of the structure for “a”:

a -> man: 22
  -> good: 17
  -> very: 7
  -> young: 3
  -> kind: 3
  -> lord: 3
  -> messenger: 2
  -> lady: 2
  -> new: 2
  -> lion: 1
  -> headborough: 1
  -> sexton: 1
  -> lamb: 1
  -> badge: 1
  -> constable: 1
  -> victory: 1
  -> stuffed: 1
  -> skirmish: 1
  -> difference: 1
  -> reasonable: 1
  -> boy: 1

That’s a lot more to work with, and it goes on for quite some time! Here are some examples of the sentences we can generate with all that text:

scholar would modestly examine your answer

husband have you welcome my place

purse such matter there’s a rougher task in your friend

Now, I know what you’re thinking: “Wow! That’s literally as good as Shakespeare!” I know. I used a little something extra to pull off this convincing effect: weighting.

Notice how in the word chain above, man appears 22 times after a, but a word like headborough only appears once. That means, if we come to the word a and we’re picking a follower word, man is 22 times more likely to follow than headborough. This is a weighted algorithm and it will create more plausible sentences.

Second-order Markov chains

We’re just tracking a single follower after each word. This is called a single-order Markov chain. But if we were to extend that to two words, we could have a second-order Markov chain:

the -> cat: 1 -> and: 1
cat -> and: 1 -> the: 1
and -> the: 1 -> dog: 1

The really bad example sentence that I picked is… well, really bad at showing this, but now we have two words to look at when choosing a follower word. That is, if we generate the cat, we can see that it’s always followed by and. This can give us more realistic sentences because it has more context.

Learning a new programming language

A Markov chain is now my go-to project for learning a new programming language. It’s relatively simple and gets me working with basic data structures, file loading, randomness, and loops. I recently started learning Swift and made one.

Pun generator

However, programming a Markov chain won’t expose you to all parts of a language. My friend Gabe wrote a blog post about learning a new programming language that outlines the steps to make a pun generator. It is a fantastic post that will end in you having a program that will generate the lamest of puns.

I went through it and created a Swift version.

I’m writing this post as sort of a prequel to Gabe’s. I found the pun generator to be a large undertaking compared to a Markov chain. If you’re looking for a something between “hello world” and generating puns, I suggest you give Markov chains a try.