Chains and ladders

This week’s Numberphile video features the BBC’s favorite mathematician, Marcus du Sautoy, who explains how the game Snakes and Ladders1 is governed by the mathematics of Markov chains. Despite some experience analyzing Markov chains in grad school, I had trouble understanding one part of the video, so I pulled out my old textbook to clear things up.

But before we get into the math, a short digression. One of my favorite episodes of Melvin Bragg’s In Our Time radio show was the “Zeno’s Paradoxes” show from back in 2016. I distinctly remember being infuriated by the show because du Sautoy’s fellow guests, a philosopher and a classicist, clearly thought that Zeno’s reasoning on paradoxes like Achilles and the tortoise and the flying arrow was still valuable. Not because it gives us insight into the historical development of thinking on infinite series or state space representation—which it absolutely does—but because it really is puzzling that Achilles passes the tortoise. What made the show so entertaining was listening to du Sautoy’s obvious frustration with what he considered their obtuseness with regard to solved problems and his need to suppress that frustration for the sake of politeness.

OK, onto the math. Du Sautoy starts by setting up a small Snakes and Ladders game with just ten squares—labeled 0 through 9—one ladder, and one snake.

Small snakes and ladders game board

You move across the board by rolling a single die and advancing your marker accordingly. The goal is to reach Square 9 with an exact roll. If you roll a number that is more than you need to reach Square 9, you stay where you are until the next roll.

He then goes through the construction of a transition matrix, in which each element is the probability of moving from the square represented by the row to the square represented by the column. It looks like this when he’s done:

du Sautoy transition matrix

The row and column of numbers outside the matrix are the squares on the game board. You’ll note that three of the positions are missing:

P transition matrix

I’m calling this matrix PP to avoid confusing it with du Sautoy’s transition matrix, which he calls QQ. We’ll return to this matrix soon.

It’s after construction his transition matrix that du Sautoy does what I don’t understand. He wants to calculate the expected number of turns it will take to reach Square 9 from Square 0. The equation he comes up with is this infinite sum,

I+Q+Q2+Q3+Q4+I + Q + Q^2 + Q^3 + Q^4 + \ldots

where II is the identity matrix and the powers represent matrix multiplication. The sum of the top row of the resulting matrix is the expected number of rolls to get to Square 9. I understand everything du Sautoy says as he explains what each of the terms in this series is, but I don’t get why it leads to the expected number of rolls.

I even understand the next part of the video, where some clever algebra shows that this infinite series converges to the non-infinite matrix2

(IQ)1\left( I - Q \right)^{-1}

So the sum of the elements of the top row of this inverted matrix is the expected number of rolls to get to Square 9.

As it happens, you can get to this final answer by a method I understand. We’ll start by returning to the PP matrix I defined earlier,

P transition matrix

As you can see, the first seven rows and columns are the same as QQ. The first seven elements of the last column consists of the probabilities of getting to Square 9 from each of the earlier squares. These are the probabilities that were skipped over in the video, and they’re all either zero or 1/61/6 because the roll has to be exactly the number of squares away from 9 you are. Note also that the sum of all the terms in each row must be 1 because you have to land somewhere after each row.

The last row is special. It says that once you’re on Square 9, you can’t go anywhere else. In the study of Markov chains, this is known as an absorbing state, and the properties of absorbing states are pretty well known. In particular, there’s an established formula for the expected number of rolls to get from Square ii to Square 9, the absorbing state.3

We’ll call the expected number of rolls to get from Square ii to Square 9 mim_i. One thing we can say right away is that

m9=0m_9 = 0

because you don’t need to roll to get to the end when you’re already there. For all the other values of ii, we use some lateral thinking.

First, we know that if we’re on Square 1, it takes m1m_1 rolls to get to the end—that’s by definition. So if we start on Square 0 and go to Square 1 on our first roll, it will take, on average, 1+m11 + m_1 rolls to get to the end. But of course we may not go from Square 0 to Square 1; we might go to Square 2 or 3 or whatever. Each of these possibilities for the first roll has a probability assigned to it in the transition matrix, PP. So the expected value of the number of rolls to get from Square 0 to Square 9 is

m0=(1+m1)p01+(1+m2)p02+(1+m3)p03m_0 = (1 + m_1)\,p_{01} + (1 + m_2)\,p_{02} + (1 + m_3)\,p_{03} +(1+m5)p05+(1+m6)p06+(1+m7)p07\quad + \:(1 + m_5)\,p_{05} + (1 + m_6)\,p_{06} + (1 + m_7)\,p_{07}

where the pp values are taken from the top row of the PP matrix. It may look like we’ve gone backwards here, introducing all the other mim_i into an equation for m0m_0. But let’s see what happens when we generalize this to any starting square and make the equation more compact with the summation symbol.

mi=allj(1+mj)pijm_i = \sum\limits_{\mathrm{all}\: j} (1 + m_j) \; p_{ij}

Doing some algebra, we get

mi=alljpij+alljmjpijm_i = \sum\limits_{\mathrm{all}\: j} p_{ij} + \sum\limits_{\mathrm{all}\: j} m_j \; p_{ij}

Note that the first sum in this equation is the sum of all the elements of PP in row ii, and that is 1 for each row. Combining that with m9=0m_9 = 0, we can simplify this equation to

mi=1+allj9mjpijm_i = 1 + \sum\limits_{\mathrm{all}\: j \neq 9} m_j \; p_{ij}

This works for all values of i9i \neq 9. So what we have here are seven linear equations in seven unknowns, which is hard to solve by hand, but easy to solve with a computer. If we put it in matrix form, it becomes interesting:

m=1+Qmm = 1 + Q\, m

Here, the mm is a column vector of all the mim_i terms, the 11 is a column vector of ones, and QQ is the matrix PP without its last row and column—yes, it’s du Sautoy’s transition matrix that ignored Square 9. Since m=Imm = I\,m, we can do the following manipulation:

m=Im=1+Qmm = I\, m = 1 + Q \, m (IQ)m=1(I - Q)\, m = 1 m=(IQ)11m = \left( I - Q \right)^{-1}\, 1

Well, this should look familiar. Because we’re multiplying by a column vector of ones, each element of mm is equal to the sum of the corresponding row of the inverted matrix. And therefore the expected number of rolls to get from Square 0 to Square 9 is the sum of the top row of

(IQ)1\left( I - Q \right)^{-1}

In solving this problem, you wouldn’t explicitly invert the matrix, because that’s more computationally intensive than solving a set of linear equations. But I wrote it out this way to show that it’s the same result as du Sautoy’s.

One thing that isn’t the same as du Sautoy’s is the numerical answer. I got the following:

m=[]Tm = [ 8.6 \quad 8.4 \quad 8.4 \quad 7.2 \quad 7.2 \quad 7.2 \quad 7.2 ]^T

As you can see, my answer for m0m_0 is 8.6, not 10 as du Sautoy says in the video. I checked my expression for QQ and it matched his. I went through all the transition probabilities again and agreed with his. Then I looked in the comments and found that several Numberphile followers also got 8.6, which made me feel better about my answer. While it’s possible we all made the same mistake, given that this is a very simple calculation, I think it’s more likely du Sautoy had a typo somewhere in his work. If you see an error here, though, I’d like to know about it.

  1. Probably known to most Americans as Chutes and Ladders, thanks to Hasbro

  2. Shades of Achilles! 

  3. In standard Markov chain terminology, there are states and steps. In Snakes and Ladders, these correspond to squares and rolls, so I’ll be using that terminology.