Better heads

As I mentioned in the update to last night’s post, Mark Berger wrote two scripts, one in Groovy and the other in Ruby, that use a much smarter approach to generating the coin flip sequences. He asked me what his approach would look like in Python, so here goes.

The cleverness in Mark’s scripts comes from his use of binary notation. Each sequence of eight coin flips is represented by an eight-digit binary number, with ones representing heads and zeros representing tails. There are 256 sequences, ranging from 00000000 to 11111111. Instead of using a recursive function to generate the sequences, he just loops from 0 to 255, converting each number to a binary string.

My Python rewrite of his scripts is this:

python:
 1:  #!/usr/bin/python
 2:  
 3:  f = 8
 4:  fmt = '{:0>%ds}' % f
 5:  seqs = []
 6:  
 7:  for num in range(2**f):
 8:    flips = fmt.format(bin(num)[2:])
 9:    heads = flips.count('1')
10:    circle = flips[-1] + flips
11:    if circle.find('11') == -1:
12:      seqs.append('%d: %s' % (heads, flips))
13:  
14:  seqs.sort()
15:  print '%d sequences' % len(seqs)
16:  print '\n'.join(seqs)

Line 8 is the key to the script. It uses the built-in bin function to convert each number to a binary string representation. Because bin prefixes each string with 0b, that had to be removed. Also, because the binary representation created by bin doesn’t include leading zeros, I had to add those back in with format. The format string is defined in Line 4 to have a length of f and be right justified with leading zeros.

As an example of what Line 8 does, let’s assume that num is 5. Step by step, the transformations go like this:

                 bin(5)      => '0b101'
                 bin(5)[2:]  => '101'
'{:0>8s}'.format(bin(5)[2:]) => '00000101'

With the addition of a few more lines, I could return to showing the coin flips as H and T instead of 1 and 0:

python:
 1:  #!/usr/bin/python
 2:  
 3:  from string import maketrans
 4:  
 5:  f = 8
 6:  fmt = '{:T>%ds}' % f
 7:  trans = maketrans('01', 'TH')
 8:  seqs = []
 9:  
10:  for num in range(2**f):
11:    flips = fmt.format(bin(num)[2:].translate(trans))
12:    heads = flips.count('H')
13:    circle = flips[-1] + flips
14:    if circle.find('HH') == -1:
15:      seqs.append('%d: %s' % (heads, flips))
16:  
17:  seqs.sort()
18:  print '%d sequences' % len(seqs)
19:  print '\n'.join(seqs)

This is still shorter than my previous script and the loop is easier to understand than the recursive function.

This got me thinking about using different number bases to enumerate sequences of other random events. For example, base 6 would work for standard cubic dice. Unfortunately, while Python has bin for binary, oct for octal, and hex for hexadecimal, it doesn’t have a conversion function for other number systems (or for an arbitrary base). But it’s not too hard to write one.

Base 6 is called a senary number system, so the function that acts analogously to bin, oct, and hex is sen:

python:
 1:  def sen(n):
 2:    d, m = divmod(n, 6)
 3:    s = str(m)
 4:    while d > 5:
 5:      d, m = divmod(d, 6)
 6:      s = str(m) + s
 7:    if d == 0:
 8:      return s
 9:    else:
10:      return str(d) + s

I didn’t know what prefix to use for a senary string, so I didn’t include one. Examples of sen in action are:

sen(5)   => '5'
sen(7)   => '11'
sen(52)  => '124'
sen(121) => '321'

The problem with using sen for dice is that it’s more natural to have each die reported as a number from 1 to 6 instead of from 0 to 5. That function’s pretty easy to write, too.

python:
1:  def senp1(n):
2:    d, m = divmod(n, 6)
3:    s = str(m+1)
4:    while d > 5:
5:      d, m = divmod(d, 6)
6:      s = str(m+1) + s
7:    return str(d+1) + s

I called it senp1 for sen plus one to match the expm1 function from the math library.

Here’s a script with senp1 simulating a Yahtzee roll of five dice and determining the probability of getting three of a kind:

python:
 1:  #!/usr/bin/python
 2:  
 3:  from collections import Counter
 4:  
 5:  dice = 5
 6:  N = 6**dice
 7:  n = 0
 8:  
 9:  def senp1(n):
10:    d, m = divmod(n, 6)
11:    s = str(m+1)
12:    while d > 5:
13:      d, m = divmod(d, 6)
14:      s = str(m+1) + s
15:    return str(d+1) + s
16:  
17:  for num in range(N):
18:    fmt = '{:1>%ds}' % dice
19:    roll = fmt.format(senp1(num))
20:    count = Counter()
21:    for die in roll:
22:      count[die] += 1
23:    top = count.most_common(2)
24:    if top[0][1] == 3 and top[1][1] != 2:
25:      n += 1
26:  
27:  print '%d out of %d or %.3f%%' % (n, N, n*100.0/N)

It uses the Counter class from the collections library to track the number of matching dice on each roll, and the most_common method of that class to pull out the most common matches. Lines 24–25 increment a counter only if the roll is exactly three of a kind—four and five of a kind are excluded, as is a full house.1

The output is

1200 out of 7776 or 15.432%

which means getting three of a kind on your first roll isn’t especially uncommon. Not that this particular result was likely to be a big concern of yours on a cold February night, but if you play dice games, you could use similar scripts to work out the best strategy for different situations.

Thanks to Mark for pointing out a better approach to enumeration.


  1. If you don’t know the Holy Grail quote that goes with this sentence, I don’t know what to do with you. ↩︎