Basketball trivia, editing distances, and derangement

October 8, 2025 at 2:40 PM by Dr. Drang

My older son and I have been playing a daily basketball trivia game. Here’s the question from yesterday with my answers handwritten on the sheet.

Trivia question

Here’s the text of the question. This is easier than trying to put it in the image’s alt text.

Match the Suns great with his retired uniform number.

1. Tom Chambers A. 42

2. Steve Nash B. 24

3. Dan Majerle C. 13

4. Connie Hawkins D. 9

The correct answer, in player order, is BCDA, not my answer of ADCB. I won’t blame anyone who scoffs at my failure to get Nash’s number right. My excuse is that I thought it was so obvious for a smart playmaking guard to wear #13 that it had to be wrong. So I chose Majerle for #13 instead. This sort of mistake is often called “outsmarting” oneself, but I think it’s more “outstupiding.”

Anyway, I got every number wrong, but I noticed that two switches, Majerle/Nash and Chambers/Hawkins, would’ve made my answer correct. I knew there were various methods for measuring edit distances of text strings and wondered if any of them would provide guidance on how “close” my answer was.

The simplest is the Hamming distance, which is just a count of how many characters are wrong. It’s limited to comparing strings of equal length, which is fine for this situation. Since every character in my answer is wrong, its Hamming distance from the correct answer is 4.

I’ve seen lots of references to the Levenshtein distance but have never used it or read up on it before now. It’s a count of the number of single-character insertions, deletions, and substitutions necessary to go from one string to another. The Levenshtein distance from my answer to the correct one is also 4. Since switching letters isn’t a valid edit in a Levenshtein calculation, you can’t go from ADCB to BCDA in fewer steps than 4 substitutions.

There’s also a Damerau-Levenshtein distance, which extends the Levenshtein idea by allowing the transposition of adjacent characters as one of the editing steps. The Damerau-Levenshtein distance from my answer to the correct one is 3: substitution of the first character, substitution of the last character, and transposition of the two interior characters.

After doing these calculations, I started thinking about how these last two distances are better for evaluating actual words than strings like the trivia answer. For example, in the Damerau-Levenshtein calculation, we’re allowed to switch only adjacent characters, not characters that are two or more positions apart. This makes sense when dealing with words. We’ve all typed teh or hte, and some of us have text substitutions set up to transform them automatically into the. Also, there are words that don’t follow the “I before E” rule—Levenshtein, to pick a random example—which makes them easy to misspell by switching adjacent letters.

But the answer to this trivia question isn’t like a word. The question would be effectively the same if it were worded this way:

Match the Suns great with his retired uniform number.

1. Connie Hawkins A. 42

2. Tom Chambers B. 24

3. Steve Nash C. 13

4. Dan Majerle D. 9

I’ve kept the uniform numbers in the same order but rotated Connie Hawkins up to the top of the list. This is the same question as before, but now the correct answer is ABCD and my incorrect answer would be BADC. The Damerau-Levenshtein distance between these strings is just 2: transpose the first two characters, then the last two characters. Also, the Levenshtein distance would be 3:

BADC→ADC→ABC→ABCD

We’ve reduced both of these edit distances without really changing the question. So these measures aren’t a good way to see how bad my answer was.

Both Mathematica and Python have functions for calculating these distances. Mathematica has the built-in functions HammingDistance, EditDistance (for Levenshtein), and DamerauLevenshteinDistance. There are a few ways to import similar functions for Python. I used the textdistance module, but jellyfish is apparently quite a bit faster.

There are 24 (4!) different ways to answer the trivia question:

ABCD   BACD   CABD   DABC
ABDC   BADC   CADB   DACB
ACBD   BCAD   CBAD   DBAC
ACDB   BCDA   CBDA   DBCA
ADBC   BDAC   CDAB   DCAB
ADCB   BDCA   CDBA   DCBA

These are the permutations of the correct answer BCDA.¹ Of these, 9 get every player wrong:

CBAD   DBAC   ABCD
CDAB   DABC   ADBC
CABD   DACB   ADCB

My wrong answer is the one in the lower left corner. These are known as the derangements of the correct answer, so called because they’re the permutations of BCDA with none of the letters in their original (correct) position. I wrote a post on calculating the number of derangements back in 2020.

Since my answer can be made correct by flipping two pairs of letters, I wondered which of the other derangements can be similarly transformed. It’s not hard to go through the 9 strings above and work it out by hand, but I wanted to write a little script to do it. Here it is:

python:
 1:  #!/usr/bin/env python3
 2:  
 3:  from itertools import permutations
 4:  
 5:  def hasmatches(s1, s2):
 6:    return sum(x == y for x, y in zip(s1, s2)) > 0
 7:  
 8:  def twoflips(s1, s2):
 9:    if hasmatches(s1, s2):
10:      return False
11:    else:
12:      count = 0
13:      for i1, c1 in enumerate(s1):
14:        for i2, c2 in enumerate(s2):
15:          if c1 == c2 and s1[i2] == s2[i1]:
16:            count += 1
17:      return count == 4
18:  
19:  correct = 'BCDA'
20:  perms = [ ''.join(x) for x in permutations(correct) ]
21:  allbad = [ x for x in perms if not hasmatches(correct, x) ]
22:  flippers = [ x for x in allbad if twoflips(x, correct) ]
23:  
24:  print(flippers)

The hasmatches function uses a generator to loop through a pair of strings, summing up how many of their characters match by position. If the sum is greater than zero, it returns True.

The twoflips function is more complicated, but it does a similar thing. It loops through both strings and checks whether the ith character of one string matches the jth character of the other and vice versa. As it proceeds, it counts all of these occurrences. Because I didn’t try to be clever in setting up the nested loops, the count variable is incremented twice for every pair of characters that can be flipped. That’s why twoflips returns True when count equals 4.

The rest of the program is more obvious. perms gets the list of all 24 permutations; allbad filters them down to just the 9 that have no matches; and finally, flippers filters allbad down to the 3 permutations that can be turned into the correct answer by performing two letter exchanges. The result is

CBAD   DABC   ADCB

which includes my answer.

Looking through the remaining derangements, we see

CDAB and ABCD, which are one-character rotations of the correct answer; and
DBAC, ADBC, and CABD, whose connection to the correct answer isn’t clear to me.

I’m glad this question had only four players. There’d be 44 derangements in a 5-player question.

Or of ABCD. Or ADCB. Permutations cover all the ways of arranging the four characters. I think it’s easiest to understand the arrangements when they’re laid out like this. ↩

And now it’s all this

I just said what I said and it was wrong
Or was taken wrong

Basketball trivia, editing distances, and derangement

Site search

Meta

Recent posts

Credits

1. Tom Chambers	A. 42
2. Steve Nash	B. 24
3. Dan Majerle	C. 13
4. Connie Hawkins	D. 9

1. Connie Hawkins	A. 42
2. Tom Chambers	B. 24
3. Steve Nash	C. 13
4. Dan Majerle	D. 9

And now it’s all this

I just said what I said and it was wrong Or was taken wrong

Basketball trivia, editing distances, and derangement

Site search

Meta

Recent posts

Credits

I just said what I said and it was wrong
Or was taken wrong