Basketball trivia, editing distances, and derangement
October 8, 2025 at 2:40 PM by Dr. Drang
My older son and I have been playing a daily basketball trivia game. Here’s the question from yesterday with my answers handwritten on the sheet.
Here’s the text of the question. This is easier than trying to put it in the image’s alt text.
Match the Suns great with his retired uniform number.
1. Tom Chambers A. 42 2. Steve Nash B. 24 3. Dan Majerle C. 13 4. Connie Hawkins D. 9
The correct answer, in player order, is BCDA
, not my answer of ADCB
. I won’t blame anyone who scoffs at my failure to get Nash’s number right. My excuse is that I thought it was so obvious for a smart playmaking guard to wear #13 that it had to be wrong. So I chose Majerle for #13 instead. This sort of mistake is often called “outsmarting” oneself, but I think it’s more “outstupiding.”
Anyway, I got every number wrong, but I noticed that two switches, Majerle/Nash and Chambers/Hawkins would’ve made my answer correct. I knew there were various methods for measuring edit distances of text strings and wondered if any of them would provide guidance on how “close” my answer was.
The simplest is the Hamming distance, which is just a count of how many characters are wrong. It’s limited to comparing strings of equal length, which is fine for this situation. Since every character in my answer is wrong, its Hamming distance from the correct answer is 4.
I’ve seen lots of references to the Levenshtein distance but have never used it or read up on it before now. It’s a count of the number of single-character insertions, deletions, and substitutions necessary to go from one string to another. The Levenshtein distance from my answer to the correct one is also 4. Since switching letters isn’t a valid edit in a Levenshtein calculation, you can’t go from ACDB
to BDCA
in fewer steps than 4 substitutions.
There’s also a Damerau-Levenshtein distance, which extends the Levenshtein idea by allowing the transposition of adjacent characters as one of the editing steps. The Damerau-Levenshtein distance from my answer to the correct one is 3; substitution of the first character, substitution of the last character, and transposition of the two interior characters.
After doing these calculations, I started thinking about how these last two distances are better for evaluating actual words than strings like the trivia answer. For example, in the Damerau-Levenshtein calculation, we’re allowed to switch only adjacent characters, not characters that are two or more positions apart. This makes sense when dealing with words. We’ve all typed teh or hte, and some of us have text substitutions set up to transform them automatically into the. Also, there are words that don’t follow the “I before E” rule—Levenshtein, to pick a random example—which makes them easy to misspell by switching adjacent letters.
But the answer to this trivia question isn’t like a word. The question would be effectively the same if it were worded this way:
Match the Suns great with his retired uniform number.
1. Connie Hawkins A. 42 2. Tom Chambers B. 24 3. Steve Nash C. 13 4. Dan Majerle D. 9
I’ve kept the uniform numbers in the same order but rotated Connie Hawkins up to the top of the list. This is the same question as before, but now the correct answer is ABCD
and my incorrect answer would be BADC
. The Damerau-Levenshtein distance between these strings is just 2: transpose the first two characters, then the last two characters. Also, the Levenshtein distance would be 3:
BADC→ADC→ABC→ABCD
We’ve reduced both of these edit distances without really changing the question. So these measures aren’t a good way to see how bad my answer was.
Both Mathematica and Python have functions for calculating these distances. Mathematica has the built-in functions HammingDistance
, EditDistance
(for Levenshtein), and DamerauLevenshteinDistance
. There are a few ways to import functions for Python. I used the textdistance
module, but jellyfish
is apparently quite a bit faster.
There are 24 (4!) different ways to answer the trivia question:
ABCD BACD CABD DABC
ABDC BADC CADB DACB
ACBD BCAD CBAD DBAC
ACDB BCDA CBDA DBCA
ADBC BDAC CDAB DCAB
ADCB BDCA CDBA DCBA
These are the permutations of the correct answer BCDA
.1 Of these, 9 get every player wrong:
CBAD DBAC ABCD
CDAB DABC ADBC
CABD DACB ADCB
My wrong answer is the one in the lower left corner. These are known as the derangements of the correct answer, so called because they’re the permutations of BCDA
with none of the letters in their original (correct) position. I wrote a post on calculating the number of derangements back in 2020.
Since my answer can be made correct by flipping two pairs of letters, I wondered which of the other derangements can be similarly transformed. It’s not hard to go through the 9 strings above and work it out by hand, but I wanted to write a little script to do it. Here it is:
python:
1: #!/usr/bin/env python3
2:
3: from itertools import permutations
4:
5: def hasmatches(s1, s2):
6: return sum(x == y for x, y in zip(s1, s2)) > 0
7:
8: def twoflips(s1, s2):
9: if hasmatches(s1, s2):
10: return False
11: else:
12: count = 0
13: for i1, c1 in enumerate(s1):
14: for i2, c2 in enumerate(s2):
15: if c1 == c2 and s1[i2] == s2[i1]:
16: count += 1
17: return count == 4
18:
19: correct = 'BCDA'
20: perms = [ ''.join(x) for x in permutations(correct) ]
21: allbad = [ x for x in perms if not hasmatches(correct, x) ]
22: flippers = [ x for x in allbad if twoflips(x, correct) ]
23:
24: print(flippers)
The hasmatches
function uses a generator to loop through a pair of strings, summing up how many of their characters match by position. If the sum is greater than zero, it returns True
.
The twoflips
function is more complicated, but it does a similar thing. It loops through both strings and checks whether the ith character of one string matches the jth character of the other, and vice versa. As it proceeds, it counts all of these occurrences. Because I didn’t try to be clever in setting up the nested loops, the count
variable is incremented twice for every pair of characters that can be flipped. That’s why twoflips
returns True when count
equals 4.
The rest of the program is more obvious. perms
gets the list of all 24 permutations; allbad
filters them down to just the 9 that have no matches; and finally, flippers
filters allbad
down to the 3 permutations that can be turned into the correct answer by performing two letter exchanges. The result is
CBAD DABC ADCB
which includes my answer.
Looking through the remaining derangements, we see
ADCB
, which is the reverse of the correct answer.CDAB
andABCD
, which are one-character rotations of the correct answer,DBAC
,ADBC
, andCABD
, whose connection to the correct answer isn’t clear to me.
I’m glad this question had only four players. There’d be 44 derangements in a 5-player question.
-
Or of
ABCD
. OrADCB
. Permutations cover all the ways of arranging the four characters. I think it’s easiest to understand the arrangements when they’re laid out like this. ↩