I don’t have the sort of encyclopedic Mac knowledge the draft participants—Shelly Brisbin, Stephen Hackett, Dan Moren, Jason Snell, and the aforementioned Johns—have, but I do want to mention what my picks would be for two of the categories: first Mac and best/favorite Mac software.
My first Mac slips in between Siracusa’s original 128k Mac and Shelly’s Mac Plus.^{1} It was the 512k Mac, popularly known as the Fat Mac because it had so much memory. I bought it in early 1985, when I was was in grad school, and wrote my thesis on it. I took advantage of a discount through the University of Illinois, and added on an external diskette drive, an ImageWriter printer, and the cool canvas carrying bag, in which I hauled the Mac between my apartment and office—just like in Apple’s marketing literature of the time.
I tend to think of the Fat Mac as the first truly useable Mac, mainly because it had enough RAM for apps to have some breathing room. Even better, Switcher, which allowed you to run multiple apps at once before MultiFinder, gave the Fat Mac a giant advantage over the original. I was teaching a course in the spring semester of ’85 and made up my tests on the Mac, making the illustrations in MacPaint and then switching over instantly to paste them into a MacWrite document.
As for my best/favorite Mac software, I’m going to do the traditional podcast draft thing and talk about how I was torn between two apps, thus giving me effectively two picks. But before that, I need to point out that the choices made by those on the podcast are all objectively better than mine. I’m deliberately choosing apps that are weird but were very useful to me in the early years. They also showed off the Mac’s user interface to great advantage.
The app I nearly chose was Macintosh Pascal, written by THINK Technologies (who went on to publish Lightspeed Pascal and Lightspeed C, which had the greatest programmer slogan ever: “Make mistakes faster”) and distributed by Apple.
This was an interpreted Pascal, so the code ran relatively slowly, but the lack of a compilation step made the edit/debug/edit/debug cycle go quickly. On the editing side, Macintosh Pascal was my introduction to syntax highlighting, which you can see (sort of) in the screenshot above. This was the Mac’s font styles used to great effect. Debugging was also done visually by dragging a stop sign to the line of code that needed a breakpoint. These are things we take for granted now, but they were a revelation to me in 1985.
So with my not-quite-choice out of the way, here’s my fave: Claris CAD.
Am I kidding? No. This was a fantastic program for the kind of drawing I was doing back in the early ’90s and am still doing today. It wasn’t drafting, per se, but it did involve the kinds of construction typically done on a drafting table: lines tangent to circles, circles tangent to lines, lines perpendicular to other lines, and so on. The cursor would snap to features of the drawing and show a preview of how the next item would be drawn. It nearly always did exactly what I wanted.
And this is not just hazy nostalgia; I thought Claris CAD was an amazing program at the time. It was simply more in tune with the drawing of manufactured objects than were (and are) apps more obviously based on Bezier curves.
These apps were not only great, they are perfect for a draft. No one would snipe them.
I also had a Plus, but it was my second Mac and was bought for me by my employer. ↩
It was made, as most of my graphs are, using Python and Matplotlib. Here’s the code that did it:
python:
1: #!/usr/bin/env python
2:
3: import matplotlib.pyplot as plt
4: from matplotlib.ticker import MultipleLocator, AutoMinorLocator
5: import numpy as np
6: import sys
7:
8: # Array of angles in degrees
9: theta = np.arange(0.5, 90.0, 0.5)
10:
11: # Arrays of friction coefficients
12: muFloor = 1/(2*np.tan(theta*np.pi/180))
13: muBoth = np.sqrt(1 + np.tan(theta*np.pi/180)**2) - np.tan(theta*np.pi/180)
14:
15: # Create the plot with a given size in inches
16: fig, ax = plt.subplots(figsize=(6, 4))
17:
18: # Add the lines
19: ax.plot(theta, muFloor, '-', color='#1b9e77', lw=2, label="Floor only")
20: ax.plot(theta, muBoth, '-', color='#d95f02', lw=2, label="Wall and floor")
21:
22: # Set the limits
23: plt.xlim(xmin=0, xmax=90)
24: plt.ylim(ymin=0, ymax=2)
25:
26: # Set the major and minor ticks and add a grid
27: ax.xaxis.set_major_locator(MultipleLocator(15))
28: ax.xaxis.set_minor_locator(AutoMinorLocator(3))
29: ax.yaxis.set_major_locator(MultipleLocator(.5))
30: ax.yaxis.set_minor_locator(AutoMinorLocator(2))
31: # ax.grid(linewidth=.5, axis='x', which='major', color='#dddddd', linestyle='-')
32: # ax.grid(linewidth=.5, axis='y', which='major', color='#dddddd', linestyle='-')
33:
34: # Title and axis labels
35: plt.title('Leaning ladder problem')
36: plt.xlabel('Ladder angle (degrees)')
37: plt.ylabel('Friction coefficient')
38:
39: # Make the border and tick marks 0.5 points wide
40: [ i.set_linewidth(0.5) for i in ax.spines.values() ]
41: ax.tick_params(which='both', width=.5)
42:
43: # Add the legend
44: ax.legend(loc=(.58, .62), frameon=False)
45:
46: # Save as SVG
47: plt.savefig('20240110-Friction comparison graph.svg', format='svg')
In broad outline, this is how nearly all of my Matplotlib graphs are made, because I have a Typinator snippet that inserts generic plot-making code that I modify to set the limits, tick marks, legend, etc. that are appropriate for the graph.
By the way, I don’t think I’ve mentioned here that I’ve switched to Typinator. Ergonis was having a sale at the tail end of 2022, and I decided to give it a go. I had been using Keyboard Maestro for my snippets for several years, but KM isn’t truly meant for text expansion, and I wanted to move to something more built-for-purpose. TextExpander would, I suppose, be the obvious choice, but I didn’t want another subscription. Typinator looked to have all the features I needed, and I haven’t regretted the choice in the year I’ve been using it.
Anyway, here’s the Typinator definition of my plotting snippet:
As you can see, the abbreviation is ;plot
. As you can’t see, at least not fully, the expansion is this:
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, AutoMinorLocator
# Create the plot with a given size in inches
fig, ax = plt.subplots(figsize=(6, 4))
# Add a line
ax.plot(x, y, '-', color='blue', lw=2, label='Item one')
# Set the limits
# plt.xlim(xmin=0, xmax=100)
# plt.ylim(ymin=0, ymax=50)
# Set the major and minor ticks and add a grid
# ax.xaxis.set_major_locator(MultipleLocator(20))
# ax.xaxis.set_minor_locator(AutoMinorLocator(2))
# ax.yaxis.set_major_locator(MultipleLocator(10))
# ax.yaxis.set_minor_locator(AutoMinorLocator(5))
# ax.grid(linewidth=.5, axis='x', which='both', color='#dddddd', linestyle='-')
# ax.grid(linewidth=.5, axis='y', which='major', color='#dddddd', linestyle='-')
# Title and axis labels
plt.title('{{?Plot title}}')
plt.xlabel('{{?X label}}')
plt.ylabel('{{?Y label}}')
# Make the border and tick marks 0.5 points wide
[ i.set_linewidth(0.5) for i in ax.spines.values() ]
ax.tick_params(which='both', width=.5)
# Add the legend
# ax.legend()
# Save as PDF
plt.savefig('{{?File name}}.pdf', format='pdf')
Typinator uses special syntax for placeholder text. In this case, the placeholders are
{{?Plot title}}
{{?X label}}
{{?Y label}}
{{?File name}}
I don’t have to remember this syntax. The {…} popup menu has a bunch of placeholder options and will insert the correct code when you define the placeholder.
When I type ;plot
, this dialog box appears to enter the text for each placeholder:
Typinator remembers the last text I entered for each placeholder, which makes things go faster if I’m doing a series of similar graphs.
If you go through the snippet, you’ll see that many of the lines of Matplotlib code are commented out, in particular those that set the formatting of the graph. That’s because I usually like to see what Matplotlib comes up with by default. If I don’t like the defaults, I uncomment the appropriate lines and start tinkering. You’ll also notice that this produces PDFs by default; that’s because the plots I make for work have always been PDFs. As I expect to be writing very few work reports from now on, I’ll probably change the file format in the last line from PDF to SVG or PNG.
]]>Rhett Allain, who’s the physics columnist for Wired and a professor of physics at Southeast Louisiana State University, solved a funny problem a couple of weeks ago involving a ladder leaning against a wall. You can see his solution on YouTube or Medium. I think of it as an oddball problem because it’s very different from the “ladder leaning against a wall” problems I’ve been seeing for over 40 years.
It started as a calculus problem in which you are to assume that Point A moves down the wall with a constant speed and then determine the speed of Point B. Prof. Allain thought (rightly) that Point A of a real ladder wouldn’t fall at a constant speed, so he decided to turn it into a physics problem with gravity acting to accelerate the ladder’s downward movement. His goal in this altered problem was to determine the point at which Point A loses contact with the wall. A key feature in Allain’s problem was zero friction between the wall and the ladder and the floor and the ladder. This certainly makes the solution easier.
Both of these problems are different from the ladder/wall problems that are commonly given to engineers in their introductory mechanics courses. Ladders and walls are typically found in the friction section of the study of statics. In these problems, the idea is to determine either
The key word in these problems is doesn’t. These are statics problems, in which things don’t move, and we look for the conditions under which the ladder stays in place. Here are some images from example problems in textbooks I have:
From Seely & Ensign’s Analytical Mechanics for Engineers (Wiley, 1941):
From Den Hartog’s Mechanics (McGraw-Hill, 1948, but I have the Dover reprint):
From Synge & Griffith’s Principles of Mechanics (McGraw-Hill, 1949):
From McGill & King’s Engineering Mechanics: Statics (PWS-Kent, 1989):
They use different loading and different friction conditions, but basic idea is the same in all of them.
In addition to the laws of statics—Newton’s Second Law with no acceleration—these problems make use of Coulomb’s^{1} static friction relationship:
$$f\le \mu \phantom{\rule{thinmathspace}{0ex}}N$$where $f$ is the friction force, which runs parallel to the contact surface in the direction opposite that of impending motion; $N$ is the normal force; and $\mu $ is the friction coefficient. Let’s talk about each of these.
In common speech, “impending” means “about to happen,” but that isn’t quite what the authors of engineering textbooks mean when they use it in this context. These are, after all, statics problems, so there is no movement about to happen. The meaning here has been stretched to “what would happen if there were no friction.”
The “normal” in “normal force” means “perpendicular.” The normal force is perpendicular to the contact surface.
The friction coefficient, $\mu $, is a marvel of engineering simplification. The magnitude of the friction force depends on many things: which two materials are in contact, their surface roughness, any lubrication that might be present, even the temperature. But to get practical solutions, we simplify these many conditions down to a single number. Well, maybe not a single number, because when you look up the friction coefficient for a particular set of materials, you’ll often find a range of values. Still, boiling friction down to a number, even if we don’t know the number exactly, is a great way to think about friction and get reasonable answers.
In the simplest version of the ladder problem, we assume there is friction between the ladder and the floor (Point B), but no friction between the ladder and the wall (Point A). If you look carefully, you’ll see that this is the problem Synge & Griffith were exploring. Here’s a free-body diagram of the ladder and all the forces acting on it:
We’ve taken the mass center of the ladder to be at its geometric center.
The three equations of statics are
$$\∑{F}_{x}={N}_{A}-{f}_{B}=0$$ $$\∑{F}_{y}={N}_{B}-\mathrm{mg}=0$$ $$\∑{M}_{B}=\mathrm{mg}\left(\frac{L}{2}\mathrm{cos}\theta \right)-{N}_{A}(L\mathrm{sin}\theta )=0$$Here, the x and y directions are horizontal and vertical as usual, and the positive direction for moments about Point B is counterclockwise.
Because we were smart in choosing Point B to take moments about, the second and third equations have only one unknown variable each and can be solved directly:
$${N}_{B}=\mathrm{mg}$$ $${N}_{A}=\frac{\mathrm{mg}}{2}\mathrm{cot}\theta $$We substitute the second of these solutions into the first statics equation to get
$${f}_{B}=\frac{\mathrm{mg}}{2}\mathrm{cot}\theta $$Here’s where the friction coefficient comes in. We know that ${f}_{B}\le \mu {N}_{B}$, so substituting in the solutions from statics, this inequality turns into
$${f}_{B}=\frac{\mathrm{mg}}{2}\mathrm{cot}\theta \le \mu {N}_{B}=\mu \mathrm{mg}$$or, after rearranging,
$$\mu \ge \frac{1}{2}\mathrm{cot}\theta $$So this gives us the minimum friction coefficient for the ladder to stay up at a given angle. A little algebra tells us the lowest at which the ladder will stay up for a given coefficient of friction:
$$\theta \ge {\mathrm{tan}}^{-1}\frac{1}{2\mu}$$If you’re worried about trig functions changing signs and messing up these inequalities, recall that the ladder angle is always between 0° and 90°, so there are no sign changes for tangent or cotangent.
Now let’s add friction between the ladder and the wall. To keep the number of variables to a minimum, we’ll assume the ladder/wall friction coefficient is the same as the ladder/floor friction coefficient.
Here’s the free-body diagram:
The addition of ${f}_{A}$ to the system means we no longer have a statically determinant system. Now we have four unknowns to go with the three equations of statics:
$$\∑{F}_{y}={f}_{A}+{N}_{B}-\mathrm{mg}=0$$ $$\∑{F}_{x}={N}_{A}-{f}_{B}=0$$ $$\∑{M}_{B}=\mathrm{mg}\left(\frac{L}{2}\mathrm{cos}\theta \right)-{N}_{A}(L\mathrm{sin}\theta )-{f}_{A}(L\mathrm{cos}\theta )=0$$And we have two inequalities of Coulomb friction:
$${f}_{A}\le \mu {N}_{A}\phantom{\rule{mediummathspace}{0ex}},\phantom{\rule{1em}{0ex}}{f}_{B}\le \mu {N}_{B}$$We can combine these five relationships, and if we’re very careful in keeping the inequalities pointed in the right directions, we’ll come up with an inequality that relates the friction coefficient to the angle of the ladder. But most engineers wouldn’t solve the problem that way. Instead, they’d use the following physical reasoning to simplify the algebra:
Using these three bits of reasoning, we can say
$${f}_{A}=\mu {N}_{A}\phantom{\rule{mediummathspace}{0ex}},\phantom{\rule{1em}{0ex}}{f}_{B}=\mu {N}_{B}$$and combine these equations with the statics equations as follows:
The second equation of statics tells us that
$${f}_{B}={N}_{A}$$so
$${N}_{A}=\mu {N}_{B}$$The first equation of statics tells us
$${f}_{A}=\mathrm{mg}-{N}_{B}$$so
$$\mathrm{mg}-{N}_{B}=\mu {N}_{A}={\mu}^{2}{N}_{B}$$Therefore,
$${N}_{B}=\frac{\mathrm{mg}}{1+{\mu}^{2}}$$ $${N}_{A}=\frac{\mu \mathrm{mg}}{1+{\mu}^{2}}$$and
$${f}_{A}=\frac{{\mu}^{2}\mathrm{mg}}{1+{\mu}^{2}}$$Plugging the expressions for ${N}_{A}$ and ${f}_{A}$ into the third equation of statics gives
$$\mathrm{mg}\left(\frac{L}{2}\mathrm{cos}\theta \right)-\frac{\mu \mathrm{mg}}{1+{\mu}^{2}}(L\mathrm{sin}\theta )-\frac{{\mu}^{2}\mathrm{mg}}{1+{\mu}^{2}}(L\mathrm{cos}\theta )=0$$which can be rearranged to
$$(\frac{1}{2}-\frac{{\mu}^{2}}{1+{\mu}^{2}})\mathrm{cos}\theta =\frac{\mu}{1+{\mu}^{2}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{sin}\theta $$This can be simplified in the following steps,
$$\frac{1+{\mu}^{2}-2{\mu}^{2}}{2(1+{\mu}^{2})}\mathrm{cos}\theta =\frac{\mu}{1+{\mu}^{2}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{sin}\theta $$ $$\frac{1-{\mu}^{2}}{2}\mathrm{cos}\theta =\mu \mathrm{sin}\theta $$to
$$\mathrm{tan}\theta =\frac{1-{\mu}^{2}}{2\mu}$$So if we’re given $\mu $, the minimum angle for which the ladder won’t slip is
$$\theta ={\mathrm{tan}}^{-1}\left(\frac{1-{\mu}^{2}}{2\mu}\right)$$If $\mu >1$, this gives us a negative $\theta $, which is outside the range of ladder angles (0° to 90°) for which the statics equations were written. So a friction coefficient greater than one means the ladder will stay up at any angle.
If we’re given the ladder angle, the minumum friction coefficient to prevent slipping is
$$\mu =\sqrt{1+{\mathrm{tan}}^{2}\theta}-\mathrm{tan}\theta $$This one of the two solutions to a quadratic equation. The other gives a negative friction coefficient, which we don’t care about. As the ladder angle approaches 0°, the minimum friction coefficient approaches one. This is consistent with the finding above that a friction coefficient above one means the ladder will stay up at any angle.
Note that our claims that these are the minimum angle and friction coefficient don’t come out of the algebra, they come out of Item 3 of our physical reasoning.
By the way, just because we went through this solution using physical reasoning to simplify the math, that doesn’t mean we couldn’t get the same result using the inequalities and carrying out the algebra. I have a few pages in my notebook in which I did that (including a couple of mistakes along the way) just to prove to myself that I could.
One last thing we can do is plot the results of the two problems and see how they compare. Here are the miniumum friction coefficients necessary to keep the ladder from slipping over the full range of angles.
There’s not much difference between the two problems when the ladder angle is high. That’s because the ladder isn’t pressing hard against the wall at those angles, so most of the work of keeping the ladder up in the second problem is being done by the floor/ladder friction. On the other hand, at low ladder angles, the difference between the two problems is huge; the wall/ladder friction in the second problem is doing a lot of work.
Simple problems like this aren’t especially important in the everyday working life of an engineer (unless you work for Werner), but the principles they teach are applicable across many disciplines. That’s why they appear in so many textbooks.
This is Charles-Augustin Coloumb, the guy who’s most famous for his law of electrical charges. In accordance with Stiger’s Law of Eponymy, the friction relationship that bears his name was written about by others before him. ↩
So you don’t have to go through the terrible burden of reading an earlier blog post, here’s the source code of the command I use from the Terminal (or iTerm) to open man pages in a new BBEdit window:
bash:
1: #!/bin/bash
2:
3: # Interpret the arguments as command name and section. As with `man`,
4: # the section is optional and comes first if present.
5: if [[ $# -lt 2 ]]; then
6: cmd=${1}
7: sec=''
8: else
9: cmd=${2}
10: sec=${1}
11: fi
12:
13: # Get the formatted man page, filter out backspaces and convert tabs
14: # to spaces, and open the text in a new BBEdit document. Set the title
15: # of the window and scroll to the top.
16: man $sec $cmd | col -bx | bbedit --view-top --clean -t $cmd
This is slightly different from the code I originally posted. It incorporates options suggested by readers to the bbedit
command in the pipeline of Line 16:
--view-top
puts the cursor and the scroll position at the top of the document.--clean
sets the state of the document to unmodified so you can close it without getting the “do you want to save this?” warning.-t $cmd
sets the title of the document to the command name.I’ve also changed the name of the command from bman
to bbman
to better fit the naming pattern set by bbedit
, bbfind
, and bbdiff
. So if I type
bbman ls
at the command line, a new BBEdit window will open^{1} with the text of the ls
man page. The bold characters that I’d see if I ran
man ls
don’t appear in the BBEdit window, but I’ve never gotten any value out of that limited sort of text formatting, so I don’t miss it.
Although most of what I do in a man page is search and read, sometimes I like to use the hints within the text to open a new, related man page. So I wrote an AppleScript (BBEdit has great AppleScript support) that uses the cursor postion or the selected text to open a new page. Here’s how it works:
Say I’m in SEE ALSO section of the ls
man page, and I want to open the chmod
man page that’s referred to there. I can either double-click to select chmod
or just single-click to put the cursor within the word.
I then select chmod
man page.
Here’s the Man Page AppleScript:
1: use AppleScript version "2.4" -- Yosemite (10.10) or later
2: use scripting additions
3:
4: -- This script is expected to be run either with the command name selected (as if
5: -- by double-clicking) or with the cursor within the command name. The section
6: -- (in parentheses) may be immediately after the command name.
7:
8: -- Function for getting the man page section from between parenthesis.
9: -- Input is the character position of the opening parenthesis (if present).
10: -- Returns the section or an empty string.
11: on getSection(parenPos)
12: tell application "BBEdit"
13: if (character parenPos of front document as text) is "(" then
14: set secStart to parenPos + 1
15: set secEnd to find ")" searching in front document
16: set secEnd to (characterOffset of secEnd's found object) - 1
17: return characters secStart through secEnd of front document as text
18: else
19: return ""
20: end if
21: end tell
22: end getSection
23:
24: -- Start by selecting the word the cursor is in (via ⌥←, ⌥⇧→) if there isn't already a selection.
25: tell application "BBEdit"
26: if length of selection is 0 then
27: tell application "System Events"
28: key code 123 using option down
29: delay 0.125
30: key code 124 using {option down, shift down}
31: delay 0.125
32: end tell
33: end if
34: end tell
35:
36: -- Set the command name according to the selection and the section according to
37: -- whatever may be in parentheses immediately after the command name. This is
38: -- in a new tell block to ensure that the selection has been updated by the
39: -- previous tell block.
40: tell application "BBEdit"
41: set cmdName to selection as text
42: set parenPos to (characterOffset of selection) + (length of selection)
43: set manSection to my getSection(parenPos)
44: end tell
45:
46: -- Get the man page and pipe it through col to delete backspaces and expand tabs. Then
47: -- pipe that to bbedit with appropriate options. The --clean option means the new
48: -- document is treated as unmodified so it can be closed without a confirmation dialog.
49: set manCmd to "man " & manSection & " " & cmdName & " | col -bx "
50: set manCmd to manCmd & "| /usr/local/bin/bbedit --view-top --clean -t " & cmdName
51: do shell script manCmd
I think it’s pretty well commented. You can see that Lines 48–51 invoke the same shell pipeline used in the bbman
script. The main features that make this different from the shell script are:
getSection
function in Lines 11–22, which uses the character position of the end of the command name to start its search.Over the years, I’ve seen lots of ways to turn man pages into HTML, which would make the linking more natural. But they’ve all seemed more trouble than they’re worth. The trick of passing the man
output through col -bx
to turn it into plain text is something I’ve always come back to, whether my preferred text editor has been BBEdit, TextMate, or (on Linux) NEdit.
On my computer, BBEdit is always running. ↩
I used the Knowledgebase last year in my post about orbital curvature. Things like
Entity["Planet", "Earth"]["AverageOrbitDistance"]
and
Entity["Planet", "Earth"]["Mass"]
pulled information out of the Knowledgebase so I didn’t have to look it up outside of Mathematica and paste it into my code. Very convenient.
But I learned a few months ago that another part of the Knowledgebase was missing data, which could get in the way of other types of calculation. I was testing out the kind of state- and county-level information I could access, and my initial explorations focused on where I live: DuPage County, Illinois.
To be sure, the Knowledgebase has lots of info on DuPage County. It knows, for example, the area, the population, the per capita income, and the number of annual births and deaths. But it doesn’t know the county seat, which I would think is easier to determine and enter into the Knowledgebase than most of the other stuff—not to mention more stable than transient figures like population and income.
Broadening my exploration to all the counties in Illinois, I learned that of our 102 counties, Wolfram knew the capitals of all of them except DuPage and DeKalb counties. So this command
AdministrativeDivisionData[
Entity["AdministrativeDivision", {"CookCounty", "Illinois",
"UnitedStates"}], "CapitalName"]
returns
Chicago
as expected, while both of these commands,
AdministrativeDivisionData[
Entity["AdministrativeDivision", {"DuPageCounty", "Illinois",
"UnitedStates"}], "CapitalName"]
and
AdministrativeDivisionData[
Entity["AdministrativeDivision", {"DeKalbCounty", "Illinois",
"UnitedStates"}], "CapitalName"]
return
Missing["NotAvailable"]
This is not exactly obscure information, and there are reliable sources from which to get it. Here, for example is a map from the Illinois Blue Book, an official publication of the state.
As you (and the folks at Wolfram) can see, the DuPage and DeKalb county seats are Wheaton and Sycamore, respectively.
I sent an email to Wolfram about the missing county seat data and got a boilerplate reply saying their development team would review it. That was in August; the Knowledgebase still returns Missing["NotAvailable"]
.
Recently, I decided to look for missing county seats in every state. Here are all the counties—or administrative divisions that Wolfram treats like counties— that are missing their capitals in the Knowledgebase:
County w/o seat | State |
---|---|
DeKalb County | Alabama |
Aleutians West | Alaska |
Bethel | Alaska |
Chugach | Alaska |
Copper River | Alaska |
Dillingham | Alaska |
Hoonah-Angoon | Alaska |
Nome | Alaska |
Prince of Wales-Hyder | Alaska |
Petersburg | Alaska |
Skagway | Alaska |
Southeast Fairbanks | Alaska |
Kusilvak Census Area | Alaska |
Wrangell | Alaska |
Yukon-Koyukuk | Alaska |
Mono County | California |
Sierra County | California |
Conejos County | Colorado |
Wakulla County | Florida |
Columbia County | Georgia |
Crawford County | Georgia |
DeKalb County | Georgia |
Echols County | Georgia |
Kalawao County | Hawaii |
Owyhee County | Idaho |
DeKalb County | Illinois |
DuPage County | Illinois |
DeKalb County | Indiana |
LaPorte County | Indiana |
Plaquemines Parish | Louisiana |
St. James Parish | Louisiana |
Keweenaw County | Michigan |
Lake of the Woods County | Minnesota |
DeSoto County | Mississippi |
Franklin County | Mississippi |
DeKalb County | Missouri |
McPherson County | Nebraska |
Esmeralda County | Nevada |
Eureka County | Nevada |
Lincoln County | Nevada |
Storey County | Nevada |
Burlington County | New Jersey |
Mora County | New Mexico |
Rio Arriba County | New Mexico |
Bronx County (The Bronx) | New York |
Broome County | New York |
Kings County (Brooklyn) | New York |
New York County (Manhattan) | New York |
Queens County (Queens) | New York |
Richmond County (Staten Island) | New York |
Camden County | North Carolina |
Currituck County | North Carolina |
Hyde County | North Carolina |
Dunn County | North Dakota |
Bristol County | Rhode Island |
Kent County | Rhode Island |
Buffalo County | South Dakota |
DeKalb County | Tennessee |
Borden County | Texas |
Glasscock County | Texas |
Kenedy County | Texas |
King County | Texas |
Loving County | Texas |
McMullen County | Texas |
Montague County | Texas |
Palo Pinto County | Texas |
Young County | Texas |
Rich County | Utah |
Alexandria (independent city) | Virginia |
Amelia County | Virginia |
Bath County | Virginia |
Bland County | Virginia |
Bristol (independent city) | Virginia |
Buckingham County | Virginia |
Buena Vista (independent city) | Virginia |
Charles City County | Virginia |
Charlottesville (independent city) | Virginia |
Chesapeake (independent city) | Virginia |
Colonial Heights (independent city) | Virginia |
Covington (independent city) | Virginia |
Cumberland County | Virginia |
Danville (independent city) | Virginia |
Dinwiddie County | Virginia |
Emporia (independent city) | Virginia |
Fairfax (independent city) | Virginia |
Falls Church (independent city) | Virginia |
Fluvanna County | Virginia |
Franklin (independent city) | Virginia |
Fredericksburg (independent city) | Virginia |
Galax (independent city) | Virginia |
Goochland County | Virginia |
Hampton (independent city) | Virginia |
Hanover County | Virginia |
Harrisonburg (independent city) | Virginia |
Hopewell (independent city) | Virginia |
Isle of Wight County | Virginia |
King and Queen County | Virginia |
King George County | Virginia |
King William County | Virginia |
Lancaster County | Virginia |
Lexington (independent city) | Virginia |
Lunenburg County | Virginia |
Lynchburg (independent city) | Virginia |
Manassas (independent city) | Virginia |
Manassas Park (independent city) | Virginia |
Martinsville (independent city) | Virginia |
Mathews County | Virginia |
Middlesex County | Virginia |
Nelson County | Virginia |
New Kent County | Virginia |
Newport News (independent city) | Virginia |
Norfolk (independent city) | Virginia |
Northumberland County | Virginia |
Norton (independent city) | Virginia |
Nottoway County | Virginia |
Petersburg (independent city) | Virginia |
Poquoson (independent city) | Virginia |
Portsmouth (independent city) | Virginia |
Powhatan County | Virginia |
Prince George County | Virginia |
Radford (independent city) | Virginia |
Richmond County | Virginia |
Roanoke County | Virginia |
Salem (independent city) | Virginia |
Stafford County | Virginia |
Staunton (independent city) | Virginia |
Suffolk (independent city) | Virginia |
Sussex County | Virginia |
Virginia Beach (independent city) | Virginia |
Waynesboro (independent city) | Virginia |
Williamsburg (independent city) | Virginia |
Winchester (independent city) | Virginia |
Quite a list. Now there are legitimate (or at least arguable) reasons some of these counties are missing their county seat:
But most of the counties with missing county seats are like DuPage and DeKalb counties—regular counties with regular county seats that are just not included in the Knowledgebase, despite them being easy to look up and verify. There are fewer than 100 of them. I don’t know why they’re missing, but filling in missing values like this is a pretty standard data cleaning operation. And as I said earlier, this is pretty much a one-time operation; counties just don’t change seats very often.
I haven’t sent this list off to Wolfram. If the people on its development team can’t be bothered to clean the data in their own home state, how likely is it that they’ll fill in all the other states’ data? But they should.
Update 4 Dec 2023 11:45 AM
Chon Torres on Mastodon informed me that the two California counties, Mono and Sierra, do have county seats, but they’re unincorporated, and that might explain why they’re missing from the Knowledgebase. That’s a good explanation, but I would argue with Wolfram that it’s a poor reason for excluding a capital. A county seat should have county government offices—Chon mentioned that he’s been at the Sierra County Courthouse in Downieville—but I don’t see why it needs a municipal government.
Adding to the madness of Virginia government, Sam Davies told me that an independent city can also be the county seat of a county that it’s been carved out of. The example he gave was Charlottesville, which is both an independent city and the capital of Albermarle County. To me, a more disturbing example is Fairfax, which is an independent city but also the county seat of—yes, that’s right—Fairfax County.
Thanks to Chon and Sam for the local government expertise.
Which is apparently not actually a borough itself, despite its name. This is more than I wanted to know about Alaska’s government. ↩
As with the Unorganized Borough in Alaska, this is more than I wanted to know about Virginia’s government. ↩
The numbers in today’s date—11, 28, and 23—make up the sides of a triangle. This doesn’t always happen; the two smaller numbers have to add up to more than the larger number.
He went on to figure out the angles of the plane triangle with side lengths of 11, 28, and 23 and then extended his analysis to triangles on a sphere and a pseudosphere. But I got hung up on the quoted paragraph. Which days can and can’t be the sides of a triangle? And how does the number of such “triangle days” change from year to year?
So I wrote a little Python to answer these questions.
python:
1: from datetime import date, timedelta
2:
3: def isTriangleDay(dt):
4: "Can the the year, month, and day of the given date be the sides of a triangle?"
5: y = dt.year % 100
6: m = dt.month
7: d = dt.day
8: sides = sorted(int(x) for x in (y, m, d))
9: return sides[0] + sides[1] > sides[2]
10:
11: def allDays(y):
12: "Return a list of all days in the given year."
13: start = date(y, 1, 1)
14: end = date(y, 12, 31)
15: numDays = (end - start).days + 1
16: return [ start + timedelta(days=n) for n in range(numDays) ]
17:
18: def triangleDays(y):
19: "Return a list of all the triangle days in the given year."
20: return [x for x in allDays(y) if isTriangleDay(x) ]
isTriangleDay
is a Boolean function that implements the test Cook described for a datetime.date
object. Note that Line 5 extracts just the last two digits of the year, which is what Cook intends. You could, I suppose, change Line 9 to
python:
9: return sides[0] + sides[1] >= sides[2]
if you want to accept degenerate triangles, where the three sides collapse onto a single line. I don’t.
The allDays
function uses a list comprehension to return a list of all the days in a given year, and triangleDays
calls isTriangleDay
to filter the results of allDays
down to just triangle days. I think both of these functions are self-explanatory.
With these functions defined, I got all the triangle days for 2023 via
python:
print('\n'.join(x.strftime('%Y-%m-%d') for x in triangleDays(2023)))
which returned this list of dates (after reshaping into four columns):
2023-01-23 2023-06-27 2023-09-19 2023-11-17
2023-02-22 2023-06-28 2023-09-20 2023-11-18
2023-02-23 2023-07-17 2023-09-21 2023-11-19
2023-02-24 2023-07-18 2023-09-22 2023-11-20
2023-03-21 2023-07-19 2023-09-23 2023-11-21
2023-03-22 2023-07-20 2023-09-24 2023-11-22
2023-03-23 2023-07-21 2023-09-25 2023-11-23
2023-03-24 2023-07-22 2023-09-26 2023-11-24
2023-03-25 2023-07-23 2023-09-27 2023-11-25
2023-04-20 2023-07-24 2023-09-28 2023-11-26
2023-04-21 2023-07-25 2023-09-29 2023-11-27
2023-04-22 2023-07-26 2023-09-30 2023-11-28
2023-04-23 2023-07-27 2023-10-14 2023-11-29
2023-04-24 2023-07-28 2023-10-15 2023-11-30
2023-04-25 2023-07-29 2023-10-16 2023-12-12
2023-04-26 2023-08-16 2023-10-17 2023-12-13
2023-05-19 2023-08-17 2023-10-18 2023-12-14
2023-05-20 2023-08-18 2023-10-19 2023-12-15
2023-05-21 2023-08-19 2023-10-20 2023-12-16
2023-05-22 2023-08-20 2023-10-21 2023-12-17
2023-05-23 2023-08-21 2023-10-22 2023-12-18
2023-05-24 2023-08-22 2023-10-23 2023-12-19
2023-05-25 2023-08-23 2023-10-24 2023-12-20
2023-05-26 2023-08-24 2023-10-25 2023-12-21
2023-05-27 2023-08-25 2023-10-26 2023-12-22
2023-06-18 2023-08-26 2023-10-27 2023-12-23
2023-06-19 2023-08-27 2023-10-28 2023-12-24
2023-06-20 2023-08-28 2023-10-29 2023-12-25
2023-06-21 2023-08-29 2023-10-30 2023-12-26
2023-06-22 2023-08-30 2023-10-31 2023-12-27
2023-06-23 2023-09-15 2023-11-13 2023-12-28
2023-06-24 2023-09-16 2023-11-14 2023-12-29
2023-06-25 2023-09-17 2023-11-15 2023-12-30
2023-06-26 2023-09-18 2023-11-16 2023-12-31
That’s 136 triangle days for this year. To see how this count changes from year to year, I ran
python:
for y in range(2000, 2051):
print(f'{y} {len(triangleDays(y)):3d}')
which returned
2000 0
2001 12
2002 34
2003 54
2004 72
2005 88
2006 102
2007 114
2008 124
2009 132
2010 138
2011 142
2012 144
2013 144
2014 144
2015 144
2016 144
2017 144
2018 144
2019 144
2020 144
2021 142
2022 140
2023 136
2024 132
2025 127
2026 120
2027 113
2028 104
2029 93
2030 82
2031 72
2032 61
2033 51
2034 41
2035 33
2036 25
2037 19
2038 13
2039 8
2040 5
2041 2
2042 1
2043 0
2044 0
2045 0
2046 0
2047 0
2048 0
2049 0
2050 0
I knew there was no point in checking on years later in the century—it was obvious that every year after 2042 would have no triangle days. As you can see, the 2010s were the peak decade for triangle days. We’re now in the early stages of a 20-year decline.
After doing this, I looked back at my code and decided that most serious Python programmers wouldn’t have done it the way I did. Instead of functions that returned lists, they would build allDays
and triangleDays
as iterators.^{1} Not because there’s any need to save space—the space used by 366 datetime.date
objects is hardly even noticeable—but because that’s more the current style.
So to make myself feel more like a real Pythonista, I rewrote the code like this:
python:
1: from datetime import date, timedelta
2:
3: def isTriangleDay(dt):
4: "Can the the year, month, and day of the given date be the sides of a triangle?"
5: y = dt.year % 100
6: m = dt.month
7: d = dt.day
8: sides = sorted(int(x) for x in (y, m, d))
9: return sides[0] + sides[1] > sides[2]
10:
11: def allDays(y):
12: "Iterator for all days in the given year."
13: d = date(y, 1, 1)
14: end = date(y, 12, 31)
15: while d <= end:
16: yield d
17: d = d + timedelta(days=1)
18:
19: def triangleDays(y):
20: "Iterator for all the triangle days in the given year."
21: return filter(isTriangleDay, allDays(y))
isTriangleDay
is unchanged, but allDays
now works its way through the days of the year with a while
loop and the yield
statement, and triangleDays
uses the filter
function to iterate through just the triangle days.
Using these functions is basically the same as using the list-based versions, except that you can’t pass an iterator to len
. So determining the number of triangle days over a range of years can be done by either by converting the iterator to a list before passing it to len
,
python:
for y in range(2000, 2051):
print(f'{y} {len(list(triangleDays(y))):3d}')
or by using the sum
command with an argument that produces a one for each element of triangleDays
,
python:
for y in range(2000, 2051):
print(f'{y} {sum(1 for x in triangleDays(y))}')
The former sort of defeats the purpose of using an iterator, so I guess it’s better practice to use the latter, even though I find it weird looking.
It may well be that my perception of “real” Python programmers is wrong and they wouldn’t bother with yield
and filter
in such a piddly little problem as this. But at least I got some practice with them.
A confession: I find it hard to distinguish between between the proper use of the terms generator and iterator. My sense is that generators provide a way of creating iterators. So once the function is written, do you have a generator, an iterator, or both? ↩
Last week, I mentioned that I needed a new starting guess for Wordle, and I wrote about how I used some simple command-line tools to see if the word I was considering was an appropriate choice. In a nutshell, I had been using IRATE as my first guess, but because it was the answer recently I wanted to try a new initial guess that might be the answer in the future. ALTER seemed like a good choice, but I wanted to make sure it hadn’t already been used. I had a list of all the answers in chronological order in a file named answers.txt
, and the most recent answer was DWELT.
My solution involved three separate commands. All three involved grep
and one included head
in a pipeline. Reader Leon Cowle was unsatisfied with this and began casting about for a cleaner solution. Here’s the single command he came up with to replace the three I used:
egrep 'alter|dwelt' answers.txt
The output was
dwelt
alter
which told me that ALTER was an answer and that it would come after DWELT. This was everything I needed to know in one step. Beautiful!
You might argue that for a one-off like this, the solution that occurs to you first is the best because it takes the least amount of your time. Generally speaking, I agree with that, and I’m not unhappy with my three-command solution. But Leon has given me the best of both worlds. I got to have my own inefficient solution that I thought of quickly and I got to learn from his more elegant solution. Knowing which of two text strings appears first in a file is something I’m pretty sure I’ve had to do before and will have to do again. Now I have a simple and effective solution to pull out of my toolbox. Thanks, Leon!
As for Conlextions, I’ve been playing it for a few weeks and recommend it to anyone who likes the NY Times Connections game but is finding it a little too easy. I’ve been playing Connections since early summer and while it has gotten more difficult, the sense of accomplishment I get in solving it with no mistakes is wearing off. Conlextions is more diabolical and more satisfying when you solve it in four guesses.
I have only three gripes with Conlextions:
In protest of this prejudice against the excessively careful, I’ve recently taken to deleting the solve time from my posts on Mastodon. And to avoid the tedium of backspacing, I made this Shortcut that does the deleting for me:
It searches the clipboard for a linefeed, the word “Solve,” and all the text after that. It replaces that with the empty string and puts the updated text back onto the clipboard.
So now when I want to post my Conlextions result to Mastodon, I tap the “Share these results” button, press and hold the side button on my phone to activate Siri, and say “Delete Time.” The name of the Shortcut seems to be distinct enough that Siri hasn’t misinterpreted it yet.
I would not normally use Shortcuts for an automation like this. Keyboard Maestro would allow me to invoke it with a keystroke and could also do the pasting. But since I always play Conlextions on my phone, Shortcuts was the best option.
]]>Before we go any further, I should mention, for those of you who don’t commit every utterance of mine to memory and are thinking “There was no IRATE last week,” that I don’t play the NY Times version of Wordle. Back in early 2022, right after the Times bought Wordle, I downloaded the original version and set it up on a server I control. That’s the game my family and I have been playing ever since. Overall, I’d say this was unnecessary. The Times hasn’t screwed up the game the way I thought it would,^{1} but it’s too late now for us to change.
Based on letter frequency tables, I figured ALTER would be a good new initial guess. But because I was so delighted when IRATE came up, I’d like to choose a word that
And I want to see if ALTER passes these two tests without spoiling myself with other answers, especially those that may be coming up soon.
I have the data to do this if I’m careful. In order to build my wordle
script, I needed to pull out all the answers and acceptable guesses from the original Wordle JavaScript source code. I was able to do this nearly blind—I saw just the first and last couple of words in each list and have long since forgotten them—and save them to three files: guesses.txt
, answers.txt
, and wordle.txt
, the last of which is a blending of the two others.
Some simple shell commands got me what I wanted. First, the simple part:
grep alter answers.txt
returned alter
, which confirmed my belief that ALTER is an answer and didn’t reveal any other answers. Checking whether it had already been an answer was only a little trickier.
The answers.txt
file has the answer words in chronological order, one per line. To figure out where I currently am in that list, I got the line number of yesterday’s word, DWELT.
grep -n dwelt answers.txt
The -n
option causes the line number of the found text to be printed along with the text:
878:dwelt
With this information, I can now see if ALTER has already been the answer:
head -n 878 answers.txt | grep alter
The head
command outputs just the first 878 lines of answers.txt
and the grep
command looks for alter
in that text. It returned nothing, so ALTER has not been an answer so far. It passes my test and will be my first guess from now on.
What I like about this solution is that I was run my test quickly without seeing any other answer words and without tipping myself off to when ALTER will appear, which I could have easily done with
grep -n alter answers.txt
I recognize that some of you will read this and think I’ve crossed the line of Wordle ethics into outright cheating. You can keep your opinions to yourself.
Although I recall some outrage about a year ago when FEAST was the too-on-the-nose word for Thanksgiving Day. ↩
Earlier today, I talked about sticking with a problem longer than I probably should because I can’t stop. Let’s apply that pathology to the Taskmaster coin flip subtask and think about a slightly different problem.^{1} Suppose we flip a fair coin and stop when we’ve flipped either five consecutive heads or five consecutive tails. What is the expected number of flips?
As we did with the simpler game, we can imagine this as a board game to help us configure the Markov chain transition matrix. Here, we treat consecutive heads as positives and consecutive tails as negatives.
Update 11 Nov 2023 11:12 PM
I screwed this up the first time through and should’ve seen the error before I published. The transformation matrices are correct now and give a result that makes more sense. Sorry about that.
We start with our marker on Square 0 and start flipping. A head moves us one square to the right; a tail moves us one square to the left. If we’re on a positive square, a tail takes to Square –1. If we’re on a negative square, a head takes us Square 1. The game ends when we reach either Square 5 or Square –5.
Here’s the game’s transformation matrix, where we’ve set the squares at the two ends of the board to be absorbing states:
$$P=\left[\begin{array}{ccccccccccc}1& 0& 0& 0& 0& 0& 0& 0& 0& 0& 0\\ \frac{1}{2}& 0& 0& 0& 0& 0& \frac{1}{2}& 0& 0& 0& 0\\ 0& \frac{1}{2}& 0& 0& 0& 0& \frac{1}{2}& 0& 0& 0& 0\\ 0& 0& \frac{1}{2}& 0& 0& 0& \frac{1}{2}& 0& 0& 0& 0\\ 0& 0& 0& \frac{1}{2}& 0& 0& \frac{1}{2}& 0& 0& 0& 0\\ 0& 0& 0& 0& \frac{1}{2}& 0& \frac{1}{2}& 0& 0& 0& 0\\ 0& 0& 0& 0& \frac{1}{2}& 0& 0& \frac{1}{2}& 0& 0& 0\\ 0& 0& 0& 0& \frac{1}{2}& 0& 0& 0& \frac{1}{2}& 0& 0\\ 0& 0& 0& 0& \frac{1}{2}& 0& 0& 0& 0& \frac{1}{2}& 0\\ 0& 0& 0& 0& \frac{1}{2}& 0& 0& 0& 0& 0& \frac{1}{2}\\ 0& 0& 0& 0& 0& 0& 0& 0& 0& 0& 1\end{array}\right]$$The rows and columns represent the squares in numerical order from –5 to 5. The lower right half is similar to the transformation matrix we used in the earlier post, and the upper left half is sort of the upside-down mirror image of the lower right half. The main difference is that we never go to Square 0 after a flip; if the coin comes up opposite the run we were on, we go to the first square on the opposite side of Square 0.
As we’ve done in the past, we create a $Q$ transformation matrix by eliminating the rows and columns associated with the absorbing states from the $P$ matrix:
$$Q=\left[\begin{array}{ccccccccc}0& 0& 0& 0& 0& \frac{1}{2}& 0& 0& 0\\ \frac{1}{2}& 0& 0& 0& 0& \frac{1}{2}& 0& 0& 0\\ 0& \frac{1}{2}& 0& 0& 0& \frac{1}{2}& 0& 0& 0\\ 0& 0& \frac{1}{2}& 0& 0& \frac{1}{2}& 0& 0& 0\\ 0& 0& 0& \frac{1}{2}& 0& \frac{1}{2}& 0& 0& 0\\ 0& 0& 0& \frac{1}{2}& 0& 0& \frac{1}{2}& 0& 0\\ 0& 0& 0& \frac{1}{2}& 0& 0& 0& \frac{1}{2}& 0\\ 0& 0& 0& \frac{1}{2}& 0& 0& 0& 0& \frac{1}{2}\\ 0& 0& 0& \frac{1}{2}& 0& 0& 0& 0& 0\\ \end{array}\right]$$Then we proceed as before, forming the matrix equation
$$(I-Q)\phantom{\rule{thinmathspace}{0ex}}m=1$$where $m$ is the column vector of the expected number of flips to get to an absorbing state from Squares –4 through 4, and $1$ is a nine-element column vector of ones. Solving this^{2} we get
$$m=\left[\begin{array}{c}16\\ 24\\ 28\\ 30\\ 31\\ 30\\ 28\\ 24\\ 16\end{array}\right]$$So the expected number of flips to get either five consecutive heads or five consecutive tails is ${m}_{0}=31$, the value in the middle of this vector. This is half the value we got last time, which makes sense.
Am I done with Markov chains now? I hope so, but you never know.
By the way, you can now see the whole episode on YouTube. ↩
Which I did with Mathematica, but you could do with any number of programs. Excel, for example. ↩
Earlier this week, Michael Glotzer on Mastodon reminded me of a little script I wrote about 5 years ago that made this sunrise/sunset plot for Chicago:
As I said at the time, there were several bespoke aspects to the script that built this plot. I did a lot of hand-editing of the US Naval Observatory data that was the script’s input. Also, the “Sunrise,” “Sunset,” and “Daylight” curve labels were set at positions and angles that were specific to the plot; they’d have to be redone if I wanted to make a plot for another city. To generalize the script to handle any set of USNO sunrise/sunset data, I’d have to alter the code to do the following:
My goal was to feed the data to my script through standard input. I’d choose the location and year on the USNO site, copy the text that appears on the followup page,
and then pipe the text on the clipboard to my script via pbpaste
:
pbpaste | sunplot
The result would be a PNG file in the current working directory, named according to the location and year—in this case, Chicago, IL-2023.png
.
As you can see, I’ve added the name and year as a title in the upper left corner and the latitude and longitude in the lower right corner. Also, the curves are labeled at their peaks or valleys, as appropriate. As before, the main curves are given in Standard Time (as the USNO presents it in its tables) and the curves associated with the darker yellow are given in Daylight Saving Time.
Here’s the code for sunplot
:
python:
1: #!/usr/bin/env python
2:
3: import sys
4: import re
5: from dateutil.parser import parse
6: from datetime import datetime
7: from datetime import timedelta
8: from matplotlib import pyplot as plt
9: import matplotlib.dates as mdates
10: from matplotlib.ticker import MultipleLocator, FormatStrFormatter
11:
12:
13: # Functions
14:
15: def headerInfo(header):
16: "Return location name, coordinates, and year from the USNO header lines."
17:
18: # Get the place name from the middle of the top line
19: left = 'o , o ,'
20: right = 'Astronomical Applications Dept.'
21: placeName = re.search(rf'{left}(.+){right}', header[0]).group(1).strip()
22:
23: # If the place name ends with a comma, a space, and a pair of capitals,
24: # assume it's in location, ST format and capitalize the location while
25: # keeping the state as all uppercase. Otherwise, capitalize all the words.
26: if re.match(r', [A-Z][A-Z]', placeName[-4:]):
27: placeParts = placeName.split(', ')
28: location = ', '.join(placeParts[:-1]).title()
29: state = placeParts[-1]
30: placeName = f'{location}, {state}'
31: else:
32: placeName = placeName.title()
33:
34: # The year is at a specific spot on the second line
35: year = int(header[1][80:84])
36:
37: # The latitude and longitude are at specific spots on the second line
38: longString = header[1][10:17]
39: latString = header[1][19:25]
40:
41: # Reformat the latitude into d° m′ N format (could be S)
42: dir = latString[0]
43: degree, minute = latString[1:].split()
44: lat = f'{int(degree)}° {int(minute)}′ {dir}'
45:
46: # Reformat the longitude into d° m′ W format
47: dir = longString[0]
48: degree, minute = longString[1:].split()
49: long = f'{int(degree)}° {int(minute)}′ {dir}'
50:
51: return placeName, lat, long, year
52:
53: def bodyInfo(body, isLeap):
54: "Return lists of sunrise, sunset, and daylight length hours from the USNO body lines."
55:
56: # Initialize
57: sunrises = []
58: sunsets = []
59: lengths = []
60:
61: # Lengths of monthly columns
62: if isLeap:
63: daysInMonth = [31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
64: else:
65: daysInMonth = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
66:
67: # Rise and set character start positions for each month
68: risePos = [ 4 + 11*i for i in range(12) ]
69: setPos = [ 9 + 11*i for i in range(12) ]
70:
71: # Collect data from each day
72: for m in range(12):
73: for d in range(daysInMonth[m]):
74: riseString = body[d][risePos[m]:risePos[m]+4]
75: hour, minute = int(riseString[:2]), int(riseString[-2:])
76: sunrise = hour + minute/60
77: setString = body[d][setPos[m]:setPos[m]+4]
78: hour, minute = int(setString[:2]), int(setString[-2:])
79: sunset = hour + minute/60
80: sunrises.append(sunrise)
81: sunsets.append(sunset)
82: lengths.append(sunset - sunrise)
83:
84: return(sunrises, sunsets, lengths)
85:
86: def dstBounds(year):
87: "Return the DST start and end day indices according to current US rules."
88:
89: # Start DST on second Sunday of March
90: d = 8
91: while datetime.weekday(dstStart := datetime(year, 3, d)) != 6:
92: d += 1
93: dstStart = (dstStart - datetime(year, 1, 1)).days
94:
95: # End DST on first Sunday of November
96: d = 1
97: while datetime.weekday(dstEnd := datetime(year, 11, d)) != 6:
98: d += 1
99: dstEnd = (dstEnd - datetime(year, 1, 1)).days
100:
101: return dstStart, dstEnd
102:
103: def centerPeak(data):
104: "Return the maximum value and the index of its central position."
105:
106: peak = max(data)
107: # Average (to the nearest integer) the first and last peak indices
108: peakPos = (data.index(peak) + (len(data) - data[-1::-1].index(peak))) // 2
109: return peak, peakPos
110:
111: def centerValley(data):
112: "Return the minimum value and the index of its central position."
113:
114: valley = min(data)
115: # Average (to the nearest integer) the first and last valley indices
116: valleyPos = (data.index(valley) + (len(data) - data[-1::-1].index(valley))) // 2
117: return valley, valleyPos
118:
119:
120: # Start processing
121:
122: # Read the USNO data from stdin into a list of lines.
123: # Text should come from https://aa.usno.navy.mil/data/RS_OneYear
124: usno = sys.stdin.readlines()
125:
126: # Get location and year from header
127: placeName, lat, long, year = headerInfo(usno[:2])
128:
129: # Is it a leap year?
130: isLeap = (year % 400 == 0) or ((year % 4 == 0) and not (year % 100 == 0))
131:
132: # Get sunrise, sunset, and sunlight length lists from body
133: sunrises, sunsets, lengths = bodyInfo(usno[9:], isLeap)
134:
135: # Generate list of days for the year
136: currentDay = datetime(year, 1, 1)
137: lastDay = datetime(year, 12, 31)
138: days = [currentDay]
139: while (currentDay := currentDay + timedelta(days=1)) <= lastDay:
140: days.append(currentDay)
141:
142: # The portion of the year that uses DST
143: dstStart, dstEnd = dstBounds(year)
144: dstDays = days[dstStart:dstEnd]
145: dstRises = [ x + 1 for x in sunrises[dstStart:dstEnd] ]
146: dstSets = [ x + 1 for x in sunsets[dstStart:dstEnd] ]
147:
148: # Plot the data
149: fig, ax =plt.subplots(figsize=(10,6))
150:
151: # Shaded areas
152: plt.fill_between(days, sunrises, sunsets, facecolor='yellow', alpha=.5)
153: plt.fill_between(days, 0, sunrises, facecolor='black', alpha=.25)
154: plt.fill_between(days, sunsets, 24, facecolor='black', alpha=.25)
155: plt.fill_between(dstDays, sunsets[dstStart:dstEnd], dstSets, facecolor='yellow', alpha=.5)
156: plt.fill_between(dstDays, sunrises[dstStart:dstEnd], dstRises, facecolor='black', alpha=.1)
157:
158: # Curves
159: plt.plot(days, sunrises, color='k')
160: plt.plot(days, sunsets, color='k')
161: plt.plot(dstDays, dstRises, color='k')
162: plt.plot(dstDays, dstSets, color='k')
163: plt.plot(days, lengths, color='#aa00aa', linestyle='--', lw=2)
164:
165: # Curve annotations centered on the peaks and valleys
166: # To get these labels near the middle of the plot, we need to use
167: # different functions for the northern and southern hemispheres
168: riseFcn = {'N':centerValley, 'S':centerPeak}
169: setFcn = {'N':centerPeak, 'S':centerValley}
170: lengthFcn = {'N':centerPeak, 'S':centerValley}
171:
172: # Sunrise
173: labeledRise, labeledRiseIndex = riseFcn[lat[-1]](sunrises)
174: ax.text(days[labeledRiseIndex], labeledRise - 1, 'Sunrise', fontsize=12, color='black', ha='center')
175:
176: # Sunset
177: labeledSet, labeledSetIndex = setFcn[lat[-1]](sunsets)
178: ax.text(days[labeledSetIndex], labeledSet - 1, 'Sunset', fontsize=12, color='black', ha='center')
179:
180: # Daylight length
181: labeledLight, labeledLightIndex = lengthFcn[lat[-1]](lengths)
182: ax.text(days[labeledLightIndex], labeledLight + .75, 'Daylight', fontsize=12, color='#aa00aa', ha='center')
183:
184: # Place name and year in upper left; coordinates in lower right
185: ax.text(datetime(year, 1, 20), 22, f'{placeName} – {year}', fontsize=16, color='black', ha='left')
186: ax.text(datetime(year, 10, 10), 2, f'{lat}, {long}', fontsize=12, color='black', ha='left')
187:
188: # Background grids
189: ax.grid(which='major', color='#cccccc', ls='-', lw=.5)
190: ax.grid(which='minor', color='#cccccc', ls=':', lw=.5)
191:
192: # Horizontal axis shows month abbreviations between ticks
193: ax.tick_params(axis='both', which='major', labelsize=12)
194: plt.xlim(datetime(year, 1, 1), datetime(year, 12, 31))
195: m = mdates.MonthLocator(bymonthday=1)
196: mfmt = mdates.DateFormatter(' %b')
197: ax.xaxis.set_major_locator(m)
198: ax.xaxis.set_major_formatter(mfmt)
199:
200: # Vertical axis labels formatted like h:mm
201: plt.ylim(0, 24)
202: ymajor = MultipleLocator(4)
203: yminor = MultipleLocator(1)
204: tfmt = FormatStrFormatter('%d:00')
205: ax.yaxis.set_major_locator(ymajor)
206: ax.yaxis.set_minor_locator(yminor)
207: ax.yaxis.set_major_formatter(tfmt)
208:
209: # Tighten up the white border and save
210: fig.set_tight_layout({'pad': 1.5})
211: plt.savefig(f'{placeName}-{year}.png', format='png', dpi=150)
For me, a 200-line script is pretty long. A lot of that is due to the continual tweaking I did to make it more general, more accommodating of different inputs.
The sunrise and sunset data is read in from standard input on Line 122 and stored in a list of lines. Here’s an example:
o , o , CHICAGO, IL Astronomical Applications Dept.
Location: W087 41, N41 51 Rise and Set for the Sun for 2023 U. S. Naval Observatory
Washington, DC 20392-5420
Zone: 6h West of Greenwich
Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec.
Day Rise Set Rise Set Rise Set Rise Set Rise Set Rise Set Rise Set Rise Set Rise Set Rise Set Rise Set Rise Set
h m h m h m h m h m h m h m h m h m h m h m h m h m h m h m h m h m h m h m h m h m h m h m h m
01 0718 1630 0703 1706 0626 1741 0534 1816 0447 1849 0418 1919 0419 1930 0444 1909 0516 1824 0547 1733 0623 1645 0658 1621
02 0718 1631 0702 1707 0624 1742 0532 1817 0446 1850 0418 1920 0420 1929 0445 1908 0517 1823 0549 1731 0624 1644 0700 1620
03 0718 1632 0701 1708 0623 1743 0530 1818 0445 1851 0417 1921 0420 1929 0446 1907 0518 1821 0550 1729 0625 1643 0701 1620
04 0718 1633 0700 1710 0621 1744 0529 1819 0443 1852 0417 1921 0421 1929 0447 1906 0519 1819 0551 1728 0627 1641 0702 1620
05 0718 1634 0659 1711 0619 1746 0527 1820 0442 1853 0417 1922 0422 1929 0448 1904 0520 1818 0552 1726 0628 1640 0703 1620
06 0718 1635 0658 1712 0618 1747 0525 1822 0441 1854 0416 1923 0422 1928 0450 1903 0521 1816 0553 1724 0629 1639 0704 1620
07 0718 1636 0657 1713 0616 1748 0524 1823 0440 1856 0416 1923 0423 1928 0451 1902 0522 1814 0554 1722 0630 1638 0704 1620
08 0718 1637 0656 1715 0615 1749 0522 1824 0438 1857 0416 1924 0424 1928 0452 1901 0524 1813 0555 1721 0631 1637 0705 1620
09 0718 1638 0654 1716 0613 1750 0520 1825 0437 1858 0416 1925 0424 1927 0453 1859 0525 1811 0556 1719 0633 1636 0706 1620
10 0718 1639 0653 1717 0611 1751 0519 1826 0436 1859 0415 1925 0425 1927 0454 1858 0526 1809 0557 1718 0634 1635 0707 1620
11 0717 1640 0652 1719 0610 1753 0517 1827 0435 1900 0415 1926 0426 1926 0455 1857 0527 1807 0558 1716 0635 1634 0708 1620
12 0717 1641 0651 1720 0608 1754 0516 1828 0434 1901 0415 1926 0426 1926 0456 1855 0528 1806 0559 1714 0636 1633 0709 1620
13 0717 1642 0649 1721 0606 1755 0514 1829 0433 1902 0415 1927 0427 1925 0457 1854 0529 1804 0601 1713 0638 1632 0710 1620
14 0716 1644 0648 1722 0605 1756 0512 1830 0432 1903 0415 1927 0428 1925 0458 1852 0530 1802 0602 1711 0639 1631 0710 1620
15 0716 1645 0647 1724 0603 1757 0511 1831 0431 1904 0415 1927 0429 1924 0459 1851 0531 1800 0603 1709 0640 1630 0711 1620
16 0715 1646 0645 1725 0601 1758 0509 1833 0430 1905 0415 1928 0430 1924 0500 1850 0532 1759 0604 1708 0641 1629 0712 1621
17 0715 1647 0644 1726 0559 1759 0508 1834 0429 1906 0415 1928 0430 1923 0501 1848 0533 1757 0605 1706 0642 1628 0712 1621
18 0714 1648 0642 1727 0558 1800 0506 1835 0428 1907 0415 1929 0431 1922 0502 1847 0534 1755 0606 1705 0644 1628 0713 1621
19 0714 1649 0641 1729 0556 1802 0505 1836 0427 1908 0415 1929 0432 1921 0503 1845 0535 1753 0607 1703 0645 1627 0714 1622
20 0713 1651 0640 1730 0554 1803 0503 1837 0426 1909 0416 1929 0433 1921 0504 1844 0536 1752 0609 1702 0646 1626 0714 1622
21 0713 1652 0638 1731 0553 1804 0501 1838 0425 1910 0416 1929 0434 1920 0505 1842 0537 1750 0610 1700 0647 1625 0715 1623
22 0712 1653 0637 1732 0551 1805 0500 1839 0425 1911 0416 1929 0435 1919 0506 1841 0538 1748 0611 1659 0648 1625 0715 1623
23 0711 1654 0635 1734 0549 1806 0459 1840 0424 1912 0416 1930 0436 1918 0507 1839 0539 1746 0612 1657 0650 1624 0716 1624
24 0710 1656 0634 1735 0548 1807 0457 1841 0423 1913 0417 1930 0437 1917 0508 1837 0540 1745 0613 1656 0651 1624 0716 1624
25 0710 1657 0632 1736 0546 1808 0456 1843 0422 1913 0417 1930 0438 1916 0509 1836 0541 1743 0614 1655 0652 1623 0717 1625
26 0709 1658 0631 1737 0544 1809 0454 1844 0422 1914 0417 1930 0439 1915 0510 1834 0542 1741 0616 1653 0653 1623 0717 1626
27 0708 1659 0629 1738 0542 1811 0453 1845 0421 1915 0418 1930 0440 1914 0511 1833 0543 1740 0617 1652 0654 1622 0717 1626
28 0707 1701 0627 1740 0541 1812 0451 1846 0420 1916 0418 1930 0441 1913 0512 1831 0544 1738 0618 1650 0655 1622 0718 1627
29 0706 1702 0539 1813 0450 1847 0420 1917 0418 1930 0442 1912 0513 1829 0545 1736 0619 1649 0656 1621 0718 1628
30 0705 1703 0537 1814 0449 1848 0419 1918 0419 1930 0442 1911 0514 1828 0546 1734 0620 1648 0657 1621 0718 1629
31 0704 1704 0536 1815 0419 1918 0443 1910 0515 1826 0622 1646 0718 1629
Add one hour for daylight time, if and when in use.
It’s a wide table, so you’ll have to scroll sideways to see everything.
The location and year are in the first two lines of the header. They’re collected with the headerInfo
function defined in Lines 15–51. It’s kind of a long function because the name of the location is given in uppercase (no matter how you specify the name in the input field), and I want mixed case in title. So there’s some messing around in the code to make sure the state or territory abbreviation, if present, is maintained as uppercase but the rest is converted to title case. Like this:
As the last example suggests, you can enter the coordinates of your house and label it “Home,” but the web site will unhelpfully turn that into “HOME” in the output. The headerInfo
function will reconvert that back to “Home.”
You’ll note also that the latitude and longitude comes with the direction letter first, the degrees (with a leading zero if necessary), a space, and then the minutes. I prefer a more conventional expression, so headerInfo
does conversions like this:
This fussiness to fit my taste takes up lines of code, but I get what I want.
The body of the data, which starts on the tenth line of the input and continues more or less to the end, contains all the sunrise and sunset times at specific character positions. The bodyInfo
function, defined in Lines 53–84, loops through the days of the year, month by month and day by day, extracting the data and converting it into a floating point number that represents the time of day in hours. It then simply subtracts the sunrise from the sunset to get the number of hours of daylight. A trick used later in the code—a holdover from the original version of the script—is that the vertical axis is plotting these floating point numbers. It only looks like it’s plotting a time object because the tick labels are formatted with a colon followed by two zeros.
Because the monthly data are in columns and the number of days in February is inconsistent, we have to pass a Boolean value, isLeap
to bodyInfo
to tell it how many rows of the February columns to go. The isLeap
variable is set in Line 128 using the standard Gregorian rules. I’m sure I could have used an existing library function to do this, but it was fun to do it in a single line of code. And yes, I tested that line against every type of year, even though I’ll be long dead before the simple but incorrect
year % 4 == 0
rule will fail in 2100.
The dstBounds
function in Lines 86–101 returns the start and end dates for DST using the current rules for the United States. I decided I didn’t need a more global approach because the USNO site is really geared to US locations. When it comes to weak points in the code, this is #1, because these rules could be changed at any time. As with leap years, I’m sure I could have used a library to work out these dates, but I liked the idea of writing the code myself. If the rules change, I’ll know what to do.
(By the way, I’d say the #2 weak point is the assumed format of the data in the body of table. The USNO could change that at any time, and I’d have to go through bodyInfo
to update the character positions. Same with headerInfo
. But since the table layout has been consistent for years, I think those functions will survive a while.)
To me, the most interesting new code is the stuff that centers the curve labels under the peaks and valleys. Matplotlib has an optional argument to the Axes.text
function, horizontalalignment
or ha
for short, that knows how to center text at a particular coordinate, but that in itself isn’t good enough. Because the USNO data reports the sunrises and sunsets to the nearest minute, the minimum value of sunrise and maximum value of sunset last for several days. I want the labels to be centered within those stretches.
So I wrote the functions centerPeak
and centerValley
, defined in Lines 103–108 and 110–115, respectively, to return both the peak [valley] value and the middle location of the run of that peak [valley] value. Python’s max
and min
functions get the values, and the index
function will return the index of the first occurrence of those values. What centerPeak
and centerValley
do is also reverse the given list to find the index of the last occurrence of the max or min. It then averages (to the nearest integer) the first and last indices to get the index in the middle. This could be off by half a day from the true middle, but that’s not enough to notice on the graph.
Another interesting thing about these labels is that while the earliest sunrise, latest sunset, and longest daylight occur near the middle of the year for most of the US, there’s a place where that is reversed. American Samoa is in the southern hemisphere, so if I used the same label-positioning rules for it as I used for the rest of the country, the labels would be off at one end of the year or the other and might run off the end of the plot. So Lines 165–182 account for that by determining whether we search for peaks or valleys depending on the N
or S
at the end of the latitude string. For example, here’s a plot for Pago Pago:
They don’t use DST in Pago Pago; it’s too close to the equator for DST to be useful, and the US rules would make the adjustments go in the wrong direction. But I’m plotting it anyway—just as I plot it for Phoenix, Honolulu, and other places that don’t use DST—to show how the rules would work if they were applied. I’m showing Standard Time in the summer for places that switch to DST, so it seems consistent to show DST even in places where it doesn’t apply.
I’ve thought about adding a dashed line showing where DST would be in the winter months. Last year, the Senate passed a bill for year-round DST (the House let it lie), so it might be nice to show the consequences of that graphically. But for now, I’m going to put this away and move on to other ways of wasting my time.
Update 11 Nov 2023 4:32 PM
Based on a tip from Marcos Huerta, I watched this video from Jacqueline Nolis and realized I have more work ahead if I expect to handle places in Alaska. Go to the USNO site and request data for Nome, AL. What a mess! Starting in late May, the sunsets are so late they roll over past midnight into the next day. From then until mid-July, a naive reading of the data (which is what my code does) says that sunset comes before sunrise. This year, and probably most years, there’s a day in mid-July with two sunsets, one just after the midnight at the beginning of the day and another just before the midnight at the end of the day. July 17, the day in question, has two lines associated with it.
Ms. Nolis, who’s pulling this same data from another online source and is using R to plot it, suggests using three days of data for each being plotted—the day before, the day of, and the day after—and then truncating the plot to show only the day of. That seems like a reasonable approach, but it may not be so reasonable if I continue to use the USNO data. I’ll have to give it some thought.
One thing I’m sure of is that I want to stick with Python and Matplotlib instead of switching to R. I’ve done enough with R to know that its syntax is not for me, no matter how appealing Kieran Healy makes it seem.
In the latest episode of Taskmaster, of which Channel 4 has posted a sneak peek, there’s a subtask related to my last post on Markov chains. The contestants are required to flip a coin and get five consecutive heads before moving on to the next part of the task. As you might expect, there are a few double-headed coins available from earlier in the task, and that makes this subtask very easy for the contestants who notice them. But the subtask can also be done by brute force with a regular coin, which raises the question: how many flips would it take, on average, to accomplish this subtask with a fair coin?
It might be easier to see how Markov chains can be used to answer this question if we build a little board game—an even easier version of Snakes and Ladders—that matches the logic of the coin flip subtask. Imagine a six-square board with the squares labeled from 0 through 5.
You start with your marker on the Square 0 and start flipping a coin. If it comes up heads, you move your marker forward; if it comes up tails, you go back to zero. You finish the game when your marker reaches the Square 5, which is the same as flipping 5 heads in a row.
As we did in the earlier post, we’ll take the end square to be an absorbing state and build a transition matrix that reflects the probabilities of moving from a given square to the others. The rows represent the current square and the columns represent the square after the next flip.
As you can see, the chance of moving forward one square is always ½, as is the chance of moving back to Square 0. The only square to which this doesn’t apply is Square 5 because that’s where the game ends.
We’ve worked out all the math on this before. We define another transition matrix, $Q$, as the same as $P$ but with the row and column associated with the absorbing state removed:
We now define ${m}_{0}$, as the expected number of flips to get from Square 0 to the absorbing state at Square 5. Similar definitions apply to ${m}_{1}$, ${m}_{2}$, ${m}_{3}$, and ${m}_{4}$. Then
$$(I-Q)\phantom{\rule{thinmathspace}{0ex}}m=1$$where $m$ is the column vector of the $m$ terms and $1$ is a five-element column vector of ones.^{1} Solving this yields
$$m=\left[\begin{array}{c}62\\ 60\\ 56\\ 48\\ 32\end{array}\right]$$so the expected number of flips to get five consecutive heads is 62. Certainly doable but also frustrating—a common Taskmaster state of affairs.
You might be looking at the 32 at the end of $m$ and thinking it’s too high to be the expected number of flips needed to get from four consecutive heads to the fifth consecutive head. While it’s true that you could get from Square 4 to Square 5 in one flip, the problem is that it’s equally likely that that flip will take you back to Square 0. So the expected number of flips to get from Square 4 to Square 5 is
$$(1)\phantom{\rule{thinmathspace}{0ex}}\frac{1}{2}+(62+1)\phantom{\rule{thinmathspace}{0ex}}\frac{1}{2}=32$$just as we calculated via the matrix equation.
I’m using bold for the vectors and matrices because that’s the common typographical convention. I didn’t use bold in the previous post because Marcus du Sautoy didn’t, and I wanted to be consistent with him. ↩
This week’s Numberphile video features the BBC’s favorite mathematician, Marcus du Sautoy, who explains how the game Snakes and Ladders^{1} is governed by the mathematics of Markov chains. Despite some experience analyzing Markov chains in grad school, I had trouble understanding one part of the video, so I pulled out my old textbook to clear things up.
But before we get into the math, a short digression. One of my favorite episodes of Melvin Bragg’s In Our Time radio show was the “Zeno’s Paradoxes” show from back in 2016. I distinctly remember being infuriated by the show because du Sautoy’s fellow guests, a philosopher and a classicist, clearly thought that Zeno’s reasoning on paradoxes like Achilles and the tortoise and the flying arrow was still valuable. Not because it gives us insight into the historical development of thinking on infinite series or state space representation—which it absolutely does—but because it really is puzzling that Achilles passes the tortoise. What made the show so entertaining was listening to du Sautoy’s obvious frustration with what he considered their obtuseness with regard to solved problems and his need to suppress that frustration for the sake of politeness.
OK, onto the math. Du Sautoy starts by setting up a small Snakes and Ladders game with just ten squares—labeled 0 through 9—one ladder, and one snake.
You move across the board by rolling a single die and advancing your marker accordingly. The goal is to reach Square 9 with an exact roll. If you roll a number that is more than you need to reach Square 9, you stay where you are until the next roll.
He then goes through the construction of a transition matrix, in which each element is the probability of moving from the square represented by the row to the square represented by the column. It looks like this when he’s done:
The row and column of numbers outside the matrix are the squares on the game board. You’ll note that three of the positions are missing:
I’m calling this matrix $P$ to avoid confusing it with du Sautoy’s transition matrix, which he calls $Q$. We’ll return to this matrix soon.
It’s after construction his transition matrix that du Sautoy does what I don’t understand. He wants to calculate the expected number of turns it will take to reach Square 9 from Square 0. The equation he comes up with is this infinite sum,
$$I+Q+{Q}^{2}+{Q}^{3}+{Q}^{4}+\dots $$where $I$ is the identity matrix and the powers represent matrix multiplication. The sum of the top row of the resulting matrix is the expected number of rolls to get to Square 9. I understand everything du Sautoy says as he explains what each of the terms in this series is, but I don’t get why it leads to the expected number of rolls.
I even understand the next part of the video, where some clever algebra shows that this infinite series converges to the non-infinite matrix^{2}
$${(I-Q)}^{-1}$$So the sum of the elements of the top row of this inverted matrix is the expected number of rolls to get to Square 9.
As it happens, you can get to this final answer by a method I understand. We’ll start by returning to the $P$ matrix I defined earlier,
As you can see, the first seven rows and columns are the same as $Q$. The first seven elements of the last column consists of the probabilities of getting to Square 9 from each of the earlier squares. These are the probabilities that were skipped over in the video, and they’re all either zero or $1/6$ because the roll has to be exactly the number of squares away from 9 you are. Note also that the sum of all the terms in each row must be 1 because you have to land somewhere after each row.
The last row is special. It says that once you’re on Square 9, you can’t go anywhere else. In the study of Markov chains, this is known as an absorbing state, and the properties of absorbing states are pretty well known. In particular, there’s an established formula for the expected number of rolls to get from Square $i$ to Square 9, the absorbing state.^{3}
We’ll call the expected number of rolls to get from Square $i$ to Square 9 ${m}_{i}$. One thing we can say right away is that
$${m}_{9}=0$$because you don’t need to roll to get to the end when you’re already there. For all the other values of $i$, we use some lateral thinking.
First, we know that if we’re on Square 1, it takes ${m}_{1}$ rolls to get to the end—that’s by definition. So if we start on Square 0 and go to Square 1 on our first roll, it will take, on average, $1+{m}_{1}$ rolls to get to the end. But of course we may not go from Square 0 to Square 1; we might go to Square 2 or 3 or whatever. Each of these possibilities for the first roll has a probability assigned to it in the transition matrix, $P$. So the expected value of the number of rolls to get from Square 0 to Square 9 is
$${m}_{0}=(1+{m}_{1})\phantom{\rule{0.167em}{0ex}}{p}_{01}+(1+{m}_{2})\phantom{\rule{0.167em}{0ex}}{p}_{02}+(1+{m}_{3})\phantom{\rule{0.167em}{0ex}}{p}_{03}$$ $$\phantom{\rule{1em}{0ex}}+\phantom{\rule{0.222em}{0ex}}(1+{m}_{5})\phantom{\rule{0.167em}{0ex}}{p}_{05}+(1+{m}_{6})\phantom{\rule{0.167em}{0ex}}{p}_{06}+(1+{m}_{7})\phantom{\rule{0.167em}{0ex}}{p}_{07}$$where the $p$ values are taken from the top row of the $P$ matrix. It may look like we’ve gone backwards here, introducing all the other ${m}_{i}$ into an equation for ${m}_{0}$. But let’s see what happens when we generalize this to any starting square and make the equation more compact with the summation symbol.
$${m}_{i}=\sum _{\mathrm{a}\mathrm{l}\mathrm{l}\phantom{\rule{0.222em}{0ex}}j}(1+{m}_{j})\phantom{\rule{0.278em}{0ex}}{p}_{ij}$$Doing some algebra, we get
$${m}_{i}=\sum _{\mathrm{a}\mathrm{l}\mathrm{l}\phantom{\rule{0.222em}{0ex}}j}{p}_{ij}+\sum _{\mathrm{a}\mathrm{l}\mathrm{l}\phantom{\rule{0.222em}{0ex}}j}{m}_{j}\phantom{\rule{0.278em}{0ex}}{p}_{ij}$$Note that the first sum in this equation is the sum of all the elements of $P$ in row $i$, and that is 1 for each row. Combining that with ${m}_{9}=0$, we can simplify this equation to
$${m}_{i}=1+\sum _{\mathrm{a}\mathrm{l}\mathrm{l}\phantom{\rule{0.222em}{0ex}}j\ne 9}{m}_{j}\phantom{\rule{0.278em}{0ex}}{p}_{ij}$$This works for all values of $i\ne 9$. So what we have here are seven linear equations in seven unknowns, which is hard to solve by hand, but easy to solve with a computer. If we put it in matrix form, it becomes interesting:
$$m=1+Q\phantom{\rule{0.167em}{0ex}}m$$Here, the $m$ is a column vector of all the ${m}_{i}$ terms, the $1$ is a column vector of ones, and $Q$ is the matrix $P$ without its last row and column—yes, it’s du Sautoy’s transition matrix that ignored Square 9. Since $m=I\phantom{\rule{0.167em}{0ex}}m$, we can do the following manipulation:
$$m=I\phantom{\rule{0.167em}{0ex}}m=1+Q\phantom{\rule{0.167em}{0ex}}m$$ $$(I-Q)\phantom{\rule{0.167em}{0ex}}m=1$$ $$m={(I-Q)}^{-1}\phantom{\rule{0.167em}{0ex}}1$$Well, this should look familiar. Because we’re multiplying by a column vector of ones, each element of $m$ is equal to the sum of the corresponding row of the inverted matrix. And therefore the expected number of rolls to get from Square 0 to Square 9 is the sum of the top row of
$${(I-Q)}^{-1}$$In solving this problem, you wouldn’t explicitly invert the matrix, because that’s more computationally intensive than solving a set of linear equations. But I wrote it out this way to show that it’s the same result as du Sautoy’s.
One thing that isn’t the same as du Sautoy’s is the numerical answer. I got the following:
$$m=[8.6\phantom{\rule{1em}{0ex}}8.4\phantom{\rule{1em}{0ex}}8.4\phantom{\rule{1em}{0ex}}7.2\phantom{\rule{1em}{0ex}}7.2\phantom{\rule{1em}{0ex}}7.2\phantom{\rule{1em}{0ex}}7.2{]}^{T}$$As you can see, my answer for ${m}_{0}$ is 8.6, not 10 as du Sautoy says in the video. I checked my expression for $Q$ and it matched his. I went through all the transition probabilities again and agreed with his. Then I looked in the comments and found that several Numberphile followers also got 8.6, which made me feel better about my answer. While it’s possible we all made the same mistake, given that this is a very simple calculation, I think it’s more likely du Sautoy had a typo somewhere in his work. If you see an error here, though, I’d like to know about it.
]]>