Dates, triangles, and Python

Tuesday morning, which was November 28, John D. Cook started a post with

The numbers in today’s date—11, 28, and 23—make up the sides of a triangle. This doesn’t always happen; the two smaller numbers have to add up to more than the larger number.

He went on to figure out the angles of the plane triangle with side lengths of 11, 28, and 23 and then extended his analysis to triangles on a sphere and a pseudosphere. But I got hung up on the quoted paragraph. Which days can and can’t be the sides of a triangle? And how does the number of such “triangle days” change from year to year?

So I wrote a little Python to answer these questions.

python:
 1:  from datetime import date, timedelta
 2:  
 3:  def isTriangleDay(dt):
 4:    "Can the the year, month, and day of the given date be the sides of a triangle?"
 5:    y = dt.year % 100
 6:    m = dt.month
 7:    d = dt.day
 8:    sides = sorted(int(x) for x in (y, m, d))
 9:    return sides[0] + sides[1] > sides[2]
10:  
11:  def allDays(y):
12:    "Return a list of all days in the given year."
13:    start = date(y, 1, 1)
14:    end = date(y, 12, 31)
15:    numDays = (end - start).days + 1
16:    return [ start + timedelta(days=n) for n in range(numDays) ]
17:  
18:  def triangleDays(y):
19:    "Return a list of all the triangle days in the given year."
20:    return [x for x in allDays(y) if isTriangleDay(x) ]

isTriangleDay is a Boolean function that implements the test Cook described for a datetime.date object. Note that Line 5 extracts just the last two digits of the year, which is what Cook intends. You could, I suppose, change Line 9 to

python:
 9:    return sides[0] + sides[1] >= sides[2]

if you want to accept degenerate triangles, where the three sides collapse onto a single line. I don’t.

The allDays function uses a list comprehension to return a list of all the days in a given year, and triangleDays calls isTriangleDay to filter the results of allDays down to just triangle days. I think both of these functions are self-explanatory.

With these functions defined, I got all the triangle days for 2023 via

python:
print('\n'.join(x.strftime('%Y-%m-%d') for x in triangleDays(2023)))

which returned this list of dates (after reshaping into four columns):

2023-01-23     2023-06-27     2023-09-19     2023-11-17
2023-02-22     2023-06-28     2023-09-20     2023-11-18
2023-02-23     2023-07-17     2023-09-21     2023-11-19
2023-02-24     2023-07-18     2023-09-22     2023-11-20
2023-03-21     2023-07-19     2023-09-23     2023-11-21
2023-03-22     2023-07-20     2023-09-24     2023-11-22
2023-03-23     2023-07-21     2023-09-25     2023-11-23
2023-03-24     2023-07-22     2023-09-26     2023-11-24
2023-03-25     2023-07-23     2023-09-27     2023-11-25
2023-04-20     2023-07-24     2023-09-28     2023-11-26
2023-04-21     2023-07-25     2023-09-29     2023-11-27
2023-04-22     2023-07-26     2023-09-30     2023-11-28
2023-04-23     2023-07-27     2023-10-14     2023-11-29
2023-04-24     2023-07-28     2023-10-15     2023-11-30
2023-04-25     2023-07-29     2023-10-16     2023-12-12
2023-04-26     2023-08-16     2023-10-17     2023-12-13
2023-05-19     2023-08-17     2023-10-18     2023-12-14
2023-05-20     2023-08-18     2023-10-19     2023-12-15
2023-05-21     2023-08-19     2023-10-20     2023-12-16
2023-05-22     2023-08-20     2023-10-21     2023-12-17
2023-05-23     2023-08-21     2023-10-22     2023-12-18
2023-05-24     2023-08-22     2023-10-23     2023-12-19
2023-05-25     2023-08-23     2023-10-24     2023-12-20
2023-05-26     2023-08-24     2023-10-25     2023-12-21
2023-05-27     2023-08-25     2023-10-26     2023-12-22
2023-06-18     2023-08-26     2023-10-27     2023-12-23
2023-06-19     2023-08-27     2023-10-28     2023-12-24
2023-06-20     2023-08-28     2023-10-29     2023-12-25
2023-06-21     2023-08-29     2023-10-30     2023-12-26
2023-06-22     2023-08-30     2023-10-31     2023-12-27
2023-06-23     2023-09-15     2023-11-13     2023-12-28
2023-06-24     2023-09-16     2023-11-14     2023-12-29
2023-06-25     2023-09-17     2023-11-15     2023-12-30
2023-06-26     2023-09-18     2023-11-16     2023-12-31

That’s 136 triangle days for this year. To see how this count changes from year to year, I ran

python:
for y in range(2000, 2051):
  print(f'{y}   {len(triangleDays(y)):3d}')

which returned

2000     0
2001    12
2002    34
2003    54
2004    72
2005    88
2006   102
2007   114
2008   124
2009   132
2010   138
2011   142
2012   144
2013   144
2014   144
2015   144
2016   144
2017   144
2018   144
2019   144
2020   144
2021   142
2022   140
2023   136
2024   132
2025   127
2026   120
2027   113
2028   104
2029    93
2030    82
2031    72
2032    61
2033    51
2034    41
2035    33
2036    25
2037    19
2038    13
2039     8
2040     5
2041     2
2042     1
2043     0
2044     0
2045     0
2046     0
2047     0
2048     0
2049     0
2050     0

I knew there was no point in checking on years later in the century—it was obvious that every year after 2042 would have no triangle days. As you can see, the 2010s were the peak decade for triangle days. We’re now in the early stages of a 20-year decline.

After doing this, I looked back at my code and decided that most serious Python programmers wouldn’t have done it the way I did. Instead of functions that returned lists, they would build allDays and triangleDays as iterators.1 Not because there’s any need to save space—the space used by 366 datetime.date objects is hardly even noticeable—but because that’s more the current style.

So to make myself feel more like a real Pythonista, I rewrote the code like this:

python:
 1:  from datetime import date, timedelta
 2:  
 3:  def isTriangleDay(dt):
 4:    "Can the the year, month, and day of the given date be the sides of a triangle?"
 5:    y = dt.year % 100
 6:    m = dt.month
 7:    d = dt.day
 8:    sides = sorted(int(x) for x in (y, m, d))
 9:    return sides[0] + sides[1] > sides[2]
10:  
11:  def allDays(y):
12:    "Iterator for all days in the given year."
13:    d = date(y, 1, 1)
14:    end = date(y, 12, 31)
15:    while d <= end:
16:      yield d
17:      d = d + timedelta(days=1)
18:  
19:  def triangleDays(y):
20:    "Iterator for all the triangle days in the given year."
21:    return filter(isTriangleDay, allDays(y))

isTriangleDay is unchanged, but allDays now works its way through the days of the year with a while loop and the yield statement, and triangleDays uses the filter function to iterate through just the triangle days.

Using these functions is basically the same as using the list-based versions, except that you can’t pass an iterator to len. So determining the number of triangle days over a range of years can be done by either by converting the iterator to a list before passing it to len,

python:
for y in range(2000, 2051):
  print(f'{y}   {len(list(triangleDays(y))):3d}')

or by using the sum command with an argument that produces a one for each element of triangleDays,

python:
for y in range(2000, 2051):
  print(f'{y}    {sum(1 for x in triangleDays(y))}')

The former sort of defeats the purpose of using an iterator, so I guess it’s better practice to use the latter, even though I find it weird looking.

It may well be that my perception of “real” Python programmers is wrong and they wouldn’t bother with yield and filter in such a piddly little problem as this. But at least I got some practice with them.


  1. A confession: I find it hard to distinguish between between the proper use of the terms generator and iterator. My sense is that generators provide a way of creating iterators. So once the function is written, do you have a generator, an iterator, or both? 


A couple of game followups

Here are some little tricks associated with Wordle and Conlextions, the Connections-like puzzle put out by Lex Friedman.

Last week, I mentioned that I needed a new starting guess for Wordle, and I wrote about how I used some simple command-line tools to see if the word I was considering was an appropriate choice. In a nutshell, I had been using IRATE as my first guess, but because it was the answer recently I wanted to try a new initial guess that might be the answer in the future. ALTER seemed like a good choice, but I wanted to make sure it hadn’t already been used. I had a list of all the answers in chronological order in a file named answers.txt, and the most recent answer was DWELT.

My solution involved three separate commands. All three involved grep and one included head in a pipeline. Reader Leon Cowle was unsatisfied with this and began casting about for a cleaner solution. Here’s the single command he came up with to replace the three I used:

egrep 'alter|dwelt' answers.txt

The output was

dwelt
alter

which told me that ALTER was an answer and that it would come after DWELT. This was everything I needed to know in one step. Beautiful!

You might argue that for a one-off like this, the solution that occurs to you first is the best because it takes the least amount of your time. Generally speaking, I agree with that, and I’m not unhappy with my three-command solution. But Leon has given me the best of both worlds. I got to have my own inefficient solution that I thought of quickly and I got to learn from his more elegant solution. Knowing which of two text strings appears first in a file is something I’m pretty sure I’ve had to do before and will have to do again. Now I have a simple and effective solution to pull out of my toolbox. Thanks, Leon!

As for Conlextions, I’ve been playing it for a few weeks and recommend it to anyone who likes the NY Times Connections game but is finding it a little too easy. I’ve been playing Connections since early summer and while it has gotten more difficult, the sense of accomplishment I get in solving it with no mistakes is wearing off. Conlextions is more diabolical and more satisfying when you solve it in four guesses.

I have only three gripes with Conlextions:

  1. I don’t think the t belongs in its name.
  2. Lex’s tendency to define groups as “words that begin with S,” or some other letter, is frustrating as hell because it’s both ridiculously easy and a connection I almost never see.
  3. The info that you share with others when you solve the puzzle includes the time it took. I’m a slow and methodical player, certain that Lex has laid so many traps that I need to think through all the possibilities three or four times before committing myself to any grouping. Including the time I spend is embarrassing, especially when I see the solution times of people like Dan Moren and Greg Pierce.

Conlextions solution

In protest of this prejudice against the excessively careful, I’ve recently taken to deleting the solve time from my posts on Mastodon. And to avoid the tedium of backspacing, I made this Shortcut that does the deleting for me:

Shortcut for deleting solution time text from clipboard

It searches the clipboard for a linefeed, the word “Solve,” and all the text after that. It replaces that with the empty string and puts the updated text back onto the clipboard.

So now when I want to post my Conlextions result to Mastodon, I tap the “Share these results” button, press and hold the side button on my phone to activate Siri, and say “Delete Time.” The name of the Shortcut seems to be distinct enough that Siri hasn’t misinterpreted it yet.

I would not normally use Shortcuts for an automation like this. Keyboard Maestro would allow me to invoke it with a keystroke and could also do the pasting. But since I always play Conlextions on my phone, Shortcuts was the best option.


Nearly cheating

Last week I solved Wordle in a single guess. My go-to first guess, IRATE, finally came in after months (over a year, I think) of use. So what do I do now? Seems like a good time to switch my starting guess.

Before we go any further, I should mention, for those of you who don’t commit every utterance of mine to memory and are thinking “There was no IRATE last week,” that I don’t play the NY Times version of Wordle. Back in early 2022, right after the Times bought Wordle, I downloaded the original version and set it up on a server I control. That’s the game my family and I have been playing ever since. Overall, I’d say this was unnecessary. The Times hasn’t screwed up the game the way I thought it would,1 but it’s too late now for us to change.

Based on letter frequency tables, I figured ALTER would be a good new initial guess. But because I was so delighted when IRATE came up, I’d like to choose a word that

  1. Is among the list of 2315 words that are answers.
  2. Hasn’t been an answer already.

And I want to see if ALTER passes these two tests without spoiling myself with other answers, especially those that may be coming up soon.

I have the data to do this if I’m careful. In order to build my wordle script, I needed to pull out all the answers and acceptable guesses from the original Wordle JavaScript source code. I was able to do this nearly blind—I saw just the first and last couple of words in each list and have long since forgotten them—and save them to three files: guesses.txt, answers.txt, and wordle.txt, the last of which is a blending of the two others.

Some simple shell commands got me what I wanted. First, the simple part:

grep alter answers.txt

returned alter, which confirmed my belief that ALTER is an answer and didn’t reveal any other answers. Checking whether it had already been an answer was only a little trickier.

The answers.txt file has the answer words in chronological order, one per line. To figure out where I currently am in that list, I got the line number of yesterday’s word, DWELT.

grep -n dwelt answers.txt

The -n option causes the line number of the found text to be printed along with the text:

878:dwelt

With this information, I can now see if ALTER has already been the answer:

head -n 878 answers.txt | grep alter

The head command outputs just the first 878 lines of answers.txt and the grep command looks for alter in that text. It returned nothing, so ALTER has not been an answer so far. It passes my test and will be my first guess from now on.

What I like about this solution is that I was run my test quickly without seeing any other answer words and without tipping myself off to when ALTER will appear, which I could have easily done with

grep -n alter answers.txt

I recognize that some of you will read this and think I’ve crossed the line of Wordle ethics into outright cheating. You can keep your opinions to yourself.


  1. Although I recall some outrage about a year ago when FEAST was the too-on-the-nose word for Thanksgiving Day. 


Consecutive heads or tails

Earlier today, I talked about sticking with a problem longer than I probably should because I can’t stop. Let’s apply that pathology to the Taskmaster coin flip subtask and think about a slightly different problem.1 Suppose we flip a fair coin and stop when we’ve flipped either five consecutive heads or five consecutive tails. What is the expected number of flips?

As we did with the simpler game, we can imagine this as a board game to help us configure the Markov chain transition matrix. Here, we treat consecutive heads as positives and consecutive tails as negatives.

Game board for heads and tails

Update 11 Nov 2023 11:12 PM
I screwed this up the first time through and should’ve seen the error before I published. The transformation matrices are correct now and give a result that makes more sense. Sorry about that.

We start with our marker on Square 0 and start flipping. A head moves us one square to the right; a tail moves us one square to the left. If we’re on a positive square, a tail takes to Square –1. If we’re on a negative square, a head takes us Square 1. The game ends when we reach either Square 5 or Square –5.

Here’s the game’s transformation matrix, where we’ve set the squares at the two ends of the board to be absorbing states:

P=[1 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 12 0 0 0 0 0 12 0 0 0 0 12 0 0 0 0 0 0 12 0 0 0 12 0 0 0 0 0 0 0 12 0 0 12 0 0 0 0 0 0 0 0 12 0 12 0 0 0 0 0 0 0 0 12 0 0 12 0 0 0 0 0 0 0 12 0 0 0 12 0 0 0 0 0 0 12 0 0 0 0 12 0 0 0 0 0 12 0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 1]

The rows and columns represent the squares in numerical order from –5 to 5. The lower right half is similar to the transformation matrix we used in the earlier post, and the upper left half is sort of the upside-down mirror image of the lower right half. The main difference is that we never go to Square 0 after a flip; if the coin comes up opposite the run we were on, we go to the first square on the opposite side of Square 0.

As we’ve done in the past, we create a Q transformation matrix by eliminating the rows and columns associated with the absorbing states from the P matrix:

Q=[0 0 0 0 0 12 0 0 0 12 0 0 0 0 12 0 0 0 0 12 0 0 0 12 0 0 0 0 0 12 0 0 12 0 0 0 0 0 0 12 0 12 0 0 0 0 0 0 12 0 0 12 0 0 0 0 0 12 0 0 0 12 0 0 0 0 12 0 0 0 0 12 0 0 0 12 0 0 0 0 0 ]

Then we proceed as before, forming the matrix equation

(IQ)m=1

where m is the column vector of the expected number of flips to get to an absorbing state from Squares –4 through 4, and 1 is a nine-element column vector of ones. Solving this2 we get

m=[16 24 28 30 31 30 28 24 16]

So the expected number of flips to get either five consecutive heads or five consecutive tails is m 0=31, the value in the middle of this vector. This is half the value we got last time, which makes sense.

Am I done with Markov chains now? I hope so, but you never know.


  1. By the way, you can now see the whole episode on YouTube. 

  2. Which I did with Mathematica, but you could do with any number of programs. Excel, for example.