Snow and memory
November 9, 2025 at 5:01 PM by Dr. Drang
Two things came together over the past few days. First was my longstanding belief that winters here in the Chicago area were snowier when I was a kid than they are now. I’m not sure why that popped back into my head, but it did. Second was my concern that I’ve been losing my Pandas skills since I retired and stopped having data analysis work projects. With both of these things floating around in my head at the same time, I decided to put together a small project to address them: an analysis of Chicago snowfall data over several decades.
The first problem was getting the data. You can download lots of historical weather data from NOAA, but I’ve never found its website easy to search. I’m sure the terminology NOAA uses makes sense to regular weather analysts, but as a dilettante, I’ve never felt comfortable with it. Still, I was able to find this page, which presented a clickable map with local forecast offices across the US.

Clicking on northern Illinois brought me to another page that let me choose the data to download. They call it NOWData,1 and I selected monthly summarized snowfall data for the Chicago area.

After clicking Go, this overlay appeared:

This is in nearly perfect form for the analysis I want to do. Each row represents a 12-month period centered on New Year’s Day, so the total at the end of the row is that entire winter’s snowfall.
I selected all the rows of the table (apart from some summary rows at the bottom) and used a combination of BBEdit and Numbers to clean things up and convert it to a CSV file. Here are the first several rows:
Season,Jul,Aug,Sep,Oct,Nov,Dec,Jan,Feb,Mar,Apr,May,Jun,Total
1883,,,,,,,,,,,,0.0,
1884,0.0,0.0,0.0,0.0,0.5,8.8,20.2,19.0,3.6,1.9,0.0,0.0,54.0
1885,0.0,0.0,0.0,0.0,0.7,14.6,26.7,6.0,1.9,1.0,0.0,0.0,50.9
1886,0.0,0.0,0.0,0.0,2.6,9.8,17.7,4.2,6.2,0.0,0.0,0.0,40.5
1887,0.0,0.0,0.0,0.0,2.5,9.9,11.9,2.2,3.5,2.0,0.1,0.0,32.1
1888,0.0,0.0,0.0,0.0,0.5,3.2,6.0,7.9,5.1,0.0,0.0,0.0,22.7
1889,0.0,0.0,0.0,0.0,1.3,0.0,2.7,8.3,9.4,0.0,0.0,0.0,21.7
1890,0.0,0.0,0.0,0.0,0.0,7.1,3.5,1.1,7.7,2.2,0.0,0.0,21.6
1891,0.0,0.0,0.0,0.0,6.8,5.5,15.3,2.8,3.1,0.0,0.0,0.0,33.5
1892,0.0,0.0,0.0,0.0,0.8,2.1,15.2,11.8,1.0,0.6,0.0,0.0,31.5
1893,0.0,0.0,0.0,0.0,7.5,12.1,6.5,12.9,5.4,0.0,0.0,0.0,44.4
1894,0.0,0.0,0.0,0.0,2.5,10.1,15.4,14.0,5.2,0.0,0.0,0.0,47.2
1895,0.0,0.0,0.0,0.0,14.5,3.4,2.0,27.8,5.9,0.0,0.0,0.0,53.6
1896,0.0,0.0,0.0,0.0,4.2,1.3,,,,,,,
As you can see, I turned all the Ms into explicit missing values and all the Ts into zeros. I also deleted the “second half” of the first column, identifying each season by the calendar year in which it starts.
Although the table goes all the way back to 1883, I wasn’t interested in the early figures. Part of that came from what I saw as questionable data—trace amounts of snow in the Julys of 1902–1904?—and part was just disinterest in snowfall more than about a generation before I was born. So I decided to review just the data since 1932. Not an important year in weather, as far as I know, but a significant one in American history.
Here’s a plot of the raw data and a curve fit through it via local regression analysis. So yes, my youth was spent with more snowfall than the generation before or after. And the last three years have been nearly snowless.

You might question drawing a curve through data with this much year-to-year variation, but even without the blue line, I think it’s clear that the 60s and 70s had both the highest highs and the highest lows.
Here’s the Python code that produced the graph:
python:
1: #!/usr/bin/env python3
2:
3: import pandas as pd
4: import numpy as np
5: import matplotlib.pyplot as plt
6: from matplotlib.ticker import MultipleLocator, AutoMinorLocator, FixedLocator
7: import statsmodels.api as sm
8:
9: # Read in the data, putting only the Season and Total columns in the dataframe.
10: df = pd.read_csv('area.csv', usecols=[0, 13])
11:
12: # Retain rows since 1932 and when the Total isn't missing.
13: df = df[(df.Season >= 1932) & df.Total.notna()]
14:
15: # Smooth the yearly snowfall with local regression.
16: smooth = sm.nonparametric.lowess(df.Total, df.Season, frac=.25, it=0)
17:
18: # Create the plot with a given size in inches.
19: fig, ax = plt.subplots(figsize=(6, 6))
20:
21: # Plot the raw and smoothed values.
22: ax.plot(df.Season, df.Total, '--', color='#bbbbbb', lw=1)
23: ax.plot(df.Season, df.Total, '.', color='black', ms=4)
24: ax.plot(smooth[:, 0], smooth[:, 1], '-', color='blue', lw=2)
25:
26: # Set the limits
27: plt.xlim(xmin=1930, xmax=2025)
28: plt.ylim(ymin=0, ymax=100)
29:
30: # Set the major and minor ticks and add a grid.
31: ax.xaxis.set_major_locator(MultipleLocator(10))
32: ax.xaxis.set_minor_locator(AutoMinorLocator(2))
33: ax.yaxis.set_major_locator(MultipleLocator(10))
34: ax.yaxis.set_minor_locator(AutoMinorLocator(2))
35: ax.grid(linewidth=.5, axis='x', which='major', color='#dddddd', linestyle='-')
36: ax.grid(linewidth=.5, axis='y', which='major', color='#dddddd', linestyle='-')
37:
38: # Title and notes.
39: plt.title('Chicago area snowfall (inches)')
40: plt.text(1952, 86.5, 'NWS NOWData', ha='center', va='center',
41: fontsize='small', backgroundcolor='white')
42: plt.text(1952, 83, 'www.weather.gov/wrh/climate?wfo=lot', ha='center',
43: va='center', fontsize='x-small', backgroundcolor='white')
44: plt.text(2015, 3.5, 'Calendar year at start of winter season', ha='right',
45: va='center', fontsize='x-small', backgroundcolor='white')
46:
47: # Make the border and tick marks 0.5 points wide.
48: [ i.set_linewidth(0.5) for i in ax.spines.values() ]
49: ax.tick_params(which='both', width=.5)
50:
51: # Save as PNG.
52: plt.savefig('20251108-Chicago area snowfall.png', format='png', dpi=200, bbox_inches='tight')
There’s really not much Pandas in this script. The only interesting bits are in Lines 10 and 13. The usecols parameter in Line 10 limits the dataframe to just the two fields we care about in the CSV. Then the filter in Line 13 gets rid of both the years before 1932, as discussed above, and a few other years in which one or more months of missing data prevent a legitimate snowfall total.
Eliminating those missing value rows was necessary to run the lowess function in Line 16. I chose the frac and it parameters based on an “eye test” of the resulting curve. I’d try to come up with a better justification if this were more than just a blog post.
In addition to keeping my skills up in Pandas, I’d like to learn more about Polars and R. Since Polars’s main claim to fame is being faster than Pandas, it doesn’t make much sense to use it on a data set with only about 150 records. So I wrote a little R script to produce this less-than-satisfying plot:

The bulk of the plot is fine, but I didn’t stick with it long enough to get the details looking the way I like. The tick marks and major grid lines are too thick, and I’d rather not have the minor grid lines at all. I don’t understand why R created them without any associated ticks. In any event, here’s my amateurish R code:
1: #!/usr/bin/env Rscript
2:
3: library(tidyverse)
4:
5: # Import the data, including only the Season and Total columns.
6: df = read_csv('area.csv', col_select=c(1, 14), show_col_types=FALSE)
7:
8: # Retain rows since 1932 and when the Total isn't missing.
9: df <- filter(df, Season >= 1932, !is.na(Total))
10:
11: # Plot the raw and smoothed data using the bw theme.
12: p1 <- ggplot(data=df, mapping=aes(x=Season, y=Total)) +
13: theme_bw() +
14: geom_line(color='#bbbbbb', linetype='dashed', linewidth=.25) +
15: geom_point(size=.5) +
16: geom_smooth(method='loess', se=FALSE, span=.4, linewidth=.75, color='blue')
17:
18: # Set the scale and tick spacing.
19: p2 <- p1 +
20: scale_x_continuous(name=NULL, limits=c(1930, 2025), expand=0,
21: breaks=seq(1930, 2020, 10)) +
22: scale_y_continuous(name=NULL, limits=c(0, 100), expand=0,
23: breaks=seq(0, 100, 10))
24:
25: # Set the title.
26: p3 <- p2 +
27: ggtitle('Chicago area snowfall (inches)') +
28: theme(plot.title=element_text(size=10, hjust=.5))
29:
30: # Save it as a PNG file.
31: ggsave('20251109-Chicago snowfall ggplot.png', p3,
32: width=1200, height=1200, units='px')
I do like R’s layering approach of adding features to a plot step by step. And it’s cool that I can add a smoothing line without explicitly creating smoothed data series first. I’m not sure of the internals of R’s loess algorithm, other than it’s clearly not the same as the lowess function I used in the Python script. I had to use different parameters to get a curve that looked similar.
One thing I found absolutely infuriating was the need to add the expand=0 parameter to Lines 20 and 22. To me, when I set the limits on an axis, the plotting software should not extend that axis by an extra 5% or so by default. It should just do what I tell it to do and nothing more.
Anyway, I found the R approach to plotting interesting enough to give it another try the next time a problem like this comes up. I really should crack open Kieran Healy’s book, which Amazon says I bought nearly seven years ago. I have more time now.
-
The NOW part is one of those annoying acronyms of an acronym: NOAA Online Weather. ↩
The wait for new Siri continues
November 4, 2025 at 4:25 PM by Dr. Drang
Yesterday, I drove to Champaign-Urbana for the Illinois men’s home basketball opener. Because I’m going to the CSO concert tonight, I stayed here overnight. When I started my car this morning, CarPlay connected to my phone and offered to give me directions home. Snort.
It’s common for CarPlay to do this when I’m away from home, and under most circumstances it’s a reasonable suggestion. A couple of years ago, when my wife was having day-long chemo infusions at the University of Chicago Medical Center, it was nice to have our rush hour route automatically plotted from Hyde Park home to Naperville. But I’m obviously not going home today until after the concert.
Should Siri know this? Well, the concert ticket is in my Wallet, the event (with location) is in my Calendar, and the email with a link to the ticket is archived somewhere in Mail. Last year’s Apple Intelligence ads would lead anyone—anyone who didn’t keep track of Siri’s actual capabilities, that is—to think that suggestions given by an Apple device would be precise and personalized. But even people who don’t follow Apple closely know that last year’s ads were lies.
We’re now hoping that Mark Gurman is right and a bespoke version of Google Gemini will be the savior of personalized Siri, giving us an assistant that’s aware of our calendar, contacts, and so on, but doesn’t transmit all our data to the don’t be evil empire in Mountain View.
Or maybe we shouldn’t bother. If Apple hadn’t made promises it couldn’t keep, today’s suggestion to drive home wouldn’t have struck me as funny. I’d be happy when suggestions like that are right and would dismiss them without a second thought when they’re wrong. It’s the hope that kills you.
The problem with dollars
November 1, 2025 at 9:08 AM by Dr. Drang
I stopped charting Apple’s quarterly results back in 2019 and don’t intend to return to it, but after seeing the recent posts at Six Colors and TidBITS, I thought I’d try out a new graph.
I got out of the Apple charting business shortly after Apple stopped reporting unit sales with the Q1 2019 figures.1 I care more about how many products Apple is putting into people’s hands than about how much money is passing from those hands into Apple’s pocket. I mentioned this in my penultimate quarterly charting post, and I also pointed out a problem with charting dollar sales instead of unit sales:
One unaddressed problem with [Apple’s revenue figures] is that they don’t account for inflation—something I didn’t have to worry about when I was plotting unit sales. Apple doesn’t account for inflation either, of course, but that doesn’t mean I shouldn’t. If I’m going to keep doing this, I’ll have to decide on an inflation index and a basis year.
Of course, I didn’t decide on how to handle inflation. I chickened out and stopped doing quarterly posts after just one more set of charts. And Apple is happy to ignore inflation, because doing so allows it to report most quarters as “the best ever” in one form or another.
But let’s adjust Apple’s numbers for inflation and see what happens. For simplicity, I’m going to use the CPI-U Index, a common measure of inflation. It looks like this from 2018 to the present:

I downloaded the CPI-U monthly values from the Bureau of Labor Statistics and adjusted Apple’s revenue figures to their equivalent in Q1 2019 dollars. Here’s how that looks compared to the unadjusted revenue:

After adjustment, the climb after the big work-from-home jump isn’t as impressive, although the non-Q1 quarters of fiscal 2025 are quite good. The just-reported Q4 revenue truly is the best Q4 ever, even after adjustment.
You might say that CPI-U isn’t the right inflation measure to use. Also—as Jason Snell pointed out when I showed him this graph—Apple is a worldwide company, and there are different inflation rates in every country. But some adjustment should be made, especially if you want to graph results over several years and some of those years include the COVID inflation period. I’ve chosen the CPI-U because Apple reports its earnings in US dollars, and that’s the inflation metric most Americans are used to seeing.
This isn’t to say that it’s wrong to report the unadjusted numbers, just that accounting for inflation gives some added perspective. Does this mean I’m suggesting additional work for others that I myself don’t intend to do? Yes.
-
In case you’ve forgotten, Apple’s fiscal year ends on the last Saturday of September, so Q1 2019 covers roughly October through December of 2018. ↩
Vectors and weathervanes
October 29, 2025 at 12:51 PM by Dr. Drang
Before going out for a bike ride this morning, I checked the Weather app to see the speed and direction of the wind. On days with a decent wind, I prefer to ride out against the wind and come back with it. Here’s what the app showed:

So I left my house and went east, but my ride is not what this post is about. It’s about the image in the Wind section of the screenshot and why I have to think carefully before making any decisions based on it.
The wind image is an arrow on top of a compass, and the arrow is oriented to point in the direction that the wind is blowing. It’s like a vector representation of the wind, except that the wind speed isn’t given by the length of the arrow, it’s written explicitly at the center of the compass. Apple is like many other weather apps in showing the wind with an arrow like this.
The problem I have—and I realize this is my problem, not Apple’s—is that when I see an image like this, my first thought is of a weathervane.

Good Directions weathervane image from Amazon.
The arrow on a weathervane points into the wind, not with it, because the arrowhead is smaller than the fletching. If Apple’s wind image were a stylized weathervane rather than a vector, it would be pointing in the opposite direction. This is why I have to think twice about wind direction.
The terminology we use for wind direction is clearly based on weathervanes. This morning’s wind is described as east-northeast (ENE)—the direction a weathervane would be pointing. It’s possible that Apple puts a dot at the tail of the wind vector to have something that matches the text. Unfortunately for me, the dot makes the arrow look even more like a weathervane, forcing me to think harder. It would be better if I could just read the text in the left half of the Wind section, but I can’t. The image draws my attention.
Which reminds me of something. About a week ago, I was riding my bike along the Centennial Trail from Romeoville to Willow Springs. The wind was strong and gusty from the south. As I went along the Chicago Sanitary and Ship Canal by the Lemont Shipyard, the boom of a tower crane extended over the bike path.

Obviously, the tower crane wasn’t moving any cargo onto or off of the bike path. It had swung into that position because of the wind. Tower cranes aren’t supposed to be used in high winds, and operators are trained to disengage the gearing to allow them to swing with the wind, reducing the force that would overturn them. This is sometimes called “weathervaning,” even though the business end of the boom points with the wind (like Apple’s arrow) instead of into the wind.
By the way, here’s where that photo was taken, and you can see why a south wind would push the boom into the position shown. This portion of the bike path runs between the Des Plaines River and the canal. Another canal, the historic (and much smaller) Illinois and Michigan Canal, is south of the Sanitary and Ship Canal.

Don’t get the I&M Canal confused with the Hennepin Canal. Unlike the Hennepin Canal, the I&M Canal was an economic success, but very little of it can be kayaked nowadays—too shallow and too filled with downed trees. Its towpath has been turned into a great bike path, though. A few miles north of Lemont, I switched from the Centennial Trail to the John Husar Trail, crossing over the Sanitary and Ship Canal to finish my ride along the I&M Canal.
Today’s ride was nice, but distinctly chillier than last week’s. I’ll soon be hanging up the bike until spring.