Accidents and estimates

I came very close to being in a car accident on Friday, which got me thinking about kinematics and estimation in engineering calculations.

I was stopped in a line of cars at a traffic light when something—probably the squeal of brakes, although I may have heard that after later—made me look up into my rear view mirror. A car going way too fast was coming up behind me. I leaned back in my seat, put my hands on my head, and closed my eyes, waiting for the impact.

Which never came.

After a couple of seconds, I opened my eyes and looked in the mirror. There was a car back there, slanted out toward the shoulder, but it was much further away than I expected. Then I noticed a car on the shoulder to my right and ahead of me. That was the car I had expected to hit me. The driver had managed to swerve to the right and avoid me.

That led to some conflicting feelings. I was pleased he was skillful enough to steer out of the accident but angry at his stupidity in needing to exercise that skill. Then the engineer in me took over. If he came to a stop ahead of me, how fast would he have hit me if he hadn’t veered to the right?

It’s a pretty simple calculation, the kind you learn in high school physics. There are two equations of kinematics we need:

[d = v_0 t - \frac{1}{2} \alpha g t^2]

and

[v_0 = \alpha g t]

These are covering the period of time from when his front bumper passed my rear bumper to when he came to rest. The distance traveled is [d], his speed at the beginning of this period is [v_0], the duration is [t], and the deceleration (assumed constant) is [\alpha g]. It’s common in situations like this to express the acceleration or deceleration as a fraction of the acceleration due to gravity; [\alpha] is a pure number.

We don’t really care about [t], so with a little algebra we can turn these into a single formula with only the variables of interest:

[v_0 = \sqrt{2 \alpha g d}]

Based on where the car ended up, I’d say [d] is about 25 feet. The deceleration factor, [\alpha] is a bit more dicey to estimate, but it’s likely to be somewhere around 0.6 to 0.8. And since we’re using feet for distance, we’ll use 32.2 [\mathrm{ft/s^2}] for [g]. That gives us a range of values from 31 to 36 [\mathrm{ft/s}] for [v_0]. Converting to more conventional units for car speeds, that puts him between 21 and 24 mph. That would have been a pretty good smack. Not only would my trunk have been smashed in, I likely would have had damage to my front bumper from being pushed into the car ahead of me.

This was a simple calculation, but it illustrates an interesting feature of estimation. Despite starting with a fairly wide range in our estimate of [\alpha] (0.8 is 33% higher than 0.6), we ended up with a much narrower range in the estimate of [v_0] (24 is only about 15% higher than 21). For this we have the square root to thank. It cuts the relative error in half.

Why? Let’s say we have this simple relationship:

[a = \sqrt{b}]

We can express our uncertainty in the value of [b] by saying our estimate of it is [b (1 + \epsilon)], where [\epsilon] is the relative error in the estimate. We can then say

[\sqrt{b(1 + \epsilon)} = \sqrt{b}\sqrt{1 + \epsilon}]

and using the Taylor series expansion of the second term about [\epsilon = 0], we get

[\sqrt{b} \left( 1 + \frac{1}{2}\epsilon + \mathrm{h.o.t} \right)]

If the absolute value of [\epsilon] is small, the higher order terms (h.o.t) won’t amount to much, and the relative error of [a] will be about half that of [b].

Lots of engineering estimates aren’t as forgiving as this one, so it’s important to know when your inputs have to be known precisely and when you can be a little loose with them.

Speaking of forgiving, I searched for rear end crash test results for my car to see how much damage it would have taken. I came up empty, but here’s a more recent model undergoing an impact at twice the speed.


I don't want nobody nobody sent

I have been known to complain bitterly about Apple’s decline in software quality. Sometimes the complaints have been typed into Twitter; more often they been spit out between clenched teeth as yet another damned thing doesn’t do what it should. But iOS 13 has added one new feature that is both incredibly valuable and works.

It’s Silence Unknown Callers, an option you can find in the Phone section of the Settings app.

Silence unknown callers

In my experience, it does exactly what it says and has already saved me lots of time and frustration. It’s not that I often answered spam calls; I had already trained myself to almost never pick up a call that my phone didn’t associate with a contact. But I still had to stop what I was doing and look at my phone or my watch whenever one came in. Now that’s a thing of the past.

I’ve seen many people say they can’t use Silence Unknown Callers because they often need to take cold calls. I pity those people. I did wonder myself whether it was OK to silence business calls from prospective new clients who aren’t yet in my contacts list, but a little thought led me to the conclusion that those callers always leave messages and aren’t offended by having to do so.

Oddly enough, my first and only bad experience with Silence Unknown Callers was the exact opposite of missing an important call. A day or two after I had turned it on, a spam call rang on my phone. My initial reaction was that Apple had screwed up (yet again), but no. The call rang because the number was in my Contacts. Over several years I had collected spam numbers into a special contact—called AAASpammer to put it at the top of the list—that was blocked. I had apparently mistakenly unblocked him,1 and now because he was in my list of contacts, and the caller was reusing a number associated with that contact, the call rang through. I deleted AAASpammer from Contacts and have not been bothered by a spam call since.

If you have any sense of the history of Chicago machine politics, you will recognize the source of the post’s title. Spam callers are nobodies that nobody sent.


  1. In my experience, when adding a new number to a blocked contact, you had to unblock the caller and then reblock him to get the newly added number to “take” (whether this was a bug or an idiotic design choice by Apple, I never knew). I must have missed a step in this dance the last time I added a number and left the contact unblocked. 


Data cleaning from the command line

If you/be been reading John D. Cook’s blog recently, you know he’s been writing about “computational survivalism,” using the basic Unix text manipulation tools to process data. I know Kieran Healy has also written about using these tools instead of his beloved R for preprocessing data; this article talks a little about it, but I feel certain there’s a longer discussion somewhere else. You should probably just read all his posts. And John Cook’s, too. I’ll be here when you’re done.

I use Unix tools to manipulate data all the time. Even though a lot of my analysis is done via Python and Pandas, those old command-line programs just can’t be beat when it comes to wrangling a mass of text-formatted data into a convenient shape for analysis. I have a particular example from work I was doing last week.

I needed to analyze dozens of data files from a series of tests run on a piece of equipment. The data consisted of strain gauge and pressure gauge readings taken as the machine was run. After a brief but painful stint of having the test data provided as a set of Excel files, I got the technician in charge of the data acquisition to send me the results as CSV. That’s where we’ll start.

Each data file starts with 17 lines of preamble information about the software, the data acquisition channels, the calibration, and so on. This is important information for documenting the tests—and I make sure the raw files are saved for later reference—but it gets in the way of importing the data as a table.

I figured a simple sed command would delete these preamble lines, but I figured wrong. For God knows what reason, the CSV file that came from the computer that collected the data was in UTF-16 format (is this common in Windows?), even though there wasn’t a single non-ASCII character in the file. UTF-16 is not something sed likes.

So I took a quick look at the iconv man page and wrote this one-liner to get the files into a format I could use:

for f in *.csv; do iconv -f UTF-16 -t UTF-8 "$f" > "$f.new"; done

I suppose I could have chosen ASCII as the “to” (-t) format, but I’m in the habit of calling my files UTF-8 even when there’s nothing outside of the ASCII range.

After a quick check with BBEdit to confirm that the files had indeed converted, I got rid of the UTF-16 versions and replaced them with the UTF-8 versions

rm *.csv
rename 's/\.new//' *.new

The rename command I use is my adaptation of Larry Wall’s old Perl script. The first argument is a Perl-style substitution command.

With the files in the proper format, it was time to delete the 17-line preamble. The easiest way to do this is with the GNU version sed that can be installed through Homebrew:

gsed -i '1,17d' *.csv

The sed that comes with macOS requires a bit more typing:

for f in *.csv; do sed -i.bak '1,17d' "$f"; done

The built-in sed forces you to do two things:

1. Include an extension with the `-i` switch so you have backups of the original files.
2. Use a loop to go through the files. A command like

    `sed -i.bak '1,17d' *.csv`

would concatenate all the CSV files together and delete the first 17 lines of that. The upshot is that only the first CSV file would have its preamble removed.

Hence my preference for gsed.

With the preambles gone, I now had a set of real CSV files. Each one had a header line followed by many lines of data. I could have stopped the preprocessing here, but there were a couple more things I wanted to change.

First, the data acquisition software inserts an “Alarm” item after every data channel, effectively giving me twice as many columns as necessary. To get rid of the Alarm columns, I needed to know which ones they were. I could have opened one of the files in BBEdit and started counting through the header line, but it was more conveniently done via John Cook’s header numbering one-liner:

head -1 example.csv | gsed 's/,/\n/g' | nl

Because all the CSVs have the same header line, I could run this on any one of them. The head -1 part extracts just the first line of the file. That gets piped to the sed substitution command, which converts every comma into a newline. Finally, nl prefixes each of the lines sent to it with its line number. What I got was this:

     1  Scan
     2  Time
     3  101 <SG1> (VDC)
     4  Alarm 101
     5  102 <SG2> (VDC)
     6  Alarm 102
     7  103 <SG3> (VDC)
     8  Alarm 103
     9  104 <SG4> (VDC)
    10  Alarm 104
    11  105 <SG5> (VDC)
    12  Alarm 105
    13  106 <SG6> (VDC)
    14  Alarm 106
    15  107 <SG7> (VDC)
    16  Alarm 107
    17  108 <SG8> (VDC)
    18  Alarm 108
    19  109 <Pressure> (VDC)
    20  Alarm 109

With the headers laid out in in numbered rows, it was easy to construct a cut command to pull out only the columns I needed:

for f in *.csv; do cut -d, -f 1-3,5,7,9,11,13,15,17,19 "$f" > "$f.new"; done

I’d like to tell you I did something clever like

seq -s, 3 2 19 | pbcopy

to get the list of odd numbers from 3 to 19 and then pasted them into the cut command, but I just typed them out like an animal.

Once again, I ran

rm *.csv
rename 's/\.new//' *.new

to get rid of the old versions of the files.

The last task was to change the header line to a set of short names that I’d find more convenient when processing the data in Pandas. For that, I used sed’s “change” command:

gsed -i '1cScan,Datetime,G1,G2,G3,G4,G5,G6,G7,P' *.csv

With this change, I could access the data in Pandas using simple references like

df.G1

instead of long messes like

df['101 <SG1> (VDC)']

All told, I used nine commands to clean up the data. If I had just a few data files, I might have done the editing by hand. It is, after all, quite easy to delete rows and columns of a spreadsheet. But I had well over a hundred of these files, and even if I had the patience to edit them all by hand, there’s no way I could have done it without making mistakes here and there.

Which isn’t to say I wrote each command correctly on the first try. But after testing each command on a single file, I could then apply it with confidence to the full set.1 The tests were run in several batches over a few days. Once I had the commands set up, I could clean each new set of data in no time.


  1. And it took a lot less time to develop and test the commands than it did to write this post. 


Old bugs never die or fade away

A few days ago, I ran into an odd and very old bug that isn’t even Apple’s fault. Imagine that!

My discovery did start with a Apple bug, though. I was working on my larger iPad Pro and saved a couple of email attachments to a new folder in iCloud Drive. At least, that’s what I thought I’d done. When I opened Files on my smaller iPad a few minutes later, neither the files nor their enclosing folder, which I had created on the spot while saving the files, were visible. Probably just a brief delay in syncing, I thought, but I was wrong. When I looked again half an hour later, nothing had synced. I could see the new files and folder on the large iPad but not the small one.

I wondered if the files had synced to my Macs. So I opened up Transmit on one of the iPads and tried to SFTP into my home iMac to look for the files. Couldn’t connect. Tried to SFTP into my office iMac and couldn’t connect there, either. I wondered if maybe iOS 13 was finally the end of Transmit for iOS, but a quick test showed I was able to connect to the server that hosts this blog. Also, I was able to use Prompt to log into both iMacs via SSH, so it was clear that their SSH server daemons were up and running.

Several unfruitful Googlings later, I sat down at my home iMac and tried to diagnose the problem. When I entered

ssh localhost

into Terminal, I got this response

Received message too long 1113682792
Ensure the remote shell produces no output for non-interactive sessions.

I didn’t understand how the remote shell could be set up to produce no output—shells are supposed to produce output, aren’t they?—but I figured this sort of raw, system-level error message would be a better search term than the Transmit error messages, which were written by Panic.

And sure enough, I quickly found this Stack Exchange question, for which the first answer was the solution. And it led to this OpenSSH FAQ, which has the official answer:

sftp and/or scp may fail at connection time if you have shell initialization (.profile, .bashrc, .cshrc, etc) which produces output for non-interactive sessions. This output confuses the sftp/scp client.

So the problem is not that the shell produces output, it’s that it produces output upon initialization. And I knew immediately why my shells were doing that.

A month or so ago, I was messing around different shells and different versions of shells, and to keep track of what was running, I had added these lines this to the top of .bashrc:

echo "Loading .bashrc..."
echo $BASH_VERSION

These were the culprits. They were fine in an interactive SSH session, but were triggering a long-standing bug in an SFTP session. I commented out the lines and could make SFTP connections via Transmit (and ShellFish and FileBrowser) to both iMacs again.

The OpenSSH people don’t seem to think of this as a bug. The Stack Exchange answer says it’s been around at least ten years, and the FAQ says the fix is “you need to modify your shell initialization.”

In other words, “you’re initializing it wrong.”

By the way, once I got this SFTP stuff straightened out, I learned that the new files and folder (remember those? this is a song about Alice) weren’t on either of my Macs, so some combination of Files and iCloud Drive had screwed up. While still on my Mac, I created a new folder and saved the email attachments into it. They appeared in Files on both iPads immediately, except that the new folder was called untitled folder on the larger iPad.

I hope it takes Apple less than ten years to recognize and clean up these iCloud/Files bugs.