Man alive again

Yesterday, I discovered that Apple had changed the URLs of all its online man pages. Without, I should add, creating redirects so old links would continue to work. This broke all the man page links I had here at ANIAT and undoubtedly broke links across the internet. Tonight, I fixed my broken links with a long but not especially complex shell command.

As I was writing yesterday’s post on toggling Desktop icons, I tried to link to the online man pages for the commands I was writing about, like defaults and killall. I couldn’t find them, though. Even Google returned links that led to errors (although I’m sure Google will get caught up with the new URLs soon).

Apple man page error message

I complained on Twitter (as you do), and soon got an answer from Arvid Gerstmann:

@drdrang They seem to have moved them. You can still access the legacy pages, fortunately. developer.apple.com/legacy/library…
Arvid Gerstmann (@ArvidGerstmann) Apr 29 2016 9:27 AM

What this means is that a link that worked just a week or so ago, like

https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/wc.1.html

is now at

https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man1/wc.1.html

Most of the URL is the same, but the “mac” part is gone, and the whole library has been moved to the “legacy” subdirectory, which is kind of ominous. I suppose there could be another copy of the library outside the “legacy” subdirectory, but I haven’t found it, and so far neither has Google.

Apart from thwarting my attempt to add man page links to yesterday’s post, this change meant that all my previous links to man pages were dead. Great. To figure out how many that was, I used a pipeline of ack and wc:

ack 'developer\.apple\.com\/.*\/ManPages\/.*\.html' */*/*.md | wc -l

which told me there were 162 links to Apple online man pages in my Markdown source files that would have to be fixed. Actually, I thought it would be more, but it’s still way too many to fix manually.

By the way, while exploring the ack results, I learned this wasn’t the first time Apple’s changed the man page URLs—it’s just the first time I’ve noticed it. In 2006, URLs looked like this:

http://developer.apple.com/documentation/Darwin/Reference/ManPages/man1/wc.1.html

Back then, there was no library/mac/ portion to the URL. That seemed to last into 2009.

Capitalization wasn’t always consistent on the documentation portion. Sometimes it was DOCUMENTATION. Sometimes a hash mark would sneak in, making it #documentation, which is kind of weird.

To fix these problems, I decided to use sed for in-place editing of the Markdown source. I am not a sed expert, and I’m absolutely certain this is not the most elegant way to use it, but it was efficient of my time, which is what I cared about the most. Here’s the pipeline, where I’ve split it over two lines:

ack -l 'developer\.apple\.com\/.*\/ManPages\/.*\.html' */*/*.md \
| xargs sed -i '' -E 's/(developer\.apple\.com\/).*(\/ManPages\/.*\.html)/\1legacy\/library\/documentation\/Darwin\/Reference\2/'

The ack command that starts it off is basically the same as what I showed before, but it uses the -l switch to give me just the list of file names that have matches, not the matches themselves.

The list of filenames is then passed to sed via xargs, which we talked about a couple of weeks ago. The sed switches are -i '', which tells it to do its editing in place with no backup (see below), and -E, which tells it to use “extended” regular expression syntax, not the “basic” (i.e., shitty) syntax sed originally used back in the ’70s. No one should use basic regex syntax.

The sed command itself is a substitution that maintains the developer.apple.com domain and the part from ManPages to the end, but changes everything in between to conform to the current address pattern.

There is a redundancy to this pipeline. Ack, after all, found all these URLs in the first place—sed shouldn’t have to go back through the files and find them again. But avoiding that redundancy requires cleverness, which I’m in short supply of and would have been a waste of my time, anyway. As it was, the pipeline completed its work in the blink of an eye. Nothing I could have done to eliminate the pipeline’s redundant processing would have made it perceptibly faster.

So what about that “in-place editing with no backup”? Isn’t that stupid? Of course it is, but I tested the pipeline on a copy of the source tree and checked the results before running it on the original. I didn’t want this command anywhere near my original source files until I knew it worked.

So now I have man page links that work again, but that doesn’t change the fact that Apple is a terrible citizen of the web. Links, especially to programming documentation, should be maintained. If a directory structure has to be changed, redirects should be used to send visitors to the new locations. This is Web 101.