Wil Shipley and ISBNs
August 21, 2009 at 11:10 PM by Dr. Drang
Wil Shipley recently posted an article on his blog that suggests he doesn’t understand how ISBNs work. That can’t be right, can it?
The purpose of the post is, apparently, to make three points:
- Wil Shipley is a programming god.
- Wil Shipley has had sex.
- When a program can’t cover every possibility, it’s worth writing it to cover common cases and then fail gracefully.
I won’t bother addressing the first two points. Shipley seems to include these in everything he writes, so I’m sure others have already covered them. The third point, though, is interesting, although I’m surprised that a professional programmer would consider it a point worth making. Even as a non-professional programmer, and one whose programs are mainly for his own use, I try to make my programs fairly tolerant of disparate input, especially those I expect to use repeatedly and over a long time.
Even more surprising is that the example he gives has mistakes. Here’s the background: Delicious Library, Shipley’s flagship product, has a single input field for the user to enter search terms. The user can enter an author’s name, title words, keywords, or an ISBN to search on. Because
- Delicious Library does its search at Amazon,
- Amazon likes its ISBNs to come without dashes or spaces, and
- users will often include dashes or spaces when typing an ISBN (because that’s the way they usually appear on books)
Shipley wanted a routine that tolerates a variety of ISBN formats. As he says, simply stripping the dashes and spaces won’t do, because titles and author names are likely to have dashes and spaces.
Here’s what he comes up with:
- (IBAction)findMatchingItems:(id)sender;
{
NSString * noSpacesOrDashesString = [[self.keywordsString
stringByReplacingOccurrencesOfString:@"-" withString:@""]
stringByReplacingOccurrencesOfString:@" " withString:@""];
BOOL containsOnlyDigits = YES;
BOOL containsOnlyDigits = YES;
for (NSUInteger characterIndex = 0; characterIndex <
noSpacesOrDashesString.length; characterIndex++) {
containsOnlyDigits &= [[NSCharacterSet decimalDigitCharacterSet]
characterIsMember:[noSpacesOrDashesString characterAtIndex:characterIndex]];
if (!containsOnlyDigits)
break;
}
if (containsOnlyDigits) {
switch (noSpacesOrDashesString.length) {
case LIISBNDigitCount: case LIUPCDigitCount: case LIEANDigitCount:
self.keywordsString = noSpacesOrDashesString;
default:
break;
}
}
[...search...]
}
Let’s assume that the doubled declaration and initialization of containsOnlyDigits
is a typo that crept in during a copy and paste. The real problem I have is with the for
loop that goes through a copy of the input string—a copy that’s been stripped of its dashes and spaces—and checks to make sure that every character is a digit. This will exclude legitimate ISBNs because the last “digit” of a ten-digit ISBN can be an X.
(I knew about final X because several years ago I wrote a little script that collected MARC record data from the Library of Congress. The script would accept
- the output of a CueCat scanner [remember them? too bad I don’t have a PS/2 port anymore],
- an ISBN,
- a Library of Congress control number, or
- title words and/or author names
and had to distinguish among them to make the appropriate query to the LoC database. See? Even non-god programmers can write code that distinguishes between various input types.)
Is the final X rare? Not on my bookshelves, it isn’t. Within a minute I found 5 books with final Xs:
- Reinforced Concrete : Mechanics and Design (ISBN 013770819X)
- Probability Concepts in Engineering Planning and Design (ISBN 047103200X)
- Theory of Vibration with Applications (ISBN 013914532X)
- Digital Filters (ISBN 048665088X)
- Design of Concrete Structures (ISBN 007071116X)
So, is the code in Delicious Library wrong? Or did Shipley present us with a simplified version, something that’s wasn’t really used in production code? The first is hard to believe. The second seems like a cheat, especially since his purpose was to show us the complexity necessary to handle a variety of inputs.
Update 8/22/09
Shipley informs me via Twitter (here and here) that the code, which he agrees is wrong with regard to trailing Xs in ISBNs, is pre-beta, not even close to production code yet. So the answer to both my questions above is “No.” He’s giving us a peek into a work in progress, so a bug or two isn’t surprising.