Wil Shipley and ISBNs

Wil Shipley recently posted an article on his blog that suggests he doesn’t understand how ISBNs work. That can’t be right, can it?

The purpose of the post is, apparently, to make three points:

  1. Wil Shipley is a programming god.
  2. Wil Shipley has had sex.
  3. When a program can’t cover every possibility, it’s worth writing it to cover common cases and then fail gracefully.

I won’t bother addressing the first two points. Shipley seems to include these in everything he writes, so I’m sure others have already covered them. The third point, though, is interesting, although I’m surprised that a professional programmer would consider it a point worth making. Even as a non-professional programmer, and one whose programs are mainly for his own use, I try to make my programs fairly tolerant of disparate input, especially those I expect to use repeatedly and over a long time.

Even more surprising is that the example he gives has mistakes. Here’s the background: Delicious Library, Shipley’s flagship product, has a single input field for the user to enter search terms. The user can enter an author’s name, title words, keywords, or an ISBN to search on. Because

Shipley wanted a routine that tolerates a variety of ISBN formats. As he says, simply stripping the dashes and spaces won’t do, because titles and author names are likely to have dashes and spaces.

Here’s what he comes up with:

- (IBAction)findMatchingItems:(id)sender;
{
  NSString * noSpacesOrDashesString = [[self.keywordsString
    stringByReplacingOccurrencesOfString:@"-" withString:@""]
    stringByReplacingOccurrencesOfString:@" " withString:@""];

  BOOL containsOnlyDigits = YES;
  BOOL containsOnlyDigits = YES;
  for (NSUInteger characterIndex = 0; characterIndex <
      noSpacesOrDashesString.length; characterIndex++) {
    containsOnlyDigits &= [[NSCharacterSet decimalDigitCharacterSet]
        characterIsMember:[noSpacesOrDashesString characterAtIndex:characterIndex]];
    if (!containsOnlyDigits)
      break;
  }
  if (containsOnlyDigits) {
    switch (noSpacesOrDashesString.length) {
      case LIISBNDigitCount: case LIUPCDigitCount: case LIEANDigitCount:
        self.keywordsString = noSpacesOrDashesString;
      default:
        break;
    }
  }

  [...search...]
}

Let’s assume that the doubled declaration and initialization of containsOnlyDigits is a typo that crept in during a copy and paste. The real problem I have is with the for loop that goes through a copy of the input string—a copy that’s been stripped of its dashes and spaces—and checks to make sure that every character is a digit. This will exclude legitimate ISBNs because the last “digit” of a ten-digit ISBN can be an X.

(I knew about final X because several years ago I wrote a little script that collected MARC record data from the Library of Congress. The script would accept

and had to distinguish among them to make the appropriate query to the LoC database. See? Even non-god programmers can write code that distinguishes between various input types.)

Is the final X rare? Not on my bookshelves, it isn’t. Within a minute I found 5 books with final Xs:

So, is the code in Delicious Library wrong? Or did Shipley present us with a simplified version, something that’s wasn’t really used in production code? The first is hard to believe. The second seems like a cheat, especially since his purpose was to show us the complexity necessary to handle a variety of inputs.

Update 8/22/09
Shipley informs me via Twitter (here and here) that the code, which he agrees is wrong with regard to trailing Xs in ISBNs, is pre-beta, not even close to production code yet. So the answer to both my questions above is “No.” He’s giving us a peek into a work in progress, so a bug or two isn’t surprising.

Tags: