Two problems

As I read Jason Snell’s commentary on my Patterns post, I realized that no one—not on Twitter, not in email—had sent me Jamie Zawinski’s famous saying about regular expressions:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

Zawinski was making tweet-sized aphorisms long before Twitter.

Jeffrey Friedl, he of the definitive book on regular expressions, wrote a much-updated post on the origin of this saying and its antecedents. I recommend it if you have any interest in regexes, programming, or Unix. The comments are nearly as fun as the post itself.

Literal-minded people take Zawinski’s gibe too seriously and either parrot it as great wisdom or get into long arguments on how it’s wrong. I agree with Friedl and Jeff Atwood that Zawinski’s real target was not regexes themselves, but their overuse, which he saw as a bad programming practice encouraged by Perl. It may be hard to believe now, but back in the 90s, Perl was being used everywhere. In his post, Friedl includes this other Zawinski quote:

The heavy use of regexps in Perl is due to them being far and away the most obvious hammer in the box.

This strikes me as exactly right and not hyperbolic in the least. When I switched from Perl to Python, the hardest thing to get used to was Python’s more verbose way of handling regexes. Regular expressions are so easy to use in Perl, and so intimately woven into the language, that I had gotten into the habit of using them everywhere. It wasn’t until I had been programming for a while in Python, where I preferred not to import re unless necessary, that I realized how much I had been overusing regexes. Judicious slicing combined with literal string searches and splits commonly work just as well as regexes on simple input, and proper parsing into data structures is almost always the better choice for highly organized data formats.

Even so, regexes are so often exactly what’s needed that I’m glad I have Patterns in my toolbox.