Where modules go to die
April 30, 2012 at 10:20 PM by Dr. Drang
When I wrote my “Python doesn’t play well with others” post a couple of weeks ago, I didn’t know that I was treading on ground already well-traveled by Kenneth Reitz. I should have known that the programmer behind the envoy library—which I mentioned in that post and which I’ve adopted as my go-to library for running external processes—would have an overarching interest in making Python libraries simpler and more obvious for common uses.
In addition to envoy, Reitz is the lead developer of
- requests, a library that simplifies HTTP communication;
- clint, a library for building command line tools; and
- tablib, a library for handling tabular data that can generate output in many formats.
He’s prepared a set of slides for a talk entitled “Python for Humans,” a shortened version of which he delivered at a Django conference earlier this month.
In this abbreviated talk, he spent most of his time on his request library and the problems with urllib2 that he was trying to solve. Reitz’s position is that several of the modules in the standard library violate one or more of the Zen of Python rules, especially
Beautiful is better than ugly.
If the implementation is hard to explain, it’s a bad idea.
and
There should be one—and preferably only one—obvious way to do it.
In the talk, Reitz describes several possibilities a new Python programmer would have to sift through if she were trying to write a program that communicated via HTTP. We saw a similar thing in my subprocess post: the standard library contains at least three ways to call an external program from within Python.
I suspect that some of the standard libraries are convoluted because there’s a sense that they have to cover all cases. The implementation then gets weighted down by the need to handle every edge condition imaginable, and the simple, common uses are buried under layers of objects and method parameter lists.
Reader Carl said, in a comment to another of my Python library complaints
The problem with the “batteries included” philosophy of Python is that a lot of batteries got written back in the Python 2.0 days and haven’t been updated since, but because they already exist, no third party libraries catch on to serve the same niche.
This is exactly right. The standard libraries constitute one of Python’s great features, but they tend to suck up all the oxygen. Programmers are reluctant to write libraries that duplicate their functions, so poor libraries in the standard set persist. Only a few people, like Reitz, are willing to write modules that compete with the standards.
Near the end of his talk, Reitz says, “The standard library is where modules go to die.” An overstatement, certainly, but with more than a germ of truth. Once a library is enshrined in the standard set, it can’t change radically because too many programs rely on it—and its bugs, idiosyncrasies, and complications—remaining stable.
Several Python programmers on Hacker News thought my complaints about the subprocess module were unfounded. I was glad to see that core developer Nick Coghlan agreed with me, as did Reitz:
An awesome article on the frustrations of using the subprocess module: leancrew.com/all-this/2012/…
— Kenneth Reitz (@kennethreitz) Tue Apr 17 2012
The “awesome” is hyperbole, but it’s nice to see people with real influence in the Python community recognizing the value of making the language and its libraries accessible to lesser programmers.