New SciPy Superpack
November 28, 2012 at 9:17 PM by Dr. Drang
Before I go any further, I should stress that the “new” in the title of the post means “new for me,” not “new to the rest of the world.” Because I’m still running Lion1 I need to use a version of the Superpack that’s a little behind the times. But it’s more up-to-date than the one I installed back in June.
The Superpack I’m talking about is Chris Fonnesbeck’s SciPy Superpack, a set of Python modules and a shell script that ties everything together and does the installation. The modules you get in the Superpack are:
- IPython (version 0.14.dev) This is the interactive Python shell that offers more simply running
python
from command line. In the past I’ve thought it not worth the effort, but after reading more about it in Wes McKinney’s Python for Data Analysis, I’ve decided to give it a whirl. - NumPy (version 1.8.0.dev_436a28f_20120710) This is the base library upon which all other numerical and scientific Python modules are built. It provides classes and methods for fast matrix computations.
- SciPy (version 0.10.1) This is a set of modules for performing common scientific and engineering computation: numerical integration, optimization, linear algebra, fourier transforms, statistics, and so on.
- matplotlib (version 1.2.x) The plotting library I’ve been using in place of Gnuplot.
- pandas (version 0.8.1.dev_8cc9826_20120717) A data analysis package for reading, writing, and manipulating data files. Given how much time I’ve spent in the past manipulating files to get them in shape for analysis, pandas seems like a godsend. But I haven’t had a good project to use it on yet.
- pymc (version 2.2) A module for Markov Chain Monte Carlo sampling. I haven’t done anything with Markov chains in ages, so I don’t see myself using this one.
- scikit_learn (version 0.12_git) A module for machine learning, something I’ve never dabbled in and don’t ever expect to.
- Statsmodels (version 0.5.0) A module for (surprise!) the statistical modeling of data. I can see myself using this, but I haven’t so far.
It also installs gFortran because some parts of these modules need to be compiled.
When I first installed the Superpack in June, everything worked fine, but there was one small annoyance. Every time I imported something from the scipy.stats
library, I’d get this warning:
RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility
I soon learned that this was not a warning I needed to worry about, but I still didn’t like seeing it. My hope was that more recent versions of these packages would suppress the warning. Luckily, that turned out to be true, and now I have the modules I want with no extraneous warnings when my scripts run.
The newer versions of the modules are installed as .egg
files alongside the older versions in /Library/Python/2.7/site-packages
. I assume I can just delete the older versions with no ill effect, but I haven’t bothered and probably won’t. Based on my experience in upgrading from Snow Leopard to Lion, I fully expect all of my site-packages
libraries to get wiped out. There’s no sense in doing any maintenance on a directory that’s likely to need a complete rebuild in a month or so.
If you’re running Mountain Lion (and I suspect most of you are), you should get the fully up-to-date Superpack for 10.8. It has more recent versions of almost all of the packages.