Wgetting the Octave documentation

I realized today that I didn’t have a local copy of the Octave manual on either of my computers. This probably isn’t a big deal on the desktop machine at work, where an internet connection is pretty much assured, but it is on my MB Air, which I often use away from WiFi. Because there’s no zipped package with the full Octave manual in HTML form available, I downloaded it using wget and did a little file manicuring with sed to make the files work better offline.

Wget is, like curl, a command line tool for downloading files via HTTP, HTTPS, and FTP. The advantage of using wget to download a large set of files, as we need to do with Octave, is that it works recursively; you can tell it to download a particular HTML page and all the files linked on that page and all the files linked on the linked pages, etc. (To avoid downloading the entire internet, there’s an option to limit how deep the linking goes.) Unfortunately, wget isn’t included by default with OS X, but it’s easy to install if you have the Developer Tools—and I have to assume that if you read this blog, you have the Developer Tools installed.

The latest version of wget is 1.13, but that requires a library OS X doesn’t have, so I suggest you download the source of version 1.12. Once it’s in your Downloads folder and expanded, open a Terminal window, cd into the directory and run

./configure
make
sudo make install

This will put a copy of wget in your /usr/local/bin directory, and as long as that directory is in your $PATH, you’re good to go.

Update 10/2/11
Thanks to the comments by John and Dexter, I’ve learned that you can compile version 1.13 without installing another library. Just use

./configure --with-ssl=openssl
make
sudo make install

I’m surprised—and more than a little annoyed—that ./configure --help doesn’t mention this option.

To download the Octave manual, cd to an appropriate directory (I use ~/doc) and issue this command:

wget -r -l 2 http://www.gnu.org/software/octave/doc/interpreter/index.html 

The -r option tells wget to recurse through the linked pages, and the -l 2 options says to go only two levels deep. When the command is done running—which will take a couple of minutes unless you have a really fast connection—you’ll see a new folder called www.gnu.org in your directory. It has this set of subdirectories:

|-www.gnu.org
|---licenses
|---philosophy
|---software
|-----octave
|-------doc
|---------interpreter
|-----texinfo

We want to pull the interpreter folder up to our level, so run this command:

cp -R www.gnu.org/software/octave/doc/interpreter ./

Now we need to put grab the CSS file and put it with all the HTML files.

cp www.gnu.org/software/octave/octave.css interpreter/

Time to say good-bye to all the stuff we don’t need.

rm -rf www.gnu.org

Now let’s do a little editing of the HTML files.

cd interpreter
sed -i .bak 's|\.\./\.\./octave|octave|;s|doc%2|doc_002|g' *.html

The first sed command changes all the stylesheet links from

<link rel="stylesheet" type="text/css" href="../../octave.css">

to

<link rel="stylesheet" type="text/css" href="octave.css">

and the second fixes a weird typo in all the files. Even on the web site, in-page links are screwed up because every place there should be a doc_002 in a link URL, there’s a doc%2 instead.

The last step is to delete the backup files that sed made.

rm *.bak

Now we have a self-contained directory with the entire Octave manual. All the links work and point to other files in the directory. You’ll probably want to change the name of the directory to something more descriptive than interpreter; I use—here’s a surprise–octave.

If you don’t want the fun of doing every command individually, you can save the whole set of commands in a single file and run it. Here they are:

 1:  #!/bin/bash
 2:  
 3:  wget -r -l 2 http://www.gnu.org/software/octave/doc/interpreter/
 4:  cp -R www.gnu.org/software/octave/doc/interpreter ./
 5:  cp www.gnu.org/software/octave/octave.css interpreter/
 6:  rm -rf www.gnu.org
 7:  cd interpreter
 8:  sed -i .bak 's|\.\./\.\./octave|octave|;s|doc%2|doc_002|g' *.html
 9:  rm *.bak
10:  cd ..
11:  mv interpreter octave

I keep a link to the ~/doc/octave/index.html file in my bookmarks bar so I can quickly look up Octave commands in a Safari window while I’m programming.