Range rage

I understand why Python’s range function works the way it does, and I usually use it correctly, but I still tend to mess up the three-parameter version. Even worse, I mess up its NumPy cousin, arange, which I find more useful than range itself, almost every time I use it. Today I decided to take action.

The root of the problem is list indices. Python inherited C’s zero-based indexing scheme. The first five items of an array named a are

a[0], a[1], a[2], a[3], a[4]

not

a[1], a[2], a[3], a[4], a[5]

as they would be in a language built for scientists and engineers, like, say Fortran.1

C does this because it’s close to the metal,2 and the index really represents an offset from the address of the start of the list. Thus the memory address of a[0] is the same as the address of a itself, a[1] is one away from the address of a, and so on.

I don’t know why Guido decided Python, which is decidedly not close to the metal, should use the same indexing scheme as C, but I suspect it has something to do with C being the mother tongue of most computer science types of his generation.

The list of numbers generated by range fits in with this zero-based mindset. The single-parameter version, range(5), returns

[0, 1, 2, 3, 4]

which are the indices of a five-element list. The default starting value of range is zero.

The two-parameter version allows you to set the starting value, so range(1, 5) returns

[1, 2, 3, 4]

which maintains the same end value. This is a little tricky, because the second parameter represents neither the end value nor the number of elements, but there is a consistency of sorts with the one-parameter version.

The three-parameter version allows you to set the step value, so

range(0, 10, 2)

returns

[0, 2, 4, 6, 8]

As with the one- and two-parameter versions, the second parameter, which the documentation calls the “stop” value, never appears in the list. To get 10 in the list, we have to use range(0, 11, 2) or range(0, 12, 2).

As I said, I usually get this wrong, but since I seldom use range, my cognitive deficiency doesn’t hurt me too often. I do, on the other hand, use the NumPy version, arange, quite often. When I want to plot a function over a uniformly spaced set of x values, arange is just the ticket.

Or it would be, if I didn’t keep mistaking the stop value for where the generated array actually stops. I can’t tell you how often I’ve written arange(0, 1, .1) and been disappointed when it creates

array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9])

and doesn’t include the 1.

If you’re familiar with NumPy, you might think linspace would be my salvation. But while linspace does stop on the stop value, I still have an off-by-one issue with its third parameter, which I keep thinking should be the number of intervals, not the number of generated values. So I do linspace(0, 1, 10) and am disappointed when the result is

array([ 0.        ,  0.11111111,  0.22222222,  0.33333333,  0.44444444,
        0.55555556,  0.66666667,  0.77777778,  0.88888889,  1.        ])

instead of

array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9,  1. ])

which requires linspace(0, 1, 11).

Today I decided to combat this problem by writing an array-generating function that works the way my brain does. It’s called fromtoby, and it always takes three parameters:

Here’s fromtoby.py:

 1:  #!/usr/bin/python
 2:  
 3:  from __future__ import division
 4:  from numpy import arange
 5:  
 6:  def fromtoby(f, t, b):
 7:    return arange(f, t + b/2, b)
 8:  
 9:  if __name__ == "__main__":
10:    print fromtoby(0, 1, .1)

By saving it in my $PYTHONPATH, I can

from fromtoby import fromtoby

and say things like

x = fromtoby(0, 1, .01)

to get x equal to

array([ 0.  ,  0.01,  0.02,  0.03,  0.04,  0.05,  0.06,  0.07,  0.08,
        0.09,  0.1 ,  0.11,  0.12,  0.13,  0.14,  0.15,  0.16,  0.17,
        0.18,  0.19,  0.2 ,  0.21,  0.22,  0.23,  0.24,  0.25,  0.26,
        0.27,  0.28,  0.29,  0.3 ,  0.31,  0.32,  0.33,  0.34,  0.35,
        0.36,  0.37,  0.38,  0.39,  0.4 ,  0.41,  0.42,  0.43,  0.44,
        0.45,  0.46,  0.47,  0.48,  0.49,  0.5 ,  0.51,  0.52,  0.53,
        0.54,  0.55,  0.56,  0.57,  0.58,  0.59,  0.6 ,  0.61,  0.62,
        0.63,  0.64,  0.65,  0.66,  0.67,  0.68,  0.69,  0.7 ,  0.71,
        0.72,  0.73,  0.74,  0.75,  0.76,  0.77,  0.78,  0.79,  0.8 ,
        0.81,  0.82,  0.83,  0.84,  0.85,  0.86,  0.87,  0.88,  0.89,
        0.9 ,  0.91,  0.92,  0.93,  0.94,  0.95,  0.96,  0.97,  0.98,
        0.99,  1.  ])

which is, finally, exactly what I want on the first try.


  1. Don’t mock Fortran, especially if you never programmed in it. It’s a product of its time, and it’s still a valuable tool when raw numerical speed is of the essence. ↩︎

  2. Hi, Merlin! ↩︎