Formatting MultiMarkdown tables with NumPy and tabulate
January 21, 2016 at 6:55 PM by Dr. Drang
I’ve been spending a lot of time lately making lists of random samples for testing. In the past, I’ve used Python’s
random module to generate and print the list the samples from a population list, and then I’ve reformatted the list into a MultiMarkdown table in BBEdit for presentation in a report. But now I do everything in Python by using NumPy to manipulate the list and the
tabulate module to format it as a MultiMarkdown table.
Let’s say I have a bunch of components available for testing, all with serial numbers.1 I don’t need to test all of them, only a sample, but I want to make sure that I don’t subconsciously cherry-pick the best ones—or the worst ones, for that matter. To ensure that my prejudices don’t play any role in the selection of components to test, I write a Python script to do the selecting for me. In general form, this is what it looks like:
python: 1: #!/usr/bin/env python 2: # coding=utf-8 3: 4: import numpy as np 5: from tabulate import tabulate 6: import random 7: 8: # Define the population. Serial numbers normally aren't 9: # this simplistic, but this is just an example. 10: population = range(5668, 7023) 11: 12: # Draw a sample of 28 from the population. 13: sample = random.sample(population, 28) 14: 15: # Pad the list out with zeros to fill a 10x3 table. 16: sample = np.append(sample, [0, 0]) 17: 18: # Turn the list into a 10x3 table. 19: table = np.reshape(sample, (10, 3), 'F') 20: 21: # Print the table. 22: print '| 1–10 | 11–20 | 20–30 |' 23: print tabulate(table, tablefmt='pipe')
It starts by creating a list of the serial numbers on Line 10 for all the available components. For this example, I’m using a nonsense range of numbers from 5668 through 7022. This is the population. Then I use the
sample function from the
random module on Line 13 to generate a new list of items chosen at random from the population.
That’s the easy and obvious part. The part that saves me a lot of editing time is what comes next. First, I use NumPy’s
append function on Line 16 to add zeros to the end of the list. I want to end up with a 10×3 table of serial numbers, so I need two more items to fill out the list. Then the
reshape function on Line 19 turns the flat list into a 10×3 matrix.
Finally, Line 22 prints the header row of the MultiMarkdown table (those are n-dashes between the numbers, which is why you see the
coding=utf-8 directive at the top of the file), and Line 23 uses
tabulate to print the format line and the body of the table. Here’s the output:
| 1–10 | 11–20 | 20–30 | |-----:|-----:|-----:| | 6940 | 5839 | 6007 | | 6615 | 6957 | 6314 | | 6169 | 6877 | 6224 | | 6142 | 6324 | 6210 | | 6492 | 6685 | 6961 | | 6908 | 5964 | 6475 | | 6604 | 6387 | 6192 | | 6189 | 6860 | 6090 | | 6444 | 6162 | 0 | | 5812 | 6950 | 0 |
And here’s what it looks like after processing,
where I’ve edited out those padding zeros to avoid any confusion.
There’s not much to this, I know, but by using this as my template and changing the individual parts to fit the particular problem at hand, I save myself a lot of time and can concentrate on the real work and not the fiddly formatting.
Or I could assign serial numbers if they don’t have them already. ↩