Formatting MultiMarkdown tables with NumPy and tabulate
January 21, 2016 at 6:55 PM by Dr. Drang
I’ve been spending a lot of time lately making lists of random samples for testing. In the past, I’ve used Python’s random
module to generate and print the list the samples from a population list, and then I’ve reformatted the list into a MultiMarkdown table in BBEdit for presentation in a report. But now I do everything in Python by using NumPy to manipulate the list and the tabulate
module to format it as a MultiMarkdown table.
Let’s say I have a bunch of components available for testing, all with serial numbers.
1 #!/usr/bin/env python
2 # coding=utf-8
3
4 import numpy as np
5 from tabulate import tabulate
6 import random
7
8 # Define the population. Serial numbers normally aren't
9 # this simplistic, but this is just an example.
10 population = range(5668, 7023)
11
12 # Draw a sample of 28 from the population.
13 sample = random.sample(population, 28)
14
15 # Pad the list out with zeros to fill a 10x3 table.
16 sample = np.append(sample, [0, 0])
17
18 # Turn the list into a 10x3 table.
19 table = np.reshape(sample, (10, 3), 'F')
20
21 # Print the table.
22 print '| 1–10 | 11–20 | 20–30 |'
23 print tabulate(table, tablefmt='pipe')
It starts by creating a list of the serial numbers on Line 10 for all the available components. For this example, I’m using a nonsense range of numbers from 5668 through 7022. This is the population. Then I use the sample
function from the random
module on Line 13 to generate a new list of items chosen at random from the population.
That’s the easy and obvious part. The part that saves me a lot of editing time is what comes next. First, I use NumPy’s append
function on Line 16 to add zeros to the end of the list. I want to end up with a 10×3 table of serial numbers, so I need two more items to fill out the list. Then the reshape
function on Line 19 turns the flat list into a 10×3 matrix.
Finally, Line 22 prints the header row of the MultiMarkdown table (those are n-dashes between the numbers, which is why you see the coding=utf-8
directive at the top of the file), and Line 23 uses tabulate
to print the format line and the body of the table. Here’s the output:
| 1–10 | 11–20 | 20–30 |
|-----:|-----:|-----:|
| 6940 | 5839 | 6007 |
| 6615 | 6957 | 6314 |
| 6169 | 6877 | 6224 |
| 6142 | 6324 | 6210 |
| 6492 | 6685 | 6961 |
| 6908 | 5964 | 6475 |
| 6604 | 6387 | 6192 |
| 6189 | 6860 | 6090 |
| 6444 | 6162 | 0 |
| 5812 | 6950 | 0 |
And here’s what it looks like after processing,
1–10 | 11–20 | 20–30 |
---|---|---|
6940 | 5839 | 6007 |
6615 | 6957 | 6314 |
6169 | 6877 | 6224 |
6142 | 6324 | 6210 |
6492 | 6685 | 6961 |
6908 | 5964 | 6475 |
6604 | 6387 | 6192 |
6189 | 6860 | 6090 |
6444 | 6162 | |
5812 | 6950 |
where I’ve edited out those padding zeros to avoid any confusion.
There’s not much to this, I know, but by using this as my template and changing the individual parts to fit the particular problem at hand, I save myself a lot of time and can concentrate on the real work and not the fiddly formatting.
-
Or I could assign serial numbers if they don’t have them already. ↩