Reshaping output

I often need to take a long list of character strings—one per line—and reformat them into a table so they’ll be easier to read. In the past, I’ve done this by pasting the data into a spreadsheet and moving the cells around to build the table by hand. Today, I decided there had to be a Unix command line tool that would do this for me, and I set out looking for it. I found two.

The data set I was dealing with was a list of devices, each identified by serial number and having a set of characteristics. I wrote a short Python script that filtered the data, printing out only the serial numbers of the devices that fit particular criteria. The easiest way to write the script was to have the serial numbers come out one per line, but that’s not how I wanted to present them in an email to my client. I wanted them laid out in a nice table.

I found a Python module, pycolumnize that will do this, but I wanted to learn a more general tool, one that I could use on any set of textual data, not just a list within a Python program.

The first command I ran across was column, which is pretty easy to use and doesn’t have many options. Here’s an example:

$ jot 36 | column -c 32
2       11      20      29
3       12      21      30
4       13      22      31
5       14      23      32
6       15      24      33
7       16      25      34
8       17      26      35
9       18      27      36

The jot command generates the numbers 1–36, one per line. Then column rearranges them into columns. The number of columns is determined by the value of the -c option; column packs as many columns as it can into an output block that many characters wide.1 While I can understand why someone might like this way of specifying the output, I’d rather specify the number of rows or columns directly.

Also, column puts a tab character between each column.2 Again, I can see why some people might like that, but I prefer spaces between the columns.

My dissatisfaction with column led me to rs, the reshape command. This is a much more flexible command than column. In its most basic form, you give it the number of rows and columns (in that order) you want in the output.

$ jot 36 | rs 4 9
1   2   3   4   5   6   7   8   9
10  11  12  13  14  15  16  17  18
19  20  21  22  23  24  25  26  27
28  29  30  31  32  33  34  35  36

Having to specify both rows and columns shouldn’t be necessary, and it isn’t. You can use 0 as a dummy value for either the row or column count, and rs will figure out what that value ought to be.

$ jot 36 | rs 0 4
1   2   3   4
5   6   7   8
9   10  11  12
13  14  15  16
17  18  19  20
21  22  23  24
25  26  27  28
29  30  31  32
33  34  35  36

As you can see, by default rs enters the values row by row instead of column by column. That can be changed with the -t (transpose) option:

$ jot 36 | rs -t 0 4
1   10  19  28
2   11  20  29
3   12  21  30
4   13  22  31
5   14  23  32
6   15  24  33
7   16  25  34
8   17  26  35
9   18  27  36

Each column of rs output is as wide as the widest entry. Other values are padded with spaces. By default, the entries are left justified within that width, but you can make them right justified by using the -j option:

$ jot 36 | rs -tj 0 6
 1   7  13  19  25  31
 2   8  14  20  26  32
 3   9  15  21  27  33
 4  10  16  22  28  34
 5  11  17  23  29  35
 6  12  18  24  30  36

The “gutter” is the space between the columns, and rs uses a two-space gutter by default. That can be changed with the -g option:

$ jot 36 | rs -tj -g4 0 4
 1    10    19    28
 2    11    20    29
 3    12    21    30
 4    13    22    31
 5    14    23    32
 6    15    24    33
 7    16    25    34
 8    17    26    35
 9    18    27    36

Be careful with -g. The width number has to come right after it. If you put a space between the g and the width number, rs will misinterpret it.

The rs command has, perhaps unfortunately, many more options, but -t, -j, and -g are probably the most useful.

By the way, I don’t want to leave the impression that you need to have a full table. If you ask for 36 entries to be arranged in 5 columns, you’ll get this:

$ jot 36 | rs -tj -g4 0 5
 1     9    17    25    33
 2    10    18    26    34
 3    11    19    27    35
 4    12    20    28    36
 5    13    21    29
 6    14    22    30
 7    15    23    31
 8    16    24    32

Or this if you don’t transpose:

$ jot 36 | rs -j -g4 0 5
 1     2     3     4     5
 6     7     8     9    10
11    12    13    14    15
16    17    18    19    20
21    22    23    24    25
26    27    28    29    30
31    32    33    34    35
36

For the data I was dealing with today, the serial numbers were all four digits long and there were, after filtering, about 200 of them, so you can see why I wanted to reshape them into a table. I chose six columns and a four-space gutter to get a table that was relatively easy to read. I piped the output of my Python script through rs -t -g4 0 6 and got output that looked something like this:

1013    2337    3908    5808    7374    8919
1021    2358    3962    5819    7384    8932
1095    2419    3980    5843    7467    8936
1125    2481    4007    5843    7494    8960
1173    2485    4059    6028    7497    8998
1194    2501    4076    6110    7510    9118
1250    2595    4119    6128    7600    9168
1255    2603    4260    6206    7623    9196
1314    2724    4315    6233    7688    9208
1326    2867    4363    6260    7766    9307
1330    2904    4372    6310    7850    9308
1346    2905    4376    6352    7914    9394
1355    3021    4379    6407    7935    9411
1404    3032    4502    6450    7990    9435
1408    3196    4503    6457    7991    9442
1449    3234    4572    6501    8022    9473
1462    3246    4699    6652    8078    9536
1477    3246    4767    6660    8127    9544
1477    3293    4819    6665    8158    9563
1529    3293    4988    6724    8222    9603
1537    3355    5035    6747    8299    9603
1551    3356    5042    6765    8333    9630
1552    3372    5101    6783    8402    9742
1809    3421    5111    6845    8436    9749
1930    3458    5261    6861    8474    9783
1996    3499    5315    6862    8507    9892
2045    3569    5522    6875    8519    9906
2047    3575    5525    6897    8524    9944
2070    3597    5526    7036    8586    9948
2078    3686    5569    7050    8664    9959
2124    3691    5583    7077    8720    
2146    3722    5592    7109    8751    
2158    3739    5617    7310    8891    
2336    3810    5755    7333    8905    

Yes, it’s long, but it’s probably as easy to read as 200 serial numbers can be.


  1. Sort of. In practice, I’ve found that you usually need to give -c a greater number than is absolutely necessary. For example, in the table above, each line is only 26 characters long, but column -c 28 would result in only three columns of data. I don’t know why. ↩︎

  2. HTML doesn’t like tab characters, so to match the look of the output in Terminal, I’ve shown the output of column with spaces between the columns. ↩︎