TextExpander sparkline snippet

After seeing this Daring Fireball entry, this Kottke article, and then Alex Kerin’s original post, I had to write a script for generating Twitter sparklines via Unicode block characters.

Sparklines are an Edward Tufte idea, little inline graphs that let you visualize data within the text you’re reading. Twitter sparklines aren’t exactly true to Tufte’s vision, because they’re chunky and low resolution, but they are clever and fun.

You can see the Unicode block characters by opening the Character Viewer and clicking on the Geometrical Shapes category.

Block characters in Character Viewer

There are eight of them, running from Unicode character 2581 through 2588. Here’s what they look like inline: ▁▂▃▄▅▆▇█.

What I wanted to be able to do was take a space-separated set of numbers, like

15 21 30 23 47 41 49 33 41 41 62

(Chicago Bulls regular season victories since the 2000-2001 season), and quickly turn them into a sparkline: ▁▂▃▂▆▅▆▄▅▅█. My model was the character flipping TextExpander snippet I wrote a couple of years ago: a Python script that generates a sparkline from standard input driven by a TextExpander shell snippet that assumes the data are on the clipboard.

Here’s the Python script, called sparkline:

 1:  #!/usr/bin/python
 2:  # -*- coding: UTF-8 -*-
 3:  
 4:  from sys import stdin, stdout
 5:  
 6:  blocks = u'▁▂▃▄▅▆▇██'
 7:  
 8:  def spark(data):
 9:    line = ''
10:    lo = float(min(data))
11:    hi = float(max(data))
12:    incr = (hi - lo)/8
13:    for n in data:
14:      line += blocks[int((float(n) - lo)/incr)]
15:    return line
16:  
17:  stdout.write(spark([float(x) for x in stdin.read().split()]).encode('utf8'))

Line 17 splits the input on whitespace and turns it into a list of floating point numbers. It then feeds that list to the spark function, which divides the data range into eight equal divisions and assigns a block character to each data point according to the division it falls in.

The only tricky thing in sparkline is the ninth character in the blocks string defined in Line 6. Why is the tallest block repeated? Normally, the int function in Line 14 will return an integer in the range of 0 to 7, perfect as an index for an eight-character string. But when it’s calculating the value for the maximum data point it will return an 8. The repeated tall block is there to handle that special case.

The TextExpander snippet itself is this shell script,

#!/bin/bash
pbpaste | ~/bin/sparkline

which pipes the contents of the clipboard to sparkline (the ~/bin/ part says that I keep sparkline stored in the bin folder in my home folder). I gave the snippet the abbreviation ;spark.

The procedure for using the snippet is:

  1. Get the data in space-separated form.
  2. Select the data and copy it onto the clipboard.
  3. Type ;spark to create the sparkline.

The sparkline script gives the lowest value the lowest bar and the highest value the highest bar. If you don’t want that, you can put dummy values at the end of the list and delete those bars after the sparkline is generated. For example, with the Bulls data I could have used

15 21 30 23 47 41 49 33 41 41 62 0 82

to provide the full range of possible victories. That would generate a sparkline of ▂▃▃▃▅▅▅▄▅▅▇▁█, for which the last two bars should be deleted to give ▂▃▃▃▅▅▅▄▅▅▇.

I suspect I won’t use this snippet any more often than I use the pǝddılɟ ɹǝʇɔɐɹɐɥɔ snippet, but it was fun to write as I watched the boring OK City Thunder blowout of the Memphis Grizzlies.

Update 5/12/11
Well, that was disappointing. The sparklines look fine in my text editor, fine here in the blog, fine in Dr. Twoot, but like shit on the Twitter website because some of the block characters don’t align vertically in some fonts.

Sparkline in Twitter

Also, the half-height block character near the middle of the sparkline is narrower than all the others. I wonder if there’s some mixing of fonts going on.

That’s a screenshot, showing what it looks like on the Twitter website. Here’s what it looks like when I grab the tweet info directly and display it with the CSS I use here (which you won’t see if you’re reading the RSS feed):

Bulls regular season victories since 2000:
▁▂▃▂▆▅▆▄▅▅█

Twitter sparkline via a @TextExpander snippet: http://xrl.us/bkii52

11:07 PM Wed May 11, 2011

Wish I could get it to look like that everywhere. Oh well. I did say I was unlikely to use this snippet very much.

Update 5/12/11 (again)
Unsurprisingly, others have seen the baseline problem, which may be confined to Macs. The solution is, to my mind, worse than the problem. Zach Seward of the Wall Street Journal, which has been using these sparklines recently, suggests just dropping the two problem blocks and going with six divisions of the data range. Dropping the tallest block doesn’t bother me, but dropping the middle one does; it makes the height distribution between adjacent blocks uneven (▁▂▃▅▆▇), which kind of defeats the purpose.

As for the baseline misalignment problem being exclusive to the Mac, I can only say that I’m seeing it on the Mac and I’m also not seeing it on the Mac—it depends on how I’m viewing the tweet. Based on my experiments with Dr.Twoot’s CSS file, I think it’s related to both the specified font and the fallback fonts. For example:

  • If I specify Arial or Helvetica, I see the problem. With Arial, both blocks are misaligned; with Helvetica only the tallest block is misaligned.
  • If I specify Times or Georgia or Lucida Grande, I don’t see the problem.
  • If I specify Lucida Grande and a fallback font of Sans-Serif, I see the problem, but only with the tallest block.
  • If I specify Lucida Grande and a fallback font of Serif, I don’t see the problem.

I’m sure there’s a good explanation for this, but I don’t know what it is.

Update 5/12/11 (again again)
John Gruber has now linked to a criticism of sparktweets by Than Tibbetts. Tibbets points out, as both Seward and I did, that sparktweets violate Tufte’s definition of sparklines as high resolution graphics. He also complains about these obvious deficiencies:

The data it purportedly represents is worthless, at best confusing. There’s no baseline, no scale, no way to tell the bounds of the upper or lower limits of the “chart.” The tallest block might as well be 873 percent.

Lighten up, Francis. This is Twitter we’re talking about, not an article for an ASA journal. Sparktweets are just a fun way to show data—data which may well be properly analyzed and presented in a linked article.

As for the problem with block characters that don’t line up properly, here are a few more observations:

  • All the blocks align properly in the official Twitter client for the Mac.
  • The tallest block doesn’t align in the official Twitter client for the iPhone.
  • The tallest block doesn’t align in Tweetbot for the iPhone.
  • The tallest and half-height blocks don’t align in Mobile Safari on the iPhone. I learned this when reading this post on my iPhone.

There’s always been something goofy with iOS’s fonts. Long ago, I pointed out that Courier on the iPhone wasn’t monospaced.