December 31, 2015 at 12:49 PM by Dr. Drang
This is the traditional time for bloggers to look back over the year and give statistics on pageviews, most popular posts, and things like that. I don’t keep track of stuff like that (this experiment with Piwik didn’t last long), but because ANIAT is a static site organized through a hierarchical folder structure and built from Markdown files, it’s easy to summarize the posts themselves though Unix shell tools.
The Markdown source files are kept in a Dropbox directory called
source, with subdirectories for each year and sub-subdirectories for each month. For example, the source for this post is in a file called
where the tilde is shell-speak for my home directory.
With this organization scheme, I
cd‘d into the
2015 directory and used this command to get the total number of posts for the year, which is 133:
ls */* | wc -l
ls command lists all the files that are one subdirectory deep, one per line. These are the Markdown source files. The list is then piped to
wc, which, when given the
-l switch, prints out just the number of lines of its input.
This is nice for the total, but it doesn’t say much for how the posts were distributed throughout the year. For that, I ran this command:
for d in `ls`; do echo -n $d; ls $d | wc -l; done
which looped through all the month directories, got the number of files in each, and printed them out, each month on its own line. The output was this:
01 17 02 11 03 9 04 8 05 9 06 14 07 11 08 20 09 11 10 6 11 5 12 12
October and November were a real dropoff.
In this command, the backticks around the first
ls executed that command, which gave a list of the month directories. The
for command looped through that list, assigning the name of a month directory to the variable
d each time through the loop. The
echo -n $d command printed the name of the current month directory, suppressing the newline that
echo normally adds to the end of what it prints. The
ls $d | wc -l composite command listed the contents of the current month directory and then printed the line count of the list. The
done act as brackets for the commands executed each time through the loop.
Finally, I wanted to know how many words I wrote for the blog. Counting words is the main job of
wc, and I could string together all the posts via
cat. The resulting pipeline was very simple:
cat */* | wc -w
It gave me an answer of 106,967 words. You could argue that this is a bit of an overestimate, as it includes hidden text, like a few header lines at the top of each post and the URLs of links and images. I don’t think it’s worth the effort to tease out those differences.
The obvious next step is to figure out how many words I wrote each month. I did that by combining parts of the commands I’d already written:
for d in `ls`; do echo -n $d; cat $d/* | wc -w; done
The result was
01 13732 02 9914 03 6415 04 6529 05 8148 06 10955 07 10134 08 16169 09 7929 10 4690 11 4032 12 8320
I wouldn’t say this was a particularly illuminating exercise, but it’s nice to keep your shell skills in tune.