Your Feedburner loss is my gain

Sometimes things just work out. If Google hadn’t decided to pull the plug on Feedburner, Marco Arment wouldn’t have written his clever subscriber counting shell script, and I’d have been forced to write one of my own. But by putting off writing my own script, I saved myself a lot of time; all I had to to was change a few lines of Marco’s.

I’ve always used Google Reader’s statistics as a rough guide to how many subscribers I have. I know there are people who use other services, but I figured Reader, because it acts as the server for so many readers, would account for the lion’s share of subscribers.

Google Readeer statistics

But since I started using PubSubHubbub a couple of weeks ago, the subscriber count returned by Reader has been stuck on 2,733. It’s not that I was expecting a big jump, but that count has always fluctuated. I downloaded my access log file and did a little searching. Here are some of the entries I found:

72.14.199.116 - - [21/Sep/2012:23:20:14 -0400] "GET /all-this/feed/ HTTP/1.1" 304 - "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 2736 subscribers; feed-id=9141626367700991551)"
72.14.199.116 - - [21/Sep/2012:23:56:24 -0400] "GET /all-this/feed/ HTTP/1.1" 200 81346 "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 3 subscribers; feed-id=10244119563422606966)"
72.14.199.116 - - [22/Sep/2012:00:10:11 -0400] "GET /all-this/feed/ HTTP/1.1" 304 - "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 24 subscribers; feed-id=14210996709684013136)"
72.14.199.116 - - [22/Sep/2012:00:11:44 -0400] "GET /all-this/feed/ HTTP/1.1" 304 - "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 5 subscribers; feed-id=5544250358980535650)"
72.14.199.116 - - [22/Sep/2012:00:18:18 -0400] "GET /all-this/feed/ HTTP/1.1" 304 - "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 1 subscribers; feed-id=10534473730017088018)"
72.14.199.116 - - [22/Sep/2012:00:28:46 -0400] "GET /all-this/feed/ HTTP/1.1" 304 - "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 1 subscribers; feed-id=9683323205805256385)"

Not only was the log file showing a few more subscribers for that particular feed ID, there were, for reasons I cannot fathom, other feed IDs associated with the same feed URL. Obviously, the right way to get the subscriber count was to search for all lines like this and add up the values.

As it seemed like a simple enough problem, I did a little Googling to see if someone had done it before. I found Ken Varnum’s blog post on the subject and the web page he made for doing the calculation. I didn’t want to provide access to my stats to an unknown host (no disrespect to Ken meant, just a general principle), so I figured I’d have to do it myself.

Then along came Marco’s gist. His script, like Ken’s, did more than just count Google Reader subscribers, I could run it on my own server, and I could review and change the source code to my heart’s content. A perfect solution.

Near the top of the script are these lines:

bash:
# Required variables:
RSS_URI="/rss"
MAIL_TO="your@email.com"
LOG_FILE="/var/log/httpd/access_log"

I changed the value of LOG_FILE to the path to my log file (I’ll explain the MAIL_TO variable in a bit). Instead of hard-coding the feed URL into the script, I changed that line to

bash:
# Required variables:
RSS_URI=$1

so it would get the feed URL from the command line. For historical reasons, this blog has both an RSS feed and an Atom feed, and I wanted to use the same script to get statistics on both.

Because my server will email me the output of any cron job, I also commented out these last few lines:

bash:
# echo "Also emailed to $MAIL_TO."
# 
# echo "$REPORT " | mail -s "[$HUMAN_FDATE] $MAIL_SUBJECT" "$MAIL_TO"

which is why I didn’t need to change the MAIL_TO variable at the top.

The cron job runs every morning and looks like this:

bash subscribers.sh '/all-this/feed/'; bash subscribers.sh '/all-this/feed/atom/'

This gives me the statistics of both feeds in the same email.

Thanks, Marco!