My RSS failure

Consider this a small blow against publication bias—a post about a failed script. Or a failed idea, anyway.

Shortly after Google announced it would be shutting down Reader in July, I started thinking about ways I could sync my feeds without relying on a third-party syncing service. I’m not averse to such services in principle, but I do wonder about their longevity. If the service I pick doesn’t gain enough popularity to make a go of it, I’ll be back trying to choose a better one in a year or two.

Self-hosted solutions like Fever and Tiny Tiny RSS are, of course, one way around the service longevity problem, but this post by Shaun Inman made me leery of Fever, and TTRSS doesn’t want to be on a shared host, which is where I’d want to put it.

So I had this clever idea:

If there isn’t an obvious replacement for Google Reader by June, I’m going to convert my feeds into email. IMAP solved syncing long ago.
Dr. Drang (@drdrang) Fri Mar 22 2013 7:12 PM CDT

And with the feedparser module for Python, writing up a simple script for syncing feeds and turning them into emails to myself wasn’t all that difficult. Here’s a little script called mailfeeds that works fairly well, although it still needs some debugging:

python:
  1:  #!/usr/bin/python
  2:  
  3:  import feedparser
  4:  from time import mktime, gmtime
  5:  from datetime import datetime
  6:  import pytz
  7:  import os
  8:  import sys
  9:  import socket
 10:  import smtplib
 11:  
 12:  
 13:  # Parameters
 14:  me = 'myrssemailaddress@gmail.com'
 15:  pw = 'seekret'
 16:  gmail = 'smtp.gmail.com:587'
 17:  subfile = os.environ['HOME'] + '/Dropbox/rss/subscriptions.txt'
 18:  homeTZ = pytz.timezone('US/Central')
 19:  utc = pytz.utc
 20:  
 21:  # Don't wait too long for nonresponsive sites.
 22:  # socket.setdefaulttimeout(30)
 23:  
 24:  # Convert post date and time from UTC to home time zone.
 25:  def convertTZ(struct):
 26:    dt = datetime.fromtimestamp(mktime(struct))
 27:    return utc.localize(dt).astimezone(homeTZ)
 28:  
 29:  # Item and message formats.
 30:  itemfmt = '''\
 31:  <h1>{0}</h1>
 32:  <h1><a href="{1}">{2}</a></h1>
 33:  <p>by {3} on {4}</p>
 34:  {5}'''
 35:  msgfmt ='''\
 36:  From: {0}
 37:  To: {1}
 38:  Subject: {2}
 39:  Content-Type: text/html
 40:  
 41:  {3}'''
 42:  
 43:  # Connect to the mail server.
 44:  server = smtplib.SMTP(gmail)
 45:  server.starttls()
 46:  server.login(me, pw)
 47:  
 48:  
 49:  # Initialize subscription dictionary.
 50:  subdict = {}
 51:  
 52:  with open(subfile) as subs:
 53:    for sub in subs:
 54:      feedURL, lastlink = sub.split()
 55:      subdict[feedURL] = lastlink
 56:  
 57:      # Parse the feed, moving to the next one if it's empty.
 58:      f = feedparser.parse(feedURL)
 59:      if len(f.entries) == 0:
 60:        subdict[feedURL] = 'none'
 61:        continue;
 62:  
 63:      # The blog name.
 64:      bname = f.feed.get('title', 'Unnamed Blog').encode('utf8')
 65:      # print bname
 66:  
 67:      # This will be the last link when we're done.
 68:      subdict[feedURL] = f.entries[0].link
 69:  
 70:      # Go through the entries more recent than the last link.
 71:      for entry in f.entries:
 72:        link = entry.link
 73:        if link == lastlink:
 74:          break
 75:        else:
 76:          # Collect the parts we need.
 77:          title = entry.get('title', 'No title')
 78:          title = title.replace('\n', ' ').encode('utf8')
 79:          date = entry.get('updated_parsed', gmtime())
 80:          date = convertTZ(date).strftime('%b %-d, %Y at %-I:%M %p')
 81:          author = entry.get('author', 'Anonymous').encode('utf8')
 82:          try:
 83:            article = entry.content[0].value.encode('utf8')
 84:          except AttributeError:
 85:            article = entry.summary_detail.value.encode('utf8')
 86:  
 87:          # Build the email body.
 88:          body = itemfmt.format(bname, link, title, author, date, article)
 89:  
 90:          # Build the message.
 91:          sender = ('%s <%s>' % (bname, me)).encode('utf8')
 92:          msg = msgfmt.format(sender, me, title, body)
 93:  
 94:          server.sendmail(sender, [me], msg)
 95:          # print '  ' + title
 96:      # print
 97:  
 98:  # Clean up.
 99:  server.quit()
100:  with open(subfile, 'w') as subs:
101:    for k, v in subdict.iteritems():
102:      subs.write('%s %s\n' % (k,v) )

It keeps track of my subscriptions in a text file. Each line of the text file contains the URL of a feed and the URL of the most recently synced article from that feed. When the script runs, it collects the RSS entries for each article posted after the most recently synced one, turns that into an HTML email, and mails it to a special address I set up specifically for these RSS emails. Finally, it updates the subscription list text file.

While the script still makes mistakes, it isn’t the reason my idea was a failure. I’m pretty confident I could shake out the bugs if I kept at it. But I’m not going to keep at it because feed reading via email sucks.

You’d think—or at least I thought—the fit between RSS and email would be pretty good. Both consist of individual items that can be read, sorted, archived, or deleted. Both are able to accept limited HTML. Both can be accessed through both browser- and app-based interfaces. Hell, one of the reasons people raved about Google Reader in its early days was that it worked a lot like GMail.

But it’s the little things that make a big difference. For example, in the RSS readers I use, unread articles are typically all I see—the ones I’ve read are automatically swept into a temporary archive in which older articles are eventually deleted. Mail clients don’t work that way, and its frustrating to have to go back and delete or archive bunches of messages after I’ve read them.

Can’t I just delete the article-messages as I read them? Yes, but in browser-based GMail, the iOS GMail app, and the much-praised Mailbox app, deleting a message doesn’t take you to the next message, it takes you back to the inbox, which adds another step to the feed reading experience.

Also, HTML email is a tricky business, and a lot of posts—especially those with large images—look like hell in the GMail and Mailbox apps. Large images cause the text to shrink to an unreadable size. Here’s an example in GMail

RSS in GMail app

and in Mailbox.

RSS in Mailbox app

I suppose this is my fault for naively thinking the HTML in a feed would be handled well by email clients.

I don’t regret giving this a try. It didn’t take long to write and would have been a permanent, portable solution if it had worked out. But I’ve come to expect more from an RSS reader than an email client can provide. Back to reading Gabe for advice.