Updating LinkShare affiliate links

If you’ve been using affiliate links to items in the iTunes Store, the App Store, or the Mac App Store, you learned about a month and a half ago that Apple was moving its affiliate program from LinkShare to the Performance Horizon Group. Up through today, LinkShare links still worked, but as of tomorrow they won’t.

I changed the script that creates my affiliate links back in August, following the prescription of Underscore David Smith. But that fixed only the links from that time forward. What about affiliate links in older posts?

Well, there’s the Auto Link Maker, a chunk of JavaScript that’s supposed to scan your posts for links to the iTunes/App/Mac App Stores and turn them into proper PHG affiliate links. But I didn’t want to use it because

  1. I don’t really know everything it’s doing.
  2. I don’t want yet another chunk of JavaScript loading and running with every page.
  3. I wanted the Markdown source code of my posts, which I have stored on my computer in addition to being in the blog’s WordPress database, to have the PHG links in case I ever use that source to generate a static blog.
  4. I just wanted to do it myself.

So early this month I wrote a couple of scripts to transform the LinkShare-style links into PHG-style links.

Most of my old affiliate links followed the format that Undie recommended back in December of 2011. These links looked like this:

[1]: http://itunes.apple.com/us/app/notesy-for-dropbox/
id386095500?mt=8&partnerId=30&siteID=L4JhWyGwYTM

I wanted to change them to look like this:

[1]: https://itunes.apple.com/us/app/notesy-for-dropbox/
id386095500?mt=8&at=10l4Fv

(I’ve inserted line breaks in both of these to make them easier to read without scrolling.)

The number in brackets before the URL is there because I always use Markdown reference-style links, which is the style all right-thinking people use. This helped in constructing the regular expression I used to search for and update the URLs. Another thing that helped is Patterns, a nifty little tool for testing out regexes on different types of input. I’m generally pretty good at regex construction, but Patterns does a nice job of showing exactly what is matched instead of what I think is matched.

Patterns

The script I wrote, update-applelinks, takes an argument list of Markdown source files, looks for the old-style links, changes them to new-style links, and writes them out to files with the extension .new added. Here’s the script:

python:
 1:  #!/usr/bin/python
 2:  
 3:  import re, sys, os
 4:  
 5:  files = sys.argv[1:]
 6:  
 7:  applelink = re.compile(r'(^\[\d+\]: .+)(partnerId=30&siteID=L4JhWyGwYTM).*$')
 8:  phglink = r'\1at=10l4Fv'
 9:  
10:  for oldfile in files:
11:    newfile = oldfile + '.new'
12:    unchanged = True
13:    with open(newfile, 'w') as new:
14:      with open(oldfile, 'r') as old:
15:        for line in old:
16:          newline = applelink.sub(phglink, line)
17:          new.write(newline)
18:          if newline != line:
19:            unchanged = False
20:    if unchanged:
21:      os.remove(newfile)

The idea was to run

update-applelinks *.md

in a directory of Markdown source files and end up with several new files with the .new extension. These were the files with the updated affiliate links. As you can see from the code, update-applelinks actually creates a .new file for every file in the folder, but it deletes the ones that didn’t have old-style affiliate links and were therefore unchanged from the original. This might seem terribly inefficient, but doing it this way made the script easy to write, and it ran in the blink of an eye despite all the extra creation and deletion.

After the .new files were created, I published them to the blog using this script, which I normally run as a Text Filter from within BBEdit and to which I have a symlink stored at ~/Dropbox/bin/publish-post folder. Finally, I’d rename all the .new files to get rid of the .new extension, overwriting their older versions in the process. After a few test to make sure it worked, I could just cd into a directory of post files and execute

update-applelinks *.md; publish-new *.new; rename -f 's/\.new//' *.new

publish-new is a simple shell that calls publish-post, dumps its output to dev/null, and waits a bit between files to avoid confusing WordPress.

bash:
1:  #!/bin/bash
2:  
3:  for f in "$@"; do
4:    publish-post < $f > /dev/null
5:    sleep 3
6:  done

I keep the source of my posts in a year/month folder hierarchy, so it took only a few minutes to work my way through all directories back to January 2012, when I started using Undie’s system for affiliate links.

I did, unfortunately, have a few months worth of affiliate links that had the awful format suggested by LinkShare:

http://click.linksynergy.com/fs-bin/stat?id=L4JhWyGwYTM
&offerid=146261&type=3&subid=0&tmpid=1826
&RD_PARM1=http%253A%252F%252Fitunes.apple.com%252Fus
%252Fapp%252Fnotesy-for-dropbox%252Fid386095500
%253Fmt%253D8%2526uo%253D4%2526partnerId%253D30

At first, I figured I’d just leave these alone. I didn’t use that format very long and, comparatively speaking, there weren’t many links of that type. After a few days, though, the wrongness of leaving those old links there got to me and I put together a script to fix them:

python:
 1:  #!/usr/bin/python
 2:  
 3:  from urllib import unquote, quote
 4:  import re, sys, os
 5:  
 6:  files = sys.argv[1:]
 7:  
 8:  phgSuffix = 'at=10l4Fv'
 9:  synergyLink = re.compile(r'^(\[\d+\]:) (https?://click\.linksynergy\.com.*)$')
10:  itunesLink = re.compile(r'(https?://itunes\.apple\.com[^\?]+)\?(mt=\d+)?')
11:  
12:  for oldfile in files:
13:    newfile = oldfile + '.new'
14:    unchanged = True
15:    with open(newfile, 'w') as new:
16:      with open(oldfile, 'r') as old:
17:        for line in old:
18:          msynergy = synergyLink.search(line)
19:          if msynergy:
20:            ref = msynergy.group(1)
21:            synergy = unquote(unquote(msynergy.group(2)))
22:            mitunes = itunesLink.search(synergy)
23:            if mitunes:
24:              if mitunes.group(2):
25:                newline = '%s %s?%s&%s\n' % (ref, mitunes.group(1),
26:                                           mitunes.group(2), phgSuffix)
27:              else:
28:                newline = '%s %s?%s\n' % (ref, mitunes.group(1), phgSuffix)
29:              unchanged = False
30:            else:
31:              newline = line
32:          else:
33:            newline = line
34:          new.write(newline)
35:      if unchanged:
36:        os.remove(newfile)

This script, clicksynergy-links, is structured like the update-applelinks but is much messier because the links it’s fixing are much messier. It has to find the clicksynergy links, then extract the itunes.apple.com parts out of them to construct the clean Undie-style URL. There are several pieces and a couple of optional parts, so Patterns was even more helpful with this script.

After the links were fixed in the .new files, the publishing and renaming went as before:

clicksynergy-links *.md; publish-new *.new; rename -f 's/\.new//' *.new

You’ll note that there are no comments in these scripts. I’m not proud of that fact, but I’m not ashamed of it, either. These are one-time scripts meant to solve a particular problem—comments would use up time that I’d never get back.

Scripting was really the only practical way to tidy up the Markdown source. The alternative—hunting down and fixing every affiliate link in a few hundred posts—was something I would never have done, even with the help of grep and BBEdit.