TextExpander, AppleScript, and Unicode

One of my favorite and most-used TextExpander snippets displayed a bug a couple of days ago. I suspect the bug is due to the way TextExpander handles Unicode, and I haven’t figured out a workaround.

The bug appeared as I was writing this post, and needed a link to this older post about Mac screen resolutions. I had the older post up as the current tab in Safari and used the ;furl snippet (which you can find in this GitHub repository) to insert the URL of the older post. The URL is

http://www.leancrew.com/all-this/2010/10/new-macs’-resolutions/

which has a curly apostrophe. It’s this non-ASCII character that’s the source of the trouble.

The ;furl snippet is an AppleScript snippet. The nut of the code is this line:

tell application "Safari" to get URL of front document  

If I run that code in the AppleScript Editor—with the older post showing in Safari’s front tab, of course—it returns an encoded version of the URL, in quotes, down in ASE’s Results panel.

AppleScript results

http://www.leancrew.com/all-this/2010/10/new-macs%E2%80%99-resolutions/

If I take this and paste it into Safari’s address field, it translates the percent-encoded stuff into a curly apostrophe and takes me to the correct page. This is as it should be, because “E2 80 99” is the three-part UTF-8 code for the curly apostrophe.

But if I have that same code saved as an AppleScript snippet in TextExpander, and I type ;furl in TextMate, it expands to

http://www.leancrew.com/all-this/2010/10/new-macs?2-resolutions/  

which is how the link in my post got screwed up. I was typing away and didn’t notice the mistake.

It’s not just in TextMate that the problem appears. If I type ;furl in TextEdit, or Terminal, or any editable text field, I get the same bad output.

Interestingly, if I run this at the command line or in TextMate (via the Execute Line Inserting Result [⌃R] command in the Text menu)

osascript -e 'tell application "Safari" to get URL of front document'

I get the correct percent-encoded output. So it’s not the application the snippet is invoked in that’s causing the problem.

TextExpander has no problem with Unicode in standard snippets. I defined a plain text snippet with

http://www.leancrew.com/all-this/2010/10/new-macs’-resolutions/

as the content and ;ddd as the trigger, and it works just fine. It doesn’t do the percent encoding, it just inserts the curly apostrophe. But if I define a plain text snippet with

http://www.leancrew.com/all-this/2010/10/new-macs%E2%80%99-resolutions/

as the content, it inserts the messed-up version

http://www.leancrew.com/all-this/2010/10/new-macs?2-resolutions/

when triggered.

Conversely, if I define an AppleScript snippet with a content of

get "http://www.leancrew.com/all-this/2010/10/new-macs’-resolutions/"

it returns the correct

http://www.leancrew.com/all-this/2010/10/new-macs’-resolutions/

when triggered. And as you’re probably guessing, an AppleScript snippet with a content of

get "http://www.leancrew.com/all-this/2010/10/new-macs%E2%80%99-resolutions/"  

returns the incorrect

http://www.leancrew.com/all-this/2010/10/new-macs?2-resolutions/

when triggered. So it looks like Unicode itself isn’t TextExpander’s problem, it’s percent-encoded strings that TextExpander’s trying and failing to decode. A weird bug—you’d think a percent-encoded string would just pass through unchanged.

As I said at the top, I don’t know a way to work around this problem; it seems like something Smile will have to fix. I’m sending them a bug report this morning.

Update 1/26/11
Here’s the response from Smile support:

Thanks for the report. I was able to reproduce it and have filed it with our engineering team for further investigation. Hopefully we can get it resolved in an upcoming release.

So the bug is in the pipeline for repair. Until it’s fixed, I’m not sure what to do with the ;furl snippet other than be on the lookout for URLs that will give it trouble.

Interestingly, my ;surl snippet (in the same repository) doesn’t have this problem because it encodes the URL before sending it off to Metamark for shortening and the result returned is pure ASCII. For example, running ;surl with the resolution post in Safari’s frontmost tab yields

http://xrl.us/bh5vva

which redirects correctly.