No audio dynamite

I’ve been screwing around with audio files pretty much all day today, and I’m not happy about it. What I’d like is a nice dictation/transcription system, but I’ve decided that speech recognition is not the way to go.

The Mac seems to be limited to iListen (no link, you don’t want it). I spent a good deal of time reading the training texts, then started dictating my own stuff. I guess the many many errors are to be expected at this early stage; correcting the errors within the iListen environment should teach it the peculiarities of my voice and the sorts of words/phrased I commonly use. Unfortunately, the user interface for corrections is awful—the cursor moves soooooooo sloooooooowly that if you’re correcting a big chunk of text (which is what the iListen people recommend) you’ll spend most of your time navigating back and forth to the errors. And correcting every time you notice a mistake in the transcription is certain to make your sentences disjointed and raise your blood pressure. So iListen is out.

What about just typing up what I’ve dictated? I don’t expect to dictate great chunks of text, so even my modest typing skills should suffice. And since I already have some keyboard shortcuts for starting and pausing iTunes, wouldn’t it be great if I imported the audio files from my digital recorder into iTunes and played it back as I type it up. Yes it would, and I’m still expecting to be able to do it, the process hasn’t been smooth.

My digital recorder stores files in WMA (Windows Media) format. It came with software that is supposed to convert to AIFF on the fly as I transfer the files from the recorder to my Mac but

  1. The transfer software that came with the recorder isn’t working on my iBook; for some reason, it won’t accept the registration code.
  2. I really don’t want to use the stupid transfer software. The Mac treats the recorder like a USB drive, so I can just drag files onto my hard disk. Why use some half-assed software from Olympus when the Finder works?

Of course, the Finder doesn’t convert from WMA into something I can use in iTunes. I have a copy of ffmpegX, which works, but takes a long time to open. I’ve been trying to use the command-line version of ffmpeg downloaded through DarwinPorts, but it keeps giving errors during the decoding process—I’m probably missing some codec. EasyWMA looks promising, even though I’d rather have a command-line tool that I could chain together with sox (see below). Like ffmpegX, it’s a Mac-packaged version of ffmpeg; unlike ffmpegX, it opens quickly. It will also pop the converted file into a user-defined playlist in iTunes, which is quite nice. I wish the demo were a bit more full-featured; it only converts the first 12 seconds of audio until you pay the $10 shareware fee. It’s not that $10 is too much, it that I can’t get a good sense of the conversion speed when it stops short like that.

One last thing (which may make me change my mind about the Olympus transfer software). Transcription at full speed is not something I can do. I need to start and stop so frequently, I spend more time doing that than typing. But sox allows me to slow down the audio (without changing the pitch) to a pace that I can just about keep up with. Of course, sox—which is famous for handling just about any audio format—can’t do WMA files. And because its tempo-changing operation would have to come between the conversion from WMA and the import into iTunes, the cool iTunes integration of EasyWMA is lost.

See, what I’m looking for is a smooth workflow for getting the audio off the recorder and into iTunes. Ideally, I’d like to drag the files into a folder (or onto an icon) that converts them, slows them down, and imports them into iTunes. Command-line tools are usually more amenable to this sort of thing because they can be piped together into a single shell script that can then be invoked from the dreaded AppleScript. Maybe I can just download ffmpeg and compile it directly, without DarwinPorts (possibly) getting in the way.

I guess this post is more me talking to myself than anything else. Maybe I’ll be able to post my solution in a few days.