August 1, 2009 at 2:50 PM by Dr. Drang
Shortly after writing this script for showing radio show song lists in a popup window, I thought it would be nice if the songs in my iTunes library included lyrics along with the title, artist, and album information. I downloaded GetLyrical and set it up to scan my Music library and install lyrics on all the tracks it could. GetLyrical uses LyricWiki.org, a community-edited repository, as its source for lyrics.
When I returned the next day, GetLyrical was done, having added lyrics to about 60% of the tracks. The 40% that didn’t get lyrics added fell into three categories:
- Tracks that just don’t have lyrics. Most of my jazz and classical tracks fall into this category, as do lectures from iTunes U.
- Tracks that LyricWiki.org doesn’t have in its repository.
- Tracks that LyricWiki.org has, but which are stored under a title or artist than differs from what’s in my iTunes library.
Obviously, there’s nothing to be done about the first category. I’ll have to look at other sites to find lyrics in the second category, GetLyrical could work for tracks in the third category if I’m willing to change some of the metadata in my library.
As an example of the third category, I had some Buddy Holly songs that GetLyrical didn’t add lyrics to. Not believing that LyricWiki.org wouldn’t have “Peggy Sue,” for example, I looked at my library and found that several Holly songs were saved with “The Crickets” in the Artist field. After I changed that to “Buddy Holly,” GetLyrical added lyrics to all those songs.
How did I know which songs were still missing lyrics after running GetLyrical? With the following script, written in Python and using the appscript.py library.
1: #!/usr/bin/python 2: 3: import appscript 4: 5: lyricless =  6: 7: songs = appscript.app('iTunes').playlists['Music'].tracks.get() 8: for song in songs: 9: if (song.lyrics.get() == '') and (song.genre.get() != 'Classical') and (song.genre.get() != 'Jazz'): 10: lyricless.append((song.artist.get().encode('utf-8'), song.name.get().encode('utf-8'))) 11: 12: lyricless.sort() 13: print '\n'.join(['%s: %s' % (a,t) for a,t in lyricless])
I call the script
lyricless.py, and I’ve saved its output into a file that I can scan for songs that need lyrics. You’ll note in Line 9 that I’ve tried to prevent jazz and classical tracks from cluttering the output. I’ve also sorted the list by artist (Line 12), because I thought it would be easier to scan the list that way.
The output has lines that look like this:
Spencer Davis Group: Stevie's Blues Spencer Davis Group: The Hammer Song Spencer Davis Group: Trampoline Spencer Davis Group: Waltz For Lumumba Spencer Davis Group: When I Come Home Spinal Tap: Saucy Jack Squires: Going All The Way Staple Singers: When Will We Be Paid Stephen Lynch: Taxi Ride Stephen Malkmus & Lee Ranaldo: Can't Leave Her Behind Stephen Malkmus & The Million Dollar Bashers: Maggie's Farm Steve Cropper, Albert King, Pop Staples: Water Steve Earle & Reckless Kelly: Reconsider Me Stevie Ray Vaughan & Double Trouble: Flood Down In Texas Stevie Wonder: Fingertips Pt.2
Scanning through the list, I’ve noticed several things:
- LyricWiki.org is light on soul and blues songs, which is unfortunate, because my library is heavy on them.
- I have many Louis Armstrong tracks that haven’t been given the Jazz genre.
- My Band on the Run tracks are attributed to “Paul Mccartney & Wings.” At first I thought GetLyrical failed because of the lowercase “c,” but after checking LyricWiki.org, I see that it considers the artist to be just “Wings.” I’ll fix Paul’s name, but I won’t remove it from the metadata—those songs will have their lyrics added by hand.
- I have a Dylan track titled “It Takes A Lot To Laugh, It Ta,” which is an unfortunate truncation.
- I have a lot of instrumental tracks that aren’t jazz or classical: Ventures, Booker T & the MGs, Aphex Twin.
- LyricWiki.org has something against pub rock: none of my Nick Lowe, Dave Edmunds, or Rockpile tracks have lyrics.