Adapting BBC Radio recording scripts
October 13, 2009 at 5:06 PM by Dr. Drang
Reader Chris Nelms sent me a nice email over the weekend. He’s a fan of several BBC Radio 4 programs and had been recording them with Audio Hijack Pro for listening to in his car—similar to my time-shifting of Radio 2 shows, except that Chris’s recordings probably have more legitimacy, as he’s a UK resident and has presumably paid the BBC license fee. When the BBC changed its program URL scheme, it broke his AHP setup. He found my scripts for recording Radio 2 shows (first discussed in this post, then updated and put in a GitHub repository) and adapted them for Radio 4.
The most important changes were to my Python radio2.py
module. That module starts like this
1: import datetime
2: import urllib
3: import BeautifulSoup
4: import re
5:
6: # The particulars for the shows we're interested in.
7: showinfo = {'jukebox': (5, 'Mark Lamarr'),
8: '70s': (6, re.compile(r'Sounds of the 70s')),
9: '60s': (5, re.compile(r'Sounds of the 60s')),
10: 'soul': (2, 'Trevor Nelson')}
11:
12:
13: def recentScheduleURL(showday, day=datetime.date.today()):
14: 'Return the schedule URL for the most recent showday (0=Mon, 6=Sun) on or before day.'
15:
16: backup = datetime.timedelta((day.weekday() - showday) % 7)
17: programDay = day - backup
18: return 'http://www.bbc.co.uk/radio2/programmes/schedules/%d/%02d/%02d' % (programDay.year, programDay.month, programDay.day)
19:
20:
21: def programCode(show):
22: 'Return the code of the program page for showname on the most recent showday.'
23: try:
24: schedHTML = urllib.urlopen(recentScheduleURL(showinfo[show][0])).read()
25: schedSoup = BeautifulSoup.BeautifulSoup(schedHTML)
26: return schedSoup.find(name='span', text=showinfo[show][1]).parent.parent['href'].split('/')[-1]
27: except KeyError:
28: return None
To adapt it to Radio 4, Chris changed the schedule URL string in Line 18 to Radio 4’s URL and he changed the showinfo
dictionary in Lines 7-10 to reflect the shows he listens to and the days of the week on which they air. Because the BBC’s site overhaul made its URLs more uniform in structure, those changes were all that were needed.
I didn’t ask Chris which shows he records, but I did notice that, unlike the Radio 2 programs I listen to, many Radio 4 shows are daily rather than weekly. This makes crafting a workable showinfo
dictionary a bit more challenging. For example, the Today program runs in the morning, Monday through Saturday. If you wanted to record each episode in the afternoon or evening of the day it aired, there are a couple of ways to go about it.
The brute force technique would be to create a showinfo
entry for each day the show airs:
6: # The particulars for the shows we're interested in.
7: showinfo = {'montoday': (0, re.compile(r'^Today$')),
8: 'tuetoday': (1, re.compile(r'^Today$')),
9: 'wedtoday': (2, re.compile(r'^Today$')),
10: 'thutoday': (3, re.compile(r'^Today$')),
11: 'fritoday': (4, re.compile(r'^Today$')),
12: 'sattoday': (5, re.compile(r'^Today$'))}
13:
14: def recentScheduleURL(showday, day=datetime.date.today()):
15: 'Return the schedule URL for the most recent showday (0=Mon, 6=Sun) on or before day.'
16:
17: backup = datetime.timedelta((day.weekday() - showday) % 7)
18: programDay = day - backup
19: return 'http://www.bbc.co.uk/radio4/programmes/schedules/fm/%d/%02d/%02d' % (programDay.year, programDay.month, programDay.day)
20:
21:
22: def programCode(show):
23: 'Return the code of the program page for showname on the most recent showday.'
24: try:
25: schedHTML = urllib.urlopen(recentScheduleURL(showinfo[show][0])).read()
26: schedSoup = BeautifulSoup.BeautifulSoup(schedHTML)
27: return schedSoup.find(name='span', text=showinfo[show][1]).parent.parent['href'].split('/')[-1]
28: except KeyError:
29: return None
The search criterion for the show name had to be a regular expression with start and end anchors (^
and $
), because Radio 4 also has a daily show called Farming Today, the name of which would interfere with a search that didn’t have the anchors. Note that I’ve also made the necessary change to the schedule URL string in Line 19 (née 17).
The problem with the brute force approach is that it requires six different AHP sessions and six different audio source AppleScripts, one for each day of the week. A better approach would be this:
6: # The particulars for the shows we're interested in.
7: showinfo = {'today': ((0,1,2,3,4,5), re.compile(r'^Today$'))}
8:
9: def recentScheduleURL(showday, day=datetime.date.today()):
10: 'Return the schedule URL for the most recent showday (0=Mon, 6=Sun) on or before day.'
11:
12: if isinstance(showday, tuple):
13: backups = [ (day.weekday() - d) % 7 for d in showday ]
14: backup = datetime.timedelta(min(backups))
15: else:
16: backup = datetime.timedelta((day.weekday() - showday) % 7)
17: programDay = day - backup
18: return 'http://www.bbc.co.uk/radio4/programmes/schedules/fm/%d/%02d/%02d' % (programDay.year, programDay.month, programDay.day)
19:
20:
21: def programCode(show):
22: 'Return the code of the program page for showname on the most recent showday.'
23: try:
24: schedHTML = urllib.urlopen(recentScheduleURL(showinfo[show][0])).read()
25: schedSoup = BeautifulSoup.BeautifulSoup(schedHTML)
26: return schedSoup.find(name='span', text=showinfo[show][1]).parent.parent['href'].split('/')[-1]
27: except KeyError:
28: return None
Here, I’ve generalized the recentScheduleURL
function to accept either a tuple or an integer for the showday
argument; either way, it finds the most recent episode. Taking advantage of this, I’ve used a tuple to define the days of the week on which Today runs. Now a single AHP session and a single AppleScript will suffice to record all the episodes in the week.
I like this approach so much, I’ve incorporated the tuple option into radio2.py
. Although I don’t need it now, I can foresee a time when I’d want to record a daily show. My thanks to Chris Nelms for his help.
The updated version is available at the GitHub repository.