# A double library transplant

With a new EXIF library in place, I rewrote my canonize photo renaming utility to take advantage of it. Canonize was my motivation for finding a new EXIF library in the first place. It’s a command-line program that renames photos based on the date and time they were taken. It does so by reading the EXIF metadata in the photo file and extracting the DateTimeOriginal field. The name is a bad pun on the idea of a canonical filename for the photos and the fact that I use Canon cameras.

Canonize relies on the pyexif library, which works fine but doesn’t allow for writing EXIF data, only reading. Canonize doesn’t need to write EXIF data, but I have plans to write other scripts that do need to write, and I want to standardize on a single library for all my EXIF work.

Substituting pyexiv2 methods for pyexif methods was really easy—only 3-4 lines needed to be changed and the changes themselves were obvious. I’ll point them out in a bit. But since I had the hood open, it seemed like a good time to switch out another library, the options parsing library.

Before today, canonize used the optparse library, which was the standard high-level library for handling command-line options in Python 2.6. It was deprecated in 2.7 in favor of the argparse library. You might think I’d upgrade to argparse, but I decided to move instead to the simpler getopt library, which doesn’t have all the bells and whistles of the other libraries but is plenty capable for my elementary needs and is unlikely to be deprecated because it’s written to mimic the venerable getopt() C function.

So here’s the new source code:

python:
1:  #!/usr/bin/env python
2:
3:  import pyexiv2
4:  import getopt
5:  import os
6:  import os.path
7:  import sys
8:
9:  # Options and help messages.
10:  usage = """Usage: canonize [options] [list of files]
11:
12:  Options:
13:    -s sss    optional suffix
14:    -f        get filenames from STDIN instead of command line
15:    -t        show the renaming but don't do it
16:    -h        show this help message
17:
18:  Rename a list of photo files (JPEGs) according to the date
19:  on which they were taken. The format for the file name is
20:  yyyymmddsss-nnn.jpg, where yyyy is the year, mm is the month
21:  number, dd is the day, sss is the optional suffix (which can
22:  be any length), and nnn is the (zero-padded) photo number
23:  for that day. By default, the original file names are given
24:  on the command line; if the -f option is used, the original
25:  file names are taken from STDIN."""
26:
27:  # Handle the command line options.
28:  try:
29:    options, filenames = getopt.getopt(sys.argv[1:], 's:fth')
30:  except getopt.GetoptError, err:
31:    print str(err)
32:    sys.exit(2)
33:
34:  filtrate = False    # default for -f
35:  suffix = ''         # default for -s
36:  test = False        # default for -t
37:  for o, a in options:
38:    if o == '-s':
39:      suffix = a
40:    elif o == '-f':
41:      filtrate = True
42:    elif o == '-t':
43:      test = True
44:    else:
45:      print usage
46:      sys.exit()
47:
48:  # Get the file list and create a list of (filedate, filename) tuples.
49:  if filtrate:
51:  filedates = []
52:  for f in filenames:
54:    try:                              # skip over files without EXIF info
56:      d = info['Exif.Photo.DateTimeOriginal'].raw_value
57:      filedates.append((d, f))
58:    except KeyError:
59:      continue
60:
61:  # Don't bother going on if there aren't any files in the list.
62:  if len(filedates) == 0:
63:    sys.exit()
64:
65:  # Some background info:
66:  # DateTimeOriginal is a string in the form 'yyyy:mm:dd hh:mm:ss'.
67:  # All the numbers use leading zeros if necessary; the hours use a
68:  # 24-hour clock format. An alphabetic sort on strings in this form
69:  # also sorts on date and time. Running split() on this string yields
70:  # a (date, time) tuple.
71:
72:  # Sort the files according to date and time taken.
73:  filedates.sort()
74:
75:  # Create a list of (oldfilename, newfilename) tuples.
76:  newnames = []
77:  i = 0                               # initialize the sequence number
78:  prev = filedates[0][0].split()[0]   # initialize the date
79:  for date, old in filedates:
80:    current = date.split()[0]
81:    if current == prev:               # still on same date
82:      i += 1
83:    else:                             # starting new date
84:      i = 1
85:      prev = current
86:    path = os.path.dirname(old)
87:    new = os.path.join(path,
88:      "%s%s-%03d.jpg" % (current.replace(':', ''), suffix, i))
89:    if new in filenames:
90:      sys.stderr.write("Error: %s is already being used\n" % new)
91:      sys.exit()
92:    else:
93:      newnames.append((old, new))
94:
95:  # Rename the files or print out how they would be renamed.
96:  if test:
97:    for o,n in newnames:
98:      print "%s -> %s" % (o, n)
99:  else:
100:    for o,n in newnames:
101:      os.rename(o,n)


One of the nice things about using getopt is that because it doesn’t cobble together a usage message from disparate help strings across the code—as the other option parsing libraries do—it encourages you to put together a nice, monolithic usage message. There’s no advantage in this to the user, but there’s a big advantage to the programmer to see the code’s raison d’etre together in one spot.

The command-line handling is done in Lines 27-46 and it’s pretty obvious what’s going on. The most important things to know are:

• The second argument to getopt.getopt is a string with the letters of all the command-line options. Letters followed by a colon take an argument.
• getopt.getopt returns a list of tuples in the form

[(option, value), (option, value), (option, value), ...]


Options that don’t have arguments have an empty string as their second element.

The new EXIF library is called in Lines 53, 55, and 56. If you look back to an earlier version of canonize, you’ll see that these lines are nearly one-for-one replacements of lines that called the previous library. That’s why it was so easy to make the switch.

Now that I’m comfortable with pyexiv2, I’ll start putting it into scripts that do more interesting things. My first thoughts are