Incrementing file version numbers with Python and Perl
November 9, 2024 at 4:57 PM by Dr. Drang
Yesterday my old vaudeville partner,1 Dan Sturm, wrote a post describing a Keyboard Maestro macro he wrote that increments the version number in a file name. It’s the sort of thing I could have used years ago when I was writing reports and analysis scripts that often had to be updated—but not overwritten—as new data came in. I’m not sure I’d have much use for it now, but it’s nice to know where to go if I need it.
The thing that caught my interest, though, was how Dan, with the help of Peter Lewis via the Keyboard Maestro forum, adjusted the macro to handle filenames with version numbers of different widths. If you give it a filename of
blue-image 03.jpg
the incremented version will be
blue-image 04.jpg
But if you start with
blue-image 0003.jpg
the increment will be to
blue-image 0004.jpg
The version numbers are always zero-padded integers, but the macro is smart enough to figure out how many digits the original name has so it can use the same number of characters in the incremented name.
Thinking about the post last night, I wanted to see if I could write a little code that was as smart as Dan’s macro. I didn’t want to reproduce his macro’s functionality, I just wanted to see if I could do the part that increments the version number while maintaining its character width. As it turns out, both Python and Perl have relatively simple ways to do it.
Here’s my Python script:
python:
1: #!/usr/bin/env python3
2:
3: from sys import stdin
4: from re import fullmatch
5:
6: def increment(fn):
7: 'Return the given filename with an incremented version number'
8: try:
9: front, vers, ext = fullmatch(r'(.+?)v?(\d+)(\..+)', fn).group(1, 2, 3)
10: return f'{front}{int(vers)+1:0{len(vers)}d}{ext}'
11: except AttributeError:
12: return f'!!!{fn}!!!'
13:
14: # Print the incremented filename
15: print(increment(stdin.read().rstrip('\n')))
The important part is the increment
function. Because I wasn’t looking at Dan’s post when I wrote this, I didn’t use his regular expression to parse the filename and break it into parts. But I remembered that he split it into three parts:
- Everything in front of the version number. This can include path components.
- The version number itself, possibly with a leading “v” and padded with zeros. The leading “v” should be removed from the filename in the incremented version.
- Everything after the version number, which should be just a period and the extension.
The regex that does this parsing is in Line 9 and works this way:
Looking back on it, I see that I was more restrictive than Dan about what comes after the version number, but that’s OK.
I used Python’s fullmatch
function from the re
library. There are other ways to do it, but since my regex covered the entire string, fullmatch
seemed like the best choice. If there is a match, fullmatch
returns a match object. The group
function then pulls out the three parenthesized groups, which are assigned to the front
, vers
and ext
variables.
The successful return value of increment
is defined by the f-string in Line 10. The replacement fields (the parts inside curly braces) of the f-string are defined this way:
What’s great about f-strings is that you can use expressions in the replacement fields, not just variable names, and you can nest them. The expression
int(vers)+1
does the incrementing. The formatting applied to that value is
0{len(vers)}d
where the len(vers)
expression is evaluated at run time to determine the character width of the version number in the input string.
Of course, we get that return value only if the fullmatch
is successful. If it isn’t, the return value is defined by the except
block. In this case, the original filename surrounded by exclamation points is returned. So, for example, if we give increment
an argument without a version number,
blue-image.jpg
we’ll get something out that should alert us that something went wrong:
!!!blue-image.jpg!!!
Line 15 handles both the collection of input from stdin and the writing of output to stdout. The rstrip(\n)
function makes sure that any trailing newlines, which are common in stdin, are removed before we run the string through increment
.
As you might expect, the Perl code is shorter and a little more cryptic:
perl:
1: #!/usr/bin/env perl -nl
2:
3: # Extract the individual parts of the filename from stdin
4: ($front, $vers, $ext) = /^(.+?)v?(\d+)(\..+)$/;
5:
6: # Print the incremented filename
7: printf("%s%0*d%s\n", $front, length($vers), $vers+1, $ext)
The -nl
options in the shebang line get the input string from stdin and strip any trailing newlines. Line 4 is basically the same as Line 9 in the Python code; the ^
and $
anchors are needed here to make sure we’re matching the entire string.
The trick in the Perl code is the asterisk in the printf
statement on Line 7. It acts as a placeholder for the second argument after the format string, which is length($vers)
. The rules for this are covered in the minimum width section of the sprintf
documentation.
Part of the reason the Perl code is shorter is that I didn’t give it any error handling. But Perl is just naturally more terse than Python. “Explicit is better than implicit” is not part of the Perl mantra.
My thanks to Dan for giving me something to puzzle over. I don’t want anyone to think I’m trying to “improve” his code. I just wanted to see if I could do dynamic formatting in languages I use regularly.
-
You remember our act, Sturm und Drang, don’t you? ↩