Deposition transcripts update
June 7, 2011 at 7:39 PM by Dr. Drang
My back went out this morning as I was loading my bike’s panniers for the ride to work, so I spent the day at home reading material for work and trying to find a comfortable position. I was more successful with the former than the latter.
What I was reading were deposition transcripts. These are Q and A sessions in which one or more lawyers asks a witness an interminable series of questions. It’s like courtroom testimony except that its done out of court and before trial. It’s part of what’s called discovery, where all the parties to a legal matter get to learn what the other sides are up to so they can evaluate their cases and plan their arguments. TV dramas always show big surprises during courtroom testimony; that doesn’t happen if the lawyers do their jobs in discovery.
Anyway, depositions are conducted in the presence of a court reporter and are typed up shortly thereafter. I sometimes act as an expert witness in lawsuits involving machinery and structural failures; in those projects, in addition to reviewing drawings and test results and calculations, I have to read transcripts of the eyewitness depositions. Only patents are a more boring reading matter than depositions.
The depositions I read today were in the form of plain text files that had been emailed to me. I didn’t want to read them on my computer screen, so I printed them out, four-up and double-sided, using a little Perl script. This script goes back about 10 years and was written during my Linux years, when PostScript files were my preferred final copy. I wrote about it way back in the early days of this blog.
For some reason, I’d never changed the script’s output from PostScript to PDF. Today, as a way of delaying the reading of the transcripts while still seeming productive, I changed the script to produce four-up PDFs that look like this
I also added the ability to set the output font size from the command line. Here’s the script, called dep24up
:
1: #!/usr/bin/perl
2:
3: use strict;
4: use warnings;
5: use Getopt::Std;
6:
7: my $usage = <<'USAGE';
8: dep24up -- Create a 4-up PDF version of a deposition transcript
9: so it can be printed out nicely.
10:
11: usage: dep24up [options] file
12:
13: options:
14: -h : print this message
15: -b n : number of blank spaces to remove from beginning
16: of each line (default: 0)
17: -p sss : Perl regexp that defines a page boundary
18: (default: '^(\s{10,}(Page\s+)*(\d+))\s*$')
19: -e : page boundary regexp is at end of page, rather than
20: at beginning
21: -f : font size before shrinking to 4-up (default: 18)
22: -d : create a troff version of the transcript for
23: debugging, but don't process it into a PDF
24: notes:
25: Page boundaries are processed after blank lines and initial
26: blank spaces are removed. Only $1 from the -p regexp is
27: preserved. The output file has the same base name as the
28: input file, with a '-4up.pdf' extension (or '.rf' if the -d
29: option is used).
30: USAGE
31:
32: # Handle command line
33: my %opt;
34: getopts('b:def:hp:', \%opt);
35: my $file = shift;
36:
37: die $usage if ($opt{h} || ! $file);
38: my $blanks = $opt{b} || 0;
39: my $page = $opt{p} || q(^(\s{10,}(Page\s+)*(\d+))\s*$);
40: my $atend = $opt{e};
41: my $fsize = $opt{f} || 18;
42: my $debug = $opt{d};
43:
44: # Slurp in whole file
45: open(TS, $file) or die "No $file: $!\n";
46: undef $/;
47: my $ts = <TS>; # ts = transcript
48:
49: # Simple cleanup
50: $ts =~ s/\n(\s*\n)+/\n/g; # weed out blank lines
51: $ts =~ s/^ {$blanks}//mg; # strip beginning blank spaces
52:
53: # Add codes at page boundaries
54: if ($atend) {
55: print STDERR "At end! $page \n";
56: $ts =~ s/$page/$1\n.bp\n\.sp |.5i/mg; # $1 goes before .bp
57: } else {
58: $ts =~ s/$page/.bp\n\.sp |.5i\n$1/mg; # $1 goes after .bp
59: $ts =~ s/\.bp\n\.sp \|\.5i\n//; # delete inadvertent .bp before 1st page
60: }
61:
62: # Add overall formatting commands at beginning
63: my $prolog = <<PROLOG;
64: .ft BMR
65: .ps $fsize
66: .vs 27
67: .po .5i
68: .ll 7.5i
69: .sp |.5i
70: .nf
71: .na
72: PROLOG
73:
74: $ts = $prolog . $ts;
75:
76: # Output
77: my ($base, undef) = split(/\./, $file, 2);
78: if ($debug) {
79: open OUT, ">$base.rf" or
80: die "Can't open $base.rf for writing: $!\n";
81:
82: }
83: else {
84: open OUT, "| groff -P-pletter| psnup -4 -cq -m18 | ps2pdf - $base-4up.pdf" or
85: die "Can't run pipeline: $!\n";
86: }
87:
88: print OUT $ts;
It uses the external programs groff
, psnup
, and ps2pdf
. Groff
comes with OS X; the other two don’t, but are part of most TeX distributions.
The idea behind the script is
- Insert troff formatting codes into the file.
- Run the file through
groff
to generate a one-page-per-sheet PostScript stream. - Pipe the PostScript through
psnup
to turn it into a four-pages-per-sheet PostScript stream. - Pipe that PostScript through
ps2pdf
to create a PDF file.
Item 1 takes up the bulk of the script. Items 2-4 are basically all on Line 84.
The trickiest thing is getting the pagination right. The plain text format for depositions isn’t standardized, and different court reporters use different headers and footers. The script uses a regular expression to locate the page boundaries; the default regex (Line 39) is one that works most of the time, but it can be changed at the command line.
As I said back when I first wrote about this script, I think the code is pretty straightforward Perl. There’s a header comment for each section of the program; only a few lines seemed to merit their own comments. Here are a few additional notes.
- I like having a usage string at the beginning of my programs; it documents the program like an introductory comment, and pulls double duty as the help message.
- Most of my utility programs written in Perl have a command-line handling section very much like this one. I prefer single-dash options and always use
-h
for help. - I changed the input record separator,
$/
, to change the way the file input operator,<>
, works on Line 46. This is a standard Perl idiom. - Once I decided to turn the page boundary definition into a pair of a user-defined options, the section that adds the troff page-break commands pretty much wrote itself. The
.sp |.5i
part puts a half-inch margin at the top of each page. - The prologue in Lines 63-72 sets the font to Bookman (BMR = Bookman Roman), a font I associate with children’s books. It’s standard on PostScript printers and is legible at small sizes. The pre-reduction font size is set by the the
.ps $fsize
command to whatever the-f
option is set to (it’s 18 points by default and should never need to be changed by more than a point or two); it will end up less than half that after the 4-up reduction. The 28-point leading (.vs 28
) gives me something between single- and double-spacing which I have found to be easy to read. The.nf
and.na
commands turn off “filling” and “adjusting,” which is necessary to preserve the numbered-line structure of the transcript. - The output file is determined by the presence or absence of the
-d
option. I suppose I could have used the File::Basename module, but it seemed like overkill for this program. The options to psnup tell it to do a 4-up transformation (-4
) with column-major page numbering (-c
) and an extra 18 points of margin (-m 18
). Normally, psnup spits out the page numbers of the new file as they are created, but-q
suppresses that.
Unfortunately, today’s script changes didn’t take long and worked the first time (who says Perl is impossible to read?). I soon had the transcripts printed and had to read the damned things.
This script isn’t likely to be useful to more than a handful of people, but I’m posting it anyway because it’s another example of how scripting allows you to be the boss of your computer instead of the other way around. There’s no way I could buy a program that does this,1 I had to write it myself.
-
Yes, there are deposition readers, but they don’t run on OS X. ↩