Deposition transcripts update

My back went out this morning as I was loading my bike’s panniers for the ride to work, so I spent the day at home reading material for work and trying to find a comfortable position. I was more successful with the former than the latter.

What I was reading were deposition transcripts. These are Q and A sessions in which one or more lawyers asks a witness an interminable series of questions. It’s like courtroom testimony except that its done out of court and before trial. It’s part of what’s called discovery, where all the parties to a legal matter get to learn what the other sides are up to so they can evaluate their cases and plan their arguments. TV dramas always show big surprises during courtroom testimony; that doesn’t happen if the lawyers do their jobs in discovery.

Anyway, depositions are conducted in the presence of a court reporter and are typed up shortly thereafter. I sometimes act as an expert witness in lawsuits involving machinery and structural failures; in those projects, in addition to reviewing drawings and test results and calculations, I have to read transcripts of the eyewitness depositions. Only patents are a more boring reading matter than depositions.

The depositions I read today were in the form of plain text files that had been emailed to me. I didn’t want to read them on my computer screen, so I printed them out, four-up and double-sided, using a little Perl script. This script goes back about 10 years and was written during my Linux years, when PostScript files were my preferred final copy. I wrote about it way back in the early days of this blog.

For some reason, I’d never changed the script’s output from PostScript to PDF. Today, as a way of delaying the reading of the transcripts while still seeming productive, I changed the script to produce four-up PDFs that look like this

Four-up depo page

I also added the ability to set the output font size from the command line. Here’s the script, called dep24up:

 1:  #!/usr/bin/perl
 3:  use strict;
 4:  use warnings;
 5:  use Getopt::Std;
 7:  my $usage = <<'USAGE';
 8:  dep24up -- Create a 4-up PDF version of a deposition transcript
 9:             so it can be printed out nicely.
11:  usage:   dep24up [options] file
13:  options:
14:      -h     : print this message
15:      -b n   : number of blank spaces to remove from beginning 
16:               of each line (default: 0)
17:      -p sss : Perl regexp that defines a page boundary
18:               (default: '^(\s{10,}(Page\s+)*(\d+))\s*$')
19:      -e     : page boundary regexp is at end of page, rather than
20:               at beginning
21:      -f     : font size before shrinking to 4-up (default: 18)
22:      -d     : create a troff version of the transcript for
23:               debugging, but don't process it into a PDF
24:  notes:
25:       Page boundaries are processed after blank lines and initial
26:       blank spaces are removed. Only $1 from the -p regexp is
27:       preserved. The output file has the same base name as the
28:       input file, with a '-4up.pdf' extension (or '.rf' if the -d
29:       option is used).
30:  USAGE
32:  # Handle command line
33:  my %opt;
34:  getopts('b:def:hp:', \%opt);
35:  my $file = shift;
37:  die $usage if ($opt{h} || ! $file);
38:  my $blanks = $opt{b} || 0;
39:  my $page   = $opt{p} || q(^(\s{10,}(Page\s+)*(\d+))\s*$);
40:  my $atend  = $opt{e};
41:  my $fsize  = $opt{f} || 18;
42:  my $debug  = $opt{d};
44:  # Slurp in whole file
45:  open(TS, $file) or die "No $file: $!\n";
46:  undef $/;
47:  my $ts = <TS>;                          # ts = transcript
49:  # Simple cleanup
50:  $ts =~ s/\n(\s*\n)+/\n/g;               # weed out blank lines
51:  $ts =~ s/^ {$blanks}//mg;               # strip beginning blank spaces
53:  # Add codes at page boundaries
54:  if ($atend) {
55:    print STDERR "At end! $page \n";
56:    $ts =~ s/$page/$1\n.bp\n\.sp |.5i/mg; # $1 goes before .bp
57:  } else {
58:    $ts =~ s/$page/.bp\n\.sp |.5i\n$1/mg; # $1 goes after .bp
59:    $ts =~ s/\.bp\n\.sp \|\.5i\n//;       # delete inadvertent .bp before 1st page
60:  }
62:  # Add overall formatting commands at beginning
63:  my $prolog = <<PROLOG;
64:  .ft BMR
65:  .ps $fsize
66:  .vs 27
67:  .po .5i
68:  .ll 7.5i
69:  .sp |.5i
70:  .nf
71:  .na
74:  $ts = $prolog . $ts;
76:  # Output
77:  my ($base, undef) = split(/\./, $file, 2);
78:  if ($debug) {
79:    open OUT, ">$base.rf" or
80:      die "Can't open $base.rf for writing: $!\n";
82:  } 
83:  else {
84:    open OUT, "| groff -P-pletter| psnup -4 -cq -m18 | ps2pdf - $base-4up.pdf" or
85:      die "Can't run pipeline: $!\n";
86:  }
88:  print OUT $ts;

It uses the external programs groff, psnup, and ps2pdf. Groff comes with OS X; the other two don’t, but are part of most TeX distributions.

The idea behind the script is

  1. Insert troff formatting codes into the file.
  2. Run the file through groff to generate a one-page-per-sheet PostScript stream.
  3. Pipe the PostScript through psnup to turn it into a four-pages-per-sheet PostScript stream.
  4. Pipe that PostScript through ps2pdf to create a PDF file.

Item 1 takes up the bulk of the script. Items 2-4 are basically all on Line 84.

The trickiest thing is getting the pagination right. The plain text format for depositions isn’t standardized, and different court reporters use different headers and footers. The script uses a regular expression to locate the page boundaries; the default regex (Line 39) is one that works most of the time, but it can be changed at the command line.

As I said back when I first wrote about this script, I think the code is pretty straightforward Perl. There’s a header comment for each section of the program; only a few lines seemed to merit their own comments. Here are a few additional notes.

Unfortunately, today’s script changes didn’t take long and worked the first time (who says Perl is impossible to read?). I soon had the transcripts printed and had to read the damned things.

This script isn’t likely to be useful to more than a handful of people, but I’m posting it anyway because it’s another example of how scripting allows you to be the boss of your computer instead of the other way around. There’s no way I could buy a program that does this,1 I had to write it myself.

  1. Yes, there are deposition readers, but they don’t run on OS X.