September 13, 2014 at 9:25 PM by Dr. Drang
When I switched to Linux in the late ’90s, I needed a way to write reports and correspondence for work. At the time, there weren’t any open source word processors worth mentioning, and I was done with wordprocessors, anyway. So I set up a report-writing workflow based on SGML, HTML’s big brother, and groff, the GNU version of the ancient Unix text formatter, troff.
I actually enjoyed writing in SGML. Creating a DTD for my reports forced me to think hard about how they ought to be structured. Although my current workflow is different, and I write my reports in Markdown, I still structure them according to the rules I had to formalize back in 1997. And SGML isn’t the straightjacket that XML is; you don’t need closing tags—or even opening tags—if there’s no way to misinterpret an element.
I kind of went SGML-happy in the late ’90s, creating DTDs for every type of structured document I wrote, including my CV. The workflow for generating a PostScript version of my CV was basically the same as the one for reports. Here’s my CV DTD:
xml: 1: <!ELEMENT cv - - (name, pos, intro?, s+)> 2: <!ELEMENT name - O (#PCDATA)> 3: <!ELEMENT pos - O (#PCDATA)> 4: <!ELEMENT intro - O (#PCDATA)> 5: <!ELEMENT s - O (h, (item|ditem)*)> 6: <!ELEMENT h O O (#PCDATA)> 7: <!ELEMENT item - O (#PCDATA | cite | br)*> 8: <!ELEMENT ditem - O (#PCDATA | cite | br)*> 9: <!ATTLIST ditem date CDATA #REQUIRED> 10: <!ELEMENT br - O EMPTY> 11: <!ELEMENT cite - - (#PCDATA)>
The structure isn’t too hard to work out. The CV as a whole consists of my name, my position with the company, an optional introductory paragraph, and then one or more sections. Each section consists of a header followed by some number of items or dated items. Dated items must have a date attribute; otherwise they’re identical to regular items. Items of either sort can contain citations and line breaks.
Here’s an example:
xml: 1: <!DOCTYPE cv SYSTEM "/Users/drang/dtd/cv.dtd"> 2: <cv> 3: <name> 4: Dr. Drang, Ph.D., P.E. 5: <pos> 6: Engineering Mechanics 7: <s> 8: Employment 9: <ditem date="1991-present"> 10: Principal, Drang Engineering, Inc. 11: <ditem date="1985-1990"> 12: Assistant Professor, Small Big Ten University 13: <s> 14: Education 15: <ditem date=1985> 16: Ph.D. in Civil Engineering; University of Illinois at Urbana-Champaign<br> 17: Thesis: <cite>An Approach To Structural Analysis That No One Uses</cite> 18: <ditem date=1982> 19: M.S. in Civil Engineering; University of Illinois at Urbana-Champaign 20: <ditem date=1981> 21: B.S. in Civil Engineering; University of Illinois at Urbana-Champaign 22: <s> 23: Professional societies 24: <item> 25: American Society of Civil Engineers 26: <item> 27: American Institute of Steel Construction 28: <item> 29: American Concrete Institute 30: <s> 31: Professional licenses and registrations 32: <item> 33: Professional Engineer, State of Illinois 34: <item> 35: Professional Engineer, State of Indiana 36: <item> 37: Professional Engineer, State of Ohio 38: </cv>
Note that the only closing tags are for the
<cite> elements. If you look in the DTD, you’ll see
- 0 in most of the element definitions. That means the opening tag is required but the closing tag is optional. Both the opening and closing tags are optional for the
<h> element; because it’s always the first element within an
<s> and it’s always followed by either an
<item> or a
<ditem>, there’s no need for tags. The SGML processor will know that things like “Employment” and “Education” are
For several years I kept my CV in this form, updating it as necessary. Sometime after switching back to the Mac, I stopped maintaining the SGML version, updating only the troff version. Even though troff isn’t the easiest markup language to write in, adding an item to my CV was pretty simple. I’d just copy a chunk of formatting code from one item, paste it in, and then add the new text.
Yesterday, though, I needed to update a few items in the CV and had the bright idea to return to the SGML form. I still had an old SGML version, so it wasn’t too hard to add the stuff necessary to bring it up to date. But I soon realized I didn’t have an SGML processor—I’d never installed it on my iMac at work.
Back when I was using SGML regularly, the standard processor was
nsgmls, part of James Clark’s SP suite of programs. I couldn’t find a precompiled version for OS X, so I decided to download the source and build it myself. Unfortunately, some of the commands in the makefile threw errors; something in either OS X’s compiler or its libraries wasn’t what the makefile expected. So I started a little yak-shaving adventure.
Installing gcc via Homebrew so I can compile an SGML processor so I can run a Perl program I wrote in 1996.
As you do.
— Dr. Drang (@drdrang) Sep 12 2014 9:47 AM
brew install open-sp
I was in business and was able to stop the installation of gcc and delete the dependencies had already been built. I generated my CV just as I had in the ’90s with only two differences:
- The SGML processor of the OpenSP project is called
- I had to convert the PostScript generated by
groffto PDF. I don’t print my CV very often anymore. I usually email the PDF to prospective clients.
Neither of these was a big deal. The pipeline looked like this:
onsgmls drangCV.sgml | cv2roff | groff | ps2pdf - > drangCV.pdf
cv2roff part is a Perl script that converts the ESIS output of
onsgmls into a troff document. I won’t be showing it here because it’s embarrassing. I had been programming Perl for less than a year when I wrote it, and it’s a mess. Worse, even, than my early Perl is the mixture of tabs and spaces in the source code. I’m sure I was using Emacs at the time and must not have known how to configure it yet. Ick.
Was it worth the trouble? I think so. Because of increased continuing education requirements to maintain my professional engineering licenses, and because I expect to be getting licensed in more states, I’ll be updating my CV more often. Having it in a concise SGML form will make it easier to edit. And even though my old Perl code is ugly, it’s fun to still be able to use a script I wrote over 15 years ago.