Highlighting with Highlights and LiquidText
May 23, 2020 at 10:07 AM by Dr. Drang
I decided to get a copy of Highlights after reading this John Voorhees review in MacStories. After trying it out on a few PDF documents that needed to be summarized for my job, I learned that it wasn’t going to work for me. Ironically, it was very bad at highlighting text in the kinds of documents I deal with. But my experiment with Highlights led me to giving LiquidText another try, and with a new perspective on how to use it, LiquidText fits my highlighting needs pretty well.
Highlights is a focused app with a straightforward user interface for highlighting and commenting on text in a PDF, and it can export the highlighted text as plain text.1 It seems like a perfect fit for how I want to work. I subscribed to the Pro version (necessary to get plain text exporting) so I could give it a try.
As luck would have it, I had just finished a couple of projects in which I had done a fair amount of document summarizing. So I put copies of those documents in a new folder and started going through them in Highlights to see how much more effective it would be than my current system.2
On the very first document I tried, Highlights wouldn’t select the text I wanted it to, overselecting in certain areas and underselecting in others. (I can’t show screenshots of it because that would expose the client’s work product.) The trouble seemed to be most commonly associated with footnotes. The selection jumped around in unpredictable ways whenever a footnote or endnote was within the desired selection or nearby.
My history with Microsoft products leads me to believe that this PDF, which I know started its life as a Word document, has a convoluted internal structure, and that may be part of the reason Highlights had so much trouble with it. But I don’t have any control over the history of PDFs I need to summarize, and documents with footnotes that were written in Word make up a large enough percentage of the material I get from clients to make Highlights effectively useless to me. I cancelled my Pro subscription.
Shortly after my Highlights experiment, this thread about PDF note-taking apps appeared on the Mac Power Users forum. It reminded me of that copy of LiquidText I got a long time ago and decided not to use. Maybe it was worth another shot.
The signature feature of LiquidText is the ability to grab excerpts from PDFs and combine them into new documents that show the linkages between different PDFs and different sections of the same PDF. It’s very impressive but struck me as more a presentation tool than a research tool. What I learned from the forum (and some new testing of my own) was that I don’t have to use LiquidText’s cool linking feature; I can just highlight and comment on text as I read along and generate a summary of the highlights and comments when I’m done. And, most important, LiquidText is much better at selecting the text I want than Highlights is. Not perfect—I have found a couple of glitches—but definitely good enough that I expect the average PDF to give me no trouble at all.
After the text is highlighted, it needs to come out, and if you’ve looked through LiquidText’s sharing options, you might think it’s not possible to get a plain text summary out of it.
But if you choose the “Notes Outline” option, which is intended to create a Word file, you’ll see another window appear that lets you put the notes onto the clipboard instead of into a DOCX file.
Copying those notes into Drafts makes for a pretty decent summary.
As with the Markdown export from Highlights, I’m not thrilled with the way the notes are formatted, but I’ve written a Drafts action that cleans it up, distinguishing between highlights and comments and getting rid of the extra spaces that often appear in selections from fully justified text.
(I should point out that this particular PDF had several equations that I had highlighted. They’re hard to express in plain text and will need to be cleaned up.)
The Drafts action that does the cleanup consists of just one JavaScript step:
javascript:
1: var summary = editor.getText();
2:
3: // Inexplicably, LiquidText uses CRs as line endings.
4: var reformatted = summary.replace(/\r/g, '\n');
5:
6: // Get rid of the second header.
7: reformatted = reformatted.replace(/\nNotes in Document \n[^\n]+\n/, '');
8:
9: // Reformat the comments and highlights.
10: function noteReplace(full, m1, m2, m3, offset, string) {
11: if (m1 == "Highlight") {
12: return 'Page ' + m3 + ' quote:\n' + m2;
13: }
14: else {
15: return 'Page ' + m3 + ' summary:\n' + m2;
16: }
17: }
18:
19: reformatted = reformatted.replace(/(Highlight|Comment):\s?:?\s?(.+)\u2028\(.+p.(\d+)\)/g, noteReplace);
20:
21: // Extra spaces are probably from full justification.
22: reformatted = reformatted.replace(/ +/g, ' ');
23:
24: editor.setText(reformatted);
A couple of comments on the script:
- As you can see in Lines 3–4, the notes from LiquidText use carriage return (CR) characters as line endings. Not linefeed (LF) characters, as would be expected on iOS. Not the CRLF combination common to Microsoft products. No, it’s the bare CR that Macs used to use back in the pre-OS X days. Bizarre.
- Nearly as odd is the use of the Unicode LINE SEPARATOR character (U+2028) within each note. That gets cleaned up in the large regex in Line 19.
I don’t expect the summary that comes out of this process to be in final form, but my early experiments have shown that there’s less editing needed in these summaries than in those I dictate.
Overall, I like the experience of summarizing a document with LiquidText. I can still sit with my iPad on my lap, and I prefer swiping with the Pencil to reading text into my phone. In those places where I need to make a comment instead of a highlight, I can still dictate—I just dictate into the iPad instead of my phone. I no longer have to wake up a phone that’s gone to sleep between comments. Unless I run into a showstopper, LiquidText is how I’ll be summarizing from now on.
-
The exported plain text is Markdown with a header structure that I wouldn’t want to use, but I don’t consider that a significant problem. It’s easy to write a filter that reformats well-structured text to get the output I want. ↩
-
Which is to dictate quotes and comments into Drafts on my phone as I read a document on my iPad. ↩