I was shocked—shocked!—to see people disagree with my last post. I was even more shocked to learn about bizarre omission in the HomePod software. I decided to dig into the many ways you can set timed alerts on your Apple devices and how the alert systems vary from device to device. It is, you will not be surprised to learn, a mess.

\nLet’s start with the summary. In the table below, I’m comparing the features of the three alert types on iOS: Timers, Alarms, and Reminders. Included in the comparison is how certain features work (or don’t work) on the iPhone, iPad, Watch, Mac,^{1} and HomePod. Most of the entries for the HomePod are empty because I don’t have one to test, but I’ve included it because it was the device that got me started down this path. Also, there’s that software omission I want to talk about.

Timer | Alarm | Reminder | |
---|---|---|---|

Number | 1 | ∞︎ | ∞︎ |

Name/Description | No | Yes | Yes |

Autodelete | Yes | No | Yes |

Shared | |||

iPhone | Yes | Yes | Yes |

iPad | No | No | Yes |

Watch | Yes | Yes | Yes |

Mac | No | No | Yes |

HomePod | ? | ? | No |

Time left | |||

iPhone | Yes | No | No |

iPad | Yes | No | No |

Watch | Yes | No | No |

Mac | No | No | No |

HomePod | ? | ? | ? |

Time of | |||

iPhone | No | Yes | Yes |

iPad | No | Yes | Yes |

Watch | No | Yes | Yes |

Mac | No | No | Yes |

HomePod | ? | ? | ? |

Many of the entries in this table have caveats, so let’s go through it.

\nThe number of alerts that can be set was the starting point for the last post. People want multiple timers in their HomePods. That’s great, but Apple’s never had multiple timers in any iOS device, which is why I’ve always used reminders instead. “Reminders aren’t a substitute for timers!” I’ve been told by several people. I admire your steadfast adherence to your principles, but I need a solution, not a manifesto. (We’ll get to the deficiencies of using reminders as a substitute for timers later in the post.)

\nSince there’s only one timer, there’s no need for it to have a name or description. So when the timer on your phone/watch/table/speaker goes off, you might have to think a bit before you remember what it’s for. Alarms and reminders don’t have this problem.

\nI didn’t mention alarms in my last post, but Kirk McElhearn reminded^{2} me of them. If you’ve only used Clock app’s UI to set an alarm, you may think you have to use a specific time (like 8:55 PM) instead of a relative time (in 20 minutes). But Siri offers another way:

\n\nHey Siri, set a casserole

\n^{3}alarm for 20 minutes.

One problem with using alarms as your alert system is that they don’t delete themselves when you dismiss them; they just sit there, inactive, taking up space in your list of alarms until you undertake a second action to remove them from the list. Timers delete themselves upon dismissal, which is certainly more convenient. Reminders *almost* delete themselves—when you mark a reminder as complete, it gets hidden in the Completed list. I take this as close enough to deletion that I gave Reminders a Yes on the Autodelete line.

One of the biggest advantages to using reminders is that they’re shared via iCloud, which also syncs them to your Mac. This is very convenient if you use reminders during the workday and allow notifications from the Reminders app, which I do. Timers and alarms are not shared; the timer you set on your phone doesn’t appear in the Clock app on your iPad or on your watch. But the watch is special because of its intimate relationship with the phone. Your watch *will* alert you of a timer or alarm set on your phone, even though it doesn’t appear in the watch’s Timer or Alarms app. The Mac is ignorant of all timers and alarms.

Here’s where we get to the HomePod’s software omission. Even if you set up your HomePod to access your reminders—which, I admit, you may be reluctant to do in some households—*the HomePod will not alert you when a reminder comes due*. I was first informed of this stunning fact by Holger Eilhard, and it’s been confirmed by others. So I guess you can create a reminder through your HomePod but not be alerted by one. For whatever that’s worth. Because I don’t think it’s worth much, I decided to put a No in the Reminder column for sharing on the HomePod.

A feature many people find essential is getting the time remaining before an alert goes off. I would like to tell these people to chill out, take a Zen approach, that “a watched pot never boils,” but that would only anger folks who seem to be a little on edge already. My blithe assertion that timed reminders is the solution to the lack of multiple timers was based too much on my own use. In the 4+ years I’ve been using reminders for timed alerts, I have never wanted to know how much time was left, but I guess the rest of the world doesn’t slavishly model itself after me.

\nSo if you need to know the time left on an alert, the timer is your only friend. Neither alarms or reminders will give you that. Alarms and timers *will* give you the time an alert will go off (like 8:55 PM), but you’ll have to do the subtraction yourself, which isn’t convenient.

By the way, although I put a Yes in the “Time of” section for the Watch, my watch has never actually been able to tell me the time a reminder is due when I ask it via Siri. It definitely understands me, and it acts like it’s going to retrieve that information, but it’s never finished the job. I can, of course, see the due time of a reminder using the watch’s Reminders app.

\nAnd there are also a couple of problems with asking Siri for the time of a reminder on the phone:

\n\nThe obvious problem is that the time Siri says is wrong. And it’s been wrong every time I’ve tried this over the past two days.^{4} For this example, the reminder was set for 3:50 PM, but Siri told me a time six hours earlier. Now, I happen to live six hours away from UTC, so my first thought was that Siri was programmed (stupidly) to respond in universal time. But then I realized the six hour difference was in the wrong direction. 3:50 PM US/Central is 9:50 PM UTC, not 9:50 AM UTC. So Siri’s answer is so bad it isn’t even wrong in an understandable way.

The less obvious problem is Siri’s characterization of my casserole reminder as the “next reminder.” Inexplicably, she uses that phrase even if the reminder you ask about isn’t the next one. Sigh.

\nAfter going through this exercise, I will continue to use timed reminders because

\n- \n
- they work across all my devices, including the Mac; \n
- their deficiencies regarding the time remaining don’t affect me; and \n
- they don’t require a second action to get them out of the way when completed, unlike alarms. \n

I’ve said on Twitter that I think Apple intends timed reminders to be the substitute for multiple timers. I still think that, but I’m less certain now than I was a few days ago.

\n\n

\n**Update Feb 18, 2018 9:22 AM**

\nThere’s always more.

First, something I had scribbled in a note but forgot to put in the post: a timer may not sound an alert. If you like to fall asleep listening to music, you may have the Timer’s When Timer Ends setting assigned to Stop Playing.

\n\nIf that’s the case, the next time you use Siri to set a timer, it won’t make a sound, which probably isn’t what you want.

\nSecond, reader Thomas Shannon has emailed me that alarms go off only at minute markers. So if it’s 9:55:45 and you tell Siri to set an alarm for one minute, it will go off 15 seconds later. I was annoyed to hear this because I looked into this four years ago with regard to reminders and found that their alert times are *not* restricted to whole minutes. If you tell Siri at 9:55:45 to remind you of something in one minute, the alert goes off at 9:56:45.

I used to tell people the advantage of using Apple products was their consistency across devices and applications. I don’t do that anymore.

\n\n

\n

\n

- \n
- \n
You’re right, the Mac isn’t an iOS device, but it does work with Reminders, which can be very handy, so I’m including it. ↩

\n \n - \n
Hah! I slay me. ↩

\n \n - \n
I’m using casseroles in the examples because I’m a homespun Midwesterner (and not from Minnesota). ↩

\n \n - \n
As I said above, I’ve never asked about the time of a reminder. Good thing, too. ↩

\n \n

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "Friendly reminders", "url": "http://leancrew.com/all-this/2018/02/friendly-reminders/", "author": {"name": "Dr. Drang"}, "summary": "Yes, you can run multiple timers on your HomePod. Just don’t call them timers.", "date_published": "2018-02-16T01:02:06+00:00", "id": "http://leancrew.com/all-this/2018/02/friendly-reminders/", "content_html": "My vision of myself as a powerful thinkfluencer in the Apple world took a real beating this week. It seemed as if everyone who got a HomePod was complaining that it couldn’t set multiple timers. This is something I’ve written about a couple of times, going back four years. And I’ve explained the solution. Is this thing on?

\nOf course, four years ago, I wasn’t talking about the HomePod, I was talking about the iPhone, but the principle is the same. In iOS, the timer function is in the Clock app, and there’s only one. There’s no way to have two timers running simultaneously and no way to give your timer a name that lets you know what it’s for.

\nBut you do have Reminders. They have names and can be set to alarm not only at an absolute time, but also at a relative time:

\n“Hey Siri, remind me to check the casserole in 20 minutes.”

\n\nThis works on my iPhone, iPad, and Watch, and I assume—based on this article—that it would work on my HomePod if I had one. This is clearly Apple’s preferred solution to setting mulitple timers, each with a distinct name.

\nSo I was frustrated to hear John Gruber and Paul Kafasis in the latest episode of *The Talk Show* complain about the multiple timer problem. They should both know how to use Reminders to solve this problem. So should Myke Hurley, who made the same complaint in the most recent *Upgrade*.

I understand where they’re coming from. If you’re an Amazon Echo user, you’re probably in the habit of saying something like

\n“Alexa, set a 20-minute timer for the casserole.”

\nHabits like that are hard to break, especially as you get older.^{1} But Apple users should be used to the idea that Apple has strong opinions about the right way to use its products and you’re usually better off not bucking the system.

You don’t like cluttering up your Reminders with hundreds of “check the casserole” and “check the tea” items? Even though you typically don’t see completed reminders? There is a solution.

\nIn the past couple of days, the HomePod complaint industry has moved on from multiple timers to white rings. Cheaply made leather circles are already coming onto the market, but I’m going to suggest that high end furniture protection should come from lace doilies with tatting that complements the HomePod’s fabric pattern.

\n\n

\n

\n

- \n
- \n
Myke is 30 now, so his brain has lost much of its former plasticity. ↩

\n \n

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "LaTeX contact info through Workflow", "url": "http://leancrew.com/all-this/2018/02/latex-contact-info-through-workflow/", "author": {"name": "Dr. Drang"}, "summary": "I get by with a little help from Twitter friends.", "date_published": "2018-02-10T21:00:37+00:00", "id": "http://leancrew.com/all-this/2018/02/latex-contact-info-through-workflow/", "content_html": "I’ve been writing more on my iPad recently; not just blog posts, but reports for work, too. Because I have a lot of helper scripts and macros built up over many years of working on a Mac, writing on the iPad is still slower. But I’m gradually building up a set of iOS tools and techniques to make the process go faster. Today’s post is about a Workflow I built yesterday with advice from iOS automation experts conveyed over Twitter.

\nFor several years, I wrote reports for work using a Markdown→LaTeX→PDF workflow. For most of those years, it was rare for me to have to edit the LaTeX before turning it into a PDF. Recently, though, that rarity has disappeared, mainly because my reports have more tables and figures of varying size that need to be carefully positioned, something that can’t be done in Markdown. A few months ago I decided it would be more efficient to just write in LaTeX from the start. This wasn’t as big a change as you might think. I used to write in LaTeX directly, and the combination of TextExpander and a few old scripts I resurrected got me back up to speed relatively quickly—on the Mac, anyway.

\nOn iOS, most of the TextExpander snippets I built for writing in LaTeX work fine, but the helper scripts, which tend to rely on AppleScript, don’t. One of the scripts I definitely wanted an iOS counterpart for was one that extracted the contact information from a client in a particular format. In my reports, the title page usually includes section for the name, company, and address of the client. This is added in the LaTeX source code by this:

\n`tex:\n\\client{John Cheatham\\\\\nDewey, Cheatham \\& Howe\\\\\n1515 Loquitor Lane\\\\\nAmicus OH 44100}\n`

\nwhere `\\client`

is a LaTeX command I created long ago, and its argument needs the usual LaTeX double backslashes to designate line breaks. Also, ampersands, which are special characters in LaTeX, need to be escaped.

I thought I could whip something up in Workflow, but my limited understanding of Workflow isn’t conducive to whipping. When I first tried to put something together a couple of weeks ago, it looked to me as if I was going to have to painstakingly extract every piece of information from the selected contact, create variables to store them in, and then put those variables together into a new string of text. So I gave up.

\nYesterday I decided to ask for help.

\n\n

\n\nI would like to extract from a selected contact a standard name/address block as plain text:\n

Full Name

Company

Street Address

\nCity, ST Zip

\nI don’t think Contacts or Interact do this. Does anything?

— Dr. Drang (@drdrang) Fri Feb 9 2018 9:37 PM \n

As you can see, I asked for something a bit simpler than what I really wanted, and I was kind of expecting suggestions for an app that would do the trick. But I soon got a response from Ari Weinstein with a Workflow solution:

\n\nSince Ari is a co-developer of Workflow, I kind of figured he knew what he was talking about. But I didn’t, and it’s because I didn’t appreciate Workflow’s magic variables. I’ve always thought of Workflow as being almost like a functional language, where each action transforms the data passed to in and sends it along to the next action in turn. That, at least, is what I thought happened when the actions are connected by lines.

\nWhich is why I didn’t understand Ari’s workflow at first. I figured that if it was extracting the Street Address in the second step, there’d be no way for it to get ahold of the Name and Company in the fourth step. What I didn’t appreciate was that there can be side effects the usual view of a workflow doesn’t show you. In this case, the Contact that’s selected in the first step is saved to a magic variable (called “Contact”) that remains available for use in later steps. So the third and fourth steps have access to all the Contact information even after the extraction of the Street Address in the second step.

\nAri’s sample is a standard workflow that would have to be run from within Workflow itself or from a launcher app like Launch Center Pro. I was thinking about how to turn it into an Action Extension that could be called from within Contacts when I noticed I had a Twitter reply from Federico Viticci:

\n\nHis suggestion is set up as an Action Extension that accepts only Contacts and extracts the info from the Workflow Input magic variable. Just what I was going to do.

\n“My” final workflow, called

\n\n , combines what I learned from Ari and Federico and adds some search-and-replace stuff to handle the LaTeX-specific parts:The first two steps create a text variable named `Ret`

that consists of a single line break. We’ll see why I needed it in a bit.

Steps 3–5 are the Ari/Federico mashup. I couldn’t use Federico’s suggestion to just add `Workflow input:Street Address`

to the end of the block because my contacts usually include the country, even though the country is almost always the US, and I didn’t want that at the end of the block. At some point, I’ll improve this by writing up a filter that deletes the country line *only* if it’s the US, but this will do until I get another job with a non-US client.

Step 6 escapes the ampersands, and Step 7 adds the double backslashes to the ends of each line. You need four backslashes to get two in the output because regexes need two to produce one. I thought I could use `\\n`

at the end of the replacement string to get a line break, but I couldn’t get that to work. Thus, the `Ret`

variable defined at the beginning of the workflow.

Finally, Step 8 puts the text on the clipboard, ready for pasting into a LaTeX document.

\nMy plan is to use this extension in Split View, with my text editor, currently Textastic, on one side and Contacts on the other. When I need to insert the client info, I find it in Contacts, tap Share Contact to bring up the Sharing Sheet, and select the Run Workflow action.

\n\nThis brings up the list of Workflow Action Extensions that can accept Contacts. I choose LaTeX Address from the list, switch focus back to Textastic, and paste the text block where it belongs. Boom.

\n\nI’ll try to remember to look for magic variables the next time I make a workflow. There is a trick to making them visible. When you’re editing a workflow and can insert a variable (magic or otherwise), a button with a magic wand will appear in the special keyboard row.

\n\nTapping it will give you a new view of your workflow, with the magic variables appearing where the workflow creates them.

\n\nYou don’t need to do this, as all of these variables should appear in the special keyboard row if you keep scrolling it to the right. But I find it easier to understand what they are and where they come from in this view.

\nThanks to everyone who had suggestions for me, especially Ari and Federico.

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "My feed reading system", "url": "http://leancrew.com/all-this/2018/02/my-feed-reading-system/", "author": {"name": "Dr. Drang"}, "summary": "All the parts of my homemade feed reading system.", "date_published": "2018-02-04T18:52:25+00:00", "id": "http://leancrew.com/all-this/2018/02/my-feed-reading-system/", "content_html": "As promised, or threatened, here’s my setup for RSS feed reading. It consists of a few scripts that run periodically throughout the day on a server I control and which is accessible to me from any browser on any device. The idea is to have a system that fits the way I read and doesn’t rely on any particular service or company. If my current web host went out of business tomorrow, I could move this system to another and be back up and running in an hour or so—less time than it would take to research and decide on a new feed reading service.

\nThe linchpin of the system is the `getfeeds`

script:

`python:\n 1: #!/usr/bin/env python\n 2: # coding=utf8\n 3: \n 4: import feedparser as fp\n 5: import time\n 6: from datetime import datetime, timedelta\n 7: import pytz\n 8: from collections import defaultdict\n 9: import sys\n 10: import dateutil.parser as dp\n 11: import urllib2\n 12: import json\n 13: import sqlite3\n 14: import urllib\n 15: \n 16: def addItem(db, blog, id):\n 17: add = 'insert into items (blog, id) values (?, ?)'\n 18: db.execute(add, (blog, id))\n 19: db.commit()\n 20: \n 21: jsonsubscriptions = [\n 22: 'http://leancrew.com/all-this/feed.json',\n 23: 'https://daringfireball.net/feeds/json',\n 24: 'https://sixcolors.com/feed.json',\n 25: 'https://www.robjwells.com/feed.json',\n 26: 'http://inessential.com/feed.json',\n 27: 'https://macstories.net/feed/json']\n 28: \n 29: xmlsubscriptions = [\n 30: 'http://feedpress.me/512pixels',\n 31: 'http://alicublog.blogspot.com/feeds/posts/default',\n 32: 'http://blog.ashleynh.me/feed',\n 33: 'http://www.betalogue.com/feed/',\n 34: 'http://bitsplitting.org/feed/',\n 35: 'https://kieranhealy.org/blog/index.xml',\n 36: 'http://blueplaid.net/news?format=rss',\n 37: 'http://brett.trpstra.net/brettterpstra',\n 38: 'http://feeds.feedburner.com/NerdGap',\n 39: 'http://www.libertypages.com/clarktech/?feed=rss2',\n 40: 'http://feeds.feedburner.com/CommonplaceCartography',\n 41: 'http://kk.org/cooltools/feed',\n 42: 'https://david-smith.org/atom.xml',\n 43: 'http://feeds.feedburner.com/drbunsenblog',\n 44: 'http://stratechery.com/feed/',\n 45: 'http://feeds.feedburner.com/IgnoreTheCode',\n 46: 'http://indiestack.com/feed/',\n 47: 'http://feeds.feedburner.com/theendeavour',\n 48: 'http://feed.katiefloyd.me/',\n 49: 'http://feeds.feedburner.com/KevinDrum',\n 50: 'http://www.kungfugrippe.com/rss',\n 51: 'http://www.caseyliss.com/rss',\n 52: 'http://www.macdrifter.com/feeds/all.atom.xml',\n 53: 'http://mackenab.com/feed',\n 54: 'http://macsparky.com/blog?format=rss',\n 55: 'http://www.marco.org/rss',\n 56: 'http://themindfulbit.com/feed.xml',\n 57: 'http://merrillmarkoe.com/feed',\n 58: 'http://mjtsai.com/blog/feed/',\n 59: 'http://feeds.feedburner.com/mygeekdaddy',\n 60: 'https://nathangrigg.com/feed/all.rss',\n 61: 'http://onethingwell.org/rss',\n 62: 'http://www.practicallyefficient.com/feed.xml',\n 63: 'http://www.red-sweater.com/blog/feed/',\n 64: 'http://blog.rtwilson.com/feed/',\n 65: 'http://feedpress.me/candlerblog',\n 66: 'http://inversesquare.wordpress.com/feed/',\n 67: 'http://joe-steel.com/feed',\n 68: 'http://feeds.veritrope.com/',\n 69: 'https://with.thegra.in/feed',\n 70: 'http://xkcd.com/atom.xml',\n 71: 'http://doingthatwrong.com/?format=rss']\n 72: \n 73: # Feedparser filters out certain tags and eliminates them from the\n 74: # parsed version of a feed. This is particularly troublesome with\n 75: # embedded videos. This can be fixed by changing how the filter\n 76: # works. The following is based these tips:\n 77: #\n 78: # http://rumproarious.com/2010/05/07/\\\n 79: # universal-feed-parser-is-awesome-except-for-embedded-videos/\n 80: #\n 81: # http://stackoverflow.com/questions/30353531/\\\n 82: # python-rss-feedparser-cant-parse-description-correctly\n 83: #\n 84: # There is some danger here, as the included elements may contain\n 85: # malicious code.\n 86: fp._HTMLSanitizer.acceptable_elements |= {'object', 'embed', 'iframe'}\n 87: \n 88: # Connect to the database of read posts.\n 89: db = sqlite3.connect('/path/to/read-feeds.db')\n 90: query = 'select * from items where blog=? and id=?'\n 91: \n 92: # Collect all unread posts and put them in a list of tuples. The items\n 93: # in each tuple are when, blog, title, link, body, n, and author. \n 94: posts = []\n 95: n = 0\n 96: \n 97: # We're not going to accept items that are more than 3 days old, even\n 98: # if they aren't in the database of read items. These typically come up\n 99: # when someone does a reset of some sort on their blog and regenerates\n100: # a feed with old posts that aren't in the database or posts that are\n101: # in the database but have different IDs.\n102: utc = pytz.utc\n103: homeTZ = pytz.timezone('US/Central')\n104: daysago = datetime.today() - timedelta(days=3)\n105: daysago = utc.localize(daysago)\n106: \n107: # Start with the JSON feeds.\n108: for s in jsonsubscriptions:\n109: try:\n110: feed = urllib2.urlopen(s).read()\n111: jfeed = json.loads(feed)\n112: blog = jfeed['title']\n113: for i in jfeed['items']:\n114: try:\n115: id = i['id']\n116: except KeyError:\n117: id = i['url']\n118: \n119: # Add item only if it hasn't been read.\n120: match = db.execute(query, (blog, id)).fetchone()\n121: if not match:\n122: try:\n123: when = i['date_published']\n124: except KeyError:\n125: when = i['date_modified']\n126: when = dp.parse(when)\n127: when = utc.localize(when)\n128: \n129: try:\n130: author = ' ({})'.format(i['author']['name'])\n131: except KeyError:\n132: author = ''\n133: try:\n134: title = i['title']\n135: except KeyError:\n136: title = blog\n137: link = i['url']\n138: body = i['content_html']\n139: \n140: # Include only posts that are less than 3 days old. Add older posts\n141: # to the read database.\n142: if when > daysago:\n143: posts.append((when, blog, title, link, body, \"{:04d}\".format(n), author, id))\n144: n += 1\n145: else:\n146: addItem(db, blog, id)\n147: except:\n148: pass\n149: \n150: # Add the RSS/Atom feeds.\n151: for s in xmlsubscriptions:\n152: try:\n153: f = fp.parse(s)\n154: try:\n155: blog = f['feed']['title']\n156: except KeyError:\n157: blog = \"---\"\n158: for e in f['entries']:\n159: try:\n160: id = e['id']\n161: if id == '':\n162: id = e['link']\n163: except KeyError:\n164: id = e['link']\n165: \n166: # Add item only if it hasn't been read.\n167: match = db.execute(query, (blog, id)).fetchone()\n168: if not match:\n169: \n170: try:\n171: when = e['published_parsed']\n172: except KeyError:\n173: when = e['updated_parsed']\n174: when = datetime(*when[:6])\n175: when = utc.localize(when)\n176: \n177: try:\n178: title = e['title']\n179: except KeyError:\n180: title = blog\n181: try:\n182: author = \" ({})\".format(e['authors'][0]['name'])\n183: except KeyError:\n184: author = \"\"\n185: try:\n186: body = e['content'][0]['value']\n187: except KeyError:\n188: body = e['summary']\n189: link = e['link']\n190: \n191: # Include only posts that are less than 3 days old. Add older posts\n192: # to the read database.\n193: if when > daysago:\n194: posts.append((when, blog, title, link, body, \"{:04d}\".format(n), author, id))\n195: n += 1\n196: else:\n197: addItem(db, blog, id)\n198: except:\n199: pass\n200: \n201: # Sort the posts in reverse chronological order.\n202: posts.sort()\n203: posts.reverse()\n204: toclinks = defaultdict(list)\n205: for p in posts:\n206: toclinks[p[1]].append((p[2], p[5]))\n207: \n208: # Create an HTML list of the posts.\n209: listTemplate = '''<li>\n210: <p class=\"title\" id=\"{5}\"><a href=\"{3}\">{2}</a></p>\n211: <p class=\"info\">{1}{6}<br />{0}</p>\n212: <p>{4}</p>\n213: <form action=\"/path/to/addreaditem.py\" method=\"post\" name=\"readform{5}\" onsubmit=\"return markAsRead(this);\">\n214: <input type=\"hidden\" name=\"blog\" value=\"{8}\" />\n215: <input type=\"hidden\" name=\"id\" value=\"{9}\" />\n216: <input class=\"mark-button\" type=\"submit\" value=\"Mark as read\" name=\"readbutton{5}\"/>\n217: </form>\n218: <br />\n219: <form action=\"/path/to/addpinboarditem.py\" method=\"post\" name=\"pbform{5}\" onsubmit=\"return addToPinboard(this);\">\n220: <input type=\"hidden\" name=\"url\" value=\"{11}\" />\n221: <input type=\"hidden\" name=\"title\" value=\"{10}\" />\n222: <input class=\"pinboard-field\" type=\"text\" name=\"tags\" size=\"30\" /><br />\n223: <input class=\"pinboard-button\" type=\"submit\" value=\"Pinboard\" name=\"pbbutton{5}\" />\n224: </form>\n225: </li>'''\n226: litems = []\n227: for p in posts:\n228: q = [ x.encode('utf8') for x in p[1:] ]\n229: timestamp = p[0].astimezone(homeTZ)\n230: q.insert(0, timestamp.strftime('%b %d, %Y %I:%M %p'))\n231: q += [urllib.quote_plus(q[1]),\n232: urllib.quote_plus(q[7]),\n233: urllib.quote_plus(q[2]),\n234: urllib.quote_plus(q[3])]\n235: litems.append(listTemplate.format(*q))\n236: body = '\\n<hr />\\n'.join(litems)\n237: \n238: # Create a table of contents organized by blog.\n239: tocTemplate = '''<li class=\"toctitle\"><a href=\"#{1}\">{0}</a></li>\\n'''\n240: toc = ''\n241: blogs = toclinks.keys()\n242: blogs.sort()\n243: for b in blogs:\n244: toc += '''<p class=\"tocblog\">{0}</p>\n245: <ul class=\"rss\">\n246: '''.format(b.encode('utf8'))\n247: for p in toclinks[b]:\n248: q = [ x.encode('utf8') for x in p ]\n249: toc += tocTemplate.format(*q)\n250: toc += '</ul>\\n'\n251: \n252: # Print the HTMl.\n253: print '''<html>\n254: <meta charset=\"UTF-8\" />\n255: <meta name=\"viewport\" content=\"width=device-width\" />\n256: <head>\n257: <style>\n258: body {{\n259: background-color: #555;\n260: width: 750px;\n261: margin-top: 0;\n262: margin-left: auto;\n263: margin-right: auto;\n264: padding-top: 0;\n265: font-family: Georgia, Serif;\n266: }}\n267: h1, h2, h3, h4, h5, h6 {{\n268: font-family: Helvetica, Sans-serif;\n269: }}\n270: h1 {{\n271: font-size: 110%;\n272: }}\n273: h2 {{\n274: font-size: 105%;\n275: }}\n276: h3, h4, h5, h6 {{\n277: font-size: 100%;\n278: }}\n279: .content {{\n280: padding-top: 1em;\n281: background-color: white;\n282: }}\n283: .rss {{\n284: list-style-type: none;\n285: margin: 0;\n286: padding: .5em 1em 1em 1.5em;\n287: background-color: white;\n288: }}\n289: .rss li {{\n290: margin-left: -.5em;\n291: line-height: 1.4;\n292: }}\n293: .rss li pre {{\n294: overflow: auto;\n295: }}\n296: .rss li p {{\n297: overflow-wrap: break-word;\n298: word-wrap: break-word;\n299: word-break: break-word;\n300: -webkit-hyphens: auto;\n301: hyphens: auto;\n302: }}\n303: .rss li figure {{\n304: -webkit-margin-before: 0;\n305: -webkit-margin-after: 0;\n306: -webkit-margin-start: 0;\n307: -webkit-margin-end: 0;\n308: }}\n309: .title {{\n310: font-weight: bold;\n311: font-family: Helvetica, Sans-serif;\n312: font-size: 120%;\n313: margin-bottom: .25em;\n314: }}\n315: .title a {{\n316: text-decoration: none;\n317: color: black;\n318: }}\n319: .info {{\n320: font-size: 85%;\n321: margin-top: 0;\n322: margin-left: .5em;\n323: }}\n324: .tocblog {{\n325: font-weight: bold;\n326: font-family: Helvetica, Sans-serif;\n327: font-size: 100%;\n328: margin-top: .25em;\n329: margin-bottom: 0;\n330: }}\n331: .toctitle {{\n332: font-weight: medium;\n333: font-family: Helvetica, Sans-serif;\n334: font-size: 100%;\n335: padding-left: .75em;\n336: text-indent: -.75em;\n337: margin-bottom: 0;\n338: }}\n339: .toctitle a {{\n340: text-decoration: none;\n341: color: black;\n342: }}\n343: .tocinfo {{\n344: font-size: 75%;\n345: margin-top: 0;\n346: margin-left: .5em;\n347: }}\n348: img, embed, iframe, object {{\n349: max-width: 700px;\n350: }}\n351: .mark-button {{\n352: width: 15em;\n353: border: none;\n354: border-radius: 4px;\n355: color: black;\n356: background-color: #B3FFB2;\n357: text-align: center;\n358: padding: .25em 0 .25em 0;\n359: font-weight: bold;\n360: font-size: 1em;\n361: }}\n362: .pinboard-button {{\n363: width: 7em;\n364: border: none;\n365: border-radius: 4px;\n366: color: black;\n367: background-color: #B3FFB2;\n368: text-align: center;\n369: padding: .25em 0 .25em 0;\n370: font-weight: bold;\n371: font-size: 1em;\n372: margin-left: 11em;\n373: }}\n374: .pinboard-field {{\n375: font-size: 1em;\n376: font-family: Helvetica, Sans-serif;\n377: }}\n378: \n379: @media only screen\n380: and (max-width: 667px)\n381: and (-webkit-device-pixel-ratio: 2)\n382: and (orientation: portrait) {{\n383: body {{\n384: font-size: 200%;\n385: width: 640px;\n386: background-color: white;\n387: }}\n388: .rss li {{\n389: line-height: normal;\n390: }}\n391: img, embed, iframe, object {{\n392: max-width: 550px;\n393: }}\n394: }}\n395: @media only screen\n396: and (min-width: 668px)\n397: and (-webkit-device-pixel-ratio: 2) {{\n398: body {{\n399: font-size: 150%;\n400: width: 800px;\n401: background-color: #555;\n402: }}\n403: .rss li {{\n404: line-height: normal;\n405: }}\n406: img, embed, iframe, object {{\n407: max-width: 700px;\n408: }}\n409: }}\n410: </style>\n411: \n412: <script language=javascript type=\"text/javascript\">\n413: function markAsRead(theForm) {{\n414: var mark = new XMLHttpRequest();\n415: mark.open(theForm.method, theForm.action, true);\n416: mark.send(new FormData(theForm));\n417: mark.onreadystatechange = function() {{\n418: if (mark.readyState == 4 && mark.status == 200) {{\n419: var buttonName = theForm.name.replace(\"readform\", \"readbutton\");\n420: var theButton = document.getElementsByName(buttonName)[0];\n421: theButton.value = \"Marked!\";\n422: theButton.style.backgroundColor = \"#FFB2B2\";\n423: }}\n424: }}\n425: return false;\n426: }}\n427: \n428: function addToPinboard(theForm) {{\n429: var mark = new XMLHttpRequest();\n430: mark.open(theForm.method, theForm.action, true);\n431: mark.send(new FormData(theForm));\n432: mark.onreadystatechange = function() {{\n433: if (mark.readyState == 4 && mark.status == 200) {{\n434: var buttonName = theForm.name.replace(\"pbform\", \"pbbutton\");\n435: var theButton = document.getElementsByName(buttonName)[0];\n436: theButton.value = \"Saved!\";\n437: theButton.style.backgroundColor = \"#FFB2B2\";\n438: }}\n439: }}\n440: return false;\n441: }}\n442: \n443: </script>\n444: \n445: <title>Today’s RSS</title>\n446: </head>\n447: <body>\n448: <div class=\"content\">\n449: <ul class=\"rss\">\n450: {}\n451: </ul>\n452: <hr />\n453: <a name=\"start\" />\n454: <ul class=\"rss\">\n455: {}\n456: </ul>\n457: </div>\n458: </body>\n459: </html>\n460: '''.format(toc, body)\n`

\nFor me, this is a very long script, but most of it is just the HTML template. What `getfeeds`

does is go through my subscription list, gather all the articles from those feeds that I haven’t already read, and generate a static HTML file with the unread articles laid out in reverse chronological order. At the end of each article, it puts a button to mark the article as read and a form for adding a link to the article to my account at Pinboard.

Start by noticing that this is a Python 2 script, so Line 2 is a comment that tells Python that UTF-8 characters will be in the source code. We’ll also run into `decode/encode`

invocations that wouldn’t be necessary if I’d written this in Python 3. I suppose I’ll translate it at some point.

Lines 16–19 are a function for adding an article to the database of read items. This is an SQLite database that’s also kept on the server. The database has a single table whose schema consists of just two fields: the blog name and the article GUID. Each article that I’ve marked as read gets entered as a new record in the database. The `addItem`

function runs a simple SQL insertion command via Python’s `sqlite3`

library.

Lines 21–27 and 29–71 define my subscriptions: two lists of feed URLs, one for JSON feeds and the other for traditional RSS/Atom feeds. A lot of these feeds have gone silent over the past year, but I remain subscribed to them in the hope that they’ll come back to life.

\nLine 86 sets a parameter in the `feedparser`

library that relaxes some of the filtering that library does by default. There is some danger to this, but I’ve found that some blogs are essentially worthless if I don’t do this. The comments above Line 86 contain links to discussions of `feedparser`

’s filtering.

Lines 89–90 connect to the database of read items (note the fake path to the database file) and create a query string that we’ll use later to determine whether an article is in the database.

\nLines 94–95 initialize the list of `posts`

that will ultimately be turned into the HTML page and the `n`

variable that keeps track of the post count.

Lines 102–105 initialize a set of variables used to handle timezone information and the filtering of older articles that aren’t in the database of read items. As discussed in the comments above Line 102 and in my previous post, old articles that aren’t in the database can sometimes appear in a blog’s RSS feed when the blog gets updated.

\nLines 108–148 assemble the unread articles from the JSON feeds. For each subscription, the feed is downloaded, converted into a dictionary, and run through to extract information on each article. Articles that are in the database of read items are ignored (Lines 120-121). Articles that aren’t in the database are appended to the `posts`

list, unless they’re more than three days old, in which case they are added to the database of read items instead of to `posts`

(Lines 142–146).

Much of Lines 108–148 is devoted to error handling and the normalization of disparate input into a uniform output. Each item of the `posts`

list is a tuple with

- \n
- the article date, \n
- the blog name, \n
- the article title, \n
- the article URL, \n
- the article content, \n
- the running count of posts, \n
- the article author, and \n
- the article GUID. \n

Lines 151–199 do for RSS/Atom feeds what Lines 108–148 do for JSON feeds. The main difference is that the `feedparser`

library is used to download and convert the feed into a dictionary.

Lines 202–203 sort the posts in reverse chronological order. This is made easy by my choice to put the article date as the first item in the tuple described above.

\nLines 204–206 generate a dictionary of lists of tuples, `toclinks`

, for the HTML page’s table of contents, which appears at the top of the page. A table of contents isn’t really necessary, but I like seeing an overview of what’s available before I start reading. The keys of the dictionary are the blog names, and each tuple in the list consists of the article’s title and its number, as given in the running post count, `n`

. The number will be used to create internal links in the HTML page.

From this point on, it’s all HTML templating. I suppose I could’ve used one of the myriad Python libraries for this, but I didn’t feel like doing the research to figure out which would be best for my needs. The ol’ `format`

command works pretty well.

Lines 209–225 define the template for each article. It starts with the title (which links to the original article), the date, and the author. The `id`

attribute in the title provides the internal target for the link in the table of contents. After the post contents come two forms. The first has two hidden fields with the blog name and the article GUID and a visible button that marks the article as read. The second form has the same hidden fields, a visible text field for Pinboard tags, and button to add a link to the original article to my Pinboard list. We’ll see later how these buttons work.

Lines 227–236 concatenate all of the posts, though their template, into one long stretch of HTML that will make up the bulk of the body of the page.

\nLine 239 defines a template for a table of contents entry (note the internal link), and Lines 240–250 then use that template to assemble the `toclinks`

dictionary into the HTML for the table of contents.

The last piece, Lines 253–460, assembles and outputs the final, full HTML file. It’s as long as it is because I wanted a single, self-contained file with all the CSS and JavaScript in it. I’m sure this doesn’t comport with best practices, but I’ve noticed that best practices in web programming and design change more often than I have time to keep track of. Whenever I need to change something, I know it’ll be here in `getfeeds`

.

The CSS is in Lines 257–410 and is set up to look decent (to me) on my computer, iPad, and iPhone. There’s a lot I don’t know about responsive web design, and I’m sure it shows here.

\nLines 412–426 and Lines 428–441 define the `markAsRead`

and `addToPinboad`

JavaScript functions, which are activated by the buttons described above. These are basic AJAX functions that do not rely on any outside library. They’re based on what I read in David Flanagan’s *JavaScript: The Definitive Guide* and, I suspect, a Stack Overflow page or two that I forgot to preserve the links to. There’s a decent chance they don’t work in Internet Explorer, which I will worry about in the next life.

The `markAsRead`

function triggers this `addreaditem.py`

script on the server:

`python:\n 1: #!/usr/bin/python\n 2: # coding=utf8\n 3: \n 4: import sqlite3\n 5: import cgi\n 6: import sys\n 7: import urllib\n 8: import cgitb\n 9: \n10: def addItem(db, blog, id):\n11: add = 'insert into items (blog, id) values (?, ?)'\n12: db.execute(add, (blog, id))\n13: db.commit()\n14: \n15: def markedItem(db, blog, id):\n16: check = 'select * from items where blog=? and id=?'\n17: return db.execute(check, (blog, id)).fetchone()\n18: \n19: # Connect to database of read items\n20: db = sqlite3.connect('/path-to/read-feeds.db')\n21: \n22: # Get the item from the request and add it to the database\n23: form = cgi.FieldStorage()\n24: blog = urllib.unquote_plus(form.getvalue('blog')).decode('utf8')\n25: id = urllib.unquote_plus(form.getvalue('id')).decode('utf8')\n26: if markedItem(db, blog, id):\n27: answer = 'Already marked'\n28: else:\n29: addItem(db, blog, id)\n30: answer = 'OK'\n31: \n32: minimal='''Content-Type: text/html\n33: \n34: <html>\n35: <head>\n36: <title>Add Item</title>\n37: <body>\n38: <h1>{}</h1>\n39: </body>\n40: </html>'''.format(answer)\n41: \n42: print(minimal)\n`

\nThere’s not much to this script. It uses the same `addItem`

function we saw before and a `markedItem`

function uses the same query we saw earlier to check if an item is in the database. Lines 23–30 get the input from the form that called it, check whether that item is already in the database, and add it if it isn’t. There’s some minimal HTML for output, but that’s of no importance. What matters is that if the script returns a success, the `markAsRead`

function changes the color of the button from green to red and the text of the button from “Mark as read” to “Marked!”

Before:\n

\nAfter:\n

\nThe `addToPinboard`

JavaScript function does essentially the same thing, except it triggers this `addpinboarditem.py`

script on the server:

`python:\n 1: #!/usr/bin/python\n 2: # coding=utf8\n 3: \n 4: import cgi\n 5: import pinboard\n 6: import urllib\n 7: \n 8: # Pinboard token\n 9: token = 'myPinboardName:myPinboardToken'\n10: \n11: # Get the page info from the request\n12: form = cgi.FieldStorage()\n13: url = urllib.unquote_plus(form.getvalue('url')).decode('utf8')\n14: title = urllib.unquote_plus(form.getvalue('title')).decode('utf8')\n15: tagstr = urllib.unquote_plus(form.getvalue('tags')).decode('utf8')\n16: tags = tagstr.split()\n17: \n18: # Add the item to Pinboard\n19: pb = pinboard.Pinboard(token)\n20: result = pb.posts.add(url=url, description=title, tags=tags)\n21: if result:\n22: answer = \"OK\"\n23: else:\n24: answer = \"Failed\"\n25: \n26: minimal='''Content-Type: text/html\n27: \n28: <html>\n29: <head>\n30: <title>Add To Pinboard</title>\n31: <body>\n32: <h1>{}</h1>\n33: </body>\n34: </html>'''.format(answer)\n35: \n36: print(minimal)\n`

\nThis script uses the Pinboard API to add a link to the original article. Line 9 defines my Pinboard credentials. Lines 12–16 extract the article and tag information from the form. Lines 19–24 connect to Pinboard and add the item to my list. If the script returns a success, the `addToPinboard`

function changes the color of the button from green to red and the text of the button from “Pinboard” to “Saved!”

Before:\n

\nAfter:\n

\nThe overall system is controlled by this short shell script, `runrss.sh`

:

`bash:\n1: #!/bin/bash\n2: \n3: /path/to/getfeeds > /other/path/to/rsspage-tmp.html\n4: cd /other/path/to\n5: mv rsspage-tmp.html rsspage.html\n`

\nLine 3 runs the `getfeeds`

script, sending the HTML output to a temporary file. Line 4 then changes to the directory that contains the temporary file, and Line 5 renames it. The file I direct my browser to is `rsspage.html`

. This seeming extra step with the temporary file is there because the `getfeeds`

script takes several seconds to run, and if I sent its output directly to `rsspage.html`

, that file would be in a weird state during that run time. I don’t want to browse the page when it isn’t finished.

Finally, `runrss.sh`

is executed periodically throughout the day by `cron`

. The `crontab`

entry is

`*/20 0,6-23 * * * /path/to/runrss.sh\n`

\nThis runs the script every 20 minutes from 6:00 am through midnight every day.

\nSo that’s it. Three Python scripts, one of which is long but mostly HTML templating, a short shell script, and a `crontab`

entry. Was it easier to do this than set up a Feedbin (or whatever) account? Of course not. But I won’t have to worry if I see that Feedbin’s owners have written a Medium post.

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "Feed reader robustification", "url": "http://leancrew.com/all-this/2018/02/feed-reader-robustification/", "author": {"name": "Dr. Drang"}, "summary": "A few changes to fix my homemade feed reading script.", "date_published": "2018-02-02T02:28:24+00:00", "id": "http://leancrew.com/all-this/2018/02/feed-reader-robustification/", "content_html": "I had a bit of shock this afternoon when I opened my RSS feed reader to see if anything was new.

\n\nNot much new, but a lot that’s old. Over 1400 posts from Kieran Healy, holder of the Krzyzewski Chair in Sociological R at the second best basketball university in North Carolina and author of a much-anticipated forthcoming book on how to make good graphs.

\nWhat happened? I don’t know for sure, but something in Kieran’s site generation software decided to include every post he’s written in his blog’s RSS feed. It’s an impressive body of work, going back to 2002, but I didn’t have time during my lunch hour to read it all.

\nMy homemade feed reader works like this. For every site I subscribe to, it

\n- \n
- read the RSS (or JSON) feed; \n
- checks each article against a SQLite database of articles I’ve already read; and \n
- adds the article to a list if it’s unread; \n

After going through all the subscriptions, the script sorts the unread articles in alphabetical order and arranges them in a static HTML page on my server, adding a table of contents to the top of the page. The script runs via a `cron`

job a few times an hour from 6:00 am until midnight.

So many of Kieran’s posts appeared today because my database of read posts is relatively young and only the last dozen or so of his articles are in it. It was all the earlier ones that were on my feed reader page.

\nThis is my fault, not Kieran’s. I knew perfectly well when I wrote my script that blogging software will sometimes regenerate its feed with all new GUIDs for each article. When this happens, it makes the articles look new to the feed reader. I’d seen this happen even back when I was using professionally written feed reading apps. What made this especially troublesome for my definitely-not-professionally-written feed reading system was that it’s not equipped with a “Mark all as read” button. Which gave me three choices:

\n- \n
- Do the programming to add a “Mark all as read” button, something I will almost never use. \n
- Go through and individually mark all 1400 old posts as read so they get entered into the database and don’t appear again. Fat chance. \n
- Figure out another way to add all these posts to the database. \n
- Change my feed reading script to just ignore articles that are more than a few days old, regardless of whether they’re in the database. \n

I chose #4 because it was the quickest to implement and should protect me against this kind of thing happening again. Kieran’s older posts disappeared from my feed reading page, and my blog reading went back to normal. Afterward, though, I realized that I could have implemented #3 in combination with #4, ignoring the older articles for the purposes of assembing the feed reading page but adding them to the database of read articles to give me added protection against seeing them pop up again.

\nI’ll try to get that working in the next day or two and then post the script in its final form. I doubt that many people really want to set up their own feed reading system, but you never know.

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "Subplots, axes, Matplotlib, OmniGraffle, and LaTeXiT", "url": "http://leancrew.com/all-this/2018/01/subplots-axes-matplotlib-omnigraffle-and-latexit/", "author": {"name": "Dr. Drang"}, "summary": "Another post to remind me how I did something in Matplotlib.", "date_published": "2018-01-30T03:36:43+00:00", "id": "http://leancrew.com/all-this/2018/01/subplots-axes-matplotlib-omnigraffle-and-latexit/", "content_html": "When I learn something new in Matplotlib, I usually write a short post about it to reinforce what I’ve learned and to give me a place to look it up when I need to do it again. In my section properties post from last week, I had a 2×2 set of plots that helped explain which arctangent result I wanted to choose under different circumstances.

\nHere’s the plot:

\n\nAnd here’s the code that made most of it:

\n`python:\n 1: #!/usr/bin/env python\n 2: \n 3: import matplotlib.pyplot as plt\n 4: import numpy as np\n 5: \n 6: x = np.linspace(-3, 3, 101)\n 7: y1 = (10+6)/2 - (10-6)/2*np.cos(2*x) - 3*np.sin(2*x)\n 8: y2 = (10+6)/2 - (6-10)/2*np.cos(2*x) - 3*np.sin(2*x)\n 9: y3 = (10+6)/2 - (6-10)/2*np.cos(2*x) + 3*np.sin(2*x)\n10: y4 = (10+6)/2 - (10-6)/2*np.cos(2*x) + 3*np.sin(2*x)\n11: \n12: f, axarr = plt.subplots(2, 2, figsize=(8, 8))\n13: axarr[0, 0].plot(x, y2, lw=2)\n14: axarr[0, 0].axhline(y=2, color='k', lw=1)\n15: axarr[0, 0].axvline(x=0, color='k')\n16: axarr[0, 0].set_ylim(0, 12)\n17: axarr[0, 0].set_xticks([])\n18: axarr[0, 0].set_yticks([])\n19: axarr[0, 0].set_frame_on(False)\n20: \n21: axarr[0, 1].plot(x, y1, lw=2)\n22: axarr[0, 1].axhline(y=2, color='k', lw=1)\n23: axarr[0, 1].axvline(x=0, color='k')\n24: axarr[0, 1].set_ylim(0, 12)\n25: axarr[0, 1].set_xticks([])\n26: axarr[0, 1].set_yticks([])\n27: axarr[0, 1].set_frame_on(False)\n28: \n29: axarr[1, 0].plot(x, y3, lw=2)\n30: axarr[1, 0].axhline(y=2, color='k', lw=1)\n31: axarr[1, 0].axvline(x=0, color='k')\n32: axarr[1, 0].set_ylim(0, 12)\n33: axarr[1, 0].set_xticks([])\n34: axarr[1, 0].set_yticks([])\n35: axarr[1, 0].set_frame_on(False)\n36: \n37: axarr[1, 1].plot(x, y4, lw=2)\n38: axarr[1, 1].axhline(y=2, color='k', lw=1)\n39: axarr[1, 1].axvline(x=0, color='k')\n40: axarr[1, 1].set_ylim(0, 12)\n41: axarr[1, 1].set_xticks([])\n42: axarr[1, 1].set_yticks([])\n43: axarr[1, 1].set_frame_on(False)\n44: \n45: plt.savefig('quadrants.pdf', format='pdf')\n`

\nWhat was new to me was the use of the `pyplot.subplots`

function to generate both the overall figure and the grid of subplots in one fell swoop. It’s possible that this technique was new to me because the documentation for Matplotlib’s Pyplot API doesn’t contain an entry for `subplots`

.^{1} I don’t remember where I first learned about it—Stack Overflow would be a good guess—but I’ve since learned that `pyplot.subplots`

is basically a combination of `pyplot.figure`

and `Figure.subplots`

.

Lines 6–10 define the four functions to be plotted. The `x`

values are the same for each and the `y`

values are named according to the quadrant they’re going to appear in. The `y`

values are defined so the moments and product of inertia match the annotations shown in the graph. The actual numbers used in these definitions are less important than their signs and their relative magnitudes, as the plots are intended to be generic.

Line 12 then defines the figure and the array of “axes,” where you have to remember that Matplotlib unfortunately uses that word in a way that doesn’t fit the rest of the world’s usage. In Matplotlib, “axes” is usually treated as a singular noun and refers to the area of an individual plot. After Line 12, the `axarr`

variable is a 2×2 array of Matplotlib axes.

Lines 13–19 then define the subplot in the upper left quadrant (what you learned as Quadrant II in analytic geometry class). Line 19 turns off the usual plot frame, and Lines 17–18 ensure there are no tick marks or labels. Lines 14—15 draw the [x] and [y] axes (here I’m using the normal definition of the word). You’ll notice that I’ve drawn the [x] axis at [y = 2] instead of [y = 0]. I didn’t like the way the graphs looked with the [x] axis lower, so I moved it up. Again, this doesn’t change the meaning behind the graph because it’s generic.

\nThe rest of the lines down through 43 are just repetitions for the the other quadrants. Finally, Line 45 saves the figure to a PDF file that looks like this:

\n\nNow it’s time to annotate the figure. In theory, I could do this in Matplotlib, but that’s a lot of programming for something that’s more visual than algorithmic. If I were making dozens of these figures, I’d probably invest the time in annotating them in Matplotlib, but for a one-off it’s much faster to do it in OmniGraffle.

\nI can open the PDF directly in OmniGraffle and start editing. First, I select the white background rectangle that’s usually included in files like this and delete it. It doesn’t add anything, and it’s too easy to select by mistake. Then I select all the axes (again, the usual definition) and add the arrowheads.

\n\nThe

\n command is very helpful in selecting repeated elements like this.After placing red circles at the maxima, it was time to label the axes (yes, usual definition; we’re out of Matplotlib now) and add the annotations. I made the annotations in LaTeXiT, a very nice little program for generating equations to be pasted into graphics programs. I’ve been using it for ages.

\n\nLaTeXiT cleverly ties into your existing LaTeX installation, so you can take advantage of all the packages you’re used to having available. I usually have LaTeXiT use the Arev package because I like its sans-serif look in figures.

\nAfter adding all the annotations, I export the figure from OmniGraffle as a PNG, run it through OptiPNG to save a little bandwidth, and upload it to the server. If this were a figure for a report instead of the blog, I’d export it as a PDF.

\n\n

\n

\n

- \n
- \n
I’ve complained about Matplotlib’s documentation before, so I’ll spare you the rant this time. ↩

\n \n

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "Canvas and my remote iPad", "url": "http://leancrew.com/all-this/2018/01/canvas-and-my-remote-ipad/", "author": {"name": "Dr. Drang"}, "summary": "The most recent episode of the Canvas podcast fits in with my ability to use my iPad Pro as a portable computer.", "date_published": "2018-01-27T15:48:07+00:00", "id": "http://leancrew.com/all-this/2018/01/canvas-and-my-remote-ipad/", "content_html": "My older son’s notebook computer, an Asus bought a couple of years ago, has developed a hinge problem that’s reached the point where he doesn’t want to take it to class for fear of it falling apart. After talking over his needs, we decided he could get through the semester with my old MacBook Air. So I set it up for a new user, moved all of my files to an external disk, and delivered it to him yesterday. Coincidentally, on the drive back up through central Illinois, I listened to an episode of Canvas that gave decent explanation of why I could give up my notebook computer.

\nYou could, of course, argue the *every* episode of Canvas is an explanation of how you can give up your notebook computer. It’s the podcast in which Federico Viticci and Fraser Speirs cover the software and work habits that allow you to use your iOS devices (especially the iPad) to accomplish things you might otherwise think you need a “real computer” to do. But Episode 52 was especially apropos because it covered SSH clients for iOS, which are the reason I feel comfortable in my current state, without a laptop computer for the first time in maybe 25 years or more.

I held off getting an iPad for several years, not because I thought it was a toy or a “consumption only” device, but because my work habits—lots of scripting and command-line use in a multi-window environment—weren’t aligned with the iPad’s strengths. I like to think I wasn’t an anti-iPad zealot during this time. I saw it as the perfect computer for many people, including my wife. I got her an iPad 2 back in 2011; she hasn’t touched a “real computer” since.

\nSo when Split Screen and the iPad Pro were introduced, my ears pricked up. I got the 9.7″ model in late 2016 and have been slowly figuring out how to work with it. Panic’s Prompt and, more recently, the [mosh]^{1} client Blink Shell are my key apps. My typical setup is to have one of them on the right in Split Screen, connected to my iMac, while I edit in Textastic on the left. This edit/test system on my iPad is very similar to the BBEdit/Terminal window arrangement I use when working on a Mac.

The irony of using a modern, highly graphical device like the iPad to handle a remote, command-line connection to another computer is not lost on me. I often think back to using the Hazeltine terminal that was in a room around the corner from my graduate school office to connect to a Cyber 175 mainframe. And when my iPad is tethered to my iPhone, it’s not unlike using the Hazeltine’s acoustic coupler.

\n\n

\n\n

\n\n

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "Transforming section properties and principal directions", "url": "http://leancrew.com/all-this/2018/01/transforming-section-properties-and-principal-directions/", "author": {"name": "Dr. Drang"}, "summary": "Arbitrary and specific coordinate axes.", "date_published": "2018-01-26T02:26:45+00:00", "id": "http://leancrew.com/all-this/2018/01/transforming-section-properties-and-principal-directions/", "content_html": "The Python `section`

module has one last function we haven’t covered: the determination of the principal moments of inertia and the axes associated with them. We’ll start by looking at how the moments and products of inertia change with our choice of axes.

The formula for area,

\n[A = \\iint\\limits_A dx \\, dy]\nwill give the same answer regardless of where we put the origin of the [x\\text{-}y] coordinate system or how we orient them. You can see this if you think of the area as being the sum of all the little [dx\\, dy] squares in the cross-section.

\n\nThe formulas for the location of the centroid,

\n[x_c = \\iint\\limits_A x\\; dx \\, dy]\n[y_c = \\iint\\limits_A y\\; dx \\, dy]\nwill give different answers for different positions and orientations of the [x] and [y] axes, but those answers will all correspond to same physical point of the cross-section.

\nThe moments and product of inertia as we defined them, relative to the centroid,

\n[I_{xx} = \\iint\\limits_A (y - y_c)^2\\; dx \\, dy]\n[I_{yy} = \\iint\\limits _A (x - x_c)^2 \\; dx \\, dy]\n[I_{xy} = \\iint\\limits_A (y - y_c)\\; (x - x_c)\\; dx \\, dy]\ndo not depend on the position of the [x\\text{-}y] origin (because [x - x_c] and [y - y_c] measure the horizontal and vertical distances away from the centroid, which is the same for any origin), but *do* depend on the orientation of the axes. We’ll show how this works by putting the origin at the centroid (which simplifies the math but does not make the results any less general) and comparing the moments and product of inertia for two coordinate systems, one of which is rotated relative to the other.

Note that [\\theta] is the angle from the [x] axis to the [\\xi] axis and is positive in the counterclockwise direction.^{1}

Because our origin is at the centroid, [x_c = y_c = \\xi_c = \\eta_c = 0], and we can write the equations for the moments and products of inertia in a more compact form:

\nIn the [x\\text{-}y] system,

\n[I_{xx} = \\iint\\limits_A y^2\\; dx \\, dy]\n[I_{yy} = \\iint\\limits _A x^2 \\; dx \\, dy]\n[I_{xy} = \\iint\\limits_A x y\\; dx \\, dy]\nand in the [\\xi\\text{-}\\eta] system,

\n[I_{\\xi\\xi} = \\iint\\limits_A \\eta^2\\; d\\xi \\, d\\eta]\n[I_{\\eta\\eta} = \\iint\\limits _A \\xi^2 \\; d\\xi \\, d\\eta]\n[I_{\\xi\\eta} = \\iint\\limits_A \\xi \\eta\\; d\\xi \\, d\\eta]\nWe can go back and forth between the two coordinate systems by noting that

\n[\\xi = \\quad x \\cos\\theta + y \\sin\\theta]\n[\\eta = -x \\sin\\theta + y \\cos\\theta]\nThus,

\n[I_{\\xi\\xi} = \\iint\\limits_A \\left( x^2 \\sin^2\\theta - 2 x y \\sin\\theta \\cos\\theta + y^2 \\cos^2\\theta \\right) \\, dx\\, dy]\n[I_{\\eta\\eta} = \\iint\\limits_A \\left( x^2 \\cos^2\\theta + 2 x y \\sin\\theta \\cos\\theta + y^2 \\sin^2\\theta \\right) \\, dx\\, dy]\n[I_{\\xi\\eta} = \\iint\\limits_A \\left( \\left( y^2-x^2 \\right) \\sin\\theta \\cos\\theta + x y \\left( \\cos^2\\theta - \\sin^2\\theta \\right) \\right) \\, dx\\, dy]\nThe [\\theta] terms can come out of the integrals, leaving us with

\n[I_{\\xi\\xi} = \\sin^2\\theta \\iint\\limits_A x^2\\; dx\\,dy - 2\\sin\\theta \\cos\\theta \\iint\\limits_A x\\,y\\; dx\\,dy + \\cos^2\\theta \\iint\\limits_A y^2\\; dx\\,dy]\nor

\n[I_{\\xi\\xi} = I_{xx} \\cos^2\\theta + I_{yy} \\sin^2\\theta - 2 I_{xy} \\sin\\theta \\cos\\theta]\nSimilarly,

\n[I_{\\eta\\eta} = I_{xx} \\sin^2\\theta + I_{yy} \\cos^2\\theta + 2 I_{xy} \\sin\\theta \\cos\\theta]\n[I_{\\xi\\eta} = \\left( I_{xx} - I_{yy} \\right) \\sin\\theta \\cos\\theta + I_{xy} \\left( \\cos^2\\theta - \\sin^2\\theta \\right)]\nSo far, this is just a bunch of algebra that could’ve been done quickly in SymPy. Now it’s time to start thinking.

\nLooking at the expression for [I_{\\xi\\eta}], you might notice that each term includes parts from the double angle formulas. So we can rewrite it this way:

\n[I_{\\xi\\eta} = \\frac{1}{2} \\left( I_{xx} - I_{yy} \\right) \\sin 2\\theta + I_{xy} \\cos 2\\theta]\nNote that [I_{\\xi\\eta} = 0] when

\n[\\left( I_{yy} - I_{xx} \\right) \\sin 2\\theta = 2 I_{xy} \\cos 2\\theta]\nor

\n[\\tan 2\\theta = \\frac{2 I_{xy}}{I_{yy} - I_{xx}}]\nBecause the tangent function repeats itself every 180°, this expression can be solved with an infinite number of values of [\\theta] that are 90° apart from one another. These orientations all look basically the same, except the [\\xi] and [\\eta] axes swap positions and flip around. For each of them, [I_{\\xi\\eta} = 0].

\nSince we’ve written the expression for [I_{\\xi\\eta}] in terms of [2\\theta], lets’s do the same for [I_{\\xi\\xi}] and [I_{\\eta\\eta}]. We start by recognizing the double angle formula for sine in each equation:

\n[I_{\\xi\\xi} = I_{xx} \\cos^2\\theta + I_{yy} \\sin^2\\theta - I_{xy} \\sin 2\\theta]\n[I_{\\eta\\eta} = I_{xx} \\sin^2\\theta + I_{yy} \\cos^2\\theta + I_{xy} \\sin 2\\theta]\nThen we use the thoroughly non-obvious identity,^{2}

and use the usual trig identities to get

\n[A\\, \\cos^2\\theta + B\\, \\sin^2\\theta = \\frac{A + B}{2} + \\frac{A - B}{2} \\cos 2\\theta]\nTherefore,^{3}

and

\n[I_{\\eta\\eta} = \\frac{I_{xx} + I_{yy}}{2} + \\frac{I_{yy} - I_{xx}}{2} \\cos 2\\theta + I_{xy} \\sin 2\\theta]\nNow let’s look at how these moments of inertia change with [\\theta]. Suppose we wanted to find the [\\theta] that maximized (or minimized) the value of [I_{\\xi\\xi}]? We’d take the derivative of the expression for [I_{\\xi\\xi}] and set it to zero:

\n[\\frac{dI_{\\xi\\xi}}{d\\theta} = \\left( I_{yy} - I_{xx} \\right) \\sin 2\\theta - 2 I_{xy} \\cos 2\\theta = 0]\nThis should look familiar. It’s solution is

\n[\\tan 2\\theta = \\frac{2 I_{xy}}{I_{yy} - I_{xx}}]\nthe same thing we got setting [I_{\\xi\\eta} = 0]. And if we took the derivative of [I_{\\eta\\eta}] with respect to [\\theta] and set it to zero to find the maxima and minima of [I_{\\eta\\eta}], we’d get the same thing.

\nThese orientations of the axes are known as the *principal directions* of the cross-section. They give us both a product of inertia of zero and the largest and smallest values of the moments of inertia. (If [I_{\\xi\\xi}] is at a maximum, then [I_{\\eta\\eta}] is at a minimum, and vice versa.)

The largest and smallest moments of inertia are commonly called [I_1] and [I_2], respectively. They can be calculated by substituting our solution for [\\theta] back into the expressions for [I_{\\xi\\xi}] and [I_{\\eta\\eta}], but there’s some messy math along the way. It’s easier to recognize that the maximum and minimum moments of inertia are determined entirely by the second and third terms of [I_{\\xi\\xi}] and [I_{\\eta\\eta}], which are in the form

\n[A \\cos\\alpha + B \\sin\\alpha]\nThis expression can be thought of as the horizontal projection of a pair of vectors, one of length [A] at an angle [\\alpha] to the horizontal and the other of length [B] at right angles to [A].

\n\nThe largest value of this expression will come when the hypotenuse of the triangle, [\\sqrt{A^2 + B^2}], is itself horizontal and pointing to the right. The algebraically smallest value will come when the hypotenuse is horizontal and pointing to the left.

\nApplying this idea to our expressions for [I_{\\xi\\xi}] and [I_{\\eta\\eta}], the larger principal moment of inertia will be

\n[I_1 = \\frac{I_{xx} + I_{yy}}{2} + \\sqrt{ \\left( \\frac{I_{yy} - I_{xx}}{2} \\right)^2 + I_{xy}^2}]\nand the smaller will be

\n[I_2 = \\frac{I_{xx} + I_{yy}}{2} - \\sqrt{ \\left( \\frac{I_{yy} - I_{xx}}{2} \\right)^2 + I_{xy}^2}]\nThe axis associated with the larger principal moment of inertia is called the major principal axis and the axis associated with the smaller principal moment of inertia is called the minor principal axis. These are sometimes called the strong and weak axes, respectively. Whatever you call them, they’ll be 90° apart.

\nNow let’s look at the `principal`

function from the `section`

module and see how these formulas were used.

`python:\ndef principal(Ixx, Iyy, Ixy):\n 'Principal moments of inertia and orientation.'\n\n avg = (Ixx + Iyy)/2\n diff = (Ixx - Iyy)/2 # signed\n I1 = avg + sqrt(diff**2 + Ixy**2)\n I2 = avg - sqrt(diff**2 + Ixy**2)\n theta = atan2(-Ixy, diff)/2\n return I1, I2, theta\n`

\nLooks like I was careless with the [(I_{yy} - I_{xx})] term and got it backward in the expression for `diff`

, doesn’t it? Also, there seems to be a stray negative sign in the expression for `theta`

. But the `principal`

function does work despite these apparent errors. What we’re running into is the sometimes vexing difference between math and computation.

First, in the formulas for [I_1] and [I_2], the `diff`

term gets squared, so flipping its sign doesn’t matter. Second, the numerical calculation of the arctangent isn’t as straightforward as you might think.

There are two arctangent functions in Python’s `math`

library (and in the libraries of many languages):

- \n
`atan`

takes a single argument and returns a result between [-\\pi/2] and [\\pi/2] (-90° and 90°, but in radians instead of degrees). \n`atan2`

takes two arguments, the [y] and [x] components of a vector directed out from the origin at the angle of interest, and returns a result between [-\\pi] and [\\pi] (-180° and 180°), depending on which quadrant the vector points toward. \n

We can’t use `atan`

in our code because it isn’t robust for some inputs. If we tried

`python:\ntheta = atan(2*Ixy/(Iyy - Ixx))/2\n`

\nas our formula suggests, we’d get divide-by-zero errors whenever [I_{xx} = I_{yy}]. We can’t have that because there are real cross sections of practical importance for which that’s the case. Any equal-legged angle, for example.

\n\nBut `atan2`

can be a problem, too, because we need to distinguish between the major and minor principal axes. In particular, I decided that `theta`

should be the angle between the [x] axis and the major principal axis. Using `atan2`

directly from the formula like this

`python:\ntheta = atan2(2*Ixy, Iyy - Ixx)/2\n`

\ncan return an angle 90° away from what we want.

\nUsing the `inertia`

function developed earlier, the moments and product of inertia of the equal-legged angle we just looked at are

`Ixx = 9.4405\nIyy = 9.4405\nIxy = -5.1429\n`

\nPlopping these numbers into the naive formula above, we get

\n`theta = -0.7854\n`

\nor -45°. This is the angle from the [x] axis to the weak axis, not the strong axis. The correct answer is 45°, just like the blue line in the figure.

\nTo figure out a way around this problem, let’s plot [I_{\\xi\\xi}] for the four cases of interest:

\n- \n
- [I_{xy} > 0 \\quad \\text{and} \\quad I_{yy} > I_{xx}] \n
- [I_{xy} > 0 \\quad \\text{and} \\quad I_{yy} < I_{xx}] \n
- [I_{xy} < 0 \\quad \\text{and} \\quad I_{yy} < I_{xx}] \n
- [I_{xy} < 0 \\quad \\text{and} \\quad I_{yy} > I_{xx}] \n

This will let us see what we need for all four quadrants of the `atan2`

function.

In each of the subplots, successive peaks and valleys of [I_{\\xi\\xi}] are 90° apart.

\nWe’re looking for the maximum values of [I_{\\xi\\xi}] that are closest to [\\theta = 0], which I’ve marked with the red dots. That means

\n- \n
- When [I_{xy} > 0 \\quad \\text{and} \\quad I_{yy} > I_{xx}] (upper right), we want the negative [\\theta] with an absolute value greater than 45°. \n
- When [I_{xy} > 0 \\quad \\text{and} \\quad I_{yy} < I_{xx}] (upper left), we want the negative [\\theta] with an absolute value less than 45°. \n
- When [I_{xy} < 0 \\quad \\text{and} \\quad I_{yy} < I_{xx}] (lower left), we want the positive [\\theta] with an absolute value less than 45°. \n
- When [I_{xy} < 0 \\quad \\text{and} \\quad I_{yy} > I_{xx}] (lower right), we want the positive [\\theta] with an absolute value greater than 45°. \n

The invocation of `atan2`

that gives us all of these is

`python:\ntheta = atan2(-2*Ixy, Ixx - Iyy)/2\n`

\nwhich we can visualize this way, where the curved arrows represent the angle [2\\theta] for each type of result:

\n\nBy flipping the signs of both arguments of `atan2`

, we get the sign and magnitude of `theta`

we’re looking for. Note that the expression used in the `principal`

function,

`python:\ntheta = atan2(-Ixy, diff)/2\n`

\nis equivalent to

\n`python:\ntheta = atan2(-2*Ixy, Ixx - Iyy)/2\n`

\nbecause of the way we defined `diff`

. And by flipping the signs of both the numerator and denominator, we’re not changing the quotient or the definition of [\\theta]. We’re just choosing which solution of

is the most useful.

\nIf you’re mathematically inclined, you may recognize the rotation of axes as a tensor transformation and the determination of principal moments of inertia and principal directions as a eigenvalue/eigenvector problem. But writing `principal`

in those terms would have required me to use more libraries than just `math`

. The formulas in `principal`

are simple, even if their derivation can take us all over the map.

Now that we’ve figured out how `principal`

works, what good is it? It can be shown^{4} that when the loads on a beam are aligned with one of the principal directions, the beam will bend in that direction only. If the loading is not aligned with a principal direction, the beam will bend both in the direction of the load and in a direction perpendicular to it.

For example, if we were using the equal-legged angle above as a beam and hung a vertical downward load off of it, it would bend both downward and to the left. Not the most intuitively obvious result, but true nonetheless.

\nEveryone who takes an advanced strength of materials class learns the formulas for the principal moments of inertia and their directions, but there’s usually a bit of hand waving to make the math go faster. And, because the strong and weak directions are typically easy to determine by inspection, the details of picking out the correct arctangent value aren’t discussed. But there’s a richness to even the simplest mechanics, I enjoy exploring it. And since computers can’t figure things out by inspection, you can’t gloss over the details when writing a program.

\n\n

\n

\n

- \n
- \n
In case you don’t recognize them, the Greek letters [\\xi] and [\\eta] are

\n*xi*and*eta*, respectively. They’re often used for coordinate directions when [x] and [y] are already taken. You’re probably more familiar with*theta*, [\\theta], usually the first choice to represent an angle. ↩ \n - \n
Don’t believe it? Well, I told you it was non-obvious. But go ahead and multiply out the right hand side and see for yourself. ↩

\n \n - \n
Pay close attention to the negative signs. ↩

\n \n - \n
Don’t worry, I’m not going to show it (not here, anyway). We’re almost done. ↩

\n \n

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "Section properties and SymPy", "url": "http://leancrew.com/all-this/2018/01/section-properties-and-sympy/", "author": {"name": "Dr. Drang"}, "summary": "Symbolic math library makes quick work of integration and simplification.", "date_published": "2018-01-18T23:28:54+00:00", "id": "http://leancrew.com/all-this/2018/01/section-properties-and-sympy/", "content_html": "I felt a little guilty about this footnote in yesterday’s post:

\n\n\nYes, the product of inertia integral is definitely more complicated if you’re going to do the derivation by hand. So don’t do it by hand. Learn SymPy and you’ll be able to zip through it.

\n

This is entirely too much like those “it can be easily shown” tricks that math textbook writers use to avoid complicated and unintuitive manipulations. If I’m going to claim you can zip through the product of inertia, I should be able to prove it. So let’s do it.

\nSymPy comes with the Anaconda Python distribution, and that’s how I installed it. I believe you can get it working with Apple’s system-supplied Python, but Anaconda is so helpful in getting and maintaining a numerical/scientific Python installation, I don’t see why you’d try anything else.

\nIf you’ve ever used a symbolic math program, like Mathematica or Maple, SymPy will seem reasonably familiar to you. My main hangup is the need in SymPy to declare certain variables as symbols before doing any other work. I understand the reason for it—SymPy needs to protect symbols from being evaluated the way regular Python variables are—but I tend to forget to declare all the symbols I need and don’t realize it until an error message appears.

\nThat one personal quirk aside, I find SymPy easy to use for the elementary math I tend to do. The functions I use most often, like `diff`

, `integrate`

, `expand`

, and `factor`

, are easy to remember, so I don’t have to continually look things up in the documentation. And the docs are well-organized when I do have to use them.

The problem we’re going to look at is the solution of this integral for a polygonal area:

\n[\\iint\\limits_A xy\\; dx dy]\nWe’ll use Green’s theorem to turn this area integral into a path integral around the polygon’s perimeter:

\n[\\iint\\limits_A xy\\; dx dy = \\oint\\limits_C \\frac{1}{2} x^2 y\\; dy]\nFor each side of the polygon, from point [(x_i, y_i)] to point [(x_{i+1}, y_{i+1})], the line segment defining the perimeter can be expressed in parametric form,

\n[x = x_i + (x_{i+1} - x_i)\\;t]\n[y = y_i + (y_{i+1} - y_i)\\;t]\nwhich means

\n[dy = (y_{i+1} - y_i)\\; dt]\nNow we’re ready to use SymPy to evaluate and simplify the integral for a single line segment. To make the typing go faster as I used SymPy, which I ran interactively in Jupyter console session, I decided to use 0 for subscript [i] and 1 for subscript [i+1]. Here’s a transcript of the session, where I’ve broken up long lines to make it easier to read:

\n`In [1]: from sympy import *\n\nIn [2]: x, y, x_0, x_1, y_0, y_1, t = symbols('x y x_0 x_1 y_0 y_1 t')\n\nIn [3]: x = x_0 + (x_1 - x_0)*t\n\nIn [4]: y = y_0 + (y_1 - y_0)*t\n\nIn [5]: full = integrate(x**2*y/2*diff(y, t), (t, 0, 1))\n\nIn [6]: full\nOut[6]: -x_0**2*y_0**2/8 + x_0**2*y_0*y_1/12 + x_0**2*y_1**2/24\n - x_0*x_1*y_0**2/12 + x_0*x_1*y_1**2/12 - x_1**2*y_0**2/24\n - x_1**2*y_0*y_1/12 + x_1**2*y_1**2/8\n\nIn [7]: part = x_0**2*y_0*y_1/12 + x_0**2*y_1**2/24 - x_0*x_1*y_0**2/12\n + x_0*x_1*y_1**2/12 - x_1**2*y_0**2/24 - x_1**2*y_0*y_1/12\n\nIn [8]: factor(part)\nOut[8]: (x_0*y_1 - x_1*y_0)*(2*x_0*y_0 + x_0*y_1 + x_1*y_0 + 2*x_1*y_1)/24\n\nIn [9]: print(latex(_))\n\\frac{1}{24} \\left(x_{0} y_{1} - x_{1} y_{0}\\right) \\left(2 x_{0} y_{0}\n + x_{0} y_{1} + x_{1} y_{0} + 2 x_{1} y_{1}\\right)\n`

\nWe start by importing everything from SymPy and defining all the symbols needed. Then we define the parametric equations of the line segment in `In[3]`

and `In[4]`

.

`In[5]`

does a lot of work. We define the integrand inside the `integrate`

function and tell it to integrate that expression over [t] from 0 to 1 (i.e., from [(x_0, y_0)] to [(x_1, y_1)]). Note that we didn’t need to explicitly enter the expressions for [x], [y], or [dy]; SymPy did all the substitution for us, including the differentiation.

I called the result of the integration `full`

because it contains every term of the integration. But we learned in the last post that the leading and trailing terms get cancelled out when we sum over all the segments of the polygon. So I copied just the inner terms from `full`

and pasted them into `In[7]`

to define a new expression, called `part`

.

`In[8]`

then factors `part`

to get a more compact expression, and `In[9]`

converts it to a LaTeX expression, so I can render it nicely here:

With a quick search-and-replace to convert the subscripts to their more general forms, we get the expression presented in the last post (albeit with the terms in a different order):

\n[\\iint\\limits_A xy\\; dx dy = \\frac{1}{24} \\sum_{i=0}^{n-1} \\left(x_{i} y_{i+1} - x_{i+1} y_{i}\\right) \\left(2 x_{i} y_{i} + x_{i} y_{i+1} + x_{i+1} y_{i} + 2 x_{i+1} y_{i+1}\\right)]\nSymPy didn’t do everything for us. We had to figure out the Green’s function transformation and recognize the cancellation of the leading and trailing terms of `full`

. But it did all the boring stuff, which is its real value.

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "Green’s theorem and section properties", "url": "http://leancrew.com/all-this/2018/01/greens-theorem-and-section-properties/", "author": {"name": "Dr. Drang"}, "summary": "Deriving the formulas for section properties of polygons.", "date_published": "2018-01-17T17:07:13+00:00", "id": "http://leancrew.com/all-this/2018/01/greens-theorem-and-section-properties/", "content_html": "In the last post, I presented a simple Python module with functions for calculating section properties of polygons. Now we’ll go through the derivations of the formulas used in those functions.

\nThe basis for all the formulas is Green’s theorem, which is usually presented something like this:

\n[\\oint\\limits_C P\\; dx + Q\\; dy = \\iint\\limits_A \\left( \\frac{\\partial Q}{\\partial x} - \\frac{\\partial P}{\\partial y} \\right)\\; dx dy]\nwhere [P] and [Q] are functions of [x] and [y], [A] is the region over which the right integral is being evaluated, and [C] is the boundary of that region. The integral on the right is evaluated in accordance with the right-hand rule, i.e., counterclockwise for the usual orientation of the [x] and [y] axes.

\n\nThe section properties of interest are all area integrals. We’ll use Green’s theorem to turn them into boundary integrals and then evaluate those integrals using the coordinates of the polygon’s vertices.

\nThis is the easiest one, but instead of going through the full derivation here, I’ll refer you to this excellent StackExchange page by apnorton and just hit the highlights.

\n- \n
The area is defined

\n[A = \\iint\\limits_Adx dy]\nand we’ll choose [P = 0] and [Q = x] as our Greens’ theorem functions. This gives us

\n[A = \\iint\\limits_Adx dy = \\oint\\limits_C x\\; dy] \nWe break the polygonal boundary into a series of straight-line segments, each of which can be parameterized this way:

\n[x = x_i + (x_{i+1} - x_i)\\;t]\n[y = y_i + (y_{i+1} - y_i)\\;t]\n[dy = (y_{i+1} - y_1)\\; dt]\nwhere the [(x_i, y_i)] are the coordinates of the vertices.

\nPlugging these equations into the integral, we get

\n[A = \\frac{1}{2} \\sum_{i=0}^{n-1}\\; (x_{i+1} + x_i)(y_{i+1} - y_i)] \n

A note on the indexing: The polygon has [n] vertices, which we’ll number from 0 to [n-1]. The last segment of the boundary goes from [(x_{n-1}, y_{n-1})] to [(x_0, y_0)]. To make this work with the equation, we’ll define [(x_n, y_n) = (x_0, y_0)].

\nLet’s compare this with the `area`

function in the module:

`python\ndef area(pts):\n 'Area of cross-section.'\n\n if pts[0] != pts[-1]:\n pts = pts + pts[:1]\n x = [ c[0] for c in pts ]\n y = [ c[1] for c in pts ]\n s = 0\n for i in range(len(pts) - 1):\n s += x[i]*y[i+1] - x[i+1]*y[i]\n return s/2\n`

\nWe start by checking the `pts`

list to see if the starting and ending items match. If they don’t, we copy the starting item to the end to fit the indexing convention discussed above. We then initialize some variables and execute a loop, summing terms along the way. Rewriting the loop in mathematical terms, we get

This doesn’t look like the equation derived from Green’s theorem, does it? But it’s not too hard to see that they are equivalent. Expanding out the binomial product in the earlier equation gives

\n[x_{i+1} y_{i+1} - x_{i+1} y_i + x_i y_{i+1} - x_i y_i]\nAs we loop through all the values of [i] from 0 to [n-1], the leading term of one trip through the loop will cancel the trailing term of the next trip through the loop. Here’s an example for a triangle:

\n[\\quad (x_1 y_1 - x_1 y_0 + x_0 y_1 - x_0 y_0 )]\n[+\\; (x_2 y_2 - x_2 y_1 + x_1 y_2 - x_1 y_1 )]\n[+\\; (x_0 y_0 - x_0 y_2 + x_2 y_0 - x_2 y_2 )]\nAfter the cancellations, all that’s left are the inner terms, and that’s the formula used in the `area`

function.

The cancellation doesn’t do much for us here, changing from two additions and one multiplication per loop to two multiplications and one addition per loop. But we’ll see this same sort of cancellation in the other section properties, and it will provide greater simplification in those.

\nThe centroid is essentially the average position of the area. If a sheet of material of uniform thickness and density were cut into a shape, the centroid would be the center of gravity, the balance point, of that shape. The coordinates of the centroid are defined this way:

\n[x_c = \\frac{1}{A} \\iint\\limits_A x\\; dx dy]\n[y_c = \\frac{1}{A} \\iint\\limits_A y\\; dx dy]\nLet’s derive the formula for [x_c] for a polygon; the derivation of the formula for [y_c] will be similar.

\nIn applying Green’s theorem, we’ll take [P = 0] and [Q = \\frac{1}{2} x^2]. Therefore,

\n[x_c = \\frac{1}{A} \\iint\\limits_A x\\; dx dy = \\oint\\limits_C \\frac{1}{2} x^2\\; dy]\nBreaking the polygonal boundary into straight-line segments and using the same parametric equations as before, we get an integral that looks like this

\n[\\int_0^1 \\frac{1}{2} \\left[ (x_i^2 + 2 x_i (x_{i+1} - x_i)\\; t + (x_{i+1} - x_i)^2\\; t^2\\right] (y_{i+1} - y_i)\\; dt]\nfor each segment. This integral evaluates to

\n[\\frac{1}{6} (x_{i+1}^2 + x_{i+1} x_i + x_i^2)\\; ( y_{i+1} - y_i)]\nso our formula for the centroid is

\n[x_c = \\frac{1}{6A} \\sum_{i=0}^{n-1} (x_{i+1}^2 + x_{i+1} x_i + x_i^2)\\; ( y_{i+1} - y_i)]\nAs we found in the formula for area, the leading and trailing terms in the expansion of this product cancel out as we loop through the sum, leaving us with

\n[x_c = \\frac{1}{6A} \\sum_{i=0}^{n-1} -y_i x_{i+1}^2 + y_{i+1}x_{i+1} x_i - y_i x_{i+1} x_i + y_{i+1} x_i^2]\nThis looks like a mess, but it can be factored into a more compact form:

\n[x_c = \\frac{1}{6A} \\sum_{i=0}^{n-1} (x_{i+1} + x_i)\\;(x_i y_{i+1} - x_{i+1} y_i)]\nThe expression for the other centroidal coordinate is as you’d expect:

\n[y_c = \\frac{1}{6A} \\sum_{i=0}^{n-1} (y_{i+1} + y_i)\\;(x_i y_{i+1} - x_{i+1} y_i)]\nThese are the formulas used in the `centroid`

function.

`python:\ndef centroid(pts):\n 'Location of centroid.'\n\n if pts[0] != pts[-1]:\n pts = pts + pts[:1]\n x = [ c[0] for c in pts ]\n y = [ c[1] for c in pts ]\n sx = sy = 0\n a = area(pts)\n for i in range(len(pts) - 1):\n sx += (x[i] + x[i+1])*(x[i]*y[i+1] - x[i+1]*y[i])\n sy += (y[i] + y[i+1])*(x[i]*y[i+1] - x[i+1]*y[i])\n return sx/(6*a), sy/(6*a)\n`

\nYou may be familiar with moments and products of inertia from dynamics, where the terms are related to the distribution of mass in a body. The moments and product of inertia we’ll be talking about here—more properly called the second moments of area—are mathematically similar and refer to the distribution of area across a planar shape.

\nThe moments and product of inertia that matter in beam bending are taken about the centroidal axis (i.e., a set of [x] and [y] axes with the origin at the centroid of the shape). Since we don’t know where the centroid is when we set up our coordinate system, our list of vertex points don’t work off that basis. But we can still calculate the centroidal moments and product of inertia by using these formulas:

\n[I_{xx} = \\iint\\limits_A (y - y_c)^2\\; dx dy]\n[I_{yy} = \\iint\\limits _A (x - x_c)^2 \\; dx dy]\n[I_{xy} = \\iint\\limits_A (y - y_c)\\; (x - x_c)\\; dx dy]\nWe’ll concentrate on [I_{yy}]; the other two will be similarly derived.

\nFirst, let’s expand the square inside the integral and see what we get:

\n[I_{yy} = \\iint\\limits_A x^2\\; dx dy - 2x_c \\iint\\limits_A x\\; dx dy + x_c^2 \\iint\\limits_A dx dy]\nThe integral in the second term is [A x_c] and the integral in the third term is just [A]. Putting this together, we get^{1}

Since we already have formulas for [x] and [x_c], we can concentrate on the integral in the first term on the right.

\nReturning to Green’s theorem, we’ll use [P = 0] and [Q = \\frac{1}{3}x^3], giving us

\n[\\iint\\limits_A x^2 dx dy = \\oint\\limits_C \\frac{1}{3}x^3 \\; dy]\nOnce again, we break the polygonal boundary into straight-line segments and use parametric equations to define the segments. For each segment, we’ll get the following integral:

\n[\\int_0^1 \\frac{1}{3} \\left[ x_i^3 + 3 x_i^2 (x_{i+1} - x_1)\\; t + 3 x_i (x_{i+1} - x_1)^2\\; t^2 + (x_{i+1} - x_1)^3\\; t^3 \\right] (y_{i+1} - y_i) \\; dt]\nThis integral evaluates to

\n[\\frac{1}{12} \\left[ x_{i+1}^3 + x_{i+1}^2 x_i + x_{i+1} x_i^2 + x_i^3 \\right] (y_{i+1} - y_i)]\ngiving us

\n[\\iint\\limits_A x^2 dx dy = \\frac{1}{12} \\sum_{i=0}^{n-1} \\left[ x_{i+1}^3 + x_{i+1}^2 x_i + x_{i+1} x_i^2 + x_i^3 \\right] (y_{i+1} - y_i)]\nOnce again, if we expand out the product inside the sum, we’ll find that the leading and trailing terms cancel as we work through the loop. That gives us

\n[\\frac{1}{12} \\sum_{i=0}^{n-1} -y_i x_{i+1}^3 + y_{i+1} x_{i+1}^2 x_i - y_i x_{i+1}^2 x_i + y_{i+1} x_{i+1} x_i^2 - y_i x_{i+1} x_i^2 + y_{i+1} x_i^3]\nAnd that long expression can be factored, leaving

\n[\\iint\\limits_A x^2 dx dy = \\frac{1}{12} \\sum_{i=0}^{n-1} (x_{i+1}^2 + x_{i+1} x_i + x_i^2)\\; (x_i y_{i+1} - x_{i+1} y_i)]\nSimilar^{2} derivations give us

These formulas, and the terms accounting for the location of the centroid, are in the function `inertia`

.

`python:\ndef inertia(pts):\n 'Moments and product of inertia about centroid.'\n\n if pts[0] != pts[-1]:\n pts = pts + pts[:1]\n x = [ c[0] for c in pts ]\n y = [ c[1] for c in pts ]\n sxx = syy = sxy = 0\n a = area(pts)\n cx, cy = centroid(pts)\n for i in range(len(pts) - 1):\n sxx += (y[i]**2 + y[i]*y[i+1] + y[i+1]**2)*(x[i]*y[i+1] - x[i+1]*y[i])\n syy += (x[i]**2 + x[i]*x[i+1] + x[i+1]**2)*(x[i]*y[i+1] - x[i+1]*y[i])\n sxy += (x[i]*y[i+1] + 2*x[i]*y[i] + 2*x[i+1]*y[i+1] + x[i+1]*y[i])*(x[i]*y[i+1] - x[i+1]*y[i])\n return sxx/12 - a*cy**2, syy/12 - a*cx**2, sxy/24 - a*cx*cy\n`

\nThis older post explains the use of the moment of inertia in beam bending, but I avoided the trickier bits associated with the product of inertia and principal axes. We’ll cover them in the next post.

\n\n

\n**Update Jan 23, 2018 12:42 PM** \nThanks to Glenn Walker for finding an error in one of the formulas. They’re more annoying to me than mistakes in the text.

\n

\n

\n

- \n
- \n
Yes, this is the parallel axis theorem. ↩

\n \n - \n
Yes, the product of inertia integral is definitely more complicated if you’re going to do the derivation by hand. So don’t do it by hand. Learn SymPy and you’ll be able to zip through it. ↩

\n \n

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "Python module for section properties", "url": "http://leancrew.com/all-this/2018/01/python-module-for-section-properties/", "author": {"name": "Dr. Drang"}, "summary": "A simple set of", "date_published": "2018-01-16T03:34:25+00:00", "id": "http://leancrew.com/all-this/2018/01/python-module-for-section-properties/", "content_html": "A lot of what I do at work involves analyzing the bending of beams, and that means using properties of the beams’ cross sections. The properties of greatest importance are the area, the location of the centroid, and the moments of inertia. Most of the time, I can just look these properties up in a handbook, as I did in this post, or combine the properties of a few well-known shapes. Recently, though, I needed the section properties of an oddball shape, and my handbooks failed me.

\nIn the past, I would open a commercial program that had a section properties module, draw in the shape, and copy out the results. But my partners and I stopped paying the license for that program several years ago, so that wasn’t an option anymore. I decided to write a Python module to do the calculations and draw the cross-section.

\nIf the cross section is a polygon, there are formulas for calculating the section properties from the coordinates of the vertices. Most of the formulas are on the aforelinked Wikipedia pages and on this very nice page from Paul Bourke of the University of Western Australia. I’ll explain how and why the formulas work in a later post; for now, we’ll just accept them. For cross sections that aren’t polygons, we can create a close approximation by fitting a series of short straight lines to any boundary curve.

\nHere’s the source code of the module, which I call `section.py`

:

`python:\n 1: import matplotlib.pyplot as plt\n 2: from math import atan2, sin, cos, sqrt, pi, degrees\n 3: \n 4: def area(pts):\n 5: 'Area of cross-section.'\n 6: \n 7: if pts[0] != pts[-1]:\n 8: pts = pts + pts[:1]\n 9: x = [ c[0] for c in pts ]\n 10: y = [ c[1] for c in pts ]\n 11: s = 0\n 12: for i in range(len(pts) - 1):\n 13: s += x[i]*y[i+1] - x[i+1]*y[i]\n 14: return s/2\n 15: \n 16: \n 17: def centroid(pts):\n 18: 'Location of centroid.'\n 19: \n 20: if pts[0] != pts[-1]:\n 21: pts = pts + pts[:1]\n 22: x = [ c[0] for c in pts ]\n 23: y = [ c[1] for c in pts ]\n 24: sx = sy = 0\n 25: a = area(pts)\n 26: for i in range(len(pts) - 1):\n 27: sx += (x[i] + x[i+1])*(x[i]*y[i+1] - x[i+1]*y[i])\n 28: sy += (y[i] + y[i+1])*(x[i]*y[i+1] - x[i+1]*y[i])\n 29: return sx/(6*a), sy/(6*a)\n 30: \n 31: \n 32: def inertia(pts):\n 33: 'Moments and product of inertia about centroid.'\n 34: \n 35: if pts[0] != pts[-1]:\n 36: pts = pts + pts[:1]\n 37: x = [ c[0] for c in pts ]\n 38: y = [ c[1] for c in pts ]\n 39: sxx = syy = sxy = 0\n 40: a = area(pts)\n 41: cx, cy = centroid(pts)\n 42: for i in range(len(pts) - 1):\n 43: sxx += (y[i]**2 + y[i]*y[i+1] + y[i+1]**2)*(x[i]*y[i+1] - x[i+1]*y[i])\n 44: syy += (x[i]**2 + x[i]*x[i+1] + x[i+1]**2)*(x[i]*y[i+1] - x[i+1]*y[i])\n 45: sxy += (x[i]*y[i+1] + 2*x[i]*y[i] + 2*x[i+1]*y[i+1] + x[i+1]*y[i])*(x[i]*y[i+1] - x[i+1]*y[i])\n 46: return sxx/12 - a*cy**2, syy/12 - a*cx**2, sxy/24 - a*cx*cy\n 47: \n 48: \n 49: def principal(Ixx, Iyy, Ixy):\n 50: 'Principal moments of inertia and orientation.'\n 51: \n 52: avg = (Ixx + Iyy)/2\n 53: diff = (Ixx - Iyy)/2 # signed\n 54: I1 = avg + sqrt(diff**2 + Ixy**2)\n 55: I2 = avg - sqrt(diff**2 + Ixy**2)\n 56: theta = atan2(-Ixy, diff)/2\n 57: return I1, I2, theta\n 58: \n 59: \n 60: def summary(pts):\n 61: 'Text summary of cross-sectional properties.'\n 62: \n 63: a = area(pts)\n 64: cx, cy = centroid(pts)\n 65: Ixx, Iyy, Ixy = inertia(pts)\n 66: I1, I2, theta = principal(Ixx, Iyy, Ixy)\n 67: summ = \"\"\"Area\n 68: A = {}\n 69: Centroid\n 70: cx = {}\n 71: cy = {}\n 72: Moments and product of inertia\n 73: Ixx = {}\n 74: Iyy = {}\n 75: Ixy = {}\n 76: Principal moments of inertia and direction\n 77: I1 = {}\n 78: I2 = {}\n 79: θ︎ = {}°\"\"\".format(a, cx, cy, Ixx, Iyy, Ixy, I1, I2, degrees(theta))\n 80: return summ\n 81: \n 82: \n 83: def outline(pts, basename='section', format='pdf', size=(8, 8), dpi=100):\n 84: 'Draw an outline of the cross-section with centroid and principal axes.'\n 85: \n 86: if pts[0] != pts[-1]:\n 87: pts = pts + pts[:1]\n 88: x = [ c[0] for c in pts ]\n 89: y = [ c[1] for c in pts ]\n 90: \n 91: # Get the bounds of the cross-section\n 92: minx = min(x)\n 93: maxx = max(x)\n 94: miny = min(y)\n 95: maxy = max(y)\n 96: \n 97: # Whitespace border is 5% of the larger dimension\n 98: b = .05*max(maxx - minx, maxy - miny)\n 99: \n100: # Get the properties needed for the centroid and principal axes\n101: cx, cy = centroid(pts)\n102: i = inertia(pts)\n103: p = principal(*i)\n104: \n105: # Principal axes extend 10% of the minimum dimension from the centroid\n106: length = min(maxx-minx, maxy-miny)/10\n107: a1x = [cx - length*cos(p[2]), cx + length*cos(p[2])]\n108: a1y = [cy - length*sin(p[2]), cy + length*sin(p[2])]\n109: a2x = [cx - length*cos(p[2] + pi/2), cx + length*cos(p[2] + pi/2)]\n110: a2y = [cy - length*sin(p[2] + pi/2), cy + length*sin(p[2] + pi/2)]\n111: \n112: # Plot and save\n113: # Axis colors chosen from http://mkweb.bcgsc.ca/colorblind/\n114: fig, ax = plt.subplots(figsize=size)\n115: ax.plot(x, y, 'k*-', lw=2)\n116: ax.plot(a1x, a1y, '-', color='#0072B2', lw=2) # blue\n117: ax.plot(a2x, a2y, '-', color='#D55E00') # vermillion\n118: ax.plot(cx, cy, 'ko', mec='k')\n119: ax.set_aspect('equal')\n120: plt.xlim(xmin=minx-b, xmax=maxx+b)\n121: plt.ylim(ymin=miny-b, ymax=maxy+b)\n122: filename = basename + '.' + format\n123: plt.savefig(filename, format=format, dpi=dpi)\n124: plt.close()\n`

\nThe key data structure is a list of tuples,^{1} which represent all of the vertices of the polygon. Each tuple is a pair of (x, y) coordinates for a vertex, and the list must be arranged so the vertices are in consecutive clockwise order. This ordering is the result of Green’s theorem, which is the source of the formulas.^{2}

Here’s a brief example of using the module:

\n`python:\n1: #!/usr/bin/env python\n2: \n3: from section import summary, outline\n4: \n5: shape = [(0, 0), (5, 0), (5, 1), (3.125, 1), (2.125, 3), (0.875, 3), (1.875, 1), (0, 1)]\n6: print(summary(shape))\n7: outline(shape, 'skewed', format='png', size=(8, 6))\n`

\nLine 5 defines the vertices of the shape. The printed output from Line 6 is

\n`Area\n A = 7.5\nCentroid\n cx = 2.3333333333333335\n cy = 1.0\nMoments and product of inertia\n Ixx = 5.0\n Iyy = 11.367187499999993\n Ixy = -1.666666666666666\nPrincipal moments of inertia and direction\n I1 = 11.77706657483349\n I2 = 4.590120925166502\n θ︎ = 76.18358042418826°\n`

\nand the PNG file created from Line 7, named `skewed.png`

, looks like this

As you might expect, the x-axis is horizontal and the y-axis is vertical. In addition to the shape itself, the `outline`

function also plots the centroid as a black dot and the orientation of the principal axes. The major axis is the thicker bluish line and the minor axis is the thinner reddish line.

The `outline`

function is the most interesting, in that it isn’t just the transliteration of a formula into Python. Lines 92–95 extract the extreme x and y values, and Line 98 calculates the size of a whitespace border (5% of the larger dimension) to keep the frame of the plot a reasonable distance away from the shape. This also makes it easy to crop the drawing to omit the frame. The ends of the principal axes are calculated in Lines 106–110 to make their lengths 20% of the smaller dimension; the idea is to make them long enough to see but not so long as to be distracting.

As noted in the comments, I chose the axis colors from a colorblind-safe palette given by Martin Krzywinski on this page. He got the palette from a paper by Bang Wong that I didn’t feel like paying $59 for (my scholarship has its limits). To better emphasize which is the major principal axis, I made it thicker.

\nMechanical and civil engineers learn how to calculate section properties early on in their undergraduate curriculum, so it’s not a particularly difficult topic, but there is a surprising depth to it. Enough depth that I plan to milk it for three more posts, which I’ll link to here when they’re done.

\n\n

\n

\n

- \n
- \n
Strictly speaking, any data structure that indexes like a list of tuples—a list of lists, for example—would work just as well, but because coordinates are usually given as parenthesized pairs, a list of tuples seemed the most natural. ↩

\n \n - \n
As promised, we’ll get to the derivation of the formulas in a later post, but if you want a taste, here’s a good derivation of the area formula by apnorton. ↩

\n \n

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}, {"title": "A small hint of big data", "url": "http://leancrew.com/all-this/2018/01/a-hint-of-big-data/", "author": {"name": "Dr. Drang"}, "summary": "Handling data files much bigger than I’m used to.", "date_published": "2018-01-06T16:59:21+00:00", "id": "http://leancrew.com/all-this/2018/01/a-hint-of-big-data/", "content_html": "Shortly before Christmas, I got a few gigabytes of test data from a client and had to make sense of it. The first step was being able to read it.

\nThe data came from a series of sensors installed in the some equipment manufactured by the client but owned by one of its customers. It was the customer who had collected the data, and precise information about it was limited at best. Basically, all I knew going in was that I had a handful of very large files, most of them about half a gigabyte, and that they were almost certainly text files of some sort.

\nOne of the files was much smaller than the other, only about 50 MB. I decided to start there and opened it in BBEdit, which took a little time to suck it all in but handled it flawlessly. Scrolling through it, I learned that the first several dozen lines described the data that was being collected and the units of that data. At the end of the header section was a line with just the string

\n`[data]\n`

\nand after that came line after line of numbers. Each line was about 250 characters long and used DOS-style CRLF line endings. All the fields were numeric and were separated by single spaces. The timestamp field for each data record looked like a floating point number, but after some review, I came to understand that it was an encoding of the clock time in `hhmmss.ssss`

format. This also explained why the files were so big: the records were 0.002 seconds apart, meaning the data had been collected at 500 Hz, much faster than was necessary for the type of information being gathered.

Anyway, despite its excessive volume, the data seemed pretty straightforward, a simple format that I could do a little editing of to get it into shape for importing into Pandas. So I confidently right-clicked one of the larger files to open it in BBEdit, figuring I’d see the same thing. But BBEdit wouldn’t open it.

\n\nAs the computer I was using has 32 GB of RAM, physical memory didn’t seem like the cause of this error. I had never before run into a text file that BBEdit couldn’t handle, but then I’d never tried to open a 500+ MB file before. I don’t blame BBEdit for the failure—data files like this aren’t what it was designed to edit—but it was surprising. I had to come up with Plan B.

\nPlan B started with running `head -100`

on the files to make sure they were all formatted the same way. I learned that although the lengths of the header sections were different, they were collecting same type of data and using the same space-separated format for the data itself. Also, in each file the header and data were separated by a `[data]`

line.

The next step was stripping out the header lines and transforming the data into CSV format. Pandas can certainly read space-separated data, but I figured that as long as I had to do some editing of the files, I might as well put them into a form that lots of software can read. I considered using a pipeline of standard Unix utilities and maybe Perl to do the transformation, but settled on a writing a Python script. Even though such a script was likely to be longer than the equivalent pipeline, my familiarity with Python would make it easier to write.

\nHere’s the script:

\n`python:\n 1: #!/usr/bin/env python\n 2: \n 3: import sys\n 4: \n 5: f = open(sys.argv[1], 'r')\n 6: for line in f:\n 7: if line.rstrip() == '[data]':\n 8: break\n 9: \n10: print 'sats,time,lat,long,velocity,heading,height,vert_vel,eng_outP,valv_inP,valv_outP,supplyP,event1,T1,flow1,flow2,eng_outT,valv_inT,supplyT,valv_outT,eng_rpm'\n11: \n12: for line in f:\n13: print line.rstrip().replace(' ', ',')\n`

\n(You can see from the `print`

commands that this was done back before I switched to Python 3.)

The script, `data2csv`

, was run from the command line like this for each data file in turn:

`data2csv file01.dat > file01.csv\n`

\nThe script takes advantage of the way Python iterates through an open file line-by-line, keeping track of where it left off. The first loop, Lines 6–8, runs through the header lines, doing nothing and then breaking out of the loop when the `[data]`

line is encountered.

Line 10 prints a CSV header line of my own devising. This information was in the original file, but its field names weren’t useful, so it made more sense for me to create my own.

\nFinally, the loop in Lines 12–13 picks up the file iteration where the previous loop left off and runs through to the end of the file, stripping off the DOS-style line endings and replacing the spaces with commas before printing each line in turn.

\nEven on my old 2012 iMac, this script took less than five seconds to process the large files, generating CSV files with over two million lines.

\nI realize my paltry half-gigabyte files don’t really qualify as big data, but they were big to me. I’m usually not foolish enough to run high frequency data collection processes on low frequency equipment for long periods of time. Since the usual definition of big data is something like “too voluminous for traditional software to handle,” and my traditional software is BBEdit, this data set fit the definition for me.

\n

[If the formatting looks odd in your feed reader, visit the original article]

"}], "home_page_url": "http://leancrew.com/all-this/", "version": "https://jsonfeed.org/version/1", "icon": "http://leancrew.com/all-this/resources/snowman-200.png"}