E-Scribe : a programmer’s blog

About Me

PBX I'm Paul Bissex. I build web applications using open source software, especially Django. Started my career doing graphic design for newspapers and magazines in the '90s. Then wrote tech commentary and reviews for Wired, Salon, Chicago Tribune, and others you never heard of. Then I built operations software at a photography school. Then I helped big media serve 40 million pages a day. Then I worked on a translation services API doing millions of dollars of business. Now I'm building the core platform of a global startup accelerator. Feel free to email me.

Book

I co-wrote "Python Web Development with Django". It was the first book to cover the long-awaited Django 1.0. Published by Addison-Wesley and still in print!

Colophon

Built using Django, served with gunicorn and nginx. The database is SQLite. Hosted on a FreeBSD VPS at Johncompanies.com. Comment-spam protection by Akismet.

Elsewhere

Pile o'Tags

Stuff I Use

Bitbucket, Debian Linux, Django, Emacs, FreeBSD, Git, jQuery, LaunchBar, macOS, Markdown, Mercurial, Python, S3, SQLite, Sublime Text, xmonad

Spam Report

At least 237143 pieces of comment spam killed since 2008, mostly via Akismet.

How to port 100,000 lines of Python 2 to Python 3

TLDR: Use python-future.

The Project

Last summer I led the conversion of a 100KLOC Python 2 web application to Python 3.

The application is called "Accelerate" - the backbone of operations at my employer, MassChallenge, a global startup accelerator. It handles every stage of a running accelerator program:

So, it's a mission-critical app. With EOL looming for the versions we were using of both Django (1.11) and Python (2.7), we dove into the migration work about a year ago. Done in parallel with our usual work of maintenance, bugfixes, and enhancements, it took about three months.

After looking into How People Are Doing This, I settled on the python-future project and its futurize tool. It was a good fit for us because we did not want the disruption of a single Big Rewrite project. We couldn't stop the world for a rewrite, and the more time long-running branches go, the more merge conflict hassles you will have. Futurize can get you to Python 3 by way of an intermediate state that still runs on Python 2.

This allowed us to do a couple big merges along the way, from the Python 3 branch into the main line. Much less disruptive and conflict-ridden than trying to do one big merge at the end.

Stage 1

First, we did what they call "Stage 1 conversion", which tackles things that will break in Python 3 but can be easily converted to a 2/3 friendly form. As the docs say, “the goal for this stage is to create most of the diff for the entire porting process, but without introducing any bugs.”

So, after Stage 1 your application doesn't work without issues undr Python 3, but basic syntax and library issues are taken care of, and nothing is broken for Python 2.

Stage 2

Then we did Stage 2; the end result of that is “Python 3-style code that [also] runs on Python 2 with the help of the appropriate builtins and utilities in future.”

The most interesting bit there is builtins, which contains rewritten versions of 18 builtins that provide Python 3 semantics. These are: ascii, bytes, chr, dict, filter, hex, input, int, map, next, oct, open, pow, range, round, str, super, zip

Stage 3

Then came Stage 3, the longest and most challenging (several weeks of work for me), summarized in a single sentence in the futurize docs:

After running futurize, we recommend first running your tests on Python 3 and making further code changes until they pass on Python 3.

Ah. A mere matter of programming.

Fixing tests

A prerequisite for a successful effort of this type, in my book, is excellent test coverage. We were at about 95% line coverage at the beginning of this work. The beginning of stage 3 is basically watching your test suite explode.

I had to touch about 25% of our 2400 unit tests to complete that work. Most of the issues were around string handling. A lot of our tests were checking for string (str) values in Django HttpResponse.content — which under Python 3 is a bytestring (bytes). So, under Python 3 a lot of those tests just threw TypeError. In almost all those cases, the fix was simply to use the Django test framework's assertContains(response, text) method, which reconciles str and bytes pretty seamlessly.

Dependencies

After fixing up the test suite, we did exhaustive manual QA and caught a few things that needed fixing. A significant bit of the effort of Stage 3, which is surprisingly little discussed, is that dependencies can be a major pain point in this process. While most of our many dependencies worked fine with Python 3 when updated to their latest version, many did not.

We had to 1) find substitutes , 2) rework our application to let us drop the problematic dependency (my favorite), or 3) patch for compatibility,


Spreading the word

Python 2 is now officially EOL, but of course there is lots of Python 2 code running in production out there. I suspect the know-how for this kind of conversion will be relevant for many years. I gave talks on this at Django Boston last fall, and NERD Summit this spring.

The thumbnail here links to the video recording of the more recent of those two talks. (My "five minute" intro extends to 9:10; skip to there if you want to just dive into the meat of the talk.)

Footnote: After we completed the work I describe in this post, we moved on to the Django upgrade, settling on version 2.2 which was the newest LTS at the time. The Python 3 conversion taught us a lot about managing the Django upgrade, and it went very smoothly.


Wednesday, April 15th, 2020
+ + +

GatsbyJS

This past week I started playing with GatsbyJS, a static site generator and framework centered around React.

I successfully used today it to generate a static version of this blog (I'm in the process of selecting the static site tool that will replace my vintage 2008 Django-based engine).

The componentization that React brings isn't much of a win for me here, i.e. I'm not likely to be building components for my blog that I reuse elsewhere.

It's definitely more than just a static site generation tool (hence the 400MB of Node modules for a "starter blog" development environment!), but as a way to develop apps that don't need a dedicated back-end I can see the appeal.

Sunday, January 19th, 2020
+ +

261-character git one-liner of the day

I wanted to have a quick way to see what the other team members are doing, and after pillaging a half-dozen SO posts this is what I came up with.

git branch -va --sort=-committerdate --format='%(HEAD) %(color:yellow)%(refname:strip=-1)%(color:reset) - %(color:red)%(objectname:short)%(color:reset) - %(contents:subject) - %(authorname) (%(color:green)%(committerdate:relative)%(color:reset))' --color=always

I added it to my git aliases as recent-commits. Sample output:

AC-6929 - e0c57582a - Added sorting to event list - Pat (2 hours ago)
AC-7054 - 0a9c84222 - Updated 'No mentor selected' option - Sam (5 hours ago)
AC-7053 - 337ef4071 - Removed duplicate values in formset - Charlie (6 hours ago)

Wednesday, September 4th, 2019
+

How things get better after you screw up at work

(Hint: it's about your team.)

A couple weeks ago I accidentally replaced our live, production database with a 17-hour old snapshot.

This is an always-on application with users around the globe, so the mistake was likely to have blown away some new user-entered data.

I didn't realize what I had done for an hour or so (I thought I had targeted a test database server, not production). When it hit me, I had already left work. Here are the steps of how we handled it, with an emphasis on the “good engineering team culture” aspect:

  1. I immediately shared my realization of the crisis with the team. I did not try to fix it myself, or pretend I didn't know what had happened. I was able to do this because I knew the team had my back.

  2. Available team members immediately dove into confirming, assessing, and mitigating the problem. (Since I was in transit I was not yet able to get on a computer.) Focus was on minimizing pain for our users and the business, not on blame, resentment, or face-saving.

  3. User monitoring tools used by our UX person gave us critical info on which users had potentially lost data. We shared knowledge.

  4. We didn't think we had a more recent backup than the snapshot I had used — but one of the engineers had been making more frequent snapshots as part of a new project. He had been done with work for hours (he's in a different time zone), but when he saw the chatter on Slack he jumped in to help. He didn't say, “not my job.”

  5. After we had reached a stable state, people signed off, but I stayed on to double-check things and write up a summary to broadcast to the team. Communication is key.

  6. The next day, we scheduled a postmortem meeting to discuss the incident. This is a standard practice that's very important for building teams that can learn and grow from mistakes. It's “blameless” — the focus is on what happened, how we responded, what the business impact was, and what we can do to reduce the chance of recurrence. An important part of prevention is making measures more concrete and realistic than “try not to make that mistake.” In the end we lost only about 90 minutes of database history, and accounted for all user data added in that period.

I made a bad mistake, the team rose to the occasion, we were lucky to have good mitigation options, and we are making changes to reduce the chance of the mistake happening again. Win.

Sunday, October 7th, 2018
+ + +

XFCE Good

After a couple years of mostly using XMonad on my Linux machines instead of a standard Desktop Environmnt, I'm coming around to using XFCE. I've always liked it; it's been my installed "fallback" DE (for when you need the damned settings dialog for some thing or other). Now it's becoming my primary.

I like the low resource use. I don't hate Unity and Gnome Shell but they are too much for my older machines.

But the little thing that is making the most difference is, good standard keyboard driven launching and window-manipulating features.

E.g.

Sure it's not XMonad, but it lets me get stuff done and doesn't require any custom setup.

Thursday, May 31st, 2018