E-Scribe : a programmer’s blog

About Me

PBX I'm Paul Bissex. I build web applications using open source software, especially Django. Started my career doing graphic design for newspapers and magazines in the '90s. Then wrote tech commentary and reviews for Wired, Salon, Chicago Tribune, and others you never heard of. Then I built operations software at a photography school. Then I helped big media serve 40 million pages a day. Then I worked on a translation services API doing millions of dollars of business. Now I'm building the core platform of a global startup accelerator. Feel free to email me.

Book

I co-wrote "Python Web Development with Django". It was the first book to cover the long-awaited Django 1.0. Published by Addison-Wesley and still in print!

Colophon

Built using Django, served with gunicorn and nginx. The database is SQLite. Hosted on a FreeBSD VPS at Johncompanies.com. Comment-spam protection by Akismet.

Pile o'Tags

Stuff I Use

bitbucket, Django, Emacs, FreeBSD, Git, jQuery, LaunchBar, Markdown, Mercurial, OS X, Python, Review Board, S3, SQLite, Sublime Text, Ubuntu Linux

Spam Report

At least 236429 pieces of comment spam killed since 2008, mostly via Akismet.

Random crufty open source release of the day

Last year a client asked for help moving his website to a new host from XO.com. The tricky part was that his 200 pages of content were locked into an obsolescent proprietary tool called "Site Builder" that offered no exporting options. The file format was a flatfile that looked like this:

#Page-Type "html"
#UID "1000"
#Access-PublicRead "on"
#Access-PublicWrite "off"
#Page-Links-Style "links_outline.nhtml"
#addbrs "off"
#hidenav "off"
#HTML ...

The file structure went like this: a parent directory named nss-objects; child directories bearing page names (or slugs, really); and inside each, an empty directory named !data and a text file named !object with the page content as described above. Weird. (I suppose every proprietary one-off system is weird in its own way, so there's nothing to be gained from dwelling on the specifics, but at least including them in the post raises the chances that somebody who actually needs this thing and searches for it will find it.)

I've since handed the job and the code off to someone else, and decided to release the script. It's fairly simple, just a couple hundred lines of Python. Since I wrote it for my own use on my Mac, it uses EasyDialogs; if you're on Windows, you can try the Windows port, and otherwise you're on your own (I'd just convert the EasyDialogs calls to output text in the shell).

You feed the script your nss-objects directory, and it does the rest. It can output either a pile of HTML pages (based on a simple template inside the script), or SQL code for ingesting into your favorite relational database.

I'd love to hear from anyone who ends up using this -- it's incredibly obscure, but if you need it, you really need it.

Source: sitebuilder-extract.py

Saturday, April 15th, 2006
+ +

Comments are closed for this post. But I welcome questions/comments via email or Twitter.