E-Scribe News : a programmer’s blog

About Me

PBX My name is Paul Bissex, and e-scribe.com is my consulting business. I build web applications using as much open source software as possible. From September to June I teach web design and other important non-photographic professional skills to photographers. In the '90s I wrote technology commentary and reviews for magazines, newspapers, and web publications, including Wired, Salon.com, FamilyPC, the late lamented Web Review, and the Chicago Tribune. Feel free to email me.

Book Project

I'm co-authoring a book, "Python Web Development with Django", with Jeff Forcier and Wesley Chun. It will be published by Prentice Hall in July 2008, but is available for pre-ordering on Amazon now.

Colophon

This site is built on a fresh trunk checkout of Django, running on Python 2.5.1, served by Apache and mod_python. The database is SQLite. The operating system is FreeBSD, on a VPS hosted at Johncompanies.com. Comment-spam protection by Akismet. Vintage topo imagery from the Maptech archive.

Pile o'Tags

Stuff I Use

Akismet, del.icio.us, Django, dpaste.com, Emacs, FreeBSD, Freenode, jQuery, LaunchBar, MacPorts, Markdown, Mercurial, OS X, Postfix, Python, SQLite, Subversion, TextMate, Trac, Ubuntu Linux, wmii

A Django site.
(Finally!)

Copyright 2008
by Paul Bissex
and E-Scribe New Media

Random crufty open source release of the day

Last year a client asked for help moving his website to a new host from XO.com. The tricky part was that his 200 pages of content were locked into an obsolescent proprietary tool called "Site Builder" that offered no exporting options. The file format was a flatfile that looked like this:

#Page-Type "html"
#UID "1000"
#Access-PublicRead "on"
#Access-PublicWrite "off"
#Page-Links-Style "links_outline.nhtml"
#addbrs "off"
#hidenav "off"
#HTML ...

The file structure went like this: a parent directory named nss-objects; child directories bearing page names (or slugs, really); and inside each, an empty directory named !data and a text file named !object with the page content as described above. Weird. (I suppose every proprietary one-off system is weird in its own way, so there's nothing to be gained from dwelling on the specifics, but at least including them in the post raises the chances that somebody who actually needs this thing and searches for it will find it.)

I've since handed the job and the code off to someone else, and decided to release the script. It's fairly simple, just a couple hundred lines of Python. Since I wrote it for my own use on my Mac, it uses EasyDialogs; if you're on Windows, you can try the Windows port, and otherwise you're on your own (I'd just convert the EasyDialogs calls to output text in the shell).

You feed the script your nss-objects directory, and it does the rest. It can output either a pile of HTML pages (based on a simple template inside the script), or SQL code for ingesting into your favorite relational database.

I'd love to hear from anyone who ends up using this -- it's incredibly obscure, but if you need it, you really need it.

Source: sitebuilder-extract.py

Saturday, April 15th, 2006
+ +

Post a comment

Comments use Markdown syntax. Your comment will not appear until approved, which may take a few hours or more. Spammers will be torpedoed.


(Will not be shared)

(Optional)