Random crufty open source release of the day

Last year a client asked for help moving his website to a new host from XO.com. The tricky part was that his 200 pages of content were locked into an obsolescent proprietary tool called “Site Builder” that offered no exporting options. The file format was a flatfile that looked like this:

#Page-Type "html"
#UID "1000"
#Access-PublicRead "on"
#Access-PublicWrite "off"
#Page-Links-Style "links_outline.nhtml"
#addbrs "off"
#hidenav "off"
#HTML ...

The file structure went like this: a parent directory named nss-objects; child directories bearing page names (or slugs, really); and inside each, an empty directory named !data and a text file named !object with the page content as described above. Weird. (I suppose every proprietary one-off system is weird in its own way, so there’s nothing to be gained from dwelling on the specifics, but at least including them in the post raises the chances that somebody who actually needs this thing and searches for it will find it.)

I’ve since handed the job and the code off to someone else, and decided to release the script. It’s fairly simple, just a couple hundred lines of Python. Since I wrote it for my own use on my Mac, it uses EasyDialogs; if you’re on Windows, you can try the Windows port, and otherwise you’re on your own (I’d just convert the EasyDialogs calls to output text in the shell).

You feed the script your nss-objects directory, and it does the rest. It can output either a pile of HTML pages (based on a simple template inside the script), or SQL code for ingesting into your favorite relational database.

I’d love to hear from anyone who ends up using this – it’s incredibly obscure, but if you need it, you really need it.

Source: sitebuilder-extract.py



Share: