E-Scribe News : a programmer’s blog

About Me

PBX My name is Paul Bissex, and e-scribe.com is my consulting business. I build web applications using as much open source software as possible. From September to June I teach web design and other important non-photographic professional skills to photographers. In the '90s I wrote technology commentary and reviews for magazines, newspapers, and web publications, including Wired, Salon.com, FamilyPC, the late lamented Web Review, and the Chicago Tribune. Feel free to email me.

Book Project

I'm co-authoring a book, "Python Web Development with Django", with Jeff Forcier and Wesley Chun. It will be published by Prentice Hall in July 2008, but is available for pre-ordering on Amazon now.

Colophon

This site is built on a fresh trunk checkout of Django, running on Python 2.5.1, served by Apache and mod_python. The database is SQLite. The operating system is FreeBSD, on a VPS hosted at Johncompanies.com. Comment-spam protection by Akismet. Vintage topo imagery from the Maptech archive.

Pile o'Tags

Stuff I Use

Akismet, del.icio.us, Django, dpaste.com, Emacs, FreeBSD, Freenode, jQuery, LaunchBar, MacPorts, Markdown, Mercurial, OS X, Postfix, Python, SQLite, Subversion, TextMate, Trac, Ubuntu Linux, wmii

A Django site.
(Finally!)

Copyright 2008
by Paul Bissex
and E-Scribe New Media

A Mercurial mirror of Django's Subversion repository

Just wanted to post a quick note that I'm now publishing an experimental Mercurial mirror of the Django source code repository, including all tags and branches and even the djangoproject.com website source itself. Tom Tobin at The Onion has been maintaining a similar mirror of Django trunk for a while (and very helpfully answered some of my questions in IRC), but I wanted to do the whole tree.

It's an experiment, which is to say it might go away at any time, so be warned! And let me know if you think it's useful. Components in the chain include svnsync, hgwebdir, Apache mod_cache, and even Pygments for source code colorizing. It's updated once per hour.

My earlier experiments along these lines used hgsvn, which is cool because it tags each commit with the corresponding svn rev number. Unfortunately, hgsvn basically ground to a halt while trying to digest all 7000 revs, so I switched to Mercurial's built-in "convert" command.

Mercurial doesn't yet support partial-tree cloning, so if you want your own copy you're going to be fetching the whole thing! It takes up about 350MB, which isn't bad considering it includes all 7000+ changesets.

Have fun!

Update: I've replaced the monolithic repo with individual repos for the most active branches: trunk, gis, newforms-admin, and queryset-refactor. This is less exhaustive but more useful, since these individual repos are much smaller. A Mercurial clone of trunk is about 40MB -- about the same size as a Subversion checkout of same, but containing a full revision history!

Wednesday, February 6th, 2008
+ + + + + + +
10 comments

Comment from masklinn, later that day

That's really cool. I was using git-svn to track Django, but I may switch to that instead.

Comment from Thomas Capricelli, later that day

Hi Paul, thanks for this mirror. I was actually looking for one, and I was not aware for the other one you mention. Although, i find it weird to have all put on the same repository. As you say, it's not possible to checkout a partial tree with mercurial, and it seems natural to do several smaller directories. Can't your 'svn import script' do that ? Svn can do partial checkout/update.

Comment from Nicholas Riley, later that day

I do this with hgsvn and svnsync (hg convert failed on me, last time I tried), but having multiple trees means you can't track branch merges in the same repo. I now regret doing it.

Comment from Thomas Capricelli, later that day

Of course you want to keep tags and branches, but you can still have a repository for the website and another one for the source. I successfully imported svn repositories and kept branches and tags. It was not easy though. I've tried lot of different ways to import svn to mercurial, and the one included in mercurial was the best one. (although i had to use the mercurial repository, it was not released yet)

Comment from Paul, later that day

Hi Thomas, the challenge is: what's the right point at which to break out separate repos? Going one level deeper than I have gone (i.e. separate repos for django and djangoproject.com) makes sense, but it doesn't really make the main repo much smaller, and it's still got every branch and tag ever made. Or the next level down (separate repos for branches/tags/trunk)? That would be much more satisfying to people who just want to track trunk especially, but as Nicholas points out it can break merge tracking (though I admit I haven't thought that part through).

As far as my "svn import script", there's nothing custom. I maintain a local mirror of the main Django repo using svnsync, and update the Mercurial repo via hg convert in a cronjob.

Comment from Horst Gutmann, later that day

Great, thank you :D

Just a question: How do you deploy hgwebdir? CGI, FastCGI, WSGI? Just curious.

Comment from Paul , later that day

Horst: It's plain CGI with Apache mod_cache in front. The caching is crucial, especially with the addition of Pygments rendering for every source file.

Re the earlier questions about structure, I'm playing with other arrangements, so you may see individual branch repos start showing up as well.

Comment from Thomas Capricelli, 1 day later

Hi Paul. I see you're experimenting. There's currently one repository for one branch.

I stay with the impression that it would be better to have one repository for the code and another one for the website. It really makes sense to have all branches in the same repository, and mercurial is quite optimized to handle big projects. I'm using the kernel mercurial repository and I never had any problem. Where do you think the problem would be ? bandwidth for your server ? for the user ? disk storage for the user ?

Comment from Paul , 1 day later

My main concerns are usability, maintainability (for me), and potential bandwidth consumption. Most people really only want one or two branches, which are much much smaller when broken out separately.

I'm now looking at a third option, a single repo with selected branches (Mercurial "named branches") via hgsvn.

Comment from Tane, 6 weeks later

As a Django and Mercurial user, you might be interested in having a look at http://hg.sharesource.org/hgfront. It's an application we're developing in Django at the moment to manage your local and remote repositories.

We're happy to get feedback and accept new ideas, and we're quite close to our first public release.

Post a comment

Comments use Markdown syntax. Your comment will not appear until approved, which may take a few hours or more. Spammers will be torpedoed.


(Will not be shared)

(Optional)