Python question: a better urlparse?

December 12, 2005

Is there a more sophisticated equivalent of urlparse.urlparse() somewhere that knows enough to break out username and password components? Ideally it would return a dict, with keys like ‘scheme’ and ‘host’ and ‘user’, instead of a tuple. Something like PHP’s parse_url().

Paul Jimenez commented on Mon Dec 12 13:09:22 2005:

I wrote http://mail.python.org/pipermail/python-dev/2005-November/058301.html about urlparse being broken not too long ago, though I have yet to present my replacement. What kind of API do you think a better urlparse() should have? Keep in mind a good solution should deal with not only http://user:password@host:port/path?query#fragment, but also tel:1-234-567-8910 and news:newsgroup and news:msgid@newsgroup. I suspect the problem with a dict instead of a tuple is standardization of keys. Or maybe that’s fine. I’d be interested in your opinion.

Paul commented on Mon Dec 12 18:47:49 2005:

Now that I’ve seen that thread, the solution seems pretty well hashed out. That little netlocparse() snippet does a lot of what I was looking for. Seems like Guido’s just waiting for somebody to submit a patch. Since the function is called “urlparse” and not “uriparse” I wouldn’t let utter completeness stand in the way of a decent fix.