E-Scribe News : a programmer’s blog

About Me

PBX My name is Paul Bissex, and e-scribe.com is my consulting business. I build web applications using as much open source software as possible. From September to June I teach web design and other important non-photographic professional skills to photographers. In the '90s I wrote technology commentary and reviews for magazines, newspapers, and web publications, including Wired, Salon.com, FamilyPC, the late lamented Web Review, and the Chicago Tribune. Feel free to email me.

Book Project

I'm co-authoring a book, "Python Web Development with Django", with Jeff Forcier and Wesley Chun. It will be published by Prentice Hall in July 2008, but is available for pre-ordering on Amazon now.

Colophon

This site is built on a fresh trunk checkout of Django, running on Python 2.5.1, served by Apache and mod_python. The database is SQLite. The operating system is FreeBSD, on a VPS hosted at Johncompanies.com. Comment-spam protection by Akismet. Vintage topo imagery from the Maptech archive.

Pile o'Tags

Stuff I Use

Akismet, del.icio.us, Django, dpaste.com, Emacs, FreeBSD, Freenode, jQuery, LaunchBar, MacPorts, Markdown, Mercurial, OS X, Postfix, Python, SQLite, Subversion, TextMate, Trac, Ubuntu Linux, wmii

A Django site.
(Finally!)

Copyright 2008
by Paul Bissex
and E-Scribe New Media

Fixing Web Spam

Over on the Technorati blog I see that there's a summit on web spam happening next week. That's good. Link farms and spam blogs have been driving me batty.

For combatting the phenomenon from inside tools like Technorati, IceRocket, Feedster, Google Blog Search, and so on, I think our best bet may be collaborative reporting similar to the Razor or Pyzor email-spam-reporting networks.

On the model of Craigslist, last month Blogger introduced a "Flag" button at the top of the screen of all Blogspot-hosted blogs, which is on the right track. But nobody except Google has access to that information. A shared reporting system would mean that before I added an alleged blog to my index, or aggregator service, or whatever, I could query that central database to see if that URL had already been flagged as spam by other users.

This tech is already well-proven with email spam. I run Pyzor on my mail server, and when messages come through to my spamtrap addresses, they get immediately reported so that other users can benefit from that information. Likewise I benefit from the reports made by other users.

Obviously the system would have to be protected against poisoning by vindictive spammers, who might be tempted to report, say, all the URLs in Technorati Top 100.

Lots of other people have written about this in the past few months: David Sifry of Technorati, for one. There's also the Fighting Splog blog, and services like Splogreporter.com that may eventually be used in the way I'm describing. When Google launched their new blog search tool, it was immediately criticized for being full of spam blogs. That's an astounding oversight on their part, but mostly it points up the fact that there's as of yet no standard for attacking this problem. Let's create one.

Saturday, September 17th, 2005
+

Post a comment

Comments use Markdown syntax. Your comment will not appear until approved, which may take a few hours or more. Spammers will be torpedoed.


(Will not be shared)

(Optional)