Fixing Web Spam

Over on the Technorati blog I see that there’s a summit on web spam happening next week. That’s good. Link farms and spam blogs have been driving me batty.

For combatting the phenomenon from inside tools like Technorati, IceRocket, Feedster, Google Blog Search, and so on, I think our best bet may be collaborative reporting similar to the Razor or Pyzor email-spam-reporting networks.

On the model of Craigslist, last month Blogger introduced a “Flag” button at the top of the screen of all Blogspot-hosted blogs, which is on the right track. But nobody except Google has access to that information. A shared reporting system would mean that before I added an alleged blog to my index, or aggregator service, or whatever, I could query that central database to see if that URL had already been flagged as spam by other users.

This tech is already well-proven with email spam. I run Pyzor on my mail server, and when messages come through to my spamtrap addresses, they get immediately reported so that other users can benefit from that information. Likewise I benefit from the reports made by other users.

Obviously the system would have to be protected against poisoning by vindictive spammers, who might be tempted to report, say, all the URLs in Technorati Top 100.

Lots of other people have written about this in the past few months: David Sifry of Technorati, for one. There’s also the Fighting Splog blog, and services like Splogreporter.com that may eventually be used in the way I’m describing. When Google launched their new blog search tool, it was immediately criticized for being full of spam blogs. That’s an astounding oversight on their part, but mostly it points up the fact that there’s as of yet no standard for attacking this problem. Let’s create one.



Share: