Eran's blog

Tag Spam

One of the biggest disadvantages of using tags in your application is how easy it is to create spam tags. Since the Web became of commercial interest we’ve seen spam invade just about every space and technology, there is no reason to assume that tag-space is not next in line to suffer from an epidemic of spam.

There is definitely evidence of tag-spam, you can see it on ice-rocket, for exmaple (just search for tag:mesothelioma). Luckily, tag-spam is not nearly as wide-spread as search-engine spam in the dark ages of the Web or Email spam. Spamming tags, as a practice, has probably not yet reached main-stream spammers but that’s not the only parameter at work here. Most online tools today are aware of the spam problem and are taking steps to fight it and those that do, are doing pretty well.

Fortunately for the forces of Good, spamming is a numbers game (even more so with sites that sort information by date instead of relevancy). For spammers to successfully attract traffic to a site they must get high placement in the search results, the easiest way to get this done is by creating spam in large numbers. This type of behavior leads to patterns, patterns can be learned and employed by filters to detect spam or spam-suspect data. If businesses share this information in an open way, the effectiveness of those tools increases exponentially.

Another tool, which is just as strong and must be present in any decent tagging application, is the community. Without a community there is no data, with a community there’s not only data but also people who care about it and will perform some gardening on it. No need for any one user to cover your entire database but if you give your users tools and enough of them care about their own little niches, spam will disappear. We’ve seen this in Craig’s List and in Wikipedia, the community takes care of it’s own.

There will always be some part of the database that does not get properly gardening from the community; you can only expect so much from users volunteering their time. If this part is still important, you can hire people to tend to it but I have to ask, if your community doesn’t care about that data, do you?

As with every other form of spam, this is, and always will be, an arms race. As programmers find new way to detect spam and to empower the community to help, spammers find new ways to get around those tools. We need to decide, are tags worth fighting for? I think they are.


Filed under: Tagging

One Response - Comments are closed.

  1. Gael says:

    WordPress Trackback Spam!!!
    I have installed plugins that prevent comment spams, but this won't prevent trackback to be blocked. I've been spam by many
    MFA websites that most probably is from the same network with trackback, but they are not linking me on their website. May I
    know how do they do it and how do I stop it? Without disabling trackback?
    Thanks, and I'm using WordPress.

%d bloggers like this: