Eran's blog

“What about Tagging?”

Tagging is the new foo. There’s no doubt about it. Tagging seems to overtake even social networking as the buzzword of the moment so I guess it’s time for me to rant a bit. This post is mostly a collection of things I posted in several other places with some new thoughts and ideas added for variety.

This article on CNN has some interesting quotes that I think show very well why tagging works and how it might fail as well.

David Sifry

“Tagging is something selfishly useful. It helps you understand and categorize something for yourself”

Stewart Butterfield

“I don’t think in the context of Flickr that there are bad tags, the point is not for you to find all of and only pictures of elephants but to give people a few extra tools to organize their own stuff.”

Both Sifry and Butterfield seem to think that tagging is a personal activity, done more for one’s own benefit than for the “greater good of humanityâ€?. I don’t tag a bookmark to enable others to find it; I tag it into my own little catalog using my own private key so that later it’ll be easier for me to retrieve it.

There is definitely some emergent behavior born from Tagging on a wide social scale such as we see on del.icio.us. Related tags are an interesting improvement that takes Tagging to the next level. By using co-citation and clustering algorithms we can guess that tag A is related to tag B, thereby creating a sort of loose structure on the global set of tags. This structure can then be implemented even on the local (single user’s) set of tags to improve the user’s personal experience. The user’s local tag set might not be rich and big enough to create those connections but she can still enjoy a pretty accurate list of related tags based on the larger set of information created by all the users in the system. Related tags, however, are a limited tool both in breadth and depth.

Related tags work because it’s possible to invert the relation between tag and URL and then to use meta-information on the set of URLs to learn about the connected tags. On a system like flickr this inversion is not easy. There is no co-citation (pictures are mostly tagged by one person) and clustering of pictures according to topic is a pretty difficult problem.

On the other hand, even in systems that have related tags, how deep is that information? Does the system know anything about the meaning of the tags or the pages? Given a new, unseen page, can del.icio.us tell me what tags it should have? Related tags are a very shallow approximation of knowledge. They fake context but carry very little actual contextual information. We are left again with a set of personal tags that mean very little to anyone but ourselves.

So what can we do with Tagging to improve on this situation? How do we take tagging even farther? Coming soon: lexical analysis, tag translation, hierarchical overlays and more!!

Update: Apparently someone wrote a related tags browser for flickr. It does a pretty good job (on some tags at least) and has a very slick interface too. Via: Ryan King.


Filed under: The Net

%d bloggers like this: