Eran's blog

Re: Microformats and SIOC

John Breslin writes about the overlap between Microformats and the SIOC (Semantically-Interlinked Online Communities) project. SIOC was one of my references when expanding cite-rel and later when designing my Distributed Social Anything project so, naturally, I’m interested.

John brings up a few points where SIOC and microformats could help each other and therefore help make the Web more semantic.

In the SIOC and FOAF vocabularies, there are two properties linking people to user profiles: a Person is linked to a User using a “hasOnlineAccountâ€? link, and a User is linked to a Person using an “account_ofâ€? relationship.

To the best of my knowledge, the way this is normally handled in the mf world is based on XFN (XHTML Friends Network) and possibly with the addition of hCard. Using XFN rel=me links one can establish an equivalence relation between different accounts to show that they belong to the same person. Additionally, one could use a rel=me link to an hCard to provide further information about the actual person (as opposed to an account on a website). This is described to some degree on the GMPG site but that description is getting somewhat out of date and should really be moved to the mf wiki.

…when you create your post, link back to your own post aggregation resource. From the user side, this could be achieved by creating a mf for sioc:has_container that allows linking to “Virtual Forumsâ€? from the post creator side, e.g. from the post content.

I’ve faced this problem while designing the DSA system and chose a similar solution based on an existing microformat – rel-directory. By adding ‘directory’ to the rel value of a link, the author states that the destination of the link is a container listing an entry for the current page. Using the rev attribute one could publish the complimentary relation.

From forums, a mf for has_part / part_of would allow one to link a forum or blog or any discussion channel to a larger community, e.g. this could be done from a forum description. Also, a mf for has_part / part_of would help to link a user to a community, e.g. by creating a typed link from a user signature.

As John says, this is very close to the word I’ve done with DSA. I haven’t really thought of showing relationships between whole forums and groups but this can be done in the same way as single posts using rel-directory (mentioned above). For users, I wanted something with a little more specific semantics that’s easy to integrate with XFN so I’ve started work on XMF (XHTML Membership format).

Lastly, John touches on distributed conversations and cite-rel. There is, indeed, some correlation between cite-rel values and SIOC/IBIS relations but there’s also an important difference.

Ultimately, we need ways to say that a post is in agreement or disagreement with a previous post or even with specific parts of a previous post (see picture below). Also, we may need to describe other reply types that are needed, as with IBIS. (Note: rev-update and rel-update may correspond to sioc: previous_version and sioc:next_version respectively, and via may be compared with sioc:related_to.)

Ryan and I created cite-rel with the intent of keeping it simple. In the microformat way of focusing on the 80% case we mostly dealt with simple relationships between posts (replies, forwarding, updates and via links). We left dealing with the subtle nuances of agreeing/disagreeing to future specs. In this specific case, we can actually use one of the earliest microformats, vote-links, to show the authors attitude towards a previous post. As for other values, here’s the mapping as I see it:

sioc:has_reply – rel-reply
sioc:reply_of – rev-reply
sioc:next_version – rel-update
sioc:previous_version – rev-update
sioc:related_to – I’d say that via can be mapped to sioc:related_to but not necessarily the other way around as via links have a very specific meaning that is not encoded in SIOC as far as I can see. Perhaps IBIS’ source would work?
IBIS:pro/con – map very well to vote-for and vote-against in vote-links.


Filed under: Aggregation, MicroFormats, Projects, The Net

Microcontent Design and Good Engineering

Richard MacManus is writing an interesting series of posts about microcontent design (more here). I agree with most of Richard’s ideas and with some of the problems he forsees for microcontent. There is, however, one important point take I’d like to take issue with. Richard metions Canter’s Law #1:

it basically says: support all formats and don’t take sides, because the user doesn’t care about your geeky format wars. As Marc put it :

“No human cares about what format is supported. Only us. Flickr proved that they could be completely format agnostic and provide a compelling experience to all.”

As Kevin Burton points out in the comments, Canter’s Law is actually a bastardization of Postel’s Law aka the Robustness Principle

“Be conservative in what you do; be liberal in what you accept from others.”

While the robustness principle talks about implementation of Internet protocols it is easy to appy it to this case as well. Design and produce content using good (robust, based on solid principles, well thought out, well documented and well specified) formats and protocols but be ready to consume even the bad ones. While our users may not care about the underlying technology we must remember that the formats and protocols that we design today might (and hopefully will) become the building blocks for tomorrow’s Web.

We don’t need another Y2K scare and it is up to the designers and implementers (read: us geeks) to make sure that they provide a solid foundation to build on. Now is the time, as this new generation of technologies is being defined and rolled out to make sure that 5 years from now there won’t be a collective shout of “D’oh!â€? echoing all around the ‘Net.

If we don’t care about the format wars, if we don’t make sure that the best formats win, we’ll end up (once more) stuck with the loudest solution that kinda works (if you squint your eyes and tilt your head just so) instead of the one that works well. So, by all means, take out the ego from the format wars but keep the dedication to quality. You won’t regret it.

Filed under: MicroFormats, The Net


i-Tag is a new standard proposal by Mary Hodder, Kaliya Hamlin, and Drummond Reed.

The basic idea of an i-tag (identity tag, independent tag, intelligent tag – take your pick) is that a user could tag an object on their own site (photo, video, sound file, text or an entire blog post), where the tag, and the object, would then go out through the RSS feed or be spidered, with some additional information that doesn’t now exist in tags.

Here is a sample i-Tag

<a href="http://example.com/tags/dog" rel="tag" class="http://example.com/blog/post/54">dog</a>

This is an interesting idea but we already have xFolk that solves this problem pretty well, in my opinion. xFolk is a format for publishing collections of bookmarks, it allows users to apply rel-tag to any object identified by a URL. xFolk can be embedded in XHTML, Atom, etc. so it can be easily aggregated by any interested party. Here’s the same entry in xFolk:

<span class=”xfolkentry”>
<a class=”taggedlink” href=”http://example.com/blog/post/54″>my blog post</a>
<a rel=”tag” href=”http://example.com/tags/dog”>dog</a&gt;

The other option for i-Tag is tagging objects with a license. I fail to see a general use case for this that would not be covered by rel-license. A sample license i-Tag:

<a href="http://creativecommons.org/licenses/publicdomain/" rel="license" class="http://example.com/blog/post/54">dog</a>

It might be useful to indicate the license of an image or a video clip in some cases but in those same cases one could use xFolk and rel-license to achieve the same:

<span class="xfolkentry">
<a class="taggedlink" href="http://example.com/blog/post/54">my blog post</a>
<a rel="license" href="http://creativecommons.org/licenses/publicdomain/">Public Domain</a>

Besides replicating existing work, there’s a couple of other problems with i-Tags. Embedding a URL (or XRI or what-have-you) in the class attribute strikes me as a generally bad idea.

  1. From a semantic standpoint, this is completely meaningless.
  2. From a design standpoint, you cannot use that class value as part of a CSS selector – it just doesn’t work. And am I allowed to add more values to the class attribute to fix that? The spec is unclear on this.
  3. Last but not least, this piece of data is completely hidden from whoever is viewing the page. How are my readers supposed to know what I’m tagging if they can’t see the subject URL?

As for the whole community dictionary thing (and whatever other information might be hidden inside those XRI), I’m not sure I quite follow but it seems like it would be just as useful inside xFolk and rel-license if one were inclined to use XRI. Alternatively, we do have wikipedia.

via: You’re It

Filed under: MicroFormats, Tagging, The Net

Microformats do it with class

Or more specifically, microformats.org – we do it with class

Now, someone please put that on a t-shirt. Thanks!

Update: FactoryJoe to the rescue!

microformats shirt

Filed under: MicroFormats

Updates: cite-rel, distributed social anything

Time for a couple of updates.


I’ve been doing some research on distributed conversations and markup in order to bring cite-rel to a draft status. You can see the markup examples I collcted, formats used in other places and the current state of the discussion all on the wiki. Please add whatever additional information or thoughts you have on the subject to the appropriate wiki page.

Distributed Social Anything

In case you missed the blog post, there’s now a whole trac wiki devoted to the project. I’ll be pursuing this project throughout this semester at USF as my Master’s Project. I’m planning to keep the project as open and transparent as possible. I’ll try to keep the wiki up-to-date and I would appreciate any feedback from interested parties. I only ask that you create an account on the wiki before modifying anything, this will allow me to keep track of who I’m talking to.

Oh, the code-name for this is Project Nirvana. Don’t ask why.

Filed under: MicroFormats, Projects

Using a DNS-like model for Distributed Conversations

One of the goals of cite-rel is to enable tracking of distributed conversation by aggregators (like technorati or memeorandum) over multiple blogs. Using a simple microformat like cite-rel to solve the problem has the advantage of a very low cost of entry. Any user can employ cite-rel and any blog software, indeed any tool that publishes HTML can support the format. The downside to that is requiring a third party – the aggregator – and the possibly large amount of work required by that aggregator. It is possible, however, to build a different solution to the problem that would not require any third party and would only require analyzing the conversations each participant is a part of.

DNS is a distributed publishing mechanism. Each DNS server is in charge of only a small subset of the entire domain system but using recursive queries each server can serve information about every public domain. Recursive queries work as a distributed search mechanism that leads your DNS server to the servers with the authority to answer the query. Your DNS server then caches the reply for a limited time so that repeated queries for the same domain would be served faster. We can employ a similar solution with blogs.

To implement such a solution we require two elements:

  1. Support for recursive queries.
  2. A search mechanism.

The search mechanism can be based in existing web technologies – pingbacks. Blog software already supports sending pingback and tracking them, this allows blogs to store references to all replying posts. Further, when posting a reply to a post on a different blog, the blog software can keep track of the original post. With these two mechanisms in place we can completely reconstruct the entire thread, even though each blog only stores links to directly connected posts.

Recursive queries for thread information will come in two different flavors.

  1. A request for an entire thread from a specific post. This type of request will be recursively redirected to the blog hosting the original post, unless this it is already there.
  2. A recursive request for all replies starting with a specific post in a thread. This request will recursively propagate down using pingback information to all blogs that published replies to this post.

Using these two simple requests any blog can give access to full threads for every post published on it. If we add a simple caching mechanism, the performance of the system should improve dramatically without using too much space.

Filed under: Aggregation, MicroFormats, Projects

Distributed Social Anything

Following are generally unstructured thoughts and plans for a possible project. I’ve been thinking about something in this vain for a while but have never put those thoughts into a more permanent form so here goes. This post serves mostly as scratch paper for my ideas so feel free to skip it if you don’t like long, raw, technical posts.

For lack of a better name, I’m calling this Distributed Social Anything. The most concise description I can come up with is distributed Tribe.net. Completely distributed (and then aggregated for convenience :).

  • All content published and owned by the users.
  • All content is accessible by any would be aggregator and formatted according to open standards (mostly microformats).
  • Based on existing tools and technologies. The main publishing tool is a blog.
  • Compatible with current tools. The requirements to participate are few, users of most blog hosting services should be able to participate.

Features and concepts:

  • Identity is defined by a URL. Currently the entities in the system are users and groups, both will have a canonical URL that contains at least XFN data. This XFN data (slightly expanded) defines the standard social network for users but also group membership.
  • Reciprocal XFN links might be required for some of the relations defined later. This is optional and left to aggregators to decide.
  • Group membership is published in XFN. This might require reciprocal links between users and group.
  • Users can publish information about themselves using XFN and hCard. User rel=”me” to link to additional shards of identity.
  • Users publish content on their blog. This content is later aggregated by groups to create a coherent group view .
  • Channels are feeds of blog posts that belong to a specific set. Channels are defined by tags or categories. Each group has at least one channel. Posts marked with that channel’s ID will be part of that group’s discussions.
  • Discussion are annotated using citeRel. Group aggregators might display those in a threaded format
  • Displaying previous versions of posts (in case of editing) with a diff view would be nice
  • XFN links should be aggregated and searchable (similar to rubhub.com). A service that offers search in the XFN space would be very nice
  • Group aggregators should also be aware of rich data (events, listings, etc.)
  • The group site might be able to highlite specific type of rich data (images, bookmarks, etc.) and/or offer access to it using feeds/API
  • We need administrative control of the group – membership, post moderation, rules, access-control, etc.
  • API for the group aggregator
  • Note: group aggregators can collect content from many sources, not just blogs (e.g. flickr, delicious)

Existing support:

  • WordPress supports feeds for categories. Also posts can belong to more than one category. Free channels!
  • RubHub does some XFN search but does not seem to be open source 😦

To do:

  • Express group membership using XFN (rel=”memberof” ?)
  • Finalize citeRel.
  • Expand and improve on structured blogging.
  • A format for publishing group information.
  • Possibly replicate and improve on RubHub.
  • The group aggregator service.

Filed under: Aggregation, MicroFormats, Projects, Tagging

Really? Simple Sharing?

I hate specs, they bore me. I’d rather see a few examples and keep a handy reference for any questions not covered by the examples. I’ve decided, however, to make an exception for Microsoft’s SSE and read the actual spec. All I could see was a simplified version control system that, for some unknown reason, is published with RSS and OPML. I quickly turned to Ray Ozzie’s post to see if maybe he’s got some motivation for me.

It seems that Ozzie is really excited about SSE because it will let him better coordinate schedules and contacts with his wife’s staff – heart warming, really! Personally I think that would work just as well using a phone but I can see how a protocol would scale better. So assuming we agree that a protocol is desirable for this new-found problem of synching our PIMs that leaves the question, why RSS and OPML?

Ozzie seems to have two reasons for that: the existing support for those standards and the simplicity of RSS. Reason one is not good enough; it’s what leads you to backwards-compatibility hell. This is even worse when the proposed enhancements break reason two. Synchronization isn’t simple and RSS + SX isn’t simple either.

So if we do still want this subscription model and RSS + SX is not the solution, what is? And what do we stand to gain from a new solution? The answer is very simple, actually, it’s XHTML (with microformats for seasoning).

We already know that anything OPML can do, XOXO can do as well so that’s easy. As for RSS, well, it’s about time we got rid of this hack anyway. There’s no reason for RSS to exist when it can be replaced by XHTML, banishing that ugly XML link from our blogs and maintaining style at the same time. The gain here is obvious, feeds become readable or even (how’s this for magic?) disappear completely – the content is the feed.

Calendar data? Easy. Contacts? Easy. Just use hCalendar and hCard to represent those. What did you just get? Is this a feed of your events that also serves as your calendar? Amazing! And importing contacts into your favorite PIM application is as easy as applying a XSLT? Magic! Synchronization can be done as per Microsoft’s schema or, since we’re just dealing with XHTML here, use any existing solution. How about DAV?

That’s what I call simple.

Filed under: MicroFormats, The Net

miniProject: Microformat Autopopulation

I’ve been thinking about various ways to parse microformat data into an object. There’s plenty of ways to tackle this problem but most require a different script for every microformat standard. The alternative that’s most alluring to me is reflection – learning the schema and values of a piece of data and automagically creating the appropriate object populated with the correct values. I started working on something similar to the database refelction used in Ruby on Rails but soon stopped and limited the scope of the project. Since most microformats still lack a proper schema and since the XHTML structure lacks definite data on the internal structure of the object, full reflection is a complicated affair and might not even be possible. I chose to use an existing schema (database tables and module files) and to limit myself to autopopulation of the values in objects based on that module. The resulting script (while not complete) contains no specific references to any microformat (with the exception of the schemas, of course). I currently have a schema in place for xFolk RC1 entries and will soon add more.

To play with the current (early!) iteration, go to http://hellonline.com/reflect/public/dispatch.fcgi/xfolkentries/text
(NOTE: this version uses class=”description” instead of class=”extended” as per the recent changes )

How it Works
The script relies on creating well formatted names for properties based on HTML elements information (element name, class name and rel value). Once a name for a possible property has been decided it checks to see if the object in question has a property of this name and sets the value according to the object type (if it exists). Using Ruby it is easy to detect that a certain property is an array and treat it correctly.

Complete source is available but the interesting file is really lib/microformats.rb

Update: The application can now parse (most) hCard data as well. Next in line, hReview.

Filed under: MicroFormats, Projects

citeRel at microformats.org

CiteRel has finally found its way to the microformats.org wiki (thanks Ryan). We’re still at the brainstorming stage, trying to get as much feedback as possible. There’s interesting discussion going on in the mailing list and interest seems to be growing.

Filed under: MicroFormats