HellOnline

Icon

Eran's blog

Re: Microformats and SIOC

John Breslin writes about the overlap between Microformats and the SIOC (Semantically-Interlinked Online Communities) project. SIOC was one of my references when expanding cite-rel and later when designing my Distributed Social Anything project so, naturally, I’m interested.

John brings up a few points where SIOC and microformats could help each other and therefore help make the Web more semantic.

In the SIOC and FOAF vocabularies, there are two properties linking people to user profiles: a Person is linked to a User using a “hasOnlineAccountâ€? link, and a User is linked to a Person using an “account_ofâ€? relationship.

To the best of my knowledge, the way this is normally handled in the mf world is based on XFN (XHTML Friends Network) and possibly with the addition of hCard. Using XFN rel=me links one can establish an equivalence relation between different accounts to show that they belong to the same person. Additionally, one could use a rel=me link to an hCard to provide further information about the actual person (as opposed to an account on a website). This is described to some degree on the GMPG site but that description is getting somewhat out of date and should really be moved to the mf wiki.

…when you create your post, link back to your own post aggregation resource. From the user side, this could be achieved by creating a mf for sioc:has_container that allows linking to “Virtual Forumsâ€? from the post creator side, e.g. from the post content.

I’ve faced this problem while designing the DSA system and chose a similar solution based on an existing microformat – rel-directory. By adding ‘directory’ to the rel value of a link, the author states that the destination of the link is a container listing an entry for the current page. Using the rev attribute one could publish the complimentary relation.

From forums, a mf for has_part / part_of would allow one to link a forum or blog or any discussion channel to a larger community, e.g. this could be done from a forum description. Also, a mf for has_part / part_of would help to link a user to a community, e.g. by creating a typed link from a user signature.

As John says, this is very close to the word I’ve done with DSA. I haven’t really thought of showing relationships between whole forums and groups but this can be done in the same way as single posts using rel-directory (mentioned above). For users, I wanted something with a little more specific semantics that’s easy to integrate with XFN so I’ve started work on XMF (XHTML Membership format).

Lastly, John touches on distributed conversations and cite-rel. There is, indeed, some correlation between cite-rel values and SIOC/IBIS relations but there’s also an important difference.

Ultimately, we need ways to say that a post is in agreement or disagreement with a previous post or even with specific parts of a previous post (see picture below). Also, we may need to describe other reply types that are needed, as with IBIS. (Note: rev-update and rel-update may correspond to sioc: previous_version and sioc:next_version respectively, and via may be compared with sioc:related_to.)

Ryan and I created cite-rel with the intent of keeping it simple. In the microformat way of focusing on the 80% case we mostly dealt with simple relationships between posts (replies, forwarding, updates and via links). We left dealing with the subtle nuances of agreeing/disagreeing to future specs. In this specific case, we can actually use one of the earliest microformats, vote-links, to show the authors attitude towards a previous post. As for other values, here’s the mapping as I see it:

sioc:has_reply – rel-reply
sioc:reply_of – rev-reply
sioc:next_version – rel-update
sioc:previous_version – rev-update
sioc:related_to – I’d say that via can be mapped to sioc:related_to but not necessarily the other way around as via links have a very specific meaning that is not encoded in SIOC as far as I can see. Perhaps IBIS’ source would work?
IBIS:pro/con – map very well to vote-for and vote-against in vote-links.

Filed under: Aggregation, MicroFormats, Projects, The Net

Using a DNS-like model for Distributed Conversations

One of the goals of cite-rel is to enable tracking of distributed conversation by aggregators (like technorati or memeorandum) over multiple blogs. Using a simple microformat like cite-rel to solve the problem has the advantage of a very low cost of entry. Any user can employ cite-rel and any blog software, indeed any tool that publishes HTML can support the format. The downside to that is requiring a third party – the aggregator – and the possibly large amount of work required by that aggregator. It is possible, however, to build a different solution to the problem that would not require any third party and would only require analyzing the conversations each participant is a part of.

DNS is a distributed publishing mechanism. Each DNS server is in charge of only a small subset of the entire domain system but using recursive queries each server can serve information about every public domain. Recursive queries work as a distributed search mechanism that leads your DNS server to the servers with the authority to answer the query. Your DNS server then caches the reply for a limited time so that repeated queries for the same domain would be served faster. We can employ a similar solution with blogs.

To implement such a solution we require two elements:

  1. Support for recursive queries.
  2. A search mechanism.

The search mechanism can be based in existing web technologies – pingbacks. Blog software already supports sending pingback and tracking them, this allows blogs to store references to all replying posts. Further, when posting a reply to a post on a different blog, the blog software can keep track of the original post. With these two mechanisms in place we can completely reconstruct the entire thread, even though each blog only stores links to directly connected posts.

Recursive queries for thread information will come in two different flavors.

  1. A request for an entire thread from a specific post. This type of request will be recursively redirected to the blog hosting the original post, unless this it is already there.
  2. A recursive request for all replies starting with a specific post in a thread. This request will recursively propagate down using pingback information to all blogs that published replies to this post.

Using these two simple requests any blog can give access to full threads for every post published on it. If we add a simple caching mechanism, the performance of the system should improve dramatically without using too much space.

Filed under: Aggregation, MicroFormats, Projects

Distributed Social Anything

Following are generally unstructured thoughts and plans for a possible project. I’ve been thinking about something in this vain for a while but have never put those thoughts into a more permanent form so here goes. This post serves mostly as scratch paper for my ideas so feel free to skip it if you don’t like long, raw, technical posts.

For lack of a better name, I’m calling this Distributed Social Anything. The most concise description I can come up with is distributed Tribe.net. Completely distributed (and then aggregated for convenience :).

  • All content published and owned by the users.
  • All content is accessible by any would be aggregator and formatted according to open standards (mostly microformats).
  • Based on existing tools and technologies. The main publishing tool is a blog.
  • Compatible with current tools. The requirements to participate are few, users of most blog hosting services should be able to participate.

Features and concepts:

  • Identity is defined by a URL. Currently the entities in the system are users and groups, both will have a canonical URL that contains at least XFN data. This XFN data (slightly expanded) defines the standard social network for users but also group membership.
  • Reciprocal XFN links might be required for some of the relations defined later. This is optional and left to aggregators to decide.
  • Group membership is published in XFN. This might require reciprocal links between users and group.
  • Users can publish information about themselves using XFN and hCard. User rel=”me” to link to additional shards of identity.
  • Users publish content on their blog. This content is later aggregated by groups to create a coherent group view .
  • Channels are feeds of blog posts that belong to a specific set. Channels are defined by tags or categories. Each group has at least one channel. Posts marked with that channel’s ID will be part of that group’s discussions.
  • Discussion are annotated using citeRel. Group aggregators might display those in a threaded format
  • Displaying previous versions of posts (in case of editing) with a diff view would be nice
  • XFN links should be aggregated and searchable (similar to rubhub.com). A service that offers search in the XFN space would be very nice
  • Group aggregators should also be aware of rich data (events, listings, etc.)
  • The group site might be able to highlite specific type of rich data (images, bookmarks, etc.) and/or offer access to it using feeds/API
  • We need administrative control of the group – membership, post moderation, rules, access-control, etc.
  • API for the group aggregator
  • Note: group aggregators can collect content from many sources, not just blogs (e.g. flickr, delicious)

Existing support:

  • WordPress supports feeds for categories. Also posts can belong to more than one category. Free channels!
  • RubHub does some XFN search but does not seem to be open source 😦

To do:

  • Express group membership using XFN (rel=”memberof” ?)
  • Finalize citeRel.
  • Expand and improve on structured blogging.
  • A format for publishing group information.
  • Possibly replicate and improve on RubHub.
  • The group aggregator service.

Filed under: Aggregation, MicroFormats, Projects, Tagging

Google Alerts?!

From Google Blog: (Emphasis and editing mine)

If you track information on a batch of discrete topics all the time like I do, managing your inbox is no day at the beach. Monitoring a number of mailing lists for interesting news on, say, [harry potter] or [sony aibo] or [housing bubble] without actually subscribing to what could be hundreds of mailing lists is a daunting task…
Some of us thought it would be cool to offer a feature that does this work for us… a beta version of Google Groups Alerts. It monitors the top 50 most recent Google Groups search results that relate to keywords you’re interested in. Any new articles posted that match your criteria will be emailed to you, just like Google News alerts.

Email? Email??? It’s nice of Google to allow us to subscribe to search results but why through Email? Where’s my RSS support? Technorati does it, Yahoo does it but Google, you know, those guys that developed AdSense for Feeds? They do Email.

Filed under: Aggregation, The Net

Aggregation

Now that our data is scattered so wide and so far, it’s time to start pulling things in; this is where aggregation becomes important. This is also where microformats can make a huge impact. From my viewpoint as a user, I can see two sides to the aggregation game:

  1. Aggregating from many people and one/many sources. You’re probably reading this post on an RSS feed reader, that’s exactly what it does. Technorati is doing something very similar with an added bonus – search! We’ve been doing this type of aggregation for a while now and I think it’s time for a nice shake-up.
  2. Aggregating content from one person and many sources, what some may call a Digital Lifestyle Aggregator (DLA). A DLA tries to recombine your fractured online persona into one piece. You can see a pretty good example for that in tribe’s new profiles.
    Behind all the flashy new design options, the new profiles let you pull in syndicated content from anywhere else, displaying it all in one place. My biggest problem with tribe’s implementation is their target audience for the aggregated content. Unlike your RSS reader which pulls in content for you to consume, tribe’s profiles pull in the content mainly for others to consume. I want to see my events from upcoming, messages from flickr and active tribes in one page. I’ve heard tribe employees say that a similar revamping of the main page is coming up soon and I think this is the way they’re headed. I’m definitely looking forward to that!
    Another interesting new example is CNet’s new Shoebox. You can read all about it yourself. finally somebody realized that I don’t need to upload my photos to their server before they can let me play with them.

Filed under: Aggregation, MicroFormats