Thursday, May 27, 2010

About one of my FOAF problems

I am currently involved in defining the Annotation Ontology (AO). For this tasks I would like to be able to reuse existing ontologies and AO is actually built on top of the Annotea project schema. Another popular vocabulary would be FOAF. Now, FOAF defines - or better loosely defines - a foaf:Document class as: The Document class represents those things which are, broadly conceived, 'documents'. Also in the documentation (accessed today) I read: there is no distinction between physical or electronic documents.

In AO I need to point to the document I am annotating. As many of you might know, documents on the web, even if they have the same URI, are subject to change. For instance my curriculum is changing with the new projects I am working on. Unfortunately the URI is always the same. Therefore, it might be that I annotate a piece of text in a document today and, tomorrow that piece is gone and my annotation is orphan.

To be clear I can write something like:
<rdf:Description rdf:about="http://www.hcklab.org/paolo-ciccarese-cv.html">
  <rdf:type rdf:resource="Document"/>             
  <pav:sourceAccessedOn>2010-05-10</pav:sourceAccessedOn>
</rdf:Description>
After a while some other user (or myself again) creates some other annotation; this will result in:
<rdf:Description rdf:about="http://www.hcklab.org/paolo-ciccarese-cv.html">
  <rdf:type rdf:resource="Document"/>             
  <pav:sourceAccessedOn>2010-05-31</pav:sourceAccessedOn>
</rdf:Description>
As you can imagine, the access date is not working well as now we have a URI associated with two dates and there is no way to distinguish which date is associated to what annotation.

A better solution would be to have a stable URI for the webpage and a different URI for each version of the document. Something like:
<rdf:Description rdf:about="http://my.example.org/sd/2332">
  <rdf:type rdf:resource="Document"/>
  <ao:retrievedFrom rdf:resource="http://tinyurl.com/ykjn87p"/>
  <pav:sourceAccessedOn>2010-05-10</pav:sourceAccessedOn>
</rdf:Description>>

<rdf:Description rdf:about="http://tinyurl.com/ykjn87p">
  <rdf:type rdf:resource="WebPage"/>
</rdf:Description>
or, if we really want to keep FOAF in the picture:
<rdf:Description rdf:about="http://my.example.org/sd/2332">
  <rdf:type rdf:resource="SourceDocument"/>
  <ao:retrievedFrom rdf:resource="http://tinyurl.com/ykjn87p"/>
  <pav:sourceAccessedOn>2010-05-10</pav:sourceAccessedOn>
</rdf:Description>>

<rdf:Description rdf:about="http://tinyurl.com/ykjn87p">
  <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Document"/>
</rdf:Description>

Read more here