Thursday, September 22, 2011

SWAN, AlzSWAN, HyQue and Nanopublications

While developing the SWAN ontology and the SWAN platform (see AlzSWAN for Alzheimer disease) there have always been two open issues: (i) the use of named graphs and (ii) the translation of the textual discourse elements (claims such as: Intramembranous Aβ could behave as chaperones of other membrane proteins) into a formal representation made of triples.


(i) The use of named graphs is a useful way for wrapping some content and specifying its provenance. Basically the idea is to create an 'onion layers' model where each layer has its own provenance. At the time - back in 2006 - I have been investigating the usage of named graphs - and TriX - for representing SWAN content.  However, we decided not to implement such approach because the technological uncertainty in that uncharted territory - of developing an application like SWAN - was already high enough, named graphs usage was not homogeneous across the community and  their serialization was not standardized. This meant introducing de-facto reification for some of the SWAN relationships in order to be able to attach the appropriate provenance. As Graphs are the topic of one of the task forces of the RDF Working Group for updating the 2004 RDF Recommendations, I was starting to think of resuming the old plans.

(ii) The translation of the textual discourse elements into a formal representation made of triples is, for instance, possible through the HyBrow (now HyQue) approach. Translating narrative into triples is not easy job though. Many already found the SWAN manual creation process of narrative claims very labor intensive. In fact, the SWAN curators have been usually rephrasing each claims/hypothesis to make them simple and self contained (including the minimum necessary context). Translation into triples requires, even more, starting from neat hypothesis and claims. And these are not always that easy to obtain.

These two SWAN-related issues have been in my thoughts since a while when the Nanopublication [1] concept came out.

A Nanopublication is a "set of annotations that refers to the same statement and contains a minumum set of (community) agreed-upon annotations.
The concept itself is simple and in the above linked slideshow you can find a first attempt based on real SWAN data. With respect to the paper, the concept of 'statement' (triple) has to be updated to 'statements' (triples) as one single statement is not always enough to satisfy needs of real use cases.
 Statement --> statements
Starting from the above example, we are now trying to formalize a bit better what a Nanopublication architecture would look like... it is work in progress, but if you look at the slides you will get the drift.

[1] Paul Groth, Andrew Gibson, Jan Velterop. The anatomy of a nanopublication. Information Services and Use (2010). Volume: 30, Issue: 1, Publisher: IOS Press, Pages: 51-56 (on Mendeley)

No comments: