Thursday, February 24, 2011

AO: Annotating with one or multiple statements (triples)

A few days ago I had a phone discussion with some collaegues (Tudor Groza, Vit Novacek and Cartic Ramakrishnan) on how to use Annotation Ontology (AO) for attaching something more complex than a single term (identified by a URI) to a document or document fragment. To make it clear I am giving here an idea on how something like that can be already done in AO.

Let's say I am performing some text-mining on some textual content. It is possible that I don't want simply to associate a term to a span of text but I want to do something more elaborate. For example I want to say, analyzing this span of text I obtain the triple GeneG encodes ProteinP. How can I do that in AO? For instance I can use a Named Graph and I can say something like in the following picture:

Figure 1: The dashed ovals are instances of annotation items. Selectors and other details of the actual annotation have been omitted.

As you can see we have annotated also the atomic components of my triple. In doing this, while analyzing the assertions belonging to a specific domain I can always trace back to the original text. Also, using a graph as object of my annotation I am going in the direction of the Nanopublication format, however this will be topic for a future post.

Given this, you can imagine you can attach the proper provenance to the annotation. If you are a text miner, you might be interested in attaching what software or computational workflow generated such annotation and with what confidence.


You might have noticed the usage of the namespace tm that stand for Text Mining. It is a set of properties I am working on for extending AO to better represent text mining results.

No comments: