Thursday, May 26, 2011

DOMEO: Linking science and semantics with Annotation Ontology (AO) [1]

In the last few months I've been focusing on the development of the SWAN Annotation Tool (recently renamed DOMEO)*. DOMEO (Document Metadata Exchange Organizer), is an extensible web component enabling users to visually and efficiently create and share ontology-based stand-off annotation metadata on HTML or XML document targets, using the Annotation Ontology RDF model. The tool supports manual, fully automated, and semi-automated annotation with complete provenance records, as well as personal or community annotation with access authorization and control. DOMEO is one of the pieces of a bigger architecture that we internally call Annotation Framework.

The DOMEO interface
The idea itself is pretty simple, DOMEO is basically a little browser inside the browser. It allows the user to type a URL, open the correspondent document and annotate it. It is also possible to pass a URL as a parameter so that the tool opens with the page you want annotate already in the content frame. This option is particularly helpful when integrating the tool with other applications or other sections of the Annotation Framework.

Figure 1: A screenshot of DOMEO. You can notice the address bar where you see the URL of the document displayed below. The document displays the same way it would appear when opened in a new browser window.
The annotation can be performed manually by the user or automatically by text mining or entity recognition services. The two features are available through the two buttons in the DOMEO toolbar and labeled respectively 'Annotate' and 'Text Mining'. When the option 'Text Mining' is selected, the tool lists all the available text mining or entity recognition services. The user can then decide which one or which ones to run on the loaded document.

The button 'save' is saving the produced annotation. DOMEO supports a complex versioning system that basically saves items only when necessary, keeps track of the different versions and for each of them records the full provenance data. I will probably explain the versioning and the provenance models in another post.

* The development of DOMEO is managed and carried out by Dr. Paolo Ciccarese. DOMEO is a product of the MIND Informatics group - Mass General Hospital. The tool is developed in parallel with the Annotation Ontology (AO)