Friday, February 18, 2011

Principle: Traceability [1]

According to Wikipedia:
Traceability refers to the completeness of the information about every step in a process chain. 
I've been working on Clinical Information Systems for quite a while and traceability is a very well-known - even if usually poorly implemented - concept when talking about medical processes and patient data. For instance, for a blood pressure measurement, it is important to know who performed the procedure, where and when but also, if the notes have been written on paper first, who wrote the measures and when, eventually who entered the data in the system, where and when... if the information system is managing structured data,  we might want to record the language of the operator who entered the data, the templates she used and so on... The main idea is to keep track of the process details and of all the accountable health care professionals. In the previous list, to make it simple, I voluntarily excluded the medical context - which cuff has been used, was the pressure measured after a meal, after physical activity - which is crucial for reproducibility but opens to multiple other representational issues. You would be amazed on how complicated the model for a blood pressure measurement can become.

But what is traceability in Semantic Web terms? I guess one way of saying it is through the term Provenance, very popular these days.
Provenance, from the French provenir, "to come from", means the, or the of something, or the history of the ownership or location of an object. The term was originally mostly used for works of art, but is now used in similar senses in a wide range of fields, including science and computing.
A good alternative definition, more focused on computing and, which takes into account processual aspects, is provided by the W3C Provenance Incubator Group:
Provenance of a resource is a record that describes entities and processes involved in producing and delivering or otherwise influencing that resource. Provenance provides a critical foundation for assessing authenticity, enabling trust, and allowing reproducibility. Provenance assertions are a form of contextual metadata and can themselves become important records with their own provenance.
In my mind, traceability is still a more generic concept than provenance. For instance, I would consider some of the aspects required for reproducibility part of the traceability and not of the provenance. And this is because I believe it would be easier to standardize 'where, when and who' (what I consider provenance), than 'what, why, which, how' (which represent context that added to provenance gives traceability) that are domain dependent and can become very hard to define. However, the last definition makes me quite happy and I would be glad if the incubator for provenance will translate into an actual Working Group.

No comments: