Saturday, February 26, 2011

Dublin Core and PRISM

As I was saying in one of my previous posts, distinguishing the different kinds of contributions is not trivial. However, sometimes is necessary. And this is probably the case of publishers that want to keep track of the exact role of the different contributors to a resource.

This is the case of PRISM (Publishing Requirements for Industry Standard Metadata), a metadata vocabulary for managing, post-processing, multi-purposing and aggregating publishing content for magazine and journal publishing. PRISM allows to distinguish between different creators roles: writer, editor, composer, speaker, photographer... you can find the full list in the The PRISM Controlled Vocabulary Namespace. PRISM is also using parts of Dublin Core Element Set and Dublin Core Terms, the subset of terms is listed in the document named The PRISM Subset of the Dublin Core Namespace.

The combination of DC and PRISM, for instance for a book, will become in XML something like:
<dc:creator prism:role=”writer”>John Doe</dc:creator> 
<dc:creator prism:role=”editor”>Paolo Ciccarese</dc:creator>
<dc:creator prism:role=”graphicDesigner”>Micheal Doe</dc:creator>
In RDF, according to the specifications (paragraph 3.5.2 of the PRISM Subset of the Dublin Core Namespaces: Version 2.1), this would look like:
<dc:creator rdf:resource=”contributorrole.xml#writer”>
     John Doe
</dc:creator>
<dc:creator rdf:resource=”contributorrole.xml#editor”>
     Paolo Ciccarese
</dc:creator>
<dc:creator rdf:resource=”contributorrole.xml#graphicDesigner”>
     Micheal Doe
</dc:creator>
However, this is not valid RDF for a couple of reasons that you can find yourself through the RDF Validator Service. Dublin Core Element Sets properties used by PRISM and by the PRISM aggregator message are: creator, contributor, description, format (PRISM records restrict values of the dc:format element to those in list of Internet Media Types [MIME]), identifier (for instance DOI), publisher, subject, title, type. Other properties are listed but not as items of the PAM format: language, relation, source.

For instance, this is how PRISM can deal with identifiers in RDF:
<dc:identifier>10.1030/03054</dc:identifier>
<prism:doi>http://dx.doi.org/10.1030/03054</prism:doi>
<prism:url rdf:resource=”http://dx.doi.org/10.1030/03054”/>
Basically, besides the usage of dc:identifier, PRISM is using the properties prism:doi - which is declaring more explicitly than dc:creator what the identifier is - and prism:url. Strangely enough, the property prism:doi is actually taking as value the DOI proxy URL and not the DOI string. Therefore, I see prism:doi and prism:url as redundant properties. You can find some more details on this old blog post by Tony Hammond.

Moreover, PRISM PAM is making use also of the Dublin Core Terms dct:hasPart and dct:isPartOf for detecting for instance images that are part of a document:
<dcterms:hasPart rdf:resource= ”http://www.myexamples.com/ExamplePhoto.jpg”/>

No comments: