Machine Readable Rights Statements

Leigh Dodds

As the open data movement continues to mature, discussion is focusing on how the publishing and use of open data can be become more sustainable. This includes making the process of opening up data clear and simple for any organisation.

The recent launchof the Open Data Certicatessees the ODI begin to provide tools that support best practices in this area. The certificates provide guidance to publishers, helping them to identify a variety of issues to be considered when releasing data, whilst also giving re-users some assurance as to the quality of open datasets.

Data published to the web should always be accompanied by machine-readable metadata describing all aspects of the dataset including is content, origin, publication schedule and, importantly, clear licensing. A clear statement of re-use rights can ensure that consumers fully understand both how a dataset can be re-used and any obligations that they may incur through that usage.

Through the Open Data Certificates, the ODI will be encouraging data publishers to publish machine-readable metadata. In fact doing so is key criteria for reaching each level of certification.

There are several existing efforts to help standardise dataset metadata, including the W3C Data Catalog Vocabulary(DCAT). Publishers should use these standards to help describe their data.

However in a few cases there are gaps in existing standards that merits further work. This is particularly true in the case of publishing machine-readable rights information.

At the ODI we are currently working on a new vocabulary to support the publication of “Open Data Rights Statements”. The vocabulary builds upon and extendsthe Dublin Core and Creative Commons vocabularies to support the description of Rights Statements that may include:

  • A reference to a license for the dataset
  • A reference to a content license that applies to copyrightable parts of a dataset – an important piece of metadata in juristications that recognise database rights
  • Copyright notices
  • Attribution metadata to support re-users in acknowledging their sources

Alongside the vocabulary we have drafted guides for both publishers and re-usersthat will describe:

  • How to publish machine-readable rights statements as RDFa, Linked Data, or as additions to existing JSON or XML formats
  • How to link to rights statements from both web pages and APIs
  • Guidance for developers on how to apply this metadata to build attribution and citation links

The guides are still a work in progress but can be viewed in our github repository. We would loveto hear your feedback on any aspect of the work.Please submit issues andcomments on github.

Once complete these guides will form part of a broader set of guidance from the ODI on how to publish standards-compliant dataset metadata.