Identifiers
Persistence of Identifiers
For the Archives Hub we take data in from many different sources, and the data can be quite variable. Therefore, we have to implement a system that is as effective as possible within this changing environment.
We cannot guarantee persistence of identifiers, because we do not generate and maintain the data. But we can make every effort to ensure persistence, so that collections are effectively discoverable over time. Persistence is usually maintained if descriptions are not revised. But it can be hard to clearly identify a revised description as representing the same archive as a previous description. It depends upon the stability of the references used for the collections combined with other changes made to descriptions when they are revised.
Identifiers on the Hub
Our intention is to provide a URI, or identifier, for all archives described on the Hub, including collections and components.
We have two means of identification - an 'opaque URI' in the Web browser and a 'direct link' within the description to enable you to reference the description:
Direct Link: https://archiveshub.jisc.ac.uk/data/gb3401-flk
Opaque identifier: https://archiveshub.jisc.ac.uk/search/archives/8ad61472-5ec9-3dc8-896b-2fa6ceb7ea86
These two identifiers are linked within our system. If you enter the 'gb3401-flk' link into a web browser, it will resolve to the opaque identifier and give you the correct description.
The advantage of using opaque identifiers is that if this particular collection either moved to a different repository, or was re-catalogued or combined with another archive, its reference may change.
The Importance of Unique References
The reference for each component of description must be unique and we have a test within our data processing to check for this. If two descriptions have the same reference, the URIs will be the same, which means the newer description will overwrite the old one, which will no longer be available to researchers.
Please be careful when cutting and pasting at lower levels of description. If you submit descriptions with duplicate references we will not be able to process them and we will ask you to modify the references.
The Anatomy of References
The reference consists of three elements:
- the country code (GB)
- the ARCHON repository code (between 1 and 4 numbers, without leading zeroes, e.g. 532)
- the local reference code
When the reference is used to create the persistent link, both the country code and the local reference are translated to lower case. There is a hyphen between the country and repository code and the local reference:
University of the Arts London, Stanley Kubrick Archive
ARCHON: 3184
Collection Reference: SK
Full Reference: gb3184-sk
Direct Link: https://archiveshub.jisc.ac.uk/data/gb3184-sk
URI: archiveshub.jisc.ac.uk/search/archives/d650ae34-1958-3530-92f7-f68410cd0dc2
If the local reference code contains spaces, these are removed from the identifier and the URI:
British Library, Western Manuscripts: Mervyn Peake Archive
ARCHON: 58
Collection Reference: Add MS 88931
Full Reference: gb58-addms88931
Direct Link: https://archiveshub.jisc.ac.uk/data/gb58-addms88931
URI: archiveshub.jisc.ac.uk/search/archives/bcbd0936-29d1-3efc-adbf-181ed24dd947
The ability to uniquely identify a resource over the Web provides for long term preservation, dissemination and access.
Identifiers on the Hub are URIs, or Uniform Resource Identifiers. URIs can be defined as short strings that identify resources. URLs are Uniform Resource Locators, a subset of URIs that use the Web protocols to provide a location for a resource.
The Hub uses URIs, which also act as URLs, to provide a means to both identify the resource and locate it on the Web.
"It is well-known that Internet resources tend to have a short life; their identification and persistent location pose complex problems that affect many technological and organizational issues involving the citation, retrieval and preservation of cultural/scientific resources. This is by no means technical problem alone: persistent digital object identification, including texts, music, video, still images, scientific documents and the like, is still a major issue that prevents the use of today’s Internet as a trustworthy platform for the research and dissemination of scientific and cultural content."
Digital Preservation Europe: Persistent Identifiers for Cultural Heritage
(http://www.digitalpreservationeurope.eu)