The Archives Hub strives to maintain a service that is interoperable with other services. We want to ensure flexible access to the content, in order to bring archives to a wide audience. We do this by providing various interfaces to the Hub, by using international and national standards for format and content, and by working with other bodies and services to promote best practice and encourage data sharing.
Archives Hub descriptions are created according to the International Standard Archival Description (General) – ISAD(G). This is the international standard for archival finding aids. Index terms created for descriptions must follow recognized rules or be taken from recognized sources (e.g. NCA Rules, AACR2, UKAT), but the Hub does not prescribe which rules/sources should be used, as long as the rules/sources are specified.
The format used to store the descriptions is Encoded Archival Description (EAD). Descriptions may be at collection level or they may be multi-level, down to individual item. It is the responsibility of the Hub contributors to create and submit descriptions for inclusion on the Hub.
Most people use a web browser to access the Archives Hub content. The descriptions are stored as EAD and transformed into HTML on-the-fly in order to be viewed in a browser. The Hub also supports the SRU search protocol, the Z39.50 search protocol and OAI-PMH, which is used to harvest descriptions.
The Archives Hub uses a distributed model. Most of the descriptions provided by our contributors are stored centrally, at Mimas, but a number of contributors host their own data, and these are commonly referred to as Spokes. The Spokes institutions have an administrative interface that enables them to create and submit descriptions to their own database, and the central Hub harvests the indexes that are created. The Spokes also have their own customisable web interface for searching their own descriptions.
Each Archives Hub contributor has a record store and indexes are created from this. SRU is used to harvest all these index terms and create a single database for each contributor. An XML document summarising each database is created, and all of these summary XML documents are pulled together to create a record store. The record store is indexed to create what may be termed meta-indexes. When a search is conducted, the meta-indexes identify which contributors’ data include the search term(s).
The list of results that a user sees when they run a search contains the results from all contributors merged together. Results are always returned in order of the probability that they will be useful, based on the search terms provided; no contributor is given preferential treatment in the rankings (unless the user requests to see only results from a single contributor). When an individual description is viewed, the Hub software makes no distinction between centrally held and distributed data; all descriptions display in exactly the same way.
The Hub uses the Cheshire information retrieval system. This provides us with a Web interface as well as Z39.50 and SRU (Search/Retrieve via URL) machine interfaces and the ability to harvest descriptions using OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting).
The Cheshire information retrieval system has been developed through a partnership between the University of Liverpool and the University of California, Berkeley, where Cheshire was conceived. The Hub uses Cheshire with an additional Cheshire for Archives layer, software that has been specifically designed for use with EAD. Indexes are provided for title, creator, date, record identifier, scope and content, full text, people, places and subjects, which allows sophisticated search and browse functionality to be provided.