See also our page on linking for:
This can be resolved very easily from the admin interface:
1. Check that duplicate files do not exist in the data directory - uploaded files are given a unique filename to avoid accidental overwrites, meaning that older versions of the same files must be manually removed.
2. Rebuild the database
You may find that you have display problems if you include empty fields within your EAD files. It is good practice to delete any field tags where you are not intending to add content.
You need to have a reasonable number of records in your database for the Subject Finder to work
Rebuild the subject cluster database from the command line by doing thefollowing:
cd
/home/cheshire/cheshire3/cheshire3/dbs/ead run.py -cluster
You need to comment out the relevant section of a file called template.ssi which can be found in the /home/cheshire/cheshire3/cheshire3/www/ead/html directory. Use the <!-- angled brackets with dashes --> to comment the section out, i.e:
<!--<a href="/ead/subject.html" title="Find Subjects" class="navlink">Find Subjects\</a> |-->
After this, go up one directory and run the build custom pages command:
./buildCustomPages.py
This happens when the software cannot retrieve the temporary results set created by your search. Result sets 'expire' after one hour of inactivity (inactivity of the result set - not the computer). So this error will occur when you have carried out a search, left the results on the screen for some time, and then returned to the result over an hour later. The solution is simply to resubmit your original search.
This may be due to a number of reasons. The purple python pages indicate that something is wrong with the underlying systems.
Please send us a bug report with details of the error and include the error text that is at the bottom of the Python error page (if unsure, please send a screenshot of the whole error message).
Some Spoke administrators like to allow their users to perform a search for particular subjects by providing them with a link which will search the Spoke and bring back all relevant records. The underlying database of your Spoke uses the Common Query Language (CQL), which is a standard way of formulating a search. There is some background on this at http://zing.z3950.org/cql/intro.html.
The basic CQL syntax for a 'general keyword' search on a single word (biology) in this example is this:
dc.description all/relevant "biology"
This would search in the title, controlled access and scope and content fields for the word biology. You can choose any of the indexes supported by the Spokes software as the one to be searched. The table below shows the names of the various indexes and the fields that they search:
| Data standard | Data field(s) |
|---|---|
| cql.anywhere | full text |
| dc.description | unittitle, controlaccess, and scopecontent fields |
| dc.title | collection title (titleproper) |
| dc.creator | creator of the collection |
| dc.identifier | eadid |
| dc.subjects | subjects |
| bath.name | personal, family, corporate and geographic names |
| bath.personalName | personal names |
| bath.corporateName | corporate names |
| bath.geographicName | geographic names |
| bath.genreForm | genre |
You need to append the CQL query to the base URL for a search on your Spoke. This is in the following form, where myspokeURL is the URL for your spoke, such as archiveshub.man.ac.uk:
http://myspokeURL/ead/search/?operation=search&query=
You also need to treat some of the search characters in a particular way, using URL escape codes:
ampersand is &
forward slash is %2F
quotation mark is %22
space is +
So the complete example for the query above would be:
http://myspokeURL/ead/search/?operation=search&query=dc.description+all%2Frelevant+%22biology%22
Other examples:
This is how you would search for the word 'literature' in the subject field:
http://myspokeURL/ead/search/?operation=search&query=dc.subject+all%2Frelevant+%22literature%22
For the more complex queries, such as a subject search for both history and russia, you would put:
http://myspokeURL/ead/search/?operation=search&query=dc.subject+all%2Frelevant+%22history+russia%22
This CQL query is really dc.subject all/relevant "history russia", but is using %2F for the forward slash, + for the space and %22 for the quotation marks. If you wanted to look for either word, rather than both, it would be dc.subject any/relevant "history russia". The 'relevant' part makes sure that the most likely records come back first, otherwise they will just come back in the order that they appear in the database.
For an exact match the 'all' or 'any' is replaced by an equals sign:
http://myspokeURL/ead/search/?operation=search&query=dc.subject+=%2Frelevant+%22photographs+russia%22
To search for a personal name, such as Winston Churchill:
http://myspokeURL/ead/search/?operation=search&query=bath.personalName+=%2Frelevant+%22churchill+winston%22
This will bring back records exactly matching "churchill winston" (the all has been replaced by an equals sign) in the personal name field.
Yes, you can remove this box by making a simple change in the localConfig.py file:
Open localConfig.py in a text editor such as emacs and change the switch display_splash_screen_popup=True to display_splash_screen_popup=False. You then need to save and exit the file. Then login as the superuser and restart the Web server:
su - root /etc/init.d/httpd restartRemember to exit superuser status once you have done this.
You may get the following error:
"/home/cheshire/cheshire3/install/lib/python2.5/site-packages/mod_python
/importer.py", line 1155, in _execute_target assert (type(result) == types.IntType), _result_warning % type(result)
AssertionError: Handler has returned result or raised SERVER_RETURN
exception with argument having non integer type. Type of value
returned was <type 'NoneType'>, whereas expected <type 'int'>
If this error also includes reference to 'verifyDatabases' then it may be worth recreating the db_ead databse. To do this, from the command line type the following commands to delete and recreate the db_ead database
dropdb db_ead
then
createdb db_ead
This is a problem that may happen using the Firefox browser (it has not been reported using any other browser). It is therefore not a data or software problem, but a browser issue. You tend to get a blank screen, and when you refresh you get a pop-up window warning that the page contains POSTDATA and asking you to click OK to resend. This makes sense when there are security issues, but it is not currently possible to disable it for selected sites. It has been reported to Mozilla as a bug and may be corrected in future versions of the browser.
Not all indexes have proximity information (required for highlighting) stored as this takes a lot of processing time and space to store. It is not efficient to have it stored when searching for terms in very specific places (e.g. access terms, creator etc.)
You can change the relevance rankings from percentages to stars by editing
the localConfig.py file, found in the /home/cheshire/cheshire3/cheshire3/www/ead
directory. There is a line in this file (line 30) which says 'graphical_relevance
= False'. The 'False' can be changed to 'True' to replace the percentage
ranking with stars. You then need to run the ./buildCustomPages.py command
to get this to take effect.
If you would like to remove relevance ranking altogether, make the following changes to localConfig.py
Navigate to around line 38 where there is an instruction, 'display_relevance = True.' Simply change this to False to stop the display of the relevance ranking symbols.
You'll need to restart the web server for any changes in localConfig.py to take effect. Do this by logging into the command line as superuser (root) and entering the following command:
su - root
/etc/init.d/httpd restart
exit
There are certain changes that you can make to your stylesheet that will change the interface display. We would advise that someone familiar with HTML should make the alterations. The Hub team can provide further advice on this.
You can change the display of the date in the sub-fonds so that the brackets are removed and the date is on a separate line from the title. To do this, you need to edit the stylesheet, called html-common.xsl in the folder /home/cheshire/cheshire3/cheshire3/dbs/ead/xsl. Open html-common.xsl in an editor and navigate to around line 761, where the style information for the unitdate in the sub-fonds is displayed. You can leave the original script in the stylesheet but just comment it out using <!-- at the beginning and --> at the end of the section you want to omit:
<!-- <h2><xsl:apply-templates select="did/unittitle"/>
<xsl:text>(</xsl:text>
<xsl:choose>
<xsl:when test="did/unitdate">
<xsl:value-of select="did/unitdate"/>
</xsl:when>
<xsl:otherwise>
<xsl:text>undated</xsl:text>
</xsl:otherwise>
</xsl:choose>
<xsl:text>)</xsl:text>
<xsl:if test="did/origination">
<xsl:text> - </xsl:text>
<xsl:apply-templates select="did/origination"/>
</xsl:if> >
</h2>
-->
Enter the following style information:
<h2>
<xsl:apply-templates select ="did/unittitle"/>
</h2>
<xsl:if test="did/unitdate">
<h3>
<xsl:apply-templates select="did/unitdate"/>
</h3>
</xsl:if>
<xsl:if test="did/origination">
<h3>
<xsl:apply-templates select="did/origination"/>
</h3>
</xsl:if>
Making adjustments to the stylesheet, html-common.xsl, it is possible to comment out a small section in order to prevent the display of data. For example, if you wish to ensure that the data contained within the <origination> tag does not display at sub-fonds level the example below shows the relevant section, with the part referring to origination commented out. You need to ensure that you are within the <Component> section of the stylesheet in order to apply this to the sub-fonds only:
<h2>
<xsl:apply-templates select ="did/unittitle"/>
<xsl:text>(</xsl:text>
<xsl:choose>
<xsl:when test="did//unitdate">
<xsl:value-of select="did//unitdate"/>
</xsl:when>
<xsl:otherwise>
<xsl:text>undated</xsl:text>
</xsl:otherwise>
</xsl:choose>
<xsl:text>)</xsl:text>
<!-- <xsl:if test="did/origination">
<xsl:text> - </xsl:text>
<xsl:apply-templates select="did/origination"/>
</xsl:if> --> </h2>
It is a good idea to delete cached HTML to see the changes in full-text display. You might also need to restart httpd to remove the old XSLT from the search handlers memory.
It may be that the name of the Spoke is wrapping
over two lines and forcing the search options off the screen.
There are three options to deal with this:
a) Making the name shorter in the localConfig.py
file in /home/cheshire/cheshire3/cheshire3/www/ead/
b)
Editing the size of the <h1> tag in the style.css file (/home/cheshire/cheshire3/install/htdocs/ead/css)
c) Changing the height
of the banner <div>, which is set to 8ems in the struc-all.css file
in the same location (this is probably the most effective option).
If you choose to use option (c) here then it would be worth looking for all occurrences of the size 8em in the file and setting them all to the new value to avoid the
banner covering up the more vital features of the page, such as parts of the search form.
Another possibility is that you are using a large graphic for your logo, so it would be worth replacing this with a smaller alternative.
To do this, you need to navigate to the index.html page and open it in a text editor. The page can be found at /home/cheshire/cheshire3/cheshire3/www/ead/html. You need to remove or comment out the relevant section (comment out using <!-- at the beginning and --> at the end of the section):
<!-- <p>
<label for="noComponents" title="Only display hits for collections, NOT individual items.">
Only display collections:
</label>
<input type="checkbox" name="noComponents" id="noComponents"/>
<a href="/ead/help.html#nocomponents" title="What is this?">
<img src="/images/whatisthis.gif" alt="[What is this?]"/>
</a>
</p> -->
Then navigate to the ead directory above this (cd ..) and enter the command to build the Custom Pages:
./buildCustomPages.py
You should then find that you have removed the text from your homepage.
You can make changes to most of the text on the search pages by editing the relevant html document. For the main search page you need to edit index.html; for the browse page you need to edit browse.html; for the Subject Finder the page to edit is subject.html. You can find these pages at /home/cheshire/cheshire3/cheshire3/www/ead/html. Once you have made the required changes to the page, save it, navigate to the ead directory directly above it (enter the command cd ..) and then enter the command to build the Custom Pages:
./buildCustomPages.py
You should then find that your changes have been made to the live Web interface.
This may be due to problematic characters in the content. If any of the content has been cut and pasted from word processing software, the file may include smart quotes. These can be difficult to spot. You can try opening the file in WordPad as this will show up any strange characters in amongst the data, e.g. <unitdate label="Dates:">
You can also try removing the record, by using the 'unindex and delete' command, deleting the cached HTML and then reloading and indexing.
If you copy a record that is on your Spoke and edit it and then upload the edited version, your Spoke will only display the newer record. If you do not delete the older record first, you may experience duplicate results, and you may still get hits when searching for words which have been removed from the new version. This behaviour will persist until you remove the old file and rebuild the database (e.g. by using the 'Rebuild Database' option in the administration interface).
We therefore recommend that you delete an older version of a record from your database before uploading a newer version. You should 'unindex and delete' the record, delete the cached HTML, upload and index the new record.
This can be useful if you wish to refer back to it. Run the indexing script as follows to create a file 'output' as well as viewing the output on the screen:-
./run.py -load -load_components -cluster output.txt 2> &1
It is not necessary to create static HTML copies of all of your descriptions. For particularly long full text descriptions it should improve the speed of retrieval, but the on-the-fly generation of HTML files is quite quick.
Apportioning CPU time to requested processes is the job of the operating system. If the machine is asked to do indexing and nothing else, it should allocate nearly 100% (minus a miniscule amount to allow responsiveness of the user interface) of the CPU time to the task. When asked to perform another process, the OS should allocate CPU time to the new task, although both indexing and the new process will progress more slowly than if they were running alone.
Indexing is computationally intensive, which is why it is done in advance, and preferably at off-peak times when there should be relatively few people wanting to do other things (e.g. searching) with the server.
The impact of indexing on dual, or quad core machines (which are becoming increasingly common) should be significantly less, as the indexing process only uses one of these 'cores'.
Yes, if you reset the statistics you can still recover your older statistics. There is a drop-down box enabling you to access statistics for previous periods of time.
Make sure that you save the cheshire3-ead.tgz file in the right directory - it should be in home/cheshire/cheshire3. You can go to the command line and use the 'ls' command to check that it is there.
Currently the programs and scripts are set up based on this location. Whilst it is possible to use different directory names, you would need to change a number of references to the location so we do not advise this unless you are an expert.
Initially when the Spoke software was being developed it was configured to use with the base installation of apache. Both the search and admin functions of the spoke software require mod_python and there were problems with with regard to python libraries and conflicts. We developed a packaged Spoke version of apache, enabling us to build a much more stable version that worked with all the required libraries, so we would recommended that you use this if possible.
See also general page on creating Links
The XSLT stylesheet looks for the <repository> element first of all and if there is one it generates the 'Held At' title that is displayed in the description. It then looks for the repositorycode attribute in the <unitid> that is in the <did> and if there isn't one it looks for the mainagencycode attribute in the <eadid> in the header. If neither of these attributes are present, the display will show the repository name without a link.
So, you will need to ensure that you have the mainagencycode
attribute in the <eadid> or repositorycode attribute in the
<unitid> and omit the leading '0's from the code. Most
contributors use the mainagencycode:
<eadheader><eadid mainagencycode="206" publicid="GB 206 MS 246" countrycode="GB">GB 206 MS 246</eadid> ..... </eadheader>
The content of the <repository> element will then be displayed with the link to Archon.
Currently if you enter the URL of your Spoke without adding the /ead directory, you will get to a Cheshire page. These instructions are for redirecting you straight to your Spoke homepage.
Log in as the 'cheshire' user and navigate to /home/cheshire/cheshire3/install/conf
edit the file httpd.conf
e.g. to edit with emacs:
emacs httpd.conf
Navigate to the section <IfModule alias_module> which starts:
# Redirect: Allows you to tell clients about documents that used to
Redirect permanent /index.html
After permanent/index.html you need to add the URL for your spoke, including '/ead'. The example below shows the URL for the MIMAS Spoke:
# Redirect: Allows you to tell clients about documents that used to
Redirect permanent /index.html http://spoke.mimas.archiveshub.ac.uk/ead
# exist in your server's namespace, but do not anymore. The client
# will make a new request for the document at its new location.
Save the file.
Log in as root with the command
su
Restart the webserver:
/etc/init.d/httpd restart
Exit superuser.
Records in Spokes on version 3.3.0 can now be harvested using the Open Archives Initiative Protocol for Metadata Harvesting. This means that external parties can harvest your records, either as EAD or Dublin Core. You can disable this if you wish to:
In the file ~/cheshire3/cheshire3/dbs/ead/config.xml navigate to the line (approximately line 13):
<setting type="oai-pmh">1</setting>
Change this to:
<setting type="oai-pmh">0</setting>
Note: If you have the Protocol enabled, it will automatically include the ability to harvest using Dublin Core as well as XML (EAD), as this is the minimum requirement for this Protocol. DC creates a less detailed record than the EAD version, containing only unititle, unitid, subjects and scopecontent mapped to the appropriate Dublin Core elements.
It is easy to make a copy of your current data files from the command line for backup purposes, to do this simply type the following from the command line:
mkdir /home/cheshire/backup/
and
cp /home/cheshire/cheshire3/cheshire3/dbs/ead/data/* /home/cheshire/backup
Alternatively the Archives Hub can offer a backup facility on our central server if your spoke administrator/IT administrator are happy with this arrangement
Yes, this is the standard that you should be using, although it is not mandatory.
As the cheshire user goto the ead directory:
cd /home/cheshire/cheshire3/cheshire3/www/ead
next edit the localConfig.tcl file
emacs localConfig.tcl
Scroll down the file to the outgoing_email_username and change this to an email username that you want to use on your spoke (the one you would use when setting up an email client to send messages). The outgoing_email_host and outgoing_email_port will need to be changed in accordance with the settings used by your institution. The usual default for most mail servers is 25.
Once you have edited and saved the file you will need to rebuild the webpages and restart the webserver
From current directory type:
./buildCustomPages.py
Login as the superuser
su
restart the webserver
/etc/init.d/httpd restart
exit from superuser
exit
refresh your browser before trying to email yourself a description
The Archives Hub software is designed to display valid EAD, but there are times when a contributor may use EAD in a way that we have not anticipated. For example, within EAD it is possible to nest one <controlaccess> within another, but we need to make some adjustments for the Hub to process it.
If contributors report these issues to us we will endeavour to implement a solution. However, it is not feasible for us to deal with the vast number of permutations that EAD allows, so we have to decide on a case-by-case basis in terms of what is of benefit to all contributors and to our users.
This should be possible, but it is likely to require some alterations to the data, as EAD is flexible and services often implement slightly different 'flavours' of EAD. We will endeavour to work with contributors to facilitate the exchange of data with other EAD systems.