Spoke FAQ

Search

Display Results

Interface design

File Management

Database Management

Statistics

Installation of software

Linking

See also our page on linking for:

Other


Search FAQ

I am getting duplicate records in my search

This can be resolved very easily from the admin interface:

1. Check that duplicate files do not exist in the data directory - uploaded files are given a unique filename to avoid accidental overwrites, meaning that older versions of the same files must be manually removed.
2. Rebuild the database

Headings are being displayed for fields that do not contain data

You may find that you have display problems if you include empty fields within your EAD files. It is good practice to delete any field tags where you are not intending to add content.

I am getting an error when I use the Subject Finder that says there are '0' documents in the database

You need to have a reasonable number of records in your database for the Subject Finder to work

  • Check that you have a file called tempCluster.data in the /home/cheshire/cheshire3/cheshire3/dbs/ead directory. If the file is not there, there is no cluster data.
  • Check that the tempCluster.data file is not empty.
  • Check that your records do contain entries in the controlaccess area

Rebuild the subject cluster database from the command line by doing thefollowing:

  1. Log into your spoke and open a terminal window
  2. Use the cd (change directory) command to navigate to the /ead directory
    cd /home/cheshire/cheshire3/cheshire3/dbs/ead
  3. Enter and run the command: run.py -cluster

 

I would like to remove the link to the Subject Finder (or another search option) temporarily

You need to comment out the relevant section of a file called template.ssi which can be found in the /home/cheshire/cheshire3/cheshire3/www/ead/html directory. Use the <!-- angled brackets with dashes --> to comment the section out, i.e:

<!--<a href="/ead/subject.html" title="Find Subjects" class="navlink">Find Subjects\</a> |-->

After this, go up one directory and run the build custom pages command:

./buildCustomPages.py

I am getting the error 'Could not retrieve requested record. It is possible your results set has expired'

This happens when the software cannot retrieve the temporary results set created by your search. Result sets 'expire' after one hour of inactivity (inactivity of the result set - not the computer). So this error will occur when you have carried out a search, left the results on the screen for some time, and then returned to the result over an hour later. The solution is simply to resubmit your original search.

I am getting Python script errors (purple python page)

This may be due to a number of reasons. The purple python pages indicate that something is wrong with the underlying systems.

Please send us a bug report with details of the error and include the error text that is at the bottom of the Python error page (if unsure, please send a screenshot of the whole error message).

I would like to set up pre-defined searches for subjects

Some Spoke administrators like to allow their users to perform a search for particular subjects by providing them with a link which will search the Spoke and bring back all relevant records. The underlying database of your Spoke uses the Common Query Language (CQL), which is a standard way of formulating a search. There is some background on this at http://zing.z3950.org/cql/intro.html.

The basic CQL syntax for a 'general keyword' search on a single word (biology) in this example is this:

dc.description all/relevant "biology"

This would search in the title, controlled access and scope and content fields for the word biology. You can choose any of the indexes supported by the Spokes software as the one to be searched. The table below shows the names of the various indexes and the fields that they search:

Data standard Data field(s)
cql.anywhere full text
dc.description unittitle, controlaccess, and scopecontent fields
dc.title collection title (titleproper)
dc.creator creator of the collection
dc.identifier eadid
dc.subjects subjects
bath.name personal, family, corporate and geographic names
bath.personalName personal names
bath.corporateName corporate names
bath.geographicName geographic names
bath.genreForm genre

You need to append the CQL query to the base URL for a search on your Spoke. This is in the following form, where myspokeURL is the URL for your spoke, such as archiveshub.man.ac.uk:

http://myspokeURL/ead/search/?operation=search&query=

You also need to treat some of the search characters in a particular way, using URL escape codes:

ampersand is &amp;
forward slash is %2F
quotation mark is %22
space is +

So the complete example for the query above would be:

http://myspokeURL/ead/search/?operation=search&amp;query=dc.description+all%2Frelevant+%22biology%22

 

Other examples:

This is how you would search for the word 'literature' in the subject field:

http://myspokeURL/ead/search/?operation=search&amp;query=dc.subject+all%2Frelevant+%22literature%22

For the more complex queries, such as a subject search for both history and russia, you would put:

http://myspokeURL/ead/search/?operation=search&amp;query=dc.subject+all%2Frelevant+%22history+russia%22

This CQL query is really dc.subject all/relevant "history russia", but is using %2F for the forward slash, + for the space and %22 for the quotation marks. If you wanted to look for either word, rather than both, it would be dc.subject any/relevant "history russia". The 'relevant' part makes sure that the most likely records come back first, otherwise they will just come back in the order that they appear in the database.

For an exact match the 'all' or 'any' is replaced by an equals sign:

http://myspokeURL/ead/search/?operation=search&amp;query=dc.subject+=%2Frelevant+%22photographs+russia%22

To search for a personal name, such as Winston Churchill:

http://myspokeURL/ead/search/?operation=search&amp;query=bath.personalName+=%2Frelevant+%22churchill+winston%22

This will bring back records exactly matching "churchill winston" (the all has been replaced by an equals sign) in the personal name field.

Is it possible to remove the status box that appears during searches?

Yes, you can remove this box by making a simple change in the localConfig.py file:

Open localConfig.py in a text editor such as emacs and change the switch display_splash_screen_popup=True to display_splash_screen_popup=False. You then need to save and exit the file. Then login as the superuser and restart the Web server:

su - root
/etc/init.d/httpd restart
.

Remember to exit superuser status once you have done this.

I am getting a mod Python 'Assertion Error' during a search?

You may get the following error:

"/home/cheshire/cheshire3/install/lib/python2.5/site-packages/mod_python /importer.py", line 1155, in _execute_target assert (type(result) == types.IntType), _result_warning % type(result) AssertionError: Handler has returned result or raised SERVER_RETURN exception with argument having non integer type. Type of value returned was <type 'NoneType'>, whereas expected <type 'int'>

If this error also includes reference to 'verifyDatabases' then it may be worth recreating the db_ead databse. To do this, from the command line type the following commands to delete and recreate the db_ead database

dropdb db_ead

then

createdb db_ead

I am getting a blank page or pop-up confirm window when I initiate a search

This is a problem that may happen using the Firefox browser (it has not been reported using any other browser). It is therefore not a data or software problem, but a browser issue. You tend to get a blank screen, and when you refresh you get a pop-up window warning that the page contains POSTDATA and asking you to click OK to resend. This makes sense when there are security issues, but it is not currently possible to disable it for selected sites. It has been reported to Mozilla as a bug and may be corrected in future versions of the browser.

Display Results FAQ

I do not get highlighting in all of the data fields

Not all indexes have proximity information (required for highlighting) stored as this takes a lot of processing time and space to store. It is not efficient to have it stored when searching for terms in very specific places (e.g. access terms, creator etc.)

I would like to change the relevance ranking symbols from percentages to stars

You can change the relevance rankings from percentages to stars by editing the localConfig.py file, found in the /home/cheshire/cheshire3/cheshire3/www/ead directory. There is a line in this file (line 30) which says 'graphical_relevance = False'. The 'False' can be changed to 'True' to replace the percentage ranking with stars. You then need to run the ./buildCustomPages.py command to get this to take effect.

I would like to remove the relevance ranking symbols

If you would like to remove relevance ranking altogether, make the following changes to localConfig.py

Navigate to around line 38 where there is an instruction, 'display_relevance = True.' Simply change this to False to stop the display of the relevance ranking symbols.

You'll need to restart the web server for any changes in localConfig.py to take effect. Do this by logging into the command line as superuser (root) and entering the following command:

su - root

/etc/init.d/httpd restart

exit

Interface Design

I would like to change the format of the headings in the display

There are certain changes that you can make to your stylesheet that will change the interface display. We would advise that someone familiar with HTML should make the alterations. The Hub team can provide further advice on this.

I would like to change the display of the date in the sub-fonds

You can change the display of the date in the sub-fonds so that the brackets are removed and the date is on a separate line from the title. To do this, you need to edit the stylesheet, called html-common.xsl in the folder /home/cheshire/cheshire3/cheshire3/dbs/ead/xsl. Open html-common.xsl in an editor and navigate to around line 761, where the style information for the unitdate in the sub-fonds is displayed. You can leave the original script in the stylesheet but just comment it out using <!-- at the beginning and --> at the end of the section you want to omit:

<!-- <h2><xsl:apply-templates select="did/unittitle"/>
<xsl:text>(</xsl:text>
<xsl:choose>
<xsl:when test="did/unitdate">
<xsl:value-of select="did/unitdate"/>
</xsl:when>
<xsl:otherwise>
<xsl:text>undated</xsl:text>
</xsl:otherwise>
</xsl:choose>
<xsl:text>)</xsl:text>
<xsl:if test="did/origination">
<xsl:text> - </xsl:text>
<xsl:apply-templates select="did/origination"/>
</xsl:if> >
</h2>

-->

Enter the following style information:

<h2>
<xsl:apply-templates select ="did/unittitle"/>
</h2>
<xsl:if test="did/unitdate">
<h3>
<xsl:apply-templates select="did/unitdate"/>
</h3>
</xsl:if>
<xsl:if test="did/origination">
<h3>
<xsl:apply-templates select="did/origination"/>
</h3>
</xsl:if>

I would like to omit the display of a particular field in the sub-fonds

Making adjustments to the stylesheet, html-common.xsl, it is possible to comment out a small section in order to prevent the display of data. For example, if you wish to ensure that the data contained within the <origination> tag does not display at sub-fonds level the example below shows the relevant section, with the part referring to origination commented out. You need to ensure that you are within the <Component> section of the stylesheet in order to apply this to the sub-fonds only:

<h2>
<xsl:apply-templates select ="did/unittitle"/>
<xsl:text>(</xsl:text>
<xsl:choose>
<xsl:when test="did//unitdate">
<xsl:value-of select="did//unitdate"/>
</xsl:when>
<xsl:otherwise>
<xsl:text>undated</xsl:text>
</xsl:otherwise>
</xsl:choose>
<xsl:text>)</xsl:text>
<!-- <xsl:if test="did/origination">
<xsl:text> - </xsl:text>
<xsl:apply-templates select="did/origination"/>
</xsl:if> --> </h2>

It is a good idea to delete cached HTML to see the changes in full-text display. You might also need to restart httpd to remove the old XSLT from the search handlers memory.

With a scrollable frame at the top, the various options don't appear unless I scroll down

It may be that the name of the Spoke is wrapping over two lines and forcing the search options off the screen. There are three options to deal with this:
a) Making the name shorter in the localConfig.py file in /home/cheshire/cheshire3/cheshire3/www/ead/
b) Editing the size of the <h1> tag in the style.css file (/home/cheshire/cheshire3/install/htdocs/ead/css)
c) Changing the height of the banner <div>, which is set to 8ems in the struc-all.css file in the same location (this is probably the most effective option).

If you choose to use option (c) here then it would be worth looking for all occurrences of the size 8em in the file and setting them all to the new value to avoid the
banner covering up the more vital features of the page, such as parts of the search form.

Another possibility is that you are using a large graphic for your logo, so it would be worth replacing this with a smaller alternative.

I would like to remove the option to 'only display collections' from the search page

To do this, you need to navigate to the index.html page and open it in a text editor. The page can be found at /home/cheshire/cheshire3/cheshire3/www/ead/html. You need to remove or comment out the relevant section (comment out using <!-- at the beginning and --> at the end of the section):

<!-- <p>
<label for="noComponents" title="Only display hits for collections, NOT individual items.">
Only display collections:&nbsp;
</label>
<input type="checkbox" name="noComponents" id="noComponents"/>
<a href="/ead/help.html#nocomponents" title="What is this?">
<img src="/images/whatisthis.gif" alt="[What is this?]"/>
</a>
</p> -->

Then navigate to the ead directory above this (cd ..) and enter the command to build the Custom Pages:

./buildCustomPages.py

You should then find that you have removed the text from your homepage.

I would like to change some of the text on the search pages

You can make changes to most of the text on the search pages by editing the relevant html document. For the main search page you need to edit index.html; for the browse page you need to edit browse.html; for the Subject Finder the page to edit is subject.html. You can find these pages at /home/cheshire/cheshire3/cheshire3/www/ead/html. Once you have made the required changes to the page, save it, navigate to the ead directory directly above it (enter the command cd ..) and then enter the command to build the Custom Pages:

./buildCustomPages.py

You should then find that your changes have been made to the live Web interface.

File Management

I have uploaded a file but it will not display in full

This may be due to problematic characters in the content. If any of the content has been cut and pasted from word processing software, the file may include smart quotes. These can be difficult to spot. You can try opening the file in WordPad as this will show up any strange characters in amongst the data, e.g. <unitdate label="Dates:">

You can also try removing the record, by using the 'unindex and delete' command, deleting the cached HTML and then reloading and indexing.

Can I edit a file and upload the edited version before deleting the original?

If you copy a record that is on your Spoke and edit it and then upload the edited version, your Spoke will only display the newer record. If you do not delete the older record first, you may experience duplicate results, and you may still get hits when searching for words which have been removed from the new version. This behaviour will persist until you remove the old file and rebuild the database (e.g. by using the 'Rebuild Database' option in the administration interface).

We therefore recommend that you delete an older version of a record from your database before uploading a newer version. You should 'unindex and delete' the record, delete the cached HTML, upload and index the new record.

Database Management

I would like to output the results of the indexing process to a logfile

This can be useful if you wish to refer back to it. Run the indexing script as follows to create a file 'output' as well as viewing the output on the screen:-

./run.py -load -load_components -cluster output.txt 2> &1

Should I always 'Build HTML'?

It is not necessary to create static HTML copies of all of your descriptions. For particularly long full text descriptions it should improve the speed of retrieval, but the on-the-fly generation of HTML files is quite quick.

How much of the processor (CPU) does the indexing process use?

Apportioning CPU time to requested processes is the job of the operating system. If the machine is asked to do indexing and nothing else, it should allocate nearly 100% (minus a miniscule amount to allow responsiveness of the user interface) of the CPU time to the task. When asked to perform another process, the OS should allocate CPU time to the new task, although both indexing and the new process will progress more slowly than if they were running alone.

Indexing is computationally intensive, which is why it is done in advance, and preferably at off-peak times when there should be relatively few people wanting to do other things (e.g. searching) with the server.

The impact of indexing on dual, or quad core machines (which are becoming increasingly common) should be significantly less, as the indexing process only uses one of these 'cores'.

Statistics

If I use the 'clear statistics' button, can I retrieve my old statistics?

Yes, if you reset the statistics you can still recover your older statistics. There is a drop-down box enabling you to access statistics for previous periods of time.

Installation of software

I cannot install the Cheshire packages

Make sure that you save the cheshire3-ead.tgz file in the right directory - it should be in home/cheshire/cheshire3. You can go to the command line and use the 'ls' command to check that it is there.

I would like to install in a different location to /home/cheshire

Currently the programs and scripts are set up based on this location. Whilst it is possible to use different directory names, you would need to change a number of references to the location so we do not advise this unless you are an expert.

Why can't we use the default version of apache with our operating system; why do we have to use the Cheshire version of apache?

Initially when the Spoke software was being developed it was configured to use with the base installation of apache. Both the search and admin functions of the spoke software require mod_python and there were problems with with regard to python libraries and conflicts. We developed a packaged Spoke version of apache, enabling us to build a much more stable version that worked with all the required libraries, so we would recommended that you use this if possible.

Linking

See also general page on creating Links

How does the Archon link work on the Spokes?

The XSLT stylesheet looks for the <repository> element first of all and if there is one it generates the 'Held At' title that is displayed in the description. It then looks for the repositorycode attribute in the <unitid> that is in the <did> and if there isn't one it looks for the mainagencycode attribute in the <eadid> in the header. If neither of these attributes are present, the display will show the repository name without a link.

So, you will need to ensure that you have the mainagencycode attribute in the <eadid> or repositorycode attribute in the
<unitid> and omit the leading '0's from the code. Most contributors use the mainagencycode:

<eadheader><eadid mainagencycode="206" publicid="GB 206 MS 246" countrycode="GB">GB 206 MS 246</eadid> ..... </eadheader>

The content of the <repository> element will then be displayed with the link to Archon.

How to I create a redirect from http://spokename.ac.uk to http://spokename.ac.uk/ead?

Currently if you enter the URL of your Spoke without adding the /ead directory, you will get to a Cheshire page. These instructions are for redirecting you straight to your Spoke homepage.

Log in as the 'cheshire' user and navigate to /home/cheshire/cheshire3/install/conf

edit the file httpd.conf

e.g. to edit with emacs:

emacs httpd.conf

Navigate to the section <IfModule alias_module> which starts:

# Redirect: Allows you to tell clients about documents that used to
Redirect permanent /index.html

After permanent/index.html you need to add the URL for your spoke, including '/ead'. The example below shows the URL for the MIMAS Spoke:

# Redirect: Allows you to tell clients about documents that used to
Redirect permanent /index.html http://spoke.mimas.archiveshub.ac.uk/ead
# exist in your server's namespace, but do not anymore. The client
# will make a new request for the document at its new location.

Save the file.

Log in as root with the command

su

Restart the webserver:

/etc/init.d/httpd restart

Exit superuser.

Other

How do I disable OAI-PMH access to our Spoke?

Records in Spokes on version 3.3.0 can now be harvested using the Open Archives Initiative Protocol for Metadata Harvesting. This means that external parties can harvest your records, either as EAD or Dublin Core. You can disable this if you wish to:

In the file ~/cheshire3/cheshire3/dbs/ead/config.xml navigate to the line (approximately line 13):

<setting type="oai-pmh">1</setting>

Change this to:

<setting type="oai-pmh">0</setting>

Note: If you have the Protocol enabled, it will automatically include the ability to harvest using Dublin Core as well as XML (EAD), as this is the minimum requirement for this Protocol. DC creates a less detailed record than the EAD version, containing only unititle, unitid, subjects and scopecontent mapped to the appropriate Dublin Core elements.

How do I make a backup of my data?

It is easy to make a copy of your current data files from the command line for backup purposes, to do this simply type the following from the command line:

mkdir /home/cheshire/backup/

and

cp /home/cheshire/cheshire3/cheshire3/dbs/ead/data/* /home/cheshire/backup

Alternatively the Archives Hub can offer a backup facility on our central server if your spoke administrator/IT administrator are happy with this arrangement

Should I be using Unicode?

Yes, this is the standard that you should be using, although it is not mandatory.

How do I setup email to work on my spoke

As the cheshire user goto the ead directory:

cd /home/cheshire/cheshire3/cheshire3/www/ead

next edit the localConfig.tcl file

emacs localConfig.tcl

Scroll down the file to the outgoing_email_username and change this to an email username that you want to use on your spoke (the one you would use when setting up an email client to send messages). The outgoing_email_host and outgoing_email_port will need to be changed in accordance with the settings used by your institution. The usual default for most mail servers is 25.

Once you have edited and saved the file you will need to rebuild the webpages and restart the webserver

From current directory type:

./buildCustomPages.py

Login as the superuser

su

restart the webserver

/etc/init.d/httpd restart

exit from superuser

exit

refresh your browser before trying to email yourself a description

Why am I having problems loading valid EAD onto the Spoke?

The Archives Hub software is designed to display valid EAD, but there are times when a contributor may use EAD in a way that we have not anticipated. For example, within EAD it is possible to nest one <controlaccess> within another, but we need to make some adjustments for the Hub to process it.

If contributors report these issues to us we will endeavour to implement a solution. However, it is not feasible for us to deal with the vast number of permutations that EAD allows, so we have to decide on a case-by-case basis in terms of what is of benefit to all contributors and to our users.

Can I use the Hub EAD for other systems that are also using EAD?

This should be possible, but it is likely to require some alterations to the data, as EAD is flexible and services often implement slightly different 'flavours' of EAD. We will endeavour to work with contributors to facilitate the exchange of data with other EAD systems.

Alt text goes here