Canada
NLC HOMESEARCHSITE INDEXCOMMENTSFRANÇAIS
Publications*

Encoded Archival Description (EAD) and the Creation of Electronic Finding Aids

by Sheila Comeau, consultant
Network Notes #58
ISSN 1201-4338
Information Technology Services
National Library of Canada

December 7, 1998


What is EAD?

EAD is a Standard Generalized Markup Language (SGML) encoding standard designed specifically for marking up information contained in archival finding aids. Finding aids are documents describing the content of primary source collections (e.g. archival fonds, print and photo libraries, manuscript collections) available in archives, libraries and museums. Intellectual finding aids describe interrelationships between a group of records and the administrative entities that created them; physical finding aids are administrative tools used by archivists to find the actual physical items in the collection. Finding aids have historically been created and maintained in a variety of print and electronic formats.

Birth of the electronic finding aid standard

In 1993, a team of library researchers at the University of California Berkeley initiated a project to develop a non-proprietary, platform-neutral standard for creating machine-readable finding aids that could be shared over distributed networks. The standard also needed to be flexible enough to accommodate embedded links from the finding aid to digital items such as scanned texts, images, or sound files, or to additional descriptive or background information (e.g. bibliographies). Given these requirements, the group looked towards SGML.

SGML is an ISO standard currently supported in various sectors, including the military, aeronautics, high tech, and commercial information companies. SGML provides a framework for defining the logical structure and elements of varying document types. The SGML Document Type Definition (DTD) is essentially the template or blueprint that specifies allowable SGML markup procedures for a particular set of similarly structured documents.

The DTD formally defines relationships among various elements within a document. Archival finding aids are appropriate candidates for a DTD because they share common informational and structural elements. Finding aids often include various hierarchies of description, ranging from information about the entire archival collection to notes pertaining to specific files within a sub-series of a series within a collection.

The EAD DTD was developed following extensive analysis of existing finding aids gathered from a cross-section of archives and libraries. Informed by existing archival standards such as ISAD(G) : general international standard archival description and Rules for Archival Description, EAD defines both required and optional informational elements that make up the finding aid header, description and ancillary material.

The EAD header contains information relating to the electronic file itself, including a title, publisher statement and finding aid author. The EAD description can include a collection description (title, creator, extent, and repository),

controlled search terms (personal and corporate names, subject headings), administrative information (details on the acquisition, usage restrictions and processing of the collection), biographical or organizational history of the creator,

notes on the scope and contents of the collection and description of the physical and intellectual arrangement, organization, and contents of series within the collection. Ancillary material is further information relating to the collection, such as a bibliography or index.

Accessing EAD finding aids

To view an EAD finding aid in its native SGML format, an SGML viewer such as Panorama is required. The SGML viewer reads the finding aid document (the content), the EAD DTD file (the structure), and a stylesheet file that provides the viewer with instructions governing how the various elements will display. A navigation file is also required to define 'jump to' links within the document. Additional support files may be required for proper display of the file.

Given the considerable overhead involved in viewing SGML files in their native format, coupled with the fact that a freely distributed SGML viewer or browser plug-in is not readily available, many EAD sites have chosen to serve up HTML conversions of their SGML finding aids. While some loss of markup detail and navigational information may occur in the conversion process, the finding aid content can be made widely available to anyone with a web browser. EAD sites have taken various approaches to the conversion process.

By retaining an SGML 'master file' of the finding aid behind the scenes, sites are able to perform element-specific indexing and retrieval on their finding aid collections, or contribute their finding aids to collective EAD efforts. Adhering to an independent standard such as EAD provides developers with the assurance that the stored electronic content will not be made obsolete or inaccessible by changes in specific software applications. In addition, the EAD DTD functions as a building block for integrated collections, extending access to primary source materials across multiple repositories. A less immediate benefit may also emerge for EAD adherents as systems move towards developments in XML (extensible Markup Language). The recently released EAD DTD Version 1.0 is designed with built-in 'switches' which will allow it to function as both an SGML and an XML DTD.

Creating EAD DTD finding aids

When creating a finding aid following the EAD standard, the author encloses the finding aid's informational elements in the tags defined by the DTD. The tagged document is a flat text file, and looks similar to an HTML source file:

 

<ADMININFO>

<HEAD>Administrative Information</HEAD>

<ACQINFO><P>The papers of <PERSNAME>John Smith</PERSNAME> (1880-1939), poet, were given to the <CORPNAME> Springfield Library </CORPNAME> in 1967 by Smith's wife, <PERSNAME>Leslie Smith </PERSNAME>.</ACQINFO>

<USERESTRICT><P>Copyright in the unpublished writings John Smith in these papers and in other collections of papers in the custody of the Springfield Library has been dedicated to the public.</P></USERESTRICT>

<PROCESSINFO><P> Selected artifacts have been transferred to the <CORPNAME>Smithsonian Institution</CORPNAME>.</P>

<P>The original register prepared by <PERSNAME>Bill Jones</PERSNAME> in 1969.</P></PROCESSINFO>

</ADMININFO>

<BIOGHIST>

<HEAD>Biographical Note</HEAD>

<CHRONLIST><CHRONITEM><DATE TYPE="long1">1880, June 20.</DATE><EVENT>Born, <GEOGNAME>Dayton, Ohio.</GEOGNAME></EVENT></CHRONITEM>CHRONLIST>

</BIOGHIST>

This excerpt illustrates tags that make up the administrative information and the biographical history elements and sub-elements.

Although SGML documents can be created using a simple text editor, various commercial and publicly available SGML software tools exist to assist in the creation of SGML encoded documents. Internet Archivist is a software product that offers a forms-based interface for creating EAD-compliant finding aids which can be saved as either SGML or HTML files.

UC Berkeley has used in-house expertise to develop a web-based EAD template generator for participants in the Online Archive of California project. The template generates a basic EAD-compliant finding aid by merging information from the participant's profile with the data entered in the template. The generated file is saved locally and detailed container list information is added.

While software tools lighten the load for SGML authoring efforts, a thorough understanding of how the software represents and manipulates the various elements of the DTD is required for consistent, high-quality finding aids to be produced. Because SGML is intolerant of deviations from the established document type definition, validation and error-detection may extend the initial time investment required to complete an EAD-compliant finding aid. However, a benefit of such stringency is increased consistency across finding aids.

Integration and access

EAD sites have taken various approaches to integrating finding aids with existing resource discovery tools. Some sites, such as the Library of Congress and University of San Diego, create MARC records for each collection described by a finding aid. The MARC records link to the full finding aid from the 856 field.

Most sites involved in EAD finding aid projects dedicate a portion of their web site to the finding aid collection. The EAD site typically provides background information and instructions on the use of finding aids, and includes a dedicated search interface to the collection. Often, specialized indexing software is run on the finding aid collection in order to harness the element-specific tagging of the SGML source files.

Conclusion

As with any SGML project, there is a considerable up-front investment in adopting the EAD standard, both in terms of training and in the acquisition of any specialized hardware and software required to support a collection of SGML documents.

Issues surrounding display and output options need to be assessed. Offering only SGML versions of a finding aid will disenfranchise users without SGML viewer software. Those that have the software may find the experience of downloading the SGML document and all of its support files frustrating and time-consuming. Sites offering HTML output may opt for HTML frames, while some may convert the SGML source to a single HTML file. Each approach needs to be assessed in terms of accessing, navigating, and printing/saving the finding aid.

In order to harness the metadata incorporated in an EAD finding aid, library systems and technical services need to be consulted on EAD cataloguing procedures and integration issues. Procedures regarding use of controlled subject headings and authority files for finding aids may need to be reviewed. In addition, new policy issues may arise for reference or document delivery services as more information about local primary source collections becomes available through electronic finding aids.

The strongest arguments for adopting the EAD standard tend to address data preservation, migration, management, consistency, scalability and interoperability issues. Maintaining a collection of finding aids using the EAD standard creates a rich repository of raw data which can be manipulated and re-purposed to meet the changing demands of search and retrieval, indexing, display and file sharing systems. To accommodate growth and change, the EAD header tracks the evolution of a finding aid as it develops from a preliminary collection level description to an extensive item level analysis, or as corrections are made to the original analysis. For sites interested in cooperative collection development and resource sharing, the EAD standard provides a building block for the creation of union catalogues or integrated collections.

More information:

  • EAD Official Web site

lcweb.loc.gov/ead/

Maintains the EAD DTD standard in partnership with the Society of American Archivists; includes a list of current implementers.

  • EAD Help Pages - EAD Roundtable of the Society of American Archivists

jefferson.village.virginia.edu/ead

Provides links to EAD source files, readings on SGML/XML, EAD sites by location (with annotations), and tools and helper files.

  • IFLA Digital Libraries: Metadata Resources-EAD

www.ifla.org/II/metadata.htm#ead

Links to background material, help pages, and EAD site lists.


Copyright. The National Library of Canada. (Revised: 1999-1-28).