Volume 1, Number 2 (August 1995)
Electronic Texts, File Formats, and Copyright: The Christian
Classics Ethereal Library
Review by,
Perry Willett
Indiana University
pwillett@indiana.edu
Willett, Perry. "Electronic Texts, File Formats, and
Copyright: The Christian Classics Ethereal Library"
Early Modern Literary Studies 1.2 (1995): 12.1-27
<URL: http://www.library.ubc.ca/emls/01-2/rev_wil2.html>.
Copyright (c) 1995 by the author, all rights reserved. Volume
1.2 as a whole is copyright (c) 1995 by Early Modern
Literary Studies, all rights reserved, and may be used and
shared in accordance with the fair-use provisions of U.S.
copyright law. Archiving and redistribution for profit, or
republication of this text in any medium, requires the consent of
the author and the Editor of EMLS.
Contents
Introduction
- Two factors, file format and copyright, invisibly shape
the creation of electronic texts. Electronic files can be
created and stored in a variety of formats, with each
format having features that allow or limit possible uses
of the text. Copyright, and its interpretation, has
tremendous influence over the texts that are chosen for
transferral to electronic formats. These two issues
underlie many or most editorial decisions made when
creating an electronic text, and determine the eventual
uses for which a text is most appropriate. Moreover, both
of these factors, in very different ways, influence the
text collections that are available for use over the
World Wide Web.
- Before reviewing electronic texts, one must first
consider their scholarly uses. The ease of duplication
and transmission of electronic files make them ideal for
retrieval. Those who wish to find a copy of the Bible, or
Augustine's Confessions, may now download
the texts through the World Wide Web. They may wish
simply to read the text as they would a printed edition,
but with the electronic version, they never have to worry
about it being checked out from the library or missing
from the shelf. As long as the workstation and network
connections are running, these texts will be available
for reading, saving to disk, or printing.
- Electronic texts can also be used as concordances, for
finding specific passages. This is something the most
basic word processor or text editor can perform in a
rudimentary way, but more sophisticated searches, such as
those involving word proximity, require more substantial
search engines than those found in word processors.
Examples of tools available for analysis of electronic
texts include WordCruncher, a commercial software package
that creates concordances and allows for complex
searching and analysis of electronic texts, and
TACT (Textual Analysis Computing Tools), software for
searching and textual analysis, freely available from the
University of Toronto.
- These two functions mirror current uses of books, perhaps
improving on delivery or searching (especially if no
printed concordance is available for a particular text).
If the ultimate use of an electronic text is as an
electronic reproduction of a printed book, then the only
formatting required will be to make the typeface and page
layout attractive and easy to read. Other types of
research are possible using electronic editions.
Linguistic, semantic, or syntactic features of texts
could be tagged and used in researching a particular
text, or across a collection of texts.
- The World Wide Web, in implementing a notion of
hypertext, has capabilities very different than printed
books, and allows for other, broadened uses of electronic
texts. In this medium, texts can be linked to notes, to
variants, to other texts, even to graphics or sound or
movies, or to itself. Creating electronic editions
designed to allow these kinds of research requires
planning and considerably more work than scanning or
typing in a transcription of a printed version, but some
texts would benefit greatly from the availability of such
tagging or links, making this effort worthwhile.
- Scholars who wish to take advantage of these features of
electronic texts in their research will notice the
paucity of texts available in the public domain. Even as
the number of electronic texts grows rapidly, there are
generally few choices of editions of any particular text.
One major impediment to the wholesale creation of
electronic editions is copyright. Anyone who creates an
electronic text and makes it available for use by others
must be concerned with the copyright status of the
edition that is chosen. Only texts that are in the public
domain can be made freely available over the Internet,
and this fact restricts greatly the choices of editors.
- Copyright remains a vexing question when applied to
electronic texts. The ease of reproduction and delivery
of electronic texts brings into clear conflict the rights
of authors and publishers on the one hand, and the desire
of researchers for electronic versions of important works
or particular editions, on the other. Copyright will
continue to cause problems for creators of electronic
versions of printed works due to its complexity.
- U.S. copyright laws have changed over the years, with the
most important change occurring in 1978. As a minimal
outline, works created before 1978 could obtain an
initial copyright of 28 years, with a renewal available
that in some cases extended copyright to 75 years. Works
created after 1978 extended copyright to 50 years after
the author's death. And, as stated in the
Frequently Asked Questions About Copyright
(v.1.1.2), International Aspects, under
international copyright law, generally speaking, "an
author's rights are respected in another country as
though the author were a citizen of that country,"
creating even larger complications for international
distribution of electronic texts over the Internet.
Copyright has many more complexities than can be
explained in this brief overview, and the general lack of
familiarity may prevent more scholars from creating
electronic texts. There is a WWW page designed to assist
in understanding
copyright, with a helpful list of "frequently
asked questions." In addition, the Harry Ransom
Humanities Research Center at the University of Texas and
the University of Reading Library have embarked on a
joint project called
WATCH (Writers And Their Copyright
Holders) to collect and provide access "to the names
of addresses of copyright holders for English-language
authors whose papers are housed, in whole or in part, in
libraries and archives in North America and the United
Kingdom." Even with these aids, the question of
whether a particular book is in the public domain will
rarely be free of ambiguity. The choice for scholars is
either to create new editions, such as the
Renaissance Electronic Text series (
reviewed in the previous issue of EMLS),
or, as is much more common, to make do with older
editions until this issue can be resolved (if, indeed, it
could ever be solved to satisfy all parties).
Return to top of review
Christian Classics Ethereal Library. Harry
Plantinga, general editor. Pittsburgh: University of
Pittsburgh, 1994-.
- The WWW provides a particularly appropriate environment
for biblical editions and studies because of its
potential to link the extensive intra- and intertextual
references, allusions and glosses. The Bible itself can
be thought of as a kind of proto-hypertext
as discussed by Delany and Landow in their introduction
to Hypermedia and Literary
Studies. One could imagine an ideal
hypertextual Bible that would allow the reader to follow
all the typologies and allusions through a series of
links. Instead of having to flip manually, one could
simply click on a verse or phrase to follow these links
through the rest of the text. One could also, in this
ideal edition, easily compare different versions or
translations at a touch of a button, or link to an
encyclopedia, dictionary, commentary or image file. (As Steven DeRose points out, such a
Bible could easily have over one million explicit links.)
- Biblical scholars embraced the computer for their
research long before other scholars in the humanities.
They have sophistical electronic resources, such as BibleWindows
and CDWord for collations of multiple
biblical versions, or CATSS-Base for aligned
Greek and Hebrew bibles and their variants. These
resources combine text with software, and were designed
before the World Wide Web existed, to run on single
workstations. Robert Kraft began an online journal for
biblical studies,
Offline, over 10 years ago. There are a
number of guides to electronic religious studies
materials, such as those by Gresham
or Strangelove, that point
to the large number of electronic discussion groups,
journals, and software libraries available over the
Internet. As expertise grows, one expects to see a rapid
growth in the amount of resources available through the
WWW for biblical studies.
- The Christian Classics Ethereal Library is
such a resource, and provides an example of how the two
factors of copyright and file format influence the
availability and possible uses of electronic texts.
Plantinga is a professor of computer science at the
University of Pittsburgh, and is creating a growing
library of works of interest to literary and religious
scholars. Works by Thomas à Kempis, Milton, Bunyan,
Calvin, Jonathan Edwards, St. John of the Cross and
others are available. The works of some of these authors,
such as Milton and Bunyan, are available in different
versions available at several different WWW sites.
However, the importance of this collection is that most
of the works are not available elsewhere. Plantinga's
work in creating this library is extremely important for
biblical scholars, for he has created versions of texts
otherwise not available in electronic form.
- Unlike the electronic texts reviewed in the
previous issue of EMLS, most of these
works are not encoded using SGML (Standard Generalized
Markup Language), but instead use other formats.
Plantinga has made most of the texts available in RTF,
or Rich Text Format, developed by
Microsoft, which can be used with word processors such as
Microsoft Word, WordPerfect, Framemaker, or others. Other
electronic texts are formatted as PDF or
Portable Document Format, developed by
Adobe. PDF files can only be viewed or printed with
software called Acrobat,
freely available from Adobe. (Acrobat works on a limited
number of platforms, including DOS, Windows, and
Macintosh.) There are also texts available in Hypercard
editions, for use with the Macintosh Hypercard program,
or in plain ASCII format. Some of the texts are available
in multiple formats, providing an opportunity to compare
the relative strength of each format.
- RTF and PDF are page description languages
and are largely concerned with how the text looks, either
on the screen or on the printed page. Clearly, Plantinga
conceives of his collections solely for reading after
printing the file--he states in an explanation of file
formats that "reading these books is easier from a printed
version." He has therefore chosen file formats
that will allow for attractively printed documents with
little effort.
- One may look at his versions of St. Augustine's Confessions
as an example of the collection available. All of the
versions of the Confessions in their various
formats seem to be accurate transcriptions of the
original editions. No errors or omissions were evident,
and thus any of these versions pass the first critical
test of any electronic text: that of accuracy.
- Two translations are available, one by the Rev. E.J. Pusey and the other by Albert Outler. Plantinga has chosen
editions in the public domain, as he clearly states at
the top of each edition. Neither translation is
considered the most important for scholarly uses. James
O'Donnell, in an
introduction to a plain text version of the Pusey
translation, calls it "not the best," listing
several other more modern editions, but notes that it is
"safely out of copyright."
- As with the other texts in this library, Plantinga
asserts that these two editions are in the public domain.
In the case of Outler's translation, published in 1955,
this is true only if the copyright was not renewed. Under
U.S. copyright law, works published before 1978 went into
public domain after 28 years unless the copyright was
renewed. One would have to review publications from the U.S. Copyright Office to determine
whether the copyright on this edition was renewed in
1983, when the work passed into the public domain. To
make matters more complicated, the series published by
the U.S. Copyright Office listing copyright renewals each
year has been published only through 1982. Researchers
would have to contact the Copyright Office directly to be
certain of the copyright status of Outler's translation.
Most editors and researchers, faced with such
complexities, make choices in good faith without knowing
in all certainty the exact status of particular texts.
- In turning to considerations of file formats, Outler's
translation is available in a variety of electronic
formats, including plain text, RTF, and PDF. The Pusey
translation is formatted in RTF and as a Macintosh
Hypercard stack. Plantinga generally gives some
bibliographic information about the editions in the
library, including author, title, translator and date of
publication (and this is much more than given in some
electronic texts found at other sites). However, in a
serious omission, he neglects to include the publisher.
This slip would not be accepted in printed editions, and
one hopes that future editions in the Library
will include this information.
- The limitations of the plain ASCII text will be familiar
to almost anyone who has used a computer to create texts.
The limited character set requires either that one omits
foreign characters, or changes them to their unaccented
equivalents. Outler includes Greek words and phrases in
his notes, for instance; these words have been omitted in
the ASCII version. He also includes reference to French
editions and translations; any accented character has
been changed. The notes are at the end of this fairly
large file, making following references rather
cumbersome.
- The RTF and PDF versions, as mentioned above, were meant
to look as much like printed editions as possible.
Indeed, when viewing the files with a word processor (in
the case of the RTF file) or Adobe Acrobat (for the PDF
file), they do look very much like a printed edition,
with nice fonts and carefully designed layout. There is
no problem in displaying foreign characters, and
footnotes are placed at the bottom of each page. RTF
files are similar to SGML-encoded files in that features
such as paragraphs and quotations are marked using plain
text codes. However, RTF is solely concerned with
appearance, and most of the encoding concerns the font
size and type. RTF does not allow for tagging linguistic,
syntactic, or semantic features of the text as
recommended by the Text
Encoding Initiative (TEI) Guidelines, and
therefore the kind of research requiring these elements
is simply not possible in this format. One advantage to
RTF, however, is that most current versions of word
processors can import texts encoded in this format. PDF
files have the same limitations of encoded in RTF, with
the added restriction that they can only be used with
Adobe Acrobat.
- In addition, one notices immediately that there are no
hypertext links. Neither PDF nor RTF allows for links to
either other sections within the text, or to external
texts as is common in HTML documents. Both Pusey and
Outler note Augustine's biblical quotations and
allusions, for instance; the citations to chapter and
verse are merely noted in the electronic versions, just
as in the printed editions. Of course, hypertext links
are not needed to read the text; the printed editions of
Outler and Pusey were quite adequate without them.
However, it would be of great benefit to be able to link
directly to a cited verse and its fuller context. With
the texts already scanned and proofed, there is no reason
that Plantinga (or someone else with his permission)
could not create editions that take advantage of the
hypertextual features of HTML.
- The Hypercard formatted file of the Confessions
(the Pusey translation) presents an opportunity to
achieve the kind of hypertextual links to biblical verse
and other commentary imagined above. Unfortunately, these
links do not exist in this edition either. Pusey's
introduction and notes have been omitted, leaving the
Hypercard shell as a kind of page turner. There are
advantages to this format over the others under review,
in that one may navigate quickly through chapters. The
search feature is superior to that of any word processor,
for the software shows how many matches are found in a
separate window, and allows for quick movement from one
match to another. The disadvantage is, of course, that
this version only runs on a Macintosh, using Hypercard
software.
- Plantinga has also created a World Wide Study Bible
that more closely resembles the hypertextual Bible
envisioned by DeRose and others.
He plans to link various Bible versions with
commentaries, sermons, images, and even musical scores,
and encourages world-wide cooperation in this effort
(hence the name). He has four versions of the Bible,
including the King James and RSV, and has linked them to
two commentaries, the Concise Matthew Henry
Commentary and Aaron's Bible Commentary.
He notes in his introduction to this project the
copyright restrictions to the NIV Bible, and sagely
cautions other participants in the project to abide by
these restrictions. He has left room for links to the
other types of materials listed above, but currently only
the Bible versions and commentaries are operable.
- In a chapter of any of the versions available, one may
easily link to the same chapter in the other three
versions. Unfortunately, the commentaries are not linked
by chapter, but instead by book, so that one must first
exit the chapter to find the relevant commentary. The
verses are linked to the commentary, so the commentaries
are perhaps a better starting place. Also, intratextual
allusions and typologies are not linked, but all of these
together would probably approach the million links
estimated by DeRose.
- One also notices the limitations of the HTML environment
as currently realized by viewers such as Netscape, Mosaic
or Cello. It is possible to view only one source at a
time, so a comparison of different versions of a chapter,
or reading a commentary in conjunction with a chapter, is
not possible unless one is running multiple simultaneous
WWW sessions. This is of course possible with an Ethernet
connection, but would be very unwieldy. The kind of true
hypertextual study possible with commercial publications
such as Bible Windows or CDWord
is not possible with current WWW viewers, and is a major
impediment to serious textual research. This is of course
not a fault of the World Wide Study Bible
but of the WWW environment in which it runs.
- The World Wide Study Bible is an ambitious
project and has exciting possibilities. One hopes that
others contribute to the wide range of links as
envisioned by Plantinga, and that he continues to create
electronic editions for other works in religious studies.
Such a project incorporates the two current strengths of
the WWW, namely its ability to link widely dispersed
materials, and the opportunity for collaborative projects
among widely dispersed contributors.
Return to top of review
Afterword
- The issues of file format and copyright inform and shape
the electronic collections that are available through the
WWW. As with the Christian Classics Ethereal
Library, editions available through the WWW are
limited to those in the public domain, in order to comply
with copyright laws. In this medium, where scholars
create and publish electronic editions outside of
traditional publishers, the opportunity for
misinterpretations and misunderstandings regarding
copyright restrictions is very great. Scholars must
either trust the editor's interpretation of copyright in
regard to a particular text, or else be prepared to
double-check.
- In addition, the multiple uses of electronic editions
have spawned, in some cases, multiple versions of the
same text in various file formats, requiring a level of
technical understanding rather burdensome for an editor.
Researchers must consider and be conversant with these
file formats also, in order to choose the most
appropriate version for their purposes. Many of the
limitations on the uses of electronic texts arise when
their editors create them solely for current software and
technologies, particularly proprietary software and
technologies, leaving an unclear future as technologies
change or software companies upgrade formats. For
instance, Adobe Acrobat runs only on a limited number of
platforms, making it useless to those with only mainframe
or UNIX accounts. As discussed in the previous
review, SGML and its various instances, such as the
Text Encoding Initiative Guidelines, seem to present the
best format for electronic texts because they are then
not linked to any particular software or operating
system, and have the greatest flexibility for storage,
interchange, enhancement and reuse. However, while
providing richer possibilities for analysis, TEI-encoded
files generally remain more difficult to handle than
those files using other formats, and require both the
editor and ultimate user to have specialized software
and/or skills in order to realize the full research
potential of electronic texts. There is some hope that
popular word processors will accept TEI-encoded files as
another routine format, but this hope has not yet been
realized.
Return
to top of review
Bibliography
- Augustine. Augustine: Confessions
and Enchiridion. Trans. Albert Outler. The
Library of Christian Classics, Vol. 7.
Philadelphia: Westminster Press, 1955.
- ---. The Confession of S.
Augustine. Trans. Rev. E.B. Pusey. Oxford: John
Henry Parker, 1840.
- Bible Windows. Ver. 3.0. Cedar Hill, TX:
Silver Mountain Software, 1994.
- CDWord. Ver. 1.0. Dallas, TX: CDWord
Library, 1989.
- Carroll, Terry.
Frequently Asked Questions About Copyright.
ver.1.1.2. Columbus, OH: Ohio State University, 1993.
- Delany, Paul and George P. Landow, eds.
Hypermedia and Literary Studies. Cambridge,
MA: MIT Press, 1991.
- DeRose, Steven. "Biblical
Studies and Hypertext." IN Hypermedia and
Literary Studies. Landow, George, ed. Baltimore:
The Johns Hopkins U P, 1992. p. 186-202.
- Gresham, John. Finding
God in Cyberspace: A Guide to Religious Studies Resources
on the Internet.Sterling, KS: Sterling
College, 1994.
- Henderson, Cathy and David Sutton.
Writers and Their Copyright Holders. Austin,
TX: The Harry Ransom Humanities Research Center,
University of Texas, 1994-.
- The
ILTguide to Copyright. New York: Institute for
Learning Technologies, Columbia University, 1993-.
- Kraft, Robert, ed.
Offline Review. Philadelphia: Center for
Computer Analysis of Texts, University of Pennsylvania,
1984-.
- Landow, George. Hypertext: the Convergence of
Contemporary Critical Theory and Technology.
Baltimore: The Johns Hopkins U P, 1992.
- ---, ed. Hyper/Text/Theory. Baltimore: The
Johns Hopkins U P, 1994.
- Liu, Alan, ed. The
Voice of the Shuttle: Religious Studies. Santa
Barbara, CA: University of California, Santa Barbara,
1995-.
- Sperberg-McQueen, Michael and Lou Burnard, eds. TEI
Guidelines for Text Encoding and Interchange (TEI P3).
Chicago, Oxford: ACH, ACL, ALLC, 1994.
- Stover, Mark. "Religious Studies and Electronic
Information: a Librarian's Perspective." Library
Trends, Spring 1992, 40(4) p. 687-703.
- Strangelove, Michael. The
Electric Mystic's Guide to the Internet: a Complete
Bibliography of Networked Electronic Documents, Online
Conferences, Serials, Software and Archives Relevant to
Religious Studies.
Volume 1 (version 2) and
Volume 3 (version 1.3). Ottawa: University of Ottawa,
Dept. of Religious Studies, 1992-1993.
- United States Library of Congress,
Copyright Office. Catalog of Copyright Entries,
Fourth Series. Part 8, Renewals. Washington, D.C.
: Copyright Office, U.S. Library of Congress, 1979-.
-
TACT: Textual Analysis Computing Tools.
Toronto: U of Toronto.
- WordCruncher software. Version 4.50.
American Fork, UT: Distributed by Johnston & Company,
1992.
Responses to this piece intended for the Readers' Forum may be
sent to the Editor at EMLS@arts.ubc.ca.
Return to EMLS 1.2 Table of Contents.
[RGS; August 31, 1995.]