NLC HOMESEARCHSITE INDEXCOMMENTSFRANÇAIS
Publications*

Federal Identifier for the National Library of Canada


Document Management Systems

by Gary Cleveland
Network Notes #44
ISSN 1201-4338
Information Technology Services
National Library of Canada

March 7, 1997


What is a Document Management System?

When technical architectures supporting digital libraries are discussed--both at the National Library and in the wider library community--document management systems are often posed as necessary technical components, along with relational databases, full text search engines and Web servers. While the purpose of the former components is fully understood, it is less clear what document management systems are and what they do. This Network Notes will attempt to shed some light on this issue.

Before answering the question "What is a document management system?" let's define "document management". Document management, in general, is the automated control of electronic "documents" through their entire life cycle, from creation to archiving. The electronic documents they manage can include any kind of digital object--bitmap images, HTML files, SGML, PDF, graphics, spreadsheets, and word-processed documents. Document management allows organizations to exert control over the production, storage, management, and distribution of electronic documents yielding greater efficiencies in the ability to reuse information and to control the flow of documents.

So what is a "document management system"? One of the confusing aspects of this question is that the term doesn't describe a single type of software package or technology that performs a set of specific, agreed-upon functions. This is in contrast to more familiar, established technologies such as relational database management systems (RDBMS). The domain of RDBMSs is well-understood and well-defined. We know exactly what functions they perform and their place in an overall technical architecture. This is not true of document management systems.

The term "document management system" signifies a broad collection of roughly related systems that perform one or more of several functions. It's a relatively new, and as yet undefined, class of information technology, one that is used to coordinate electronic document management, storage, and retrieval. Typical functions that document management systems perform, though no system performs them all, include:

  • Descriptive filenaming: Allowing long, more descriptive filenames thus overcoming the problems of short filenames.

  • Indexing: Creating lists of keywords.

  • Multi-file document control: A key feature of document management systems is the treating of various files and data associated with one document as a single object during archival and retrieval transactions. Such functionality is required because complex electronic documents often consist of several files (e.g., chapters, diagrams, and photographs) each in a different file format. To manage such documents efficiently, a document management system must keep track of each file, presenting the document to users as though it were a single entity.

    A helpful analogy is that of a book ripped into several pieces, with its chapters, diagrams, and photographs physically separated from each other. The challenge would be to manage the "book" as several pieces even though they belong together. Manual tasks taken for granted in a bound book become difficult and time consuming when the pieces exist independently: keeping pieces from getting lost; preserving the relationships among the sections; moving all the pieces together from one place to another; and tracking changes from edition to edition. A document management system would have the effect of electronically "binding" the separate pieces together, thereby simplifying management -- a matter of some significance when dealing with thousands of complex electronic documents.

  • Storage and retrieval: Assists in managing storage and retrieval functions.

  • "Library" services: Not to be confused with what librarians consider to be library services, this is a term used to refer to document control mechanisms such as check-in, check-out, audit trail, protection/security, and version control.

  • Workflow management: Workflow is the coordination of tasks, data, and people to make a business process more efficient, effective, and adaptable to change. It is the control of information throughout all phases of a process. The path of a particular document is determined by the document type (e.g., press releases, manuals, policy papers, memos), the processes governing a document, and organizational roles (i.e., who has the authority to see what?). It supports functions such as writing, revising, routing, commentary, approval, conditional branching, and the establishment of deadlines and milestones.

  • Presentation/distribution services: Presentation and distribution address the form and manner in which users are provided with information. Document management systems should allow "multi- purposing" where information can be distributed in different formats, such as viewed on a network (e.g., the Web), distributed on CD-ROM, or printed on paper.

Thus, when asking the question "What does a document management system do?" the answer is that it depends on the particular document management system in question, and what it has been created to do. Generally, a system isn't bought off-the-shelf, but is created for each specific application by integrating a number of separate components -- an RDBMS, a full text search engine -- with over-arching management routines.

So what would a document management system do in a digital library? Again, the specific functions would depend on the particular tasks that are identified for a given digital library, but a few typical functions could include:

  • automate the check-in of newly acquired electronic documents;

  • route electronic documents through the cataloguing process, simultaneously presenting the document to descriptive cataloguers, and subject analysts, in addition to notifying the personnel responsible automatically, acquiring approvals and sign-offs as applicable;

  • store processed files in a database, automatically updating indexes;

  • help manage multi-part documents as "publications," rather than sets of files;

  • supply the environment in which documents are linked to other, related resources and processes, such as a bibliographic record and ILL;

  • handle copyright management.

These and other related issues will be explored by Information Technology Services (ITS) over the coming year as part of its investigations into developing a digital library technical architecture for the National Library of Canada.


Canada Copyright. The National Library of Canada. (Revised: 1997-07-31).