Data Administration at the National Library of Canada
by Pierre Dorion Network Notes #22 December 29, 1995 "Underlying any database is an organized structure of entity types and relationships, which also defines and explains the enterprise in a fundamental way. Capturing this structure in a model (entity data-model, or E/R diagram) is a crucial step in understanding the corresponding data in its most basic, stable and nonredundant form." (Ronald G. Ross, Editor/Publisher of the Database Research Group, Inc.)
IntroductionData Administration was established at the National Library of Canada (NLC) in 1991 with the introduction of the AMICUS project. Data Administration defines the standards for data modelling, the contents of the data models, and the techniques involved in the creation and maintenance of these models. Data modelling is a technique used to represent the nature of the data required for the organization to meet its objectives. The role of Data Administration is to inventory and classify the data of the business and to provide a uniform model for the integration of systems. The challenges of Data Administration at the Library include: the introduction of data modelling techniques for the development of relational databases to end users and ITS staff; the introduction of techniques for developing data models with Computer Aided Software Engineering (CASE) tools; and the development of the Data Administration Standards and Procedures Guide. Data Administration staff, ITS staff and end users are jointly responsible for the development of a collection of data models addressing the view of the Library's business requirements. One of the primary functions of Data Administration is to participate in the development, approval and maintenance of these models. A Computer Aided Software Engineering (CASE) tool is used by Data Administration to define, document and store all data required to develop data models. One of the biggest implementation challenges was the introduction of the Data Administration standards to maintain consistency in the naming, definition, attributes and contents of entities. It would be impossible to share data if an entity had more than one name. It is the responsibility of Data Administration staff to oversee standards, policies, and procedures to minimize inconsistencies in data.
MissionThe primary goal of Data Administration is to participate in the development of conceptual, logical, and physical data models. Data Administration manages NLC data by ensuring that all metadata is up-to-date, consistent, integrated and easily accessible. NLC data comprises all objects relevant to NLC business where information is retained. As well, Data Administration staff ensure the maintenance and sharing of NLC information by the establishment of standards, procedures and guidelines. The Data Administration Standards and Procedures Guide (DA Guide) provides a single source of reference for standards, procedures and guidelines required for the development of data models at the National Library of Canada. The DA Guide provides a coherent set of naming conventions for entities (diagrams, data elements, data relationships, code tables, etc.) which is crucial to the documentation and central management of corporate data. Naming conventions provide greater efficiency in data handling, reduce data redundancy and inconsistency, and minimize confusion among staff, management and the system integrator.
Data ModellingThe Data Administration group is the custodian of all data models (entity/relationship) at the National Library. A data model is a graphical representation of data that is used by the organization to meet its objectives. A list of definitions provides the same information as shown graphically in the entity/relationship (E/R) diagram. The evolution of a data model addresses several levels: the conceptual model; the logical model, and; the physical model. The participation of end-users, Information Technology Services (ITS) staff and management is mandatory for the development of a data model to ensure that all business requirements for the system are supported. Different data models were developed for Phase 1 of the AMICUS project. Here are the primary models:
The subject of data modelling at the National Library will be explored in detail in a future issue of the Network Notes.
CASE ToolsComputer-Aided Software Engineering (CASE) tools support Data Administration activities by providing an integrated set of analysis and design tools that automates the development of specifications for software systems. A CASE tool helps data analysts define, verify and document the design before coding begins. All data required to develop data models are managed, defined, organized, stored and maintained with the use of a CASE Tool/Data Dictionary/Repository. The CASE technology is characterized by components such as diagramming tools, prototyping, re-engineering tools, import/export, and error-checking. Although all these components are important, the "central repository" component is the keystone. A central repository is more than just a dictionary. It is the place where all the system data is kept, the repository for data about the system, graphs, data, and rules. Central control over the logical data model and data element definitions allows applications to share data structures and field validations. There is no need for each application to incur the cost of recreating the Bibliographic Item table layout (e.g., rewriting, retesting, and maintaining code that validates each column in the table). Since November 1995, SILVERRUN is the NLC CASE tool. SILVERRUN is a single-user application that runs on a PC connected to the LAN. The SILVERRUN relational data model (RDM) tool allows for:
Data Administration Group -- Ongoing Tasks
Statistics (AMICUS)
Glossary Of Terms
Copyright. The National Library of Canada. (Revised: 1997-07-30). |