CPA Annual Report: 1997 - 1998
LIR serves as the administrative home for the Digital Library Federation (DLF). Begun by 15 research libraries and archives in 1995, the Federation seeks to establish the necessary conditions for creating, maintaining, expanding, and preserving a distributed collection of digital materials for both scholars and the general public. Federation partners share the investment in developing the infrastructure that will enable them to bring together, or federate, the works they manage for their users.
"The Federation's program plans call for action in four areas: developing libraries of materials born digitally, integrating digital materials into the fabric of academic life, building core infrastructure for digital libraries, and supporting the organization of digital libraries."
The DLF has grown to 22 libraries and other organizations participating as full partners and three institutions formally allied to the Federation. The DLF has also forged working relationships with many institutions both in the United States and abroad that have related interests in digital libraries. Directors of the partner and allied institutions serve on the DLF Steering Committee which, with members of a Technical Architecture Committee and staff of the partner and allied institutions, work closely with the DLF Director in formulating and executing a rich agenda of projects, research, and other tasks designed to help the development of digital libraries.
The Federation's program plans call for action in four areas: developing libraries of materials born digitally, integrating digital materials into the fabric of academic life, building core infrastructure for digital libraries, and supporting the organization of digital libraries. The activities described below reflect the work undertaken in these areas.
Several projects have given the DLF partners a chance to contribute to the growing store of digital materials, to develop mature organizational structures, and to enhance specific aspects of digital library infrastructure and operational services. These include:
The Making of America, Part II. Led by the University of California at Berkeley and including Cornell, New York Public Library, Pennsylvania State, and Stanford, this project focuses on special collections related to the theme of transportation in the Gilded Age. A report produced in the initial phase of the project describes standard means of creating electronic links between encoded finding aid descriptions of the collections and digitized versions of collection objects. The National Endowment for the Humanities has funded the implementation phase of this project.
The Making of America, Part III. Planning began for this project, which will provide ways of integrating collections of Americana already digitized at the DLF institutions so that readers can more easily discover and retrieve them. It will also develop and demonstrate migration techniques for preserving digital information.
Social Sciences Databases Project. Reflecting the emphasis of the DLF on materials born digitally, participants in this project will identify databases that are in high demand for the undergraduate curriculum but are difficult to use and costly to support. Work will be divided among several institutions to make these databases and their codebooks available in a uniform, user-friendly, broadly accessible way.
Scope of work. For the DLF partners effectively to "federate" digital libraries, they must share a common understanding of what a digital library is. The definition of digital libraries they created identifies the Federation's general scope of work:
"Digital libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities."
Support structures. The Federation supports and fosters the development of digital libraries as organizations in part by helping digital library directors identify key policy and operational issues. The directors have begun to address the shared and conflicting values associated with the distribution of intellectual property that digital libraries own and manage. Other policy questions that need attention include how closely to position digital libraries in relationship to the researcher's and scholar's desk or lab bench, and how to finance digital libraries in a larger information environment. The DLF has also planned a series of projects, workshops, and publications that will help digital library managers and staff build technical experience and a sense of professional community for creating and managing digital libraries that are highly responsive to user needs.
The durability of digital libraries depends on how deeply the materials selected are integrated institutionally into the fabric of research, teaching, and learning. A study of the DLF institutions shows that the criteria for selection must include how well the material advances one or more of the following four institutional goals: Does the material help the institution to organize and manage new forms of knowledge that are available only in digital form? Does it contribute to efforts to manage intellectual property in ways that enhance scholarly communication? Does it serve programs that aim to improve the quality and lower the cost of research and learning? Does it support institutional efforts to extend research and educational services to the general public, to special categories of constituents such as corporate partners or alumni, or to students in distance education programs?
Metadata. Metadata, or information about content, is important in helping readers identify, retrieve, and use digital materials. Accordingly, the DLF distinguishes descriptive, structural, and administrative metadata. Information that describes content helps readers know of an item's existence and characteristics in relation to other information and to particular information needs. The digital environment has allowed new kinds of descriptive information, such as the encoded archival description (EAD) for archival materials, to be developed. It has also allowed the integration and manipulation of descriptive information about materials in digital and other formats.
The difficulties and added costs of managing digital information arise from the uncertainty and lack of standard practice regarding the creation and use of structural and administrative metadata. The Making of America, Part II, is developing practices for information about the internal structure of digital materials that is essential for organizing its delivery in a consistent and coherent way. Administrative metadata is information about a digital resource that facilitates the management of its intellectual property, as well as its long-term preservation. The DLF is advancing understanding of administrative metadata requirements in digital libraries through its work on access management and archiving, as described below.
Distributed finding aids. There are many ways that libraries can present finding aids that are created as encoded archival descriptions. The DLF supported research at Michigan and Harvard Universities to explore the means and costs of searching encoded finding aids that are distributed among different institutions, rather than collected in a single repository.
Workshop on editorial practice. The DLF seeks to enhance the editorial practices of Electronic Text Centers and others engaged in digital conversion as a form of publication. It sponsored a conference of leading text center staff focused on the application of the Text Encoding Initiative (TEI) standards in library-based text encoding projects, the ramifications of XML development on existing and future text encoding programs, and the future governance and stability of the TEI standard.
Licensing digital materials. The DLF funded Ann Okerson of Yale University to develop software that will support library and publisher licensing efforts. Tentatively entitled "The LIBLICENSE Guide to Digital Information Licensing Agreements," the software will systematically query librarians (or producers) about the details of the information to be licensed and, based on that input, produce a draft license agreement. The draft license agreement can then be sent to information publishers (or customers) to serve as the basis for further negotiations for license agreements with acceptable terms.
Access management. The DLF is addressing the technical and other problems associated with helping users gain authorized access to networked information. The Federation sponsored a planning meeting for a project that is now underway involving the libraries and information technology divisions of universities belonging to the Committee on Institutional Cooperation (CIC). The project will implement a protocol that enables one institution to accept users of its resources who have been authorized at other institutions. The DLF participated in the development of the CNI White Paper on Authentication and Access Management, which identifies various technical options and sets forth criteria for evaluating their effectiveness. With the support of the National Science Foundation, the DLF and Columbia University's Center for Research on Information Access (CRIA) sponsored a day-long workshop in April to develop formal requirements for more sophisticated and versatile systems authorization than those in common use today. The report of the workshop identifies a set of themes that can guide systems designers and developers of prototype systems for information access.
Emulation. CLIR commissioned a report from Jeff Rothenberg, computer scientist at the RAND Corporation, to document and assess existing models of digital archiving and to develop his theories of software emulation. His interim report argues that migration strategies are simply too labor-intensive to be viewed as a reliable preservation treatment, especially in an era of such dynamic change as our own, when standardization, which is critical for migration, is not feasible.
Migration. CLIR has also commissioned a study at Cornell University to explore the degree of preservation risk associated with various formats of materials in digital form. Cornell is developing a risk assessment tool and investigating practical preservation procedures to carry out migration strategies for materials in the selected formats. The products of this project should help other libraries in managing their digital collections.
Digital Library Architectures. The DLF has commissioned a survey of recent literature on the systems architecture of digital libraries. The survey will highlight selected component systems and show how they relate to a larger architectural whole. It will assess which components are relatively well conceptualized and developed, which need further attention, and where gaps might exist in the overall conception of digital library architecture. The survey report will serve as the basis for a forthcoming workshop for library systems staff, to be sponsored by the DLF Technical Architecture Committee.
Persistent names. As part of its focus on the infrastructure for discovery and retrieval, the DLF is helping to design a framework for the persistent identification of digital materials. It has formulated a research agenda to create the means of linking a reference to a digital work to the multiple repositories where the work may reside. This is a difficult problem that DLF libraries are now encountering in the distribution and use of digital materials. The DLF is identifying research partners as part of the development of a formal proposal in the NSF-sponsored Digital Library Initiative funding program.