CPA Newsletter #91, Jul-Aug 1996

Commission on Preservation and Access

The Commission on Preservation and Access

Newsletter

July-August 1996

Number 91

Planning Task Force Recommends NDLF Leadership Roles

The National Digital Library Federation Policy Board stated in October 1995 that the construction of a national digital library must respect and accommodate local decision making at each institution, while also identifying and endorsing processes and standards necessary for a coherent network of scholarly information resources and services. The NDLF Planning Task Force has found in its deliberations since that time that an organization founded on these principles of federation is not only feasible and compelling, but imperative to ensure the affordable and beneficial use of digital technologies by the higher education community.

The task force found that much of the technology that would facilitate a federated approach to a national digital library is either already available or well advanced in the process of development. However, if individual work is to contribute to a greater whole–to the construction of a national digital library–it will need to be based on a set of common structures and protocols.

In their report to the Policy Board, the task force identified three areas where the research library community can exert leadership.

Discovery and Retrieval. The heterogeneity of the information available in digital form–different data structures, search engines, vocabularies for access–significantly challenges users in their ability to identify and retrieve needed information. To lower the barriers to access for these heterogeneous materials and to provide cross-collection search capability, the task force has charted a multi-step course of action.

First, a pilot effort is underway to build a model for “institutional gateways” to digital collections, which will allow the aggregation and browsing of digital information by categories of material (e.g., journals, special collection materials, spatial data). These pilot gateways are currently accessible via the Federation home page on the World Wide Web.

Second, to build on this initial step, the Federation should explore adding functionality to the World Wide Web gateways through the incorporation of Internet indexing tools for the Web space of Federation participants.

Third, the Federation needs to develop more formal database support for cross-collection search capability. There are a variety of possible database solutions, including the use of SGML (standard generalized markup language), but perhaps the most important issue is for institutions to agree to guidelines for the use of a minimal set of metadata elements in a portable form. These metadata should build on existing efforts. The elements need to be mapped to MARC or other record formats as desired. They must incorporate naming conventions for digital objects, and they should include descriptive attributes for other infrastructure elements, such as rights and archival status.

Intellectual Property Rights and Economic Models. Since most of the technical requirements for the management of intellectual property rights are now–or will shortly become–available, the Federation should concentrate on putting in place a clear and articulate rights policy to regulate rights relationships among Federation institutions. Such a policy will have the effect of organizing common access to digital objects and create incentives for institutions to make digital objects they hold readily accessible via the infrastructure. The principles that the task force has considered to underlie an effective rights policy and which it recommends for further investigation and testing include the following:

  • Maximize scholarly access to digital objects, the intellectual property rights of which are vested with the local institution; avoid or keep to a minimum interinstitutional charges for access to those objects.
  • Work together and with other partners to influence intellectual property right legislation and the rights policies at individual institutions; take aggressive action to preserve fair use rights; ensure that scholars are aware that the Federation infrastructure provides them with an opportunity to explore alternative methods of scholarly publishing.
  • Monitor and, when possible, participate in groups and projects, such as the Common Solutions Group and the Mellon-funded publishing projects, that are creating technologies affecting rights management.

A further policy element is how rights to intellectual property affect the economic relationships needed to support the creation, accessibility and maintenance of content in digital form. The Planning Task Force sees the development of the Federation as an opportunity to begin defining and resolving the issues involved in the interaction of rights management and economic organization. The economic issues that most affect the possible future of the Federation and interact directly with the principles of rights management include the need to:

  • establish models of collaborative funding within the research and learning community;
  • create pools of investment capital to support the development of content, access structures and preservation mechanisms;
  • define and rationalize the costs of digital access and preservation; and
  • create revenue streams that recoup development costs, cover ongoing costs and provide incentives for institutions to share and distribute content.

Archiving of Digital Information. Perhaps the greatest test of adherence to the goal of creating a national digital library is a commitment to preserve culturally significant digital information as part of the national heritage. The Federation can foster and facilitate a commitment to digital archiving in at least three ways.

First, it can assist in the development of the legal foundations for digital archiving. In the development of its rights management policy, the Federation can help ensure that a common feature of purchase agreements and licenses for digital information is clarity about whether the information will be archived and which party has archival rights and responsibilities. The Federation should also have an interest in defining the archival fail-safe mechanism for which the Task Force on Archiving of Digital Information called in its recently issued final report.

Second, the Federation can encourage digital archiving by providing, in its metadata work, a mechanism and clear guidelines for institutions to declare the level of commitment to archiving the material they have made available through Federation means.

Third, the Federation can recognize that the prospects of migrating digital information into the future are today more promising and economical for some kinds of materials than for others. To the extent that the Federation helps institutions discriminate among digital materials by the ability to migrate them and develops corresponding guidelines and best practices for digital materials, then it assists in the creation of a trustworthy or, in the words of the report of the Task Force on Archiving of Digital Information, a “certified” process for preserving digital information.

The Policy Board has taken the recommendations of the task force under advisement and is considering how this work can be supported and continued. Meanwhile, the Commission and Council have announced contracts that address some of the areas (see articles in this newsletter). For background see newsletters of March and June of 1996, and June, July-August, October, November-December of 1995. See also the

NDLF Web site, http://lcweb.loc.gov/loc/ndlf/

YaleProject Addresses Archiving Concerns

A contract from the Commission will support a pilot project on the preservation of digital information in Yale University’s Social Science Data Archive. The project employs a two-pronged preservation strategy of migrating digital files and digitizing related paper records for enhanced access. Preserving Digital Information, the final report of the Task Force on Archiving of Digital Information (DATF), recommended that the Commission support follow-on studies to establish best practices and to benchmark costs for archiving digital files. The project embodies a substantive and early response to those recommendations.

Quote

The Yale University Library, one of the first academic libraries to form a collection of machine-readable data, began collecting numeric data in 1972. The collection includes materials from the Roper Center for Public Opinion Research, whose data files provide a record of public opinion research in the U.S. from 1935 to the present, along with surveys conducted abroad since the 1940s.

Over the years, Yale has copied its data from one form of digital storage to another as mainframe computer technology has dictated. The copying of data, while labor-intensive, was straightforward in creating exact logical copies from out-of-date media to newer data storage formats. Now, as users move from the mainframe to distributed computing systems and from one hardware and software configuration to another, digital formats that are mainframe-dependent require not just simple duplication, but restructuring.

Yale staff will select a small, representative collection of

important studies from the Roper Center materials for this project.

They will migrate digital numeric information, now stored on computer tape, to a system-independent format. ornamentThe Roper Center materials also include deteriorating paper records necessary for use of the data files. Several options are available for digitizing them, including optical character recognition, image scanning, text encoding, and manual data entry. Yale staff will analyze the alternate methods and costs. Staff will restructure the digital files of data and documentation, migrate them to a format that allows networked online storage, and create metadata for the resulting files. As a result, researchers will be able to use numeric data and the descriptive information, both now and into the future, no matter when and how the data were created.

The project will advance a design on which Yale and others can subsequently build. Out of this experience, Yale will create a model of the process needed to handle a large-scale project that entails migration of data and preservation of accompanying documentation. The contract calls for Yale to produce a report to help other institutions working on digital archiving projects. 

What’s New on the Web

As described in the article, “Planning Task Force Recommends NDLF Leadership Roles,” pilot institutional gateways to digital collections are accessible via the Federation home page: http://lcweb.loc.gov/loc/ndlf/.

At the Commission’s Web site, you can quickly link to the names of Commission sponsors, the Publication List, and electronic mail to staff. Direct links have been added to the top of the home page. A catalog with descriptions of Commission publications is being added to the site.

The Commission Web pages are maintained at Stanford University. During May, Stanford logged the following statistics:

Analyzed requests from May 1, 1996 to May 31, 1996

  • Total completed requests: 12,851
  • Average completed requests per day: 415
  • Number of distinct files requested: 326
  • Number of distinct hosts served: 4,452
  • Number of new hosts served in last 7 days: 911

We welcome corrections and suggestions regarding the Web site. Please contact Maxine Sitts, mksitts@cpa.org.

Quote

Cornell, Michigan to Expand Making of America Project

  • Cornell University and the University of Michigan, participants in the National Digital Library Federation (NDLF), will begin an expanded phase of the Making of America project at a two-day meeting supported by a contract from the Commission. The meeting is likely to launch the first substantive demonstration project of the NDLF.Making of America is a multi-institutional project to preserve and make accessible through digital technology a significant body of thematically related sources on the history of America between 1850 and 1950. It is envisioned as a phased effort, focusing on an array of themes and historical epochs that support the building of coherent collection segments at cooperating institutions. With support from The Andrew W. Mellon Foundation in the pilot phase, Cornell and Michigan are selecting, scanning, and providing online access to 5,000 monographs and journal volumes that document 19th-century America. The pilot phase involved Cornell and Michigan; the next phase will include additional NDLF institutions.

    At the meeting, two senior representatives from each invited institution will consider key issues raised in multi-institutional efforts to share digital resources: interoperability, technical standards, intellectual property rights, and others. Participants will address such specific matters as:

    • the object and products of the second phase
    • incentives for participation
    • sources of funding
    • revisions needed to move beyond the pilot phase
    • mechanisms to ensure an appropriate and effective level of interoperability for a distributed digital collection
    • technical and structural issues including conversion of color and non-published materials, naming conventions, security, and metadata
    • principles and strategies for selection of materials
    • promotion, user instruction, and user feedback
    • methods and responsibilities for long-term archiving, and the potential to test or implement recommendations of the CPA/RLG Task Force on Archiving of Digital Information. 

Commission/CLR Contract to Support Electronic Licenses Tool

A contract awarded jointly by the Commission and the Council on Library Resources is supporting an 8-month project at Yale University to develop an online tool to help academic libraries negotiate license agreements with providers of electronic resources. The tool will help those dealing with ownership, lease, or access to remote databases, CD-ROM and networked resources, and other forms of electronic information.

QuoteMany libraries have contacted the Commission and Council–as well as peer institutions and other organizations–seeking guidance about license agreements. Too often, publishers present agreements that are unsatisfactory for library purposes. They are often far more restrictive than current copyright law, but few libraries possess the technical and legal expertise–not to mention the time–to negotiate and close license agreements that allow appropriate service to patrons. Yale will draw on its successful licenses to mount on the World Wide Web a multi-faceted tool:

  • an introductory essay
  • an “anatomy of a license” (or model agreement) with hyperlinks to definitions and vocabulary, examples and assessments of good and poor language, and citations of and links to printed and online information resources
  • a database of Yale’s key electronically licensed titles and their main license terms

Library administrators and law school personnel at Yale will develop the tool, with advice from knowledgeable attorneys and law librarians beyond the university.

The project addresses needs articulated by the National Digital Library Federation Planning Task Force. The online tool will be a useful resource for librarians and university attorneys, and may serve as an educational document for publishers as they develop markets for their electronic publications.

Board Accepts Heilbron Resignation

At the May 10, 1996, meeting, the Commission Board accepted with regret the resignation of John L. Heilbron, Professor, Graduate School, University of California, Berkeley. Heilbron, who currently resides in England, joined the Commission in 1991 when he served as Vice-Chancellor of the University of California, Berkeley.

In accepting the resignation, Chairman Billy E. Frye stated, “You have contributed a great deal to the Board’s discussions, but more importantly, you have acted upon your firm commitment to the cause of preservation and access. Your scholarly perspective has reminded us all of the primary reason we persist in these efforts.”

Two ARL Publications  Survey the Preservation Landscape

Digitization Projects

A new SPEC Kit from the Association of Research Libraries (ARL) provides information on 46 digitization projects in 29 libraries. A survey distributed in November 1995 forms the basis of the kit and reveals the variety among projects and practices in U.S. and Canadian libraries.

Photographic materials, archival collections, and books were the source materials most commonly being digitized. Funding for the projects was equally divided between internal and external sources, and a third were done in cooperative initiatives. Four sites had set up an archiving program for their electronically stored images, and 15 had created permanent copies on microfilm or other media. Responses suggest that much work remains to be done in the area of bibliographic control and underscore the need for technical standards.

Digitization Technologies for Preservation (SPEC Kit 214, March 1996) includes:

  • synopses of all 46 projects and detailed profiles of 8
  • National Agricultural Library selection criteria and guidelines for digital preservation
  • job descriptions
  • sample bibliographic records
  • publicity materials
  • a bibliography

Preservation Statistics

After the emergence and dramatic growth of preservation programs in research libraries during the 1980s, preservation expenditures have leveled off in the past
two years, according to the ARL Preservation Statistics 1994-95. Statistics for personnel, expenditures, conservation, preservation treatment (deacidification and preservation photocopying), and microfilming reveal mostly small increases. There are two notable exceptions, however: conservation activities generally declined, and there was a marked increase in the number of titles and volumes microfilmed.

ARL members spent over $79 million for preservation, a small increase from the previous year. Individual library expenditures ranged from $127,000 to $3.8 million, and from 1% to 10% of total library budgets. Grants from the National Endowment for the Humanities (NEH) and other external sources accounted for about 13% of the total expenditures, and those funds were used predominantly in preservation microfilming projects.

Eighty-one libraries had a preservation administrator (defined as one who spends at least 25% of time managing a program), and 61 of those were full-time preservation managers.

Digitization Technologies for Preservation is available for $40 ($25 for ARL members), and ARL Preservation Statistics 1994-95 is available for $65 ($35 for members), plus $5 per item for shipping and handling. Prepayment is required. Direct orders and inquiries to ARL Publications, Department #0692, Washington, DC 20073-0692, email: pubs@cni.org.

LC Issues RFP for Digitization, Text Conversion

A request for proposals (RFP) issued by the Library of Congress (LC) on May 15 solicits offers to create 1.3 to 2.7 million digital images and digitize selected texts in a project that may extend 5 years.

Source materials will include archival documents, books, and other printed matter–most of which are unique, valuable, and in fragile condition. Owing to the nature of the documents and the anticipated uses of the digital files, the RFP includes special requirements. It outlines stringent guidelines for physical handling of materials, requires that scanning personnel attend a training session led by LC’s Binding and Collections Care Division and Conservation Division, prohibits disbinding of most volumes, and proscribes certain scanning equipment and techniques. Because of the value of the materials, LC will require many items–perhaps about 80% in the first year–to be scanned within the library.

The RFP calls for printed texts, once digitized, to be converted to a version of Standard Generalized Markup Language (SGML). The RFP specifies outcomes and quality criteria but not the methods to be used for conversion and encoding. Offerors are free to propose manual re-keying, an automated process such as optical character recognition (OCR), and other approaches.

Like LC’s earlier digitization RFP (see the May 1996, no. 89, newsletter), this one specifies a file naming structure that will link the scanned images to the library’s bibliographic records and finding aids.

The project includes 19th-century sheet music, music manuscripts, theater playbills, reports of slavery trials, Native American legal materials, early congressional documents, published materials related to the Continental Congress and Constitutional debates, letters from the Presidential Papers, and selected books and periodicals. Like the earlier one, it is part of LC’s National Digital Library Program.

Proposals were due July 11. Copies of the RFP are available, while supplies last, from: The Library of Congress, Contracts and Logistics Service, 1701 Brightseat Road, Landover, MD 20785; fax 202-707-8611. Requests must reference RFP96-18.

EROMM Report Updated

The European Commission on Preservation and Access (ECPA) has issued its first publication: an updated version of European Register of Microform Masters (EROMM)–Supporting International Cooperation. The few revisions reflect new realities of access, such as the use of CD-ROM for distributing files and the ability to search EROMM through the World Wide Web. Dr. Werner Schwartz, who wrote the initial version published by the Commission in May 1995, prepared the updated report.

The EROMM database has grown to well over 300,000 bibliographic records. Nine libraries are participating as partners in 8 European countries, and several others contribute records of their microform masters. The EROMM Steering Committee is working to develop record exchange mechanisms with national bibliographic databases in Australia, South America, and the U.S.

The ECPA chose the report as its first publication to provide greater awareness within Europe of the EROMM activities. Copies are available, while supplies last, from the ECPA Secretariat, P.O. Box 19121, NL-1000 GC Amsterdam, The Netherlands, or by sending an email request to: ecpa@bureau.knaw.nl.

New Commission Catalog

The Commission has created an annotated catalog of its reports to help individuals identify the materials that are most relevant to their needs. Each entry describes the publication’s scope and focus, along with the publication date, length, ISBN, and cost.

 Picture of catalog

The catalog is available at no cost by mailing or faxing a request to the Commission or by sending an email request to Alex Mathews (amathews@cpa.org). It also will be mounted on the Commission’s Web site.

Microfilming Book Provides Guidance

Preservation Microfilming: A Guide for Librarians and Archivists, 2nd edition, provides significant new guidance on planning and managing microfilming projects, cooperative filming efforts, evaluating service bureaus, bibliographic control, and the relationship between microfilming and digitization. Written by Lisa Fox, the book was designed to complement the RLG handbooks, Archives Microfilming Manual and Preservation Microfilming Handbook, and incorporates relevant national standards. The 394-page book was published by the American Library Association in cooperation with the Association of Research Libraries; OCLC provided financial support. It costs $70 ($63 for ALA members) and is available from: Book Order Fulfillment, ALA, 155 N. Wacker Dr., Chicago, IL 60606-1719; 800-545-2433, press 7; fax 312-836-9958.


Commission on Preservation and Access
1400 16th Street, NW, Suite 740
Washington, DC 20036-2217
(202) 939-3400 Fax: (202) 939-3407

The Commission on Preservation and Access was established in 1986 to foster and support collaboration among libraries and allied organizations in order to ensure the preservation of the published and documentary record in all formats and to provide enhanced access to scholarly information.

The Newsletter reports on cooperative national and international preservation activities and is written primarily for university administrators and faculty, library and archives administrators, preservation specialists and administrators, and representatives of consortia, governmental bodies, and other groups sharing in the Commission’s goals. The Newsletter is not copyrighted; its duplication and distribution are encouraged.

Deanna B. Marcum–President
Maxine K. Sitts–Program Officer, Editor