|
ContentsEstablishing Minimum Requirements for Archival Repositories The Challenge of Preservation in Creating
Digital Library Services Managing our Collections by Managing our Risks The Organizational Impact of Digital Library Initiatives Rebecca Graham to Oversee Digital Library Program at Johns Hopkins College Libraries Committee Sets Agenda Enhancing Digital Libraries through the Use of Knowledge Organization Systems
LAST FALL, THE Coalition for Networked Information (CNI) and CLIR convened a small group of library directors and publishers and asked what would be required to ensure the persistence of an electronic journal for one hundred years. CNI has since enlarged the group and continued the discussion. Other organizations, such as the Society for Scholarly Publishing and the National Science Foundation, have also devoted meetings to the subject of long-term maintenance of digital information. These and other discussions have brought to light many questions focusing on responsibility for archiving, technical strategies, the connection between archiving and access, and economic models. These questions cannot be answered in the abstract. To make progress, projects must be started, the processes documented, and the costs calculated. Following strategy sessions with digital library experts, CLIR agreed to convene a series of meetings to identify practical next steps to establish some archival repositories, to seek publishing partners to populate the archives, and to develop the necessary licensing apparatus to ensure that the archiving strategy accommodates the interests of users. For publishers and librarians to feel confident about archival repositories, both parties must agree upon the criteria for archivability. Thirteen librarians met in Washington on April 14 to develop a framework for establishing archival repositories and to agree upon the minimum requirements. Daniel Greenstein, director of the Digital Library Federation, extracted relevant requirements from the existing Open Archives Information System (OAIS) model and framed them in the context of electronic journals as a place to begin. Also providing a framework for discussion was a summary of archival repositories for electronic journals that Clifford Lynch presented at the CNI meeting in March. Although the exact language of the eight criteria for archivability is still being developed, the librarians agreed in principle with the following:
Fuller discussions of these criteria are available on the Digital Library Federation Web site. The librarians agreed to observe these criteria in developing local and consortial archival repositories for electronic journals. They further agreed to document the processes they follow in archiving, and to make available the information about costs. CLIR will work with the libraries to gather information on processes and costs, and will disseminate the findings widely to the library, archival, scholarly, and publishing communities. The next step is to convene a group of publishers in early May to discuss these criteria. In June, the licensing group will meet. Then, CLIR intends to bring the three groups together to negotiate criteria of archivability and terms of deposit of electronic journals.
THE PERSISTENCE OF digital information remains an essential challenge in building the online service environment of the digital library. A few libraries and library organizations are poised to develop limited digital archival repositories. Their progress may rely upon the emergence of two elements that are currently absent. First, there is no widespread agreement about the minimum functional requirements of a digital archival repository. Such agreement is essential. Without defining what maintenance entails (and thus the requirements of the repository), libraries cannot tell suppliers of digital content what is needed to preserve the information. The suppliers need to agree on the requirements of a repository to satisfy any demand that libraries may make with regard to that content's persistence. (See "Establishing Minimum Requirements for Archival Repositories," above). Finally, for emerging repositories to be trusted, whether as suppliers or consumers of digital content, they require a blueprint for the services they need to offer and a benchmark against which their services can be measured and validated. A second element that is absent from the digital preservation arena is a more realistic understanding of the value of digital information. The costs of maintaining digital information over time are unknown but undoubtedly high. The costs of information loss are likewise unknown, but the potential costs must be considered. For example, a drug company maintains data generated in the development of a new product for as long as those data have value to the company. Such data might be kept as evidence in the case of legal action; the costs of not preserving the data could be ruinous. In this context, preservation may be expensive but less so than the alternative. It would be difficult for libraries to make similar assessments, given their overwhelming focus on commercially produced scholarly materials (e.g., journals and reference services). Moreover, because of the number of subscriptions they hold, it would be unlikely that any single library or library consortium could take responsibility for preserving such content over the longer term. Nor does long-term preservation motivate the commercial supplier. And the commercial supplier's understanding of "longer term" will understandably be at variance with that of the library. Might we begin, then, with digital information for which no other body is likely to take an archival interestwith the digital surrogates, for example, that are created by some libraries? This is not to suggest that all digital surrogates must be preserved. The UK's National Gallery periodically re-digitizes its collection of some 2,500 art objects to take advantage of new imaging technologies. The same strategy is not necessarily advisable for all, especially those conducting projects to digitize tens or even hundreds of thousands of individual objects. The question to be addressed is not only about the costs of preservation but also about the higher costs that are likely to be involved in periodic re-digitization. And what about the digital content emanating from surrounding academic departments, which makes up an increasing proportion of the university's intellectual assets? Computer-based research, learning, and teaching materials have significant value. Yet, that value is fully realized only if the materials are assembled into professionally managed collections and maintained over time. Admittedly, decisions to maintain the university's intellectual assets will not be made by the university library in isolation. The information content that is available from the university's digital library makes up only one part (a very important part to be sure) of the university's portfolio of information assets. To determine its value and the bearable expense involved in its preservation, the entire portfolio needs to be reviewed. In the university context, progress in digital preservation is likely to require institutional ownership of a far broader preservation problem.
IT IS NOT unusual for a library to field questions from its oversight bodies such as "How much are your collections worth?", "How do you decide how much to invest in security, cataloging, preservation, or collection development?", and "What if you digitize your collections?" These questions beg more fundamental questions about the role of collections in contemporary libraries and the funding required to make them productive. For as long as libraries have existed, ownership of collections has been central to the mission of providing information to patrons. But it is no longer true that libraries must ownor even have physical custody ofan item to serve it to a patron. First with the growth of interlibrary lending, then with the spread of networked resources, libraries have slowly begun to uncouple collections and services. This development coincides with the increasing demand by funding organizations to manage library services and collections in a businesslike way. CLIR has published a report, Managing Cultural Assets from a Business Perspective, that describes how the Library of Congress developed and implemented a plan for greater accountability over its collections. The report is a case study that, while focusing on one institution, sets out a model that can easily be adapted to every type of library, no matter how small, how specialized, or how atypical. This report is published with the cooperation of the Library of Congress and written by Laura Price of KPMG LLP and Abby Smith, now of CLIR but formerly with the Library of Congress. The Public ServicesAssurance practice of KPMG LLP, an international audit and business advisory firm, developed the business risk model for the Library of Congress. Both Ms. Price and Ms. Smith were deeply involved in adapting the business risk model to the daily operations of a major cultural institution. Given how little data we have about the relationship between access to collections and the productivity of scholars, the achievements of students, or the enlightenment of the public who use library resources, many managers are rightly concerned about treating the library business as a business. Nonetheless, it is not difficult to translate the work that goes on in libraries into the language of business. Nor is it incorrect to assess the effectiveness of libraries by investigating how responsibly staff members meet their obligations as custodians of collections. Responsible stewardship is, after all, at the very heart of professional librarianship. When appropriately adapted to the library environment, the business risk assessment model effectively addresses the major challenges facing library managers, funders, and staff in the course of running a library. It defines library collections as core institutional assets and defines good stewardship as a dynamic process of identifying risks to the collections and instituting policies and procedures that mitigate the risks. This model is valuable to managers because it is designed not only to identify risks to assets, but also to determine which risks are unacceptable and what measures must be taken to reduce them. It guides managers' decisions about investments in their collections, and is grounded in the individual mission of each library. The fundamental step in determining what constitute the chief risks to a collection is to quantify what threatens their fitness for use. What good would a book be if no one could use it, and what could happen to a book that would render it unusable? It could become embrittled and crumblea preservation risk. It could become misplaced, inadvertently through misshelving (an inventory control risk) or deliberately through theft (a security risk). It could be incorrectly cataloged and hence be unretrievablea bibliographic risk. These hazards are well-known to librarians, and staff members spend much of their effort in reducing the chances that any of those things will happen. Libraries can effectively serve patrons if they have bibliographic controls that tell them what they have, inventory controls that tell them where the items are, preservation controls that mitigate loss of information caused by physical deterioration, and security controls that prevent unauthorized removal of items from the library. The risk assessment model is flexible and dynamic: it allows for the great variety of physical formats and intellectual value of library materials and permits one to define risk at any given moment in the life cycle of an item. For example, an eighteenth-century manuscript leaf is more vulnerable to theft than a book, while a book may be more vulnerable to embrittlement than a manuscript. Digital materials carry entirely different risks. Policies and procedures to control these materials must derive from those perceived threats. All staff who handle the materialsfrom those who unload incoming materials at the loading dock to the registrar who documents collection items leaving for exhibitionsare responsible for following those procedures to reduce risk to acceptable levels. By looking at the life cycle of library materials, the risk assessment model widens the definition of who is involved in the stewardship of library assets to include information technology staff. The bibliographical and inventory controls in libraries are increasingly dependent upon systems librarians and the technology staff that support the systems. Because risk assessment is a dynamic process, it can help managers to identify the risk to collections in advance of crises and to plan for the strategic investments in collections management that will ensure the productivity of institutional assets. CLIR is interested in working with libraries and other collecting institutions to move this risk assessment model from case study into practice and seeks ways to help libraries incorporate it into their work. The published report is available from CLIR for $15, prepaid; an online version is available.
SEVENTY-FIVE PARTICIPANTS from the Digital Library Federation (DLF) gathered at Emory University on March 31 for the second DLF Forum on Digital Library Practices, which focused on the libraries' organization for digital library initiatives. The two-and-a-half-day event allowed for an in-depth exchange of ideas about how libraries can manage effectively their digital library efforts. What emerged from the presentations and discussions is that organizational practices vary significantly, reflecting the variations in institutional mission and strategy. Yet, some common themes ran through the sessions. Digital library initiatives, while in the early days tended to be driven by grant funding, are now driven by libraries' interest in providing services to their users or by the changes in scholarly communication, or both. The participants noted increased demand for digital materials in distance education and for classroom teaching. Users expect the library to be able to bring multimedia materials together for them. Simultaneously, there are growing expectations of self-service access to library resources. As digital library programs evolve, more and more library staff want to be involved. The early initiatives were limited to a few members of the staff, who used the digital library projects to gain expertise. Increasingly, however, there is a strong interest among the entire staff to integrate the digital library into the larger library program. As this integration occurs, traditional library positions are evolving to include new tasks. Technical services staffs are no longer concerned with acquisitions and cataloging exclusively. They are also taking on the tasks of capturing and producing metatdata, assuming rights management responsibilities, and assigning and managing persistent names for digital objects. Some libraries have added Web harvesting responsibilities to the mix of technical services duties. But these changes are not limited to technical services. Public services staff have also been integrated into the digital library initiatives because of their service orientation and their subject expertise. Kitty Bridges, Head of the Shapiro Science Library at the University of Michigan, commented that "information technology is now a part of all of our jobs." Throughout the Forum sessions, participants emphasized the need for additional technical training and better preparation for project management. In the technical areas, they called for training in encoded archival description and metadata standards. They need a better understanding of digital library architecture. In project management, they cited the need for basic training as well as help in the use of project management software. Recruitment and retention of digital library staff are common problems. Several participants recommended that libraries seek partners from departments of computer science and the schools of information and library studies, both for the expertise these partners would bring, but also for the potential supply of new staff they could provide. Not surprisingly, the review of organizational practices uncovered both challenges and opportunities. Many of the participants picked up new ideas from their counterparts in other institutions. The discussions, while focused on how to organize digital library initiatives, soon branched into a larger and more important questionhow to organize the library of the twenty-first century.
AT ITS MEETING on March 20, 2000, CLIR's College Libraries Committee concluded that the agenda for the next phase of work could be best accomplished if the group were expanded to include different types of academic libraries. In the near future, the committee will add representatives from mid-sized universities and independent research libraries to its number, and the name of the group will be changed to reflect the new composition. Formed originally to advise the Commission on Preservation and Access on preservation problems confronting liberal arts colleges, the group has broadened its focus in recent years to address a wider range of issues. For example, early in 1999, CLIR and the CLC convened a conference on the innovative use of technology on the campuses of small and mid-sized academic institutions. The conference was based on a series of case studies conducted the previous year. (CLIR published the case studies and a summary of the conference proceedings in August 1999.) At its March meeting, the committee identified the following topics that most need attention:
After a thorough discussion of these needs, the committee chose four areas for in-depth study this year:
|