CLIR Issues Number 24
On Digital Library Standards: From Yours and Mine to Ours
by Daniel Greenstein
What's Happening to Archives? A Tale of Three Sessions
by Jerry George
IF YOU ASK people in research libraries to identify the most significant digital library challenge facing them, it is likely that most will respond with the same answer: the absence of standards. These people are not referring to the formal standards emerging from the International Standards Organization (ISO) or the World Wide Web Consortium (W3C). Such standards are plentiful. Instead, they are bemoaning the lack of a consensus about when and how to apply those formal standards in a digital library.
What Do We Mean by "Standards"?
This seemingly uniform complaint—a lack of standards—carries at least three distinct meanings. For example, in digital libraries that are starting work for which there is little local expertise, the need for "standards" might be interpreted as a call for "recipes" for online finding aids or for assistance with the construction of local name-resolution services.
Respondents from organizations seeking to maximize investments in digital content by leveraging efforts and adding value to other complementary digital content assign a different meaning to the word "standards." Consider a library with a small but respectable online collection of digital images that are used to teach specific aspects of veterinary medicine. To better support teachers and students of veterinary medicine, that library might want to make its images accessible as part of a more comprehensive uniform database with digital images drawn from several collections. With its potential partners in veterinary imaging, the library may be thinking about the attributes of a "good" digital image that would be appropriate for inclusion in its (possibly virtual) database. Criteria for a good image might include some mix of suitability for teaching, interoperab-ility, and, perhaps, even persistence. In this case, "standards" will actually refer to the qualities of "good" and, no doubt, to the difficulties involved in reaching agreement about what these qualities are. The need for formal standards for encoding textual information (e.g., XML) and formatting image files (e.g., TIFF) is minimal. Instead, discussion and debate tend to focus on what descriptive textual information should be recorded in XML and on the resolution and color depth of the TIFF files.
The need for standards will be interpreted differently again by people working at organizations that are trying to maximize their investment in networked services by ensuring that they integrate effectively in a larger universe of networked services and information. The organization building a digital archive is one example. Another is the library building an online database (whether of finding aids or streaming video) that can be searched personally by individuals and automatically by other Internet services. The would-be archive will be trying to reach a consensus with prospective users about how and in what format electronic information will be deposited, stored, and managed over time. Equally important, it will be trying to decide how, to whom, and under what circumstances the electronic information will be distributed. Among the users will be persons who create and then deposit the journal content, as well as persons who may wish to use the archived journal content in the future. The database developer will want to work with potential users to agree upon the attributes of interoperable online services.
Until recently, only demand for standards of the first type—the recipe—was being satisfied in digital libraries. This was partly because of the apparent competitive advantage that some leading libraries gained by documenting their successes. Consequently, methods were often described as "the ____ way," or issues as "the ____ problem," inserting the name of the appropriate library in the blank.
The numerous successes, documented in a proliferation of recipes, intensified demand for standards of the second and third types—that is, for practices that are tried, tested, and generally agreed to be fit for purpose. However, these successes also slowed progress. The same dynamic logic that encouraged digital libraries to document and promote their way of doing things tended to complicate the consensus building that standards of the second and third types require. In many cases, benchmarks and applications were assessed in terms of who developed, promoted, and supported them, rather than on their merits alone.
Two New Approaches to Standards
Two important changes have recently occurred in the ways in which libraries approach the development of digital library standards. First, whereas recipes once proliferated, "cookbooks" are now beginning to appear. Cookbooks are compilations that acknowledge that there are different flavors of finding aids, metadata, and digital formats, and that each is suited to a limited set of purposes. Thus, the best-practice guidelines being prepared by the National Initiative for a Networked Cultural Heritage (NINCH) will aim to document data- and metadata-creation practices that are appropriately used with different kinds of digital information (e.g., texts, images, video) being created for different purposes. The Institute for Museum and Library Services (IMLS) and the National Science Foundation's National Science Digital Library are similarly developing a framework for characterizing "good"—that is, minimally consistent, interoperable, and persistent—digital collections, objects, and metadata. It is understood not only that methods are more or less fit for specific purposes but also that the methods will change over time. The framework provides a convenient mechanism for collating and organizing the community's methodological preferences as they evolve.
The second change is a shift in the way libraries develop and adopt standards. Simply put, the trendsetters and opinion makers who have forged ahead by setting good practical examples are identifying real advantages in leading from the rear. Increasingly, they are claiming adherence to practices that are already vetted and endorsed by at least one, but preferably by several, peer institutions. They are also pushing much harder on benchmarks than on cookbooks.
This approach is becoming more evident in this country and abroad, and it has become very apparent in initiatives of the Digital Library Federation (DLF). In the past year, DLF members have reached a consensus on a model for negotiating access to commercial journals and databases and on the minimum requirements for digitally reformatted book and serial publications. DLF members have also facilitated agreement among selected libraries and publishers about the minimum requirements of a digital archival repository for electronic journals.
Nowhere is this new wave of collegiality more evident than in the emergence of the metadata encoding and transmission scheme (METS). METS aspires to be a standard scheme for recording information about a digital object's structural and administrative characteristics—information essential to the effective management and interchange of that object. "Led" by individuals from Berkeley, Harvard, New York University, the Research Libraries Group, and the Library of Congress, the METS initiative has developed rapidly and attracted widespread and enthusiastic support. It has sped toward a consensus at a rate that would have been inconceivable for any comparable initiative.
Economics Driving Change
The reasons for these limited but nonetheless remarkable successes are only starting to emerge. In no small part, it appears that economics are responsible. As digital libraries mature, they must transform interesting projects into essential infrastructure. At that stage, the costs and risks associated with digital library investments are no longer ephemeral and can no longer be met by discretionary or external funding. Failures can no longer be written off as "learning experiences" gained at relatively small cost. Instead, these libraries face very real large-scale investments and the need to make a commitment to a broad range of methodologies that will evolve into complex operational services. The issues they face cannot be treated in isolation (as individual recipes) or in the spirit of theoretical or scientific inquiry. Libraries at this stage demonstrate a desire to pool their collective uncertainties and to define, and then frame, a broad suite of practices as benchmarks in which they can all invest and upon which they can all safely build. The initiatives described above have tapped, perhaps unintentionally, into this very fruitful seam. With other comparable successes here and abroad, they define a trend that will enable digital libraries of any scale and maturity to develop with greater confidence, speed, consistency, and quality.
Though profound, the changes described here are far from universal. Many institutions will continue to develop and use nonstandard methods, both homegrown and new. They must be encouraged to do so. An independent spirit is the source of innovation—the same innovation that will improve digital libraries and develop the needed benchmarks. But with standards as with most innovation, there is risk: redundancy, wasted effort, misspent funds, unsuccessful initiatives, and aimless digital library programs. The true innovators have been willing and able to tolerate these substantial risks. Understandably, they are few and far between. As the innovators continue, more and more digital libraries are focusing their attention on the demands of supporting sustainable operational services. Where standards are concerned, they find value in leading from the rear.
TO LIBRARIANS, THE declaration would have seemed familiar. Technology will change archives, the speaker asserted. Professionals in the field will become "cyberarchivists." And he, for one, welcomed it. In that high spirit, Leon J. Stout, president of the Society of American Archivists, opened SAA's annual meeting in Washington, D.C., last August.
As Mr. Stout knows, successful "cyberarchivy" will depend on answering a hard question that faces librarians as well as archivists. Namely, how can we ensure that all the valuable digital material we are expensively putting online persists, is preserved, and stays accessible?
Archivists are digitizing some records and providing Web access to finding aids that describe many others. Like librarians, they need to preserve such material for ongoing use. But archivists have a more urgent need. They must find ways to provide electronic access in perpetuity to a continuous stream of electronic records judged permanently valuable. These are records that governments and others have generated and continue to generate. Failure to provide long-term access to these records would mean great losses of documentation of our era's history.
Like cyberarchivists, "cybrarians" also are loading the Web with useful information—books, journals, special collections—that needs to be preserved for future as well as current use. But can librarians and archivists do it? Can we transfer relatively unstable digital data from tape to tape, program to program, system to system indefinitely without significant loss? Can we retrieve, on tomorrow's computer systems, data generated on systems that are already becoming obsolete? And can we preserve and provide access to great volumes of digital data generated in increasingly complex forms?
If not, the online archival and library information that can so easily be studied, searched, enhanced, and creatively recombined today will be unavailable tomorrow. As one expert on the subject, Jeff Rothenberg, has said, "Digital information lasts forever—or five years, whichever comes first."
How do archivists hope to meet the digital preservation challenge?
Archives of the Future: Under Development Today
The answer to this question came in a second SAA session, one of several that dealt with electronic records. In it, Kenneth Thibodeau asserted that preservation mastery is not complete, but that the "archives of the future" is in sight. An Electronic Records Archives (ERA) is actually under development by the National Archives and Records Administration (NARA). Thibodeau directs this effort.
NARA has been accessioning electronic records since 1970. But, said Thibodeau, "proven methods for preserving digital information across generations of technology are limited to the simplest formats." He said that NARA needs a way to preserve millions of government records in all digital formats, with "continuing authenticity" through "unlimited time." That is what he hopes the ERA will do.
The ERA is evolving from technologies already under development to support e-commerce and e-government in multiple ways. For NARA, computer scientists are using a "persistent object preservation architecture" to relieve records from dependency on particular software and hardware, Thibodeau explained. In simplest terms, the goal is to preserve information about records—their content, structure, and context—in ways that permit reconstituting the records for access on future computer systems.
This "transformative approach," according to Thibodeau, differs from preservation by "migration," which relies on recopying, or by the "emulation," on new computer systems, of obsolete systems used to create the records being preserved. Will it work? The approach has critics, yet there are grounds for optimism. The San Diego Supercom-puter Center has demonstrated the feasibility of the approach in tests involving, among other kinds of records, a million government e-mail messages.
When will the ERA be finished? "Ten years after computer development stops," Thibodeau quipped. By this he meant that the ERA will be "dynamic," alterable to accommodate continuing technological evolution, and "progressively deployed." A core ERA, without "full functionality" but capable of basic work, will be operable within five years, he speculated, and will continue developing incrementally. To run it, he added, archivists will need to transform themselves into "archival engineers."
Anticipating Users' Needs
At a third SAA session, a new question arose. To paraphrase the movie, "If we build ERA, will users come?" Archivists and their archives are changing, but so are some of their traditional patrons, and the two parties sometimes move in different directions. As archivists employ technology to preserve records for the study of history, historians increasingly express interest in studying archivists. That is, a number of historians and other scholars have taken up the study of social memory"—how it is created, by whom, and with what effects and motives.
Francis Blouin spoke of this "new historiographical direction" in his introduction to an SAA session on "Archives, Documentation, and the Institutions of Social Memory." That was also the title of a recent yearlong seminar at the Bentley Historical Library, which Mr. Blouin directs. The seminar had to do with how society constructs what gets collectively remembered. Such studies include how archives function and make choices about which records to preserve. Some historians expressed the view in the seminar that such choices reflect ideology, politics, authority, and control of power. One seminar participant even pondered whether archives, by helping determine what society saves from its history and transmutes into social memory, could be "technologies of rule in themselves."
That certainly shed a different light on both technology and archives.
What effect might NARA's archives of the future have on social memory? If the ERA can store more records than paper archives have had space to keep, will the result be a correspondingly broader view of history? If the ERA makes access easier, for more people in more places, will the power that comes from information be less the property of just the privileged? And will today's archival guidelines for appraising the significance of records make sense to such future ruling technologists as the cyberarchivist and the archivist-engineer?
Perhaps we do need the historians' new questions as well as the archivists' new machinery. Five years from now, when the ERA shows us how to save digital documents, maybe we can also take a new look, in an SAA session, at that fundamental question: What are we saving?
CLIR IS IMPLEMENTING several recommendations that emerged from "Building and Sustaining Digital Collections: Models for Libraries and Museums," a meeting held with the National Initiative for a Networked Cultural Heritage (NINCH) in February 2001. Supported by a grant from the Institute for Museum and Library Services under its National Leadership Grants for Library and Museum Collaborations, CLIR and NINCH brought together museum and library leaders who are putting cultural and educational collections and services online.
Guide to Nonprofit Business Planning
At the meeting, participants explored for-profit and nonprofit models for developing sustainable enterprises. They focused on identifying the most appropriate business models for institutions that want to go online to share collections, staff expertise, and educational services without doing so for profit, which, many attendees felt, could undermine or distract from an institution's core mission work. Representatives from these libraries and museums said they did not wish to go online to make money per se; their goal was to identify strategies to support the high costs associated with developing and sustaining a digital presence. They also wanted to find their niche in the highly competitive online marketplace of ideas without having to spend a great deal of money on advertising.
Participants expressed the need for a simple guide to developing business plans for nonprofit organizations. In July, CLIR commissioned consultant David Rodgers to create such a guide. Due to be published by the end of the year, the guide will be a concise and accessible resource for those in cultural institutions who are ready to develop an online enterprise and who need general guidance and a pointer to further resources. The report will provide a comparative framework for assessing various business models and how they may or may not fit the profile of a given cultural institution. It will provide guidance for early-stage decision making and will provide references for those seeking more detailed information.
New Functions, New Staff Skills
Participants in the February meeting also expressed interest in better understanding the implications of the significant changes that their institutions are undergoing. The growth in the use of technology has had an enormous impact on the internal culture of museums and libraries. Staff in all departments need new skills, and the needs continue to change. For example, libraries report that bibliographers and subject specialists are spending more time on licensing agreements and less time on selection issues. Museums report that interpretation and research are extending beyond curatorial departments to education and Web site activities.
These are important cultural shifts whose consequences are not yet well understood. Thus, while the need for technological skills is great, the need to reconceptualize the nature of jobs within cultural institutions is even greater. Some meeting participants proposed reconfiguring jobs along functional lines rather than maintaining the traditional divisions between genres, formats, or areas of intellectual expertise. Others called for cross-domain experience through such programs as temporary or rotating assignments to different areas. Such programs, they maintained, would afford staff members an opportunity to acquire a better appreciation of institutional purpose and direction. Although libraries and museums have different definitions of professional turf, both know that protecting one's turf can interfere with responding appropriately to change. Finally, both types of institutions are dealing with the problem of identifying, recruiting, and mentoring the next generation of leaders.
To explore in greater depth the many facets of institutional transformation, CLIR will convene another meeting between leaders of libraries, museums, and other cultural institutions in the spring of 2002. The changing nature of cultural institutions and the strategies that some of them are using to guide that change will provide a context for discussing approaches to sustainability and for reconceptualizing jobs within cultural institutions. A survey of digital library growth strategies that Suzanne Thorin is conducting for the Digital Library Federation will provide important data for the meeting and will serve as a model for similar work within the museum community.
CLIR will publish the results of the upcoming meeting in a document that will continue the series on library-museum collaborations, which includes Collections, Content, and the Web (January 2000) and Building and Sustaining Digital Collections: Models for Libraries and Museums (August 2001).
CLIR Sponsors' Symposium Set for April 26, 2002
"Reimagining Collections for the 21st Century" is the topic of CLIR's next annual sponsors' symposium, which will be held at the Cosmos Club in Washington, D.C., on April 26, 2002. Invitations will be mailed in November.
Symposium participants will explore issues relating to collection development and management. Preservation will be prominently featured. Speakers and topics for the morning sessions will be announced in November. In the afternoon, structured, facilitated discussions will give symposium sponsors an opportunity to suggest projects and commissioned papers they would like to see CLIR take on in the future.
The symposium will provide an opportunity for all attendees to learn more about the concerns and aspirations of CLIR's sponsors. Mark your calendar now and plan to be part of this annual event.
Susan Perry Joins CLIR
Susan Perry, Director of Library, Information, and Technology Services at Mount Holyoke College, has joined the CLIR staff part-time as director of programs, through an arrangement with The Andrew W. Mellon Foundation and Mount Holyoke. At CLIR, Ms. Perry will focus on issues of interest to liberal arts colleges, with the aim of connecting CLIR's programs with work that the Mellon Foundation is supporting through its Institute for Technology and Liberal Education. In addition, she will work with CLIR President Deanna Marcum on developing the curriculum for the Frye Leadership Institute.
Annual Report Available
CLIR has just issued its 2000-2001 annual report. It is available as a .pdf file on the Web at ww.clir.org. Print copies are also available from CLIR at no charge. To order, please direct your request to email@example.com.
The Frye Leadership Institute June 2-14, 2002
The Frye Leadership Institute is accepting applications for its 2002 season. The Institute is an intensive, two-week residential program in which participants study and analyze the leadership challenges stemming from the changing context of higher education. The session will be held June 2-14, 2002 at Emory University. Participants will be selected competitively from among nominees and applicants who have a commitment to, and talent for, leadership within higher education. The group as a whole will be chosen to reflect the variety of backgrounds and experience that constitute higher education.
The Institute is supported by a grant from the Robert W. Woodruff Foundation and is sponsored by CLIR, EDUCAUSE, and Emory University.
Applications must be submitted by December 15, 2001. Information and application instructions are available at www.fryeinstitute.org. The Institute can also be contacted by e-mail at firstname.lastname@example.org.
CLIR and DLF Invite Proposals for Distinguished Fellows Program
CLIR and the DLF are seeking applicants for the Distinguished Fellows Program. The program is open to individuals who have achieved a high level of professional distinction in their fields and who are working in areas of interest to CLIR or the DLF. Unlike other fellowship programs that provide support for individual research, the Distinguished Fellows Program is aimed at identifying potential partners for the CLIR/DLF agenda.
The fellowships, available for periods of between three and twelve months, are ideal for senior professionals with well-developed personal research agendas who will benefit significantly from time away from their day-to-day responsibilities.
Although distinguished fellows will not be required to relocate to Washington, D.C., during their tenure, they will be expected to participate in program planning sessions in Washington and to cooperate with CLIR staff on current projects, in addition to working on their own projects.
For more information on how to apply for a fellowship, see www.clir.org/news/pressrelease/fellows.html.
CLIR WISHES TO thank The Andrew W. Mellon Foundation, the Institute for Library and Information Services, and the Alfred P. Sloan Foundation for awarding grants for the following projects.
Dimensions and Use of the Scholarly Information Environment. The Andrew W. Mellon Foundation has awarded funds to the Digital Library Federation for a study of how faculty and students at universities and colleges use the academic library and how they perceive the library within the larger scholarly information environment. The study will be conducted by Outsell, Inc., a research and advisory firm that focuses on the information content industry. The findings will help libraries and universities plan information services that match the current and emerging needs of their faculty and students. A keener sense of user needs will also help publishers and content providers that serve the education market create better information products. The study is scheduled to be completed in April 2002.
The State of Preservation Programs in American College and Research Libraries. The Institute for Library and Information Services has awarded a grant to CLIR for a study of preservation programs in American college and research libraries. The study will be undertaken jointly with the Association of Research Libraries (ARL), the University Libraries Group (ULG), and the Regional Alliance for Preservation. The partners will assess preservation efforts and concerns in the nearly 250 college and research libraries that constitute the membership of ARL and ULG, as well as in leading liberal arts colleges and major non-ARL land-grant institutions. In addition to documenting current conditions and preservation needs, the study will suggest new strategies to equip preservation programs for an increasingly complex technical environment. Further information on the project can be found in CLIR Issues 19 (Jan.-Feb. 2001), page 3.
Ensuring Long-Term Access to Web-based Documents. The Alfred P. Sloan Foundation has awarded a grant to CLIR for a project to address issues related to the archiving of Web-based documentation projects. CLIR will convene a meeting of scholars, librarians, publishers, and technologists who have worked successfully on various aspects of digital archiving. The meeting will focus on the legal and ethical ramifications of the current copyright regime, the lack of appropriate infrastructure within archives and libraries for providing long-term access, and the lack of best practices for creators of Web sites designed for long-term access. The aim of the meeting is to build partnerships, transfer knowledge among stakeholders, and, ultimately, prepare scholars for creating Web sites that can be preserved and that warrant the considerable resources that preservation and long-term access demand.
|Council on Library and Information Resources|
1755 Massachusetts Avenue NW, Suite 500
Washington, DC 20036
Fax: (202) 939-4765 · E-mail: email@example.com