CLIR Issues Number 2
Special Collections Stake Their Claim in the Electronic Age
--by Abby Smith
Digital Library Federation to Develop Authorization Systems
--by Donald J. Waters
Task Forces Cope with the Future, and the Present
--by James Morris
Special Collections Stake Their Claim in the Electronic Age
--by Abby Smith
Special Collections—the manuscripts, prints and photographs, musical scores, maps, motion pictures, sound recordings, ephemera, and other non-print materials that make up a large percentage of the holdings of research libraries, archives, and historical societies—are the raw materials of scholarship and are often of unique research value. But though they may be among the richest resources in an institution, they are frequently among the most underut-ilized as well. Moreover, they present formidable endemic preservation, access, and security challenges to library and archival managers because of their complexity and their significance as artifacts. All institutions are now under pressure to make their collections available electronically, and have scant financial resources to do so. Why, then, should libraries and archives invest in improving access to special collections, which, under the best of circumstances, are used so much less than the general monograph and serials collections?
It is, first and foremost, the intrinsic research value of the collections that gives them a claim to attention in an institution. These holdings allow researchers to apprehend directly the history of document creation or the assembling of a collection, even as they are mining the accumulated raw data from the past for historical evidence. Moreover, these materials are often organized or grouped not by subject or format but by creator or collector, which makes them an especially rich source for interdisciplinary and multimedia approaches to a subject. The wealth of information to be discovered in special collections and the inherent difficulties and expense of making these materials accessible argue that librarians and archivists should focus on a few core activities to mitigate the impediments to their greater discovery and use by scholars. Technology can help.
The greatest stumbling block to access is often that a collection is poorly described, or not described at all. Scholars quite naturally turn first to materials in a repository that have already been discovered, used, and described, relying as they traditionally do on finding aids, catalog subject headings, and clues found in footnotes. Perhaps the most important step a repository can take is simply to inventory collections and create some level of access record, no matter how rudimentary it may be. A collection lacking description of any kind is not likely to be used. Most finding aids do not need to be detailed to be of significant help. Even a collection-level record, disseminated to scholars through shared databases, or a catalog entry on an institution's Web site, would serve access goals admirably.
In addition to the challenge of intellectual control and the creation of appropriate access points, special collections present daunting preservation problems. These may arise from the sheer size of some collections (e.g., a large group of family records from nineteenth-century Wisconsin) or from their physical unwieldiness and fragility (e.g., a small number of feature films by African-American creators that exist only on nitrate). Repositories must take a series of essential actions to ensure long-term accessibility: do enough preservation treatment to stabilize the collection, apply conservation treatment to endangered items, and provide secure and environmentally sound storage conditions, especially for image-based and audio media.
Once processed, special collections can play an outsized role in the institution, in addition to serving the needs of scholarship. The very nature of the material as artifacts—the way, for example, that sheet-music covers from World War I or political posters from the Eisenhower/Stevenson campaigns can summon up a host of associations, and with them create a vivid impression of things past—makes the objects as such significant in a way that goes well beyond their role as "information resource." These collections can be mined for many uses beyond the reading room. They may be featured in exhibitions, for example, or exploited by the publishing industry (including the repository's own publishing enterprise), or used for educational programs, public outreach, and promotional purposes. Further, because artifacts in and of themselves convey something immediate and authentic about the nature of research to those not actively engaged in scholarship, formal and informal displays of maps, manuscripts, vintage photographs, and holograph musical scores can be surprisingly effective in winning outside financial support for library programs.
What can digital technology do for special collections? In addition to conveying information about the items far beyond their repositories, the technology can make the collections themselves accessible to researchers through the creation of digital surrogates. Digital conversion and dissemination are not appropriate for all materials (daguerreotype collections from the Antebellum period do better online than, say, rare recordings of musical performances from the forties that are still under copyright restrictions). But it is surprising and gratifying to see how digital versions of some special collections can both aid long-distance researchers and serve specific preservation goals for fragile materials in the reading room. By posting finding aids and catalog records online, putting up thumbnail scans of a few representative items from a large collection that will not be reformatted, and featuring unique and rare items in digital library projects, repositories may greatly expand the potential user base for special collections and help to justify, if not defray, the high costs of processing.
Perhaps the most important reason of all for preserving and making accessible non-print documents in libraries and archives is that these are the formats in which, increasingly, information is being recorded in this century and will be recorded into the future. If there is a relative lack of demand for access to audio and visual resources, it is only because scholars and teachers have traditionally used text-based documents and continue to feel more comfortable working with them. But that is unlikely to be the case for succeeding generations of scholars and students. Librarians and archivists probably know better than their patrons the extent to which contemporary information is being captured on audio and visual media, both analog and digital. Once this source base has accumulated sufficient mass, though not until then, researchers will begin to mine it extensively. It is the mission of libraries and archives to make sure that information in these media will be available when it is sought, however many years from now. For that to happen, research institutions must come to grips now with the preservation and access challenges of special collections'fragile, immense, irreplaceable.
Digital Library Federation to Develop Authorization Systems
--by Donald J. Waters
Colleges, Universities, Libraries, publishers, and other parties associated with higher education share substantial interests in the production and dissemination of knowledge that exists only in digital form. Moreover, many believe that investment in digital technologies will enhance the quality of the educational experience, improve the means of managing intellectual property, and extend the reach of the academic enterprise to distance learners, alumni, and the general public. And yet, multiple obstacles impede the ability of those with stakes in higher education to achieve these goals and to integrate digital information into the larger academic information environment.
One of the most prominent obstacles is the lack of adequate facilities for authentication and access management. Such facilities are needed if diverse categories of readers are to present themselves in the digital environment as individuals authorized to gain access to various distributed bodies of knowledge. Without mechanisms for authentication and authorization, institutions cannot effectively extend the reach of research and higher education through distance-learning programs'cannot, in fact, even provide off-campus staff, faculty members, and students access to works licensed on campus. Nor can publishers develop differentiated products that expand the market for digital products and lower their cost. The lack of effective mechanisms may even contribute to the compartmentalization of systems of discovery and retrieval if it causes information service providers who seek to protect intellectual property against unauthorized use to invest huge amounts in proprietary systems that do not easily interact with other systems.
Systems of authentication and authorization
Systems of authentication and authorization consist of several component processes that are distinguished by their focus on either the user or the provider of an information object. One process must authenticate the identity of the user by affirming that the person is who he or she claims to be. Another must authenticate the properties of an information object and ensure that it is what it purports to be.
On the user side, an authorization process links an identity to a valid set of rights and duties relative to objects of information. (For example, I may be authorized as a student of a university to use the Britannica Online.) On the provider side, an authorization process links the properties of information objects with a set of terms and conditions for rightful use. Thus, a publisher may authorize a journal for use by the faculty and students of a particular university but not by its alumni. Use occurs at the intersection of these authorization processes, when a user is validated to exercise a set of rights and duties that meets the provider's terms and conditions.
A number of technical facilities have emerged to manage authentication processes in the digital environment, and they are currently being implemented. For example, to determine that individuals are indeed members of the communities they profess to inhabit, colleges and universities are turning increasingly to Kerberos, an authentication system invented at MIT and now supported as a standard feature in a wide variety of products. Cryptographic methods, including digital watermarking, are available, and various object-identifier schemes are under development to ensure that digital information is authentic and uncorrupted. Although authentication means are rapidly emerging and being adopted, the means of authorization are much less well developed, and in need of sustained investment.
The primitive state of authorization systems
Rudimentary means of authorization do exist, of course. For example, institutions almost everywhere now use authorization schemes based on Internet IP addresses. Under such schemes, the use of a computer with an IP address belonging to an institution is presumed to qualify an individual as a member of an authorized class of users for particular digital information products. But IP addressing turns information products on or off for an entire geographically related population, though the institution may actually need products only for faculty members, or for participants in a course of study, or for a subclass of users such as alumni.
The use of IP addressing as a means of authorization prevents colleges and universities from making the distinctions they need to make—and do make for information products in other forms—and keeps providers from usefully differentiating their products. Providers cannot target specific markets, open new markets, or experience the growth and economy of business that typically accompany such differentiation.
What must be done
Because momentum is already driving the development of systems of authentication, it makes sense to focus on improving the systems of authorization, through a series of interrelated steps. First of all, we must articulate the divergent values and expectations about the use of intellectual property and devise ways to keep them in balance. For example, we must balance the values of academic freedom and privacy in the pursuit of knowledge against the values of economy and organization that lead to the means of tracking identity and usage patterns as a part of the authorization process.
We need as well to develop a better understanding of the technological options available for authorization and of how they relate to the differing values. The Coalition for Networked Information (CNI) has invited broad participation in the development of a white paper that should greatly improve our understanding of how the technical options relate to the critical values. Over the coming months, the Digital Library Federation (DLF) and, individually, numerous DLF institutions will help to prepare the CNI white paper.
Another essential step will be to define the formal requirements for distinguishing user roles and product terms and conditions in systems of authorization as they apply to work in research and higher education. The DLF and the Center for Research on Information Access at Columbia University, with support from the National Science Foundation, will convene a workshop of specialists on April 6 in Washington, D.C., to consider these requirements.
Further, institutions must begin to gain experience with more sophisticated and flexible authorization mechanisms, such as digital certificate systems. One way of gaining this experience is by deploying alternative mechanisms in a controlled and measured way—for example, as a substitute for the simple functionality provided by IP addressing. The DLF is supporting a project under development within the Committee on Institutional Cooperation (CIC) to test technology for authorization across institutional boundaries.
The results of these various initiatives will become available during the spring and summer. They should set the stage for a series of additional projects to implement systems of authentication and authorization that will greatly improve the use and the usability of digital information for teaching, learning, and research.
|Washington Post Publishes Letter from Deanna Marcum
Letters to the Editor, Wednesday, January 21, 1998
The Greater Digital Crisis
The Jan. 12 Federal Page article on the Defense Department's year-2000 problem discusses serious issues affecting our computer-dependent government. But the nation also faces a second digitally based crisis that might, in time, do great harm.
We run the risk that digital information will disappear. Indeed, portions of it already have become inaccessible. Either the media on which the information is stored are disintegrating, or the computer hardware and software needed to retrieve it from obsolete digital formats no longer exist. The extent of the problem will emerge as more and more records are requested for retrieval and cannot be read. There are already documented examples of this, and government and industry representatives are concerned about the potential large-scale consequences.
When President Clinton completes his second term, his administration will send some 8 million electronic files to the National Archives. But those files are only a small fraction of the information the government will have generated during his years in office. Given the problems now surfacing as existing digital files are retrieved, the prospect of major losses to come grows increasingly likely.
Military files, including POW and MIA data from the Vietnam War, were nearly lost forever because of errors and omissions contained in the original digital records. Ten to twenty percent of vital data tapes from the Viking Mars mission have significant errors because magnetic tape is too susceptible to degradation to serve as an archival storage medium.
Research conducted by the National Media Lab, part of the National Technology Alliance—a consortium of government, industry and educational institutions that seeks to leverage commercial information technology for government users—has shown that magnetic tapes, disks, and optical CD-ROMs have relatively short lives and, therefore, questionable value as preservation media. The findings reveal that, at room temperature, top-quality data VHS tape becomes unreliable after 10 years, and average-quality CD-ROMs are unreliable after only five years. Compare those figures with a life of more than 100 years for archival-quality microfilm and paper. Current digital media are plainly unacceptable for long-term preservation.
Finding a late-model computer to read a 5.25-inch floppy disk—a format common only a few years ago—or the software to translate WordPerfect 4.0 is practically impossible. On government and industry levels, the problem is magnified: old Dectape and UNIVAC drives, which recorded vast amounts of government data, are long retired, and programs like FORTRAN II are historical curiosities.
The data stored by these machines in now-obsolete formats are virtually inaccessible. The year-2000 problem concerns only obsolete formats for storing dates. It is merely a snapshot of the greater digital crisis that puts future access to important government, business, and cultural data in such jeopardy.
Librarians and archivists have long worried that hardware and software manufacturers show more interest in discovering new technology than in preserving today's data. It is important for federal, state and local governments to set digital storage standards that will ensure future access. If private industries hope to sell their wares to governments, they will need to comply with those standards. And all of us will benefit.
CLIR Initiates South Africa Program with Preservation Workshop
The extent and the diversity of Africa's historical and cultural record almost defy measurement. Unfortunately, few institutions in African countries are equipped to address the preservation needs of this vast record. Although most librarians and archivists know about the effects of heat, humidity, and age on the physical condition of materials in their collections, few have the resources or training to tackle the problems systematically. And few are familiar with the special requirements of preserving audio, visual, and digital information.
In early March, twenty South African library and archives staff members attended a preservation workshop in Durban that was supported by funds from CLIR's International Program. The week-long workshop considered the problems of paper-based records and why they deteriorate, and presented options for reformatting print, audio, visual, and digital materials. The workshop marked the first of several activities that the International Program will sponsor in South Africa over the next two years, under a grant from The Andrew W. Mellon Foundation.
In addition to providing much-needed instruction in preservation management to South Africans, the Durban workshop will serve as a model for a subsequent workshop directed to all of Anglophone Africa. The larger workshop, to be held in Durban in April, will be sponsored by the Joint IFLA/ICA Committee for Preservation in Africa.
CLIR to Study Innovative Uses of Technology by Colleges
Dramatic developments in digital technology and the growth of electronic information are changing the way college campuses supply information to faculty members and students. In cooperation with its College Libraries Committee, CLIR will conduct a study of how a group of colleges are responding to the challenge of providing information services in a rapidly changing environment.
At the beginning of April, CLIR will send letters to about 1,200 colleges to determine their interest in participating in the case studies. After reviewing their responses, up to twelve campuses will be chosen for the study. CLIR program staff and members of the College Libraries Committee will make site visits to the colleges and then issue a volume of case studies that illustrate how college libraries in particular have strengthened their role on campus, enhanced their information services, and improved instruction and research. The volume will be distributed in both print and electronic versions.
For more information about the Innovative Use of Technology by Colleges Case Studies Project, please contact Deanna Marcum at (202) 939-4750 (firstname.lastname@example.org); or Willis Bridegam at (413) 542-2212 (email@example.com).
Deadline Approaches for 1998 Zipf Fellowship in Information Management
The Council on Library and Information Resources has established a fellowship to honor A.R. Zipf, a pioneer in information management systems. The fellowship is awarded annually to a student currently enrolled in graduate school, in the early stages of study, who shows exceptional promise for leadership and technical achievement in information management.
Completed applications for this year's fellowship must be received at the Council no later than April 1, 1998. The amount of the award in 1998 will be $5,000, and the winner will be selected by June 1. (Please note that applicants must be citizens or permanent residents of the United States.)
Applications for the fellowship may be requested by e-mail <firstname.lastname@example.org>, by phone (202-939-4750), fax (202-939-4765), or by writing to the following:
A.R. Zipf Fellowship
1755 Massachusetts Ave., N.W.
Washington, D.C. 20036
Task Forces Cope with the Future, and the Present
--by James Morris
The five task forces CLIR and the American Council of Learned Societies have convened to discuss how the needs of scholars will be met in a technologically transformed environment held their initial sessions in December, January, and February. The conversations they began will be extended over the next several months by making use, appropriately enough, of a technological capacity: participants in each of the groups will communicate with one another through listservs.
As we reported in CLIR Issues Number 1, the task forces have been defined by formats and types of materials—audio, visual, monographs and journals, manuscripts, and area studies—to keep their size manageable and the obligations of the participants realistic. This was only one of several schemes that might have been used to organize them, and to choose among the options was inevitably to compromise, as was made clear during the meeting of the first task force, on audio materials, when the participants wished they could address their visual counterparts. The experience in subsequent groups has been much the same: each has had something to say to, or to ask of, absent colleagues. This was to be expected, given that the technology will soon allow an unprecedented mixing of media for research and pedagogical presentation.
The first conversations tended toward provisional conclusions only. It would be fair to say, for example, that scholars do not appear to be driving the development of the technology in significant ways. Not in response to a scholars' wish list does the commercial sector proceed from the summoning of one astonishing capacity to the next. But scholars are not skittish about the technology either, or reluctant to embrace it. An injudicious comparison might be to children who have neglected to send Santa a list of goods tailored expressly to their galloping desires but who are none the less thrilled by the abundant, though previously unspecified, gifts they encounter on Christmas morning.
Members of several task forces expressed a worry that the Web will come to define what real knowledge is. The instances multiply of how this is already happening among too many undergraduates. One scholar complained explicitly about the prevalence—not among students but among his colleagues—of "Op-Ed scholarship," the sort of surface traversal of a topic to which research done exclusively through electronic means may lead. This is not because the means are intrinsically deficient. In part, at least, it reflects the shallow possibilities of the search when so small a portion of materials is available to be mined electronically.
All the groups considered the role librarians will play in the creation of digital collections. Librarians are professionals who are trained to organize knowledge systematically and to apply standards. They are the ones who must integrate electronic materials into the larger whole of collections and judge whether the addition of new materials to collections matches fundamental institutional goals. It is essential that scholars and librarians work with one another in this process of selection and growth, lest there be a disjunction between the vision that librarians have of their patrons and the users' own sense of what they require.
One concern expressed by a librarian member of the monographs and journals group ran through all the sessions—a fear that there may be no adequate vision of where the technology will be in 10 or 15 years, and no grand strategy that says "we need these particular technological devices and capacities to make the information environment work for us." We are moving along too haphazardly, when we should be setting quite specific goals. For example, we might commit to enhancing"through digitization"the usefulness of materials that, for want of any better disposition, and often as an alternative to their disappearing, have been consigned to microfilm and fiche.
The monographs and journals group expressed the most aggressive intent to harness the technology to the needs of scholarship and the academic process. Some members are especially committed to changing the current situation with respect to scholarly journals. Their hope is to use the technology to support a new approach to scholarly certification. The character of the arrangement is by no means settled, but, in essence, it would separate scholarly certification from scholarly publication, so that work would be peer reviewed and endorsed (and archived) without the need for it to appear in any print format. The transferral of the communication of research results from print-based formats to electronic would profoundly affect the economics of information provision on campuses and represent a signal victory achieved through technological means. Libraries would save the enormous sums they are now required to lay out to purchase back, in journal format, the research that faculty members have done while in the employ of their home institutions. Under current practice, faculty members, in effect, give away their research to commercial entrepreneurs, who profit by selling it back to its source—the academy.
Several of the groups called explicitly for finding aids to materials in the format under discussion. The audio group—in the very first of the sessions—led the way, and the visual group picked up the theme by asking for proper finding aids to electronic image collections. One member of the visual group said that creating access to digital images through finding aids is the most important thing that can be done to advance the application of the technology to scholarship. The lack of adequate indexing mechanisms is a key stumbling block to the widespread availability and use of digital information. Scholars lag behind other sectors (photo stock houses, other image providers, the military) in driving the creation of retrieval systems for images. At Columbia University, for example, digital images in the medical collections (which have important pedagogical value) are cataloged online only at the collection level: there are no records for individual images within a collection. Even when academic institutions have scanned special collections, they do not necessarily make the images available beyond their premises. This is a consequence not of technological capacity but of institutional policy (and perhaps of copyright concerns).
Members of the visual task force noted that, in some ways, the image itself can be considered a surrogate for a bibliographic record. Even if the quality of the image is poor, it is an access point—an instance of the kind of superficial information the Web is useful for delivering. But they also wondered whether funds spent on improving access to materials by digital means will be taken away from the acquisition and preservation of hard-copy materials. And if some research that includes visual devices (maps, charts, and so forth) will be available only on the Web, where it may have been placed without the filter of peer review, how will librarians bring their traditional skills of selection to bear on what they choose to acquire and retain?
The area studies group thought that the immediate effects of the introduction of digital technologies were damaging because area studies traditionally exist at the margin of resource allocation within libraries. If funds are directed, with increasing enthusiasm, to digital projects—few of which are in the vernacular languages of the world—area studies run the risk of being further marginalized. The group was concerned that much of the current demand for digital projects seems to be vendor-driven, not mission-driven, and that vendors see no profit in the financial returns associated with area studies. Vendors that provide access to journals in electronic form, for example, often exclude area studies journals because of their low subscription numbers. The foreign materials valuable to area studies scholars will rarely be created in digital format and are not strong candidates for digitization. Works in non-Roman scripts, as an instance, cannot be scanned with adequate resolution for reformatting projects—though groups in Asia are working to solve this problem.
The area studies group did think that technology might make possible the levels of cooperation among institutions—in collection development, for example—that are essential if area studies are to survive. They acknowledged that these cooperative solutions are old-fashioned but thought that the technology might put a new face on them. They endorsed the mounting of resource guides to area studies on Web sites to facilitate resource sharing, and they recommended that models of distributed collecting, such as one the Germans have developed be considered, and that area studies be shored up through the generous maintenance of a few major programs around the country.
The manuscripts group worried that the easy accessibility of information on the Web will draw people from the use of primary materials. The group was most enthusiastic about using the technology to promote access to primary materials, not to digitize them. The preservation and accessibility of primary evidence were the principal concerns of individuals in this group, and it is with the second of those responsibilities that the technology is likely to prove most helpful, for it will introduce marvelous tools of access.
The group also considered the preservation of the massive amount of material that is being created digitally and that exists only in electronic form on the Web—in effect, our contemporary version of manuscripts (items in the right-to-die debate, urgent and topical, were cited as an example). They did not challenge one member's assertion that we shall have to rely on individual enterprise and individual collectors to capture and to keep items from the vast flea market of the Web, just as individuals in the past have built collections of every sort (from proper manuscripts to improper postcards) out of personal passion. We should stop worrying about how to save everything on the Web. This is, after all, the same flawed but endurable world that has lost most of Sophocles.
|International Newsletter to Appear on the Web|
CLIR has published the first issue of a new quarterly, Preservation and Access International Newsletter. The newsletter will inform readers of international developments in preservation and access. Look for the newsletter on our website in the near future.