Close this search box.
Close this search box.

3. States of the Artifact, 1800-2000

Research library collections are primarily made up of printed matter: books, serials, journals, and newspapers. Even with the rapid growth of machine-readable and electronic resources, 85 to 90 percent of the acquisitions budgets of libraries still goes toward the purchase of printed matter.9 Humanities and social science scholars rely heavily on print journals for scholarly communication and so do scientists. Those who work within the print regime rely on a series of conventions about documents and their relationship to their physical manifestation that may be so familiar that they are invisible. These conventions bear mentioning, however, because many of them do not operate in realms of audiovisual and digital resources.

Among the many consequences of the adoption of printing technology, several became fixtures in the print landscape. They are

  • the creation of a comparatively fixed and stable text
  • the concept of authorship and of intellectual property inhering in authorship
  • easy duplication and wide dissemination of texts, especially after the introduction of high-volume presses in the nineteenth century
  • the notion of fungibility of informational content (also a result of mass-market publishing)

While texts copied by hand were intended to replicate their sources, they did not do so precisely or completely. With the advent of printing, it became possible to produce nearly identical copies in large numbers. This great increase in the accuracy of reproduction was crucial in the development of scientific and technical literature, and especially so in the reproduction of illustrations. While there certainly have been variations between printings, and even among copies from the same press run, the presumption that authorship and content were, in principle, stable and fixed, took hold, with significant consequences. The notion of repeatability and accurate reproduction, on the one hand, and of a reliable text with a known author, on the other, caused a shift that Foucault describes in detail in “What Is an Author?” According to Foucault, with the arrival of printing, scientific literature ceased to derive its claim to legitimacy from its attribution to an author (Aristotle, for example) and began to derive its authority from principles of experimentation-the potential falsifiability of the hypothesis, the repeatability of the experiment, the replicability of the result. Literature, on the other hand, began to derive its authority not from its dispersion and repetition in the culture, but from its originality and its connection with a particular author (Foucault 1977). In an information economy, much now depends on these notions of fixity, reproducibility, and authorship, as many recent court cases, legislative acts, and international agreements attest.

Bibliographic and textual scholarship, since the nineteenth century, has shown just how precarious and nuanced these concepts are. The destabilization of our ideas of fixed content began to accelerate, however, with the advent of audiovisual recording technologies. Daguerreotypes exist in only one version, because the image is exposed directly onto a metallic plate; wax cylinder recordings are also unique, each produced as a live recording. (Although they were produced in batches, there was no master.) Film-based images, however, do have a master-the photographic negative. For this reason, they can exist in multiple, nearly identical copies. Still, although the images are all made from one source (i.e., a negative), the negative inevitably wears with use, and the copies become less faithful. Through reproduction, the image on the negative can be effaced. This is not as true of print products. Engravings do deteriorate with each production, and books set with type can show wear from copy to copy; nevertheless, the loss of information for texts from copy to copy is generally not as striking as it is for visual and sound resources.

In non-print materials, the problem of “version control”-i.e., determining which version should be preserved as an original-is complicated by a further lack of stability over time and place. This factor characterizes broadcast media in particular. Take as an example a single television document that a news archives might wish to collect: the CNN news broadcast for January 15, 1998. What do the archives collect? The broadcast that originated in Atlanta, in London, or in Tokyo? This problem has existed among collectors of newspapers, of course, with libraries deciding often on the final edition as being the one of record and making efforts to collect several editions of a paper that tracks a particularly important event, such as an assassination or an election. But the scale of this problem for broadcast media on a 24-hour news cycle that constantly update news is of a different order.

The artifact that records an image or sound, moreover, can easily lose its originality or uniqueness. The adjective “unique” may haveonce been sufficient to identify a primary document in manuscript, and in some cases, such as legal documents or signed first editions, it may still have some meaning. But with newer formats, such as electronic and broadcast media that rely on refreshing and reformatting for longevity, the terms “original,” “unique,” “content fixity,” and “material artifact” mean little.

Task force members found the cardinal features of an artifact that have the highest research value-originality, faithfulness, fixity, and stability-retreating like a mirage as they worked their way from the 1800s into the twenty-first century. The three sections that follow look at how these four values work themselves out in print, analog audiovisual, and digital documents. These sections also explore how preservation strategies address these values and seek to preserve them. Preservation options, it will be seen, are shaped by such factors as the quantity and quality of resources, the instability of media, constraints of resource allocation, and the changing valuation of the artifact in research and teaching.

3.2 Print/Paper

3.2.1 The Relative Stability of Imprints

Books and other printed matter deteriorate over time as the result of their inherent chemical instability. For example, when paper made of wood pulp reacts with humidity and heat, it becomes brittle. Books also deteriorate as a result of mechanical strain. For example, the spine is stressed when an open book is placed on a photocopy machine. (Photocopying also exposes the paper to heat and light.) But there have been significant changes in how books, journals, and newspapers have been made since 1800, and these changes affect their significance as artifacts as well as their physical robustness.

In the 1970s, 1980s, and 1990s, much attention was focused on the legacy of brittle books created by the processes for making inexpensive paper. Now joined to this concern about the paper is concern about the structural support for that text block-the binding. The last few decades have seen an explosion of relatively inexpensive, soft-cover editions of books that were not designed to last. Thus, the task force looked at both chemical and mechanical problems associated with paper and its binding.

Paper. Until the middle of the nineteenth century, paper was made from linen and cotton rags, which in principle make a strong and durable product.10 In the 1850s, wood pulp came into general usage for making paper more economically. The publishing industry rapidly converted to this process, following the lead of the newspapers, for which wood pulp was a source of inexpensive newsprint. The manufacturing process required that wood-pulp paper be treated with aluminum sulfate (alum) to keep the inks from running and to improve the hand. Alum, together with various bleaches and sizings usually added during the papermaking process, reacts with humidity to produce an acid that, over time, breaks down the molecular structure of the cellulose in the wood pulp. In its worst form, damage leads to “brittle” paper that loses its flexibility and eventually crumbles when handled. While any given page may be readable, turning it may lead to several types of damage. In thin, hard-finish paper, the page becomes brittle and brown along the edges and can easily snap off along a fold line. Thick, pulpy paper tends to separate almost spontaneously under tensile stress in any direction, and whole blocks of text may come loose near the gutter and fall out of the volume (Kantor 1986).

Once paper becomes embrittled, nothing can arrest the decay. Such materials are candidates for preservation reformatting, that is, capturing the information content of the original and transferring it onto a stable medium such as preservation-quality microfilm.11

While acid paper of any sort is at risk, decay manifests itself unevenly. Manufacturing processes vary a great deal, and the conditions of storage and use can vary dramatically as well. Few places have proved to be as bad for wood-pulp paper as the humid, polluted eastern seaboard of the United States. Books, like people, prefer California-like climates that are temperate and do not vary dramatically.

Many acid-paper items are not yet embrittled and can be stabilized to arrest the process of embrittlement. The two chief methods of stabilizing acid-based paper are deacidification and storage under optimal conditions. Deacidification is a chemical process whereby paper is treated with an alkaline buffering agent that neutralizes the acid content. It can be done on a single item or on many items at a time. There are facilities that can deacidify bound materials en masse, and some facilities can even treat unbound materials, such as sheet music, archival documents, and newspapers. Deacidification can stop further damage, but it neither reverses damage done nor strengthens already-brittle paper. Therefore, it is unsuitable for books that are damaged or weak. Part of the cost of treatment lies in carefully assessing, item by item, how suitable a book is for deacidification.12

Embrittlement can also be avoided or slowed by storing materials in stable environments with set parameters for temperature and humidity. These storage conditions cannot be obtained in open-access stacks. The conditions that slow the decay of library materials would be uninhabitable for most people. One of the reasons that libraries have been eager to build remote storage facilities is to lengthen the productive lives of their print collections.

How extensive is the problem of brittle books? In 1984, the Library of Congress and Yale University surveyed their holdings and found that one-quarter to one-third of their collections were highly embrittled and in danger of imminent disintegration. This alarmed other libraries, which turned to their partners in the academic community to help assess their collections and devise a coordinated strategy to address the problem of brittle books. The Council on Library Resources asked the Association of American Universities and the American Council of Learned Societies to join in creating a task force to study the extent of book deterioration. In 1984, Robert Hayes, then dean of the Graduate School of Library and Information Sciences at the University of California at Los Angeles, was commissioned to determine the percentage of embrittlement at major repositories in the United States. He determined that of the 305 million volumes in Association of Research Libraries collections, approximately 25 percent were brittle. Hayes also attempted to determine the degree of overlap among libraries to find out how many of these were individual titles that needed to be preserved. The number he arrived at was 12 million, and he estimated that about one-third of these could be microfilmed in a 20-year period (Hayes 1985).

The reformatting of brittle books accomplished two things: it rescued information deemed endangered and increased access to that information-a point that was critical in persuading the U.S. Congress to fund the National Endowment for the Humanities (NEH) Brittle Books Program. Each reel of preservation microfilm produced under the auspices of NEH was made available for purchase, in accordance with any copyright considerations, and each title filmed was entered into a database that recorded the existence of the film. This strategy not only helped avoid accidental duplication of effort but also made known the availability of the titles.

Although important for meeting the needs of remote users, microfilming books seldom best serves the access needs of local users. Photocopying onto acid-free paper is the preferred technique for this purpose, and it is an option used increasingly by most libraries.13

But what does shared access to the artifact look like? How can a single book serve the needs of both local and remote users? The example of registering reformatted books raises a question for those engaged in mass-deacidification programs. Does each library have to duplicate the deacidification work of the others? Would it be possible to create national registries where libraries log the books that they have treated? Other libraries could then consult the log and determine whether local demand dictated treating their copy of the work, or whether it would be acceptable to box or shrink-wrap the work and send it to remote storage, knowing that if that copy became too brittle to use, library patrons would have access to another copy through interlibrary loan. Unlike reformatting on film or digital files, saving one book in its original form does not increase access to it. The problem is, how can libraries achieve economies of scale in the preservation of artifacts? The first step would be to improve passive systems, especially environmental conditions. If libraries do not intervene to save every low-use book that will turn brittle, but take action (such as deacidification) to stabilize some number of them that can fulfill the needs of patrons who must use an original, how many of a single item should be preserved, where, and at whose expense?

Yale University’s Sterling Memorial Library has microfilmed more than 60,000 books during the last 12 years. During that time, though, the library acquired 150,000 additional books each year, and more than 65 percent of these came from countries where permanent, or alkaline-buffered, paper is not used. In other words, almost 100,000 volumes being added to the library’s collection each year are at risk of becoming brittle in the future. On the basis of its own estimates, Yale determined that filming a volume after it became brittle would cost about $120 in current dollars. Scanning the volume might cost as little as $80, but the cost of preserving and managing digital files over time is unknown, and, in any event, it is not yet a preservation format. Deacidification would cost $17 a volume. This is one library’s estimate of how cheap an ounce of prevention would be in comparison with a pound of cure (Walls 2000). This is in line with the LC’s estimates of the various treatments available for print materials (see Appendix VI).

With respect to the future, the good news is that publishers in the industrialized world had largely ceased to use acid paper by the 1990s, at least for first printings. In 1990, the U.S. government mandated the use of permanent, acid-free paper in all documents and publications that were to be archived, and most state governments have followed suit. Most publishers agreed to print the first press runs of hardcover books on permanent paper.

Librarians have noted a disturbing trend during routine checking of new acquisitions for acidity. In many newly independent and emerging countries-countries from which U.S. libraries get a significant portion of their collections-printers continue to use unbuffered wood-pulp paper. A large portion of these acquisitions are at high risk for acidity. But more and more academic and trade presses in the developed countries are producing reprints-and an ever-increasing number of first press runs in paperback-on acid paper. Consequently, many new acquisitions, not only from Asia or South America but also from Europe and North America, may be at risk of embrittlement unless deacidification or other preventive measures are undertaken.

Binding. In the West, changes in the publishing economy have created yet more preservation problems. To save money and cater to mass markets, publishers have sharply increased the numbers of paperback books published. As a result of this trend, which began in the 1950s, libraries need to rebind these acquisitions even before putting them on the shelves. This may divert money that would otherwise be spent for preserving older items.

Books and other bound materials are vulnerable not only because the paper may be weak but also because physical handling weakens the structure of the volume and creates other problems. A study done in 1994 at New York University yielded some interesting data. During one week, preservation experts looked at everything that came across the circulation desk. They found that 21 percent of circulating books returned with spine damage and 14 percent needed rebinding or recasing. Thirty-two percent of the books returned were stained or damaged in some way.14

The latest ARL preservation statistics (1996­1997) reflect the local need to keep books as objects alive and well. ARL libraries repaired 12 times as many books as they filmed (873,000 versus 70,597) during that period. Thirty-six percent of total preservation expenses went for contract binding; filming accounted for only 3.6 percent. Of the 2 to 4 percent of library budgets that go to preservation (exclusive of capital costs), item-level repair appears to be the asset-management strategy of choice.

3.2.2 Evaluation of the Artifact and Selection for Preservation

The recent interest in the production of knowledge has made materials that libraries acquired as secondary sources in the last two centuries into primary sources for this century. The lively field of printing history has made the book itself, rather than its content, the subject of investigation. This has promoted a large and not very physically robust category of resources to artifact status.

It is not surprising that historians of print are interested in books that were created before the mechanization of printing and binding. Book production before the 1830s was craft work. Printers delivered unbound sheets to the booksellers, who then bound them into volumes that carried the printers’ own imprints. One printing was distributed among a number of booksellers, each of which would bind the volumes differently. Consequently, the same printing could appear as a number of forms. Each set can be considered unique in its printing, binding, or dissemination and thus worthy of retention in a collection.

The introduction of mass production has not diminished the status of books from the latter two-thirds of the nineteenth century as items with artifactual interest. There were so many innovations in the business and art of printing during that period that there are now many candidates for special treatment as objects, irrespective of the content. While books printed before 1801 are usually managed as “rare” books, there is an increasing use of the category of “medium-rare” books of the nineteenth century that are served to patrons under somewhat stricter protocols than are general collection imprints. This could be because the books have illustrations that are of research value or are vulnerable to mutilation; have aesthetically significant bindings; or were produced by special printers or publishing houses. These books rarely receive the intensive, item-level treatment that a rare book would get; a global or collection-level treatment is sufficient to ensure that they survive in usable form. This is one example of preventive preservation that should be encouraged in libraries holding important collections of such materials.

One should not underestimate the amount of time it takes to select books for this kind of treatment. (Books are rarely shelved by age, after all.) Some libraries pull these books during normal shelf inventorying or when they cross the circulation desk. Other libraries pull these books when they are selecting items for filming, scanning, deacidification, or secondary storage. To expect each library to develop a collection of artifacts relevant to the history of printing simply because it has books from eras relevant to that field, however, is neither feasible nor responsible.

Many paper-based collections other than books and periodical publications held in special collections libraries and departments are also at risk. Libraries, archives, historical societies, and museums house large collections of pamphlets, letters, brochures, broadsides, sheet music, printed maps, advertising art, playbills, restaurant menus, scrapbooks and memorabilia, almanacs, proof sheets, children’s books, religious tracts, and other items printed between 1800 and 2000. The task force has not focused on these collections because, to the extent that they are rare or unique and constitute primary sources, institutions that hold them do not dispose of them. Nevertheless, retention of these materials does not guarantee their survival. These collections are invaluable intellectual and cultural resources, and they must be considered in any national preservation strategy. While these materials fall outside the scope of consideration here, it is important to emphasize that these collections warrant separate investigation. Indeed, the ARL recently completed a study of special collections within its membership that underscores the need to devote attention to the stewardship of these types of collections (Panitch 2001). The report points to a number of areas, including but not limited to preservation, that demand fresh approaches as well as new resources. The task force recommends that such studies be extended to non-ARL libraries that hold special collections, that the needs of these collections be identified, and that a strategy for devising and funding cost-effective solutions be developed.

3.2.3 Creating Surrogates: Filming versus Scanning

Surrogates are created for one of two reasons: to create copies of works too fragile to use or to replace items in imminent danger of disintegration. In the former case, a rare book or collection of broadsides may be scanned and the originals retired from use, except under extraordinary circumstances. For materials that are on their last or next-to-last use, the content is reformatted onto a more stable medium, such as preservation microfilm, or is photocopied onto acid-free paper. For these types of books and other items that are in current demand, most libraries create paper copies, a far more convenient mode of access than is microfilm. In most cases, the source is photocopied onto acid-free paper; however, at libraries with active digitization programs, the source materials are often scanned and the scans are used to recreate the original volume on acid-free paper on demand. The question that then arises is what to do with the original. This issue is addressed in Section 3.2.4.

Microfilming remains the gold standard for preservation reformatting of low-use materials. With proper storage and handling, preservation microfilm can remain faithful and legible for a century or more. Film is still considered to be the best medium for preservation of images. But microfilm is just that-images-and no more. Digital reformatting that includes optical character recognition (OCR) adds functionality, including the capability for full-text searching.

Given the value added by digitization, why isn’t all reformatting digital? The chief reason, besides the often higher cost of scanning and creating searchable text compared with filming, is that there is as yet no reason to be confident that digital files will last as long as microfilms, or be as easy to manage over time.15 The preservation community has given much thought to making preservation microfilm whenever digital scans are made or to converting preservation microfilm into digital scans (Chapman, Conway, and Kenney 1999), but few libraries have embraced this more costly approach. Even though the money goes toward ensuring preservation after creating access, funders tend to be reluctant to put money into preservation when the same money can be put toward enhancing access to something else.

The investment taxpayers have made, through the NEH, to create microfilm copies of brittle books could be leveraged to create digital scans of those books for ready access. This notion, however, has elicited little enthusiasm. Part of the reason is that the books that have been microfilmed are often low-use items. As one expert has written, “Brittle books have been selected for filming because they have potential research value, but are low priority for current researchers and so can be put on film for storage even though it is an awkward access medium” (Gertz 1999). Digitization of low-use special-format collections, by contrast, is common. This predilection to scan special collections rather than monographs is based in part on the idea that special-collections materials-maps, photographs, manuscripts-have traditionally received little use because they exist in single copies in one collection. Once made easily accessible, these materials may become high-use items.

One significant exception to the practice of not scanning low-use print materials is the Making of America (MOA) program at the University of Michigan and Cornell University. This digitization program focused on brittle American monographs and journals from the latter half of the nineteenth century, and the selection of materials was coordinated between the two libraries. This conversion of brittle American monographs and periodicals has created an interesting and largely unanticipated result: the MOA journals at Cornell receive thousands of hits weekly. The beneficiaries of the project are many. For example, a graduate student used the publications as research materials for his dissertation. They enabled him to complete his studies from abroad when his wife’s job took them out of the United States. William Safire noted the value of mining the text for early uses of words and phrases and cited MOA publications scanned using OCR as a rich source for nineteenth-century texts (Safire 2000). A high school student found the ideal material for a term paper, while a company in Detroit located an engraving of Daniel Boone, which they intend to make into a poster to commemorate the city’s 300th anniversary. The visibility of digitized materials on the Web has facilitated their discovery, resulting in usage that greatly outstrips that of the paper copies that were slowly decaying on their shelves. A dedicated researcher may have consulted the print copies, but the secondary school student, the genealogist, the lexicographer, and the insurance company are unlikely to have engaged with the paper volumes as they are able to access these digitized publications on the Internet.16

A recent study on how humanities scholars work in this evolving information environment, based on surveys and case studies, reports a similar trend in preferences for electronic access to print materials via Web-based data repositories:

. . . the scholars all had access to a number of full-text databases published by Chadwyck-Healey, the Women Writers Online project from Brown University, and full-text journals from the Johns Hopkins Muse project and from JSTOR. Few scholars mentioned using these full-text resources, but the ones who had were hooked on what they could offer and particularly appreciated those products that provided access to primary sources (Brockman et al. 2001, forthcoming).

The scholars who were “hooked” reported that the searching techniques available were especially prized. The report noted that “the thoroughness with which searching is possible across any of the corpora covered by these databases means that once they have been recognized by a group of researchers in a particular field, their use is obligatory.”

The observation that use is “obligatory” means that these scholars are now able to avail themselves of otherwise-scarce texts. Women Writers Online was cited specifically for both research and teaching uses. The types of searching are novel as well, being inaccessible to manual research; they include such techniques as keyword searching, pattern identification, and an abundance of searchable elements. Finally, the database created a contextual mass of different editions of the same work that allows collation and comparison among versions. In addition to reporting use of the larger commercial databases, a few scholars recounted using smaller, noncommercial Web-based projects devoted to individual authors.

All these instances testify that scholars are growing more comfortable with digital surrogates of texts. There is a need to develop and apply methodologies to track the growing use of large digitized collections and to evaluate how researchers use these aggregations. Their increasing use also raises the question of what value there will be for libraries to maintain their hard-copy collections of these texts, when the same collections, without gaps and items missing from the shelves, are accessible from a computer 24 hours a day from anywhere in the world. Duplicative collections of materials that are not rare and are readily accessible on the desktop will have lost much of their original reason for being: to provide local access. What will the next generation of scholars, many of whom will have grown up with JSTOR, MOA, and other databases and find the very notion of having to retrieve hard-copy journals from stacks during library hours a hindrance to research, think of the journals and monographs that languish in dead storage? How much will a university or college be willing to pay for digital access and also for keeping hard copies? How do we ensure that enough originals are kept and are available for present and future use?

Although we are far from having a comprehensive body of primary and secondary literature readily available for use from the desktop, there is every reason to believe that within a decade, significant corpora of texts will be available in a number of fields. It is not too early to plan for the eventual disposition of the scores, or even hundreds, of duplicate copies of individual items that scholars, voting with mouse clicks, prefer to use online. The question is not whether libraries should keep hard copies of them. Of course they should. The real question is, in a world in which access is no longer tied to physical possession, let alone ownership, which institutions, and how many institutions, will take the responsibility of ensuring the preservation of and access to hard copies? Information technologies may allow libraries to develop a system of registration or other kind of tracking that can allow linking local preservation decisions and investments to the changing needs of a national and international research community (Greenstein 2001). And with new models of distributed preservation responsibilities facilitated by digital technology, shared among libraries, must come financial and legal support for those actions taken on behalf of the group.

3.2.4 Responsible Retention Policies

Books are totemic objects that have developed a powerful status over centuries, both for their content and for their physical significance. At the same time, books, like other objects, have always been cultural variables. Codices were routinely expunged in the Middle Ages, their writing scraped off to make room for a new work, which, in turn, might be effaced by a later generation. That said, task force members testified repeatedly to the symbolic value of the book and to the cultural significance of the library both as a building and as a mental construct. They asserted that responsible stewardship is necessary to strengthen and reinforce those values. The spiraling costs of basic library functions and the added demands of new and more expensive services and collections cannot be allowed to put libraries and their missions at risk. To ensure the continued accessibility of current library collections into the future, not to mention the extension of access to those collections through networked resources, economies of scale, some of which entail the forging of cooperative enterprises, must be achieved. The technical infrastructure, financial resources, and public support must also be secured to sustain those actions.

As a way of envisaging how responsible retention of reformatted materials might be reconciled with the economic realities of preservation, librarians can look to archivists for sound guidance (Kenney 1996). The benchmarks of sufficient quality film or digital capture include the following:

  • The scan has captured all informational content-color, original formatting, full content, or whatever else is important in the source document.
  • The document is fully accessible through a defined indexing-and-retrieval scheme.
  • Access to the digital file will be maintained over time, and the data will be protected from loss, corruption, and technological obsolescence.

A consortium of leading American research libraries has proposed benchmarks for digital capture of text and image for libraries that posit technical specifications that would meet these requirements. If adopted and widely practiced, these would go a long way to building digital files on which researchers could rely for some minimal levels of capture and fidelity (DLF 2001a).

Once a library has created scans of sufficient quality to serve as full surrogates, it should put in place a plan to maintain those resources over time. It should also consider the question of what to do with the original source material. A plan for maintaining the files over time may one day become routine, but at present there are very few libraries or archives that can or would assert that their digital assets are secure for more than a few years hence. There is a need for what might be called digital service bureaus or utilities that would provide such services as scanning, managing files, delivering files to local users, and long-term archiving. This infrastructure needs to be in place before most libraries and archives could develop routine, cost-effective digital services.

Any scanned material that is rare and of artifactual value should be handled with care during scanning, and retained afterward, even if retired from active duty, and stored to prevent further damage. Items that are common, such as journals, and that have content value but little artifactual value may also be sent to storage. However, if hard copies exist at other sites, there is no compelling reason to retain them, unless the local patrons have a history of using hard copy even when digital files are available. What is important is that researchers who do need hard copy should be able to locate and retrieve it with relative convenience. In many cases, this may mean developing shared repositories for originals, used and supported by several libraries that store their materials in one site and are able, thereby, to create richer and often more comprehensive shared collections. The custodians of artifacts need to design a plan that strives for the most comprehensive coverage of given journals or subject matter. This would almost certainly entail collaboration by the repositories of the original materials, support from their local constituencies, and concurrence that acting for the greater good might engender some local inconvenience. The Five-College Library Depository (see Section 4.1) is a model for this kind of shared storage. Such a repository might be able to afford collections services such as in-house preservation facilities, or even scan-on-demand services, that single institutions could not develop and sustain alone.

Promoting better understanding of the importance of the artifact will require a clear, succinct framing of the issues and structured discussions. The objective should be to develop commonly understood and widely endorsed approaches to the problem of caring for an abundance of materials with limited resources. There will never be complete agreement: the matter of what to preserve and how to make that accessible will never be considered resolved for all time. Instead, interested parties should develop and declare a basis for making those decisions that must be made in the immediate future.

The task force’s investigation of the print record has confirmed that there is a great need for the identification and preservation of numbers of important artifacts. Materials that are unique can be identified and preserved. But there are large categories of printed materials that exist in abundance and do not have high market or exhibition value, but are important nonetheless because they exemplify a genre and a way of recording information and communicating. For these materials, the need for the identification is especially great. Such things as antislavery pamphlets, election broadsides, and sermons printed as circulars are distinct genres that have value but are unlikely to survive in toto. A measured plan to save numbers of such genres will probably succeed in securing funding for preservation; a general alarm about their status probably will not. It is important that scholars work with librarians to identify and define categories of materials and locate the finest and best-preserved specimens.

Beyond specific genres of printed ephemera, there is a need for a repository of American imprints-what one of the task force advisers called a “Federal Reserve of National Literature”-that would ensure that an archival copy of American imprints is preserved. The American Antiquarian Society (AAS) has been working for decades to create a deposit of record for American imprints up to 1876. Other than AAS, there is no library or consortium of libraries attempting to preserve a record of American imprints. The Library of Congress, despite the fact that it receives two copies of every copyrighted book in America, does not have a program, or indeed a mandate, to preserve even one copy for posterity. It is worth exploring the possibility of seeking congressional authority and funding to require the Library of Congress to send one of its two copies directly into storage upon receipt and to be obliged to make a surrogate of that copy available for use if the title becomes too scarce to find elsewhere. The Library of Congress has congressional authorization to build additional storage for its needs, but does not have the authority to set aside a copy for archival retention.

The Library of Congress is the working library of the legislative branch of the United States government, and its collections are at the service of the Congress. A request for an archival deposit system has never been made to Congress, and members of Congress neither know about nor understand the need for such a system. Congress, however, has proved itself a champion of scholarship and library preservation in the past, when it authorized and funded the brittle-books reformatting grants under the NEH. It acted then at the behest of scholars, who impressed upon it the irreplaceable value of the information printed on acidic paper. The task force recommends that a consortium of scholars and librarians, working with the AAS, the Library of Congress, and other appropriate bodies, develop a strategy for a series of repositories to assemble and preserve a full record of American publications. One such plan, for example, would have the AAS responsible for materials before 1876; another body or groups of institutions responsible for 1877 to the present; and the Library of Congress responsible for all prospective archiving. This archival repository system would be in addition to the local or regional repository system proposed above, which would serve the needs of scholars routinely requiring originals and would be designed to provide access to them. In contrast, an archival repository or series of repositories would be designed as a fail-safe mechanism that ensured the survival of an original, not ready access to one. The details of governance, deposit, access, and so forth would need to be carefully worked out by a number of interested parties, including librarians, preservation experts, and scholars.

3.3 Audiovisual

3.3.1 Sound and Light as Artifacts

Audiovisual materials present many of the same preservation challenges as do resources on paper. They are recorded on media that decay and they are abundant, thereby forcing libraries and archives to make difficult choices in acquiring and preserving them. Researchers’ demands for audiovisual products, like the demands for print resources, change over time. This reinforces libraries’ sense that they need to collect masses of materials with no immediate demand in sight, in case they eventually prove to be of research value.

At the same time, in large part because of the extreme fragility of the recording media and the dependence on playback machinery that quickly becomes obsolete, audiovisual materials present new challenges to traditional notions of the intrinsic value of the artifact.17 For printed materials, the artifact is a single physical object that has some measure of fixity. That fixity is largely dependent on the fact that the object has been recorded on a relatively stable medium. The stability of the paper depends on its physical composition, the effects of handling and storage on the original imprint, and other variables; however, there is little confusion about what a book or a journal is and what constitutes the original. Moreover, the role of the print artifact in research is fairly straightforward: it provides evidence, and does so through information that the physical object itself carries. This information offers a means of ensuring the authenticity of an object and of judging the veracity and accuracy of the information contained in it. It also brings researchers in some tangible way closer to the moment of creation of that object and, presumably, closer to the intentions of the creator.

A simple example can be used to illustrate the variety of ways in which audiovisual technologies assault the fundamental notion of artifact. Judged by any criteria, a film by Stanley Kubrick-Spartacus, for example-is important to keep and make accessible to future researchers, even if estimations of its merit fluctuate over time. What does it mean to preserve a film in its artifactual form? The Library of Congress, which acquired this film through copyright deposit shortly after its creation, defines the artifact as the original manifestation.18

The movie was a technological marvel during its day; however, we have now discovered that the 65-mm negative and 70-mm prints made from this negative both fade irreversibly, and sometimes dramatically, over time. What we have of “the original manifestation” of Spartacus is no longer a physical object that provides accurate or meaningful information about what viewers saw in theaters in 1960. The original object was fixed onto a carrier that precludes the very notion of stability. While there was a commercial effort recently to restore the film to a more or less faithful recreation of what it once was, that restoration is not an original artifact from the 1960s, but one from the 1990s.19 Looking ahead to the 2060s, there is every reason to anticipate that 35-mm and 70-mm films will be orphans of a technology that no longer exists, and that projectors to play them will be relics from the past. If no one reformats Spartacus onto new technology, then we will have lost direct access to a film that was crucial in the development of an important American artist who also created Dr. Strangelove and 2001: A Space Odyssey.

This is a relatively uncomplicated example of what happens to the concepts of originality, fidelity, stability, and fixity when information moves onto one of the most significant recording media of the nineteenth and twentieth centuries. Similar examples abound for recorded musical performances and spoken-word documents.
While many humanists are familiar with print resources and the copyright regime that governs their distribution and permits timely preservation action, fewer scholars are familiar with the protocols that govern audiovisual resources. Their manufacture, preservation, and access protocols and, perhaps most significantly, the copyright laws that permit or restrict those protocols, are well-known only to specialists. Moreover, the history of collecting these materials differs dramatically from that of print resources. The following discussion of these matters, together with case studies in Section 4, provides background and a historical context with which to grasp the challenges that these media pose to scholars and librarians.

The curatorial and preservation communities involved in audiovisual materials are widely dispersed-in media, government, and public and research libraries, as well as in museums, historical societies, and regional collecting institutions. There are as many conflicts as there are common interests between the creative community (such as film directors, photographers, and musical artists) and the rights holders (such as studios, news services, and major media companies), or between researchers (folklorists, ethnographers, and linguists) and their subjects (the individuals, communities, and sovereign Native American nations who have been recorded). These conflicts sometimes have serious commercial consequences, and the communities can come to blows over preservation issues such as colorization and letter-boxing, cropping and retouching, or enhancing recorded performances for re-release on CD by cleaning up the noise in an original long-playing (LP) record. Because these resources often have high entertainment and commercial value, the role of the marketplace must be considered in any strategy to preserve and make them accessible for research. In other instances, for unpublished and noncommercial recordings, serious ethical issues have arisen between some researchers and the communities that have been documented. This has led to new access restrictions on old bodies of evidence that beg the question of whether or not repositories should invest in preserving materials to which they may never be allowed to grant access. Some of the intricacies of rights management, commerce, and ethics are discussed in detail in two case studies about film preservation and folklore collections (see Sections 4.2 and 4.3). We focus here on describing the scope of visual and sound resources and highlighting some of the issues that affect artifactual evaluations and preservation options.

3.3.2 Still Images

The member libraries of ARL reported holding more than 64 million photographs, pictures, maps, prints, slides, charts, posters, cartoons, engravings, and other graphic arts in 1998­1999 (ARL 1999a). The Library of Congress has more than 13 million items in its photography and print collections. Many special-collections repositories also have significant still-image collections, often filed with printed, cartographic, or design materials in collections organized not by medium but by provenance or subject matter. No one knows exactly how much of this type of material exists in the United States or where it is stored. Nonetheless, it seems clear that the majority of historical documents that are image-based are not in academic research libraries but are scattered in historical societies, natural history collections, special-collections libraries, commercial archives, and local, municipal, state, and federal records offices. The charts that document land ownership and management; architectural drawings and engineering records of the built environment; photographs taken in the course of collecting news or creating journalistic essays; and archives of architectural firms, industrial design companies, and advertising enterprise—all these collections constitute invaluable historical records. While many companies take excellent care of their business archives and offer some level of preventive preservation for the still images scattered throughout their records, most enterprises do not have preservation strategies for their archives, lack resources to develop and implement such strategies, and may not even be aware that they hold materials that are of great potential value to historians and others. The task force recognizes that many collections that are part of the visual record of this nation lie well outside the purview of research institutions and that special efforts are required to identify and properly preserve these materials.

Within research institutions, still-image collections have been slowly coming into their own as primary source materials. In public libraries such as the New York Public and the Library of Congress, both of which have premier visual resource collections, most users are professional picture researchers who are looking for images to reproduce in publications. There is still a gap between those scholars who seek out visual sources as primary documentation and those who come to picture collections because they are looking for material to illustrate a monograph or an article. While professional picture researchers will always constitute a significant portion of users in public institutions, there is evidence that the current and future generations of humanities scholars will turn increasingly to visual documents for primary evidence.

There are several reasons for this trend in research methodology. The first is chronology, pure and simple. The primary recording media of the past 100 years have been visual, and anyone studying that period can—or in many cases must—rely on still or moving images for information. A second reason is that the proliferation of photography and easy reproduction of images in magazines and books have exposed the current generation to more imagery. Some believe that this has led to a rise in “visual literacy”; others argue that exposure to many images does not create the ability to “read,” or understand, an image critically—the true sign of literacy. (Despite the proliferation of imagery in everyday life, few graduate programs outside the traditions of art history and archaeology teach students the same critical approach to visual sources that they do to textual sources. Reading documentary photography with an art historian’s eye can be inappropriate for those pursuing other research agendas.) The third reason is that the topics and methodologies of literary, historical, and sociological research have broadened to include many phenomena that are not well documented in texts but are so in visual resources. Gender studies is a good example of a field that relies on a variety of sources from a variety of disciplines and makes good use of nontextual materials for such things as the history of domestic life (for example, illustrated magazines, advertising art, family photographs). There are also relatively new fields, such as environmental studies, that end up relying on the inadvertent documentation of built and unbuilt environments that would not have been remarked upon in texts. This priority of image over text is encouraged by the digitization of images. Visual resources perform superbly in an online environment.20

That said, what is the artifact in question? In the case of film-based images, is it the print, or is it the film from which the print was made? Sometimes the film has been developed and printed in a way that makes the printed image significantly different from the image captured on film. Most curators choose to collect both the medium of original capture and the print, when this is possible.

In pictorial collections, the items that receive priority for selection and preservation include any work that is of artistic value or of a certain age, or that has been created by a well-known artist photographer, graphic artist, or cartoonist; has a significant provenance; constitutes an institutional priority; and so forth. Original works—vintage photographs and their negatives or original drawings—receive the best care. Color film and photographs are generally more fragile than are black-and-white film and photographs; however, any photograph printed on resin-coated or acid paper is at risk.

Film-based materials are often served in a high-quality surrogate, such as a digital image, slide, or copy print, unless the researcher is primarily interested in the object rather than the image content. A researcher using Toni Frissell photographs from the front of World War II could be satisfied with copy prints if she is looking for general-level information about Frissell’s coverage of the war or various activities on the front. Another researcher might be interested in seeing only vintage prints, negatives, and contact sheets that carry information about the entire project or shooting assignment, the original sequencing of shots, how the negative was cropped and printed, and so forth.

One of the dilemmas of preservation is that the best conditions for storing an object often compromise the object’s effectiveness for research. Still images, whether film or paper based, are often an integral part of a mixed-media collection. Such images might include photos in a personal archive, illustrations in a book, or architectural records in a business archive. Preservation would demand that they be physically separated from the original collection and stored elsewhere in cold vaults. Each medium has its own storage requirements; for example, requirements for black-and-white film differ from those for color.

There are large collections of documentary photography that gain research value as they are supplemented by a range of secondary materials—notebooks, work files, and contact sheets—that enlarge the evidential base of the documents under review. In many cases, it is sufficient to create surrogates that can be kept with the original collection. Copy photos or digital reproductions usually provide all the information needed, but creating high-quality surrogates can be costly.

3.3.3 Moving Images

The catastrophic loss of silent film—the incunabula of moving images—that occurred before movies came to be seen as respectable sources for research should warn us of the perils of undervaluing new media. What we have faced in film is not only a failure to keep films that have been collected from undue deterioration but also a failure to collect films systematically at all. Fortunately, certain events conspired in the last 15 years to draw attention to the lamentable state of film preservation and galvanize communities into action. This should serve as an example of what can and should be done for other audiovisual sources.

While both public and private nonprofit archives have long been working to document and preserve film, it was the emergence of ancillary markets for resale of film inventories that moved the industry itself to invest resources in preservation. As videotape playback equipment became cheap enough for the consumer market, studios predicted that recycling old films for reissue would be worth the investment. That led to the introduction of colorization, theoretically to enhance the appeal of old movies to consumers. Colorization and cropping to fit a television screen, in turn, galvanized the creative community around issues of original intent and moral rights, which sparked a confrontation with commercial forces and a widening circle of discourse about the state of the global film heritage.

In 1988, Congress passed the National Film Preservation Act, creating the National Film Preservation Board and asking the Copyright Office to investigate the colorization and material alteration to film. By the time the film board was reauthorized in 1992, its focus had moved from alteration to preservation in the broadest sense. A consequence of its work was that all the players—archives and libraries, studios, artists, and distributors, each with competing interests—were forced to identify their common interest in the integrity of the film heritage and to collaborate for the first time toward a common preservation goal. This concern for the integrity of and continued access to film was one factor in the development of a national strategy for preservation. It coalesced easily with other significant initiatives of the time, notably the formal establishment of the Association of Moving Image Archivists (AMIA), which brought together the nonprofit and for-profit professionals in film—archivists, technicians, filmmakers, academics, and laboratory managers. The high-profile efforts of filmmaker Martin Scorsese and the Film Foundation also raised awareness of these issues among the public and creative communities. In 1993, the film board undertook a national study of preservation needs and the following year put forth a national plan (LC 1994). In 1997, with congressional authorization to fund film preservation, the National Film Preservation Foundation opened its doors with the financial support of the Film Foundation and the Academy of Motion Picture Arts and Sciences.

That landmark film-preservation study addressed a broad range of issues, including technical, legal, economic, and financial issues around preservation. It helped document what had been created, what had been lost, and what was at highest risk for loss. The first order of business was to document what has been lost, and that was tricky. Written records, old journals and newspapers, and collections of film memorabilia provided clues. A mental reconstruction of the film industry at certain periods of time led to meaningful deductions about what must have been and to an understanding of the distribution chain that could suggest where copies of films had been shown. This, in turn, led to happily correct suppositions about where missing films could be located. The American film industry often shipped abroad films that, for various reasons, were never returned at the end of a run. Australia and the Czech Republic proved to hold significant American films thought to be long lost. An international collaboration is crucial to the ongoing reconstruction of the film record of any nation.

Documentation is a serious challenge for the film community, especially for noncommercial films. Because so much of what has been created is either lost or in imminent peril of disintegration, information about what once was made and how it was distributed and received can serve very important needs, even without the original itself. Film, after all, is a public medium with wide influence on those who see it. One example of a documentation project on which scholars and archivists have collaborated is the British Universities Film & Video Council (BUFVC). It has created a database to document what newsreels were shot and shown in theaters, even those newsreels that no longer exist.

From these state-of-film data, the National Film Preservation Board and its collaborators developed a strategy for preserving films that was based on a clear division of labor. Studio-produced and -distributed films are preserved by a series of collaborative and unilateral arrangements. Because studios now see their old films as assets, they are preserving the artifact and its economic value. Studios are building new storage facilities. But while many studios preserve a significant portion of their inventory, a very large portion of films are actually preserved and stored at the four major film archives (the Library of Congress, UCLA, the George Eastman House, and the Museum of Modern Art), often with financial assistance from the studios.

The National Film Preservation Board’s study also highlighted an array of noncommercial films, from independent art films to documentaries and home movies, that are vital parts of the historical record. Some “home movies,” for example, were shot at an internment camp for Japanese Americans. Because these films and their importance have now been identified, public and private support for funding their preservation has emerged. The National Film Preservation Foundation was established to coordinate these preservation efforts. Its approach is to find funding from a mix of private and public sources to regrant to those repositories—historical societies, regional film archives, local museums—for the purpose of undertaking preservation work.

The preservation challenges are complex, and the solutions are expensive. Film-based materials need to be protected not only from their inherent chemical instability but also from the mechanical damage incurred by running a reel through a projector. A master negative needs to be properly stored and used only to produce intermediaries that can be used for creating access copies (either film or tape). Original materials for old films, no matter how badly deteriorated, need to be saved in the event that a new technology comes along that can extract more information from them. The four major film archives in the United States all keep original materials.

There is much dissension among preservation experts about whether analog materials should or should not be preserved in digital form. With respect to film, the important thing to note is that the physical artifact itself is an endangered species that warrants special measures to ensure its survival. Few institutions have taken on the burden of preserving film. The best practices that have emerged include retaining as much of the original material as possible, including any supporting materials that provide further evidence of the film’s original state or that can be used to restore lost portions of films. For the sake of preserving the original materials, researchers are given access to reformatted versions. Technology is increasing the fidelity of reformatting so successfully that most researchers do not need access to the original.

3.3.4 Recorded Sound

The heritage of recorded sound is imperiled because the industry that manufactures currently acceptable preservation media—primarily analog reel-to-reel tape—is phasing them out, along with the playback equipment needed to render the encoded information intelligible. Sound recordings seem to be especially vulnerable to technological obsolescence. Anyone who grew up with 78s, 45s, and LPs, then cassettes and CDs, and now reads about (or downloads files through) Napster and Gnutella, is aware of this shifting landscape. Nonetheless, these consumer formats have remained remarkably stable relative to the industry standards with which the preservation community has to keep pace. One expert reported the following to the Library of Congress:

Most industry representatives report that their audio and video products—media and systems—will all be digitally based within the next five years or less. This abandonment of traditional analog technology has come alarmingly fast and in dramatic fashion. Access, even on a limited basis, will need to become digital. Master preservation copies will need to become digital (Storm 1998, vi).

Many people disagree about whether digital preservation of film is necessary for analog films, because commercial entities still shoot with old-fashioned film stock. Editing is now done primarily on digital equipment, and some major filmmakers—George Lucas most famously—are shooting with digital cameras. As long as film stock continues to be manufactured, the debate will go on.

Recorded sound is a different matter. In this case, members of the preservation community will soon find themselves without the tape and equipment they need for analog reformatting. Acquiring the equipment needed to transfer analog to digital is a huge capital expenditure—one that many institutions that hold valuable recordings are not prepared to make. Consequently, many collecting institutions will have to outsource their preservation reformatting. Such a process is fraught with risk because few laboratories specialize in the art of old media.

Among the most significant repositories of sound recordings are the Library of Congress, the National Archives, the New York Public Library, and a number of academic institutions, including Columbia, Yale, Syracuse, and Stanford Universities. There are also important, but smaller, collections such as those of whale sounds at Scripps Oceanic Institute and Cornell University’s ornithology collections. Folklore and linguistic materials abound in the personal collections of many scholars. Some type of preservation needs to be done for these items as well, because they constitute primary data for these scholars’ research. If one of these personal, unpublished, and often uncataloged, collections goes, the evidence base for that scholar’s work also disappears. It is fair to say that for field recordings that exist on various tapes and cassettes, the artifact itself has little to no value; the chief mission of preservation is to rescue the information on those media before it disappears. It is crucial that scholars attend to their own collections, and they that they do so now.

Perhaps the most important thing that scholars can do to prevent the loss of these documents is to recognize their value and to lobby for support for their preservation. The U.S. Congress, largely in response to pressure from constituents and from the industry, recently passed the National Recording Preservation Act of 2000. Modeled on a similar act for film preservation, it calls for public recognition of the value of the nation’s sound heritage and for a survey to document the state of recorded sound. (See Appendix VII for the full law.) Until that survey is completed, it will be hard to assess accurately what is imperiled.

A significant part of the identification problem is that, outside the field of classical music, sound recordings have not been incorporated into the canon of bona fide research resources until recently. The copyright regime has been slow to grapple with the fact of recording. Recorded sound has been covered by the copyright code only since 1972. Even when one admits popular and ethnographic music into the fold, libraries have collected chiefly published recordings. Published materials have not been readily incorporated into the bibliographical systems of libraries, and unpublished materials have almost never been included in these systems. Researchers have found that commercial catalogs are the best sources for the initial phases of a search of published recordings. For unpublished materials, which constitute the core collections of most anthropologists, ethnomusicologists, folklorists, and linguists, few if any points of access are available. Scholars who are themselves creators of documentation should start thinking about access and preservation issues at the moment they create the documentation, from getting informed consent from subjects so that rights to access for various purposes will not later become a problem, to using best practices for recording, describing, and storing materials.21

At present, selection of audio for preservation, like that of moving image material, is based on an assessment of the following factors:

  • cultural value of the item
  • historical uniqueness of the item
  • estimated longevity of the medium
  • current condition of the item
  • state of playback equipment
  • access restrictions
  • frequency of use

When the access copy or preservation copy does not adequately capture the information, it must be retained for future use because of the probability of advances in technology.

3.3.5 Broadcast Media

Broadcast media—radio and television—are the stepchildren of the audiovisual media. While everyone acknowledges their reach across the country and their power to shape public perceptions, interests, and thinking, few institutions take them seriously as sources of historical information, at least as evidenced by how few collect them. Vanderbilt University collects television and specializes in network news. The Library of Congress has an unparalleled collection of black-and-white-era television holdings and some significant radio holdings, such as the Armed Forces Radio and Television Service Collection. But in general, at both the national and the local levels, what gets broadcast is woefully under-documented.

There is little local collecting, yet because broadcast media are the equivalent of newspapers in another era, the most significant materials are generated locally and should be collected and preserved at the local, not national, level. Those who decry the loss of newspapers from the past are well positioned to raise awareness about the present need to save broadcasting, because what they most value about old newspapers is present in the newer forms of news and entertainment—exquisite sources for popular opinion, advertising, and cultural phenomena of all types.

The National Historical Records and Publications Commission (NHPRC), a body of the National Archives, has recognized the problem and become a leading funder of local television news collection. The Library of Congress has been authorized to create an archival record of national radio and television through the American Television and Radio Archives program. However, the state of local television news collections across the country is, in the words of an expert report, “extremely desperate” (LC 1997, 91). The state of collections of non-news television and of radio of all sorts is equally deplorable, if not more so.

The Library of Congress report on the state of television and video preservation (1997) lays out the scope of the problem and proposes a national plan of shared responsibilities to ensure that at least some of the television and videotape in the country will survive. Should these broadcast documents not survive, it would be an irreparable loss for present and future generations. Scholars can play a crucial role in making the academic communities of which they are members, as well as the public, aware of the value of this documentation, the extent of the challenges to future access, and the efforts under way to preserve these materials. It would be ironic and sad indeed if this generation of scholars, using ephemera and other artifacts of the nineteenth century to reconstruct the history and consciousness of that time, were to do nothing to articulate the need to safeguard the comparable artifacts of this era.

To the extent that preservation funding follows closely what the scholarly community has declared to be of research value, all the audiovisual formats discussed in this report suffer at present from lack of advocacy by scholars. But the task force recommendations cannot assume that significant additional funds will be forthcoming simply because the need for them has been identified. University librarians and administrators alike have testified to the task force that preservation budgets today are remaining flat or, in real dollars, shrinking. There is no reason to believe that this trend will not continue. If some base funding were to become available for reallocation, it would surely go to meet the greatest faculty need—the purchase and licensing of electronic resources. However, in the area of film, television, recorded sound, videotape, and other audiovisual media, partnerships beyond the academy are critically important, and the national preservation plans that are in place, or will be so shortly, must claim the time and attention of the scholarly community.

3.4 Digital

At first, digital information objects appear to be outside the scope of work of the task force, charged as it was to investigate the role of the artifact in library collections. After all, what is an artifact if not a physical object? And what is digital information if not intangible, available to be “output” to any number of media for access but having no intrinsic physical form and not bound by the temporal and spatial constraints of print, visual, and sound resources? Is it not digital information itself that is threatening to replace, or at least displace, the physical collections that libraries hold?

It is in part the phenomenal growth of digital information in libraries that prompts this examination of the nature and importance of physical artifacts for research and teaching. Digital technology is changing the ways in which researchers and teachers are getting access to information and in some cases obviating the need to consult an original. The demand for electronic resources and the infrastructure needed to support their access and maintenance is affecting the budgetary climate in which artifactual preservation must fight for its share. Finally, the flooding of the information landscape by “nonartifactual” intellectual property influences our understanding of the notion of the artifact in ways that beg scrutiny. It is for reasons such as these that digital formats warrant special attention.

Thus far, the task force has focused primarily on the value of digital conversion for the purpose of access to originals—on the creation of surrogates of nondigital works to overcome obstacles created by scarcity or physical fragility. With respect to original artifacts, the salient questions to pose about digitized representations are as follows:

  • When can a digital surrogate stand in for its source?
  • When can a digital surrogate replace its source?
  • When might a digital surrogate be superior to its source?
  • What is the cost of producing and maintaining digital surrogates?
  • What risks do digital surrogates pose?

In print, there is a great deal of secondary literature that is amenable to digitization. There is also a great deal of primary source material—dime novels, vintage photographs, and various ephemera—that, because of its scarcity, could be more widely accessible were it in digital form. Other items—for example, oversize materials such as posters, or books in fragile states—are easier and safer to use in the form of digital surrogates. The research tools available for digital materials, such as full-text searching, may make the surrogates more accessible to specific types of research questions than the originals are. Postscanning processes may make the use of damaged items easier because one can lighten dark patches or sharpen the resolution of faded inks.

For visual resources and recorded sound, reformatting for access purposes is nearly always recommended or required for the sake of preserving the integrity of the source material. There is a growing consensus that digital, rather than analog, reformatting will best meet the demand for accessibility, fidelity, ease of reproduction, and cost-effectiveness, although significant issues still must be addressed in standards for access, preservation, and rights management before this technology can fulfill its promise. For audiovisual resources, the funding of the preservation work that is required remains a hurdle.

In sum, there are many ways in which this new technology can create adequate and at times superior access to information in physical artifacts. There are also instances in which no surrogate, no matter how splendid, will serve the scholar’s needs. Finally, the infrastructure needed to support and sustain such reformatting programs is still in its infancy.

Before one can recommend digital reformatting for the preservation of or access to artifacts with intrinsic value as physical objects, it will be necessary to identify the hazards of digital representation of artifacts and to determine the true nature of an artifact in a digital environment. Physical artifacts bear all sorts of evidence about how they were created or manufactured, who had possession of them at what time, how they were used, how they have changed, whether the information they contain has been altered, and whether the alteration was intentional or inadvertent. What happens to this evidence when the object is represented in digital form?

3.4.1 Artifacts and Artifactual Value in the Digital Realm

Libraries and archives have two distinct categories of digital objects—materials that exist only in digital form (“born-digital” information) and materials that are digitized versions of analog source materials (“reborn-digital” information). The two categories can exist side by side on a server, and the information technologists responsible for the maintenance of those files do not draw distinctions between them in terms of treatments. From the researcher’s point of view, however, the distinctions are great, and they will be maintained here.

Digital technology can represent all genres and types of library and archival materials—textual, numeric, visual, and sound information. Because the technology of creating and disseminating information through digital means is relatively new, society has appropriated many terms specific to analog information and used them, almost by analogy, to describe digital objects. This goes beyond such examples as referring to individual instances of files uploaded to computer screens as “pages.” It has also resulted in such awkward back-formations as “nondigital” information to mean everything that already exists in analog form.22 These neologisms and appropriations have considerably complicated the work of the task force, and none has proved more vexing than the use of the word “artifact” in the digital realm.

The simplest, and one of the oldest, denotations of the word “artifact,” as discussed previously, is a physical object on which information is recorded. The very value of the physical manifestation, together with the dependence on the physical medium, has created problems for preserving information printed on paper. That problem deepens with audiovisual information objects, because the mechanical processes that produce and record audiovisual information produce physical artifacts far more fragile than paper is. The concept of artifactual value is contingent not only on changes in cultural valuations but also on information technologies and their businesses—be they printing presses and paper mills or film stock manufacturers and playback equipment companies. All these complexities abound, and are perhaps even more problematic, in digital formats as well.

Some task force members argued correctly that an artifact is something—anything—made by art and that it does not need to be physical. This is one of several definitions the Oxford English Dictionary attests to the word. Nonetheless, the notion of physicality was intrinsic to the working definition used by the task force, because its central charge was to distinguish between those times when the researcher needs the original physical manifestation and those when a secondary or reformatted manifestation is sufficient. This is yet another example of how slippery a concept “artifact” is at heart. To speak of a “digital artifact” may even appear paradoxical, to the extent that “artifact” implies uniqueness or scarcity, whereas electronic information is replicable to the point where “original” and “copy” may lose their frame of reference. Consequently, when considering artifacts that are originally digital, the first and possibly the most difficult question is “What is the artifact?” This question is discussed in detail in Section 2. We will return to it during the discussion of born-digital items.

3.4.2 Digital Surrogates

What is the utility of a digital surrogate? The answer to this question depends, to a large extent, on the nature of the original artifact and the conditions of its use. Therefore, as a means of determining the value and appropriate use of digital surrogates for library holdings, it may be useful to divide original materials into those that are rare and those that are not, and to divide them further into those that are frequently used and those that are infrequently used. There would be, then, four possible cases:

1. Materials that are not rare and that are frequently used. In this case, we can assume that preservation of the original is not a particularly high priority (since the original is not rare); nevertheless, digital surrogates for such an object might be worth producing and providing, for several reasons:

  • to reduce the cost associated with reshelving the object
  • to make the object simultaneously available to multiple users (for example, through an electronic reserve desk)
  • to replace the object, thereby doing away with the cost of housing it

The first two are obvious and uncontroversial benefits. The third is potentially problematic, even if the object in question is not rare, because it is not obvious that digital surrogates provide all the functionality, all the information, or all the aesthetic value of originals. Therefore, while it may be sensible to recommend that digital surrogates be used to reduce the cost and increase the availability of library holdings that circulate frequently, the decision to deaccession a physical object in library collections and replace it with a digital surrogate should be based on a careful assessment of the way in which library patrons use the original object or objects of its kind. It is not necessary that the digital surrogate possess all the qualities and perform all the functions of the original, but it is necessary that the digital surrogate answer to the identifiable needs and expectations of those who frequently used the original.

2. Materials that are not rare and that are infrequently used. Many libraries now store infrequently used books and other materials in long-term storage facilities. Those materials are retrievable and available to library patrons, but only after a wait of two or three days. With such materials, digital surrogates might

  • help users to determine whether recalling an object from long-term storage was worth the wait—and worth the library’s effort
  • increase frequency of use (by providing searchable metadata, for example)
  • reduce costs by replacing the object with a digital surrogate

The first two are obvious and uncontroversial benefits, and the third comes with the caveat that the digital surrogate should answer to the identifiable needs and expectations of those who (in)frequently used the original. At some point, especially with infrequently used materials that are not rare, libraries might be expected to evolve a calculus that balances functionality with actual use in order to help decide when digital surrogates that provide most of the functionality of originals are acceptable.

One other point needs to be raised, especially here, where we are discussing the component of library collections that has the least “market value.” Libraries, as an institutional and cultural community, need to consider whether these infrequently used and commonly held materials are, in fact, being preserved in a concerted and deliberate way in their original form by any one (or more than one) library. If they are not, the sources for digital surrogates that are common today could easily become rare, or nonexistent, tomorrow. This is the substance of Nicholson Baker’s objection to libraries’ practice of discarding their newspaper holdings. If 50 libraries are holding the same issues of the same newspapers in original form, at great expense and with limited use, then it is difficult to make the case that all of them should pay to house, shelve, reshelve, and preserve the originals. However, if 49 of those libraries, over time, have replaced their physical holdings with digital surrogates, one certainly hopes that the fiftieth library would be aware that its physical holdings were now rare, and therefore subject to considerations outlined directly below under cases 3 and 4.

3. Materials that are rare and are frequently used. In this case, the principal (and very obvious) benefits of digital surrogates are

  • Preservation: By standing in for some uses, the digital surrogate reduces wear and tear on the original object; and
  • Access: By providing access that does not impose wear and tear on the original, the digital surrogate makes rare objects more accessible.

Few would argue that digital surrogates should replace truly rare materials. Digital technology and techniques of digitization are so new, and are developing so rapidly, that we cannot be confident that we have devised the best method for extracting and digitally representing information from any analog source—whether it is a printed page, an audiotape, or a filmstrip. Nonetheless, digital surrogates could, in many cases, stand in for rare and frequently used materials, and could thereby aid in the preservation of originals.

4. Materials that are rare and are infrequently used. On the face of it, these materials seem the least likely to be represented with digital surrogates, if only because digitizing is expensive. On the other hand, if the cost of housing a rare but infrequently used object rises high enough, then digitizing and deaccessioning that object may become an attractive possibility. Here again, libraries need to be aware of the actual or potential rarity of even those materials used infrequently today. Tomorrow, those may very well be the most valuable of artifacts, perhaps for users, or uses, that one could not predict today.

The basic questions, and their answers, are therefore as follows:

  • When can a digital surrogate stand in for its source? When it answers to the needs of users.
  • When can a digital surrogate replace its source? When the source is not rare.
  • When might a digital surrogate be superior to its source? In cases where remote or simultaneous access to the object is required, when software provides tools that allow something more than or different from physical examination, and when the record of the digital surrogate finds its way into indexes and search engines that would never find the physical original.
  • What is the cost of producing and maintaining digital surrogates? The cost of producing digital surrogates depends on the uniformity, disposability, and legibility of the original. The cost of maintenance depends on frequency of use and the idiosyncrasy of format. Beyond that, cost depends on technological, social, and institutional factors that are difficult or impossible to predict. This is an important reason for being cautious when one chooses to replace a physical object (the maintenance costs for which are known) with a digital surrogate (the maintenance costs for which are, to some extent, unknown).
  • What risks do digital surrogates pose? The principal risk is the possibility of disposing of an imperfectly represented original because one believes the digital surrogate to be a perfect substitute for it. Digital surrogates also pose the risk of providing a partial view (of an object) that seems to be complete, and the risk of decontextualization, i.e., the possibility that the digital surrogate will become detached from some context that is important to understanding what it is and that it will be received and understood in the absence of that context.

3.4.3 Access

While digital collections such as the William Blake Archive or the Women Writers Online databases have been created (in part) to make disparate items more accessible, scanning is but the first step. Ease of access depends on how these items are described in cataloging and index schemes and how easy it is to find and retrieve records of these collections in a catalog or database. The cost of cataloging, mark-up, and other things that make up metadata (i.e., data about data) is often as high as, or even higher than, the cost of image capture. This cost is also one of the few things about digitization that is not likely to go down as a result of technological improvements, because it requires significant human input. It is important that descriptions of digitized materials be done in the most accessible forms possible, not in hand-created systems or forms that are commercially restricted. Creating records for monographs and serial publications can be fairly straightforward and can allow direct linking to an institution’s online catalog. Visual resources can be marked up according to the controlled vocabularies found in Art & Architecture Thesaurus and Thesaurus for Graphic Materials.

Fields that do not have a controlled and shared vocabulary—folklore, for example—are at a great disadvantage in a networked environment. Many ethnographic and field recordings exist in analog formats that are rapidly deteriorating or becoming obsolete or that have a limited life span. They are prime candidates for digital reformatting. Saving the information by transferring it from audiocassette to digital files is only the first step. The files need to be described in conventional language to allow access. It is no accident that textual corpora have been among the first to find new lives in digital form. These are materials that have standardized forms and bibliographical standards. The machine-readable catalog (MARC) record is a standard that allows libraries to ensure uniformity of description and to share information in a networked environment. Visual and sound resources, as well as manuscript collections, are less standardized in description and so are rendered less readily accessible online. Librarians and archivists are making efforts to devise descriptive systems that allow for uniform descriptions of archival collections; Encoded Archival Description (EAD) is among the most widely adopted system so far. Moreover, key professional associations in sound and moving image archives are developing best practices for digital capture and mark-up. Academic fields such as folklore, dance history, and ecology, which lack best practices in documentation and research, would be well advised to agree on ways to make their resources accessible through standardized descriptive practices. Access tools will be user-friendly only to the extent that users are involved in their creation. It is time to engage systematically in studies of research needs and researcher behaviors online, so that the tool kit developed for their use will meet the users’ needs.

Another missing feature of the current digital library infrastructure is a central location from which to discover which analog collections have been digitized and how one can get access to them. There is much talk in the digital library community about building a registry or series of registries for digitized items, but so far, none exists (RLG DigiNews 2000, DLF 2001b, Greenstein 2001). One reason is that organizations that digitize use a great variety of standards, not only for capture but also for description. Some items are described at the item level, others at the collection level. Items that have been digitized more than once would not all appear in a search of the records as they currently exist. Libraries are often short-staffed, and they cannot report regularly on what they are doing or planning to do. The absence of such reports would jeopardize the timeliness and reliability of such a registry. No single body has agreed to act as an organizing agency to create and maintain the information, and it is not clear from where the funding for such a body would come. The work of such a body would include securing the information and ensuring that changes in URLs were kept up-to-date. In the meantime, a number of associations and groups at the local, state, and national levels have created databases about what they are doing. These, taken together, do not meet the researcher’s need to locate without great difficulty the digital surrogates of items he or she seeks.

3.4.4 Born-Digital Materials

Whereas digital surrogates always originate from physical source materials, a born-digital item has no previous manifestation in physical form. It is entirely dependent on hardware and software for accessibility, storage, and long-term access. Anyone who has tried to trace a citation to a digital source, only to find that the site no longer exists, understands that dedicated maintenance and resources are required to keep digital sources alive, let alone up-to-date. Beyond the problem of stability, it can be difficult to judge the authority or authenticity of Web-based information.

Digital information is by its nature perfectly replicable. To distinguish between the first and the forty-first instantiations of a digital file is a fool’s errand. But born-digital information is of very great import for scholars interested in the artifact, for it challenges notions of originality and uniqueness, and even of authenticity, fixity, and stability. That it does so matters greatly, because of the sheer quantity of information that is being created in digital-only form. In 1998­1999, ARL libraries spent an average of $742,598, or a median of 10.18 percent of their materials budgets, on electronic resources (Jewell 2001). While still a small portion of what is spent on total acquisitions, this percentage is increasing each year. This category of materials would include databases such as the Inter-University Consortium for Political and Social Research data sets (the world’s largest archives of computerized social science data) or the Environmental Systems Research Institute data and maps (a detailed set of data for the United States and the world that includes census boundaries and major transportation features), and—increasingly—thematic research collections. Such examples as Ed Ayers’s “Valley of the Shadow” Civil War site (Ayers et al. 2001) and Columbia International Affairs Online, while largely based on digital surrogates, also include layer upon layer of secondary or editorial work that is uniquely digital, and the sum of the parts is wholly digital. Many of these new “digital objects,” as they are often called, have been created by scholars or publishers or both. They have the same claim to intrinsic value that any intellectual property created with analog technologies would have. By far the largest portion of these electronic sources, however, is the science, technology, and medicine periodical literature, an increasing percentage of which is available exclusively in digital form.

Of the things we value in these materials, what is fungible and what cannot be separated from its carrier medium? How does one begin to make that distinction? In many digital objects, the chief value is in the informational content of the file, and that content is as good in one copy as in another. It is similar to the case of feature films, where the researcher will normally prize ease of access over fidelity to the original manifestation, within limits. Many researchers are happy enough with laser-disc or VHS-tape versions of films they are researching. So it is with digital information: ease of access is usually its greatest value. A PC Word document viewed on a Mac will still serve the access needs of the researcher. It is seldom important to seek out a PC to view that same document with full fidelity to the look and feel of the environment in which it was created.

But are there equivalents in the digital realm to the intrinsic values we have defined for the physical artifact, namely, originality,faithfulness, fixity, and stability? This is a subject of intense discussion among librarians, archivists, computer scientists, and security experts. A detailed examination of these complex issues is beyond the scope of this report. A summary of the major issues surrounding the problem of authenticity is, however, necessary for understanding the possible hazards of using digital information.

Originality in its simplest sense—that is, of “oneness”—is not a meaningful notion in the digital realm. Archivists have many ways of defining “digital original,” all of which may suffice to meet core requirements for evidence. In a library context, however, the notion is more or less bankrupt (Bearman and Trant 1998, CLIR 2000, Lynch 1999). Faithfulness, by contrast, is not necessarily an empty notion. It is true that the bit streams that make up a file can be changed without evidence of the change, regardless of whether it was intentional or accidental. For example, someone who receives a forwarded plain-text file that has had three paragraphs deleted from it would not be able to see evidence of the deletion on a computer monitor. Internal evidence, however, might lead an attentive reader to believe that “something is missing,” and if the file comes with a digital signature—a public key or other mechanism for objectively verifying that the file has not been changed since the key was generated—then one does have some hope of determining faithfulness.23 Of course, few researchers perform forensic examinations on the physical artifacts they view. Users trust that a source found in a library, for example, is more reliable than one found posted to a billboard. So, too, many researchers place more trust in resources found on a library Web site than in information anonymously posted on the Web.

The issues of stability and fixity, however, are quite troublesome for digital texts. For analog sources, stability and concrete form are absolute requirements for proving the authenticity and provenance of unique documents. The malleability of digital information, the ease of creating several different and subtly tailored versions of digital documents for different audiences, and the difficulties of maintaining digital sources intact and accessible over time are similar to the problems associated with archiving the international television broadcasts cited in the previous section. Which version of a digital document designed to evolve and grow over time is the one that should be archived? For interactive documents, what importance does the interaction have in defining the authentic document? It is important that scholarly societies work with publishers and librarians to specify what should constitute the archived version of an e-journal article, for example, and who should be responsible for creating that final version. Efforts are under way to bring together libraries and publishing houses to address the many questions that surround the archiving of scholarly journals (DLF 2001c). It is in the interest of members of the academic community to join this effort.

As to the intrinsic value of the digital artifact, as opposed to the value of the content that might easily be reformatted onto another medium, what are those features that drop out when content is reformatted? It could be the text formatting, as anyone who has imported a file created in one word-processing program into another program well knows. But in most cases, it is some sort of functionality that is lost. Relational databases can export their data in a form that permits another database engine to import that content, but the relationships laid down among items in the original database will be lost, because there is no standard way of encoding those relationships. Without those relationships, much of the functionality of the database will be lost as well. So, just as with analog sources, preservation of the digital artifact is successful to the degree that it maintains over time the chief distinctions of that object as digital—its functionality, its formatting, or whatever is important about the digital object for a particular use and a particular user.

3.4.5 Preservation of Digital Information

Computer scientists, librarians, and archivists have explored the subject of preserving digital information in depth for nearly a decade, and it continues to be a subject of active research. This section summarizes a topic that has been well covered by others.24

Like analog moving-image and recorded sound information, digital information is both technology- and medium-dependent. Digital technology appears to be evolving even more rapidly than analog recording media have evolved. Hardware and software have changed so quickly that information that has been recorded on one software program often is at risk of obsolescence in only three or four years. The media on which digital information is recorded, whether it is magnetic tape or a hard disk, are vulnerable to a startling array of new risk factors, including magnetic fluctuations, separation of information from the substrate caused by mild environmental disturbances, accidental overwrites, and “bit rot.”

Precisely because the technologies used to encode, display, and enact digital information are changing so rapidly, the digital artifact that goes untouched for 10 or 20 years may well be unrecoverable. Its storage medium might require hardware that no longer exists, the software used to create it might no longer be available, or the operating system under which that software ran might be obsolete. Digital files must be “refreshed” and magnetic tape “exercised” to ensure that the bits keep their integrity and the recording medium does not suffer undue degradation. This constitutes minimal-level preservation for digital files. There is no such thing as putting the text on eight-inch floppies, putting the floppies in storage, and being able to retrieve the text 50 years later. Digital preservation programs will have to be much more active, and even aggressive, than book preservation programs are.

In the past, with information stored in more or less durable physical objects, the task of preservation has been to stabilize and, if possible, enhance the integrity of items (books, films, sound recordings) that had been around for generations because they had been acquired by a library. Some judgment has been made about the intellectual or cultural value of the item, and the question for preservation is not whether, but how, to preserve the item. With digital information, whose life span can be as short as one software upgrade, the decision to preserve must be made almost simultaneously with its creation. This turns the traditional preservation paradigm on its head.

The two most commonly advocated methods of preservation, both of which involve concerted efforts by custodians, are migration and emulation. Migration is essentially a moving of digital files from old hardware and software platforms onto new ones. Migration is a translation, and some measure of loss is inherent in the process. What happens when one reformats word-processing files from an old version of a program to a new one, or from a Mac platform to a PC? The content of the documents is usually preserved, but the formatting is corrupted to one degree or another. Migration has occurred, and loss has been incurred. Yet most people who look back at the costs and benefits of the process would probably decide that the benefits of not having to rekey whole documents outweigh the tinkering necessary to restore the original formatting.

At the same time, there are categories of digital documents, most of them nontextual, that might suffer unacceptable losses during migration. These include executable files, such as time-lapse simulations and interactive programs. For these, the goal is to preserve not only the content but all aspects of the behavior of the original content and of the software used to present it. This is what emulation is designed to do—create software that can simulate the hardware and software environment in which a document was created. The technology of emulation is still in its early stages.

When considering artifacts that are born digital, the first and possibly the most difficult question is “What is the artifact?” What information or value inheres in the carrier medium? Is the equipment originally used for display part of the digital artifact? Does the software that presents and actualizes the data qualify as a constituent element of the artifact? Thinking again of the criteria for determining intrinsic value in an artifact—as a way of understanding what the features of an artifact might be—it is evident that there are a number of practical ways in which these questions might surface.

The physical form of a document or program might be the subject for study if the records provide meaningful documentation or significant examples of the form. For example, the layout of a form used for collecting data on the Web might reveal a good deal about otherwise-inexplicable aberrations, omissions, or misconstructions in the data collected.

There are also aesthetic or artistic qualities that may require preserving, most notably digital art and literature (Lyman and Kahle 1998). Achieving that level of fidelity to the original, or original instantiation, would require running a program on originally specified system hardware and software. Early computer games are a popular species of programming wizardry that inspire heroic efforts at emulation, or recreation of original hardware and software platforms, for replay. One such emulation program can be found in Java, which runs Kaypro (Z-80/CPM-based) software on contemporary Intel/Windows-based systems. The challenges posed by preserving digital art are manifold, because the art is often designed, like a game, to provide real-time experience for the viewer—an experience that may or may not be intended to be replicable. The Guggenheim Museum of Art has developed a program, Variable Media Initiative, that “encourages digital artists to help establish preservation guidelines long before their equipment, code, and lives become history,” as Wired recently reported (Jana 2001). Through this program, artists can record not only the technical specifications of a given work of art but also the artist’s intentions or considerations for longevity.

A third form of digital preservation is the preservation of the original hardware and software for use at some time in the indefinite future. Similar to the saving of wax cylinders and the original play back equipment for listening, this is a strategy of last resort, one that is not scalable for the routine needs of moving massive amounts of information forward and making it readily accessible to users in the future.

These examples suffice to show that the relevant features of a digital artifact could include more than the fungible information contained in an electronic file. Although the Kaypro example may seem facetious, the many issues attendant to digital art and literature are not. The questions that the preservation of these digital objects bring forth are similar to those that collection development staff members ask about preserving all sorts of digital information that is dependent on specific software programs for its intrinsic value.

In many cases, the answer for libraries will be that they are, in fact, primarily concerned with collecting, preserving, and providing access to the fungible informational content of digital objects. In that case, the “preservation through handling” scheme is a likely winner; digital information that is frequently used by patrons stands a better chance of being migrated and refreshed, and therefore is more likely to continue to be available in future generations, than is little-used digital information. Indeed, migration may turn out to be a much more frequently recurring problem than is refreshing, because “today’s optical media most likely will far outlast the capability of systems to retrieve and interpret the data stored on them” (Conway 2000). Regrettably, it is easier and cheaper to refresh than to migrate. If libraries have reason to be hopeful in this regard, it lies in open, nonproprietary standards such as JPEG for images, MPEG for video, and SGML, XML, and XSL for textual data. There are still important data types for which no such standards exist (GIS data, for example). However, the trend over the last 20 years—accelerated significantly in the last decade by the advent of the World Wide Web—has been in the direction of support for nonproprietary standards, even in proprietary software (Rosenthal and Reich 2000).25

Another promising strategy under development is called LOCKSS (Lots of Copies Keep Stuff Safe). The basic principle of LOCKSS is preservation through proliferation. An article in The Economist described LOCKSS as follows:

. . . a network of PCs based at libraries around the world and designed to preserve access to scientific journals that are published on the web. The computers organise polls among themselves to find out whether files on their hard disks have been corrupted or altered, and replace them with intact copies if necessary. (“Here, There and Everywhere,” June 24, 2000, 92)

Curiously, this strategy flies in the face of the new logic of preservation and access in the digital world: that libraries should not build redundant collections in a networked environment, but should develop cooperative collection development and preservation strategies. Duplication of effort is seen as not cost-effective. In the digital realm, cold storage does not work. Born-digital artifacts will not benefit from living undisturbed in dead storage for years. On the contrary, digital artifacts seem to require preservation through handling. Such active intervention would seem to obviate redundancy. Nevertheless, there is plenty of room for experiments in this new environment, and if the future proves to be anything like the past, then many of our initial approaches to ensuring longevity will be stood on their heads.

With print originals, the “preservation through proliferation” strategy of duplicative collections assumes that one copy is as good as another, that is, that the value of the artifact is fungible and that it is the information, not the artifact, one wants to preserve. One copy cannot serve the access needs of a large research community, and the only way for two people to have access to a book is for each to have his or her own copy. The copies cannot be shared in real time: If I give you mine, I don’t have one. With digital files, the opposite is true: If I give you the file for an e-book, you have it and I have it, too. But a million perfect copies will still stand mute if we no longer understand how to read them.

3.4.6 Copyright: A Barrier to Preservation?

Libraries and archives are allowed to use copying technologies for preservation purposes under an exemption of the copyright code. When the Digital Millennium Copyright Act first appeared, there was no exemption for digital materials. The copyright community feared that it would be impossible to control the distribution of copied files, and that even those created for educational and preservation purposes would be a threat to the intellectual property rights of their creators or publishers.

These concerns have been addressed in further legislation that allows preservation copying, and technologies to encode data are adding desired layers of protection from infringement for rights owners. The real threat to preservation in the digital realm is that copyright law is being finessed by licensing agreements. Libraries do not purchase electronic databases and do not have ownership of materials as they do of analog materials. Libraries do not preserve materials that they do not own. At the same time, however, publishers are not in the preservation business.

The library and publishing communities are making efforts to come to terms with the crisis that this situation may create (DLF 2001c). This is a matter that deeply affects scholars, as creators of intellectual property and as researchers and teachers wishing to gain access to the published record. It is one more area in which scholars must become familiar with what is at stake in negotiations between publishers and librarians and be sure that their interests are considered and that their responsibilities—as writers as well as teachers—are clear.


9 In 1998-1999, both ARL and Digital Library Federation member libraries spent about 10 percent of their resources budgets on electronic resources (Jewell 2001, 4).

10 Acid was sometimes used even in the production of rag paper. The first problems with acid paper appeared in the 1830s, when rag paper was bleached with acid-producing chemicals that weakened it.

11 Not all microfilm is created equal, and the preservation profession recognizes certain standards of film capture, film quality, and storage as being preservation-worthy. Unfortunately, a significant number of microfilms do not meet these standards, resulting in surrogates of poor quality.

12 A relatively new technology, known as paper splitting, strengthens weakened paper by splitting a piece of paper in half to separate the front from the back, inserting a stabilizing layer, and reapplying the two halves of the paper. This process, which was developed in Germany, is not widely deployed in the United States.

13 Johns Hopkins University is doing more preservation photocopying and less microfilming, and this trend is common among ARL libraries. (Testimony of James Neal to the task force, October 29, 1999.)

14 Testimony of Carol Mandel to the task force, October 29, 1999. A similar study of circulating collections at several Columbia University libraries in 1995 revealed significantly higher levels of embrittlement and other damage in the oldest collections. More funds were subsequently allocated for rebinding and protective enclosures. (Personal communication, Janet Gertz to Abby Smith, April 4, 2001.)

15 For a full discussion of the technical, legal, financial, and administrative complexities of ensuring the longevity and integrity of digital files, see Task Force on Archiving of Digital Information 1996. The problem of preserving digital information is treated briefly in Section 3.4 of this report.

16 This transformative effect is further borne out by JSTOR, another project that is building a dynamic database from the static pages of journals. For more information, see Section 4.4.

17 “Intrinsic value is the archival term that is applied to permanently valuable records that have qualities and characteristics that make the records in their original physical form the only archivally acceptable form for preservation. Although all records in their original physical form have qualities and characteristics that would not be preserved in copies, records with intrinsic value have them to such a significant degree that the originals must be saved” (NARA 1982, 1). See also Menne-Haritz and Brübach 1999.

18 The copy on deposit is likely to be in poor shape. Film companies routinely use exhibition copies, which are no longer fit for screening, to fulfill their legal deposit duty.

19 Characteristically for the 1990s, the restorers tried to recreate what Kubrick wanted to make, not what the studio had released in 1960. For example, a scene with sexual innuendo between characters played by Laurence Olivier and Tony Curtis, which had not been included in the original cut but was restored in the 1990 version of the film, became a talking point in the publicity surrounding the restoration.

20 Libraries with important image collections have only recently begun to hire photo conservators. This is another indication of the increased importance of visual resources in research, although many conservators work not on materials that are in demand in the reading room but that are being prepared for digital conversion.

21 For a comprehensive view of audio preservation and best practices for recording, see the Web site of the Photographic and Recording Media Committee of the Preservation and Reformatting Section of the American Library Association. Available at

22 Though inelegant, the word “analog” is preferred in this report to “nondigital,” because in the present context, the identity of these (analog) collections is not derivative of their distinction from digital formatting. “Analog collections” is also preferable to “legacy collections,” which has the misleading connotation that all future collections will come only in digital format.

23 See the application of digital authentication technology to electronic scholarly editions, carried out by the Australian Scholarly Editions Centre. Available at:

24 See, for example, Task Force on Archiving of Digital Information 1996, Rothenberg 1995, Rothenberg 1999, Bearman 1999, and Granger 2000.

25 That this general truth should not be taken for granted by the library community was recently and significantly demonstrated by the release of Microsoft’s Windows XP, with its Internet Explorer 6 (which does not have native support for Java) and its Windows Media Player (which does not have support for the MP3 music format).

Skip to content