The Internet has transformed the way in which scholarship is produced and disseminated, most notably in the sciences. In the humanities and social sciences, digital technologies for scholarly research, analysis, communication, and teaching have been adopted more slowly, but there has been significant innovation even in these fields. There has been enough progress that libraries and special collecting institutions are concerned about how to acquire, preserve, and make accessible some of the digital content coming from historians, literary scholars, and other humanists, as well as the primary sources in digital format on which this scholarship is based.
Most scholars who are creating digital information are seeking ways to make the best use of the technologies at hand to further research, discovery, and the sharing of results. In the scientific community, where currency of information is at a premium, this has led to such innovations as the establishment of preprint archives in high-energy physics and mathematics that are managed by the data creators. Certain fields, such as genomics, are building massive databases that require the attention of information management specialists in an academic domain-bioinformatics in this case. Scientific communities of knowledge develop and manage their own information nodes on the Web to speed communication in time and across space. They can thus create a community of scientists around the globe who have access to essentially the same information with few of the sociological or physical barriers that previously existed. Such sites as the preprint archive for high-energy physics (arXiv.org) are not intended to stand in for or replace peer-reviewed publication, nor are they intended to be “archival” in the sense that the fields creating them view them as “of record” and necessarily persistent. The sites are something new: technology allows an old need-timely communication-to be met in innovative ways, yielding a new model of scholarly output.
The humanities, which work with much less money and on a smaller scale than “big science,” have not seen similar growth in building common information resources and managing them collectively. Innovators in the use of digital technology in the humanities often work singly or in small teams; creators of information must address major data management problems on their own or turn, belatedly, to their campus library to help them preserve and manage their handcrafted sites and databases. This is a cause of increasing concern to digital librarians, who confront in this growing phenomenon a series of preservation challenges different from those they have seen until now with electronic journals. Although scholarly electronic journals present many intellectual property issues besides format and version challenges, they present few problems in selection or assessment for enduring value. E-journals are in the mainstream of genres that form the core of academic collection development. Their value is known, and they do not come as “one-offs” in nonstandard formats.
What will happen to these new models of scholarship? Although it is impossible to predict how such scholarship will develop, it is not too early to focus on the question of their long-term value and longevity. How will they be assessed for enduring value, and how long will they survive?
These are basic questions that preservationists in all media face daily. Preservation is that series of actions that individuals and institutions take to ensure that a given resource will be accessible for use at some unknown time. To preserve effectively, one must be able to anticipate what those future uses might be and then develop policies and procedures to safeguard the information.
New-model scholarship poses novel challenges to preservationists: How do we know what the value of these digital objects is and may be decades hence? How do we anticipate and address the technical needs of fragile digital objects over time? Who is responsible for preservation, and how is it financed? Librarians and archivists have been working on these issues for more than a decade, but they will not be able to answer these questions alone.
Preservation has been a back-room library operation for many years. Patrons are seldom aware of this work unless a resource is lost or too damaged to provide useful information. In the analog realm, most scholars understand that preservation goes well beyond “passive restraints” on aging and damage, but they rarely give the matter much attention. The burden of collecting and preserving materials for future access does not normally impinge directly on the scholar or creator, especially for published materials. It is the responsibility of publishers and libraries. As for unpublished materials-such as the manuscripts, maps, photographs, and audio tapes found in archives and special collections in research libraries-few scholars are responsible for building primary source collections from scratch or curating them, let alone taking measures to help ensure their long-term survival. Most researchers rely on archivists and librarians to collect, preserve, and make accessible the key resources on which discovery and scholarship are based. A major exception to this rule is found in such fields as ethnomusicology and anthropology, which are based on observational data.
In the digital realm, by contrast, the critical dependency of preservation on good stewardship begins with the act of creation, and the creator has a decisive role in the longevity of the digital object. This is a new role for most scholars-one for which their professional training has seldom prepared them. The role is fraught with unhappy implications for the use of the scholars’ time, daunting demands to acquire new skills, and the uncertainties that come with dependencies on hardware and software, and also on librarians and archivists, to cope with the complicated tasks of data management.
For the digital scholar intent on creating information resources that are long-lived and can be accessed easily, the task is not only to invent tools that foster productive use of the Web as a medium of scholarship and teaching but also to create material in preservable form.
In the past decade, digital librarians and archivists have worked hard to define the parameters of “material in preservable form.” They have tried to specify which formats and encoding schemes will hold up the best through one or more cycles of data migration. Because of their often-prescriptive nature, these efforts have met with mixed success in the academic community.
New-Model Scholarship: Three Examples
Developing strategies for preserving digital resources of high research value should begin with a look at the aspirations of digital scholars. An understanding of what these scholars are trying to achieve and an examination of the challenges they face should inform approaches to ensure the longevity of digital objects. The examples that follow describe three scholarly projects that share crucial dependencies on Web-based technologies and illustrate many problems creators face today. The examples cover a wide range of fields in the humanities-the history of current science and technology, history, and literary studies-yet they have several common aspirations, approaches, and challenges.
History of Recent Science and Technology, the Dibner Institute (hrst.mit.edu)1
The History of Recent Science and Technology (HRST), funded by the Alfred P. Sloan Foundation and located at the Dibner Institute at the Massachusetts Institute of Technology (MIT), is attempting to document specific fields in contemporary science and engineering. Scholars at the Dibner Institute believe that the history of science must keep pace with science itself, and that it is not well served by the practice of one scholar focusing on one scientist and writing a biographical study. Science comprises massive projects that involve large collaborations of scientists and technicians who work in highly specialized fields of inquiry. HRST is developing a historical methodology that it believes is well suited to documenting this phenomenon as well as the results of the scientific research itself. They have created groups of historians who collaborate on documenting these large projects through Web-based networks. They are recruiting support from technologists, librarians and archivists, and the historical actors-the scientists and engineers themselves.
One of five projects on the site, the Physics of Scales has three historians who make up the core team of principal investigators. They are recruiting about two dozen scientists who have worked in the physics of scales. The core team will interview the scientists, put their working papers and documents online, and collaborate with them to annotate the documents. They will then ask a few of these scientists to moderate forums that will yield yet more primary source documentation. These scientists, in other words, will become “communication nodes” who will recruit other collaborators and students to contribute their own documentation to the site. By the end of the project, HRST hopes to have online interviews with 60 to 80 scientists that document aspects of the field they pioneered. In effect, the core team is trying to develop networks that then create their own networks, each in turn creating a documentary trail. Given the inherent difficulties in motivating the subjects of a historical study-in this case, scientists and engineers-to take hours of their time to document their past and present activities, the Physics of Scales core team has tried to develop a process that minimizes the technical and procedural barriers to participation. This has not been easy.
A major technical problem has been the need for a trade-off between standardizing formats, which is good for digital preservation, and allowing data entry to be customized, which the historians creating the site perceive as essential to their work. Although the historians have encouraged the scientists to use certain XML standards to enable important digital library features, their efforts have not always been successful. The historians realize they must make it as simple as possible for scientists to contribute to the body of knowledge. Therefore, they decided not to impose any required standards. The historians also gave top priority to creating tools that can be easily customized by the core team. They did this to put the historians who serve as site moderators firmly in charge of structuring the online interactions among the scientist participants. They hoped thereby to ensure that the contents would be historically significant and worthy of preserving.2
Early in the project, the HRST historians decided to use an open source toolkit that allows them to annotate documents, create extensible time lines, produce highly detailed bibliographies, conduct interviews, and enable scores of scientists and engineers to discuss their fields. They favored these tools because using them does not require much technical knowledge. This approach contrasts with that of the Perseus project (www.perseus.tufts.edu), where the scholars are the programmers: when they need a tool, they know how to write it. The Physics of Scales project hired a programmer and a graphic designer. HRST also had to decide between ease of use (text only) and the physical attractiveness of the site (buttons rather than text links). They chose ease of use.
The project is now in its third year. The software that has enabled data gathering is finished, but the digital library software that will allow users to extract the richness of information is not. That digital library work has taken a backseat to creating the scholarly documents. Preservation planning has also been put off, and the core team is only now beginning to discuss how it will secure a library’s commitment to acquire and sustain the output of the project. Although not ideal, this process is typical and in some ways hard to avoid. The creators were fully occupied with the short-term goal of creating something of value from a historian’s point of view. They had little time to plan simultaneously for preserving something that was being created in an iterative fashion.
The Physics of Scales project exemplifies many key features of new-model scholarly enterprises. It is
- experimental: designed to develop and model a methodology for generating recorded information about a scientific enterprise that might otherwise go undocumented.
- open-ended: generates digital objects that are intended to be added to over time.
- interactive: gathers content through dynamic interactions among the participants. The historians stipulate that the interactions, as well as the content, are part of what is to be preserved.
- software-intensive: stipulates that the tools for using the data are as valuable a part of the project as is the content, and thus worthy of equal attention by preservationists.
- multimedia: creates information in a variety of genres-texts, time lines, images-and of file formats.
- unpublished: designed to be used and disseminated through the Web, yet not destined to be published formally or submitted for peer review.
Center for History and New Media, George Mason University (chnm.gmu.edu)3
Some members of the history faculty at George Mason University (GMU) have created the Center for History and New Media to explore new ways of creating historical documents. Like the historians in the HRST, these scholars do not wish to become experts in technology. They see technology as a tool that can enable them to expand the historical resource base and its functionality. The promise of this technology is, as one historian explains, to open the writing of history to a host of new voices and new stories, to create a more democratic and inclusive view of the past, to offer modes of learning about the past that spur student participation and engagement, and to engender innovative scholarship that challenges traditional ways of “doing history.”
The site creators have placed a high value on experimentation and sharing knowledge, and have consciously tolerated “make-do” use of the technology. The formats they have created so far include text, image, audio, video, e-mail, database, hypertext, and interactive programs. Although proud of its accomplishments, the Center acknowledges having slighted standards and preservation as it focused on the short term more than on the long term. This approach is typical of that of other startup enterprises. Moreover, it is unlikely that they would have been able to accomplish as much as they have if they had focused on the long term. Now, however, they are at the point where they must begin to extend their focus into the future.
The twin demands of scholarship and preservation create tension. The site creators were able to avoid this tension in their early projects by declaring their activities “experimental.” For example, in 1999 GMU undertook a project with The American Quarterly, the flagship journal in American studies, to present hypertext scholarship. The goal was to move beyond hypertext theory and create examples. The project team chose not to work with Project Muse at Johns Hopkins University (JHU), which publishes the online version of The American Quarterly, so that it could obviate the problems that might arise when creating a product that departed from JHU’s standardized format. Creating an experimental, nonstandard project was one of GMU’s objectives.
For other projects, the team has thought more about the longevity of its digital objects, because their goal is to create enduring historical resources. Their current projects are not designed to be experimental, although they do suffer from tensions between the need to achieve some short-term goals and to create with longevity in mind. For example, the 9/11 Project, designed to gather testimonies from people around the country about their experiences on September 11, 2001, was inaugurated under great time pressure. The team had only one month to launch the site, and it was impossible to foresee and act on the variety of preservation issues that might arise. The historians already see problems that may in the end compromise the value of the sources that they have created. These potential problems originate from the need to act expediently: they were not able to create the deep metadata that will provide future evidence of the records’ authenticity.
Like the historians of the Physics of Scales project, the GMU team finds it necessary to lower technical barriers and reduce time commitments of the volunteer participants to ensure a critical mass of contributions. While collecting stories to document the effects of 9/11 on individuals, for example, the GMU team decided to demand only “barebones data” from and about the contributors. They felt that if they were to require more, then contributions would drop off significantly. The team had assessed the National Endowment for the Humanities (NEH) project, “My History Is America’s History,” which collected more metadata than GMU’s 9/11 project did and thus promised a possibly richer site in the end. But the team concluded that, because it takes a significant effort to fill out the NEH forms, NEH was able to collect only as many stories in two and a half years as GMU collected in two and a half months, despite the publicity afforded the NEH project by a cover story in Parade magazine.
GMU’s 9/11 site continues to grow; about 15,000 stories are available. Discussions are under way with a national institution to preserve the 9/11 site. Faculty members are also discussing with GMU ways to ensure persistence for other Center projects. Interestingly, the NEH site, which chose depth of contribution at the expense of breadth of coverage, has been discontinued and is no longer accessible on the Web.
Like the HRST sites, GMU sites are experimental, open-ended, and interactive. They represent a complex mix of formats and genres. They are clearly created for wide dissemination, even though they fall outside the well-known publishing norms. The libraries with which the Center is negotiating for long-term deposit understand the value of a site such as 9/11, in that it documents a major event in the history of the United States. It is also valuable as evidence of a new and rapidly evolving information technology. Such sites are, in effect, “digital incunables,” and may be as prized over time as fifteenth-century imprints are today. Historians at the Center advocate for a digital preservation strategy that, like GMU’s projects, favors action over deliberation and has built-in assessments and course corrections that are familiar in computer science, where an iterative process of “learn as you go” can result in significant advances.
Institute for Advanced Technology in the Humanities, University of Virginia (www.iath.virginia.edu)4
A third set of examples of new-model scholarship comes from the Institute for Advanced Technology in the Humanities (IATH) at the University of Virginia (UVA), which has supported humanities projects that use technology as a research tool for close to a decade. IATH has several projects that are built of complex, heterogeneous file format types. They are still growing and showcase the challenges of incorporating successive generations of content, contributors, and software.
The William Blake Archive, for example, which brings together into one virtual space a variety of source materials by and about Blake from many institutions, is among the oldest of the IATH sites. Over the years, differences of opinion have risen about what good digital library standards allow creators to do and what creators want to do. For example, staff members of the Blake Archive are intensely concerned with reproducing the quality of physical artifacts and the site is designed to allow a user to examine closely a surrogate image of a plate on which William Blake engraved illustrated poems. IATH established an approach to structuring the data that appropriately privileges the physical structure of a volume over its logical structure. When the scholars later decided the table of contents ought to list the poems, not merely the plates, IATH had to create ways to cut across the privilege hierarchy.
Monuments and Dust, a site devoted to the culture of Victorian London, renders its content dynamically, presenting a challenge to creator and preserver alike. It includes a three-dimensional model of the Crystal Palace showing every nut, bolt, wire, and pane of glass in the original. Created in an architectural modeling program (Form.Z), rendered in a lighting simulation program (Radiance), animated, and delivered in Quick Time, the site is very difficult to standardize into a digital library format because few standards of the XML/SGML type exist for such models. The demand for these types of complex and dynamic format types is common among architects and landscape architects, for example, with whom IATH does much of its work. Until the digital preservation community develops and promotes preservation standards in areas such as these, Monuments and Dust and similar projects are fated to be ephemeral.
The Rossetti Archive (properly titled The Complete Writings and Pictures of Dante Gabriel Rossetti: A Hypermedia Research Archive) is another early project of IATH. It originated as text (SGML, then XML) and images. Much of the scholarly work is found in the illustration. The archive now contains about 10,000 files and 45,000 cross-references in various languages, and it continues to grow. The references include songs and digitized films, among other complex formats. The second of four planned installments of new materials was mounted in the summer of 2002. It defines itself as a hypertextual instrument designed to facilitate scholarly research, not as content per se. The content, both reformatted and wholly original, is one part of the larger whole. Therefore, ensuring future access to the Rossetti Archive does not mean just securing the preservation of the content.
What Do These Examples Tell Us?
The three projects just described illustrate several challenges to preservation that are typical of work in the humanities. First, the digital objects created are often complex, composed of heterogeneous types, open ended, and resistant to closure and to normalization. Moreover, the functionalities that scholars prize may often be at odds with emerging best practices for preservation as well as with one another. Of the three endeavors, IATH is the oldest and best positioned to address some of these challenges. Indeed, because of IATH’s commitment to pioneering technology in the humanities, it has partnered with the University of Virginia Libraries’ Digital Library Research and Development Group in a project to address three related problems:
- scholarly use of digital primary resources
- library adoption of born-digital scholarly research
- co-creation of digital resources by scholars, publishers, and libraries
The project, titled “Supporting Digital Scholarship” and funded by The Andrew W. Mellon Foundation, is designed to identify problems associated with preserving materials that originate outside libraries. Much of the material created in this fashion may be sponsored by a research institute and be designed for publication, but if it is to endure, it will need to be supported by a repository that will ensure its integrity, authenticity, and accessibility into the future. Traditionally, that repository has been a library. Will it be a library in the future, or are other possibilities evolving?
IATH is trying to articulate new roles for all the institutions and individuals that have traditionally played well-defined roles in the production, dissemination, and long-term care of scholarly resources. They are especially concerned that neither libraries nor publishers are able to deal with the digital objects coming from the sorts of collaborations they foster. Understanding the distinction that some digital librarians make between the “presentation form” and the “archival form” of a digital object, and the work entailed in transforming the former into the latter, some at IATH suggest that the publisher’s role is to take material originating in digital form and produce it in a format that libraries can collect for long-term retention as well as contemporary access (i.e., with standardized metadata and sustainable formats). They point out that in the print regime, the publisher is responsible for editing, printing, binding, and delivering a text to a library and argue that in the digital realm, these activities should take place at an equally high institutional level. Responsibility for these activities should not be based on a negotiation between the digital author and the librarian, as is often the case today.
Normalizing a digital object for preservation is one area under negotiation between author and archive. Librarians would like to see creators adhere to standard, preservation-friendly formats. Authors do not like to be inhibited by such parameters, and they also do not like to create the amount of metadata usually required for preservation. Ironically, in the IATH projects, insufficient documentation of the digital object is seldom the problem in normalizing it for preservation; extensive documentation exists for all of IATH’s projects. Rather, sites produced by scholars tend to overload some file names with too much information, resulting in ambiguities that cause problems even for the originator. Publishers have knowledge that can be applied to developing and maintaining digital materials and to recovering ongoing production costs. Even a well-tagged, well-described object may lose its identity over time and become irretrievable, unless a “persistent identifier” had been assigned to it. In the view of some at IATH, publishers are ideally positioned to mitigate this problem as well.
Who Should be Responsible for Safekeeping?
The suggestion that publishers assume certain critical functions of “digital librarianship” raises its own concerns. For some, the idea that digital preservation, or at least some of its key functions, would become the responsibility of commercial or nonprofit entities that come and go in the marketplace is unacceptable. Preservation, they argue, should be the responsibility of institutions that are buffered from the vicissitudes of business cycles. But do such institutions exist? Libraries are not entirely unaffected by upswings and downturns in the economy. They are not currently prepared to recover the added expenses of the preservation services that digital media demand. Many analog collections are “preserved” in libraries and archives through simple accessioning and storing, and no other investments are made to prolong the useful life of the resource. Libraries are underfunded for the tasks of analog preservation. They cannot assume sole responsibility for the added burdens of digital preservation.
The debate will continue about whether publishers will be ready, willing, and able to provide the types of preservation services just mentioned, but in truth, the discussion may not be relevant to the new-model scholarly resources that HRST and GMU are creating. These sites, after all, are not destined for publication as their primary form of dissemination. HRST sites are, at present, designed for archival deposit, that is, to serve as primary source materials for the secondary literature that publishers see. Repurposing may be desirable at some point, either for publishing or for use in teaching, but the HRST data creators have not provided for that. Their sites are collections of archival materials that, in the analog realm, would go to a special collections library without being published.
How Do We Decide What to Preserve?
For many scholarly sites, such as those mounted by the Center for History and New Media, there is no clear-cut audience to help acquisitions librarians determine the suitability of the sites for library users. The sites at GMU have a variety of uses: some are meant for teaching purposes, some as primary source material, some for the general public; still others have mixed target audiences. Part of the reason for getting as much material online as quickly as possible has been to see who is attracted to the sites and how users interact with them. There is always a greater use of a “scholarly” site by the public than is anticipated. Although the nature of that use can be difficult to assess, it is nonetheless something that historians such as those at GMU want to consider and, if possible, encourage.
Another challenge for acquisitions lies in that these sites collectively do not pass through the quality filters provided by publication. This raises the types of technical issues for digital material seen in the IATH projects described above. Perhaps most significant, lack of peer review or the vetting that customarily underlies decisions about publishing makes selection more difficult for librarians. Also, publication information (who published, when, and in what number) has always been critical in helping librarians evaluate items under consideration for acquisition.
Another feature of new-model scholarship presenting novel choices for librarians is that these sites do not always contain new content and that the content of the site itself is not always the chief offering to the scholar and teacher. The Physics of Scales site has created a resource that gathers important and unique information, but does so in conjunction with specific functionalities that are crucial to using the data. Those functionalities are similar to the instrumentation found in laboratories-instrumentation that constitutes assets just as important as the specimens or data that the instruments analyze. Historians of the HRST would not agree that printing out the documentation they created is a suitable preservation strategy, although they agree that it is better than no strategy at all. Sites at GMU and IATH, on the other hand, are what might be termed “thematic research collections.” They are designed to support research and are structured, as Carole Palmer points out, to be open ended, flexible, and dynamic (Palmer 2003). These sites mix genres, formats, and secondary and primary sources, and all exist within a specific platform designed for querying and retrieval that is difficult to archive. These are not information resources that libraries have traditionally collected.
Therefore, it is possible to claim that an important feature of new-model scholarship is a blurring between “collections” and “services” and between research “information” and research “tools.” An analog information resource such as a book represents a highly sophisticated technology for information transmission that does not depend on an array of peripheral technologies for use. The tools for mining information from a monograph, for example, include things embedded in the physical object, such as page numbers, indexes, tables of contents, typeface, spacing, and other formatting conventions. Save the book and you save the tools for search and retrieval.
In the digital realm, those search and retrieval tools are behaviors that are embedded in the software but are not, strictly speaking, the data themselves that are recorded in the digital object. Nonetheless, the tools or instruments needed to use the data must be conveyed with the digital objects. It is not beyond the reach of libraries to extend collection development paradigms to include the research tools and software interfaces along with the information; however, some preferred preservation strategies, such as migration, are fairly good at preserving the integrity of data but not the functionality of a digital object. Digital librarians need to know whether and when an information object includes the tools and behaviors, as well as the data, and policies must be developed that support those choices. Making custodial provisions for these types of digital resources also includes ensuring that subject specialists and selectors are trained to understand those tools.
How Do We Sustain These Resources?
Another issue-sustainability-must be added to considerations of quality and appropriateness for acquisition. Librarians must not only assess the digital object’s value for their institution but also determine whether the institution can sustain access to the object over time. Access involves many factors besides good storage and preservation: it includes creating metadata for preservation, search, and retrieval; maintaining hardware and software that can read the digital file; and providing reference help, among other things. Given the multiple (sometimes unquantifiable) costs of acquiring and maintaining Web-based scholarship, negotiations over acquisition between the library and the scholar often revolve around the perceived value of the object and soft projections about investments required to enable future use.
In many cases, the best both parties can do is to ensure that the digital file gets deposited in a repository, even if the repository can guarantee only that the bit stream will remain intact (physical preservation), as opposed to guaranteeing the logical rendering of the bits into an original digital object format over time (logical preservation). Because bit storage is possible and often not too expensive, this solution for physical preservation has much to recommend it, even if logical preservation of the digital object, needed for recalling the bits from storage and (re)creating the object, is an uncertainty. Inadequate as it may seem to some, it does trump total inaction and is in the spirit of experimenting, assessing, and learning as one goes.
The lack of clarity about the intended audience for or use of the digital objects creates problems for preservation as well as for selection. The strategy one chooses for preserving a digital object is usually calibrated to enable some future use; for example, cataloging (or metadata) choices always try to provide for scenarios in which a user will attempt to discover that object. Specifying the uses in the beginning goes a long way to ensuring the authenticity and reliability of the object; it is also of great use to an end user. As an ironic example of such forethought, the William Blake Archive has an agreement with the Charles Babbage Institute’s Center for the History of Information Technology to preserve what IATH sees as the archival value of the site, that is, the textual record of how the site has been created, developed, and maintained. In this instance, the Blake Archive is seen to be significant as an early example of the use of digital technology for humanistic scholarship, fitting in well with the Babbage’s collecting scope. Therefore, provision has been made to preserve that particular value of the archive, not the content of the archive itself. (This does not preclude preserving that content under different auspices.)
Better approaches to ensuring the accessibility of complex digital behaviors in the future entail engaging the creator in stipulating the intended use. Some digital artists, creating works that are essentially interactive, performative, or based on dynamic objects, are helping to ensure the authenticity of persistent objects by declaring what elements of the object are needed to re-create the experience of the art. These elements range from hardware specification (for example, a certain size and resolution of screen, or certain processing speeds but not others) to specific features of the software that must be replicable. (These declarations are often printed out on acid-free paper for archival purposes, an irony seldom lost on the artists.)
With complex digital objects, there is a disjunction between what scholars wish to create and hand off to a third party for preservation and what a given third party is willing to commit to preserving. There are ways to bridge that gap-for example, creators can fully document what their work is and which elements are most important to preserve or adhere to formats and markups that preservationists are able to manage. Developing good practice for creation, as well as preservation, is iterative and needs to be informed by research, testing, and analysis. It is important for all to take a long view of their work and to keep moving forward, one step at a time.
1 The author thanks Babak Ashrafi of the Dibner Institute for providing information and advice about this project.
2 From the archival viewpoint, the documents created in the course of this project are not records, that is, texts and other forms of recorded information that are generated in the course of doing business. This might raise some concerns about the value of these sources over time, concerns well understood by, for example, scholars who gather documentary evidence from firsthand accounts. It also means that, because the self-documenting by scientists does not occur in the course of normal business, the documents created will be less comprehensive than records.
3 The author thanks Roy Rosenzweig of George Mason University for information and advice about the Center for History and New Media projects.
4 The author thanks John Unsworth, of the University of Virginia’s Institute for Advanced Technologies in the Humanities, for information and advice on IATH projects.