Managing Digital Assets: An Overview of Strategic Issues
by
Donald J. Waters
Program Officer, Scholarly Communications
The Andrew W. Mellon Foundation
February 4, 2005
THIS IS A WORK IN
PROGRESS
Managing Digital Assets: An Overview of Strategic Issues
Introduction
Good evening, ladies and gentlemen. Welcome to Charleston and to this workshop. Thank you, David, for the kind introduction, and to the organizers of this meeting for kindly assigning me the task of providing an overview of issues related to digital asset management.
The speed with which this workshop
became fully subscribed after its announcement is a measure of the importance to
you of the topics that will occupy us over the next two days — and perhaps of the
difficult challenges that those topics represent. It may be also be measure of the frustration that we all feel in trying to manage digital
assets prudently and well at a time of change when all the parts seem to be
moving at once — and we are making up many of the parts as we go along.
Despite the rapid pace of change, there
is one point of constancy and common ground that I should emphasize at the
outset of this workshop. That
point is our common mission in the service of the academy. Let's be crystal clear that the digital
assets of which we speak tonight and in the next few days are resources for
research and teaching, and that our purpose in managing them is to advance
knowledge and improve education. The
touchstone question for all the topics that we will cover this weekend must
be: how will this system or that
feature advance scholarship?
Inquiring minds often want to know
the criteria or other features of the Mellon Foundation's digitization initiatives. My reply is always that Mellon has no
digitization initiatives. We do,
however, have a variety of programs to help advance scholarship that may
involve digitization and other uses of digital technologies. So with respect to the management of digital
assets, we must remember — to paraphrase the slogan from the 1992 Clinton
campaign — that "It's the scholarship, stupid!" That should be our talisman this weekend. Write it down on the back of your hand.
Raise that hand if the discussions
become at any point too abstract or start veering off into technical or other
kinds of esoterica.
But do also remember that these are
complicated issues that require creativity and imagination. We must teach each other and share what
we are learning in an open cooperative spirit. Our biggest digital assets are each other — and your biggest
digital asset this weekend may be sitting next to or across from you. In saying this, however, I do not mean
to underestimate the challenges of communication in managing the human factors
of scholarly digital assets. We
all depend, for example, on the genius of our programmers, but some of you are
surely aware of what distinguishes an introverted programmer from an extroverted
programmer. An introverted
programmer stares at her shoes when she is talking to you; an extroverted
programmer stares at YOUR shoes.
Talking in code is also endemic in
this field and is another major obstacle to clarity and understanding. Some of you may have heard the story
about the student of computer science on his first day here at the College of
Charleston. He had entered a
strange, bizarre world. The only
sounds were the occasional calling out of a number by one of the professors,
followed by laughter. One
professor would say "52," there would be a short pause then peals of laughter. Someone else would say "713," same
thing, everyone would fall down laughing.
"What's going on here?" the new
student asked his advisor. "We're
telling jokes," the advisor replied. "You see, we've all worked here so long we know each other's
jokes. There are a thousand of them.
So, being information theorists we
applied data compression. We just assigned them all numbers, 0 through 999. It saves a lot of time and effort. Would
you like to try? Just say any
number from 0 to 999."
The student wasn't fully convinced,
but he tried. Very quietly he
whispered, "477." There was hardly
a murmur. He looked at his advisor.
"What's wrong?" he said. "Try
again," said the advisor. So he
said, louder: "318." Again, nothing;
no response. "Something's wrong," the
new student said. "Well," said the
advisor, "it's like this: it's not so much the joke as the way you tell it!"
There is a curious sequel to this
story. The student eventually
succeeded by accident in the most dramatic and unexpected way. He called out a number outside the range
0 to 999. "Minus 105," he said. At first there was stunned amazement,
then one professor began laughing hysterically, then another, then another,
till they were all rolling about holding their sides. None of them had heard that one before.
"It's the scholarship, stupid" and
clear and open channels of communication are two of the preliminary points that
I want to make here this evening.
The third is that the effort to ask the right questions is just as
important as — and perhaps more important than — getting the right
answers. I was intrigued last week
by the report on MSNBC that an asteroid had been officially named for Douglas
Adams, the science fiction humorist and author of the Hitchhiker's Guide to
the Galaxy. The DouglasAdams asteroid is also known as 2001-DA-42, a striking
combination of Adams's initials, the year of his death, and the famous answer
to the Ultimate Question.
Readers of the Hitchhiker's
Guide will remember that in the story:
A
hyper-intelligent race of beings take the form of mice and construct Deep
Thought, which was the second greatest computer of all time and space, and
built to calculate the answer to the Ultimate Question. After seven and a half million years of
pondering the question, Deep Thought provides the answer: ‘forty-two.'
"Forty-two!"
yelled [one of the researchers]. "Is that all you've got to show for seven and
a half million years' work?" "I
checked it very thoroughly," replied the computer, "and that quite definitely
is the answer. I think the problem,
to be quite honest with you, is that you've never actually known what the
question is."
Deep Thought then informed
the researchers that it would design a second and greater computer,
incorporating living beings as part of its computational matrix, to tell them
what the Ultimate Question is. That
computer was called Earth and was so big that it was often mistaken for a
planet. Just as Earth was about to
produce the Ultimate Question, it was destroyed, supposedly to build a
hyperspace bypass. Later in the Hitchhiker
series, it is revealed that it was actually a consortium of philosophers and psychiatrists
that arranged to have Earth destroyed because they feared for the loss of their
jobs if the meaning of life became common knowledge.[1]
If you signed up to this workshop looking for answers, there you have it. The answer to the Ultimate Question is "42." But one of the morals of the Hitchhiker story is that the certainty of answers like "42" is all but irrelevant if it is cast in the absence of a framework of meaningful, substantial questions. I take it as my job for the remaining time I have this evening to help frame this workshop with some of the strategic questions that need to be addressed in creating effectively managed scholarly resources in your institutions. The very distinguished series of speakers that the organizers have assembled will work with you over the next two days, and together you will discuss these issues in much more detail, elaborate them, suggest different emphases, and no doubt disagree with points that I will ask you to consider.
My plan is as follows: First, I want to review and critique the current debate about scholarly publishing. Then I want to recognize the big elephant lurking about in our midst — Google — and highlight how the priority of asset management questions might shift in light of recently-announced activities by Google and other commercial search engines. I will conclude by suggesting a few principles for determining when digital assets need to be managed by someone else, when collective action is needed, and when local institutions must act.
Scholarly publishing
One of the largest, most visible,
and fastest growing investments in digital assets by colleges and universities
over the last decade has been the purchase of licenses to electronic journals
and databases. The key management
task for this class of assets has been to deal with steadily increasing
prices. Even before the
introduction of electronic versions, journal pricing had been on a rising
trajectory, especially in fields of science, technology, and medicine. As publishers have moved steadily over
the last decade to add electronic versions, the pricing crisis for academic
institutions has not improved but steadily worsened as provosts and librarians,
especially during the booming nineties, largely agreed to finance enormous
annual price increases. A provost
once justified such payments to me by saying: "It is my job to feed the hearts and minds of the faculty
and staff. For healthy hearts, I
must provide extraordinary annual increases in budget for health care; for
their minds, I must provide similar increases in the serials budgets."
Now that resources are squeezed,
attention has focused on the increasingly deleterious effects of these previous
financial agreements. On many
campuses, administrators have mobilized faculty, calling them to account as
editors and authors of expensive journals and involving them more deeply in ongoing
purchase decisions. Big deals have
been undone, bundles unbundled, and there is even research underway to scrutinize
publisher consolidation for cause to justify government anti-trust action. In addition, members of the academic
community have embarked on a vigorous search for other, alternative business
models. This search has resulted
in stimulating and sometimes heated debates about the viability of a suite of
options under the broad umbrella of "open access publishing."
As originally formulated, open
access publishing would disrupt the current system by shifting the burden of
generating revenue from the demand side through widespread use of
subscriptions, to the supply side through the use of author fees, making use
cost-free. Such a shift would have
the benefit, in theory, of putting the principals—the scholars—in the economic
driver's seat, and it would have the broad public policy benefit of lowering
the economic barriers to reading and using the publications. Discussion of this idea has quickly
revealed that in very few disciplines do scholars have sufficient funds from
grants and other sources to pay author fees, and that there would be an
administrative nightmare if academic institutions that are already financially
strapped for funds were to massively reallocate budgets from library
acquisitions and other sources to support author fees in any kind of fair and
equitable fashion. Publications
adhering to this strict definition of open access will undoubtedly continue to
be created and survive, but they will probably be limited in number unless and
until sources of supply-side revenue can be found that do not depend as heavily
as they do now on grant support, and that do not require fundamental administrative
and financial overhauls of our institutions.
In the face of these practical difficulties,
the open access discussion has now morphed to focus on other ways to lower the
barriers to access, such as by encouraging publishers to make articles freely accessible
after a limited time during which they exploit subscription revenue, or by calling
for authors to "self-archive," that is, to retain rights to make available
their articles in pre-print and/or post-print form. The theory behind this approach is that if enough authors
were self-archiving, then new services could arise to collect, aggregate, evaluate,
and present these articles to users.
At some tipping point, as yet undetermined, these services might serve
to challenge and undermine the economics and inefficiencies of the current
system of publication.
The key barrier to a complete
transformation following this scenario, however, is precisely an asset
management issue. Does it make
sense for systems of knowledge to be built upon the fragile infrastructure of a
network of personal Web sites that are subject to personal whims, not to
mention the migratory habits of individuals? Institutional repositories might help, especially if they
were to collect other related faculty output, such as underlying data and
teaching materials. However, in
order to justify the necessary and significant costs of such repositories,
institutions like yours must develop compelling rationales for collecting,
preserving, and providing access to these kinds of output. Moreover, institutional repositories
must be invested with features of the current scholarly publishing system that preserve
trust in the authenticity of academic work and reliably allocate credit.
Because there is so much at stake
for individual faculty in the ways that the current system confers credit and
authenticity, it appears that these factors are going to be the hardest to
disrupt. Even the self-archived material
in physics and related disciplines in Paul Ginsparg's famous ArXive has not
resulted in substantial shifts from traditional forms of publication — at least
not yet. Still, efforts to build
new models of scholarly communications based on institutional repositories and
rights to self-archiving have gained growing interest. They remain worth exploring, and Mellon
has provided a series of grants for preliminary studies of how this approach
might be scaled up across disciplines.
Important as the serials crisis is,
and as pregnant as the discussions about open access alternatives may be, there
are even larger forces at play.
These are only partially revealed in the system of scholarly publishing,
and may even be obscured by a narrow focus on pricing and open access. First, whatever happens with open
access, it is not likely to result in a uniform, utopian solution. As Jason Epstein has written, "the
global village green will not be paradise. It will be undisciplined,
polymorphous and polyglot." The
academic world is and will undoubtedly remain highly pluralistic. Just within the domain of publishing,
traditional journal production will remain with us for some time and the shift
of those journals to electronic forms of dissemination is likely to continue.
Moreover, our institutions have a
lot to gain economically in this transition from print to electronic
publishing. Not only do electronic
publications provide greater functionality for teaching and research than those
in print, but also as a recent study published by CLIR has shown, there is good
evidence that the considerable operational costs in libraries of ordering,
receiving, processing, shelving, and circulating physical copies can be
eliminated by a shift to electronic versions. In the aggregate across institutions, these potential savings
may total in the tens of millions annually. It is worth noting, however, that these are the costs in the
print world that represent our system of archiving. Cutting those costs without putting in place reliable
archives that are committed to the academic mission remains a problem, and
there is increasingly—and unfortunately—a widely held view among our academic
institutions that savings are theirs to capture and reallocate, and that
covering the costs of archiving digital assets for the long term is a
responsibility for someone else.
In the face of this reasoning, let
me cast the archiving problem in even starker terms. The shift to electronic publication in its current form
represents a dramatic, jump-off-the-cliff shift in the academy from owning
scholarly output to renting it. With
subscription payments, institutions no longer buy journals, and the current
form of licenses limit use so that more traditional, mostly regional initiatives
for collaborative collection development and resource sharing across
institutions are now next to impossible.
Instead, the hundreds of thousands of dollars going out the door each
year typically buy only a year's worth of access to the resource and only for
members of a single institution.
Not only have prices risen, but the material terms of the licensing deals
are transforming the underlying infrastructure of resources available for teaching
and research.
A growing number of senior officers
of our colleges and universities—presidents, provosts, and chief financial
officers—are beginning to question the huge risk to the future of their institution's
core operations because of the growing dependence on a record of scholarship
for which the institution is paying substantial sums but on which it has no
real continuing claim. Library
licenses often have a clause that reserves so-called "perpetual access" for the
institution to the material to which it subscribes each year, but there are no
reliable enforcement mechanisms in these clauses. Typically, publishers promise to transfer the material on a
pile of CDs, but I am not aware that any such transfer has ever taken place,
and it is unlikely that any institution has or will build the capacity to
implement such a solution.
Instead, collective action is needed, and Portico, a new organization will
be emerging in the coming months as a trusted third party archive, offering a
model for how such action might best be organized.
In fact, the Mellon Foundation,
JSTOR, and, we expect, the Library of Congress are all making commitments to
the development of Portico as a new digital archive that could help alleviate
this substantial and growing business risk within the academy—and address the
problem of providing enduring access to the scholarly record for the community
as a whole. Libraries and
publishers must both join in crafting the solution. Library supporters of the archive would require publishers
to deposit journals in the archive as a matter of contract. Upon cancellation of a subscription by
a participating library, the archive would provide perpetual access to that
material to that library. After a
period of time all material in the archive would be available electronically to
all paying supporters of the archive.
The business model would effectively turn a rent of journals into a buy
for everyone participating, but the collective, collaborative action needed to
implement this model requires that universities and colleges make it a priority
to invest in the support of the archive at least a portion of the operational savings
that they would accrue from the shift to electronic publications.
The collective action will likely
also require mechanisms to audit and certify the archive to ensure that it is
performing its archiving functions.
Managing the institutional risks associated with continuing and growing
investment in electronic resources is a looming problem that responsible financial
officers, faculty, librarians, information technology specialists, and others
in the academy must confront across the board, not just for electronic
journals, and which they must together address collectively across traditional
institutional and disciplinary boundaries. The Library of Congress' National Digital Information
Infrastructure and Preservation Program (NDIIPP) has mounted a significant
effort on this issue for a broad range of materials involving a large set of
institutions, and you will hear more about this later in the workshop. But even for journals, a reliance on
Open Access, however it is defined, is not an answer but just another form of
the problem.
The Elephant in Our Midst
and Its implications
An even more significant strategic
issue that has the potential to profoundly and permanently disrupt the patterns
of higher education is what Lorcan Dempsey of OCLC calls the "Amazoogle factor." It is now well known, and still
deplored by some, that Amazon, Google, Yahoo, and other online systems are the
first and sometimes only stops for students doing research. Faculty, too, have come to depend increasingly
on these services. These
organizations are now working closely with publishers to make the contents of
current publications more accessible and "search friendly." Google Scholar, which was announced last
fall, may not be comprehensive in its coverage, but its ability to parse out
citations from articles, among other remarkable features, shows how adept
Google can be in addressing some of the more nuanced and specialized needs of
scholars. And this is not the
end. In mid-December, Google also announced
that it has launched a massive retrospective digitization project based, at
least initially, in five major research libraries.
One of the most common figures of
speech that has appeared in public discussions of digitization over the last
decade has been the invitation to imagine having the entire Library of Congress
available electronically and accessible at the click of mouse. Google's investment in re-engineering
the digitization process and of significantly reducing the costs so that it
could undertake its own initiative means that the vision of digitizing the
holdings of our largest research libraries is not only imaginable but may actually
be within reach. The initiative
and any competitive projects it might stimulate could be incredibly valuable
for the public and for the academy in particular. But that Google is undertaking this effort, not for
philanthropic purposes, but for business reasons, means that higher
education—at least its library arm responsible for collecting, preserving, and
providing access to content of scholarly significance—now has a formidable for-profit
competitor with considerable resources and its sights set squarely on at least
one core aspect of the higher education business. The deals that Google has made with its research library
partners are just now beginning to be scrutinized in public, and those deals
may not pass muster, as Microsoft's Corbis deals with museums fifteen years ago
did not. But Google is different
from Corbis, and its interest in library content resonates deeply with the
interests of libraries in digitizing book content and enhancing search of those
digital assets by more sophisticated means than catalog searching.
Let us assume that one way or
another, massive digitization takes place. Among the big strategic questions for higher education would
be how digital assets for scholarly communications should be organized in such
an environment. These questions
have scarcely been identified, much less aired and fully discussed. I am going to leave a number of these
issues to one side tonight and instead highlight several other broad implications
of Google's potentially disruptive influences on the academy and particularly
on the ways that it manages its digital scholarly assets.
The "processed" publication. First, I want to draw attention to an idea that Joseph
Esposito highlighted a few years ago in a First Monday article.
For scholars, massive digitization and open access are not ends unto
themselves. The central issue is
whether scholars can advance knowledge in ways that were not previously
possible. Scholars need to make
use of digitized and open access materials. Esposito's insight is that at the highest level of
generality, what unites our interest in digitization and open access in a
digital world is that the material becomes "processable." That is, it is subject to computational
processing: it can be indexed,
manipulated, mined, aggregated, decomposed, built up, and so on by algorithm,
and it is this "processability" that makes digitized objects and open access
materials valuable to scholars.
Intellectual property. This brings me to a second point about intellectual
property. The temptation is to
throw up one's hands in despair at the massive cost of meticulously clearing
the rights of every rights holder in an object to be digitized, and either to
abandon digitization of copyrighted material altogether, or to engage in
efforts—also costly but often not accounted for—to stay under the radar of the
copyright police. Google seems
prepared to take the risk of violating copyright by displaying snippets of
copyrighted material in search results and then handing off the searcher to the
publisher or to a library where the searcher can obtain the full text legitimately
in print or digital form. This
approach would represent one of a growing set of initiatives, including
Mellon-funded initiatives such as JSTOR, ARTstor, CIAO, ACLS's History-E
project, the BiblioVault project at the University of Chicago, the Electronic
Enlightenment at Oxford University, and New World Records, as well as others
such as the ECCO project based at the University of Michigan and CLIR, all of
which demonstrate that communities of users and publishers can find ways to
create the needed trust and goodwill and agree to overcome the costly barriers
of copyright to create highly useful digitized collections of research and
educational materials.
Open access materials might lower
costs to some who want to use them, but they will never comprise the full range
of materials for scholarly purposes, and to that extent open access simply will
not be a necessary condition of advancing scholarship. On the other hand, what do appear to be
necessary to the future of scholarship are "processable" materials. There may well be an opportunity here to
recalibrate licenses, rights, and even copyright law itself with a richer
taxonomy of uses, many of which may currently be regarded by default as "piratical."
Uses that support machine indexing,
for example, may actually need to be redefined as legitimate, provided other
protections are in place, because they have become the core infrastructure
today for serving the U. S. constitutional principle of promoting "the progress
of science and the useful arts."
Search. Third, I would highlight the need for new and expanded search
and research capabilities. Google's
indexing of full text would be generated by optical character recognition (OCR)
and could greatly expand and facilitate basic searching and retrieval. Serious thought now needs to be given
about ways that Google and other search engines could be used to achieve the
metasearch and other service objectives we are trying to achieve, sometimes at great
expense, in the catalogs of our local systems. However, we also need to be thinking beyond the local system
catalogs.
The sheer volume of digitized material,
for example, is going to require implementation of much more sophisticated
indexing, searching, and filtering techniques, including broad application of
computational linguistic and related statistical techniques as well as
sophisticated techniques for filtering based on markup and thesauri, which
would relate results to discipline-based concepts and concerns. Above all, there will be growing demand
for mechanisms to link search results flexibly across systems in ways that
resemble but will be fundamentally different from metasearching across catalogs. To provide a simple example: how easily
could one search for related materials in ARTstor, and JSTOR, and, say, Readex
Newsbank? Google or Yahoo may be
able to respond to the basic demand for cross searching, but as scholars become
more sophisticated in their use of these technologies, their needs will become correspondingly
more specialized and discipline-specific in ways that it will likely be
unprofitable to address for commercial companies aimed at the mass market. Search and information retrieval is a
growth industry not only in the general economy but also for scholarly
communications. Solutions that the
large search engines cannot supply will have to come from search applications
developed within and for the academy, and finding these solutions should be a
high priority for the academy to address.
Research methods. The fourth strategic area that I would highlight for you is
the advance of new discipline-based research methods. The development of search technologies will drive the
scholarly use of massively digitized resources, but scholarly use will also
shape and guide the development of particular technologies and applications for
specific disciplinary pursuits.
Disciplines will need to develop new and specialized methodologies—an
informatics of standards and practices—to identify, mark up, and explore the
large volumes of digital information with which they each need to work: economists with tabular data in
government publications; literature scholars with literary texts from various
genres; social historians with contemporary accounts of various aspects of
social life; ethicists with case studies of ethical dilemmas; art historians
with evidence about the context of artists and their art; and so on. As scholars in various fields of study
develop experience with these materials, the disciplines and sub-disciplines
will need to develop and codify practice.
Over the next 3-5 years, if
scholars begin to formulate how the use of these newly digitized materials
could advance knowledge in their fields and begin to set discipline-based standards
for how these materials should be organized for systematic use, then we will
likely need to pave the way for three further types of intensive scholarly
activity. Editorial activity will
shift, field-by-field, to the markup and online annotation of digitized source
materials to shape them for scholarly activity in particular disciplines. Tools will be needed to operate on
these materials in discipline-appropriate ways. And given appropriately edited and marked up resources, and proficiency
in new methodological techniques, scholars will begin to generate and report
results based on research using these methods. These reports will refer systematically to digitized sources
and may incorporate them in various ways.
Researchers at the Institute for Advanced Technology in the Humanities
at the University of Virginia, including Ed Ayers and Will Thomas, and elsewhere
have been modeling new forms of scholarly practice like these. The results of this early work
demonstrate that there will be a growing need for training and other forms of
support from librarians and IT specialists in all forms of institutions as
these practices take shape in discipline-appropriate ways and spread throughout
the academy.
New collection emphases. The fifth strategic area for managing digital assets that I would bring to your attention is the need for dramatic shifts in the emphases in collection building and processing. If large quantities of published materials are available online through some common interface, it will be increasingly hard to distinguish libraries based on their holdings of these materials. Of course, scholars will always need access to the original artifacts for various purposes, and holding libraries will need to streamline the ways that they collectively manage these artifacts in offsite, and perhaps shared, regional repositories. Instead, libraries and their institutions will increasingly be distinguished by the special collections of rare and unique materials which they hold and by the scholarly services they provide for these materials. Special collections are often inaccessible or underprocessed, and the forms of description do not integrate well with other kinds of catalogs. Several institutions have been working together in recent years to develop innovative methods of appraising special collections for processing; others to simplify the cataloging. Building on these initiatives, support will be needed over the next 3-5 years for the development of more efficient forms of processing and description, and revised standards so that special collections finding aids integrate more effectively in larger asset management systems.
Perhaps even more important is the need for more aggressive development of collections in new media. Recent and contemporary culture is documented in audio recordings, in still and moving images, and in various exclusively digital formats, such as geographic information systems, simulations, Web pages, and Weblogs. Scholars will increasingly need access to these materials for teaching and research. Concerted action is especially needed among libraries to ensure that these materials are actively and comprehensively collected and processed for scholarly use. Economies of scale, and the complexities associated with intellectual property rights management may prove—as it is proving for ARTstor and art images and New World Records and musical recordings—that individual libraries need more centralized, collaborative mechanisms to achieve these objectives.
Interaction between digital
library and learning management systems. The
last strategic issue that I would highlight is the need for more seamless
interaction between digital library and learning management systems. There is a pedagogical trend to
incorporate the use of primary sources and research methods more deeply in the
curriculum of higher education, and this trend will likely continue, but will
also vary by discipline. As
scholars in different fields gain experience with and develop discipline-based methodologies
for using massively digitized content, as well special collection and new media
collections, they will need to incorporate the material and train students in
the research methods. Demand will
grow for deepening connections between digital library systems used for managing
digital assets in various forms and combinations of licensed, digitized, and
open access materials and learning management systems such as Sakai. Conversely, at least some of the content
specifically created for teaching and learning will need to flow to digital
library systems for long-term management and preservation. Essential for the effective management
of the flows of content among digital library systems and between digital
library systems are mechanisms, like Shibboleth, for building and expressing
levels of trust between owners and users of the digital assets.
Conclusion
There is a view that the promise —
or curse — of Google's activities is that they will make the management of scholarly
digital assets within the academy largely irrelevant. I hope you can tell from the strategic issues that I have
highlighted for you this evening that I find such a view to be spectacularly uninformed
and shortsighted. Rather, the
promise — or the curse —is that managing digital assets for scholarly purposes has
become a vastly more interesting enterprise than it has ever been. It is increasingly possible for
scholars to have unprecedented access to the resources they need to engage
issues that have remained elusive or even unthinkable. The custodial challenge for us is to be
both extraordinarily innovative and conservative at the same time: Innovative in that we must organize
ourselves to take absolute best advantage of the opportunities; and
conservative in that we must protect our gains and not screw up the scholarly
process.
I promised to conclude by focusing
on roles and division of labor. I
cannot be prescriptive about who should do what. There is just too much to do. The need is great for imagination and expertise to be
applied wherever it can be found, from the largest institutions to the
smallest. But allow me the
following rough distinctions.
For many of the issues that I have
highlighted, such as the shift in methodological practices in scholarly
disciplines and the interaction of library-based content with pedagogical
practice, much of the energy and support is inevitably local. Indeed, it will require levels of
flexible and responsive service relationships between the faculty and library
and information technology specialists, which the smaller colleges and
universities have proven themselves to be extraordinarily adept at providing,
compared to the larger research universities. In order to make and preserve the gains through local support
efforts, I especially want to mention that generating funding is critical and requires
that we make urgent, common cause with our faculty colleagues to produce a more
disciplined, rigorous, and articulate public face than we have had in a long
time about the value of humanistic study in a hostile and brutal world.
Having said that there is much room
for local effort, it is worth pointing out that many activities, including
initiatives involving mass digitization such as the Google activities, and even
JSTOR and ARTstor, need to be organized centrally, largely for reasons of scale
and economy. Activities involving
uniquely held, special collection materials must take a more distributed
approach and involve the holding institutions, but that does not mean that
every institution must itself invest in digitization labs. Similarly, large-scale software
development projects, like DSpace, Fedora, and Sakai, greatly benefit from
central control to produce code efficiently and on-budget, and also require
input and support from a broad range of users and experts, but each has generated
slightly different mechanisms for opening its processes and engaging the
broader community.
Beyond the need for local
initiative and expertise, there is also a need for collaboration and collective
organization involving shared financing and responsive governance at levels
that are probably unprecedented.
If massive digitization occurs, it would be imprudent for the
institutions that provide the source material to cede complete control of the
digitized versions to a commercial organization, and it is my understanding
that each of the initial partners has ensured that Google will provide them a
copy of the digitized versions.
However, the volume of material to be digitized would be so large that,
even with rapidly falling storage costs, no single institution would likely to
be able to afford to store the digitized versions plus provide necessary
backups. Moreover, no single
institution would be the source of material to be digitized and the digital
copies collectively would be of great use to the broad educational
community. A collectively financed
mechanism or organization—a "BookStor"—would probably need to be founded for
the purpose of providing long-term preservation of the digital copies in trust
for the community, if not actually for serving up digital copies to users once
they are found and requested. Will
Google agree to allow a "Bookstor" to come into being and will institutions
rise up to support it?
The need for such collective
organization raises another, larger question about how the academy can
reorganize itself to accommodate efficiently and responsibly within its embrace
entities that essentially outsource library and related functions that once
were held closely within individual institutions. The California Digital Library is one model of outsourcing
within a state system. JSTOR, ARTstor,
and NITLE represent yet other models, and the Mellon and Hewlett Foundations
are experimenting with yet another in their jointly funded creation called
Ithaka, which is designed to stitch together with common services ARTstor,
JSTOR, NITLE, and a family of other scholarly support entities. These resources simply cannot take
shape if they are imagined to be "one off," or ad hoc organizations. Presidents, provosts, deans, scholars,
librarians, and technologists together must find ways within the larger
academic community for their institutions to work together to realize the
extraordinary economies of scale that are possible, and foundations like Mellon
should not be seen as the "deep pockets" to which they turn to cover the costs
of these entities, but as catalysts in the necessary effort to establish them financially
and organizationally as new modes of ongoing operation in higher education.
Let me stop here and leave you with a cautionary tale from Adrian John's Nature of the Book. In nineteenth-century England, there arose a group called the Society for the Diffusion of Useful Knowledge. Worried that an educated working class could be a dangerous force in society, it resolved to swamp the country with cheap magazines—the Penny Cyclopedia and the Penny Magazine—that contained absolutely nothing "to excite the passions." To achieve this mission, the Society was the first group to make full, industrial use of the steam press, a remarkably cost-effective technology at large scale. By 1832, The Society's magazine was "by far the most extensively circulated periodical works that issue from the press." It estimated its readership at the then unprecedented figure of one million.
However, for all the attention to
cost-effectiveness, critics of all persuasion attacked the project and it
eventually failed. Conservatives
were convinced that the project dispersed unnecessary ideas that might still
prove dangerous. Radicals, on the
other hand, complained that the magazines contained no really useful
knowledge. Instead, they said,
rather than meeting demand, the society sought nothing more than to "stuff our
mouths with Kangaroos."
As we continue the discussions in this workshop and in the future about how most effectively to manage scholarly digital assets, let us not fall into the trap of the Society for the Diffusion of Useful Knowledge and lose sight of the ultimate objective: meeting demand for useful knowledge. Let us be on the lookout for the "Kangaroos."
Thank you very much for your attention.
[1] "The
Answer to Life, the Universe, and Everything," in Wikipedia.
Available at http://en.wikipedia.org/wiki/The_Answer_to_Life,_the_Universe,_and_Everything.