 |
Lessons in Deep Resource Sharing
from the University of California Libraries
Daniel Greenstein
The research library's historic role is providing access to great
collections of scholarly knowledge. To date, those great collections
have been assembled in a single place, with a high level of professional
service surrounding them, in support of research, teaching, and all
sorts of civic and cultural engagements. The greatest challenge that
research libraries face today is to fundamentally transform themselves
so that they may continue to build and maintain those collections.
I suggest that the traditional collection development modelone
that assembles information resources and people in physical proximity
to it in a single organizationis no longer a functional one.
Instead, we are driven by the challenges we face to implement a new
division of labor between organizationally distinctive, layered library
services that work interdependently to provide individual users with
the full suite of collections and services that they require.
Layering Library Services at the University
of California
The story will be told with reference to the University of California
(UC), where a layered library model is beginning to emerge. Before
introducing the model itself, it is important to reflect a little
on the context in which it is becoming realized. If it were a nation
in its own right, the state of California would claim the fifth or
sixth largest gross national product in the world. The state has
two public university systems: the University of California and the
California State University. The University of California has 10
campuses (the tenth, Merced, will begin enrolling students soon),
nearly 200,000 student full-time equivalents, and about 5,000 faculty
members. Its governance and funding are both highly decentralized.1
UC also has 11 university libraries. Ten of these are located on
the campuses (where they are in most cases themselves library systems),
and one, the California Digital Library (CDL), is located at the
Office of the President. Collectively, the libraries hold nearly
32 million volumes, and their combined annual budget includes some
$240 million in state funding. Harvard libraries, by comparison,
claim some 14 million volumes. In maintaining the breadth and depth
of their collections, UC libraries, like other great research libraries,
are hard pressed to keep up with the escalating costs of scholarly
publications. These costs have risen more rapidly than library budgets
in the past several years. Figure 1 shows the extent of the challenge.
It compares a price index calculated for scholarly journals with
the consumer price and higher education price indices, respectively,
and demonstrates that librariesin good years as well as in
badcannot keep up with the annual 6-12 percent price increases
in scholarly journal subscription costs.
Figure 2 shows the same problem in a slightly different way. It
charts the annual increase in the number of volumes published worldwide
with the declining purchasing power, in volumes, of the state funding
that libraries receive for monograph purchases.2
Fig. 1. Periodical price increases in comparison
with common inflation indexes,
1985-2000
Fig. 2. Growth in publishing and decline
in library
buying power, 1988-2001
This inflation in both the volume and the cost of scholarly publications
has forced the UC libraries to seek new ways of maintaining their
historic collecting roles. In particular, they have invested collectively
in services that all require but that none can afford independently.
Looking briefly at a number of these services, we will see a layered
library service model beginning to emerge, one in which campus libraries
build upon a range of common or utility services in order to better
meet the very distinctive local needs of their own faculty, students,
and civic constituencies.
Regional Libraries, a Union Catalog, and a
Digital Collection
The regional library facilities (compact print storage facilities
of which UC has two, in the north and the south) were an early, perhaps
the first, UC experiment with a new library service model. These
facilities that are paid for centrally and managed (by Berkeley in
the north and UCLA in the south) for the use of the libraries generally,
free up scarce shelving space that is available in campus libraries,
thereby enabling them to keep locally maintained collections current.
A second utility is a union catalog, Melvyl®, which
makes information available to any user, anywhere in the systemanywhere
in the world, in factabout the UC libraries' collective holdings.
By combining Melvyl with an online patron initiated interlibrary
loan service (a further utility), the UC libraries give their users
access to more than 32 million volumes as if they formed part of
a virtual uniform library. Figure 3 shows the results of an online
search conducted using Melvyl. A publication called Adaptive
Instructional Systems is not widely held by the UC libraries.
So a user at Riverside who is interested in the title clicks the
Request button, and the volume is delivered within 24 to 48 hours.
Fig. 3. Melvyl and patron-initiated request
Another utility, of more recent origin, is a collection of digital
materials that the libraries agree to license or purchase together.
The collection is one of the largest made available digitally by
a research library and at present includes more than 8,000 journal
titles, 250 reference and other databases, all books printed in English
before 1800, 200,000 digital images of works of art and architecture,
and 4,500 social scientific and government statistical databases.
Nothing in this collection is acquired that is not agreed to and
paid for by every library.3 The
rationale for the shared digital collection's development is simple.
Digital information doesn't need to live anywhere in particular and
can be accessed from anywhere over the network. Rather than acquiring
highly redundant local digital collections, the UC libraries began
in 1997 to acquire some digital materials togethernot as a
buying club, but as a single corporate entity. By sharing in the
development of digital collections, the UC libraries can effectively
share in a variety of essential tasks, including identification,
review, vendor negotiation, content acquisitions, and acquisitions
processing. They also exercise and enhance their buying power acting
as the University of California libraries.
A next step, a very new one for the UC libraries, is to think about
extending the shared collection from digital to printed materials.
The UC libraries are, for example, building shared collections of
printed journals that exist in digital formats and exploring the
development of shared collections of federal and state government
documents. The rationale for print is as it is for digital:
- enhancing collections and services that each UC campus library
makes available to its faculty and students;
- expanding the breadth and depth of collections available systemwide
to support the university's distinguished teaching and research
programs;
- reducing unnecessary duplication of campus holdings; and
- saving substantially in cost and effort.
Planning for the shared print collection has been a revealing process
and has forced us to ask hard but essential questions. Of the materials
on our libraries' shelves, which of them do we need to continue holding
redundantly? Are there economies to be had through some coordination?
How can shared print holdings be collaboratively governed?4
We are starting with print materials where cooperative collection
development makes obvious sense, notably with new journals (e.g.,
as published by Elsevier and the Association for Computing Machinery
[ACM]) where a single print edition is supplied "free" to
the UC libraries in respect of their systemwide electronic site license.
In these economic times, when libraries are beginning to cancel print
subscriptions where electronic versions exist, we are also expecting
this kind of shared collection to ensure that print editions aren't
knowingly or willingly lost to the system. We are also thinking retrospectively
about focusing not only on journals that are available online but
also on federal and state government publications. In an interesting
hallway discussion recently, two of our university librarians found
themselves wondering whether and to what extent libraries should
share in the cost of "core" materials, leaving campus libraries
to enhance, maintain, and assert their distinctiveness by investing
in distinctive local collections.
The shared print collection is yet another example of a utility
set of services. It enables campus libraries to provide a higher
level of collection and service support for research and teaching
on their campuses and for the various public communities they serve.
The layering model is also evident in a range of technology applications
that are supplied by the California Digital Library in close cooperation
with the campus libraries. One example is a reference linking service
that is demonstrated in figures 4-7. In figure 4 a user is searching
in OVID's Current Contentsan abstract and indexing
databasefor journal articles on strokes. Having located a promising
reference to Anatomy of Stroke, Part I, she wants to see the full
text of the article. Clicking on the reference, she does (figure
5). If the user then sees a footnote or reference to something that
he or she also wishes to read, clicking on that reference will pull
up the full text of that article (figures 6-7). But links from Current
Contents will not always lead to the full text of an article.
In fact, the links can be made only if the article text is available
under license at UC. In some instances, only the print edition is
available, in which case the user may end up back at Melvyl, having
to issue a request for an interlibrary loan.
Fig. 4. Reference linking from Current
Contents
Fig. 5. Link found in Stroke
Fig. 6. Reference linking from a footnote
in the
article in Stroke
Fig. 7. Link found in the Annals of
Neurology
This linking utility is a particularly interesting model of a layered
service. The CDL hosts technology that enables this kind of linking
and uses that technology to ensure that it applies wherever possible
to the electronic content that makes up the shared digital collection.
But the shared digital collection does not constitute the sum total
of electronic materials to which UC faculty and students have access.
Campus libraries acting individually and in small groups also license
or purchase electronic information over and above that which is available
in the shared collection. To ensure that campuses can integrate the
unique electronic materials that they hold, the CDL makes the linking
technology that they maintain available to the campus libraries;
these libraries in turn configure the linking service to include
locally held online materials.
A Union Catalog of Finding Aids
A further example of a layered service is the Online Archive of
California (OAC), a union catalog of some 7,000 finding aids that
have been developed for library special collections and archives
on UC campuses and more generally around the state. Bound up within
the OAC are perhaps two enabling utilities. A technology infrastructure
enables integration of disparate finding aids. Perhaps more interesting,
the OAC as a project provided the guidelines, and in some cases the
motivation, to campus and other collections to produce online finding
aids in a format that could be integrated. A new service that integrates
access to digital image surrogates for works of art and architecture
may have a similar effect and help UC's libraries and museums make
hundreds of thousands of digital images available to the widest possible
community. As with other utilities, this one is designed to enhance
the local services that campus libraries can make available to their
users. In this vein, we are exploring the development of tools that
will enable libraries to configure the service to meet local users'
specific needs, for example, by adding local images to the collection,
by integrating the image collection with other local holdings, and
by building interfaces that ensure the image service as a whole integrates
with local course management systems.
What Makes the Layered Service Model a Challenge
This brief review of the layered services that are available within
UC suggests that there is nothing at all new about the service model.
The great public utilities (electricity, gas, even water) have been
provided on a similar model since the late nineteenth century. What
is new is the application to library services of this layered model.
Also new are the weaknesses in the digital library that the model's
development at UC has revealed, and it is to these challenges that
the paper turns.
Figure 8 depicts schematically and somewhat abstractly the current
digital library service model. It shows star shapes toward the top
of the picture to represent library Web sites where users come to
find a host of materials (online public access catalogs, online journals,
online databases, etc.). Libraries construct the Web sites for their
users. They make reference to a wide variety of information resources
represented as oval shapes toward the bottom of the picture. These
information resources may include
- catalogs of materials that are available locally in print and
other analog formats (e.g., through online public access catalogs
and finding aids);
- online materials that are available to local users under licenses
and that may be managed by third parties (e.g., online journals
and reference databases); and
- freely accessible Internet-based materials that are accessible
through the library Web site and may be hosted anywhere in the
world.
Fig. 8. The current digital library service
model
Because information resources are built differently in a variety
of places, by a variety of people, and to serve a variety of means,
the library has to work quite hard and often in very proprietary,
ad hoc ways (demonstrated by differently depicted arrows) to include
them in its Web site. The model is enormously ineffective and inefficient.
Take the library's integration into its Web site of online journal
content as an example. Operating at the content layer (represented
by ovals), journal publishers have produced a host of different products,
each of them aggregating or assembling in one place a particular
collection of journals. Although the aggregations tend to focus in
particular subject areas and can be quite large, they are only a
very partial representation of the available journal content. Rather
than look exclusively at one publisher's collection of scientific
journals, for example, the library user wants to look across a host
of publishers' science offerings. To support this research, the library
is forced to combine, in a single Web site, a wide variety of journal
collections, linking collections wherever possible by using the reference-linking
technology discussed above. In effect, the library spends considerable
energy in disaggregating the publisher aggregations so the journal
content they contain can be more useful. Further, the library is
charged doubly for its inconvenience. It pays a premium in subscription
costs for the so-called value-added services that publishers claim
they add by aggregating content. It then pays again to support the
reference-linking technologies that allow it to unbundle aggregations
so that the materials become more useful.
Many journal publishers have recognized the burden that the model
imposes and have organized themselves through CrossRef so that they
universally support network protocols that enable cross-collection
linking. Unfortunately, the hard lessons learned are apparently not
having any influence over those monograph publishers who are beginning
to make some of their backlists available online. Once again, we
see the publishers' insistence on aggregating online content in ways
that make little sense to library users, who typically want unfettered
access to a range of information products. Indeed, the model emerging
with electronic monographs may prove to be more flawed than that
which is only now being transformed in the journal market. At least
the journal publishers went out of their way to aggregate content
by discipline, including in any one aggregation the journals of many
different academic societies and, sometimes, publishers. With online
monograph collections, the organizing principle that is most commonly
in evidence seems to be by publisher (and perhaps, within publisher,
by subject).
Commercial electronic publishers are not the only or even the worst
offenders. Libraries that produce their own digital collections (for
example, by scanning selected special collections) do so in a way
that makes it extremely difficult for others to federate and integrate
those collections with one another and with the more foundational
holdings of printed and electronic monograph and journals. Have we,
too, developed content that is so distinctive and ad hoc in its local
orientation that it forces others who want to use it to go through
the same unbundling process that commercial journal and monograph
publishers force upon us?
A more rational digital library model is depicted in figure 9. The
model proposes that we (publishers, libraries, anyone who builds
digital information content) develop digital content and distribute
it in open repositories. The repositories are "open," not
because they are freely accessible (the model doesn't prejudice business
decisions) but because the digital objects they contain (whether
they are encoded texts, digital images, digital sound or film, statistical
databases, or geospatial information systems) can be accessed, transformed,
combined, and recombined with objects drawn from other collections
by bona fide users according to their needs and interests. The model
does not constrain the journal or book publishers, or even the digital
libraries, from aggregating content in their own unique ways or distributing
it with their own look, feel, brand, and functionality. They will
and they must continue to build "higher-level" end-user
services based on the content they supply. The model simply suggests
that perhaps others will want to develop different higher-level services
supporting needs and uses that the content owners cannot envisage.
The model also forces us to think creatively about what kind of higher-level
services might materialize if digital information content is available
in open repositories. At present, the only higher-level services
that we know are union catalogs (which integrate access to information
about holdings of different online information resources) and, more
recently, linking services as discussed above. Although there is
a great deal more to do before we can claim to have perfected these
kinds of services, we might still want to ask whether there are others
that we have not yet thought of.
Fig. 9. A layered approach
What about alerting services, which indicate to users that something
in their field of study has just become available from one or other
source? Or format-based services that integrate access to and encourage
use of online maps or space data? What about authoring tools that
allow users to weave an interpretive web around digital objects (online
journals, encoded books, manuscript images, databases) that are found
in a variety of different open repositories and to present the interpretive
web as an interactive lesson in support of online learning? Can we
surface online library information in a manner that allows it to
be integrated selectively into online learning materials, whether
the materials are developed in Blackboard, WebCT, or some proprietary
system? The answer, sadly, is no. At the University of California,
this translates financially as follows: the $240-million annual investment
that UC makes in its libraries is not available to the $170-million
investment that it makes in instructional technologies. And UC is
by no means unique in this.
There are other challenges. Even if we do adopt a layered model
and put at its foundation a range of open digital object repositories,
we are uncertain about how best to manage our digital content. What
we do currently is perhaps best exemplified with reference to the
many lives of a digital image surrogate for a work of art. Let's
say that a library wishes to develop an online finding aid to assist
users interested in accessing its slide library. That library might
include for every record in the catalog a thumbnail of the image
of the slide in question. The thumbnail image is produced, included
in a catalog record, bundled into a database management system that
is useful for cataloging, and made accessible through a range of
search-and-retrieval functions that are appropriate to a catalog.
If the same library wants to include images that are available from
the slide library in, say, an online collection of works by German
expressionists, it will create an altogether different image (probably
at higher resolution), bundle it along with some descriptive data
in an altogether different content management system (e.g., as appropriate
to an online image service), and make it available through a variety
of search, retrieval, slide-table, and other functions as specifically
appropriate to such a service. Then let's assume that a teacher who
is presenting a class on a particular German expressionist wants
to create some online learning materials utilizing some of the same
digital images that are now available both in the catalog and in
the image service. She will in this case have to reproduce the digital
image and include it along with any descriptive information in an
entirely separate content management system, this one providing the
functionality as appropriate to online learning materials.
The model, depicted in figure 10, relies upon proliferation of parallel
and independent services, each with its own data ingest, data management,
and data delivery schemes. It doesn't scale. Every time the library
wants to use a single digital object (whether an image, a graph,
a map, or a text) in a new way, it is almost forced to build another
vertical and independent silo of infrastructure and technology around
it. That's pretty silly. In the more rational model depicted in figure
11, the library's digital images are managed in a single consistent
format as part of one or several open image repositories that are
constructed in a way that supports very different users of selected
digital images. This is where we think we are going at UC, as at
many other research libraries, and we are going in this direction
because the parallel model (figure 10) is so uneconomical.
Fig. 10. Content management: the parallel
service model
Fig. 11. Content management: a layered
model
Conclusion
The layered service model also forces us to think differently about
organizational issues. In it, we give up on any understanding that
the content producer (the entity responsible for the open digital
object repository) can know all the various ways in which the digital
objects they produce will ultimately be presented and used. Abstractly,
the repository cannot predict the range, complexity, or functionality
of the higher-level services that are built on top of it. The question
then becomes how to build a repository so that it can support a virtually
infinite array of unknown higher-level services. Any answer to this
question will undoubtedly be technical, but it will necessarily include
organizational and political aspects as well. In a layered model,
the success of those building open digital object repositories will
be tied directly to the success and visibility of those building
higher-level services based upon them. The promise that the model
holds for libraries is compelling.
Today, we heard talks by people from public, national, and research
libraries. Many of us have talked about the wonderful independent
services we have created, services that array themselves in parallel
to one another and comport themselves according to some organizational
independence. Perhaps in two or three years we will return and speak
in a different way. Then, perhaps, public librarians will speak eloquently
about the services they have built for a local community on top of
the collections offered up by research and national libraries. And
the research librarians might in turn speak with passion about how
they are delivering their collections through services developed
by civic libraries and by schoolsservices that are tailored
to specific user communities and user needs. Is it possible that
a layered service model permits an organizational division of labor
through which a variety of organizational entities, each playing
different functional roles, are equally empowered?
1The
10 campuses are Berkeley, Davis, Irvine, Los Angeles, Merced, Riverside,
San Diego, San Francisco, Santa Barbara, and Santa Cruz.
2U.S.
libraries manage somehow to acquire some 650,000 books annually using
endowment and other funding. Even at that rate, they are unable to
keep pace with the rate of publication.
3Payments
are made according to a prorated formula that is worked out and agreed
to by the campus libraries.
4These
themes are more fully developed in Daniel Greenstein, "Library
Stewardship in a Networked Age: The Compelling Logic of Shared Collections," in Redefining
Preservation in the Twenty-first Century, edited by Abby Smith.
Forthcoming.
Return
to Start Previous
Return to CLIR Home Page >>
|