How much does and should digitization cost? It’s a simple enough question. You take the cost per item, multiply it by the number of items, and you’re done: problem solved. What’s that? You ask what the cost per item is? Well, it depends…what type of collection is it? Where is the collection based? Who’s doing the labor? Are there any preservation activities that need to be incorporated into the digitization workflow? How about metadata? How thorough do you want the records? What quality of images do you need? Where are you storing the digital files? What’s your sustainability plan? How about discoverability? Do you have digitization equipment at your institution or do you plan to work with a vendor or partner?
It doesn’t take long to realize that when it comes to digitization, the question of cost is more complicated than it initially appears. The high number of variables at play makes it challenging to estimate costs for an individual project, let alone compare costs between projects or develop broadly applicable guidelines.
Last week, CLIR’s Digitizing Hidden Special Collections and Archives program organized a panel to exploreissues of cost at DPLAfest 2016 in Washington, DC. The 45-minute session was intended to start a conversation about tools and resources that could help project planners arrive at informed budgets, and to offer insights to those assessing digitization proposals. DPLAfest, the annual conference organized by the Digital Public Library of America (DPLA), was a particularly fruitful environment to discuss this topic. As a mass aggregation service for digitized sources, DPLA and its constituents are naturally invested in efficient digitization. DPLA’s hub model also provides an interesting regional approach to addressing cost questions at scale. It’s worth mentioning that DPLA and other open aggregation platforms helped inspire the adoption of a new core value for CLIR’s Hidden Collections program: connectedness, highlighting the need to maximize the discoverability and accessibility of digitized material along side related materials.
The panel opened with a presentation by Joyce Chapman, assessment librarian at Duke University, on the Digitization Cost Calculator (beta) created by the DLF Assessment Interest Group. The Cost Calculator is a planning tool that aims to provide time and cost estimates for digitization projects. Chapman and her colleagues began their project by assembling a bibliography, and were surprised to find fewer than 20 resources on the subject of digitization costs. They then created best practice guidelines for collecting time data for various digitization processes; these were finalized in July 2015. Since then, the group has been soliciting data to inform the calculator and make its assessments more accurate. They are cognizant of what it requires to collect this data and seek to offer support to contributors through their upcoming “Day of Data” initiative, which challenges professionals to set aside a single day (or work shift) for timing themselves during one or more of the processes in their digitization workflow.
The next speaker, Sandra McIntyre, Director of the Mountain West Digital Library (MWDL), spoke about MWDL’s attempt to establish common pricing across digitization centers in their region. The MWDL is a DPLA service hub that has ingested one million items from 825 collections throughout the mountain west region. They established a Digital Services Pricing Task Force, which conducted a survey of prices offered by several vendors and DPLA Service Hubs. While they experienced difficulty convincing vendors to share their information, they had a good response from academic libraries and government groups, which enabled them to establish their Digital Services Price List. Both MWDL digitization center managers and new digital collections partners have enjoyed the advantages in common pricing, such as simplified project planning and review.
Jen Palmentiero, Digital Services Librarian at Southeastern New York Library Resources Council (SENYLRC), gave the final presentation, offering a perspective from a regional consortium servicing smaller organizations. SENYLRC has a diverse member base including institutions focused on K-12 education, small museums, and organizations that do not see cultural memory as a primary function, such as hospitals. Jen described how institutions that otherwise might not be able to afford digitization, could pool resources through SENYLRC and gain access to subscriptions and related services that support digitization work. Their members have access to web hosting, backup services, collections management and digital curation software, DPLA and WorldCat harvesting, and hardware if needed. The consortium also provides organizations with training and personalized consultation. The presentation closed with a look at SENYLRC’s pricing models and three case studies of small-scale digitization projects, tailored to the institutions’ individual circumstances.
In 45 minutes we were barely able to scratch the surface of the question, How much does digitization cost? What digitization should cost, is a question inherently intertwined with complex conversations about labor and best practices, and deserves a much richer exploration than we ultimately had time for last week. However, our brief conversation introduced new resources and shed light on why cost modeling and cost comparisons can be so challenging. Beyond the many variables to consider, it is time consuming to conduct a survey of prices, particularly given the lack of transparency from many providers. There is also limited literature available on costs to guide these efforts, suggesting a need for further research. Cost planning is particularly difficult for organizations with limited capacity, which can benefit from a regional or other focused network that provides targeted guidance and services.
There is a clear need for further conversations about this topic and resources to make cost planning and assessment more accessible to projects and institutions of all types. So let’s keep talking about the cost of digitization. It’s a simple enough request.
Nicole Ferraiolo is program officer for scholarly resources at CLIR.