Enabling Access in Digital Libraries: Convenors' Questions • CLIR

What kinds of role distinctions are necessary?

Users of the resources in these scenarios may play many different roles, using the term role in a general sense. A faculty member may, for example, act as teacher, author or creator, researcher, consultant, or private individual. It would be impossible, participants argued, for an individual to declare that access to a particular article was being sought in conjunction with only one such role. Some expressed a fear that the mere technical ability to introduce and enforce distinctions among roles would lead to the adoption of practices that would discourage the general pursuit of knowledge and so would not be in the best of academic interests. One librarian recalled a case in which access to a licensed resource was permitted to faculty only during semesters in which they were teaching particular courses, regardless of whether the resource was useful to research or even to the preparation of the courses.

In the context of automated authorization and access management schemes, the term role has a related but more specific sense. It describes recorded characteristics of an individual user, such as membership in a group. Rules within the access management system determine whether a user with a particular role is able to access a resource and what operations he or she can perform on it. A user’s role or roles might be established or negotiated in different ways, for example, through a campus-based proxy service or authorization scheme supported by a directory database, by membership in a professional society, or by acceptance of a charge to a credit card. Where institutional licensing of published journals is being considered, roles may be divided into those for which the institution can issue credentials and those that must be negotiated by the individual. Participants agreed that any access management scheme should allow an individual user to negotiate privileges beyond those afforded by institutional credentials or offered to the general public.

Much of the discussion in this area focused on the granularity (degree) of role distinctions required and perhaps transmitted through credentials or gateway services that an institution may provide for members of its community. Privacy, cost of implementation, and institutional requirements (associated with varying missions and policies) were seen as factors here. Some argued strongly that the granularity should be no finer than membership in a community as defined by the licensing institution, in other words, that all those affiliated with a university should have access to the same resources on the same terms. Finer distinctions by school or department within a university (such as those suggested in the second scenario) are likely to inhibit cross-disciplinary research. Distinctions between faculty, undergraduates, and graduate students would cause problems for teaching. Others suggested that some distinctions might be necessary because of institutional policies relating to services for alumni, say, or, in the case of state universities, services for the general public. The consensus was that fine role distinctions should be avoided and that certification of any distinctions should be the responsibility of the user institution.

The technology, said participants, should allow libraries and publishers to make the business agreements they want, but both sides are more likely to benefit if the agreements do not rely on complex role distinctions. In the second scenario, licensed journals are only of interest to a subset of the community; in such a case, the licensor and licensee might avoid the transaction costs in enforcing special limitations on access by negotiating the subscription price on a different basis. One suggestion was to base the price on the size of the subset interested in the resource (though not limiting access to this group). Others stressed the value of developing a volume-based approach to pricing other than a pay-per-view model.

The purpose of use, observed two breakout groups, is often more relevant than any characteristic of the user. In the first scenario (public domain materials digitized by libraries), libraries would probably encourage any use for teaching or research but wish to control commercial re-use of digital reproductions made at substantial expense, in order to recover costs or fund future digitization projects. The privileges afforded by the fair use doctrine and exceptions granted in copyright laws are also primarily based on the nature (and effects) of use and not on characteristics of the users. On further reflection, participants concluded that requiring users to declare in advance how they intended to use materials was unrealistic and would be seen as an invasion of privacy.

What rights and duties are expected?

One issue raised by this question related to the use of the term rights. Under U.S. copyright law, observed Mary Levering of the U.S. Copyright Office, publishers and authors have rights in intellectual works but that users exercise privileges and duties. Furthermore, copyright owners and their agents generally manage rights in copyrighted works, whereas libraries generally manage access to those works.

As pointed out earlier (under the heading What perspectives are needed?), users and providers have different expectations. Rights, privileges, duties, and responsibilities are shaped not only by license agreements, but also by the overall legal, economic, and technical environment. They will be subject to change over time.

Legally, privileges and duties may be established through a chain of obligation from author or creator to publisher, to library (possibly via a consortium or third party aggregator) to end user. Not every link in this chain is associated with a formal agreement. In the first scenario, where unpublished materials may be involved, there may be no way to follow the chain and establish unambiguously the rights associated with the original materials. Complex terms of gift may impose additional duties on the recipient library. After converting the material to digital form and becoming the provider of online access, the library may wish to assert rights in the digital reproductions in order to safeguard the potential for income or retain control over how the materials are used. Most users of converted archival materials would comply with reasonable terms, if it were easy to determine what the terms were. Automatic enforcement of all such terms is infeasible, since they often apply to subsequent use rather than to access or to specific operations that might be controlled by technical means. Both providers and users would benefit in this case from a mechanism that cautions users about special conditions and allows them to determine whether or how to proceed.

Academic users value their personal space highly. In the words of one librarian, users want the library to “make the connection and get out of the way.” They expect to be allowed to exercise personal responsibility or, as one breakout group reported, to have the “right to do reasonable things and the responsibility not to do unreasonable things.” They would expect any access management system to allow them access to all the information that they are entitled to have access to, inform them of their privileges and responsibilities, and explain how they can negotiate additional privileges. They expect patterns of use permitted for print publications to carry over into the electronic environment. They also expect that publishers will somehow guarantee that the content they are accessing has not been corrupted inadvertently or maliciously.

For their part, publishers hope to maintain revenue, whether to satisfy shareholders, subsidize other activities, or simply cover costs. To achieve this end, they expect to control distribution of works for which they hold rights. They expect that privileges given to users based on a reasonable business model can be implemented by technical means. To be acceptable to publishers, an access management scheme must be customizable to individual license agreements and flexible enough to incorporate new types of agreement and new technology for authentication and for delivery of content. Market forces will determine which technical barriers to access and usage protect revenue and which inhibit market expansion.

As intermediaries, libraries have the responsibility to negotiate reasonable agreements on behalf of their user communities and parent institutions. They cannot be responsible for the actions of end users, but they do have a duty to take reasonable efforts to inform users of terms and conditions for access and use and to ensure that institutional policies, as well as systems or data that support access controls are effective and valid. They will expect to understand how license agreements are encoded and enforced within an access management scheme, in order to fulfil these responsibilities. Libraries (or their parent institution or agent) must make reasonable assurances that proxy or gateway services exclude unauthorized users and that credentials offered for users are valid. In return, they will expect publishers’ access management schemes to honor the credentials provided and facilitate access through such proxies.

In the print environment, libraries have assumed the responsibility for archiving materials for posterity. Under section 108 of the U.S. Copyright Law, libraries and archives may reproduce materials in certain circumstances, for example, to replace “a copy or phonorecord that is damaged, deteriorating, lost, or stolen, if the library or archives has, after a reasonable effort, determined that an unused replacement cannot be obtained at a fair price.”⁸ In an electronic environment in which the publisher controls the master copy, after-the-fact preservation will be impossible. Archiving for preservation must be planned for in advance. Libraries, as custodial institutions, will expect license agreements, and access management schemes that implement them, to provide contingency provisions and fail-safe mechanisms that ensure the long-term accessibility of the information resource. The long-term archiving of information in digital form presents a formidable challenge. Information, concluded one breakout group, “will only be preserved if someone’s job depends on preserving it.” Although the archiving challenge was beyond the scope of the workshop, participants noted that a possible contribution to an eventual solution would be special access management provisions that allowed libraries or trusted agents to make archival copies.

What are the privacy issues?

Participants were unanimous in their view toward the privacy of individual users, an important issue in the discussions surrounding the development of the CNI White Paper: the metadata that establishes privileges, they argued, should be under the control of the licensing organization and closely guarded. Using the CNI’s categories of identification (anonymous, pseudonymous, pseudonymous with demographics, and actual identities), they recommended that campus-based authentication services, gateways, or proxies should not relay actual identities to access management schemes run by publishers or aggregators. Anonymous access, they concluded, poses the least threat to privacy. Pseudonymous identifiers ensure accountability by allowing a publisher to identify abnormal volumes of use by one (unidentified) user and notify the licensing organization. The association of demographic information with pseudonymous identifiers should be limited; under no circumstances should it be detailed enough to identify an individual user. As librarians have found, some publishers request more details than they can usefully analyze. However, libraries require some tracking of demographics for acquisition decisions and resource allocation, while providers may need such information to adjust business models.

Participants stressed that no unnecessary information should be tracked by provider or licensing institution. Users should not be required to indicate the purpose of use. In many states, library reader records are confidential, and the law prohibits libraries from tracking readers’ behavior. The academic community, some participants argued, should lobby for more extensive legal protection for privacy, extending to transactions with publishers and bookstores. However, it is reasonable to allow users to reveal personal information voluntarily in order to secure additional privileges, if they are told how that information will be protected.

How strong must the security controls be?

The design of any access management scheme will balance the tightness of security against user inconvenience and even denial of access to valid users in some cases. The degree of security enforced should be commensurate with the provider’s trust in the user community. Publishers, it was also pointed out, do recognize that libraries are basically honest and will try to comply with reasonable license agreements to the best of their ability. Existing arrangements suggest that they would honor credentials generated through campus-based authentication schemes. Where trust between libraries is concerned, as in the first scenario, libraries have already proved the benefits of mutual trust in many resource-sharing activities, such as interlibrary loan. Libraries will certainly trust each other’s authorization procedures if technically compatible.

In neither of the scenarios examined by the workshop does the content call for very tight security. The limited market value of scholarly and archival information is unlikely to invite widespread abuse. Thus, in the case of a student dropout, say, it would not be essential for the system to be able to revoke privileges immediately. Other classes of information, however, such as current recreational literature or some reference materials, might require more robust controls because of the potential for publishers to lose revenue.

Legal experts reminded participants that no access management scheme exists in a vacuum and that the external environment must be taken into account. They recommended that access management systems emphasize the detection of inappropriate behavior rather than enforcement ahead of time, which is likely to prevent some valid use. Users, they added, need to know what their responsibilities are, and institutional policies need to include adequate sanctions for abusers and procedures for dealing with them. Abuse could be punished by revoking privileges within the system or within the external environment.

In considering how to balance accountability and privacy in the campus environment, participants found one technical approach that had emerged in the discussions relating to the CNI White Paper as promising. Campuses could issue short-term pseudonymous certificates to authenticated users. Certificates valid for a semester or a year could act as credentials for access to most information resources. For selected resources, certificates valid for a few hours might be more appropriate.

What kinds of accountability are necessary and what kinds of management data are needed?

Participants reiterated that libraries cannot, in practice, be accountable for the actions of users. Realistically, they can only make reasonable efforts to ensure compliance with license terms and the law. Any license agreement between a publisher or publisher’s agent and a library will include some clauses relating to accountability of either party for complying with terms of the agreement. The JSTOR Library License Agreement, repeatedly cited as a model, stipulates that libraries must inform JSTOR if they are using a proxy server to control access, must exert reasonable efforts and cooperate with JSTOR in the implementation of security procedures, must work with JSTOR to inform users of the User Rules, and must notify JSTOR if the library becomes aware of violations. The license allows either JSTOR or the licensee organization to terminate access in the case of unauthorized use. To the extent that access to licensed resources is supported by technical means, some degree of accountability for the effectiveness of those technical controls is to be expected. As mentioned in the discussion on security controls, the group favored after-the-fact accountability rather than automated enforcement that might prevent valid access.

In conjunction with the discussion on privacy, participants observed that libraries, even when objecting to licenses that limit access to subsets of users, may still wish to collect usage statistics aggregated by demographic categories in order to make acquisition decisions and allocate resources. As noted earlier, some publishers ask for access to more demographic details than they fully use. No specific suggestions emerged as to an appropriate level of detail. In this instance, it is possible that both publishers and libraries would like to gather more detail for management purposes than is consistent with protection of the user’s privacy.

How do we evaluate effectiveness of the system from user and provider perspectives?

According to participants, the basic test for a general access management scheme will be whether it is adopted in the marketplace. Its success will depend at least in part on quantity and breadth of use and its viability on whether the various parties receive appropriate value in the bargains they strike. Not surprisingly, no short-term or formal measures of effectiveness were discussed, since there is still much uncertainty about how best to evaluate digital libraries. No better criteria have emerged than precision and recall, which have served heretofore to evaluate information retrieval systems of much more limited scope.

REFERENCE

⁸ Copyright Law of the United States, contained in Title 17 of the U.S. Code, Section 108: “Limitations on Exclusive Rights: Reproduction by Libraries and Archives.” Available at http://lcweb.loc.gov/copyright/title17/1-108.html.

Enabling Access in Digital Libraries: Convenors’ Questions