 |

2. USER STUDIES
DLF respondents devoted the bulk of their discussion to user studies,
reflecting the user-centered focus of their operations. One respondent
referred to the results of user studies as "outcome" measures because,
although they do not measure the impact of library use on student learning
or faculty research, they do indicate the impact of library services,
collections, facilities, and staff on user experiences and perceptions.
Libraries participating in the DLF survey organize, staff, and conduct
user studies differently. Some take an ad hoc approach; others use
a more systematic approach. Some sites have dedicated staff experts
in research methodologies who conduct user studies; others train
staff throughout the libraries to conduct user studies. Some libraries
take both approaches. Some have consulted experts on their campuses
or contracted with commercial firms to develop research instruments
and analyze the results. For example, libraries participating in
the DLF survey have recruited students in library science and human-computer
interaction to conduct user studies or hired companies such as Websurveyor.com
or Zoomerang.com to host Web-based surveys and analyze the data.
Libraries that conduct user studies use spreadsheet, database, or
statistical analysis software to manage and analyze the data. In
the absence of standard instruments, guidelines, or best practices,
institutions either adapt published efforts to local circumstances
or make their own. There is clearly a flurry of activity, some of
it not well organized or effective, for various reasons discussed
elsewhere in this report.
Learning how to prepare research instruments, analyze and interpret
the data, and use the results is a slow process. Unfortunately, however,
the ability to quickly apply research results is often essential,
because the environment changes quickly and results go out of date.
Many DLF respondents reported instances where data languished without
being analyzed or applied. They strongly cautioned against conducting
research when resources and interest are insufficient to support
use of the results. Nevertheless, DLF libraries are conducting many
user studies employing a variety of research methods. The results
of these studies run the gamut: they may reinforce librarian understanding
of what users need, like, or expect; challenge librarian assumptions
about what people want; or provide conflicting, ambiguous, misleading,
or incomplete information that requires follow-up research to resolve
or interpret. Multiple research methods may be required to understand
fully and corroborate research results. This exacerbates an already
complicated situation and can frustrate staff. Resources may not
be available to conduct follow-up studies immediately. In other cases,
new priorities emerge that make the initial study results no longer
applicable; in such a case, any attempt at follow-up is worthless.
Moreover, even when research data have been swiftly analyzed, interpreting
the results and deciding how to apply them may be slowed if many
people are involved in the process or if the results challenge long-held
assumptions and preferences of librarians. Finally, even when a plan
to use the results is in hand, implementation may pose a stumbling
block. The longer the entire research process takes, from conception
to implementing the results, the more likely the loss of momentum
and conflict with other priorities, and the greater the risk that
the process will break down and the effort will be wasted. The issue
appears to be related to the internal organization and support for
the library's assessment effort.
To help libraries understand and address these concerns, this section
of the report describes popular user study methods, when and why
DLF libraries have used them, where they succeeded, and where they
failed. Unless otherwise noted, all claims and examples derive from
the DLF interviews. The focus is surveys, focus groups, and user
protocols, which are the methods DLF libraries use most often. Heuristic
evaluations, paper prototypes and scenarios, and card-sorting exercises
are also described because several DLF institutions have also used
these methods successfully.1
2.1. Surveys (Questionnaires)
2.1.1. What Is a Survey Questionnaire?
Survey questionnaires are self-administered interviews in which
the instructions and questions are sufficiently complete and intelligible
for respondents to act as their own interviewers.2 The
questions are simply stated and carefully articulated to accomplish
the purpose for which the survey is being conducted. Survey questions
typically force respondents to choose from among alternative answers
provided or to rank or rate items provided. Such questions enable
a simple quantitative analysis of the responses. Surveys can also
ask open-ended questions to gather qualitative comments from the
respondents.
Surveys are an effective way to gather information about respondents'
previous or current behaviors, attitudes, beliefs, and feelings.
They are the preferred method to gather information about sensitive
topics because respondents are less likely to try to please the researcher
or to feel pressured to provide socially acceptable responses than
they would in a face-to-face interview. Surveys are an effective
method to identify problem areas and, if repeated over time, to identify
trends. Surveys cannot, however, establish cause-effect relationships,
and the information they gather reveals little if anything about
contextual factors affecting the respondents. Additional research
is usually required to gather the information needed to determine
how to solve the problems identified in a survey.
The primary advantage of survey questionnaires is economy. Surveys
enable researchers to collect data from large numbers of respondents
in relatively short periods of time at relatively low cost. Surveys
also give respondents time to think about the questions before answering
and often do not require respondents to complete the survey in one
sitting.
The primary disadvantage of survey questionnaires is that they must
be simple, impersonal, and relatively brief. If the survey is too
long or complex, respondents may get tired and hurriedly answer or
skip questions. The response rate and the quality of responses decline
if a survey exceeds 11 pages (Dillman 1978). Instructions and questions
must be carefully worded in language meaningful to the respondents,
because no interviewer is present to clarify the questions or probe
respondents for additional information. Finally, it is possible that
someone other than the selected respondent may complete the survey.
This can skew the results from carefully selected samples. (For more
about sampling, see section 4.2.1.) When necessary, survey instructions
may explicitly ask that no one complete the survey other than the
person for whom it is intended.
2.1.2. Why Do Libraries Conduct Surveys?
Most of the DLF respondents reported conducting surveys, primarily
to identify trends, "take the temperature" of what was happening
among their constituencies, or get a sense of their users' perceptions
of library resources. Occasionally they conduct surveys to compare
themselves with their peers. In summary, DLF libraries have conducted
surveys to assess the following:
- Patterns,
frequency, ease, and success of use
- User
needs, expectations, perspectives, priorities, and preferences
for library collections, services, and systems
- User
satisfaction with vendor products, library collections, services,
staff, and Web sites
- Service
quality
- Shifts
in user attitude and opinion
- Relevance
of collections or services to the curriculum
A few respondents reported conducting surveys as a way to market
their collections and services; others commented that this was an
inappropriate use of survey research. One respondent referred to
this type of survey as "push polling" and stated that there were
easier, more appropriate ways than this to market what the library
offers.
The data gathered from surveys are used to inform decision making
and strategic planning related to the allocation of financial and
human resources and to the organization of library units. Survey
data also serve political purposes. They are used in presentations
to faculty senates, deans' councils, and library advisory boards
as a means to bolster support for changes in library practice. They
are also used in grant proposals and other requests for funding.
2.1.3. How Do Libraries Conduct Surveys?
DLF respondents reported that they conduct some surveys routinely;
these include annual surveys of general library use and user priorities
and satisfaction. Other surveys are conducted sporadically; in this
category might be, for example, a survey to determine user satisfaction
with laptop-lending programs. The library administrator's approval
is generally required for larger, more formal, and routine surveys.
Smaller, sporadic, less expensive surveys are conducted at the discretion
of middle managers.
Once the decision has been made to conduct a survey, libraries convene
a small group of librarians or staff to prepare the survey instructions
and questionnaire, determine the format of the survey (for example,
print, e-mail, Web-based), choose the sampling method, identify the
demographic groups appropriate for the research purpose, determine
how many participants to recruit in each group and decide how to
recruit them, and plan the budget and timetable for gathering, analyzing,
interpreting, and applying the data. A few DLF respondents reported
using screening questionnaires to find experienced or inexperienced
users, depending on the purpose of the study.
Different procedures are followed for formal surveys than for small
surveys. The former require more work. Because few libraries employ
survey experts, a group preparing a formal survey might consult with
survey experts on campus to ensure that the questions it has drafted
will gather the information needed. The group might consult with
a statistician on campus to ensure that it recruits enough participants
to gather statistically significant results. When a survey is deemed
to be extremely important and financial resources are available,
an external consulting or research firm might be hired. Alternatively,
libraries with adequate budgets and sufficient interest in assessment
have begun to use commercial firms such as Websurveyor.com to conduct
some surveys.
If the survey is to be conducted in-house, time and financial constraints
and the skills of library staff influence the choice of survey format.
Paper surveys are slow and expensive to conduct. Follow-up may be
needed to ensure an adequate response rate. Respondents are not required
to complete them in one sitting; for this reason, paper surveys may
be longer than electronic surveys. E-mail surveys are less expensive
than paper surveys; otherwise, their advantages are similar. Web-based
surveys might be the least expensive to conduct, particularly if
scripts are available to analyze the results automatically. They
also offer several other advantages. For example, they can be edited
up to the last minute, and the capabilities of the Web enable sophisticated
branching and multimedia surveys, which are difficult or even impossible,
in other formats. Both Web and e-mail surveys are easier to ignore
than are paper surveys, and they assume participants have computer
access. Web surveys have the further disadvantage that they must
be completed in one sitting, which means they must be relatively
short. They also require HTML skills to prepare and, if results are
to be analyzed automatically, programming skills. Whether Web-based
surveys increase response rate is not known. One DLF library reported
conducting a survey in both e-mail and Web formats. An equal number
of respondents chose to complete the survey in each format.
Considerable time and effort should be spent on preparing the content
and presentation of surveys. Instructions and questions must be carefully
and unambiguously worded and presented in a layout that is easy to
read. If not, results will be inaccurate or difficult or impossible
to interpret, worse yet, participants may not complete the survey.
The choice of format affects the amount of control libraries have
over the presentation or appearance of the survey. Print offers the
most control; with e-mail and Web-based formats, there is no way
for the library to know exactly what the survey will look like when
it is viewed using different e-mail programs or Web browsers. The
group preparing e-mail or Web surveys might find it helpful to view
the survey using e-mail programs and Web browsers available on campus
to ensure that the presentation is attractive and intelligible.
Libraries pilot test survey instructions and questions with a few
users and revise them on the basis of test results to solve problems
with vocabulary, wording, and the layout or sequence of the questions.
Pilot tests also indicate the length of time required to complete
a survey. Libraries appear to have ballpark estimates for how long
it should take to complete their surveys. If the time it takes participants
to complete the survey in the pilot tests exceeds this figure, questions
might be omitted. The survey instructions include the estimated time
required to complete the survey.
DLF respondents reported using different approaches to distribute
or provide access to surveys, based on the sampling method and survey
format. For example, when recruiting volunteers to take Web-based
surveys, the survey might automatically pop up when users display
the library home page or click the exit button on the online public
access catalog (OPAC). Alternatively, a button or link on the home
page might provide access to the survey. Posters or flyers might
advertise the URL of a Web-based survey or, if a more carefully selected
sample is needed, an e-mail address to contact to indicate interest
in participating. Paper surveys may be made available in trays or
handed to library users. With more carefully selected sample populations,
e-mail containing log-in information to do a Web-based survey, or
the e-mail or paper survey itself, is sent to the targeted sample.
Paper surveys can be distributed as e-mail enclosures or via campus
or U.S. mail. DLF respondents indicated that all of these methods
worked well.
Libraries use spreadsheet or statistical software to analyze the
quantitative responses to surveys. Cross-tabulations are conducted
to discover whether different user groups responded to the questions
differently; for example, to discover whether the priorities of undergraduate
students are different from those of graduate students or faculty.
Some libraries compare the distribution of survey respondents with
the demographics of the campus to determine whether the distribution
of user groups in their sample is representative of the campus population.
A few libraries have used content analysis software to analyze the
responses to open-ended questions.
2.1.4. Who Uses Survey Results? How Are They Used?
Libraries share survey results with the people empowered to decide
how those results will be applied. The formality of the survey and
the sample size also determine who will see the results and participate
in interpreting them and determining how they will be used. High-profile,
potentially contentious survey topics or research purposes tend to
be treated more formally. They entail the use of larger samples and
generate more interest. Survey results of user satisfaction with
the library Web site might be presented to the library governing
council, which will decide how the results will be used. Data from
more informal surveys might be shared strictly within the department
that conducted the survey. For example, the results of a survey of
user satisfaction with the laptop-lending program might be presented
to the department, whose members will then decide whether additional
software applications should be provided on the laptops. Striking
or significant results from a survey of any size seem to bubble up
to the attention of library administrators, particularly if follow-up
might have financial or operational implications or require interdepartmental
cooperation. For example, results of a survey of reference service
that suggest that users would be better served by longer reference
desk hours or staffing with systems office personnel in addition
to reference librarians should be brought to the addition of library
administration. Survey data might also be shared with university
administrators, faculty senates, library advisory boards, and similar
groups, to win or bolster support for changing directions in library
strategic planning or to support requests for additional funding.
Multiyear trends are often included in annual reports. The results
are also presented at conferences and published.
Although survey results often confirm expectations and validate
what the library is doing, sometimes the results are surprising.
In this case, they may precipitate changes in library services, user
interfaces, or plans. The results of the DLF survey indicate the
following applications of survey data:
- Library
administrators have used survey results to inform budget requests
and secure funding from university administrators for electronic
resources and library facilities.
- Library
administrators and middle managers have used survey results to
guide reallocation of resources to better meet user needs and
expectations. For example, low-priority services have been discontinued.
More resources have been put into improving high-priority services
with low satisfaction ratings or into enhancing existing services
and tools or developing new ones.
- Collection
developers have used survey results to inform investment decisionsfor
example, to decide which vendor's Modern Language Association
(MLA) bibliography to license; whether to license a product after
the six-month free trial period; or whether to drop journal titles,
keep the titles in both print and electronic format, or add the
journals in electronic format. Developers have also used survey
data to inform collection-development decisions, for example,
to set priorities for content to be digitized for inclusion in
local collections or to decide whether to continue to create
and collect analog slides rather than move entirely to digital
images.
- Service
providers, such as reference, circulation, and resource sharing
(interlibrary loan [ILL] and document delivery) departments,
have used survey results to identify problem areas and formulate
steps to improve service quality in a variety of ways, for example,
by reducing turnaround time for ILL requests, solving problems
with network ports and dynamic host assignments for loaner laptops,
helping users find new materials in the library, improving staff
customer service skills, assisting faculty in the transition
from traditional to electronic reserves, and developing or revising
instruction in the use of digital collections, online finding
aids, and vendor products.
- Developers
have used survey results to set priorities and inform the customization
or development of user interfaces for the OPAC, the library Web
site, local digital collections, and online exhibits. Survey
results have guided the revision of Web site vocabulary, the
redesign of navigation and content of the library Web site, and
the design of templates for personalized library Web pages. They
have also been used to identify online exhibits that warrant
upgrading. · Survey results have been used to inform or establish
orientation, technical competencies, and training programs for
staff, to prepare reports for funding agencies, and to inform
a Request for Proposals from ILS vendors.
- A multilibrary
organization has conducted surveys to assess the need for original
cataloging, the use of shared catalog records and vendor records,
the standards for record acceptance (without local changes),
and the applicability of subject classifications to library Web
pagesall to inform plans for the future and ensure the
appropriate allocation of cataloging resources.
DLF respondents mentioned that survey results often fueled discussion
of alternative ways to solve problems identified in the survey. For
example, when users report that they want around-the-clock access
to library facilities, libraries examine student wages (since students
provide most of the staffing in libraries during late hours) and
management of late-night service hours. When users complain that
use of the library on a campus with many libraries is unnecessarily
complicated, libraries explore ways to reorganize collections to
reduce the number of service points. When users reveal that the content
of e-resources is not what they expect, libraries evaluate their
aggregator and document delivery services.
2.1.5. What Are the Issues, Problems, and Challenges With Surveys?
2.1.5.1. The Costs and Benefits of Different Types of Surveys
DLF respondents agreed that general surveys are not very helpful.
Broad surveys of library collections and services do provide baseline
data and, if the same questions are repeated in subsequent surveys,
offer longitudinal data to track changing patterns of use. However,
such surveys are time-consuming and expensive to prepare, conduct,
and interpret. Getting people to complete them is difficult. The
results are shallow and require follow-up research. Some libraries
believe the costs of such surveys exceed the benefits and that important
usage trends can be tracked more cost-effectively using transaction
log analysis. (See section 3.)
Point-of-use surveys that focus on a specific subject, tool, or
product work as well as, or better than, general surveys. They are
quicker to prepare and conduct, easier to interpret, and more cost-effective
than broad surveys. However, they must be repeated periodically to
assess trends, and they, too, frequently require follow-up research.
User satisfaction surveys can reveal problem areas, but they do
not provide enough information to solve the problems. Service quality
surveys, based on the gap model (which measures the "gap" or difference
between users' perceptions of excellent service and their perceptions
of the service they received), are preferred because they provide
enough information to plan service improvements. Unfortunately, service
quality surveys are much more expensive to conduct than user satisfaction
surveys.
2.1.5.2. The Frequency of Surveys
Surveys are so popular that DLF respondents expressed concern about
their number and frequency. Over-surveying can decrease participation
and make it more difficult to recruit participants. When the number
of completed surveys is very small, the results are meaningless.
Conducting surveys as a way to market library resources might exacerbate
the problem.
2.1.5.3. Composing Survey Questions
The success of a survey depends on the quality and precision of
the questions askedtheir wording, presentation, and appropriateness
to the research purpose. In the absence of in-house survey expertise,
adequate training, or consultation with an expert, library surveys
often contain ambiguous or inaccurate questions. In the worst cases,
the survey results are meaningless and the survey must be entirely
revised and conducted again the following year. More likely, the
problem applies to particular questions rather than to the entire
survey. For example, one DLF respondent explained that a survey conducted
to determine the vocabulary to be used on the library Web site did
not work well because the categories of information that users were
to label were difficult to describe, particularly the category of "full-text" electronic
resources. Developing appropriate and precise questions is the key
reason for pilot testing survey instruments.
Composing well-worded survey questions requires a sense of what
respondents know and how they are likely to respond. DLF respondents
reported the following examples. A survey conducted to assess interface
design based on heuristic principles did not work well, probably
because the respondents lacked the knowledge and skills necessary
to apply heuristic principles to interface design (see section 2.4.1.1).
Surveys that ask respondents to specify the priority of each service
or collection in a list yield results where everything is simply
ranked either "high" or "low," which is not particularly informative.
Similarly, surveys that ask respondents how often they use a service
or collection yield results of either "always use" or "never use." Where
it is desirable to compare or contrast collections or services, it
is important to require users to rank the relative priority
of services or collections and to rank the relative frequency of
use. Otherwise, interpreting the results will be difficult.
Asking open-ended questions and soliciting comments can also be
problematic. Many respondents will not take the time to write answers
or comments. If they do, the information they provide can offer significant
insights into user perceptions, needs, and expectations. However,
analyzing the information is difficult, and the responses can be
incomplete, inconsistent, or illegible. One DLF respondent reported
having hundreds of pages of written responses to a large survey.
Another respondent explained that he and his staff "spent lots of
time figuring out how to quantify written responses." A few DLF libraries
have attempted to automate the process using content analysis software,
but none of them was pleased with the results. Perhaps the problem
is trying to extract quantitative results from qualitative data.
The preferred approach appears to be to limit the number of open-ended
questions and analyze them manually by developing conceptual categories
based on the content of the comments. Ideally, the categories would
be mutually exclusive and exhaustive (that is, all the data fit into
one of them). After the comments are coded into the categories, the
gist would be extracted and, if possible, associated with the quantitative
results of the survey. For example, do the comments offer any explanations
of preferences or problems revealed in the quantitative data? The
point is to ask qualitative questions if and only if you have the
resources to read and digest the results and if your aims in conducting
the survey are at least partly subjective and indicative, as opposed
to precise and predictive.
2.1.5.4. Lack of Analysis or Application
Theoretically, the process is clear: prepare the survey, conduct
the survey, analyze and interpret the results, decide how to apply
them, and implement the plan. In reality, the process frequently
breaks down after the survey is conducted, regardless of how carefully
it was prepared or how many hundreds of respondents completed it.
Many DLF respondents reported surveys whose results were never analyzed.
Others reported that survey results were analyzed and recommendations
made, but nothing happened after that. No one knew, or felt comfortable
enough to mention, who dropped the ball. No one claimed that changes
in personnel were instrumental in the failure to analyze or apply
the survey results. Instead, they focused on the impact this has
on the morale of library staff and users. Conducting research creates
expectations; people expect results. Faculty members in particular
are not likely to participate in library research studies if they
never see results. Library staff members are unlikely to want to
serve on committees or task forces formed to conduct studies if the
results are never applied.
The problem could be loss of momentum and commitment, but it could
also be lack of skill. Just as preparing survey questions requires
specific skills, so too do analysis, interpretation, and application
of survey results. Libraries appear to be slow in acquiring the skills
needed to use survey data. The problem is exacerbated when survey
results conflict with other data. For example, a DLF respondent reported
that their survey data indicate that users do not want or need reference
service, even though the number of questions being asked at the reference
desk is increasing. Morale takes a hit if no concrete next steps
can be formulated from survey results or if the data do not match
known trends or anecdotal evidence. In such cases, the smaller the
sample, the more likely the results will be dismissed.
2.1.5.5. Lack of Resources or Comprehensive Plans
Paper surveys distributed to a statistically significant sample
of a large university community can cost more than $10,000 to prepare,
conduct, and analyze. Many libraries cannot afford or choose not
to make such an investment. Alternative formats and smaller samples
seem to be the preferred approach; however, even these take a considerable
amount of time. Furthermore, surveys often fail to provide enough
information to enable planners to solve the problems that have been
identified. Libraries might not have the human and financial resources
to allocate to follow-up research, or they could simply have run
out of momentum. The problem could also be a matter of planning.
If the research process is not viewed from conception through application
of the results and follow-up testing, the process could likely halt
at the point where existing plans end.
2.2. Focus Groups
2.2.1. What Is a Focus Group?
A focus group is an exploratory, guided interview or interactive
conversation among seven to ten participants with common interests
or characteristics.3 The
purpose of a focus group is to test hypotheses; reveal what beliefs
the group holds about a particular product, service, or opportunity
and why; or to uncover detailed information about complex issues
or behaviors from the group's perspective. Focus group studies entail
several such group conversations to identify trends and patterns
in perception across groups. Careful analysis of the discussions
reveals insights into how each group perceives the topic of discussion.
A focus group interview is typically one to two hours long. A trained
moderator guides the conversation using five to ten predetermined
questions or key issues prepared as an "interview guide." The questions
are open-ended and noncommittal. They are simply stated and carefully
articulated. The questions are asked in a specific sequence, but
there are no predetermined response categories. The moderator clarifies
anything that participants do not understand. The moderator may also
ask probing follow-up questions to identify concepts important to
the participants, pursue interesting leads, and develop and test
hypotheses. In addition to the moderator, one or two observers take
detailed notes.
Focus group discussions are audio- or videotaped. Audiotape is less
obtrusive and therefore less likely to intimidate the participants.
Participants who feel comfortable are likely to talk more than those
who are not; for this reason, audiotape and well-trained observers
are often preferred to videotape. The observers' notes should be
so complete that they can substitute if the tape recorder does not
work.
Focus groups are an effective and relatively easy way to gather
insight into complex behavior and experience from the participants'
perspective. Because they can reveal how groups of people think and
feel about a particular topic and why they hold certain opinions,
they are good for detecting changes in behavior. Participant responses
can not only indicate what is new but also distinguish trends from
fads. Interactive discussion among the participants creates synergy
and facilitates recall and insight. A few focus groups can be conducted
at relatively low cost. Focus group research can inform the planning
and design of new programs or services, be it a means for evaluating
existing programs or services, and facilitate the development of
strategies for improvement and outreach. Focus groups are also helpful
as prelude to survey or protocol research; they may be used to identify
appropriate language, questions, or tasks, and as follow-up to survey
or protocol research to get clarification or explanation of factors
influencing survey responses or user behaviors. (Protocol research
is discussed in section 2.3.)
The quality of the responses to focus group questions depends on
how clearly the questions are asked, the moderator's skills, and
the participants' understanding of the goals of the study and what
is expected of them. A skilled moderator is critical to the success
of a focus group. Moderators must quickly develop rapport with the
participant, remain impartial, and keep the discussion moving and
focused on the research objectives. They should have background knowledge
of the discussion topic and must be able to repress domineering individuals
and bring everyone into the conversation. Before the focus group
begins, the moderator should observe the participants and, if necessary,
strategically seat extremely shy or domineering individuals. For
example, outspoken, opinionated participants should be placed to
the immediate left or right of the moderator and quiet-spoken persons
must be placed at some distance from them. This enables the moderator
to shut out the domineering person simply by turning his or her torso
away from the individual. Moderators and observers must avoid making
gestures (for example, head nodding) or comments that could bias
the results of the study.
Moderators must be carefully selected, because attitude, gender,
age, ethnicity, race, religion, and even clothing can trigger stereotypical
perceptions in focus group participants and bias the results of the
study. If participants do not trust the moderator, are uncomfortable
with the other participants, or are not convinced that the study
or their role is important, they can give incomplete, inaccurate,
or biased information. To facilitate discussion, reduce the risk
of discomfort and intimidation, and increase the likelihood that
participants will give detailed, accurate responses to the focus
group questions, focus groups should be organized so that participants
and, in some cases, the moderator are demographically similar.
The selection of demographic participant groupings and focus group
moderator should be based on the research purpose, the sensitivity
of the topic, and an understanding of the target population. For
example, topics related to sexual behavior or preferences suggest
conducting separate focus groups for males and females in similar
age groups with a moderator of the same age and gender. When the
topic is not sensitive and the population is diverse, the research
purpose is sufficient to determine the demographic groupings for
selecting participants. For example, three focus groupsfor
undergraduate students, graduate students, and facultycould
be used to test hypotheses about needs or expectations for library
resources among these groups. Mixing students and faculty could intimidate
undergraduates. Although homogeneity is important, focus group participants
should be sufficiently diverse to allow for contrasting opinions.
Ideally, the participants do not know one another. This is because
if they do, they tend to form small groups within the focus group
and make it harder for the moderator to manage.
The primary disadvantage of focus groups is that participants may
give false information to please the moderator, stray from the topic,
be influenced by peer pressure, or seek a consensus rather than explore
ideas. A dominating or opinionated participant can make more reserved
participants hesitant to talk, which could bias the results. In addition,
data gathered in focus groups can be difficult to evaluate because
such information can be chaotic, qualitative, or emotional rather
than objective. The findings should be interpreted at the group level.
The small number of participants and frequent use of convenience
sampling severely limit the ability to generalize the results of
focus groups, and the results cannot be generalized to groups with
different demographic characteristics. However, the results are more
intelligible and accessible to lay audiences and decision makers
than are complex statistical analyses of survey data.
A final disadvantage of focus groups is that they rely heavily on
the observational skills of the moderator and observer(s), who will
not see or hear everything that happens, and will see or hear even
less when they are tired or bored. How the moderators or observers
interpret what they see and hear depends on their point of reference,
cultural bias, experience, and expectations. Furthermore, observers
adjust to conditions. They may eventually fail to recognize language
or behaviors that become commonplace in a series of focus groups.
In addition, human beings cannot observe something without changing
it. The Heisenberg principle states that any attempt to get information
out of a system changes it. In the context of human subjects research,
this is called the Hawthorne or "guinea pig" effect. Being a research
subject changes the subject's behavior. Having multiple observers
can compensate for many of these limitations and increase the accuracy
of observational studies, but it can also further influence the behaviors
observed. The best strategy is to articulate the specific behaviors
or aspects of behavior to be observed before conducting the study.
Deciding, on the basis of the research objectives, what to observe
and how to record the observations, coupled with training the observers,
facilitates systematic data gathering, analysis of the research findings,
and the successful completion of observational studies.
2.2.2. Why Do Libraries Conduct Focus Groups?
More than half of the DLF respondents reported conducting focus
groups. They chose to conduct focus groups rather than small, targeted
surveys because focus groups offer the opportunity to ask for clarification
and to hear participants converse about library topics. Libraries
have conducted focus groups to assess what users do or want to do
and to obtain information on the use, effectiveness, and usefulness
of particular library collections, services, and tools. They have
also conducted focus groups to verify or clarify the results from
survey or user protocol research, to discover potential solutions
to problems identified in previous research, and to help decide what
questions to ask in a survey. One participant reported conducting
focus groups to determine how to address practical and immediate
concerns in implementing a grant-funded project.
Data gathered from focus groups are used to inform decision making,
strategic planning, and resource allocation. Focus groups have the
added benefit of providing good quotations that are effec tive in
public relations publications and presentations or proposals to librarians,
faculty, university administrators, and funders. Several DLF respondents
observed that a few well-articulated comments from users in conjunction
with quantitative data from surveys or transaction log analysis can
help make a persuasive case for changing library practice, receiving
additional funding, or developing new services or tools.
2.2.3. How Do Libraries Conduct Focus Groups?
DLF respondents reported conducting focus groups periodically. Questions
asked in focus groups, unlike those included in surveys, are not
repeated; they are not expected to serve as a basis for assessing
trends over time. The decision to convene a focus group appears to
be influenced by the organization of the library and the significance
or financial implications of the decision to be informed by the focus
group data. For example, in a library with an established usability
program or embedded culture of assessment (including a budget and
in-house expertise), a unit head can initiate focus group research.
If the library must decide whether to purchase an expensive product
or undertake a major project that will require the efforts of personnel
throughout the organization, a larger group of people might be involved
in sanctioning and planning the research and in approving the expenditure
to conduct it.
Once the decision has been made to conduct focus groups, one or
more librarians or staff prepare the interview questions, identify
the demographic groups appropriate for the research purpose, determine
how many focus groups to conduct, decide how to recruit participants,
and plan the budget and timetable for gathering, analyzing, interpreting
and applying the data.
Focus group questions should be pilot tested with a group of users
and revised on the basis of the test results to solve problems with
vocabulary, wording, or the sequence of questions, and to ensure
that the questions can be discussed in the allotted time. However,
few DLF respondents reported testing focus group questions. More
likely, the questions are simply reviewed by other librarians and
staff before conducting the study. Questions are omitted or reorganized
during the initial focus group session, on the basis of time constraints
and the flow of the conversation. The revised list of questions is
used in subsequent focus groups.
DLF libraries have used e-mail, posters, and flyers to recruit participants
for focus group studies. The invitations to prospective participants
briefly describe the goals and significance of the study, the participants'
role in the study, what is expected of them, how long the groups
will last, and any token of appreciation that will be given to the
participants. Typically, focus groups are scheduled for 60 to 90
minutes. If food is provided during the focus group, a 90-minute
session is preferred. When efforts fail to recruit at least six participants
for a group, some libraries have conducted individual interviews
with the people they did recruit.
In addition to preparing interview questions and recruiting and
scheduling participants, focus group preparation entails the following:
- Recruiting,
scheduling, and training a moderator and observer(s) for each
focus group
- Scheduling
six to twelve (preferably seven to ten) participants in designated
demographic groups, and sending them a reminder a week or a few
days before the focus group
- Scheduling
an appropriate room for each focus group. DLF respondents offered
the following cautions:
- Make
sure that the participants can easily find the room. Put
up signs if necessary.
- Beware
of construction or renovation nearby, the sound of heating
or air-conditioning equipment, and regularly scheduled noise
makers (for example, a university marching band practice
on the lawn outside).
- Ensure
that there are sufficient chairs in the room to comfortably
seat the participants, moderator, and observer(s) around
a conference table.
- If
handouts are to be distributed, for example, for participants
to comment on different interface designs, be sure that the
table is large enough to spread out the documents.
- Ordering
food if applicable
- Photocopying
the focus group questions for the moderator and observer(s)
- Testing
the audio- or videotape equipment and purchasing tapes
The focus group moderator or an observer typically arrives at the
room early, adjusts the light and temperature in the room, arranges
the chairs, and retests and positions the recording equipment. If
audiotape is used, a towel or tablet is placed under the recording
device to absorb any table vibrations. When the participants arrive,
the moderator thanks them for participating, introduces and explains
the roles of moderator and observer, reiterates the purpose and significance
of the research, confirms that their anonymity will be preserved
in any discussion or publication of the study, and briefly describes
the ground rules and how the focus group will be conducted. The introductory
remarks emphasize that the goal of the study is not for the participants
to reach consensus, but to express their opinions and share their
experiences and concerns. Disagreement and discussion are invited.
Sometimes the first question is asked round-robin, so that each participant
responds and gets comfortable talking. Subsequent questions are answered
less formally, more conversationally. The moderator asks the prepared
questions and may ask undocumented, probing questions or invite further
comments to better understand what the participants are saying and
test relevant hypotheses that surface during the discussion. For
example, "Would you explain that further?" or "Please give me an
example." The moderator uses verbal and body language to invite comments
from shy or quiet participants and to discourage domineering individuals
from turning dialogue into monologue. If participants ask questions
unre lated to the research purpose, the moderator indicates that
the question is outside the scope of the topic under discussion,
but that he or she will be happy to answer it after the focus group
is completed. Observers have no speaking roles.
When the focus group is over, the moderator thanks the participants
and might give them a token of appreciation for their participation.
The moderator may also answer any questions the participants have
about the study, the service or product that was the focus of the
study, or the library in general. Observer notes and tapes are labeled
immediately with the date and number of the session.
Libraries might or might not transcribe the focus group tapes. Some
libraries believe the cost of transcribing exceeds the benefits of
having a full transcription. One DLF respondent explained that clerical
help is typically unfamiliar with the vocabulary or acronyms used
by focus group participants and therefore cannot accurately transcribe
the tapes. This means that a professional must also listen to the
tapes and correct the transcriptions, which significantly increases
the cost of the study. When the tapes are transcribed, a few libraries
have used content analysis software to analyze the transcriptions,
but they have not been pleased with the results, perhaps because
the software attempts to conduct a quantitative analysis of qualitative
data. Even when the tapes are not transcribed, at least one person
listens to them carefully and annotates the notes taken by observers.
Analysis of focus group data is driven by the research purpose.
Ideally, at least two people analyze the datathe moderator
and observerand there is high interrater reliability. With
one exception, DLF respondents did not discuss the process of analyzing
focus group data in detail. They talked primarily about their research
purpose, what they learned, and how they applied the results. Participants
who mentioned a specific method of data analysis named content analysis,
but they neither described how they went about it nor specified who
analyzed the data. No one offered an interrater reliability factor.
Only one person provided details about the data analysis and interpretation.
This person explained that the moderator analyzed the focus group
data by using content analysis to cluster similar concepts, examining
the context in which these concepts occurred, looking for changes
in the focus group participants' position based on the discussion,
weighting responses based on the specificity of the participants'
experience, and looking for trends or ideas that cut across one or
more focus group discussions. The overall impression from the DLF
survey is that focus group data are somehow examined by question
and user group to identify issues, problems, preferences, priorities,
and concepts that surface in the data. The analyst prepares a written
summary of significant findings from each focus group session, with
illustrative examples or quotations from the raw data. The summaries
are examined to discern significant differences among the groups
or to determine whether the data support or do not support hypotheses
being tested.
2.2.4. Who Uses Focus Group Results? How Are They Used?
Decisions as to who applies the results of focus group research
and how it is applied depend on the purpose of the research, the
significance of the findings, and the organization of the library.
For example, the results of focus groups conducted to inform redesign
of the library Web site were presented to the Web Redesign Committee.
The results of focus groups conducted to assess the need for and
use of electronic resources were presented to the Digital Library
Initiatives Department. The larger the study, the more attention
it seems to draw. Striking or significant results come to the attention
of library administrators, especially if potential next steps have
financial or operation implications or require interdepartmental
cooperation. For example, if the focus group results indicate that
customer service training is required or that facilities must be
improved to increase user satisfaction, the administrator should
be informed. Focus groups provide excellent quotations in support
of cases being presented to university administrators, faculty senates,
and deans' councils to gain support for changing library directions
or receiving additional funding. The results are also presented at
conferences and published in the library literature.
The results of the DLF study indicate that focus group data have
been used to
- Clarify
or explain factors influencing survey responses, for example,
to discover reasons for undergraduate students' declining satisfaction
with the library
- Determine
questions to ask in survey questionnaires, tasks to be performed
in protocols, and the vocabulary to use in these instruments
- Identify
user problems and preferences related to collection format and
system design and functionality
- Confirm
hypotheses that user expectations and perceived needs for a library
Web site differ across discipline and user status
- Confirm
user needs for more and better library instruction
- Confirm
that faculty are concerned that students cannot judge the quality
of resources available on the Web and do not appreciate the role
of librarians in selecting quality materials
- Target
areas for fundraising
- Identify
ways to address concerns in grant-funded projects
In addition, results from focus group research have been used to
inform processes that resulted in
- Canceling
journal subscriptions
- Providing
needed information to faculty
- Redesigning
the library Web site, OPAC, or other user interface
- Providing
personalized Web pages for library users
- Sending
librarians and staff to customer service training
- Eliminating
a high-maintenance method of access to e-journals
- Planning
the direction and development priorities for the digital library,
including the scope, design, and functionality of digital library
services
- Planning
and allocating resources to market library collections and services
continuously
- Creating
a Distance Education Department to integrate distance learning
with library services
- Renovating
library facilities
2.2.5. What Are the Issues, Problems, and Challenges with Focus
Groups?
2.2.5.1. Unskilled Moderators and Observers
If the moderator of a focus group is not well trained or has a vested
interest in the research results, the discussion can easily go astray.
Without proper facilitation, some individuals can dominate the conversation,
while others may not get the opportunity to share their views. Faculty
in particular can be problematic subjects. They frequently have their
own agendas and will not directly answer the focus group questions.
A skilled, objective moderator equipped with the rhetorical strategies
and ability to keep the discussion on track, curtail domineering
or rambling individuals, and bring in reticent participants is a
basic requirement for a successful focus group.
Similarly, poor observer notes can hinder the success of a focus
group. If observers do not know what comments or behaviors to observe
and record, the data will be difficult, if not impossible, to analyze
and interpret. The situation worsens if several observers attend
different focus group sessions and record different kinds of things.
Decisions should be made before conducting the focus groups to ensure
that similar behaviors are observed and recorded during each focus
group session. The following list can serve as a starting point for
this discussion (Marczak and Sewell).
- Characteristics
of the focus group participants
- Descriptive
phrases or words used by participants in response to the key
questions
- Themes
in the responses to the key questions
- Subthemes
held by participants with common characteristics · Indications
of participant enthusiasm or lack of enthusiasm
- Consistency
or inconsistency between participant comments and observed behaviors
- Body
language
- The
mood of the discussion
- Suggestions
for revising, eliminating, adding questions in the future
2.2.5.2. Interpreting and Using the Data
A shared system of categories for recording observations will simplify
the analysis and interpretation of focus group data. No DLF respondent
mentioned establishing such a system before conducting a focus group
study. Imposing a system after the data have been gathered significantly
complicates interpreting the findings. The difficulty of interpreting
qualitative data from a focus group study can lead to disagreement
about the interpretation and delay preparation of the results. The
limited number of participants in a typical focus group study, and
the degree to which they are perceived to be representative of the
target population, exacerbate the difficulty of interpreting and
applying the results. The greater the time lapse between gathering
the data and developing plans to use the data, the greater the risk
of loss of momentum and abandonment of the study. The results of
the DLF study suggest that the problem worsens if the results are
presented to a large group within the library and if the recommended
next steps are unpopular with or counterintuitive to librarians.
2.3. User Protocols
2.3.1. What Is a User Protocol?
A user protocol is a structured, exploratory observation of clearly
defined aspects of the behavior of an individual performing one or
more designated tasks. The purpose of the protocol is to gather in-depth
insight into the behavior and experience of a person using a particular
tool or product. User protocol studies include multiple research
subjects to identify trends or patterns of behavior and experience.
Data gathered from protocols provide insight into what different
individuals do or want to do to perform specific tasks.
Protocol studies usually take 60 to 90 minutes per participant.
The protocol is guided by a list of five to ten tasks (the "task
script") that individuals are expected to perform. Each participant
is asked to think aloud while performing the designated tasks. The
task script is worded in a way that tells the user what tasks
to accomplish (for example, "Find all the books in the library catalog
published by author Walter J. Ong before 1970), but not told how to
accomplish the tasks using the particular tool or product involved
in the study. Discovering whether or how participants accomplish
the task is a typical goal of protocol research. A facilitator encourages
the participants to think aloud if they fall silent. The facilitator
may clarify what task is to be performed, but not how to perform
it.
The participant's think-aloud protocol is audio- or videotaped,
and one or two observers take notes of his or her behavior. Some
researchers prefer audiotape because it is less obtrusive. Experts
in human-computer interaction (HCI) prefer videotape. In HCI studies,
software can be used to capture participant keystrokes.
Protocols are very strict about the observational data to be collected.
Before the study, the protocol author designates the specific user
comments, actions, and other behaviors that observers are to record.
The observers' notes should be so complete that they can substitute
for the audiotape, should the system fail. In HCI studies, observer
notes should capture the participant's body language, selections
from software menus or Web pages, what the user apparently does or
does not see or understand in the user interface, and, depending
on the research goals, the speed and success (or failure) of task
completion. Employing observers who understand heuristic principles
of good design facilitates understanding the problems us ers encounter,
and therefore the recording of what is observed and interpretation
of the data.
User protocols are an effective method to identify usability problems
in the design of a particular product or tool, and often the data
provide sufficient information to enable the problems identified
to be solved. These protocols are less useful to identify what works
especially well in a design. Protocols can reveal the participant's
mental model of a task or the tool that he or she is using to perform
the task. Protocols enable the behavior to be recorded as it occurs
and do not rely on the participants' memories of their behaviors,
which can be faulty. Protocols provide accurate descriptions of situations
and, unlike surveys, can be used to test causal hypotheses. Protocols
also provide insights that can be tested with other research methods
and supplementary data to qualify or help interpret data from other
studies.
For protocols to be effective, participants must understand the
goals of the study, appreciate their role in the study, and know
what is expected of them. The selection of participants should be
based on the research purpose and an understanding of the target
population. Facilitators and observers must be impartial and refrain
from providing assistance to struggling or frustrated participants.
However, a limit can be set on how much time participants may spend
trying to complete a task, and facilitators can encourage participants
to move to the next task if the time limit is exceeded. Without a
time limit, participants can become so frustrated trying to complete
a task that they abandon the study. In HCI studies, it is essential
that the participants understand it is the software that is
being tested, not their skill in using it.
The primary disadvantage of user protocols is that they are expensive.
Protocols require at least an hour per participant, and the results
apply only to the particular product or tool being tested. In addition,
protocol data can be difficult to evaluate, depending on whether
the research focuses on gathering qualitative information (for example,
the level of participant frustration) or quantitative metrics (for
example, success rate and speed of completion). The small number
of participants and frequent use of convenience sampling limit the
ability to generalize the results of protocol studies to groups with
different demographic characteristics or to other products or tools.
Furthermore, protocols suffer from the built-in limitations of human
sensory perception and language, which affect what the facilitator
and observer(s) see and hear and how they interpret and record it.
2.3.2. Why Do Libraries Conduct User Protocols?
Half of the DLF respondents reported conducting or planning to
conduct user protocols. With rare exception, libraries appear to
view think-aloud protocols as the premier research method for assessing
the usability of OPACs, Web pages, local digital collections, and
vendor products. Protocol studies are often precipitated or informed
by the results of previous research. For example, focus groups, surveys,
and heuristic evaluations can identify frequently performed or suspected
problematic tasks to be included in protocol research. (Heuristic
evaluations are discussed in section 2.4.1.1.)
Libraries participating in the DLF study have conducted think-aloud
protocols to
- Identify
problems in the design, functionality, navigation, and vocabulary
of the library Web site or user interfaces to different products
or digital collections
- Assess
whether efforts to improve service quality were successful
- Determine
what information to include in a Frequently Asked Questions (FAQ)
database and the design of access points for the database
One DLF respondent reported plans to conduct a protocol study of
remote storage robotics.
2.3.3. How Do Libraries Conduct User Protocols?
DLF respondents reported conducting user protocols when the results
of previous research or substantial anecdotal evidence indicated
that there were serious problems with a user interface or when a
user interface was being developed as part of a grant-funded project,
in which case the protocol study is described in the grant proposal.
When protocols are conducted to identify problems in a user interface,
often they are repeated later, to see whether the problems were solved
in the meantime. In the absence of an established usability-testing
program and budget, the decision to conduct protocols can involve
a large group of people because of the time and expense of conducting
such research.
After the decision has been made to conduct user protocols, one
or more librarians or staff members prepare the task script, choose
the sampling method, identify the demographic groups appropriate
for the research purpose, determine how many participants to recruit
in each group, decide how to recruit them, recruit and schedule the
participants, and plan the budget and timetable for gathering, analyzing,
interpreting and applying the data. Jakob Nielsen's research has
shown that four to six subjects per demographic group is sufficient
to capture most of the information that could be discovered by involving
more subjects. Beyond this number, the cost exceeds the benefits
of conducting more protocols (Nielsen 2000). Sometimes protocols
are conducted with only two or three subjects per user group because
of the difficulty of recruiting research subjects.
DLF libraries immediately follow user protocol sessions with a brief
survey or interview to gather additional information from each participant.
This information helps clarify the user's behavior and provides some
sense of the user's perception of the severity of the problems encountered
with the user interface. One or more people prepare the survey or
interview questions. In addition, some libraries prepare a recording
sheet that observers use to structure their observations and simplify
data analysis. Some also prepare a written facilitator guide that
outlines the entire session.
DLF libraries pilot test the research instruments with at least
one user and revise them on the basis of the test results. Pilot
testing can help solve problems with the vocabulary, wording, or
sequencing of protocol tasks or survey questions; it also can target
ways to refine the recording sheet to facilitate rapid recording
of observations. Pilot testing also enables the researcher to ensure
that the protocol and follow-up research can be completed in the
time allotted.
DLF libraries have used e-mail, posters, and flyers to recruit
participants for user protocol studies. The recruitment information
briefly describes the goals and significance of the research, the
participants' role, and what is expected of them, including the time
it will take to participate and any token of appreciation that will
be given to the participants. Other than preparing the instruments
and recruiting participants, preparation for a user protocol study
closely resembles preparation for a focus group. It involves the
following steps:
- Recruiting,
scheduling, and training a facilitator and one or more observers;
in some cases, the facilitator is the sole observer
- Scheduling
the participants and sending them a reminder a week or a few
days before the protocol
- Scheduling
a quiet room; protocol studies have been conducted in offices,
laboratories, or library settings.
- If
necessary, ordering computer or videotape equipment to be delivered
a half hour before the protocol is to begin
- Photocopying
the research instruments
- Testing
the audio- or videotape equipment and purchasing tapes
The facilitator or an observer arrives at the room early, adjusts
the light and temperature in the room, arranges the chairs so that
the facilitator and observers can see the user's face and the computer
screen, and tests and positions the recording equipment. If audiotape
is used, a towel or tablet is placed under the recording device to
absorb any table vibrations. The audiotape recorder is positioned
close enough to the user to pick up his or her comments, but far
enough away from the keyboard to avoid capturing each key click.
If computer or videotape equipment must be delivered to the room,
someone must arrive at the room extra early to confirm delivery,
be prepared to call if it is not delivered, test the computer equipment,
and allow time for replacement or software reinstallation if something
is not working.
Though HCI experts recommend videotape, all but one of the DLF libraries
reported using audiotape to record user protocols. The library that
used videotape observed that the camera made users uncomfortable
and the computer screen did not record well, so the group used audiotape
instead for the follow-up protocols. Few DLF libraries have the resources
or facilities to videotape their research, and the added expense
of acquiring these might also be a deterrent to using videotape.
When participants arrive, the facilitator thanks then for participating,
explains the roles of facilitator and observer(s), reiterates the
purpose and significance of the research, confirms that anonymity
will be preserved in any discussion or publication of the study,
and describes the ground rules and how the protocol will be conducted.
The facilitator emphasizes that the goal of the study is to test
the software, not the user. The facilitator usually reminds participants
multiple times to think aloud. For example, "What are you thinking
now?" or "Please share your thoughts." Observers have no speaking
role.
DLF libraries immediately followed protocol sessions with brief
interviews or a short survey to capture additional information and
give participants the opportunity to clarify what they did in the
protocol, describe their experience, and articulate expectations
they had about the task or the user interface that were not met.
Protocol research is sometimes followed up with focus groups or surveys
to confirm the findings with a larger sample of the target population.
When the protocol is over, the facilitator thanks the participant
and usually gives him or her a token of appreciation. The facilitator
also answers any questions the participant has. Observer notes and
tapes are labeled immediately.
DLF libraries might or might not transcribe protocol tapes for
the same reasons they do or do not transcribe focus group tapes.
If the tapes are not transcribed, at least one person listens to
them and annotates the observer notes. With two exceptions, DLF respondents
did not discuss the process of analyzing, interpreting, and figuring
out how to apply the protocol results, although several did mention
using quantitative metrics. They simply talked about significant
applications of the results. The two cases that outlined procedures
for analyzing, interpreting, and applying results merit examination:
- Case
one: The group responsible for conducting the protocol study
created a table of observations (based on the protocol data),
interpretations, and accompanying recommendations for interface
redesign. The recommendations were based on the protocol data
and the application of Jakob Nielsen's 10 heuristic principles
of good user interface design (Nielsen, no date). The group assessed
how easy or difficult it would be to implement each recommendation
and plotted a continuum of recommendations based on the difficulty,
cost, and benefit of implementing them. The cost-effective recommendations
were implemented.
- Case
two: When protocol data identified many problems and yielded
a high failure rate for task completion, the group responsible
for the study did the following:
- Determined
the severity of each problem on the basis of its frequency
and distribution across users, whether it prevented users
from successfully completing a task, and the user's assessment
of the severity of the problem, which was gathered in a follow-up
survey.
- Formulated
alternative potential solutions to the most severe problems
on the basis of the protocol or follow-up survey data and
heuristic principles of good design.
- Winnowed
the list of possible solutions by consulting programmers
and doing a quick-and-dirty cost-benefit analysis. Problems
that can be fixed at the interface level are often less expensive
to fix than those that require changes in the infrastructure.
- Recommended
implementing the solutions believed to have the greatest
benefit to users for the least amount of effort and expense.
The procedures in the two cases are similar, and although the other
DLF respondents did not describe the process they followed, it could
be that their processes resemble these. At least one other respondent
reported ranking the severity of problems identified by protocol
analysis to determine which problems to try to solve.
2.3.4. Who Uses Protocol Results? How Are They Used?
The results of the study suggest that who applies the results from
user protocols and how the results are applied depend on the purpose
of the research, the significance of the findings, and the organization
of the library. The larger the study and the more striking its implications
for financial and human resources, the more attention it draws in
the library. Although the results of protocol studies are not always
presented to university administrators, faculty senates, deans' councils,
and similar groups; they might be presented at conferences and published
in the library literature.
DLF libraries have used significant findings from protocol analysis
to inform processes that resulted in the following:
- Customizing
the OPAC interface, or redesigning the library Web site or user
interfaces to local digital collections. Examples of steps taken
based on protocol results include
- rearranging
a hierarchy
- changing
the order and presentation of search results
- changing
the vocabulary, placement of links, or page layout
- providing
more online help, on-screen instructions, or suggestions
when searches fail
- changing
the labeling of images
- changing
how to select a database or start a new search
- improving
navigation
- enhancing
functionality
- Revising
the metadata classification scheme for image or text collections
- Developing
or revising instruction for how to find resources on the library
Web site and how to use full-text e-resources and archival finding
aids
The results of protocol studies have also been used to suggest revisions
or enhancements to vendor products, to verify improvements in interface
design and functionality, and to counter anecdotal evidence or suggestions
that an interface should be changed.
2.3.5. What Are the Issues, Problems, and Challenges With User
Protocols?
2.3.5.1. Librarian Assumptions and Preferences
Several DLF respondents commented that librarians can find it difficult
to observe user protocols because they often have assumptions about
user behavior or preferences for interface design that are challenged
by what they witness. Watching struggling or frustrated participants
and refraining from providing assistance run counter to the librarians'
service orientation. Participants often ask questions during the
protocol about the software, the user interface, or how to use it.
Facilitators and observers must resist providing answers during the
protocol. Librarians who are unable to do this circumvent the purpose
of the research.
Librarians can also be a problem when it comes to interpreting and
applying the results of user protocols. Those trained in social science
research methods often do not understand or appreciate the difference
between HCI user protocols and more rigorous statistical research.
They may dismiss results that challenge their own way of thinking
because they believe the research method is not scientific enough
or the pool of participants is too small.
2.3.5.2. Lack of Resources and Commitment
User protocols require skilled facilitators, observers, and analysts
and the commitment of human and financial resources. Requisite skills
might be lacking to analyze, interpret, and persuasively present
the findings. Even if the skills are available, there could be a
breakdown in the processes of collecting, analyzing, and interpreting
the data, planning how to use the findings, and implementing the
plans, which could include conducting follow-up research to gather
more information. Often the process is followed to the last stage,
implementation, where Web masters, programmers, systems specialists,
or other personnel are needed. These people can have other priorities.
Human and financial resources or momentum can be depleted before
all the serious problems identified have been solved. Limited resources
frequently restrict implementation to only the problems that are
cheap and easy to fix, which are typically those that appear on the
surface of the user interface. Problems that must be addressed in
the underlying architecture often are not addressed.
2.3.5.3. Interpreting and Using the Data
Effective, efficient analysis of data gathered in user protocols
depends on making key decisions ahead of time about what behaviors
to observe and how to record them. For example, if quantitative usability
metrics are to be used, they must be carefully defined. If the success
rate is to be calculated, what constitutes success? Is it more than
simply completing a task within a set time limit? What constitutes
partial success, and how is it to be calculated? Similar questions
should be posed and answers devised for qualitative data gathering
during the protocols. Otherwise, observer notes are chaotic and data
analysis may be as difficult as is analyzing the responses to open-
ended questions in a survey. The situation worsens if different observers
attend different protocols and record different kinds of things.
Such key decisions should be made prior to conducting the study.
If made afterward, they can result in significant lag time between
data gathering and the presentation of plans to apply the results
of the data analysis. The greater the lag time, the greater the risk
of loss of momentum, which can jeopardize the entire effort.
2.3.5.4. Recruiting Participants Who Can Think Aloud
General problems and strategies for recruiting research subjects
are discussed in section 4.2.1. DLF respondents reported difficulty
in getting participants to think aloud. At least one librarian is
considering conducting screening tests to ensure that protocol participants
can think aloud. Enhancing the skills of the facilitator (through
training or experience) and including a pretest task or two for the
participants to get comfortable thinking aloud would be preferable
to risking biasing the results of the study by recruiting only participants
who are naturally comfortable thinking aloud.
2.4. Other Effective Research Methods
2.4.1. Discount Usability Research Methods
Discount usability research can be conducted to supplement more
expensive usability studies. This informal research can be done at
any point in the development cycle, but is most beneficial in the
early stages of designing a user interface or Web site. When done
at this time, the results of discount usability research can solve
many problems and increase the efficiency of more formal testing
by targeting specific issues and reducing the volume of data gathered.
Discount usability research methods are not replacements for formal
testing with users, but they are fruitful, inexpensive ways to improve
interface design. In spite of these merits, few DLF libraries reported
using discount methods. Are leading digital libraries not using these
research methods because they are unaware of them or because they
do not have the skills to use them?
2.4.1.1. Heuristic Evaluations
Heuristic evaluation is a critical inspection of a user interface
conducted by applying a set of design principles as part of an iterative
design process.4 The
principles are not a checklist, but conceptual categories or rules
that describe common properties of a usable interface and guide close
scrutiny of an interface to identify where it does not comply with
the rules. Several DLF respondents referred to Nielsen's heuristic
principles of good design, mentioning the following:
- Visibility
of system status
- Match
between system and real world
- User
control and freedom
- Consistency
and standards
- Recognition
rather than recall
- Flexibility
and efficiency of use
- Aesthetics
and minimalist design
- Error
prevention
- Assistance
with recognizing, diagnosing, and recovering from errors
- Help
and documentation5
Heuristic evaluations can be conducted before or after formal usability
studies involving users. They can be conducted with functioning interfaces
or with paper prototypes (see section 2.4.1.2.). Applying heuristic
principles to a user interface requires skilled evaluators. Nielsen
recommends using three to five evaluators, including someone with
design expertise and someone with expertise in the domain of the
system being evaluated. According to his research, a single evaluator
can identify 35 percent of the design problems in the user interface.
Five evaluators can find 75 percent of the problems. Using more than
five evaluators can find more problems, but at this point the cost
exceeds the benefits (Nielsen 1994).
Heuristic evaluations take one to two hours per evaluator. The evaluators
should work independently but share their results. An evaluator can
record his or her own observations, or an observer may record the
observations made by the evaluator. Evaluators follow a list of tasks
that, unlike the task script in a user protocol, may indicate how
to perform the tasks. The outcome from a heuristic evaluation is
a compiled list of each evaluator's observations of instances where
the user interface does not comply with good design principles. To
guide formulating solutions to the problems, each problem identified
is accompanied by a list of the design principles that are violated
in this area of the user interface.
Heuristic evaluations have several advantages over other methods
for studying user interfaces. No participants need to be recruited.
The method is inexpensive, and applying even a few principles can
yield significant results. The results can be used to expand or clarify
the list of principles. Furthermore, heuristic evaluations are more
comprehensive than think-aloud protocols are, because they can examine
the entire interface and because even the most talkative participant
will not comment on every facet of the interface. The disadvantages
of heuristic evaluations are that they require familiarity with good
design principles and interpretation by an evaluator, do not provide
solutions to the problems they identify, and do not identify mismatches
between the user interface and user expectations. Interface developers
sometimes reject the results of heuristic evaluations because no
users were involved.
A few DLF libraries have conducted their own heuristic evaluations
or have made arrangements with commercial firms or graduate students
to do them. The evaluations were conducted to assess the user-friendliness
of commercially licensed products, the library Web site, and a library
OPAC. In the process, libraries have analyzed such details as the
number of keystrokes and mouse movements required to accomplish tasks
and the size of buttons and links that users must click. The results
of these evaluations were referred to as a "wake-up call" to improve
customer service. It is unclear from the survey whether multiple
evaluators were used in these studies or the study was conducted
in-house, and whether the libraries have the interface design expertise
to apply heuristic principles or conduct a heuristic evaluation effectively.
Nevertheless, several DLF libraries reported using heuristic principles
to guide redesign of a user interface.
2.4.1.2. Paper Prototypes and Scenarios
Paper prototype and scenario research resembles think-aloud protocols,
but instead of having users perform tasks with a functioning system,
this method employs sketches, screen prints, or plain text and asks
users how they would use a prototype interface to perform different
tasks or how they would interpret the vocabulary. For example, where
would they click to find a feature or information? What does a link
label mean? Where should links be placed? Paper prototypes and scenarios
can also be a basis for heuristic evaluations.
Paper prototype and scenario research is portable, inexpensive,
and easy to assemble, provided that the interface is not too complicated.
Paper prototypes do not intimidate users. If it is used early in
the development cycle, the problems identified can be rectified easily
because the system has not been fully implemented. Paper prototypes
are more effective than surveys to identify usability, navigation,
functionality, and vocabulary problems. The disadvantage is that
participants interact with paper interfaces differently than they
do with on-screen interfaces; that is, paper gets closer scrutiny.
A few DLF respondents reported using paper prototype research. They
have used it successfully to evaluate link and button labels and
to inform the design of Web sites, digital collection interfaces,
and classification (metadata) schemes. One library used scenarios
of horizontal paper prototypes, which provide a conceptual
map of the entire surface layer of a user interface, and scenarios
of vertical paper prototypes, which cover the full scope of
a feature, such as searching or browsing. This site experimented
with using Post-itTM notes to display menu selections
in a paper prototype study, and accordion-folded papers to imitate
pages that would require scrolling. The notes were effective, but
the accordion folds were awkward.
2.4.2. Card-Sorting Tests
Vocabulary problems can arise in any user study, and they are often
rampant in library Web sites. A few respondents reported conducting
research specifically designed to target or solve vocabulary prob
lems, including card-sorting studies to determine link labels and
appropriate groupings of links on their Web sites. Card-sorting studies
entail asking individual users to
- Organize
note cards containing service or collection descriptions into
stacks of related information
- Label
the stacks of related information
- Label
the service and collection descriptions in each stack
Reverse card-sorting exercises have been used to test the labels.
These exercises ask users what category (label) they would use to
find which service or collection. Alternatively, the researcher can
simply ask users what they would expect to find in each category,
then show them what is in each category and ask them what they would
call the category.
The primary problem encountered in conducting card-sorting tests
is describing the collections and services to be labeled and grouped.
Describing "full-text" e-resources appears to be particularly difficult
in card-sorting exercises, and the results of surveys, focus groups,
and user protocols indicate that users often do not understand what "full-text" means.
Unfortunately, this is the term found on many library Web sites.

1 To
give the reader a better understanding of the care with which user
studies must be designed and conducted, sample research instruments
may be viewed at www.clir.org/pubs/reports/pub105/instr.pdf.
2 Much
of the information in this section is taken from Chadwick, Bahr,
and Albrecht 1984.
3 Much
of the information in this section is taken from Chadwick, Bahr,
and Albrecht 1984.
4 See,
for example, Nielsen 1994. Other chapters in the book describe other
usability inspection methods, including cognitive walk-throughs.
5 A
brief description of these principles is available in Nielsen, no
date.
Next Previous
Return to CLIR Home Page >>
|