Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Rigorous Evaluation Usability Testing
What is Usability Testing? • Formal and rigorous testing using a structured process • Validate adherence to interaction requirements • “Actual” users who perform realistic and representative tasks • Utilize a functional prototype • Quantitative and qualitative usability measures
Chuckism • Usability Evaluation instead of Usability Testing • Want to put participants at ease • The word test automatically invokes feelings of anxiety • Don’t use the word test on
- Recruiting artifacts (flyers) - Consent forms - Label on door to testing room - first thing the user sees when they get to the evaluation
• Your conversation (“We are evaluating the software and not you.”)
Usability Measures • Ease of learning (learnability)—how fast can a user learn to
accomplish basic tasks? • Ease of remembering (memorability)—can a user remember
enough to be effective the next time? • Efficiency of use—how fast can an experienced user accomplish
tasks? • Error frequency and severity (understandability/comprehensibility) -
how often do users make errors, how serious are they, and how do users recover from them?
• Engaging (Subjective satisfaction)—how much does the user like using the system?
Usability Evaluation - Ethics
Awareness of Regulations • Human Subjects Protocols
• You must be fully aware of the regulations imposed by the various institutions and regulatory bodies that pertain to your experimental design
• Health and well being of subjects • The U.S. Department of Health and Human Services Web site
• http://www.hhs.gov/ohrp/ • Informed consent form – all participant users should
read and sign • Explains the study, risks (if any), and contact name for questions/issues • Written consent to record participants (video and audio) • How you may or may not use their image/voice
Advantages and Limitations of Usability Evaluation • Advantages
• Discover usability issues before deployment • Particularly important for a market driven product • Begin to build user loyalty • Gain knowledge for future releases
• Disadvantages • Artificial context • No guarantee of product acceptance • Result skew if true user demographic missed • May not be the most efficient and cost effective method for usability
evaluation
Comparative • Can be formative or summative
- Formative: what do we need to change vs competition, with which design should we proceed
- Summative: verify we are better than competition • Between your own designs/prototypes • Vs. competition • Methodology
- Compare to pre-determined standard or benchmark - Use performance and preference criteria - Between or within-subjects design
Format of the Test Plan 1. Purpose/goals 2. Research questions (usability requirements) 3. Target Audience 4. Design of the Usability Evaluation 5. Logistics 6. Data Collection Methodology 7. Deliverables Description
1. Purpose • High level reasoning for performing the test at this time • Need sound reasons – NOT… • Everyone else has a testing program • Goal examples from 3.3 of Tullis/Albert
- Comparing products (you vs. competition) (summative) - Comparing designs - Evaluate Information Architecture (IA) - Problem discovery - Evaluate critical feature or product - Impact of change on established user base
2. Research Questions / Usability Requirements • Describes issues/questions that need to be
resolved and focuses the research • Precise, accurate, clear and measurable (or
observable) • Helps you plan your test design Your test results will need to answer these questions • If a test task scenario isn’t related to one of the questions, it
should be removed • If you don’t have a test task scenario to answer one of these,
need to add one
2. Research Questions / Usability Requirements
• Too unfocused and vague - Is the current product usable? - Is product ready for release?
• Better… - Can users successfully install the software using the setup guide
without assistance? - Do the screens reflect the end user’s conceptual model? - Can users perform task xyz in under 3 minutes? - Can users perform task abc without instruction (first-use intuitive.)? - Do users prefer prototype A vs. prototype B? - How quickly & easily can users perform picture upload from a
mobile device?
Activity Outline a first version of your project test plan:
• Define the purpose of the usability evaluation - Which system will you compare your design against?
• Define the usability evaluation goals • Use the Test Plan template
3. Target Audience • Participants should be real users • You don’t need a large sample (8-15 or so) to get good
feedback • Recruit users with the following characteristics:
• Availability • Responsiveness • Objectivity • Diversity – background, experience, responsibility, … • Represent primary user roles
4. Design of the Usability Evaluation • Introductory paragraph
- Layperson’s explanation of how you will conduct the test • Test Design
- Type of study (within-, or between- subjects) - Number of participants - Duration of each session - Number and types of tasks performed by each participant - Matrix design
Test Design: Intro Paragraph Example In this between-subjects test, each participant will sit through a one-hour usability study session. Approximately 15 minutes of each session will be used to explain the session to the participant, review basic background information with the participant, and conduct a pre-test questionnaire. During the middle 30 minutes of the session, participants will use the XYZ software to complete 4 tasks, which the moderator will administer. Each participant will be asked to use the think-aloud protocol while conducting the tasks. The last 15 minutes will be used to conduct the post-test questionnaire and debriefing.
Within-Subjects Test Design • A single group of participants • Essentially comparing each participant to themselves • Compare Designs
- Each participant uses both interfaces - Preferred interface measures
• Repeated measures - Example: Multiple trials of same task, Compare time on task
between trials - Can be used to measure “learnability” of product
• Advantage: fewer participants required • Disadvantage: carryover/transfer of learning
- Mitigate via counterbalancing
2 Products – 1 Group – 1 Set of Tasks
A B
Within-Subjects Test Design
Between-Subjects • Comparing Interfaces
- Randomly assign participants to groups - Randomly assign all participants - Randomly assign males to each group then females to ensure equal distribution between groups - Or balance novice and expert between groups
• Comparing groups - Manually assign to group based upon characteristic - group of experts vs. group of novices
2 Products – 2 Groups – 1 Set of Tasks
A B Between-Subjects Test
Design
Disadvantage of Within-Subjects What if Product A leaves a bad taste? What if participants learned something using Product A that made Product B easier to use? *Carryover/transfer of learning
Think: Taste Test
A B
Counterbalancing • Alternate order of tasks • Alternate order of interface use • Alternate order of both interfaces and tasks
Within-Subjects: Comparing Interfaces *More Typical
Participants P1 A1 B1 A2 B2 A3 B3 A4 B4 P2 B1 A1 B2 A2 B3 A3 B4 A4 P3 A1 B1 A2 B2 A3 B3 A4 B4 P4 B1 A1 B2 A2 B3 A3 B4 A4 P5 A1 B1 A2 B2 A3 B3 A4 B4 P6 B1 A1 B2 A2 B3 A3 B4 A4 P7 A1 B1 A2 B2 A3 B3 A4 B4 P8 B1 A1 B2 A2 B3 A3 B4 A4
Tasks
Tasks An equivalent to task Bn in their respective interfaces
Compare Interfaces
An = Tasks for Interface A Bn = Tasks for Interface B
Session Outline • General description of what will happen during
session • Contains high-level outline with timings
- Pre-session setup (5 minutes) - Intro and informed consent (5 minutes) - Background questionnaire (5 minutes) - Tasks (not a list of tasks) (30 minutes) - Post task questionnaire (5 minutes) - Post-session debrief and questionnaire (10 minutes)
Activity Work on the design of the usability evaluation section
• Define the user tasks • Design the task matrix • Design the session outline • Use the Test Plan template
5. Logistics • Schedule • Location • Resources – people and equipment • Materials, e.g. task scenarios – how will tasks be used in the user
environment?
Constraints on Usability Testing • Time to …
• Design, prepare, and administer the test • Analyze the results
• Financial • Equipment and software • Laboratory time • Recording media • [Participant compensation ]
• Space—to perform the usability test • A dedicated laboratory or room is recommended.
People • Tester roles
• Test project leader, expert • Moderator – interacts with the participant during the test • Data logger / Note taker • [Technician] – operational responsibility
• Optional observers: • Other development team members not involved in the test • Other stakeholders
Test Scripts • Write test scripts – to avoid bias due to inconsistent
moderator-participant interaction • Greet the participant – introductions, set the stage • Preliminary interview – warm-up questions • Provide instructions • Monitor the test – record observations, capture participant’s
impressions and comments • Debrief the participant – wrap-up discussion
• Script test and task execution details • Length and order • Breaks to minimize user fatigue • Intervals between tests • Flexibility for the unexpected
Activity • Work on the Logistics section of your project test plan. • Write test scripts • Fill in Appendix A.
6. Data Collection Methodology (Plan) • Define the measurements – reflect usability goals • Quantitative - objective, measurable
• Performance data - times, error rates, etc. • Subjective ratings, from post test surveys
• Qualitative: subjective • Participant comments, survey answers • Test team comments, observations • Background participant data from user profiles, surveys,
questionnaires • List of problems (known and/or suspected)
6. Data Collection Methodology (During the test)
• Maintain a log or observation check list for each task
• Create a problem list to capture anything that is not covered by the check list
• Note any ideas or theories that occur to you about the problems
• Usability measurements • Critical incident observation – emotional impact
Usability Specification Table
• User role – user category • UX goal – quality measure, e.g., learnability • Measuring instrument – the benchmark task(s) or survey to generate test
data • UX Metric – test measurement values to be collected; e.g., response times • Baseline level – performance of current system if relevant • Target level – minimum value for success • Observed results – measured values
User Role
UX Goal Measuring Instrument
UX Metric
Baseline Level
Target Level
Observed Results
Usability engineering: Our experience and evolution M. Helander Handbook of Human-Computer Interaction , J.A. Whiteside J. Bennett K. Holtzblatt 1988
7. Deliverables: Process the Data • Identify problems (known and/or suspected)
• Severity • Frequency • Errors of omission • Errors of commission
• Prioritize problems • Theorize reasons and solutions • Identify successes and areas of uncertainty
Run A Pilot Test To Rehearse • Be organized • Be presentable for a good first impression
Activity Work on the Data Collection section of your project test plan. Fill in Appendix B and Appendix C.
User Role
UX Goal Measuring Instrument
UX Metric
Baseline Level
Target Level
Observed Results
References • Albert, William, and Thomas Tullis.
Measuring the user experience: collecting, analyzing, and presenting usability metrics. Newnes, 2013.