Discount Usability Engineering Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development March 2, 1999

Discount Usability Engineering

Marti Hearst (UCB SIMS)SIMS 213, UI Design &

DevelopmentMarch 2, 1999

Adapted from slide by James Landay

Last Time

UI tools good for testing more developed UI ideas

Two styles of tools– “Prototyping” vs. UI builders

Most ignore the “insides” of application

Today

Discount Usability Engineering– How to do Heuristic Evaluation– Comparison of Techniques

Evaluation Techniques

UI Specialists– User Studies /

Usability Testing– Heuristic

Evaluation

Interface Designers and Implementors– Software Guidelines– Cognitive

Walkthroughs

What is Heuristic Evaluation?

A “discount” usability testing method UI experts inspect an interface

– prototype initially– later, the full system

Check the design against a list of design guidelines or heuristics

Present a list of problems to the UI designers and developers, ranked by severity


Phases of Heuristic Evaluation

1) Pre-evaluation training– give evaluators needed domain knowledge and

information on the scenario

2) Evaluation– individuals evaluate and then aggregate

results

3) Severity rating– determine how severe each problem is

(priority)

4) Debriefing– discuss the outcome with design team

Heuristic Evaluation Developed by Jakob Nielsen Helps find usability problems in a UI design Small set (3-5) of evaluators examine UI

– independently check for compliance with usability principles (“heuristics”)

– different evaluators will find different problems– evaluators only communicate afterwards

» findings are then aggregated

Can perform on working UI or on sketches

Why Multiple Evaluators?

Every evaluator doesn’t find every problem Good evaluators find both easy & hard ones

Heuristics (original)

H1-1: Simple & natural dialog

H1-2: Speak the users’ language

H1-3: Minimize users’ memory load

H1-4: Consistency H1-5: Feedback

H1-6: Clearly marked exits

H1-7: Shortcuts H1-8: Precise &

constructive error messages

H1-9: Prevent errors H1-10: Help and

documentation


How to Perform H.E. At least two passes for each evaluator

– first to get feel for flow and scope of system– second to focus on specific elements

Assistance from implementors/domain experts– If system is walk-up-and-use or evaluators are

domain experts, then no assistance needed– Otherwise might supply evaluators with

scenarios and have implementors standing by


How to Perform H.E. Each evaluator produces list of

problems– Explain each problem with reference to

heuristics or other information– Be specific– List each problem separately


Example Problem Descriptions

Can’t copy info from one window to another– Violates “Minimize the users’ memory load” (H1-

3)– Fix: allow copying

Typography uses mix of upper/lower case formats and fonts– Violates “Consistency and standards” (H2-4)– Slows users down– Probably wouldn’t be found by user testing– Fix: pick a single format for entire interface


How to Perform Evaluation

Where problems may be found– single location in UI– two or more locations that need to be

compared– problem with overall structure of UI– something that is missing

» this part is hard to find with paper prototypes so work extra hard on those


Debriefing

Conduct with evaluators, observers, and development team members

Discuss general characteristics of UI Suggest potential improvements to

address major usability problems Dev. team rates how hard things are

to fix Make it a brainstorming session

– little criticism until end of session


Severity Rating Used to allocate resources to fix problems Estimates of need for more usability efforts Combination of

– frequency– impact– persistence (one time or repeating)

Should be calculated after all evaluations are done

Should be done independently by all evaluators


Severity Ratings (from Nielsen & Mack 94)

0 - don’t agree that this is a usability problem

1 - cosmetic problem 2 - minor usability problem3 - major usability problem; important to fix4 - usability catastrophe; imperative to fix


Severity Ratings Example

1. [H1-4 Consistency] [Severity 3]

The interface used the string "Save" on the first screen for saving the user's file, but used the string "Write file" on the second screen. Users may be confused by this different terminology for the same function.


Results of Using HE

Discount: benefit-cost ratio of 48 [Nielsen94]– cost was $10,500 for benefit of $500,000– value of each problem ~15K (Nielsen &

Landauer)– how might we calculate this value?

» in-house productivity; open market sales

There is a correlation between severity & finding w/ HE


Results of Using HE (cont.)

Single evaluator achieves poor results– only finds 35% of usability problems– 5 evaluators find ~ 75% of usability

problems– why not more evaluators? 10? 20?

»adding evaluators costs more»adding more evaluators doesn’t increase

the number of unique problems found


Decreasing Returns

problems found benefits / cost

Caveat: graphs for a specific example


Summary of H.E. Heuristic evaluation is a “discount” method Procedure

– Evaluators go through the UI twice– Check to see if it complies with heuristics– Note where it doesn’t and say why

Follow-up– Combine the findings from 3 to 5 evaluators– Have evaluators independently rate severity– Discuss problems with design team

Alternate with user testing

Study by Jeffries et al. 91

Compared the four techniques– Evaluating the HP-VUE windows-like

OS interface – Used a standardized usability

problem reporting form


Compared the four techniques– Usability test

»conducted by a human factors professional»six participants (subjects)»3 hours learning, 2 hours doing ten tasks

– Software guidelines»62 internal guidelines»3 software engineers (had not done the

implementation, but had relevant experience)


Compared the four techniques– Cognitive Walkthrough

»3 software engineers, as a group (had not done the implementation but had relevant experience)

»procedure: developers “step through” the interface in the context of core tasks a typical user will need to accomplish

»used tasks selected by Jeffries et al.»a pilot test was done first


Compared the four techniques– Heuristic Evaluation

»4 HCI researchers»two-week period (interspersed evaluation

with the rest of their tasks)

Results of Study

H.E. group found the largest number of unique problems

H.E. group found more than 50% of all problems found

Results of Study Problems Discovered

H.E. U.T. S.G. C.W.

Technique 105 30 13 30

Side Eff ect -- 1 8 5

PriorExperience

-- 0 12 0

Time Required (in person-hours)

H.E. U.T. S.G. C.W.

EvaluationTime

20 199 17 27

TechniqueTrainingTime

-- -- 5 27

HP-VUETrainingTime

15 -- 64 6

Average Severity Ratings

Partition into three groups Classify results into two of these

– most severe / least severe– average >=3.86 / average <=2.86

HE 28/52 UT 18/2 SG 12/11 CW 9/10

Benefit/Cost Ratios

Average severity rating/time– HE: 12– UT: 1– SG: 2 – CW: 3– (S.G. becomes 6 if you factor in the

time required for HP-VUE training)


HE vs. User Testing HE is much faster

– 1-2 hours each evaluator vs. days-weeks– finds some problems user testing does not find

HE doesn’t require interpreting user’s actions User testing is more accurate (by def.)

– takes into account actual users and tasks– finds some problems HE doesn’t find– HE may miss problems & find “false positives”

Good to alternate between HE and user testing– find different problems– don’t waste participants

Next Time

Work on Low-fi prototypes in class

Documents

Discount Usability Engineering Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development March 2, 1999