U08038 The Human Computer Interface Lecture 5

U08038 The Human Computer Interface

Lecture 5 Evaluation

Dr Clare Martin

2

Today's Lecture

• Usability Evaluation• Analytical Methods

– GOMS analysis– Cognitive Walkthroughs– Heuristic Evaluation

• Evaluation Methods– User Observation– Beyond Testing

3

Reminder: what is usability? Usability is not about

"Is the system capable of performing this task?"

Usability is about "How effective is the system at allowing the user to perform this task?"

Usability is not about functionality (important though that is, other branches of computer science address this requirement), it is about good design. (Preece 1.6.1)

4

Brookes Calendar

Example Usability Question:

Is it obvious

what the heavy lines represent?

5

Photocopier Example Usability Question:

Can a novice user figure out which button is the "Sample Copy" button?

6

More examples of usability questions

• Will the design of this ATM prevent people don'tfrom leaving their bank cards behind?

• Is this button on the screen too small to pressaccurately? And if the user accidentally pressesthe button, will he/she manage to undo theeffects?

7

General Categories of Usability (1) See Redmond-Pyle & Moore, Section 1.2:

•  Effectiveness: –  How effectively and/or quickly can the users

perform their tasks using the interface?

•  Learnability: –  How much training time and practice do users

require to be effective with the system?

–  If users only use the system intermittently, how long does it take to relearn the system each time they use it?

8

General Categories of Usability (2)

• Flexibility:– Is the interface still effective if there are changes in

the task or environment?

• Attitude/Satisfaction:– Do people experience using the system as

frustrating, or rewarding?

– Do users like the system?

9

How to get good usability?

•  Good usability makes for a better system, but how? •  Designers are biased: easy to produce a design

that is personally pleasing, but very difficult to see how other people are going to view the design!

•  Experienced designers may produce designs with fewer usability problems than inexperienced designers, but there WILL still be problems.

•  This means that every designer needs to be humble! (Yes, you too!)

(Redmond-Pyle p4)

10

The need to measure usability Design guidelines are very useful, and help a bit, but only go so far. We need to be able to measure usability, to get feedback about our designs (and thus improve them). (Dumas & Redish p184) No quick and easy answers, when measuring usability. No clear scale of measurement, no one right answer. This does not mean it is useless to attempt to measure usability!

...quite the contrary: a lot of valuable information can be gained, which can improve a design greatly. This is the whole point of measuring usability - to improve the design!

11

Measuring Usability

Useful terminology: •  A measurement is formative if it gives back

information that provides input into a design (helps form the design)

•  A measurement method is summative if it gives information about whether a design meets a standard or not, with no feedback into the design

12

Evaluation (formative) Formative evaluation is carried out as part of the iterative design process: – A prototype reflects the current state of the design.– The prototype is regularly updated and tested on users– The results are fed into the next iteration of the design,

helping to form the design Each usability evaluation allows the designers to check design decisions that have already been made, and to help make design decisions, e.g. "Which of these alternatives should we adopt?" "Are there any problems with the design so far?" (Preece 12.2.4)

13

Evaluation (summative) This involves testing the design as a whole (typically at the end of the design process) If there was no testing up to this point, then it is likely the whole design will fail to meet usability requirements (and maybe there won't be enough time to fix it) Summative evaluation is therefore less effective ....but can still provide useful information (e.g. for the next version of the product) (Preece 12.2.4)

14

Usability Measuring Methods There are two main types of methods: theoretical and practical (Preece)

Analysis (theoretical) – Requires a detailed description of the

design (but the design doesn't have to beimplemented)

– Creates a model of the user's activity andthen analysis is performed on that model

– Cheap, no advance planning

15

Usability Measuring Methods

Evaluation (Practical) (Preece p) – The design needs to be implemented – A prototype is built and then tested – Relatively expensive, requiring advance

planning – Many different testing methods possible,

but should include some kind of user observation

16

Usability Evaluation Methods There are many! We will look at: Analytical Methods • GOMS analysis• Cognitive Walkthroughs• Heuristic Evaluation

Evaluation Methods • User Observation

– Think-aloud protocol– Performance measurements– Computer logging

• Beyond Usability Testing– Questionnaires/Interviews– Focus groups

17

GOMS Reminder: •  Goals, Operators, Methods, Selection •  GOMS breaks the user's goals down into smaller sub-

goals and for each (selection) method of achieving the goal, it lists the operators used

GOMS provides a way of modelling how a user interacts with an interface to achieve their goals

Can also be used in Task Analysis

18

GOMS analysis GOMS techniques are theoretical, and can be used for design and/or analysis. To do a GOMS analysis, the analysts look at the design (need a detailed specification of the design) and express the design in the GOMS model. Then various questions can be asked of the GOMS model.

Note that creating the model itself is summative, not formative, as this model by itself doesn't give any clues as to how to correct any problems! Although it might provide inspiration... Example → → →

19

GOMS example: currency conversion Let us consider an example task scenario (GOAL): Convert £40 to Singapore Dollars (SGD), then convert 100 SGD to £ from http://www.x-rates.com/calculator.html

20

GOMS example (2)

One possible method:

1.  choose the currency to convert from, using the drop-down menu 2.  choose the currency to convert to, using the drop-down menu 3.  choose the amount to convert 4.  Repeat steps 1-3 for opposite conversion

Operators: •  Click on the middle box •  Click GBP •  Click the bottom box •  Press the down arrow until

you reach SGD •  Click on the top box •  Type 40 •  Press Enter •  Then repeat the above

process for the opposite conversion

21

GOMS example (2) One possible method:

1. choose the amount toconvert2. choose the currency toconvert from, using the drop-down menu3. choose the currency toconvert to, using the drop-down menu4. Select a different methodfor the opposite conversion

Operators: • Click on the top box• Type 40• Press• Click on the middle box• Scroll down alphabetical list to

GBP• Click the bottom box• Type ‘si’ in the search box• Click on SGD• Click on• Click on the top box• Type 100, then Enter

22

Old x-rates interface

Question: which features of the new interface make the user experience more efficient?

24

GOMS example (4) After forming the GOMS model, it is possible to analyse methods, selection and operators and ask questions of the model. For example:

– How long is it estimated that users will take to performthe task?

– Are there some methods which are unnecessarilylengthy which we can make more efficient?

– Is this system better than the old design?

25

GOMS modelling exercise

•  goal: deleting a section of text in MS Word •  What different ways (methods) are there to

delete text? •  Can you think of different scenarios when the

user might select one method rather than another?

•  The mouse movements, key presses etc. are the operators used

26

Keystroke level model • GOMS has also been developed to provide a

quantitative model - the keystroke levelmodel.

• The keystroke model allows predictions to bemade about how long it takes an expert userto perform a task.

27

Response times for keystroke level operators (Card et al., 1983)

Operator Description Time (sec) K Pressing a single key or button

Average skilled typist (55 wpm) Average non-skilled typist (40 wpm) Pressing shift or control key Typist unfamiliar with the keyboard

0.22 0.28 0.08 1.20

P

P1

Pointing with a mouse or other device on a display to select an object. This value is derived from Fitts’ Law Clicking the mouse or similar device

0.40

0.20 H Bring ‘home’ hands on the keyboard or other device 0.40

D Draw a line using the mouse Variable

M Mentally prepare to do something 1.35 R(t) The response time is counted only if it causes the user to wait

when carrying out a task. t

See also: http://www.measuringu.com/predicted-times.php

28

Summing together

Texecute = TK + TP + TH + TD +TM + TR

29

Cognitive Walkthroughs •  A walkthrough is a step-by-step inspection. Preece 15.2.2

•  Several analysts look at the design and try and imagine/imitate

what the user is going to be thinking as he/she tries to use the design.

•  This often results in noticing problems for novice users, and suggestions for design improvements.

•  Could be done with a prototype, but could also be done with a very detailed specification of the design.

•  Alternatively, cognitive walkthroughs can be done without a complete design, to try and help form the design in the first place (formative).

Cognitive walkthroughs concentrate on what the user is thinking whilst learning to use the system

30

Cognitive Walkthroughs (2) There are 5 steps involved: 1. The analysts need:

– Some indication of who the users of the system are.– A detailed description or prototype of the interface,

including what happens as a result of each action.– Description of which tasks the user is to perform

2. For each task, the analysts simulate– The user exploring the system, looking for actions that

might help– The user selecting an action– The user's interpretation of the system's response

(Preece 15.2.2)

31

Cognitive Walkthroughs (3) 3. The analysts walk through the action sequences for

each task, putting it in the context of a task scenario, and ask questions like this: –  Will the user know what to do? –  Will the user see how to do it? –  Will the user understand from feedback whether the

action was correct or not? See also http://www.userfocus.co.uk/articles/cogwalk.html 4.  A record is kept of the following:

–  The assumptions about what would cause problems –  Notes about side issues and design changes

5. The design is then revised to fix the problems

32

Demo

Task: to buy a copy of Interaction Design from Amazon

Users: regular web users Task steps: 1. Selecting the category of goods2. Completing the form

– Will the user know what to do?– Will the user see how to do it?– Will the user understand from feedback whether

the action was correct or not?

33

Heuristic Evaluation Heuristic evaluation requires the use of a team of evaluators, along with a set of design guidelines (heuristics). Heuristics guide the analysis that the evaluators apply: The evaluators carefully systematically examine the interface, and assess how the design does or doesn't meet the guidelines. Typically the interface is examined by performing walkthroughs of the interface. Preece 15.2.1 http://www.useit.com/ http://www.useit.com/papers/heuristic/

34

Design Heuristics (1) For example, one set of possible heuristics from Nielsen and Molich (1989): http://www.nngroup.com/topic/heuristic-evaluation/ Simple and natural dialogue

– Provide clearly marked exits– Speak the user’s language– Provide short cuts– Minimise user memory load– Good error messages– Be consistent– Prevent errors– Provide feedback

Other possible sets of heuristics can be obtained from design guidelines, e.g. - Microsoft Style Guidelines

- Shneiderman's Guidelines

35

Design Heuristics (2) Examples: •  ‘Prevent errors’ for example would focus the

evaluator on searching for errors, for example by scanning the design for features that the user might misinterpret

•  ‘Provide short cuts’ would suggest looking for frequently performed tasks that involve lengthy sequences of actions

•  ‘Provide feedback’ would involve checking that suitable feedback is always provided on any user action, and also the quality of the feedback

36

Heuristic Evaluation (1) Notes: • This type of evaluation can be done with a

prototype, or before (provided a sufficientlydetailed description of the design is available)

• Better to use external experts for theevaluators, not the designers.

• Useful for analysing a variety of usersituations (e.g. where the user is not anovice).

Examples on Moodle: Nationwide Heuristic Instrument & Sample Google Form

37

Heuristic Evaluation (2) Advantages: •  Relatively low cost •  Intuitive to perform •  Doesn't necessarily require advance planning •  Suitable for use in all stages of development process Disadvantages: •  Focuses on problems rather than solutions •  Encourages designers to repair existing designs rather

than think up new designs •  Not an exact science - two evaluators might give

different results http://usabilitybok.org/heuristic-evaluation

38

Activity: Evaluation in Coursework You need to design 2 evaluations: one heuristic one for

‘experts’ (i.e. another group) and one for your users. • Start by thinking up a list of tasks for the evaluators to

perform• Decide how you will gather the data in each case, e.g.

Google form, filming, audio recording• Plan how you will analyse the data as this can affect

what sort of data you gather• For the heuristic evaluation, choose which heuristics

you will use (not necessarily Nielsen’s)• What scale will you use for the evaluators?• Look at the Moodle resources for this week for tips

39

Which Analysis to Perform? We have looked at three methods (more exist). •  GOMS

–  good at analysing precise sequences of user actions

•  Cognitive Walkthroughs –  good for assessing how novice users will cope

•  Heuristic Evaluation –  can look at a wide range of issues and user situations

Describing / examining the interface from a different perspective often provides fresh insight into the usability of a system. Methods usually required detailed models, and have limited applicability. But theory isn't always applicable, and even if it is, it will only get you so far and reveal certain kinds of problems. Need to evaluate/test with REAL USERS!

40

Evaluation by Usability Testing User Observation

– Think-aloud protocol– Performance measurements– Usability Laboratories– Computer logging

Beyond Usability Testing – Questionnaires/Interviews– Focus groups

Preece 14.2

41

User Observation Very simply, user observation involves getting users to use the software, observing what they do, and recording this in some way • Direct observation

– The investigator is present, maybe taking notes– The investigator can direct attention to areas of interest

• Indirect observation– Recording is done mechanically, e.g. by use of a video

recorder– More useful for discovering otherwise unnoticed activity

User observation is useful for obtaining qualitative data (data that can't be precisely measured)

42

User Observation (2) User observation tests need careful planning, e.g.

– What is/are the objective(s) of your testing– What tasks are you going to get the user to do?

One very useful tool is the Thinking Aloud Protocol: • Literally, the user "thinks aloud" whilst going through

tasks with the interface, and their thoughts arerecorded.

• Designers can then see what the user was thinking atthe time of having problems.

• Reveals not only what the problems are, but why.• Subjective, but one of the most valuable methods in

usability engineering.

43

User Observation (3) Bear in mind the following! • Give your users sufficient information about the testing,

including reassurance that it is the design being tested, notthem.

• When observing users in person, your presence canchange events.

• Explain that you can't provide help/answers during thetesting, but that if they do have questions about how to dothings, they should ask the questions anyway, because thisprovides valuable information about the design.

• Don't blame participants for making mistakes, that is thewrong focus. Attribute problems to faulty design, not theusers!

44

Performance Measurements Examples: • The time users take to complete a specific task

(efficiency)• The ratio between successful interactions and errors

(effectiveness)• The number of user errors (and the time spent

recovering from them) (simplicity)• The number of times the user expresses clear

frustration• The ratio between positive comments and criticisms

Preece p646• Cognitive load (NASA TLX)

45

Usability Laboratories These can be simple or sophisticated, cheap or expensive.

Must include: – desks, chairs, computers

May include: – audio/visual recording equipment– two-way mirrors– computer logging– intercom– good lighting and nice decor– wheelchair accessible

46

Automated User Observation There are ways to record user interactions for later analysis without needing a usability lab or direct user Testing. • Screen captures, along with eye tracking devices• Computer Logging

– software can have built-in statistics collecting ofusage patterns

– e.g. web server logs can reveal useful informationabout which pages are more visited, which links aremore used

– big brother issues for client-run software

47

Beyond Usability Testing

Questionnaires & Interviews – Useful for measuring subjective satisfaction (eg SUS)– Useful for assessing which features users like– Users can identify which features are disliked, but don't

(usually) offer any suggestions for fixing them!

Preece p308, and Section 7.5

Focus groups – a small group of users formed especially to discuss the

system– e.g. the PIP/UMP system has a working group, and the

discussions of this group include design issues

Documents

U08038 The Human Computer Interface Lecture 5