51
SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

SIMS 213: User Interface Design & Development

Marti HearstMarch 9 and 16, 2006

Page 2: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Formal Usability Studies

Page 3: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Outline

Experiment Design– Factoring Variables– Interactions

Special considerations when involving human participantsExample: Marking Menus– Motivation– Hypotheses– Design– Analysis

Page 4: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Adapted from slide by James Landay

Formal Usability Studies

When useful– to determine time requirements for task completion– to compare two designs on measurable aspects

• time required• number of errors• effectiveness for achieving very specific tasks

Require Experiment Design

Page 5: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

Experiment design involves determining how many experiments to run and which attributes to vary in each experiment

Goal: isolate which aspects of the interface really make a difference

Page 6: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

Decide on – Response variables

• the outcome of the experiment• usually the system performance• aka dependent variable(s)

– Factors (aka attributes)• aka independent variables

– Levels (aka values for attributes)– Replication

• how often to repeat each combination of choices

Page 7: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

Example: – Studying a system (ignoring users)

Say we want to determine how to configure the hardware for a personal workstation – Hardware choices

• which CPU (three types)• how much memory (four amounts)• how many disk drives (from 1 to 3)

– Workload characteristics• administration, management, scientific

Page 8: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

We want to isolate the effect of each component for the given workload type.How do we do this?– WL1 CPU1 Mem1 Disk1– WL1 CPU1 Mem1 Disk2– WL1 CPU1 Mem1 Disk3– WL1 CPU1 Mem2 Disk1– WL1 CPU1 Mem2 Disk2– …

There are (3 CPUs)*(4 memory sizes)*(3 disk sizes)*(3 workload types) = 108 combinations!

Page 9: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

One strategy to reduce the number of comparisons needed:– pick just one attribute– vary it– hold the rest constant

Problems:– inefficient– might miss effects of interactions

Page 10: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Interactions among Attributes

A1 A2B1 3 5B2 6 8

A1 A2B1 3 5B2 6 12

A1

B1B1

A2

A1

B2

A2

B2

A and B do not interact A and B may interact

A2A2 A1A1

B1 B2B1 B2

Page 11: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

Another strategy: figure out which attributes are important firstDo this by just comparing a few major attributes at a time – if an attribute has a strong effect, include it in future

studies– otherwise assume it is safe to drop it

This strategy also allows you to find interactions between attributes

Page 12: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

Common practice: Fractional Factorial Design– Just compare important subsets– Use experiment design to partially vary the

combinations of attributes

Blocking– Group factors or levels together– Use a Latin Square design to arrange the blocks

Page 13: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Between-Groups Design

Wilma and Betty use one interface

Dino and Fred use the other

Page 14: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Within-Groups Design

Everyone uses both interfaces

Page 15: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Adapted from slide by James Landay

Between-Groups vs. Within-Groups

Between groups – 2 or more groups of test participants– each group uses only 1 of the systems

Within groups – one group of test participants– each person uses all systems

• can’t use the same tasks on different systems

Page 16: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Between Groups ExampleComparing TextEdge to GraffitiWobbrock, J.O., Myers, B.A. and Kembel, J.A. (2003) EdgeWrite: A stylus-based text entry method designed for high accuracy and stability of motion. (UIST '03).

Page 17: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Between Groups Example

Comparing TextEdge to GraffitiWobbrock, J.O., Myers, B.A. and Kembel, J.A. (2003) EdgeWrite: A stylus-based text entry method designed for high accuracy and stability of motion. (UIST '03).

– Independent Variables?– Dependent Variables?– Between or Within Subjects?

Page 18: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Between Groups Example

Comparing TextEdge to Graffiti– Independent Variables?

• TextEdge vs Graffiti!

– Dependent Variables?• Time• Errors

– Between or Within Subjects?• Between, to control for learning

effects

Page 19: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Between-Groups vs. Within-Groups

Within groups design– Pros:

• Is more powerful statistically (can compare the same person across different conditions, thus isolating effects of individual differences)

• Requires fewer participants than between-groups– Cons:

• Learning effects• Fatigue effects

Page 20: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Special Considerations for Formal Studies with Human Participants

Studies involving human participants vs. measuring automated systems– people get tired– people get bored– people (may) get upset by some tasks– learning effects

• people will learn how to do the tasks (or the answers to questions) if repeated

• people will (usually) learn how to use the system over time

Page 21: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

More Special Considerations

High variability among people

– especially when involved in reading/comprehension tasks

– especially when following hyperlinks! (can go all over the place)

Page 22: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design Example: Marking Menus

Based onKurtenbach, Sellen, and Buxton, Some Articulartory and Cognitive Aspects of

“Marking Menus”, Graphics Interface ‘94, http://reality.sgi.com/gordo_tor/papers

Page 23: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design Example: Marking Menus

Pie marking menus can reveal – the available options – the relationship between mark and command

1. User presses down with stylus2. Menu appears3. User marks the choice, an ink trail follows

Page 24: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Why Marking Menus?

Same movement for selecting command as for executing itSupporting markings with pie menus should help transition between novice and expertUseful for keyboardless devicesUseful for large screensPie menus have been shown to be faster than linear menus in certain situations

Page 25: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

What do we want to know?

Are marking menus better than pie menus?– Do users have to see the menu?– Does leaving an “ink trail” make a difference?– Do people improve on these new menus as they

practice?

Related questions:– What, if any, are the effects of different input

devices?– What, if any, are the effects of different size menus?

Page 26: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Factors

Isolate the following factors (independent variables):

– Menu condition• exposed, hidden, hidden w/marks (E,H,M)

– Input device• mouse, stylus, track ball (M,S,T)

– Number of items in menu • 4,5,7,8,11,12 (note: both odd and even)

Response variables (dependent variables):– Response Time – Number of Errors

Page 27: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Hypotheses

Note these are stated in terms of the factors (independent variables)

1. Exposed menus will yield faster response times and lower error rates, but not when menu size is small

2. Response variables will monotonically increase with menu size for exposed menus

3. Response time will be sensitive to number of menu choices for hidden menus (familiar ones will be easier, e.g., 8 and 12)

4. Stylus better than Mouse better than Track ball

Page 28: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Hypotheses

5. Device performance is independent of menu type

6. Performance on hidden menus (both marking and hidden) will improve steadily across trials. Performance on exposed menus will remain constant.

Page 29: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

Participants– 36 right-handed people

• usually gender distribution is stated– considerable mouse experience– (almost) no trackball, stylus experience

Page 30: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

Task– Select target “slices” from a series of different pie

menus as quickly and accurately as possible• (a) exposed (b) hidden• Can move mouse to select, as long as butten held down

– Menus were simply numbered segments• (meaningful items would have longer learning times)

– Participants saw running scores• Shown grayed-out feedback about which selected• Lose points for wrong selection

Page 31: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

36 participantsOne between-subjects factor – Menu View Type

• Three levels: E, H, or M• (Exposed, Hidden, Marking)

Two within-subjects factors– Device Type

• Three levels: M, T, or S• (Mouse, Trackball, Stylus)

– Number of Menu Items• Six levels: 4, 5, 7, 8, 11, 12

How should we arrange these?

Page 32: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

E H M

12 12 12

Betweensubjectsdesign

How to arrange

thedevices?

Page 33: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

M

T

S

T

S

M

S

M

T

E H M

12 12 12

A LatinSquare

No row or

columnsharelabels

(Note: each of 12 participants does everything in one column)

Page 34: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

M

T

S

T

S

M

S

M

T

E H M

How toarrange

themenu sizes?

Block by sizethen

randomize the

blocks.

Page 35: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

M

T

S

T

S

M

S

M

T

E H M

5 11

12 8

7 4

Block by sizethen

randomize the

blocks.

(Note: the order of each set of menu size blockswill differ for each participant in each square)

Page 36: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

M

T

S

T

S

M

S

M

T

E H M

5 11

12 8

7 4

7 8

12 5

4 11

40 trials per block

(Note: these blocks will lookdifferent for each participant.)

Page 37: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Overall Results

Group Mean RT(s.d)

Mean Errors(s.d.)

Mean %Errors

Exposed 0.98 (.23) 0.64 (1.0) 1.6%

Hidden 1.10 (.31) 3.27 (3.57) 8.2%

Marking 1.10 (.31) 3.76 (3.67) 9.4%

So exposing menus is faster … or is it?Let’s factor things out more.

Page 38: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

A Learning EffectWhen we graph over the number of trials, we finda difference between exposed and hidden menus.This suggests that participants may eventually becomefaster using marking menus (was hypothesized).A later study verified this.

Page 39: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Factoring to Expose InteractionsIncreasing menu size increases selection time and number of errors (was hypothesized).No differences across menu groups in terms of response time.That is, until we factor by menu size AND menu group– Then we see that menu size has interaction effects on Hidden

groups not seen in Exposed group– This was hypothesized (12 easier than 11)

Page 40: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Factoring to Expose Interactions

Stylus and mouse outperformed trackball (hypothesized)Stylus and mouse the same (not hypothesized)Initially, effect of input device did not interact with menu type– this is when comparing globally– BUT ...

More detailed analysis:– Compare both by menu type and device type– Stylus significantly faster with Marking group– Trackball significantly slower with Exposed group– Not hypothesized!

Page 41: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Average response time and errors as a function of device, menu size, and menu type.

Potential explanations:

Markings provide feedbackfor when stylus is pressedproperly.Ink trail is consistent withthe metaphor of using a pen.

Page 42: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Experiment Design

M

T

S

T

S

M

S

M

T

E H M

How can we tell if order in which the device appears has an effect on the final outcome?

Some evidence:There is no significant difference among devices in the Hidden group.Trackball was slowest and most error prone in all three cases.Still, there may be some hidden interactions, but unlikelyto be strong given the previous graph.

Page 43: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Statistical Tests

Need to test for statistical significance– This is a big area– Assuming a normal distribution:

• Students t-test to compare two variables• ANOVA to compare more than two variables

Page 44: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Adapted from slide by James Landay

Analyzing the Numbers

Example: trying to get task time <=30 min. – test gives: 20, 15, 40, 90, 10, 5– mean (average) = 30– median (middle) = 17.5– looks good! – wrong answer, not certain of anything

Factors contributing to our uncertainty– small number of test users (n = 6)– results are very variable (standard deviation = 32)

• std. dev. measures dispersal from the mean

Page 45: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Adapted from slide by James Landay

Analyzing the Numbers (cont.)

This is what statistics are forCrank through the procedures and you find– 95% certain that typical value is between 5 & 55

Usability test data is quite variable– need lots to get good estimates of typical values– 4 times as many tests will only narrow range by 2x

Page 46: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Followup Work

Hierarchical Markup Menu study

Page 47: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Followup WorkResults of use of marking menus over an extended period of time– two person extended study– participants became much faster using gestures without

viewing the menus

Page 48: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Followup WorkResults of use of marking menus over an extended period of time– participants temporarily returned to “novice” mode when they

had been away from the system for a while

Page 49: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Summary

Formal studies can reveal detailed information but take extensive time/effortHuman participants entail special requirementsExperiment design involves– Factors, levels, participants, tasks, hypotheses– Important to consider which factors are likely to have real effects on the

results, and isolate theseAnalysis– Often need to involve a statistician to do it right– Need to determine statistical significance– Important to make plots and explore the data

Page 50: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

References

Kurtenbach, Sellen, and Buxton, Some Articulartory and Cognitive Aspects of “Marking Menus”, Graphics Interface ‘94, http://reality.sgi.com/gordo_tor/papersKurtenbach and Buxton, User Learning and Performance with Marking Menus, Graphics Interface ‘94, http://reality.sgi.com/gordo_tor/papersJain, The art of computer systems performance analysis, Wiley, 1991http://www.statsoft.com/textbook/stanman.htmlGonick and Smith, The Cartoon Guide to Statistics, HarperPerennial, 1993Dix et al. textbook

Page 51: SIMS 213: User Interface Design & Development Marti Hearst March 9 and 16, 2006

Discuss Jeffries et al.

Compared 4 Evaluation Techniques– Heuristic Evaluation– Software Guidelines– Cognitive Walkthroughs– Usability Testing

Findings?