Usability Study and Error Analysis of Space Logistics ...web.mit.edu/chaiwoo/www/presentations/nechfes2011-1.pdf · 2011-10-21 8 Slide 15 Backup Slides Slide 16 LEO LLPO KSC Time

2011-10-21

1

Usability Study and Error Analysis of Space Logistics Analysis Tools

HFES NEC Student Research ConferenceOctober 14, 2011, Cambridge, Massachusetts

Paul Grogan, Chaiwoo Lee, and Prof. de WeckMIT Engineering Systems Division

MITSTRATEGICENGINEERING

Designing Systems for an Uncertain Future

Slide 2

Context: Space Exploration Logistics

• Infrequent, long duration transports

• Limited cargo capacity

• Critical resource requirements

• Coupled missions in campaigns

Software tools assist analysis

… but tools must be usable

2011-10-21

2

Slide 3

Analysis Tools: SpaceNet (SN) Spreadsheet (SS)

Type of Tool General‐purpose application Ad hoc (improvised)

Analysis Method Discrete event simulation Cell‐based formulas

User Interface Java Swing GUI Microsoft Excel

Visualizations Plots, animations, etc. None

Error‐checking Simulation error messages Status messages

Study Motivation & Objectives

• Applying usability design principles and software testing methods to the domain of space logistics

• Research questions– What is the comparative effectiveness and

efficiency of space logistics analysis tools?

– How does the user experience and usage patternscompare between the two tools?

• Answer questions with comprehensive usability testing involving human subjects

Slide 4

2011-10-21

3

Study Design

• Design for comparative evaluation

• Randomized, orthogonal assignment of experimental variables – Tool (SpaceNet vs. Spreadsheet)

– Scenario (C vs. D)

• Within-subject evaluation w/ 12 volunteer subjects– Primary group: 8 with space background, aged 22-30 (1♀, 7♂)

– Secondary group: 4 without space background, aged 24-32 (2♀, 2♂)

• Study procedure– Session 1 (~90 min)

: Consent → tutorial → 1st scenario (part 1 → part 2) → questionnaire

– Session 2 (~90 min, different day)

: Tutorial → 2nd scenario (part 1 → part 2) → questionnaire → interview

Slide 5

Factor Description Metric Data collection

EffectivenessModeling missions completely with high research values

Completeness / outcome quality Observation

Perception of outcomes Questionnaire

EfficiencyThe time and effort needed

Completion time / time in mode / time until event

Observation

Mental effort / ease of use / complexity Questionnaire

Error Tolerance and Prevention

Making fewer errors, recovering quickly from errors, feeling a sense of control in usage

Error rate / recovery rate / recovery time

Observation

Annoyance / confidence / predictability / intuitiveness / familiarity

Questionnaire

Usability Metrics

Slide 6

Factor Description

Effectiveness Accuracy and completeness to achieve goals

Efficiency Resources expended to achieve goals

Satisfaction Freedom from discomfort, positive attitudes

ISO 9241-11 definitions

Factors and metrics for this study

2011-10-21

4

Scenario Description

Slide 7

Scenarios C & D:• Part 1: Efficiency

– Create new model• Lunar orbital mission• Validate propellant levels

– Not time limited

• Part 2: Effectiveness– Modify existing model

parameters with constraints• Lunar surface exploration• Maximize REC: Relative

Exploration Capability– Limited to 15 minutes

Part 1 – Efficiency Results

Slide 8

Assemble Launch Stack

Earth Launch** Earth Departure Burn

Lunar Arrival Burn**

** (p<0.01)

Time (s)

Task Completion Times

SpaceNet Spreadsheet SpaceNet Spreadsheet SpaceNet Spreadsheet SpaceNet Spreadsheet

2011-10-21

5

Part 2 – Effectiveness Results

Slide 9

Outcome Quality (% baseline REC)Significant Effects:

• Non-space users did not increase quality using SpaceNet tool

• 2nd session resulted in higher quality

• Spreadsheet tool produced:– Higher quality

– Fewer errors

– Less time in recoveryQuality (%

)

SpaceNet Spreadsheet

User Perception Results

Slide 10

Tools: SpaceNet (SN) Spreadsheet (SS)

Questionnaire

•Convenient, easy to use

•Confident, intuitive, and familiar

•Higher quality outcomes

•More mental effort (p<0.05)

Interview Comments

•Graphical and visual

•“Busier” and less transparent

•Errors pin‐pointed with messages

•“More convenient in the long run”

•Easier to “play” with inputs

•Unconfident of error detection

•“Not intuitive” compared to mental model

•“Not scalable,” at some point “chokes”

2011-10-21

6

Error Analysis Logging

Slide 11

Assemble Launch Stack

Earth Launch

Earth Departure

Moon Arrival

Verify Propellant

Errors

Error Detection CorrectionUndetected

Error

Milestones

Error Analysis – Key Results

Slide 12

Errors Detection

Circle size frequencyLine width error‐detection pairs

Frequency (# Errors)

Error Comparison Across Tools

2011-10-21

7

User experience & usage

• User Perception– Model input accessibility

– Graphics and feedback

– Mental model agreement

• Error Analysis– Error susceptibility differs

– Detection method differs

Revisiting Study Objectives

Slide 13

Efficiency & effectiveness

• SpaceNet more efficient at parts of model creation

• Spreadsheet more effective for modifying existing model

Future work

• Investigate scalability on complex campaigns

• Integrate best practices into future analysis tools

Slide 14Image credit: NASA

Questions?

spacenet.mit.edu

Grogan, P., C. Lee, and O. de Weck, “Comparative Usability Study of Two Space Logistics Analysis Tools,” AIAA-2011-7345, AIAA Space 2011 Conference and Exposition, Long Beach, California, Sept. 26-29, 2011.

Acknowledgements:

– Experimental subject volunteers

– DoD Air Force Office of Scientific Research, NDSEG Fellowship, 32 CFR 168a (Grogan)

– Samsung Scholarship (Lee)

2011-10-21

8

Slide 15

Backup Slides

Slide 16

LEO

LLPO

KSC

Time (days)0.0 1.0 2.0 3.0 7.0

Scenario C – Part 1

Residual PropellantUpper Stage 3,777 kgPropulsion Module 15,697 kg

2011-10-21

9

Slide 17

PSZ

LEO

LLPO

LSP

KSC

Time (Not to Scale)0.0 1.0 2.0 3.0 7.0 8.0 9.0 14.0 15.0 16.0 20.0

Scenario C – Part 2

Baseline REC: 1.16

Slide 18

LEO

LLOI

KSC

Time (days)

0.0 1.0 1.5 3.0 7.0

Scenario D – Part 1

Residual PropellantThird Stage 10,018 kgService Module 3,679 kg

2011-10-21

10

Slide 19

PSZ

LEO

LLOI

TLV

KSC

Time (Not to Scale)

0.0 0.5 4.5 5.0 8.0 8.5 12.5

Scenario D – Part 2

Baseline REC: 0.49

Slide 20

Factor Metric Definition CollectionMethod

Effectiveness Completeness (Part 1) % of events correctly completed in 5 minutes ObservationOutcome quality (Part 2) % increase in relative exploration capability in 15 minutes

Perception of outcomes Perceived quality of task outcomes QuestionnaireEfficiency Completion time (Part 1) Time to complete the tasks given in the scenario Observation

Time in mode (Part 1) Time spent on each event in the scenarioTime until event (Part 1) Time elapsed before first creating an event correctly

Time until event (Part 2) Time elapsed before first making a valid increase in relative exploration capability

Mental effort Perceived mental effort required to do given task Questionnaire

Ease of use The degree to which the system is convenient for completing the scenario

Complexity Perceived complicatedness and difficulty

Error Tolerance and Prevention

Error rate Number of errors made by a user during the process of completing a task

Observation

Recovery rate Percentage of errors correctly recoveredRecovery time Percentage of time spent recovering from errors

Annoyance Perceived frustration and irritation QuestionnaireConfidence The degree to which a user felt confident using the interface

without the fear of making mistakes

Predictability Degree in which the user was able to predict how interface will function

Intuitiveness Perception on the power of knowing or understanding without cognitive effort

Familiarity Degree to which a user recognizes interface components and views their interaction as natural

2011-10-21

11

Slide 21

Su

bje

ct

Too

l

Se

ss

ion

Sc

en

ari

o

Tas

k 1

(s

)

Tas

k 2

(s

)

Tas

k 3

(s

)

Tas

k 4

(s

)

Co

mp

. T

ime

(s)

# T

as

ks

in

5

min

Co

rre

ct

Eve

nt

(s)

# E

rro

rs

% R

ec

ov.

Tim

e to

Va

lid

R

EC

Qu

ali

ty (

%)

# E

rro

rs

% R

ec

ov.

5 SN 1 C 53 64 24 36 245 4 117 3 33.33 143 300.86 2 50.005 SS 2 D 35 221 82 25 730 2 186 2 100.00 348 383.67 0 n/a6 SS 1 C 30 410 73 31 544 3 52 2 60.00 565 401.72 2 50.006 SN 2 D 52 160 59 11 412 0 102 5 100.00 n/a 100.00 4 50.007 SN 1 D 56 43 24 18 177 1 42 0 554 214.29 2 50.007 SS 2 C 42 388 18 35 483 4 57 4 50.00 48 419.83 1 100.008 SS 1 D 49 122 107 101 379 0 352 1 100.00 810 146.94 0 n/a8 SN 2 C 103 249 84 20 598 2 49 5 75.00 587 231.03 1 100.009 SN 1 C 87 369 74 29 960 1 208 3 66.66 n/a 100.00 5 0.009 SS 2 D 28 787 74 53 1462 1 287 9 100.00 871 626.53 1 100.00

10 SS 1 C 93 264 245 27 3567 0 667 6 83.33 332 296.55 1 0.0010 SN 2 D 66 142 55 28 560 3 66 3 66.66 n/a 100.00 1 100.0011 SN 1 D 40 145 111 10 390 1 383 1 100.00 n/a 100.00 2 50.0011 SS 2 C 30 353 43 52 839 3 40 5 100.00 801 222.41 0 n/a12 SS 1 D 62 121 35 19 351 3 133 3 50.00 107 208.16 1 0.0012 SN 2 C 71 96 59 19 496 4 62 2 66.66 n/a 100.00 1 0.0013 SN 1 C 114 62 141 13 450 1 411 3 66.66 616 300.86 0 n/a13 SS 2 D 22 245 90 63 463 2 414 2 75.00 98 214.29 1 0.0014 SS 1 C 129 146 97 59 549 2 505 1 100.00 362 100.86 1 0.0014 SN 2 D 65 55 52 11 252 4 250 0 n/a 615 175.51 1 0.0015 SN 1 D 52 67 52 11 251 4 251 0 n/a 542 208.16 1 100.0015 SS 2 C 45 130 59 35 282 4 282 1 100.00 323 481.03 1 100.0016 SS 1 D 133 149 376 81 1869 1 114 3 33.33 770 208.16 1 0.0016 SN 2 C 114 105 108 18 616 1 460 1 100.00 215 109.48 3 66.66

Complete Analysis – Part 1

Slide 22

Between Groups (SN)

Between Groups (SS)

Paired Scenarios Paired Sessions Paired Tools

1° 2° 1° 2° C D 1 2 SN SSCompletion Time (s) 375.1 601.5 662.4 1554.8 802.4 608.0 811.0 599.4 450.6 959.8

Time to Correct Task (s) 114.4 111.8 119.1* 349.8* 199.8 109.7 143.5 166.0 113.5 196.0

Time in Task 1 (s) 76.1 66.0 60.6 53.3 75.9 55.0 74.8 56.1 72.8 58.2

Time in Task 2 (s) 100.6 188.0 226.4 381.3 219.7 188.1 163.5 244.25 129.8** 278.0**

Time in Task 3 (s) 68.0 74.8 112.8 99.3 85.4 93.0 113.3 62.3 70.5 108.3

Time in Task 4 (s) 17.3 21.5 53.8 37.8 31.2 35.9 36.3 30.8 18.7** 48.4**

Tasks in 5 Minutes (#) 2.63 2.50 1.75 1.50 1.50* 2.75* 2.17 2.08 2.58 1.67

Error Rate (#) 1.63 2.25 2.38* 5.00* 3.08 2.00 2.33 2.75 1.83 3.25

Recovery Rate (%) 75.0 70.8 77.3 83.3 70.6 80.6 65.9 85.2 73.1 78.0

Recovery Time (%) 13.5 1.95 33.8 47.0 33.7 20.0 22.9 30.8 15.5* 38.2*

* Significant difference at α=0.05, ** Significant difference at α=0.01

2011-10-21

12

Complete Analysis – Part 2

Slide 23

Between Groups (SN)

Between Groups (SS)

Paired Scenarios

Paired Sessions Paired Tools

1° 2° 1° 2° C D 1 2 SN SSOutcome Quality (%) 205.0** 100.0** 294.6 338.4 255.4 223.8 215.5 263.6 170.0* 309.2*

Time to REC Increase (s) 467.4 n/a 415.5 527.8 327.7 533.9 542.4 319.1 467.4 394.1

Error Rate (#) 1.75 2.25 0.88 0.75 1.50 1.25 1.50 1.25 1.92* 0.83*

Recovery Rate (%) 59.5 37.5 41.7 33.3 39.6 50.0 25.0* 64.6* 45.8 43.8

Recovery Time (%) 12.6 36.6 3.4 25.4 14.8 16.5 12.0 18.3 20.6* 10.8*


• 7-point Likert scale

• Questions related to:– Mental Effort

– Convenience

– Predictability

– High-quality Outcome

– Complicated Interface

– Annoyance or Frustration

– Confidence

– Intuitiveness

– Familiarity

• User (space, non-space)– Not significant

• Scenario (C, D)– Not significant

• Ordering– Not significant

• Tool– SN: significantly less

mental effort

Questionnaire Analysis

Slide 24

2011-10-21

13

Complete Analysis – Questionnaire

Slide 25

Between Groups (SN)

Between Groups (SS)

Paired Scenarios

Paired Sessions Paired Tools

1° 2° 1° 2° C D 1 2 SN SSQ1: Mental Effort 4.25 3.25 5.12 4.75 4.75 4.17 4.92 4.00 3.92* 5.00*Q2: Convenience 5.25 5.00 3.75 4.50 4.67 4.50 4.67 4.50 5.17 4.00Q3: Predictability 4.75 4.25 4.12 6.00 4.67 4.67 4.67 4.67 4.58 4.75

Q4: High-quality Outcome 4.38 4.00 4.00 3.25 3.58 4.41 3.75 4.25 4.25 3.75Q5: Complicated Interface 3.81 3.25 4.63 3.50 3.88 4.00 4.08 3.79 3.63 4.25

Q6: Annoyance or Frustration 3.25 5.50 4.50 4.00 4.08 4.25 4.83 3.50 4.00 4.33

Q7: Confidence 4.25 5.50 3.75 5.00 4.42 4.42 4.92 3.92 4.67 4.17Q8: Intuitiveness 5.38 5.00 4.25 5.25 4.75 5.08 4.75 5.08 5.25 4.58

Q9: Familiarity 4.75 5.50 4.13 5.25 4.50 5.00 4.67 4.83 5.00 4.50


Documents

Usability Study and Error Analysis of Space Logistics ...web.mit.edu/chaiwoo/www/presentations/nechfes2011-1.pdf · 2011-10-21 8 Slide 15 Backup Slides Slide 16 LEO LLPO KSC Time