Evaluation and metrics: Measuring the effectiveness of virtual environments Doug Bowman Edited by C....

Evaluation and metrics: Measuring the effectiveness of

virtual environments

Doug Bowman

Edited by C. Song

11.2.2 Types of evaluation

Cognitive walkthrough

Heuristic evaluation

Formative evaluation Observational user studies Questionnaires, interviews

Summative evaluation Task-based usability evaluation Formal experimentation

Sequentialevaluation

Testbedevaluation

11.5 Classifying evaluation techniques

Ÿ Form al Sum m ativeEvaluation

Ÿ Post-hoc Q uestionnaire

Ÿ (generic perform ancem odels for VEs (e.g., fitt'slaw))

Ÿ In form al Sum m ativeEvaluation

Ÿ Post-hoc Q uestionnaire

Ÿ H euris tic Evaluation

Ÿ Form ative EvaluationŸ Form al Sum m ative

EvaluationŸ Post-hoc Q uestionnaire

Ÿ Form ative Evaluation(in form al and form al)

Ÿ Post-hoc Q uestionnaireŸ In terview / D em o

Ÿ (application-specificperform ance m odels forVEs (e.g., G O M S))

Ÿ H euris tic EvaluationŸ C ognitive W alk through

Generic

Quantitative

Qualitative

Requires Users Does Not Require Users

{Quantitative

Qualitative

U s e r I n v o l v e m e n t

u l t s

ApplicationSpecific{

Generic

Qualitative

Quantitative

Application-specific

Qualitative

Quantitative

11.4 How VE evaluation is different

Physical issuesUser can’t see world in HMDThink-aloud and speech incompatible

Evaluator issuesEvaluator can break presenceMultiple evaluators usually needed

11.4 How VE evaluation is different (cont.)

User issuesVery few expert usersEvaluations must include rest breaks to

avoid possible sickness

Evaluation type issuesLack of heuristics/guidelinesChoosing independent variables is difficult

11.4 How VE evaluation is different (cont.)

Miscellaneous issuesEvaluations must focus on lower-level

entities (ITs) because of lack of standardsResults difficult to generalize because of

differences in VE systems

11.6.1 Testbed evaluation framework

Main independent variables: ITs

Other considerations (independent variables) task (e.g. target known vs. target unknown) environment (e.g. number of obstacles) system (e.g. use of collision detection) user (e.g. VE experience)

Performance metrics (dependent variables) Speed, accuracy, user comfort, spatial awareness…

Generic evaluation context

Testbed evaluation

User-centered Application8

Heuristics&

Guidelines

QuantitativePerform ance

Results

T e s t b e dE v a l u a t i o n

2Taxonom y

Outside Factorstask, users, evnironm ent,

system

3 4 Perform anceM etrics

Initial Evaluation1

Taxonomy

Establish a taxonomy of interaction technique for the interaction task being evaluate.

Example : Task: Changing the object’s color 3 sub tasks :

selecting object Choosing a color Applying color

2 possible technique components (TC) for choosing a color Changing the values of R, G and B sliders Touching a point within a 3D color space

Outside Factors

A user’s performance on an interaction task may depend on a variety of factors.

4 categories Task

Distance to be traveled, size of object to be manipulated Environment

The number of obstacles, the level of activity or motion User

Spatial awareness, physical attributes (arm length, etc) System

Lighting model, the mean frame rate etc.

Performance Metrics

Information about human performance

Speed, Accuracy : quantitative

More subjective performance valuesEase of use, ease of learning, and user

comfortThe user’s sense and body, user-centric

performance measure

Testbed Evaluation

Final stages in the evaluation of Interaction techniques for 3D Interaction tasks

Generic, generalizable, and reusable evaluation through the creations of test-beds.

Test-beds : Environments and tasks Involve all important aspects of a task Evaluate each component of a technique Consider outside influences on performance Have multiple performance measures

Application and Generalization of Results Testbed evaluation produces models that characterize the

usability of an interaction technique for the specified task. Usability is given in terms of multiple performance metrics w.r.t

various lelvels of outside factors. -> performance Database(DB) More information is added to the DB each time a new technique is

run through the testbed.

To choose interaction techniques for applications appropriately, one must understand the interaction requirements of the application The performance results from testbed evaluation can be used to

recommend interaction techniques that meet those requirements.

11.6.2 Sequential evaluation

Traditional usability engineering methods

Iterative design/eval.

Relies on scenarios, guidelines

Application-centric

User-centered Application

(D )R epresentative

U serT ask

Scenarios

(C )S tream lined

U ser In terfaceD esigns

(1)User TaskAnalysis

(3)Formative

User-CenteredEvaluation

(4)Summative

ComparativeEvaluation

(2)Heuristic

Evaluation

(A )T ask

D escriptionsSequences &D ependencies

(E )Iterative ly R efined

U ser In terfaceD esigns

(B)G uidelines

andH euris tics

11.3 When is a VE effective?

Users’ goals are realized

User tasks done better, easier, or faster

Users are not frustrated

Users are not uncomfortable

11.3 How can we measure effectiveness?

System performance

Interface performance / User preference

User (task) performance

All are interrelated

Effectiveness case studies

Watson experiment: how system performance affects task performance

Slater experiments: how presence is affected

Design education: task effectiveness

11.3.1 System performance metrics

Avg. frame rate (fps)

Avg. latency / lag (msec)

Variability in frame rate / lag

Network delay

Distortion

System performance

Only important for its effects on user performance / preference frame rate affects presencenet delay affects collaboration

Necessary, but not sufficient

Case studies - Watson

How does system performance affect task performance?

Vary avg. frame rate, variability in frame rate

Measure perf. on closed-loop, open-loop task

e.g. B. Watson et al, Effects of variation in system responsiveness on user performance in virtual environments. Human Factors, 40(3), 403-414.

11.3.3 User preference metrics

Ease of use / learning

Presence

User comfort

Usually subjective (measured in questionnaires, interviews)

User preference in the interface

UI goalsease of useease of learningaffordancesunobtrusivenessetc.

Achieving these goals leads to usability

Crucial for effective applications

Case studies - Slater

questionnaires

assumes that presence is required for some applications

e.g. M. Slater et al, Taking Steps: The influence of a walking metaphor on presence in virtual reality. ACM TOCHI, 2(3), 201-219.

study effect of:collision detectionphysical walkingvirtual bodyshadowsmovement

User comfort

Simulator sickness

Aftereffects of VE exposure

Arm/hand strain

Eye strain

Measuring user comfort

Rating scales

QuestionnairesKennedy - SSQ

Objective measuresStanney - measuring aftereffects

11.3.2 Task performance metrics

Speed / efficiency

Accuracy

Domain-specific metricsEducation: learningTraining: spatial awarenessDesign: expressiveness

Speed-accuracy tradeoff

Subjects will make a decision

Must explicitly look at particular points on the curve

Manage tradeoffSpeed

Case studies: learning

Measure effectiveness by learning vs. control group

Metric: standard test

Issue: time on task not the same for all groups

e.g. D. Bowman et al. The educational value of an information-rich virtual environment. Presence: Teleoperators and Virtual Environments, 8(3), June 1999, 317-331.

Aspects of performance

SystemPerformance

InterfacePerformance Task

Performance

Effectiveness

11.7 Guidelines for 3D UI evaluation

Begin with informal evaluation

Acknowledge and plan for the differences between traditional UI and 3D UI evaluation

Choose an evaluation approach that meets your requirements

Use a wide range of metrics – not just speed of task completion

Guidelines for formal experiments

Design experiments with general applicability Generic tasks Generic performance metrics Easy mappings to applications

Use pilot studies to determine which variables should be tested in the main experiment

Look for interactions between variables – rarely will a single technique be the best in all situations

Acknowledgments

Deborah Hix

Joseph Gabbard

Evaluation and metrics: Measuring the effectiveness of virtual environments Doug Bowman Edited by C....

Documents

Bowman RBV

android design guidelines - beta.techcrunch.com · When Doug Bowman, former creative ... and learn to remain open-minded ... When creating wireframes for an Android app, it is probably

volunteer fire fighter volunteer fire fighterFergus Walker Doug Wild Michael Bowman Front Cover – Sir Ivan Fire, January 2017 – Image generously provided by Dean Sewell who is

3D User Interface Design for Virtual Environmentscs5754/lectures/3DUIs.pdf · Virtual Environments Doug A. Bowman. 2 (C) 2008 Doug Bowman, Virginia Tech 2 Why 3D interaction? 3D

BOWMAN - neosolar.sk

Toward Improved Infrasound Events Location Michael O’Brien 1, Doug Drob 2 and Roger Bowman 1 1 – Science Applications International Corporation 2 – Naval

Augmented and mixed reality (AR & MR) Doug Bowman Edited by C. Song Based on original lecture notes by Ivan Poupyrev

pres-connect-db-finalcourses.cs.vt.edu/~cs5754/presence_connect.pdf · Title: Microsoft Word - pres-connect-db-final.doc Author: Doug Bowman Created Date: 4/4/2006 9:46:42 AM

P15665 GLEASON GEAR JAW METRICS PROBLEM DEFINITION REVIEW 9/11/14 KATIE BALDWIN JOSH SMITH DOUG PERRY EVAN MOLONY

Résumé of: Julie Anna Bowman Julie Anna Bowman *Graphics by:

Lecture 5: Interaction and Navigation Dr. Xiangyu WANG Acknowledge the notes from Dr. Doug Bowman

November Newsletter Image result for turkey picsfoxrunccsc.com/golf/emailer2020/img/foxrunccsc/2018... · 2018-11-05 · Doug Dunn Brad Bowman Jim Glenn Richard Garner ... Please

OIS Bowman

Stan Bowman, digital photos Mary Ann Bowman, mixed …files.constantcontact.com/0b577ce8001/42214135... · Learn more at ArtTrail.com ... Stan Bowman, digital photos Mary Ann Bowman,

Kent Grayson - facultyresearch.london.edufacultyresearch.london.edu/docs/00-501.pdf · Bharadwaj, Doug Bowman, Dawn ... partners is influenced by types of trust that have a broader

Who are these people? Introduction to HCI Qing Licourses.cs.vt.edu › ~cs3724 › fall2005-bowman › lectures › introduction.pdf(C) 2005 Doug Bowman, Virginia Tech CS 17 History

Key Quality Assurance Metrics in Additive ManufacturingKey Quality Assurance Metrics in Additive Manufacturing Doug Wells NASA MSFC Huntsville AL NASA Quality Assurance in Additive

3D Interaction Techniques for Virtual Environments Doug A. Bowman

3D Interaction Techniques for Virtual Environments: Travel · Virtual Environments: Travel Doug A. Bowman. 2 (C) 2006 Doug Bowman, Virginia Tech 2 Travel the motor component of navigation

Bowman - Oilite