56
Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of Technology Atlanta, Georgia USA

Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Embed Size (px)

Citation preview

Page 1: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Just Another Pretty Visualization?

The What, Where, When, Why and How of Evaluating Visualizations

Jim FoleyCollege of ComputingGeorgia Institute of TechnologyAtlanta, Georgia USA

Page 2: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Talk originally presented at

Opening ConvocationUNC-Charlotte Visualization CenterMay 2006

Page 3: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Visualization

• Interactive visual representation of data to provide a user with information and insights

• Two subfields Scientific visualization Information visualization

Page 4: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

4

Scientific Visualization

• Data typically from physics/chemistry/biology

• Data an inherent geometry• Examples

Air flow, molecules, weather

Page 5: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

5

Information Visualization

• Data typically abstract Census, financial, business, …

• Must create geometricalrepresentation

Page 6: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Examples taken from Information Visualization

Methodologies that are appropriate for both SciVis and InfoVis

Evaluation of Visualizations

Page 7: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Roadmap

• The challenges of evaluation• Subjective evaluation• Cognition + perception - based

evaluation• Experimental evaluation• Long-term Evaluation• Conclusions

Page 8: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

The Challenges of Evaluation

• The challenges of evaluation• Subjective evaluation• Evaluation based on model of

cognition & perception• Experimental evaluation• Long-term Evaluation• Conclusions

Page 9: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Evaluation Criteria

• Look pretty? Be effective?• Extract numbers?

Accuracy or speed or avolid errors or …

• The purpose of computing is insight, not numbers - Richard Hamming How do we measure insights? Are some insights better than others?

• Detecting the Expected – Discovering the Unexpected™ - Jim Thomas If we don’t know what is expected to be there,

how do we know if we have found it?

Challe

nge

1

Page 10: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

The Problem of Good, Better, Best

• There is no “proof of optimality”• Even if we find all info that we seek –

could be faster with another viz• Maybe another insight was there?• Settle for “Good” or “Better” rather

than “Best”

Challe

nge

2

Page 11: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Effectiveness Depends on …

• The data• The user

Domain knowledge Tool knowledge Right brain - left brain

Challe

nge

3

Page 12: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

The Tool vs. the Visualization

• Usability ≠ Usefulness• If viz hard to create, user may not

use Maybe the viz is perfect for the task.

• Focus is NOT on usability of the viz tool Assume it is easy to learn and use=> Apply ALL usability tools/methods!

Challe

nge

4

Page 13: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Compare Best Apple to Best Orange

• Compare two visualizations => Each must be good as

possible• Great implementation of a low-

effectiveness visualization may perform better than a poor implementation of a high-effectiveness visualization

Challe

nge

5

Page 14: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

The Default Trap

• Users tend to stick with default settings and visualizations, even though others would be better Example: Kobsa, An empirical

evaluation of three commercial information visualization systems, InfoVis 2001

Seen in other domains as well

Challe

nge

6

Page 15: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Purpose of Evaluation: Insight, not Numbers

• Good to know that A is better than B

• But MUCH better to know WHY A is better than B

=> Ground experiments in theory

=> Get inside users’ heads

Challe

nge

7

Page 16: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Subjective Evaluation

• The challenges of evaluation• Subjective evaluation• Evaluation based on model of

cognition & perception• Experimental evaluation• Long-term Evaluation• Conclusions

Page 17: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Subjective Evaluation

• Really subjective - “I like it so must be GREAT”

• Less subjective - “All my friends like it so it must be REALLY GREAT”

• Tufte’s “minimize chart junk” etc• Shneiderman’s “Overview first, zoom

and filter, details on demand” • But

Need more goal-oriented criteria Need HCI/InfoVis experts, not our

friends

Page 18: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Potential Criteria

• Amar and Stasko, A Knowledge Task-Based Framework for Design and Evaluation of Information Visualizations, Best Paper, InfoVis 2004.

• Rationale-based Tasks Expose uncertainty Concretize relationships Formulate cause and effect

• Worldview-based Tasks Determination of domain parameters Multivariate explanation Confirm hypotheses

Page 19: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Pros/Cons (+/-)

+ Fast+ Have rationale basis- Still based on subjective (but

informed) judgments

Page 20: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Cognition & Perception - Based

• The challenges of evaluation• Subjective evaluation• Evaluation based on model of

cognition & perception• Experimental evaluation• Long-term Evaluation• Conclusions

Page 21: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Evaluation based on model of cognition & perception

• Lohse, A Cognitive Model for the Perception and Understanding of Graphics, CHI 1991

• In the spirit of GOMS and ACT• Given one of three query types

Read: Did Ford’s market share exceed 18% in 1983

Compare: In 1983, did Honda have more USA market share than Nissan?”

Trend: From 1979 to 1982, did GM’s market share increase?

Page 22: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Lohse’s Model Accounts for

• Eye movements (saccades)• Word recognition• Retrieve into working memory• Compare two units in working

memory• Etc…

Page 23: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

“Semantic Traces” on three content-equivalent visualizations, to answer the question “In 1983, did Nissan have more USA market share than Toyota?”

Page 24: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

WorkingMemory

Model explained 58% of variation in verification

experiment

Page 25: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Pros/Cons (+/-)

- Focus is facts (read, compare, trend), not insights

+ But facts can lead to insights+ Useful for comparing designs+ Faster than experiments!- Assumes scanning

sequence can bedetermined

Page 26: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

CS 7450 26

Application

http://www.perceptualedge.com/example2.php

Page 27: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Experimental Evaluation

• The challenges of evaluation• Subjective evaluation• Evaluation based on model of

cognition & perception• Experimental evaluation• Long-term Evaluation• Conclusions

Page 28: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Experimental Comparison of Tree Visualization Systems

• Kobsa, User Experiments with Tree Visualization Systems, InfoViz 2004

• Compares 5 Tree Visualization Systems + Windows file browser

• Quantitative - measure results• Qualitative - to (partially) understand

results

Page 29: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Treemap 3.2Tree Viewer

Hyperbolic BrowserBeamTrees

SequoiaView

Page 30: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Experimental Design

• One data set - subset of eBay item taxonomy 5799 nodes

• Six systems 5 viz + windows file browser

• Between subjects 48 S’s, each uses one system => 8 S’s per system

• Each subject performs 15 tasks 9 tasks concern directory structure

Maximum depth of hierarchy?Tree balanced or unbalanced?

6 tasks concern file or directory attributesFind file with name= xxxxFind name of largest file

Page 31: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Results - Correct Answers

Treemap (TM

)SequoiaView

(SV)Beam

Trees (BT)H

yperbolic Browser (H

B)

Tree Viewer (TV)

Window

s File Browser (W

FB)

BT << TM, SV, HB, WFB (1%)

TV << TM, HB, WFB (1%)

TV < SV (5%)

Page 32: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Results - Effect of Task Type

BeamTrees - Structure

BeamTrees - Attributes

Tree Viewer - Structure

Tree Viewer - Attributes

Tree ViewerSS at 0.0001BeamTrees

SS at 0.001

Page 33: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Results - Task Times

• Range for all tasks 101 seconds (Windows File Browser) and

106 (Treemap) up to 188 (BeamTrees)

• Structure-related tasks ~165 (SequoiaView and Windows

Browser) up to 265 (Tree Viewer)

• Attribute-related tasks ~80 (Treemap and Windows) up to 155

(BeamTrees)

Page 34: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Typical Qualitative Results

• Treemaps Color coding and filtering heavily used Needs “Detail on Demand”

• Hyperbolic Browser Cartesian distance ≠ distance from root

Tree depth not what HB intended for!!

• Lots of usability problems• From analyzing videos of users

Page 35: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Pros/Cons

+ Nicely designed experiment- Very different tool functionalities Detecting the Expected –

Discovering the UnexpectedTM

+ Good job measuring “Detecting the Expected”

- Does not try to measure “Discovering the Unexpected” (ie, Insights)

Page 36: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Experimental Evaluation for INSIGHT

• Saraiya, North and Davis, An Evaluation of Microarray Visualization Tools for Biological Insight, InfoVis 2004

• Purpose of visualizations is insight => Count insights• Domain: Visualization of gene

expression arrays• Compare: five software tools

Page 37: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

TimeSearcher Hierarchical Clustering Explorer

Spotfire® GeneSpring®C

lust

er /

Tre

evie

w,

aka

Clu

ster

view

(not

sho

wn)The Tools

Page 38: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Experimental Design

• Five visualization tools• Three data sets

Time series: viral infection - 1060 genes x 5 times Comparison: 861 genes x 3 related viral infections Comparison: 170 genes x (48 with lupus + 42 healthy)

• Three classes of users Domain expert (10) Domain novice (11) Software developers (9)

• One user, two tools => 30 trials Mixed design – within/between subject

Page 39: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Tool Capabilities

Overview + Detail

Dynamic Queries

Page 40: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Measure INSIGHTS

• Users instructed to list some questions they might want to ask, and to then use tool until could learn nothing more

• For each insight, record / assess The actual insight Time to reach insight Importance of the insight Correctness of the insight Sought for vs serendipitous insight Etc

Page 41: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Results (Statistically Sig.)

• Insight Value Spotfire® (66) > GeneSpring® (40)

• User’s perception of how much learned Spotfire® > HierClusExplr, Clusterview

• Time to first insight Clusterview (4.6) < all others (7 to 16) :-) GeneSpring® (16) > all others (4.6 to 14) s

:-(

Page 42: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Other Results & Insights

• User background did not have a s.s. effect• Poor UI design in some cases

Discouraged/prevented users from using most effective visualization

Slowed down use of an otherwise powerful tool – too many clicks

• Clustering is a poor default view - biases users toward certain points of view

• Tool - dataset interaction (in some cases)

Page 43: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Pros/Cons (+/-)

+ Very important methodology!Gets at users’ real objectives!!!!

+ Detecting the Expected – Discovering the UnexpectedTM

+ Measures “Detecting the Expected”+ Measures “Discovering the Unexpected” (ie,

Insights)- Wide variation in tool capabilities

TimeSearcher did really well with the time series data - but not with other types of data

- Not enough users to get many s.s. results- Only 10 domain experts- Short-term use – just a few hours- Users not 100% motivated

Page 44: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Implications

• Study real users doing real work with “seeded” data Known set of predetermined insights

Also called “ground truth”

• Can approximate this with run-off competitions using “seeded” data

Page 45: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Long-term Evaluation

• The challenges of evaluation• Subjective evaluation• Evaluation based on model of

cognition & perception• Experimental evaluation• Long-term Evaluation• Conclusions

Page 46: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Long-term Evaluation

• Hard to know what really works for exploratory data analysis

• Real people doing real work Real people need results for

professional success Highly motivated users!!

Page 47: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Long-term Studies

• Perer & Shneiderman, Integrating Statistics and Visualization: Case Studies of Gaining Clarity during Exploratory Data Analysis, CHI ’08

• Domain: Social network analysis – political influence, etc

• Problem: Network visualizations by themselves not so useful Need statistics about the networks

• SocialAction InfoVis system Combines visualization and statistics

• Question: Does SocialAction improve researcher’s

capabilities?For another such study, see Saraiya, North, et al, An Insight-Based Longitudinal Study of Visual Analytics, IEEE TVCG 2006

Page 48: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

SocialAction

Page 49: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Methodology – for each user

• Interview (1 hour) Screen participants Understand their goals

• Training (2 hours) With generic data

• Early use (2-4 weeks) With user’s data Weekly visits Requested (and feasible) software enhancements Provide help

• Mature use (2-4 weeks) No software improvements Weekly visits Provide help

• Exit Interview (1 hour) How did system impacted research? How well were goals met?

Page 50: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

One User: A Political Analyst

Study US senate power structures, based on voting blocs

Page 51: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Political Analyst

• Analyst had tried 3 other network analysis tools, without gaining desired insights NetDraw, ManyEyes, and KrackPlot

• Found interesting patterns using the capability to rank all nodes, visualize outcome and then filter out the unimportant The betweenness centrality statistic helped

find “centers of gravity” Found geographic alliances.

Page 52: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Findings

• With all four users Integration of network statistics with

visualizations improved ability to find insights

• User feedback helped improve SocialAction

Page 53: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Pros/Cons (+/-)

+ Detecting the Expected – Discovering the UnexpectedTM

+ Measures “Detecting the Expected”+ Measures “Discovering the Unexpected” (ie,

Insights) – by users’ successes

- Does not compare two tools- Focus on how well a tool helps real users do

real work – very authentic, not lab study

- Not enough users to get s.s. results+ Users 110% motivated

Page 54: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Conclusions

• Apply UI & graphic design skills• Methods not mutually exclusive• Use as part of development cycle• Understand why users do what they do• Be sure you know what you’re

measuring• The purpose of evaluations is insight,

not numbers!

Page 55: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Thank You

Page 56: Just Another Pretty Visualization? The What, Where, When, Why and How of Evaluating Visualizations Jim Foley College of Computing Georgia Institute of

Questions and

(maybe) Answers