View
1.712
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Information graphics have been used for thousands of years to help illustrate ideas and communicate information. However, it requires skills and time to hand craft high-quality, customized information graphics for specific situations (e.g., data characteristics and user tasks). The problem becomes more acute when we must deal with big data. To address this problem, we are researching and developing mixed-initiative visual analytic systems that leverage both the intelligence of humans and machines to aid users in deriving insights from massive data. On the one hand, such a system automatically guides users to perform their data analytic tasks by recommending suitable visualization and discovery paths in context. On the other hand, users interactively explore, verify, and improve visual analytic results, which in turn helps the system to learn from users' behavior and improve its quality over time. In this talk, I will present key technologies that we have developed in building mixed-initiative visual analytic systems, including feature-based visualization recommendation and optimization-based approaches to dynamic data transformation for more effective visualization. I will also use concrete applications to demonstrate the use and value of mixed-initiative visual analytic systems, and discuss existing challenges and future directions in this area.
Citation preview
1
“Big Picture”Mixed-Initiative Visual Analytics of Big Data
Michelle Zhou IBM Research, Almaden
http://blog.threestory.com/
Outline
Definitions
– Big data
– Mixed-initiative visual analytics
Challenges and Goals
Our Approaches
– Key technologies
– Use cases
Future Directions
Variety
Definitions
“Big Picture”Mixed-Initiative Visual Analytics of Big Data
Volume Velocity Veracity210-million customers10-billion transactions
850 TB of data…
rumorsIncomplete data
…
100,000 tweets684,478 FB shares204 million emails
…
Per Minutewww.domo.com
Definitions
“Big Picture”Mixed-Initiative Visual Analytics of Big Data
Here is your customer
summary. I also suggest …
Here is your customer
summary. I also suggest …
Tell me more about my customers dougblakely.com
user-initiative
system-initiative
Key Challenges
“Tell me about … ”– How to visually summarize large
volumes of heterogeneous data to quickly discover meaningful insights
“What do they mean?”– How to visually explain
discovered insights (complex + abstract) and guide exploration
“This does not look right, I want to … do it again”– How to allow users to correct
analytic results and adopt previous analytic steps
ww
w.gfi.com
big data
Combine advanced data analytics and interactive visualization to help end users
Our Goals
Derive and consume insights
Explore various analytic paths and trust derived insights
Discover opportunities to compensate for and improveinsights and analytic processes
ww
w.gfi.com
AnalyticRequests
Output: Interactive Visualization
Input: User Actions
Data Analytic RecipesUser Models Analytic Engines
AlternativeVisualization
VisualizationExamples
“Big Picture”: Overview
Expressive UI &Action Interpreter
Expressive UI &Action Interpreter
Visual Analytics“Concierge”
Visual Analytics“Concierge”
AnalyticRequests
Output: Interactive Visualization
Input: User Actions
Data Analytic RecipesUser Models Analytic Engines
AlternativeVisualization
VisualizationExamples
“Big Picture”: Our Focus
Expressive UI &Action Interpreter
Expressive UI &Action Interpreter
Visual Analytics“Concierge”
Visual Analytics“Concierge”
Analytic Requests
Output: Interactive Visualization
Data Analytic RecipesUser Models Analytic Engines
AlternativeVisualization
VisualizationExamples
Visual Analytics Concierge
“Big Picture”: Our Focus
VisualizationRecommender
VisualizationRecommender
DataTransformer
DataTransformer
Insight Revision& Provenance
Insight Revision& Provenance
Analytic Requests
Output: Interactive Visualization
Data Analytic RecipesUser Models Analytic Engines
AlternativeVisualization
VisualizationExamples
“Big Picture”: Our Focus
VisualizationRecommender
VisualizationRecommender
DataTransformer
DataTransformer
Insight Revision& Provenance
Insight Revision& Provenance
Visual Analytics Concierge
ww
w.gfi.com
Data Transformation: Motivation
“Dirty”, noisy data
Large data variance
“Plain” raw data
Distorted, illegible visualization
“Messy” visualization without insights
Quality of data to be visualized affects the quality of visualization
Example 1: “Dirty” Noisy Data
Original visualization After separating noise
Task: Show houses on a map
Example 2: Large Data Variance
Task: Summarize houses by styles and towns
Original visualization After normalization
Example 3: “Plain” Raw Data
Task: Correlate house price and towns under $1.5M
Original visualization After ordering towns
Price
Town
Ordered
Example 4: “Plain” Raw Data
Task: What is my emotional style?
After semantic-temporal segmentation [Pan et al. IUI 2013]
Original visualization
Technical Challenges
Determine proper data transformation for different visualization situations
– Difficult to predict visualization situations involving multiple factors: data, user, and types of visualization
Certain situations require multiple data transformations
Balance multiple, potentially conflicting factors
– Quality of visualization and performance
Our Approach
Optimization-based approach to automatically derive data transformations that maximize visualization quality
Original Data (D)Data RetrievalData Retrieval Visualization
Generation
VisualizationGeneration
Data Transformer
Data Data TransformerTransformer
TransformedData
Visualization Type (Vt)
Input: Original data D, Visualization type Vt
Output: A set of transformation operators Op = {…, op[i], …}
where reward ∑ desirability(D, Vt , Op) is maximized
VisualizationRecommender
VisualizationRecommender
[Wen and Zhou IUI 2008, InfoVis 2008]
Measuring Visualization Desirability
∑ Desirability (D, V, Op)
∑ Visual_Quality (D, Vt , Op) ∑ Cost (Op)
Visual quality metrics Time cost of data transformations
––
visual legibility
visual pattern recognizability
visual fidelity
visual continuity
))(/)()((1),( 21 tt VDdensityDcomplexityVD βλλχ ×+×−=
[Wen and Zhou IUI 2008, InfoVis 2008]
Data Transformation: What’s Next
What additional desirability metrics should we consider?
How to perform data transformation in context (incremental transformation)?
How to scale out to support exabytes of data for different user tasks and situations?
Analytic Requests
Output: Interactive Visualization
Data Analytic RecipesUser Models Analytic Engines
AlternativeVisualization
VisualizationExamples
Visual Analytics Concierge
“Big Picture”: Our Focus
VisualizationRecommender
VisualizationRecommender
DataTransformer
DataTransformer
Insight Revision& Provenance
Insight Revision& Provenance
Visualization Recommendation: Motivation
ww
w.gfi.com
– Characteristics of data
– User tasks
– Device
– User interaction behavior
Visually encoding data requires skills and time and is influenced by a number of factors
Visualization Recommendation: Types
ww
w.gfi.com
Data-driven recommendation– Dynamically recommend suitable
visualizations based on data, display, and user tasks
Behavior-driven recommendation– Dynamically track user interactions and
detect behavior patterns to recommend suitable visualizations in context
DisplayDisplay + new data + task
DisplayDisplay + user behavior
Two situations
– Single display
– Multiple, consecutive displays
Multiple methods
– Rule based
– Planning based
– Machine learning based
VisualizationRecommendation
VisualizationRecommendation
Adopted from [Roth et al., CHI 94]
[Mackinlay ’86; Roth & Mattis CHI ’94; Zhou & Feiner InfoVis 96; Zhou & Feiner CHI 98; Zhou IJCAI 99; Zhou & Chen InfoVis 02; Zhou & Chen IJCAI03; Wen & Zhou InfoVis 05]
Data-Driven Visualization Recommendation
Behavior-Driven Visualization Recommendation
Observation– Users tend to stay with unsuitable visualization or
compensate for with large number of interactions instead of changing visualization
Goal– Detect user interaction patterns and make pattern-based
visualizations recommendations
Behavior-Driven Visualization Recommendation: Example 1Display: Bridge problems by State by year
User interactions: click on each state to examine the structural bridge problems
Behavior-Driven Visualization Recommendation: Example 1Pattern: Scan
Visualization Recommendation: line charts for direct comparison
[Gotz and Wen IUI 2009]
Behavior-Driven Visualization Recommendation: Example 2Display: Map of the Market
User interactions: repeatedly change time windows for two industries
Time Window:26 weeks52 weeks
…
Time Window:26 weeks52 weeks
…
Behavior-Driven Visualization Recommendation: Example 2Pattern: Flip
Visualization Recommendation: line chart for direct trend comparison
-10
-5
0
5
10
15
20
10 20 30 40 50
Utility
Netw orking
[Gotz and Wen IUI 2009]
Behavior-Driven Visualization Recommendation: Pattern-Based Approach
PatternDetection
PatternDetection
Pattern-TaskMatching
Pattern-TaskMatching
Pattern-DataMatching
Pattern-DataMatching
ExampleMatch
ExampleMatch
VisualizationRecommendations
Task Features
Data Features
User Interactions
[Gotz and Wen IUI 2009]
ScanFlip
Swap…
Recommending Visual Interactions
Automatically annotate and suggest follow-on user interactions based on displayed visual features
Original display Annotated display
AB
[Kandogan VAST’2012]
Recommending Visual Interactions
Grid-based approach to detect salient visual features in a display, and then annotate
Clusters
Outliers
Trends
[Kandogan VAST’2012]
Visualization Recommendation: What’s Next
Recommend a suitable heterogeneous visualization as a consecutive display
Recommend the composition of two or more existing visualizations
+ = ?Vehicle Group Vehicle Age
Cost
?
?[Wen, Zhou & Aggarwal, InfoVis05; Heer & Robertson, InfoVis07]
[Yang , Li, & Zhou 2013]
Visualization Recommendation: What’s Next
“Individualized” (hyper-personalized), adpativevisualization
– By cognitive style and personality [Gardner 1983]
– By one’s emotional/affective states
inventive/curious vs. consistent/cautious
friendly/compassionate vs. cold/unkind
outgoing/energetic vs. solitary/reserved
efficient/organized vs. easy-going/careless
Sadness Optimism Trust
sensitive/nervous vs. secure/confident
O
C
E
A
N
Big 5 Personality Model
Analytic Requests
Output: Interactive Visualization
Data Analytic RecipesUser Models Analytic Engines
AlternativeVisualization
VisualizationExamples
Visual Analytics Concierge
“Big Picture”: Our Focus
VisualizationRecommender
VisualizationRecommender
DataTransformer
DataTransformer
Insight Revision& Provenance
Insight Revision& Provenance
Insight Revision and Provenance
Insight revision
– Users amend derived insightsto correct analytic mistakes or make personalized adjustments
Insight provenance
– Users record interactions and insight for continuation and reuse
Neuroticism (high low)
Extroversion (low high)
Zoom In2 Edit2 Query Filter
User Actions
Insight Revision
A crowd-powered approach to insight revision
– Users amend various types of text analytics mistakes
– Adopting multi-user consistent inputs
Correcting sentiment classification error
[Hu et al. INTERACT 2013]
Insight Revision
A crowd-powered approach to insight revision
– Users amend various types of text analytics mistakes
– Adopting multi-user consistent inputs
Correcting summarization label error
[Hu et al. INTERACT 2013]
Insight Provenance
[Gotz and Zhou InfoVis 2009]
An action-based approach to insight provenance– “Actions” captures observable and semantically
meaningful user interactions
• Three types of actions: Exploration | Insight | Meta
– “Action trails” captures sequence of actions leading to an insight for insight provenance
InsightnExploratio AA o+= ])([ ττ
Insight Provenance [Gotz and Zhou InfoVis 2009]
Insight Revision and Provenance: What’s Next
Balance crowd input and personalized adjustments– Reconcile diverse user amendments vs. prevent potential
system abuse
Detect and learn different types of logical structures from user interactions – Automatically infer and predict user interaction patterns
to better support and anticipate user tasks
?
Summary
Tell me what’s in my data
Tell me what’s in my data
Here is the “big picture” of your
data. I also suggest you look into …
Here is the “big picture” of your
data. I also suggest you look into …
dougblakely.com
“Big data” is of high volume, heterogeneous, and often “dirty”
It requires both users and computers to take initiatives for effective visual analytics of big data
Something is wrong… should be…Remember what I have done so far
Something is wrong… should be…Remember what I have done so far I incorporated your
and others’ feedback. Please continue …
I incorporated your and others’ feedback.
Please continue …
DataTransformation
DataTransformation
VisualizationRecommendation
VisualizationRecommendation
Insight Revision& Provenance
Insight Revision& Provenance
Acknowledgements
IBM Research, Almadn
– Eser Kandogan, Fei Wang, Huahai Yang, Liang Gou, Ying Xuan, Eben Haber, Yunyao Li
IBM T. J. Watson
– David Gotz, Zhen Wen, Shimei Pan, Jie Lu, Min Chen*, Sheng Ma*, Peter Kissa*, Vikram Aggarwal*
IBM Research, China
– Shixia Liu*, Nan Cao, Yangqiu Song* Weihong Qian
Summer interns
– Ying Feng (Indiana University)
– Basak Alper (UC Santa Barbara)
– Mengdie Hu (Georgia Tech)
– Jian Zhao (University of Toronto)
References of Our WorkDavid Gotz and Michelle X. Zhou: Characterizing users' visual analytic activity for insight provenance. Information Visualization 8(1): 42-55, 2009.
David Gotz and Zhen Wen: Behavior-driven visualization recommendation. IUI 2009: 315-324, 2008.
Eser Kandogan: Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations. IEEE VAST 2012: 73-82.
Mengdie Hu, Huahai Yang, Michelle X. Zhou, Liang Gou, Yunyao Li, and Eben Haber: OpinionBlocks: A Crowd-Powered, Self-Improving Interactive Visual Analytic System for Understanding Opinion Text. To appear in Proc. INTERACT 2013.
Zhen Wen and Michelle X. Zhou: Evaluating the Use of Data Transformation for Information Visualization. IEEE Trans. Vis. Comp. Graph. 14(6): 1309-1316, 2008.
Zhen Wen and Michelle X. Zhou: An optimization-based approach to dynamic data transformation for smart visualization. IUI 2008: 70-79
Zhen Wen, Michelle X. Zhou, and Vikram Aggarwal: An Optimization-based Approach to Dynamic Visual Context Management. INFOVIS 2005: 25-32.
Huahai Yang, Yunyao Li, and Michelle X. Zhou: A Crowd-sourced Study: Understanding Users’Comprehension and Preferences for Composing Information Graphics. In Submission to TOCHI 2013.
Michelle X. Zhou and Min Chen: Automated Generation of Graphic Sketches by Example. IJCAI 2003: 65-74
Michelle X. Zhou, Min Chen, and Ying Feng: Building a Visual Database for Example-based Graphics Generation. INFOVIS 2002: 23-30.
Michelle X. Zhou, Sheng Ma, and Ying Feng: Applying machine learning to automated information graphics generation. IBM Systems Journal 41(3): 504-523 (2002)