Upload
infoblog
View
5.972
Download
2
Embed Size (px)
DESCRIPTION
This is a CIDR 2009 presentation. See http://infoblog.stanford.edu/ for more information and http://www-db.cs.wisc.edu/cidr/cidr2009/program.html for downloads.
Citation preview
Voyagers and VoyeursSupporting Social Data Analysis
Jeffrey HeerComputer Science DepartmentStanford University
CIDR 2009 – Monterey, CA5 January 2009
A Tale of Two Visualizations
vizster
Observations
Groups spent more time in front of the visualization than individuals.
Friends encouraged each other to unearth relationships, probe community boundaries, and challenge reported information.
Social play resulted in informal analysis, often driven by story-telling of group histories.
NameVoyagerThe Baby Name Voyager
Social Data Analysis
Visual sensemaking can be social as well as cognitive.
Analysis of data coupled with social interpretation and deliberation.
How can user interfaces catalyze and support collaborative visual analysis?
sense.usA Web Application for Collaborative Visualization of Demographic Data
Voyagers and Voyeurs
Complementary faces of analysis
Voyager – focus on visualized data
Active engagement with the data
Serendipitous comment discovery
Voyeur – focus on comment listings
Investigate others’ explorations
Find people and topics of interest
Catalyze new explorations
Out of the Lab,Into the Wild
Wikimapia.org
DecisionSite posters
Spotfire Decision Site Posters
Tableau Server
Many-Eyes
Social Data Analysis In Action
1. Discussion and Debate
2. Text is Data, Too
3. Data Integrity and Cleaning
4. Integrating Data in Context
5. Pointing and Naming
For each, some thoughts on future directions.
I asked my colleagues: if you could give database researchers a wish list, what would it be?
Discussion and Debate
Tableau X-Box / Quest Diag?
“Valley of Death”
Content Analysis of Comments
Feature prevalence from content analysis (min Cohen’s = .74)High co-occurrence of Observations, Questions, and Hypotheses
ServiceSense.us Many-Eyes
0 20 40 60 80
Percentage
0 20 40 60 80
Percentage
ObservationQuestion
HypothesisData Integrity
LinkingSocializing
System DesignTesting
TipsTo-Do
Affirmation
Reduce the cost of synthesizing contributions
WANTED: Structured Conversation
Wikipedia: Shared Revisions NASA ClickWorkers: Statistics
Reduce the cost of synthesizing contributions
Can we represent data, visualizations, and social activity in a unified data model?
WANTED: Structured Conversation
Text is Data, Too
Visualization Popularity
Over 1/3 of Many-Eyes visualizations use free text
ServiceMany-Eyes Swivel
0.0 0.1 0.2 0.3 0.4 0.5
Percentage
0.0 0.1 0.2 0.3 0.4 0.5
Percentage
Tag CloudBubble Graph
Word TreeBar Chart
MapsNetwork Diagram
TreemapMatrix Chart
Line GraphScatterplot
Stacked GraphPie Chart
Histogram
Alberto Gonzales
WANTED: Better Tools for Text
Statistical Analysis of text (with ties to source!)
Entity Extraction
Aggregation and Comparison of texts
Get a “global” view of documents
We can do better than Tag Clouds (!?)
Use text analysis tools to enable analysis of structured conversation by the community.
Data Integrity and Cleaning
No cooks in 1910? … There may have been cooks then. But maybe not.
The great postmaster scourge of 1910?
Or just a bugin the data?
Content Analysis of Comments
16% of sense.us comments and 10% of Many-Eyes comments reference data quality or integrity.
ServiceSense.us Many-Eyes
0 20 40 60 80
Percentage
0 20 40 60 80
Percentage
ObservationQuestion
HypothesisData Integrity
LinkingSocializing
System DesignTesting
TipsTo-Do
Affirmation
WANTED: Data Cleaning Tools
Reshape data, reformat rows & columns
Handle missing data: label, repair, interpolate
Entity resolution and de-duplication
Group related values into aggregates
Assist table lookups & data transforms
Provide tools in situ to leverage collective
Transparency requires provenance
Integrating Data in Context
College Drug Use
College Drug Use
Harry Potter is Freaking Popular
WANTED: In-Situ Data Integration
Search for and suggest related data or views
User input for types, schema matching, or data
Apply in context of the current task
But record mappings for future use
Record provenance: chain of data sources
Examples: Google Web Tables, Pay-As-You-Go, Stanford Vispedia, Utah VisTrails
Pointing and Naming
“Look at that spike.”
“Look at the spike for Turkey.”
“Look at the spike in the middle.”
Free-form Data-aware
Visual Queries
Model selections as declarative queries over interface elements or underlying data
(-118.371≤ lon AND lon≤ -118.164)AND(33.915≤ lat AND lat≤ 34.089)
Visual Queries
Model selections as declarative queries over interface elements or underlying data
Applicable to dynamic, time-varying data
Retarget selection across visual encodings
Support social navigation and data mining
WANTED: Data-Aware Annotation
Meta-queries linking annotations to views
Visually specifying notification triggers
Annotating data aggregates (use lineage?)
Unified model (again!) to facilitate reference
How to make it work at scale?
How else to use machine-readable annotations?
Can annotations be used to steer data mining?
Conclusion
Social Data Analysis
Collective analysis of data supported by social interaction.
1. Discussion and Debate
2. Text is Data, Too
3. Data Integrity and Cleaning
4. Integrating Data in Context
5. Pointing and Naming
Summary
As visualization becomes common on the web, opportunities for collaborative analysis abound.
Weave visualizations into the web: data access, visualization creation, view sharing and pointing.
Support discovery, discussion, and integrationof contributions to leverage the collective.
Improve both processes and technologies for communication and dissemination.
Parting Thoughts
Visualizations may have a catalytic effecton social interaction around data.
Encourage participation by minimizing or offsetting interaction costs.
Provide incentives by fostering the personal relevance of the data.
Acknowledgements
@ Berkeley: Maneesh Agrawala, Wes Willett, danah boyd, Marti Hearst, Joe Hellerstein
@ IBM: Martin Wattenberg, Fernanda Viégas
@ PARC: Stu Card
@ Tableau: Jock Mackinlay, Chris Stolte, Christian Chabot
Jeffrey Heer Stanford University
[email protected]://jheer.org
Voyagers and VoyeursSupporting Social Data Analysis
With a collaborative spirit, with a collaborative platformwhere people can upload data, explore data, compare solutions, discuss the results, build consensus, we can engage passionate people, local communities, media and this will raise - incredibly - the amount of people who can understand what is going on.
And this would have fantastic outcomes: the engagement of people, especially new generations; it would increase knowledge, unlock statistics, improve transparency and accountability of public policies, change culture, increase numeracy, and in the end, improve democracy and welfare.
Enrico Giovannini, Chief Statistician, OECD. June 2007.