Upload
hubert-doyle
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
LOGO
Exploring Millions of
Social Stereotypes
3
• How old do they look?
• Do you think they look smart?
• How do we perceive age, gender, and attractiveness?
Data Analysis!
The FaceStat Judging Interface
4
Preprocessing the Data
5
Problematic data
Aggregate results from multiple people into a single description
Map from multiple-choice responses to one numerical value
Exploring the Data
6
Initial scatterplot matrix of the face data
Exploring the Data
7
Initial histogram of face age data
Exploring the Data
8
Histogram of cleaned face age data
Age, Attractiveness, and Gender
9
Scatterplot of attractiveness versus age, colored by gender
Age, Attractiveness, and Gender
10
Smoothed scatterplots for attractiveness versus age, colored by gender
Age, Attractiveness, and Gender
11
Three iterations of plotting attractiveness versus age versus gender:(a) ages averaged within buckets per age year, (b) 95% confidence interval for each bucket, plus loess curves, and (c) larger buckets where the data is sparser.
Age, Attractiveness, and Gender
12
Pearson correlation matrix
Clustering
13
Attractiveness versus age, colored by cluster, 2000 points.
Clustering
14
Cluster centroids, tags, and exemplars
Clustering
15
Cluster centroids, tags, and exemplars
Conclusion
Our data indicates some familiar stereotypes.
Women are considered more attractive than men
Age have a stronger attractiveness effect for women than men
Also some potential surprises.
Babies are most attractive
Conservatives look more intelligent
The point of this instance is not to come to any particular conclusion.
Instead, we want to show some examples of the rich set of significant patterns contained in large, messy data set of human judgments.
16
LOGO
Visualizing Urban Data
Crimespotting Project
18
Home Page
19
How to Get the Crime Data?
Collect further details on the crime reports
Determining the location of crime
Recognize the crime icon
Get an image from CrimeWatch server
20
A Sample Image
21
A sample image from CrimeWatch shows areas of the theft, narcotics, robbery, and other crimes.
A Sample Image
22
The same sample image from CrimeWatch with programmatically recognized icons outlined.
A Sample Image
23
The same sample image with the reddish parts made white to show the red boxing glove icon more clearly.
Geolocation
24
A map of downtown Oakland showing three reference points for triangulation purposes.
The Spotlight Feature
25
The type selector shows the total numbers of each report type in the selected time span
Conclusion
Crime is a serious issue for any urban resident, by visualizing the crime data can we effectively protect the citizens.
The project has been a productive success, resulting in what we believe is a data service maximally useful to local residents.
City and government information is being moved onto the Internet to match the expectations of a connected, wired citizenry.
For more information about Crimespotting:
http://oakland.crimespotting.org/
26
LOGO
Beautiful Political Data
Data Help Obama Win
28
Redistricting and Partisan Bias
Redistricting Redistricting is the process of drawing United States electoral
district boundaries, often in response to population changes determined by the results of the decennial census.
Partisan Bias Partisan bias is a measure of how much the electoral system
favors the Democrats or Republicans, after accounting for their vote share.
29
Redistricting and Partisan Bias
30
Effect of redistricting on partisan bias
Time Series of Estimates
31
Age and Voting
32
Sure, young people voted heavily for Mr.Obama, but they voted heavily for John Kerry. ----Mark Penn, Political Consultant
Was he right?
Age and Voting
33
Some graphs showing recent patterns of voting by ages
Localized Partisanship in Pennsylvania
34
Geographic partisanship in Pennsylvania
Conclusion
Political data is increasingly accessible and is increasingly being plotted and shared in the media and on the web.
At the research level, articles in political science journals are starting to make use of graphical techniques for discovery and presentation of results.
Statistical visualization to become more important and more widespread in political analysis.
35
LOGO
Data Finds Data
37
An example Corruption at the Roulette Wheel Past Posting
Data Finds Data
Data Finds Data
38
What can data finds data system do for us?
Guest Convenience
Customer service
On the way to “data finds data”:
Data Finds Data
39
What can data finds data system do for us? Improved Child Safety Cross-compartment Exploitation
Data Finds Data
40
What should we solve first? All examples benefit from just in-time discovery. However, we should solve the “enterprise discoverability” problem. Federated search
Do not have the indexes necessary to enable the efficient location of a record.
Requires recursive processing.
Federated search cannot support the “data finds data” mission, because it has no ability to deliver on enterprise discoverability at scale.
Directories are necessary!
Conclusion
Determine how new observations relate to what is known.
Differentiate one organization from another.
Likely become another building block from which next generations of advanced analytics will benefit.
41
LOGO
Exploring Your Life in Data
Exploring Your Life in Data
43
Web: About sharing, broadcasting and distributing. About tracking, monitoring, analyzing his\her habits and behaviors.
Tools: PEIR & YFD
Difference: PEIR runs in the background and automatically upload data. YFD requires that users actively enter data.
44
Some Examples
DietSense
Family Dynamics
Walkability
Thanks to built-in sensors.
All bring people involved in their communities with just their mobile phones.
45
Visualization
• Traces are colored based on impact and exposure values.
• A different mapping scheme that make all trips on the map mono-color, using circles to encode impact and exposure.
• All traces are colored white, and the model values are visually represented with circles that varies in size at the end of each trip.
• Greater values are displayed as circles larger in area while lesser values are smaller in area.
Visualization
46
• We grayscaled map tiles and inverted the color filters so that map items that were originally lightly colored turned dark and vice versa.
• To be more specific, the terrain was originally lightly colored, so now it is dark gray, and roads that were originally dark are now light gray.
• This darkened map lets lightly colored traces stand out, and because the map is grayscale, there is less clashing.
47
Visualization
• PEIR provides histograms to show distributions of impact and exposure for selected trips.
48
PEIR Interface
49
Design of Interface in YFD
50
Track of Feelings and Emotions
Conclusion
People who collect data about themselves are not necessarily after the actual data.
They are mostly interested in the resulting information and how they can use their own data to improve themselves.
We use the data visualization to teach and to draw interest.
51
LOGO
The Design of Sense.us
The Design of Sense.us
Data beautiful?
How to make it beautiful?
An example to demonstrate: sense.us
53
Quartet ——An Example
Four data sets
54
Same statistical properties
Quartet ——An Example
55
Back to Sense.us
Consider The correlation between two numerical values
To visualize change over time
56
scatterplot
line graph
Not always the case
Back to Sense.us
Effect of our choice was influenced by Martin’s Baby Name
Voyager visualization, a stacked graph of baby name popularity
that became surprisingly popular online.
57
Stacked Graph
Job Voyager
58
Job Voyager visualization:
Left: an overview showing the constitution of the labor force over 150 years;
Right: a filtered view showing the percentage of farmers.
Differentiate Individual Series
When we filtered the view to show only males or only females.
Enable perceptual discrimination by varying color saturation in an arbitrary fashion.
Rather than vary colors arbitrarily, do so in a meaningful, data-driven way.
Subsequently vary color saturation according to socio-economic index scores for each occupation.
59
Stacked Graph
Birthplace Voyager
60
Birthplace Voyager visualization:
Left: an overview showing the distribution of birthplaces over 150 years;
Right: a filtered view showing the total number of European immigrants.
U.S. Census State Map and Scatterplot
61
Left: Interactive state map showing changes in each state’s population from 2000 to
2005;
Right: Scatterplot of U.S. states showing median household income (x-axis) versus
retail sales (y-axis); New Hampshire and
Delaware have the highest retail sales.
Population Pyramid
62
Population pyramid visualization:
Left: a comparison of the total number of males and females in each age group in 2000;
Right: the distribution of school attendees in 2000 (an annotation highlights the
prevalence of adult education).
Conclusion
The combination of interactive visualization and social interpretation can help an audience more richly explore a data set.
The forms of analysis we observed in sense.us were exploratory in nature, the system had a clear educational benefit and users reported that using sense.us was both enjoyable and informative.
63