Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
How I learned to stop visualizing and love statistics
1
You have a hunch
2
Visualizations sanity check
Statistics quantify the hunch
(Visualizations storytelling)
3
Someone says: “Obama got more small campaign
contributions than McCain”
4
5
???
6
7
© Jhguch on Wikipedia. All rights reserved. This content is excluded from our CreativeCommons license. For more information, see http://ocw.mit.edu/fairuse.
Median
8
9
25% 75%
10
Inner Quartile Range
11
Whiskers / Extremes
12
Outliers
13
Box-and-Whiskers Plot
14
???
15
Are they actually different?
T-Test
16
Obama McCain Obama McCain 17
Assume Reality
Obama McCain Obama McCain 18
How likely is given ? How likely is given ? How likely is given ?
Obama 19
McCain
avg1
avg2
Obama 20
McCain
avg1
avg2
Effect Size
Obama 21
McCain
avg1
avg2
vari
ance
1
vari
ance
2
How likely is given ? How likely is given ? How likely is given ?
avg1
avg2 avg2 avg2 avg2
avg1 avg1 avg1
vari
ance
1
vari
ance
2
22
How likely are they equal given avg/variance differences?
Probablility p
p is low p is high Obama, McCain Don’t trust
are different the difference (significant) 23 (not significant)
Probablility
p is high
Significance is binary
• Pick a threshold: .01? .05?
• Is p > threshold, or < threshold?
p < .05? significant
p > .05? don’t trust the difference 24
avg1
avg2 avg2 avg2 avg2
avg1 avg1 avg1
vari
ance
1
vari
ance
2
T-Test Signifiance
# Samples Obama: >1M
McCain: >1M +
25
Correlation, Linear Regression
26
County Health Rankings
• Every county in USA
• Years of Potential Life Lost (YPLL): early morbidity
– less is good
– more is bad
• Median income, % population w/ diabetes,
% population under 18, …
27
What is correlated with early death in a community?
Burgers Sleep
Education Exercise
# Rappers Your theory here
28
29
y = mx + b
R2 (0 to 1)
p < .05?
Line coefficients:
Correlation amount:
Significance:
30
31
Correlation != Causation
Correlation
Causal Hunch
Randomized Trial
T-Test! 32
Causal Hunch
Randomized Trial
T-Test!
MIT OpenCourseWarehttp://ocw.mit.edu
Resource: How to Process, Analyze and Visualize DataAdam Marcus and Eugene Wu
The following may not correspond to a particular course on MIT OpenCourseWare, but has been provided by the author as an individual learning resource.
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.