Upload
zeeshanshhd
View
215
Download
0
Embed Size (px)
Citation preview
7/30/2019 Math is Music - Stats is Literature, 2004
1/56
September 18, 2004 AWL Workshop -- HACC 1
Math is MusicMath is MusicStats is LiteratureStats is Literature
Dick De Veaux - Williams CollegeThanks also to Paul Velleman,
Cornell University
Or why are there no six year old novelists?Or why are there no six year old novelists?
7/30/2019 Math is Music - Stats is Literature, 2004
2/56
September 18, 2004 AWL Workshop -- HACC 2
ProdigiesProdigies Math, music, chess
Gauss, Pascal Mozart, Schubert, Mendelssohn
Bobby Fischer
Why these three areas? Each creates its own world with its
own set of rules
There is no experience required Once you know the rules, you are freeto create anything
7/30/2019 Math is Music - Stats is Literature, 2004
3/56
September 18, 2004 AWL Workshop -- HACC 3
Prodigies in LiteratureProdigies in Literature
Mary Wollstonecraft Shelley Age 19
Created Frankenstein, an imaginarycreature
Others? Why?
Literature is about the world, not aboutrules. It deals with lifes experience andthe wisdom we develop over time.
7/30/2019 Math is Music - Stats is Literature, 2004
4/56
September 18, 2004 AWL Workshop -- HACC 4
StatisticsStatistics What doWhat dostudents find hard?students find hard?
Understood the material in class, butfound it hard to do the homework
Should be more like a math course,with everything laid out beforehand
More problems in class should be like
the HW and tests
7/30/2019 Math is Music - Stats is Literature, 2004
5/56
September 18, 2004 AWL Workshop -- HACC 5
What is easy?What is easy?
The math part Give them the formula, they can get the
answer with some training
The hard part Putting it all together Real world
Experience
Methods
7/30/2019 Math is Music - Stats is Literature, 2004
6/56
September 18, 2004 AWL Workshop -- HACC 6
Whats Hard?Whats Hard? ----ExampleExample
7/30/2019 Math is Music - Stats is Literature, 2004
7/56
September 18, 2004 AWL Workshop -- HACC 7
TT--CodeCode
7/30/2019 Math is Music - Stats is Literature, 2004
8/56
7/30/2019 Math is Music - Stats is Literature, 2004
9/56
September 18, 2004 AWL Workshop -- HACC 9
Whats Hard?Whats Hard?Five Unnatural ActsFive Unnatural Acts
Think Critically
Be Skeptical
Focus not on what we know, buton what we dont know
Think first about Variation
Think clearly about Conditioningand Rare events
7/30/2019 Math is Music - Stats is Literature, 2004
10/56
September 18, 2004 AWL Workshop -- HACC 10
Statistics is UnnaturalStatistics is Unnaturaland Subversiveand Subversive
We ask students to Question the data
Examine the assumptions
Reject the null hypothesis
Have they done this in mathclass?
Convincing them to be subversivemay be easier than you think
7/30/2019 Math is Music - Stats is Literature, 2004
11/56
September 18, 2004 AWL Workshop -- HACC 11
ThinkThink CriticallyCritically Challenge the datas credentials.
Look for bias. Know what we want to know.
Whats the QUESTION?
Look for Lurking variables. Check Assumptions and Conditions.
Critical thinking requires creativity. Youmust think about things that are not infront of you and imagine ways in whichthings might have gone wrong.
7/30/2019 Math is Music - Stats is Literature, 2004
12/56
September 18, 2004 AWL Workshop -- HACC 12
Be SkepticalBe Skeptical Be cautious about making claims
based on data. Trust every analysis, but plot the
residuals. Skeptical statisticians expect the
unexpected, so we go looking for it.
SHOW that the analysis isappropriate
7/30/2019 Math is Music - Stats is Literature, 2004
13/56
September 18, 2004 AWL Workshop -- HACC 13
Ancient HistoryAncient History
The vote in the 2000 Presidential election forBuchanan and the vote for Nader, (the two
principal alternatives to Bush and Gore), has
a correlation of 0.65 over the counties ofFlorida.
Ask: Is the relationship linear?
Is the data set homogeneous or are there subgroups?
Are there any outliers?
7/30/2019 Math is Music - Stats is Literature, 2004
14/56
September 18, 2004 AWL Workshop -- HACC 14
Plot the DataPlot the Data
750
1500
2250
3000
2500 5000 7500
NADER
B
U
C
H
A
N
A
N
Without Palm Beach county and itsbutterfly ballot, the correlation is 0.91.
7/30/2019 Math is Music - Stats is Literature, 2004
15/56
September 18, 2004 AWL Workshop -- HACC 15
Hypothesis TestingHypothesis Testing
Skepticism formalized
The null hypothesis is a skepticalclaim about the data
Its unnatural to show theopposite
7/30/2019 Math is Music - Stats is Literature, 2004
16/56
September 18, 2004 AWL Workshop -- HACC 16
Critical Thinking andCritical Thinking and
SkepticismSkepticism
Critical thinking is open-endedquestioning of the datas credentials. We wonder whether the data are competent to tell
us what we want to know.
Skepticism questions whether whatthe data appear to be telling us is the
whole truth.
7/30/2019 Math is Music - Stats is Literature, 2004
17/56
September 18, 2004 AWL Workshop -- HACC 17
Focus on What We DontFocus on What We Dont
KnowKnow
In most science and mathcourses, we focus on what we
know Statisticians are a bit perverse
7/30/2019 Math is Music - Stats is Literature, 2004
18/56
September 18, 2004 AWL Workshop -- HACC 18
Confidence IntervalsConfidence Intervals
We dont say The mean is 31.2.
We dont say The mean is probably 31.2
We dont say The mean is close to 31.2.
All we can manage is The mean is close to 31.2. Probably
(and, in fact, Im willing to admit I may bewrong and to spend the effort to give you a
whole interval of plausible values and then to
spend extra effort to estimate how likely it is
that even that interval is wrong.)
7/30/2019 Math is Music - Stats is Literature, 2004
19/56
September 18, 2004 AWL Workshop -- HACC 19
All Models are WrongAll Models are WrongGeorge Box:
All models are wrong but some areuseful
Statisticians, like artists, have the badhabit of falling in love with their models
But, statisticians love models--because theyare wrong.
What do we focus on?residuals!
what the model fails to account for
7/30/2019 Math is Music - Stats is Literature, 2004
20/56
September 18, 2004 AWL Workshop -- HACC 20
VariationVariation
Students find it easier to thinkabout values rather than variation,
butStatistics is about Variation
7/30/2019 Math is Music - Stats is Literature, 2004
21/56
September 18, 2004 AWL Workshop -- HACC 21
ExampleExample
A town has two hospitals Large hospital about 100 babies a day
Smaller hospitals about 15 babies a day
Over the course of the year, whichhospital (if either) would probably have
more days in which more than 60% of
the babies born are male?
7/30/2019 Math is Music - Stats is Literature, 2004
22/56
September 18, 2004 AWL Workshop -- HACC 22
The Standard DeviationThe Standard Deviation
is the Statisticians Ruleris the Statisticians Ruler
Most of the inference seen in theintroductory course compares a
statistic to its standard deviationto see whether it is big.
This idea carries into advancedmethods as well.
7/30/2019 Math is Music - Stats is Literature, 2004
23/56
September 18, 2004 AWL Workshop -- HACC 23
Thinking aboutThinking about
Conditional EventsConditional Events
This is just plain hard.
It is easy to show that we dont
naturally think clearly aboutconditional probabilities.
But we must for rational decision
making.
7/30/2019 Math is Music - Stats is Literature, 2004
24/56
September 18, 2004 AWL Workshop -- HACC 24
LindaLinda((TverskyTversky && KahnemanKahneman))
Linda is 31 years old, single,outspoken, and very bright. Shemajored in philosophy. As a student,she was deeply concerned withissues of discrimination and social
justice, and she participated inantinuclear demonstrations.
7/30/2019 Math is Music - Stats is Literature, 2004
25/56
September 18, 2004 AWL Workshop -- HACC 25
Order these in order of LikelihoodOrder these in order of Likelihood
a) Linda is a teacher in an elementary school
b) Linda works in a bookstore and takes yogaclasses.
c) Linda is active in the feminist movement.
d) Linda is a psychiatric social worker
e) Linda is a member of the League of WomenVoters.
f) Linda is a bank teller.
g) Linda is an insurance salesperson.
h) Linda is a bank teller who is active in the feministmovement.
7/30/2019 Math is Music - Stats is Literature, 2004
26/56
September 18, 2004 AWL Workshop -- HACC 26
Pick a number at RandomPick a number at Random
7/30/2019 Math is Music - Stats is Literature, 2004
27/56
September 18, 2004 AWL Workshop -- HACC 27
Random?Random?
0 1 2 3 4
7/30/2019 Math is Music - Stats is Literature, 2004
28/56
September 18, 2004 AWL Workshop -- HACC 28
Random IIRandom II
10
20
30
Count
1 2 3 4 5 6 7 8 9 10 12
7/30/2019 Math is Music - Stats is Literature, 2004
29/56
September 18, 2004 AWL Workshop -- HACC 29
Is Statistical ThinkingIs Statistical Thinking
Unnatural?Unnatural?
We havent evolved to be Statisticians. Our students who think Statistics is an
unnatural subject are right. This isnt howhumans think naturally.
But it is how humans think rationally. And itis how scientists think. This is the way wemust think if we are to make progress inunderstanding how the world works and,
for that matter, how we ourselves work.
7/30/2019 Math is Music - Stats is Literature, 2004
30/56
September 18, 2004 AWL Workshop -- HACC 30
How can we help?How can we help?
Give them an outline for putting
the real world into a framework Whats the problem?
The Ws
The model The method
What are the mechanics?
What have we learned?
7/30/2019 Math is Music - Stats is Literature, 2004
31/56
September 18, 2004 AWL Workshop -- HACC 31
ThinkThink ShowShow ---- TellTell
THINK: What techniques apply?
SHOW: Mechanics how to do it.
TELL: Explain what you learned.
7/30/2019 Math is Music - Stats is Literature, 2004
32/56
September 18, 2004 AWL Workshop -- HACC32
7/30/2019 Math is Music - Stats is Literature, 2004
33/56
September 18, 2004AWL Workshop -- HACC
33
The Three Rules of DataThe Three Rules of Data
AnalysisAnalysis
I. Make a Pictureit will help you think about the data
II. Make a Picture
it may show unexpected features
III. Make a Pictureit will help you tell others what
youve found.
These are made easier with technology!
7/30/2019 Math is Music - Stats is Literature, 2004
34/56
September 18, 2004AWL Workshop -- HACC
34
Know the Datas WsKnow the Datas Ws Who is the data about?
Whats a row?
What is measured? What are the columns?
And in what units?
When was it measured?
Where was it measured?
Ho(W) was it measured?
Why was it measured?
7/30/2019 Math is Music - Stats is Literature, 2004
35/56
September 18, 2004AWL Workshop -- HACC
35
The WsThe Ws
14718833912040.5383.36.02USALance Armstrong2004
14718934272040.9483.41.12USALance Armstrong2003
15318932782039.9382.05.12USALance Armstrong2002
14418934532040.0286.17.28USALance Armstrong2001
12818036622139.5692.33.08USALance Armstrong2000
14118036872040.391.32.16USALance Armstrong1999
3611444881428.7156.09.31FranceLucien Petit-Breton1908
339344881428.5156.22.30FranceLucien Petit-Breton1907
148246371324.5185.47.26FranceRene Pottier1906
246029751127.3112.18.09FranceLouisTrousselier1905
23882388624.396.05.00FranceHenri Cornet1904
21602428625.394.33.00FranceMaurice Garin1903
FinishStartDis (km)StagesSpeedTimeCountryWinnerYear
7/30/2019 Math is Music - Stats is Literature, 2004
36/56
September 18, 2004AWL Workshop -- HACC
36
The ModelThe Model Statistics is about models
A model is a simplification ofreality.
We know its not perfect
Two quotations from George Box
7/30/2019 Math is Music - Stats is Literature, 2004
37/56
September 18, 2004AWL Workshop -- HACC
37
Common ModelsCommon Models Probability models
Regression model
7/30/2019 Math is Music - Stats is Literature, 2004
38/56
September 18, 2004AWL Workshop -- HACC
38
CommonCommon ModelsModels Simulation
7/30/2019 Math is Music - Stats is Literature, 2004
39/56
September 18, 2004
AWL Workshop -- HACC39
Pay Dirt ModelsPay Dirt Models
Sampling distribution models
By now students know that models areidealized
Theyve seen probability models and
simulations: CLT follows naturally Null hypothesis models
Wrong (we hope) but useful
7/30/2019 Math is Music - Stats is Literature, 2004
40/56
September 18, 2004
AWL Workshop -- HACC40
ModelsModels
Require assumptionsBecause they are idealized, they are only really
true under idealized assumptions
Are described by parametersParameters refer to models of populations, not
to the populations themselves
7/30/2019 Math is Music - Stats is Literature, 2004
41/56
September 18, 2004
AWL Workshop -- HACC41
Assumptions andAssumptions and
ConditionsConditions
Some assumptions we must justassume. (Pretend)
Many can be checked forplausibility with appropriateconditions Often the conditions are graphical
(Remember the 3 rules)
Few are really true
7/30/2019 Math is Music - Stats is Literature, 2004
42/56
September 18, 2004 AWL Workshop -- HACC 42
Conditions to CheckConditions to Check Summary statistics
Quantitative data condition.
T-test Assumption is that data are Normal
Rule of thumb? 30? 50? 100?
Nearly normal condition--make a picture
Variable -- TCODE
Mean 54.41
Std Dev 957.50
Std Err Mean 3.11
upper 95% Mea 60.51lower 95% Mean 48.31
7/30/2019 Math is Music - Stats is Literature, 2004
43/56
September 18, 2004 AWL Workshop -- HACC 43
7/30/2019 Math is Music - Stats is Literature, 2004
44/56
September 18, 2004 AWL Workshop -- HACC 44
Show, with TechnologyShow, with Technology
Calculation is for calculators andstatistics packages. Let them do it, so students can think about
statistical thinking.
Show generic output rather than aparticular package.
Let them do it so we can play Statistics
7/30/2019 Math is Music - Stats is Literature, 2004
45/56
September 18, 2004 AWL Workshop -- HACC 45
Play StatsPlay Stats
M H lM H l
7/30/2019 Math is Music - Stats is Literature, 2004
46/56
September 18, 2004 AWL Workshop -- HACC 46
More HelpMore Help
Reality ChecksReality Checks The answer is wrong if it makes no sense --
even if you pushed the buttons you meant topush or gave the command you intended
Check that the results are plausible
Remember the units!
7/30/2019 Math is Music - Stats is Literature, 2004
47/56
September 18, 2004 AWL Workshop -- HACC 47
7/30/2019 Math is Music - Stats is Literature, 2004
48/56
September 18, 2004 AWL Workshop -- HACC 48
Draw ConclusionsDraw Conclusions
Plot the data, but then say what you
see. Give guidance for how to see
Reject the null hypothesis, but then
provide a CI to assess effect size. Emphasize interplay between tests and CI
Think about costs and consequences.
Dont be satisfied with I rejected Ho
7/30/2019 Math is Music - Stats is Literature, 2004
49/56
September 18, 2004 AWL Workshop -- HACC 49
What Can Go Wrong?What Can Go Wrong?
Acknowledge common
misapplications and misinterpretationsof statistics.
(Hope to) Minimize them in Tellingwhat was found.
7/30/2019 Math is Music - Stats is Literature, 2004
50/56
September 18, 2004 AWL Workshop -- HACC 50
7/30/2019 Math is Music - Stats is Literature, 2004
51/56
September 18, 2004 AWL Workshop -- HACC 51
StepStep--ByBy--StepStep
Encourage students to bring all ofthese ideas together when they
solve a statistical problem.
Illustrate how, step-by-step
7/30/2019 Math is Music - Stats is Literature, 2004
52/56
September 18, 2004 AWL Workshop -- HACC 52
7/30/2019 Math is Music - Stats is Literature, 2004
53/56
September 18, 2004 AWL Workshop -- HACC 53
7/30/2019 Math is Music - Stats is Literature, 2004
54/56
September 18, 2004 AWL Workshop -- HACC 54
7/30/2019 Math is Music - Stats is Literature, 2004
55/56
September 18, 2004 AWL Workshop -- HACC 55
Take Home MessagesTake Home Messages
Stats is about the real world: Technology frees the student to think
about the world
Give the student a structure for a chaotic
world Root the course in examples taken from
the students lives to make the connection
apparent
Help them with unnatural thinking
7/30/2019 Math is Music - Stats is Literature, 2004
56/56
September 18, 2004 AWL Workshop -- HACC 56
Thank you !!Thank you !!