Math is Music - Stats is Literature, 2004

Embed Size (px)

Citation preview

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    1/56

    September 18, 2004 AWL Workshop -- HACC 1

    Math is MusicMath is MusicStats is LiteratureStats is Literature

    Dick De Veaux - Williams CollegeThanks also to Paul Velleman,

    Cornell University

    Or why are there no six year old novelists?Or why are there no six year old novelists?

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    2/56

    September 18, 2004 AWL Workshop -- HACC 2

    ProdigiesProdigies Math, music, chess

    Gauss, Pascal Mozart, Schubert, Mendelssohn

    Bobby Fischer

    Why these three areas? Each creates its own world with its

    own set of rules

    There is no experience required Once you know the rules, you are freeto create anything

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    3/56

    September 18, 2004 AWL Workshop -- HACC 3

    Prodigies in LiteratureProdigies in Literature

    Mary Wollstonecraft Shelley Age 19

    Created Frankenstein, an imaginarycreature

    Others? Why?

    Literature is about the world, not aboutrules. It deals with lifes experience andthe wisdom we develop over time.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    4/56

    September 18, 2004 AWL Workshop -- HACC 4

    StatisticsStatistics What doWhat dostudents find hard?students find hard?

    Understood the material in class, butfound it hard to do the homework

    Should be more like a math course,with everything laid out beforehand

    More problems in class should be like

    the HW and tests

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    5/56

    September 18, 2004 AWL Workshop -- HACC 5

    What is easy?What is easy?

    The math part Give them the formula, they can get the

    answer with some training

    The hard part Putting it all together Real world

    Experience

    Methods

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    6/56

    September 18, 2004 AWL Workshop -- HACC 6

    Whats Hard?Whats Hard? ----ExampleExample

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    7/56

    September 18, 2004 AWL Workshop -- HACC 7

    TT--CodeCode

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    8/56

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    9/56

    September 18, 2004 AWL Workshop -- HACC 9

    Whats Hard?Whats Hard?Five Unnatural ActsFive Unnatural Acts

    Think Critically

    Be Skeptical

    Focus not on what we know, buton what we dont know

    Think first about Variation

    Think clearly about Conditioningand Rare events

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    10/56

    September 18, 2004 AWL Workshop -- HACC 10

    Statistics is UnnaturalStatistics is Unnaturaland Subversiveand Subversive

    We ask students to Question the data

    Examine the assumptions

    Reject the null hypothesis

    Have they done this in mathclass?

    Convincing them to be subversivemay be easier than you think

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    11/56

    September 18, 2004 AWL Workshop -- HACC 11

    ThinkThink CriticallyCritically Challenge the datas credentials.

    Look for bias. Know what we want to know.

    Whats the QUESTION?

    Look for Lurking variables. Check Assumptions and Conditions.

    Critical thinking requires creativity. Youmust think about things that are not infront of you and imagine ways in whichthings might have gone wrong.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    12/56

    September 18, 2004 AWL Workshop -- HACC 12

    Be SkepticalBe Skeptical Be cautious about making claims

    based on data. Trust every analysis, but plot the

    residuals. Skeptical statisticians expect the

    unexpected, so we go looking for it.

    SHOW that the analysis isappropriate

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    13/56

    September 18, 2004 AWL Workshop -- HACC 13

    Ancient HistoryAncient History

    The vote in the 2000 Presidential election forBuchanan and the vote for Nader, (the two

    principal alternatives to Bush and Gore), has

    a correlation of 0.65 over the counties ofFlorida.

    Ask: Is the relationship linear?

    Is the data set homogeneous or are there subgroups?

    Are there any outliers?

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    14/56

    September 18, 2004 AWL Workshop -- HACC 14

    Plot the DataPlot the Data

    750

    1500

    2250

    3000

    2500 5000 7500

    NADER

    B

    U

    C

    H

    A

    N

    A

    N

    Without Palm Beach county and itsbutterfly ballot, the correlation is 0.91.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    15/56

    September 18, 2004 AWL Workshop -- HACC 15

    Hypothesis TestingHypothesis Testing

    Skepticism formalized

    The null hypothesis is a skepticalclaim about the data

    Its unnatural to show theopposite

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    16/56

    September 18, 2004 AWL Workshop -- HACC 16

    Critical Thinking andCritical Thinking and

    SkepticismSkepticism

    Critical thinking is open-endedquestioning of the datas credentials. We wonder whether the data are competent to tell

    us what we want to know.

    Skepticism questions whether whatthe data appear to be telling us is the

    whole truth.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    17/56

    September 18, 2004 AWL Workshop -- HACC 17

    Focus on What We DontFocus on What We Dont

    KnowKnow

    In most science and mathcourses, we focus on what we

    know Statisticians are a bit perverse

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    18/56

    September 18, 2004 AWL Workshop -- HACC 18

    Confidence IntervalsConfidence Intervals

    We dont say The mean is 31.2.

    We dont say The mean is probably 31.2

    We dont say The mean is close to 31.2.

    All we can manage is The mean is close to 31.2. Probably

    (and, in fact, Im willing to admit I may bewrong and to spend the effort to give you a

    whole interval of plausible values and then to

    spend extra effort to estimate how likely it is

    that even that interval is wrong.)

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    19/56

    September 18, 2004 AWL Workshop -- HACC 19

    All Models are WrongAll Models are WrongGeorge Box:

    All models are wrong but some areuseful

    Statisticians, like artists, have the badhabit of falling in love with their models

    But, statisticians love models--because theyare wrong.

    What do we focus on?residuals!

    what the model fails to account for

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    20/56

    September 18, 2004 AWL Workshop -- HACC 20

    VariationVariation

    Students find it easier to thinkabout values rather than variation,

    butStatistics is about Variation

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    21/56

    September 18, 2004 AWL Workshop -- HACC 21

    ExampleExample

    A town has two hospitals Large hospital about 100 babies a day

    Smaller hospitals about 15 babies a day

    Over the course of the year, whichhospital (if either) would probably have

    more days in which more than 60% of

    the babies born are male?

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    22/56

    September 18, 2004 AWL Workshop -- HACC 22

    The Standard DeviationThe Standard Deviation

    is the Statisticians Ruleris the Statisticians Ruler

    Most of the inference seen in theintroductory course compares a

    statistic to its standard deviationto see whether it is big.

    This idea carries into advancedmethods as well.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    23/56

    September 18, 2004 AWL Workshop -- HACC 23

    Thinking aboutThinking about

    Conditional EventsConditional Events

    This is just plain hard.

    It is easy to show that we dont

    naturally think clearly aboutconditional probabilities.

    But we must for rational decision

    making.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    24/56

    September 18, 2004 AWL Workshop -- HACC 24

    LindaLinda((TverskyTversky && KahnemanKahneman))

    Linda is 31 years old, single,outspoken, and very bright. Shemajored in philosophy. As a student,she was deeply concerned withissues of discrimination and social

    justice, and she participated inantinuclear demonstrations.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    25/56

    September 18, 2004 AWL Workshop -- HACC 25

    Order these in order of LikelihoodOrder these in order of Likelihood

    a) Linda is a teacher in an elementary school

    b) Linda works in a bookstore and takes yogaclasses.

    c) Linda is active in the feminist movement.

    d) Linda is a psychiatric social worker

    e) Linda is a member of the League of WomenVoters.

    f) Linda is a bank teller.

    g) Linda is an insurance salesperson.

    h) Linda is a bank teller who is active in the feministmovement.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    26/56

    September 18, 2004 AWL Workshop -- HACC 26

    Pick a number at RandomPick a number at Random

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    27/56

    September 18, 2004 AWL Workshop -- HACC 27

    Random?Random?

    0 1 2 3 4

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    28/56

    September 18, 2004 AWL Workshop -- HACC 28

    Random IIRandom II

    10

    20

    30

    Count

    1 2 3 4 5 6 7 8 9 10 12

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    29/56

    September 18, 2004 AWL Workshop -- HACC 29

    Is Statistical ThinkingIs Statistical Thinking

    Unnatural?Unnatural?

    We havent evolved to be Statisticians. Our students who think Statistics is an

    unnatural subject are right. This isnt howhumans think naturally.

    But it is how humans think rationally. And itis how scientists think. This is the way wemust think if we are to make progress inunderstanding how the world works and,

    for that matter, how we ourselves work.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    30/56

    September 18, 2004 AWL Workshop -- HACC 30

    How can we help?How can we help?

    Give them an outline for putting

    the real world into a framework Whats the problem?

    The Ws

    The model The method

    What are the mechanics?

    What have we learned?

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    31/56

    September 18, 2004 AWL Workshop -- HACC 31

    ThinkThink ShowShow ---- TellTell

    THINK: What techniques apply?

    SHOW: Mechanics how to do it.

    TELL: Explain what you learned.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    32/56

    September 18, 2004 AWL Workshop -- HACC32

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    33/56

    September 18, 2004AWL Workshop -- HACC

    33

    The Three Rules of DataThe Three Rules of Data

    AnalysisAnalysis

    I. Make a Pictureit will help you think about the data

    II. Make a Picture

    it may show unexpected features

    III. Make a Pictureit will help you tell others what

    youve found.

    These are made easier with technology!

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    34/56

    September 18, 2004AWL Workshop -- HACC

    34

    Know the Datas WsKnow the Datas Ws Who is the data about?

    Whats a row?

    What is measured? What are the columns?

    And in what units?

    When was it measured?

    Where was it measured?

    Ho(W) was it measured?

    Why was it measured?

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    35/56

    September 18, 2004AWL Workshop -- HACC

    35

    The WsThe Ws

    14718833912040.5383.36.02USALance Armstrong2004

    14718934272040.9483.41.12USALance Armstrong2003

    15318932782039.9382.05.12USALance Armstrong2002

    14418934532040.0286.17.28USALance Armstrong2001

    12818036622139.5692.33.08USALance Armstrong2000

    14118036872040.391.32.16USALance Armstrong1999

    3611444881428.7156.09.31FranceLucien Petit-Breton1908

    339344881428.5156.22.30FranceLucien Petit-Breton1907

    148246371324.5185.47.26FranceRene Pottier1906

    246029751127.3112.18.09FranceLouisTrousselier1905

    23882388624.396.05.00FranceHenri Cornet1904

    21602428625.394.33.00FranceMaurice Garin1903

    FinishStartDis (km)StagesSpeedTimeCountryWinnerYear

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    36/56

    September 18, 2004AWL Workshop -- HACC

    36

    The ModelThe Model Statistics is about models

    A model is a simplification ofreality.

    We know its not perfect

    Two quotations from George Box

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    37/56

    September 18, 2004AWL Workshop -- HACC

    37

    Common ModelsCommon Models Probability models

    Regression model

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    38/56

    September 18, 2004AWL Workshop -- HACC

    38

    CommonCommon ModelsModels Simulation

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    39/56

    September 18, 2004

    AWL Workshop -- HACC39

    Pay Dirt ModelsPay Dirt Models

    Sampling distribution models

    By now students know that models areidealized

    Theyve seen probability models and

    simulations: CLT follows naturally Null hypothesis models

    Wrong (we hope) but useful

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    40/56

    September 18, 2004

    AWL Workshop -- HACC40

    ModelsModels

    Require assumptionsBecause they are idealized, they are only really

    true under idealized assumptions

    Are described by parametersParameters refer to models of populations, not

    to the populations themselves

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    41/56

    September 18, 2004

    AWL Workshop -- HACC41

    Assumptions andAssumptions and

    ConditionsConditions

    Some assumptions we must justassume. (Pretend)

    Many can be checked forplausibility with appropriateconditions Often the conditions are graphical

    (Remember the 3 rules)

    Few are really true

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    42/56

    September 18, 2004 AWL Workshop -- HACC 42

    Conditions to CheckConditions to Check Summary statistics

    Quantitative data condition.

    T-test Assumption is that data are Normal

    Rule of thumb? 30? 50? 100?

    Nearly normal condition--make a picture

    Variable -- TCODE

    Mean 54.41

    Std Dev 957.50

    Std Err Mean 3.11

    upper 95% Mea 60.51lower 95% Mean 48.31

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    43/56

    September 18, 2004 AWL Workshop -- HACC 43

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    44/56

    September 18, 2004 AWL Workshop -- HACC 44

    Show, with TechnologyShow, with Technology

    Calculation is for calculators andstatistics packages. Let them do it, so students can think about

    statistical thinking.

    Show generic output rather than aparticular package.

    Let them do it so we can play Statistics

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    45/56

    September 18, 2004 AWL Workshop -- HACC 45

    Play StatsPlay Stats

    M H lM H l

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    46/56

    September 18, 2004 AWL Workshop -- HACC 46

    More HelpMore Help

    Reality ChecksReality Checks The answer is wrong if it makes no sense --

    even if you pushed the buttons you meant topush or gave the command you intended

    Check that the results are plausible

    Remember the units!

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    47/56

    September 18, 2004 AWL Workshop -- HACC 47

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    48/56

    September 18, 2004 AWL Workshop -- HACC 48

    Draw ConclusionsDraw Conclusions

    Plot the data, but then say what you

    see. Give guidance for how to see

    Reject the null hypothesis, but then

    provide a CI to assess effect size. Emphasize interplay between tests and CI

    Think about costs and consequences.

    Dont be satisfied with I rejected Ho

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    49/56

    September 18, 2004 AWL Workshop -- HACC 49

    What Can Go Wrong?What Can Go Wrong?

    Acknowledge common

    misapplications and misinterpretationsof statistics.

    (Hope to) Minimize them in Tellingwhat was found.

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    50/56

    September 18, 2004 AWL Workshop -- HACC 50

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    51/56

    September 18, 2004 AWL Workshop -- HACC 51

    StepStep--ByBy--StepStep

    Encourage students to bring all ofthese ideas together when they

    solve a statistical problem.

    Illustrate how, step-by-step

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    52/56

    September 18, 2004 AWL Workshop -- HACC 52

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    53/56

    September 18, 2004 AWL Workshop -- HACC 53

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    54/56

    September 18, 2004 AWL Workshop -- HACC 54

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    55/56

    September 18, 2004 AWL Workshop -- HACC 55

    Take Home MessagesTake Home Messages

    Stats is about the real world: Technology frees the student to think

    about the world

    Give the student a structure for a chaotic

    world Root the course in examples taken from

    the students lives to make the connection

    apparent

    Help them with unnatural thinking

  • 7/30/2019 Math is Music - Stats is Literature, 2004

    56/56

    September 18, 2004 AWL Workshop -- HACC 56

    Thank you !!Thank you !!