Lecture 1 on R programming

Embed Size (px)

Citation preview

  • 7/27/2019 Lecture 1 on R programming

    1/37

    STAT 2

    Lecture 1:

    An introduction

  • 7/27/2019 Lecture 1 on R programming

    2/37

    About the lecturer

    Brad Luen

    [email protected](put STAT 2 in subject line)

    http://www.stat.berkeley.edu/users/bradluen/stat2/

  • 7/27/2019 Lecture 1 on R programming

    3/37

    Why are we here?

    World is full of data

    Statistics lets us make sense ofdata

    Therefore, statistics helps us

    make sense of the world

  • 7/27/2019 Lecture 1 on R programming

    4/37

    Why are we here?

    It's a requirement... It's a prerequisite... It's easy units...

  • 7/27/2019 Lecture 1 on R programming

    5/37

    Why are we here?

    Statistical literacy: understandstatistical statements

    Statistical reasoning: drawconclusions from statisticalstatements

    Statistical thinking: investigateproblems statistically

  • 7/27/2019 Lecture 1 on R programming

    6/37

    What we're going to do today

    Course outline: everything you

    have to do this semester Course structure: everything you

    need to know about statistics, in

    half an hour

  • 7/27/2019 Lecture 1 on R programming

    7/37

    CourseStructure

  • 7/27/2019 Lecture 1 on R programming

    8/37

    Week in, week out

    Textbook: Statistics byFreedman, Pisani & Purves, 3rd or4th ed.

    Lectures: M-F 10-11 am, hereDiscussion: M-Th: 11 am in 332

    Evans; 11 am in 344 Evans;12 pm in 344 Evans

  • 7/27/2019 Lecture 1 on R programming

    9/37

    Grading

    Weekly quizzes: 20% (best 5 of6)

    Midterm: Friday 18th July: 30% Final: Friday 15th August: 50%For full schedule, see course

    webpageFirst quiz: this Thursday duringdiscussion

  • 7/27/2019 Lecture 1 on R programming

    10/37

    Where to get help

    Brad's office hours: Wed 11am -1pm

    Partha's office hours: W 9-10 am,

    Th 3-4 pm Daniel's office hours: Tu 9-10

    am, 2-3 pmProbably in 307 Evans but to beconfirmed

  • 7/27/2019 Lecture 1 on R programming

    11/37

    Protips

    Don't fall behind! Read the book! Come to office hours!

  • 7/27/2019 Lecture 1 on R programming

    12/37

    Questions?

  • 7/27/2019 Lecture 1 on R programming

    13/37

    I

    Dealing with data:weeks 1 and 2

  • 7/27/2019 Lecture 1 on R programming

    14/37

    Design of experiments

    How do you design anexperiment to show what you

    want to show? How can you set up a fair

    comparison? What if you can't do an

    experiment?

  • 7/27/2019 Lecture 1 on R programming

    15/37

    Summarising data

    Summarise data through: Graphs Averages Spreads

  • 7/27/2019 Lecture 1 on R programming

    16/37

    Mistakes in measurement

    Physics sez: V=IRIs physics right?

  • 7/27/2019 Lecture 1 on R programming

    17/37

    II

    The best fit:weeks 3 and 4

  • 7/27/2019 Lecture 1 on R programming

    18/37

    Correlation

    How strong is therelationship?

  • 7/27/2019 Lecture 1 on R programming

    19/37

    Regression

    Which line shows theaverage weight giventhe person's height?

  • 7/27/2019 Lecture 1 on R programming

    20/37

    Prediction

    How accurately can wepredict a person'sweight, given theirheight?

  • 7/27/2019 Lecture 1 on R programming

    21/37

    Probability

    What does chance mean?

    How do we calculate probabilitiesof complex events? What if we can't do exact

    calculations?

  • 7/27/2019 Lecture 1 on R programming

    22/37

    Intermission: The outcome effect

    France vs Holland soccer, June18th

    Most sportsbooks: bet $1, win $2

    if France wins One sportsbook: bet $1, win $2 if

    France wins OR draws I bet on France Holland 4, France 1

  • 7/27/2019 Lecture 1 on R programming

    23/37

    The outcome effect

    After the fact, probability ismeaningless

    Single probability statementsgenerally can't be judged onoutcomes alone

    Need multiple observations for atest

  • 7/27/2019 Lecture 1 on R programming

    24/37

    III

    Variation:Weeks 5 and 6

  • 7/27/2019 Lecture 1 on R programming

    25/37

    The law of averages

    After taking a large number ofobservations, the observed

    average is very close to thetheoretical average... if the theoryis right

    How can we use this knowledgeto statistically model events?

  • 7/27/2019 Lecture 1 on R programming

    26/37

    How to gamble

    Don't gamble

    In most cases, you're sure to losein the long run We can analyse games (and life)

    in terms of expected value

  • 7/27/2019 Lecture 1 on R programming

    27/37

    Taking samples and surveys

    How do we avoid bias?

    How do we deal with chanceerrors? How large should our sample size

    be?

  • 7/27/2019 Lecture 1 on R programming

    28/37

    How accurate are samples?

    How accurate are opinion pollpercentages? How accurate are experimental

    averages?

    Confidence intervals: the mostconfusing things in all statistics

  • 7/27/2019 Lecture 1 on R programming

    29/37

    IV

    Putting it to the test:Weeks 7 and 8

  • 7/27/2019 Lecture 1 on R programming

    30/37

    More about errors

    Types of error Models for error Checking for cheats

  • 7/27/2019 Lecture 1 on R programming

    31/37

    Is the difference real?

    Testing for a significant

    difference What tests assume

    How to interpret test results

  • 7/27/2019 Lecture 1 on R programming

    32/37

    Is the difference real: advanced

    Looking too hard Bad models, bad tests Make your own tests

  • 7/27/2019 Lecture 1 on R programming

    33/37

    Recap

  • 7/27/2019 Lecture 1 on R programming

    34/37

    Statistical literacy

    Understand graphs

    Understand probabilisticstatements Understand experimental and

    survey results

  • 7/27/2019 Lecture 1 on R programming

    35/37

    Statistical reasoning

    Draw conclusions from graphsand data summaries

    Make decisions based onprobabilities Evaluate conclusions others have

    drawn from statistics

  • 7/27/2019 Lecture 1 on R programming

    36/37

    Statistical thinking

    Design experiments to testhypotheses

    Build and evaluate predictionmodels

    Understand the relative strengthof statistical conclusions

  • 7/27/2019 Lecture 1 on R programming

    37/37

    Next time:

    How statistics helpedto vanquish polio