11
Lecture 0: Introduction Statistics 101 Mine C ¸ etinkaya-Rundel January 12, 2012 Syllabus & policies General Info Classroom: Social Sciences 139 Time: Tuesdays and Thursdays 1:15pm - 2:30pm Professor: Mine C ¸ etinkaya-Rundel Office: Old Chemistry 213 Email: [email protected] Teaching Yingbo Li - [email protected] Assistants: Brittany Cohen - [email protected] Anthony Weishampel - [email protected] Statistics 101 (Mine C ¸ etinkaya-Rundel) Lecture 0: Introduction January 12, 2012 1 / 40 Syllabus & policies Course goals & objectives Introduce you to the discipline of statistics as a science of understanding and analyzing data Provide you with the tools for solving real world problems using statistics and a better understanding of the process of scientific research and statistical inference By the end of the class you should be able to interpret statistical results in context and critique news stories and journal articles that include statistical information, be comfortable with concepts such as association and causation, random sampling and random assignment, statistical bias and statistical significance, and to understand and appreciate why real data beats anecdotes. Statistics 101 (Mine C ¸ etinkaya-Rundel) Lecture 0: Introduction January 12, 2012 2 / 40 Syllabus & policies Major topics discussed Exploratory data analysis: description, summary and visualization Principles of experimental design and causal inference Observational studies and non-causal inference The basics of probability and chance processes The normal distribution Central Limit Theorem and sampling distributions Statistical inference through theory and randomization Bivariate correlation and causality Simple and multiple linear regression and ANOVA Bayesian perspective in statistical inference Statistics 101 (Mine C ¸ etinkaya-Rundel) Lecture 0: Introduction January 12, 2012 3 / 40

Syllabus & policies General Info - Duke University · Syllabus & policies General Info Classroom: Social Sciences 139 ... random sampling and random assignment,

Embed Size (px)

Citation preview

Lecture 0: Introduction

Statistics 101

Mine Cetinkaya-Rundel

January 12, 2012

Syllabus & policies

General Info

Classroom: Social Sciences 139Time: Tuesdays and Thursdays 1:15pm - 2:30pm

Professor: Mine Cetinkaya-RundelOffice: Old Chemistry 213Email: [email protected]

Teaching Yingbo Li - [email protected]: Brittany Cohen - [email protected]

Anthony Weishampel - [email protected]

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 1 / 40

Syllabus & policies

Course goals & objectives

Introduce you to the discipline of statistics as a science ofunderstanding and analyzing data

Provide you with the tools for solving real world problems usingstatistics and a better understanding of the process of scientificresearch and statistical inferenceBy the end of the class you should be able to

interpret statistical results in context and critique news stories andjournal articles that include statistical information,be comfortable with concepts such as association and causation,random sampling and random assignment, statistical bias andstatistical significance, and to understand and appreciate why realdata beats anecdotes.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 2 / 40

Syllabus & policies

Major topics discussed

Exploratory data analysis: description, summary andvisualization

Principles of experimental design and causal inference

Observational studies and non-causal inference

The basics of probability and chance processes

The normal distribution

Central Limit Theorem and sampling distributions

Statistical inference through theory and randomization

Bivariate correlation and causality

Simple and multiple linear regression and ANOVA

Bayesian perspective in statistical inference

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 3 / 40

Syllabus & policies

Required materials

Textbook: OpenIntro StatisticsDiez, Barr, Cetinkaya-RundelCreateSpace, 1st Edition, 2011

Clicker : i>clicker 1Available at the Duke textbook store or on Amazon

Calculator : Four function calculator that can do square roots,no limitation on the type of calculator.

We will not be providing calculators and you willnot be allowed to borrow one from another studentduring an exam.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 4 / 40

Syllabus & policies

Webpage

http:// stat.duke.edu/ courses/ Spring12/ sta101.1

All announcements and assignments will be posted on this websiteunder the Schedules tab.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 5 / 40

Syllabus & policies

Office Hours

Professor: Mondays and Wednesdays 2 pm - 4 pm+ after class and by appointmentadditional OH announced weeks of projectsand exams

TAs: Sunday - Thursday 4pm - 9pm starting nextweek at the SECC (Old Chemistry 211A)

You are highly encouraged to stop by with any questions orcomments about the class, or just to say hi and introduceyourself.

Most homework assignments due on Thursday. Recommendattempting all homework problems by Wednesday to takeadvantage of Wednesday OH.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 6 / 40

Syllabus & policies

Grading

Clicker questions 7.5% Project 1 7.5%Homework & Labs 15% Project 2 10%Online quizzes 5% Midterms (2 × 15%) 30%

Final Exam 25%

Grades curved at the end of the course after overall averageshave been calculated.

Average of 90-100 guarantee A-.Average of 80-90 guarantee B-.Average of 70-80 guarantee C-.

The more evidence there is that the class has mastered thematerial, the more generous the curve will be.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 7 / 40

Syllabus & policies

Lectures

Lecture slides will be posted on the course webpage (underschedule) by 9am the day of the course.

In order to be able to keep up with the pace of the course andnot fall behind you must attend the lectures.

Introduction of concepts as well as hands on activities andexercises to complement them.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 8 / 40

Syllabus & policies

Clicker registration

http:// iclicker.com/ support/ registeryourclicker

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 9 / 40

Syllabus & policies

Clicker grading

Two types of questions:Review questions:

One question per lecture on material covered in previous classesGet credit for answering correctly.Generally in the first 5 minutes of class.Objective: motivate you to keep up with the material.

New questions:On new material introduced in class that day.Credit for clicking in, regardless of whether you have the correctanswer.Objective: make you an active participant and help me pace theclass.

Up to two unexcused late arrivals or absences will not affect yourclicker grade.

If one person is simultaneously using two or more clickers, the allowners of the clickers will receive a 0 for an overall clicker grade.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 10 / 40

Syllabus & policies

Homework

Questions from the book and “on your own” part of the lab.

Due at the beginning of class on the due date.

Show all your work to receive credit.

Welcomed and encouraged to work with others, but turn in yourown work.

Lowest homework score will be dropped.

No make-ups.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 11 / 40

Syllabus & policies

Online quizzes

All quizzes at

http:// www.openintro.org

Designed to help you find any problem areas, and to help mejudge how to pace the course.

1 hour to complete each quiz (should take no more than halfhour), must take the quizzes by yourself.

Thursdays at 5 pm to Mondays at 8:30 am and will cover theprevious and the coming week’s material.

No make-ups.

Lowest quiz score will be dropped.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 12 / 40

Syllabus & policies

Online quizzes (cont.)

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 13 / 40

Syllabus & policies

Online quizzes (cont.)

Course ID:STAT101S12

Access code:6VQ3T

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 14 / 40

Syllabus & policies

Labs

http:// beta.rstudio.org

Email me your gmail address, if you haven’t yet done so, tocreate an RStudio account.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 15 / 40

Syllabus & policies

Projects

Project 1: individual, 5 page write-up

Project 2: in teams, presentation + 10 page write-up

Two projects will be selected to be entered to a nationalOpenIntro project competition. More details to follow...

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 16 / 40

Syllabus & policies

Exams

Midterm 1: Thursday, February 23

Midterm 2: Tuesday, March 27

Final: Thursday, May 3, 9:00am - 12:00pm (Cumulative)

Exam dates cannot be changed. No make-up exams will begiven. If you cannot take the exams on these dates you shoulddrop this class.

You must bring a calculator to the exams (no cell phones, iPods,etc.) and you are also allowed to bring one sheet of notes(“cheat sheet”). This sheet must be no larger than 81

2” × 11” andmust be prepared by you (no photocopies). You may use bothsides of the sheet.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 17 / 40

Syllabus & policies

Email

I will regularly send announcements by email, so make sure tocheck your email daily.

While email is the quickest way to reach me outside of class,note that it is much more efficient to answer most statisticalquestions in person.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 18 / 40

Syllabus & policies

Work load

You are expected to put in 4-6 hours of work outside of class. A few ofyou will do well with less time than this, and a few of you will needmore.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 19 / 40

Syllabus & policies

Other learning resources

Aside from the TAs and the professor’s office hours, you can alsomake use of the Academic Skill Center. For more information, seehttp:// web.duke.edu/ arc .

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 20 / 40

Syllabus & policies

Policies

Late work policy:

late but during class: lose10% of pointsafter class on due date: lose20% of points

next day: lose 40% of points

later than next day: lose allpoints

There will not be make-ups for any of the online quizzes,homework, labs, or exams.

All regrade requests on homework assignments, labs, andexams must be discussed with the professor within one week ofreceiving your grade. There will be no grade changes after thefinal exam.

Academic integrity & Duke Community Standard

Excused absences

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 21 / 40

Syllabus & policies

Tips for success

1 Read (or skim) the relevant sections before a new week begins,and then review again after the lectures.

2 Be an active participant during lectures and labs.3 Ask questions - during class or office hours, or by email. Ask me,

the TAs, and/or your classmates.4 Do the homework - start early and make sure you attempt and

understand all questions.5 Start your projects early and and allow adequate time for working

on them.6 Give yourself plenty of time time to prepare a good cheat sheet

for exams. This requires going through the material and takingthe time to review the concepts that you’re not comfortable with.

7 Do not procrastinate - don’t let a week go by with unansweredquestions as it will just make the following week’s material evenmore difficult to follow.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 22 / 40

Examples Arbuthnot

Birth counts and history

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 23 / 40

Examples Arbuthnot

What is going on?

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 24 / 40

Examples Arbuthnot

Data error?

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 25 / 40

Examples Baby names

Baby names in the US

Each year the Social Security Administration collects andreleases data on the how many babies are given a certain name.

They released these data for years 1880 to 2010 for eachgender.

For privacy reasons they restrict the list of names to those with atleast 5 occurrences.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 26 / 40

Examples Baby names

2010

http:// www.ssa.gov/ oact/ babynames

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 27 / 40

Examples Baby names

Jac...

http:// www.babynamewizard.com

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 28 / 40

Examples Baby names

2010s least frequent

Zulie,F,5

Zuriya,F,5

Zuriyah,F,5

Zyda,F,5

Zyera,F,5

Zyia,F,5

Zyiana,F,5

Zyira,F,5

Zylia,F,5

Zylynn,F,5

Zyniya,F,5

Zyonnah,F,5

Zyriana,F,5

Zyrihanna,F,5

Ziven,M,5

Zmari,M,5

Zoren,M,5

Zuhaib,M,5

Zyeire,M,5

Zygmunt,M,5

Zykerion,M,5

Zylar,M,5

Zylin,M,5

Zymaire,M,5

Zyonne,M,5

Zyquarius,M,5

Zyran,M,5

Zzyzx,M,5

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 29 / 40

Examples Geotagged data

Clicker question

Do you geotag your posts on social networking sites, like Facebook,Twitter, GooglePlus, etc.?

(a) yes

(b) no

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 30 / 40

Examples Geotagged data

Map based on Flickr tags

http:// aaronstraupcope.com

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 31 / 40

Examples Geotagged data

Map based on Flickr tags

Red: Tourists

Blue: Locals

Yellow: Either

http:// www.flickr.com/

photos/ walkingsf/

4671594023/ in/

set-72157624209158632/

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 32 / 40

Examples Modeling and uncertainty

Primary Predictions

http:// fivethirtyeight.blogs.nytimes.com/

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 33 / 40

Examples Modeling and uncertainty

Primary Predictions (cont.)

http:// fivethirtyeight.blogs.nytimes.com/

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 34 / 40

Examples Modeling and uncertainty

Primary Predictions (cont.)

http:// fivethirtyeight.blogs.nytimes.com/

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 35 / 40

Examples Modeling and uncertainty

Primary Predictions (cont.)

http:// fivethirtyeight.blogs.nytimes.com/

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 36 / 40

Examples Modeling and uncertainty

Primary Predictions (cont.)

http:// fivethirtyeight.blogs.nytimes.com/

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 37 / 40

Why study statistics

[...], the study also warnsthere is a significant shortageof qualified workers to analyzethese data sets adequately.According to the report, ashortfall of about 140,000 to190,000 individuals withanalytical expertise isprojected by 2018. The studyalso predicts a need for anadditional 1.5 millionmanagers and analysts bythat same date to fully engagethe true potential of thecurrently available data.

http:// jobs.aol.com/ articles/ 2011/ 08/ 10/

data-scientist-the-hottest-job-you-havent-heard-of

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 38 / 40

Why study statistics

http:// www.dailymail.co.uk/ news/ article-2023514Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 39 / 40

Data collection

Data collection

Work in groups for a few minutes to come up with data you wouldlike to find out about Sta 101 students. Think of interestingquestions since we will analyze these data throughout thesemester.

I will email with instructions to fill out an anonymous onlinesurvey over the weekend.

Statistics 101 (Mine Cetinkaya-Rundel) Lecture 0: Introduction January 12, 2012 40 / 40