24
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill Building 10:00 - 10:50 Mondays, Wednesdays & Fridays. http://courses.eller.arizona.edu/mgmt/delaney/d15s_database_weekone_screenshot.xlsx

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Embed Size (px)

Citation preview

Page 1: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Introduction to Statistics for the Social Sciences

SBS200, COMM200, GEOG200, PA200, POL200, or SOC200Lecture Section 001, Fall 2015

Room 150 Harvill Building10:00 - 10:50 Mondays, Wednesdays & Fridays.

http://courses.eller.arizona.edu/mgmt/delaney/d15s_database_weekone_screenshot.xlsx

Page 2: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Everyone will want to be enrolled in one of the lab sessions

Labs continue next week

Page 3: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Please re-register your clickerhttp://

student.turningtechnologies.com/

Page 4: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

By the end of lecture today10/9/15

Law of Large Numbers

Central Limit Theorem

Page 5: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Before next exam (October 16th)

Please read chapters 1 - 8 in OpenStax textbook

Please read Chapters 10, 11, 12 and 14 in PlousChapter 10: The Representativeness Heuristic

Chapter 11: The Availability HeuristicChapter 12: Probability and Risk

Chapter 14: The Perception of Randomness

Schedule of readings

Page 6: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

On class website: Please print and complete homework worksheet #11 Due Monday October 12th

Dan Gilbert Reading and Law of Large Numbers

Homework

Page 7: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Review of Homework Worksheet

just in case of questions

Page 8: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Homework review

Based on apriori probability – all options equally likely – not based on previous experience or data

Based on expert opinion - don’t have previous data for these two companies merging together

25

= .40

Based on frequency data (Percent of rockets that successfully launched)

Page 9: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Homework review

Based on apriori probability – all options equally likely – not based on previous experience or data

Based on frequency data (Percent of times that pages that are “fake”)

30100

= .30

Based on frequency data (Percent of times at bat that successfully resulted in hits)

Page 10: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Homework review5

50= .10

Based on frequency data (Percent of students who successfully chose to be Economics majors)

Page 11: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

.

50 554444 - 50 4

= -1.5

55 - 50 4

= +1.25

z of 1.5 = area of .4332

.4332 +.3944 = .8276

z of 1.25 = area of .3944

50 55

55 - 50 4

= +1.25

.5000 - .3944 = .1056

1.25 = area of .3944

.3944

52 5552 - 50 4

= +.5

55 - 50 4

= +1.25

z of .5 = area of .1915

.3944 -.1915 = .2029

z of 1.25 = area of .3944

.3944.1915

.8276

.1056

.2029

.4332.3944

Page 12: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Homework review

3,0003000 - 2708

650 =0.45

z of 0.45 = area of .1736

.5000 - .1736 = .3264

3,000 3,500

.1736

3000 - 2708

650 =0.45

z of 0.45 = area of .1736

.3888 - .1736 = .2152

3500 - 2708

650 =1.22

z of 1.22 = area of .3888

.1736

2,500 3,500

.1255

2500 - 2708

650 =-.32

z of -0.32 = area of .1255

.3888 +.1255= .5143

3500 - 2708

650 =1.22

z of 1.22 = area of .3888

.3888

.3264

.2152

.5143

.3888

Page 13: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Homework review

20 20 - 15 3.5

=1.43

z of 1.43 = area of .4236

.5000 - .4236 = .0764

.4236

20 - 15 3.5 =1.43

z of 1.43 = area of .4236 z of -1.43 = area of .4236

.4236 – .3051 = .1185

z of -.86 = area of .3051

10 1220

.4236

.5000 + .4236 = .9236

10 - 15 3.5 =-1.43

12 - 15 3.5 =-0.86

.0764

.1185

.9236

.3051.4236

Page 14: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Comments on Dan Gilbert Reading

Page 15: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Law of large numbers: As the number of measurementsincreases the data becomes more stable and a better

approximation of the true (theoretical) probability

As the number of observations (n) increases or the number of times the experiment is performed, the estimate will become more accurate.

Page 16: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Law of large numbers: As the number of measurementsincreases the data becomes more stable and a better

approximation of the true signal (e.g. mean)

As the number of observations (n) increases or the number of times the experiment is performed, the signal will become more clear (static cancels out)

http://www.youtube.com/watch?v=ne6tB2KiZuk

With only a few people any little error is noticed (becomes exaggerated when we look at whole

group)

With many people any little error is corrected (becomes minimized when we look at whole

group)

Page 17: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Sampling distributions of sample means versus frequency distributions of individual scores

XXXXXX

XXXXX

XXXXX

XXX

XXX

XX

XX XX

XXXXX

XXXXX

XXXXXX

Melvin

Eugene

Distribution of raw scores: is an empirical probability distribution of the values from a sample of raw scores from a population

Frequency distributions of individual scores• derived empirically• we are plotting raw data• this is a single sample

Population

Take a single score

Repeat over and

over

x xx

xx

xxx

Preston

Page 18: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sizedsamples from a population

Sampling distributions of sample means• theoretical distribution• we are plotting means of samples

Population

Take sample –

get mean

Repeat over and over

important note:

“fixed n”

Mean for 1st sample

Page 19: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sizedsamples from a population

Population Distribution

of means of samples

Sampling distributions of sample means• theoretical distribution• we are plotting means of samples

Take sample –

get mean

Repeat over and over

important note:

“fixed n”

Page 20: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sizedsamples from a population

XXXXXX

XXXXX

XXXXX

XXX

XXX

XX

XX XX

XXXXX

XXXXX

XXXXXX

2nd sample

23rd sample

Sampling distributions sample means• theoretical distribution• we are plotting means of samples

Frequency distributions of individual scores• derived empirically• we are plotting raw data• this is a single sample

Melvin

Eugene

Page 21: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Central Limit Theorem: If random samples of a fixed N are drawnfrom any population (regardless of the shape of thepopulation distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical populationmean.

Sampling distribution for continuous distributions

XXXXXX

XXXXX

XXXXX

XXX

XXX

XX

XX XX

XXXXX

XXXXX

XXXXXX

MelvinEugen

e

Sampling Distribution of Sample means

Distribution of Raw Scores

2nd sample

23rd sample

Page 22: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

An example of asampling distribution of sample means

µ= 100σ = 3

= 1

XXXXXX

XXXXX

XXXXX

XXX

XXX

XX

XX XX

XXXXX

XXXXX

XXXXXX

Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sizedsamples from a population

Mean = 100

100

100

Standard Deviation = 3

µ = 100Mean = 100

Standard Errorof the Mean = 1

Notice: SEM is smaller than SD – especially as n increases

Melvin

Eugene

2nd sample

23rd sample

Page 23: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill

Proposition 1: If sample size (n) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population

Central Limit Theorem

Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population

XXXXXX

XXXXX

XXXXX

XXX

XXX

XX

XX XX

XXXXX

XXXXX

XXXXXX

Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases.

As n ↑

x will approach µ

As n ↑ curve will approach normal shape

As n ↑ curve variability gets smaller

Page 24: Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill