32
Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Embed Size (px)

Citation preview

Page 1: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Introduction to Sampling

“If you don’t believe in sampling, the next time you have a blood

test tell the doctor to take it all.”

Page 2: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

To date we’ve learned to display, describe and summarize data and to examine relationships between variables; but so far we have been limited to examining data we’re given.

Now we need to use our knowledge and skills to answer questions of interest to us – by collecting our own data…

Page 3: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Three Important Ideas

1: Sampling: Examining part of a whole…

2: Randomization: Choosing randomly!!

3: Sample Size: It’s all about the sample size (the population size doesn’t matter).

Page 4: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Idea 1: SamplingWe often want to know something about

a population; but examining each individual is impractical or impossible.

To combat this problem, we examine a small but representative group of individuals – called a sample – from the population.

Page 5: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Examples:1. Think about cooking…if you want to

know how your meal will taste….

2. Opinion polls are samples; they’re designed to ask questions of a small group of people to learn something about the opinions of an entire population.

Page 6: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”
Page 7: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Types of Bias• Voluntary Response Bias:

• Bias introduced when participants self-select.

• Under-coverage Bias:• Introduced by a sampling method that ignores a portion of

a population. Bias results in a sample that is not fully representative.

• Response Bias:• Something in a survey’s design that influences responses.

Includes question wording, interview techniques, etc. (video?)

Page 8: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Types of Bias (cont’d)

• Non-Response Bias:• Bias introduced when a large portion of the

target sample fails to respond…and those who do respond are likely to not be representative of the population of interest.

Page 9: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Bias• Bias is the bane of sampling - the one thing that

must be avoided at all cost!

• There is usually no way to fix a biased sample and no way to salvage useful information from it.

• The easiest way to avoid bias is to select individuals for the sample at random.

• The deliberate introduction of randomness to eliminate bias is one of the great insights of Statistics.

Page 10: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Convenience sampling: Just ask whoever is around.

– Example: “Man on the street” survey (cheap, convenient)

– BUT…Which men, and on what street?

– Ask about legalizing marijuana “on the street” in Boston then in some small town in Idaho and you would probably get totally different answers. Even within an area, answers may differ. Think about this question when asked outside a church; then

again outside a bar.

Bias????: Under-coverage: limited to those present.

Bad Sampling Methods

Page 11: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Voluntary Response Sampling: Samples of individuals who choose to be involved. These samples are very

susceptible to bias because people are motivated (one way or another) to

respond. Often called “public opinion polls.” These are not considered valid or

scientific. Bias: Sample design systematically favors a particular outcome (voluntary response bias).

Example: Ann Landers summarized the responses of readers and

reported that 70% of (the 10,000) parents who wrote in said that

having kids wasn’t worth it. If they had to do it over again, they

wouldn’t!!

Bias: Most letters to newspapers are written by disgruntled people. A

random sample showed that 91% of parents WOULD have kids again.

Page 12: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Online surveys:Bias (voluntary response): People have to care enough about an issue to

bother replying. This sample is probably a combination of people who hate

“wasting the taxpayers money” and “animal lovers.” Is this representative of

everyone??

Page 13: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Sampling Terms• Population (of interest):

• The group we’re interested in drawing conclusions about.

• Sampling Frame:• A list of individuals from which a sample is drawn.

• Target Sample: • The group that you plan (or hope!) to sample.

• Sample: • The actual group you end up with when you’re done.

Page 14: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Sampling Terms (cont’d)• Beads???

• Sampling Variability (“Error”): Samples drawn at random differ from one another. These differences lead to different values for the statistics we measure.

• Strata: Homogeneous portions of a larger population.

• Cluster: A small section of a population that represents the entire population.

Page 15: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Sampling Methods• Simple Random Sample (SRS)

• Each person in the population of interest has an equal chance of being selected.

• Stratified Sampling:• The population is divided into strata (homogeneous

groups) before the target sample is selected. SRSs are then selected from each strata.

• Stratified sampling can reduce sampling variability and highlight important differences between groups.

Page 16: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Sampling Methods (cont’d)• Cluster Sampling:

• Splitting a population into clusters that represent the entire population. Once divided, several clusters are selected randomly and a census is performed within each cluster.

• Multi-Stage Sampling:• Sampling methods that combine several other

methods are called multi-stage samples. (web example / handout).

Page 17: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Sampling Methods (cont’d)• Systematic Sampling:

• Sampling method when individuals are selected systematically from a sampling frame; starting point must be generated randomly.

• Pilot: • Small trial run of a survey to check whether

questions are clear; allowing elimination of bias and question corrections.

Page 18: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Sampling Methods (example)• The Principal is interested in student opinions

on the school’s attendance policy…• Population of Interest? • Simple Random Sample (SRS)

• Sampling Frame? Target Sample?• Sample? Problems?

• Systematic Sample• Sampling Frame? Target Sample?• Sample? Problems?

Page 19: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Sampling Methods (example)• The Principal is interested in student opinions

of the school’s attendance policy…• Population of Interest? • Cluster Sample

• Clusters? Target Sample?• Sample? Problems?

• Stratified Sample• Strata? Sampling Frame?• Target Sample? Sample?• Problems?

Page 20: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Sampling Methods (example)• The Principal is interested in student opinions

of the school’s attendance policy…• Population of Interest? • Multi-Stage Sample

• Target Sample?• Sample?• Problems?

• Which is Best Method??• IT DEPENDS!!

Page 21: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Three Important Ideas

1: Sampling: Examining part of a whole…

2: Randomization: Choosing randomly!!

3: Sample Size: It’s all about the sample size (the population size doesn’t matter).

Page 22: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Idea 2: Randomization

Randomization minimizes bias.

Page 23: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Idea 3: It’s all about the Sample Size

• It’s the size of the sample, not the size of the population or the proportion of the population you’ve sampled that matters.

Page 24: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Which Survey is Most Accurate?1) In the city of Peabody, 1,000 likely voters are randomly

selected and asked who they are going to vote for in the Peabody mayoral race.

2) In the state of Massachusetts, 1,000 likely voters are randomly selected and asked who they are going to vote for in the Massachusetts Governor's race.

3) In the United States, 1,000 likely voters are randomly selected and asked who they are going to vote for in the presidential election.

Answer: All the surveys have the same accuracy.

Page 25: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Does a Census Make Sense?• Why bother with sampling…worrying

about sample size, bias, etc.? • Wouldn’t it be better to include everyone? To

“sample” an entire population? Well….sometimes! (example) Such a special sample is called a census.

Page 26: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Does a Census Make Sense? (cont.)• There are problems with taking a census:

– Practicality: It can be difficult to complete a census—individuals can be hard to locate...

– Timeliness: populations don’t stand still. Even if you could take a census, the population changes while you work.

– Expense: taking a census is much more expensive than sampling. U.S. Census???

– Accuracy: a census may not be as accurate as a good sample: (data entry errors, tedium, etc.)

$14.7 billion

Page 27: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Population vs. Sample• Population: The entire group of

individuals we are interested in but

can’t get to directly.

• Examples: All humans, all working-age

people in New England, all crickets, all

h/s students.

• A parameter is a number

describing a characteristic of the

population.

• Sample: The part of the

population we actually

examine (for which we do

have data).

• A statistic is a number

describing a characteristic

of a sample.

PopulationSample

Page 28: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Greek Latin

Page 29: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Various claims are made about surveys. Why are each of these not correct?

• It is always better to take a census than a sample…Timeliness, expense, complexity, accuracy

• Stopping students on their way out of the cafeteria food line is a good way to sample if we want to know the quality of the food in the cafeteria.Bias; they chose to eat at the cafeteria

Page 30: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

• An internet poll taken at the website (www.statsisfun.org) garnered 12,357 responses. The majority of the respondents said they enjoyed doing statistics homework. With a sample size so large, we can be pretty sure that most stats students feel this way too.Voluntary response bias; size of sample does not

remove the bias.

Page 31: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Lots of New Vocabulary!!!Population

Sample

Convenience Sample

Voluntary Response BiasPilot

Systematic Sample

Multistage Sample

Cluster Sample

Stratified Random Sample

Sampling Variability Sampling Frame

SRSPopulation Parameter

Census

Sample size

Bias

Undercoverage bias

Nonresponse bias

Response bias

Page 32: Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”

Vocabulary ReviewPopulation

Sample

Convenience Sample

Voluntary Response BiasPilot

Systematic Sample

Multistage Sample

Cluster Sample

Stratified Random Sample

Sampling Variability Sampling Frame

SRSPopulation Parameter

Census

Sample size

Bias

Undercoverage bias

Nonresponse bias

Response bias