18
Sampling Sampling

Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Embed Size (px)

Citation preview

Page 1: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

SamplingSampling

Page 2: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

SamplingSampling

Can’t talk to everybodyCan’t talk to everybody

Select some members of population of Select some members of population of interestinterest

If sample is “representative” can If sample is “representative” can generalize findingsgeneralize findings

Page 3: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Some TermsSome Terms

PopulationPopulation

SampleSample

Population ParameterPopulation Parameter

EstimatorEstimator

Sample StatisticSample Statistic

StratumStratum

Y

Page 4: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

More TermsMore Terms

Sampling FrameSampling Frame

Sampling UnitSampling Unit

Sample BiasSample Bias

Page 5: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

When Good Surveys go BadWhen Good Surveys go Bad

Literary Digest 1936 PollLiterary Digest 1936 PollDraw on Approx 4 million respondentsDraw on Approx 4 million respondentsPredict Landon VictoryPredict Landon VictoryWhat went wrong?What went wrong?Bad sampleBad sampleAuto RegistrationsAuto RegistrationsPhonesPhonesReaders of Literary magazineReaders of Literary magazine

Page 6: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

SamplingSampling

Bigger is generally preferable to smallerBigger is generally preferable to smallerQuality trumps quantityQuality trumps quantityMargin of ErrorMargin of Error4000 4000 ± 2± 21500 ± 31500 ± 31000 ± 41000 ± 4600 ± 5600 ± 5400 ± 6400 ± 6200 ± 8200 ± 8100 ± 11100 ± 11

Page 7: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Sampling/Margin of ErrorSampling/Margin of Error

Poll shows 53-47 prefer BushPoll shows 53-47 prefer Bush

Sample of 1000Sample of 1000

± 4± 4

Could be 49-51Could be 49-51

Could be 57-43Could be 57-43

Page 8: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

An Example

• 2000 NES 1798 People Answered Age Question

• Actual Ages– Average 47.2– Range 18-97

• 11Samples of 10%– Estimate- 45.4-49.6– Std. Deviation of Estimate

1.2– Mean of Estimates- 47.1 46.00 47.00 48.00 49.00

var00001

1

2

3

4

5

Count

Page 9: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Example ContinuedExample Continued

Actual Average- 47.2Actual Average- 47.2

1% Sample- 39.31% Sample- 39.3

5% Sample-49.15% Sample-49.1

25% Sample- 47.925% Sample- 47.9

50% Sample- 4850% Sample- 48

75% Sample-46.875% Sample-46.8

99% Sample- 47.2599% Sample- 47.25

Page 10: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

SamplingSampling

Bigger is generally preferable to smallerBigger is generally preferable to smallerQuality trumps quantityQuality trumps quantityMargin of ErrorMargin of Error4000 4000 ± 2± 21500 ± 31500 ± 31000 ± 41000 ± 4600 ± 5600 ± 5400 ± 6400 ± 6200 ± 8200 ± 8100 ± 11100 ± 11

Page 11: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Sampling/Margin of ErrorSampling/Margin of Error

Poll shows 53-47 prefer BushPoll shows 53-47 prefer Bush

Sample of 1000Sample of 1000

± 4± 4

Could be 49-51Could be 49-51

Could be 57-43Could be 57-43

Page 12: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Types of Samples- Simple RandomTypes of Samples- Simple Random

Simple RandomSimple Random RDDRDD

Requires numbered list of population Requires numbered list of population membersmembers

Pick random elements until you meet Pick random elements until you meet sample sizesample size

Page 13: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Systematic SamplingSystematic Sampling

List populationList populationDetermine Sampling IntervalDetermine Sampling Interval E.g. if you have 1000 and want 200 cases, E.g. if you have 1000 and want 200 cases,

take every 5take every 5thth case (1000/200=5) case (1000/200=5)

Start on random list number (for example Start on random list number (for example 1-5)1-5)Include every 5Include every 5thth case thereagter case thereagterProblem- POPULATION MUST NOT BE Problem- POPULATION MUST NOT BE RANKED BY A CHARACTERISTICRANKED BY A CHARACTERISTIC

Page 14: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Stratified SampleStratified Sample

Probability SampleProbability SampleGroup Elements by some traitGroup Elements by some traitSelect Number of Each element to reflect distribution in Select Number of Each element to reflect distribution in populationpopulationExampleExample

City is 70% White, 20% Latino, 10% African American, want City is 70% White, 20% Latino, 10% African American, want sample of 1000 Randomly Select 700 Whites, 200 Latinos, 100 sample of 1000 Randomly Select 700 Whites, 200 Latinos, 100 African AmericansAfrican Americans

Oversampling- Some traits may not be common, select Oversampling- Some traits may not be common, select extra members of that populationextra members of that population

E.g. African Americans make up about 10% of population, might E.g. African Americans make up about 10% of population, might collect extra African Americans for more detailed analysis. collect extra African Americans for more detailed analysis.

Page 15: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Cluster SamplesCluster Samples

Initial Frame is Clusters of UnitsInitial Frame is Clusters of Units

Take sample of initial units (e.g. telephone Take sample of initial units (e.g. telephone exchanges, zip codes, city blocks, etc).exchanges, zip codes, city blocks, etc).

Get details of make up of selected unitsGet details of make up of selected units

Take random sample within unitsTake random sample within units

Still random, just done in stepsStill random, just done in steps

Works best in fairly homogenous Works best in fairly homogenous populationspopulations

Page 16: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Non Probability SamplesNon Probability Samples

Cases Where no good way to specify Cases Where no good way to specify populationpopulation

Too expensive for probability samplingToo expensive for probability sampling

Preference for studying certain casesPreference for studying certain cases

Page 17: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

Types of Non-Probability SamplesTypes of Non-Probability Samples

PurposivePurposive

Convenience SampleConvenience Sample

Quota SampleQuota Sample

Snowball SampleSnowball Sample

Page 18: Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings

For Next Time For Next Time

Content AnalysisContent Analysis