19
INTRODUCTION TO SURVEY SAMPLING February 14, 2018 Linda Owens www.srl.uic.edu General information Please hold questions until the end of the presentation Slides available at http://www.srl.uic.edu/seminars.htm Please raise your hand so that I can see that you can hear me 2

INTRODUCTION TO SURVEY SAMPLING - srl.uic.edu to sampling_Spring_2018... · Stratified sampling: Proportionate To ensure sample resembles some aspect of population Population is divided

Embed Size (px)

Citation preview

INTRODUCTION TO SURVEY SAMPLING

February 14, 2018

Linda Owens

www.srl.uic.edu

General information

� Please hold questions until the end of the presentation

� Slides available athttp://www.srl.uic.edu/seminars.htm

� Please raise your hand so that I can see that you can hear me

2

Outline

� Introduction

� Target Populations

� Sample Frames

� Sample Designs

� Determining Sample Sizes

� Modes of Data Collection

� Questions

3

Introduction

Census:

� Gathering information about every individual in a population

Sample:

� Selection of a small subset of a population

4

Census

Sample

� Less expensive

� Less time-consuming

� More accurate

� Samples can lead to statistical inference about the entire population

Why sample instead of taking a census?

5

Probability vs. non-probability

� Probability Sample

� Generalize to the entire population

� Unbiased results

� Known, non-zero probability of selection

� Non-probability Sample

� Exploratory research

� Convenience

� Probability of selection is unknown

6

Probability vs Non-Probability Sample

7

p=n/N=10/30=.3333 p=n/?=?=?

Steve Mays, YouTube video on sampling: https://youtu.be/yx5KZi5QArQ

Rahul Patwari, YouTube video on non-probability sampling: https://youtu.be/-kwdXEXC7yE

Target population

Definition: The population to which we want to generalize our findings

� Unit of analysis: Individual/Household/City

� Geography: State of Illinois/Champaign County/City of Urbana

� Age/Gender

� Other variables

8

Examples of target populations

� Population of adults in Champaign County

� Faculty, staff, or students at the University of Illinois

� Youth age 5 to 18 in Champaign County

� Registered Voters

9

Sampling frame

� Before you can ask people to answer your questions, you have to make contact with them

� How will you do that?

� Sampling frame is the mechanism that makes that possible

� Information on sampling frame has bearing on mode of data collection

10

Sampling frame

� A complete list of all units, at the first stage of sampling, from which a sample is drawn

� For example, lists of . . .

� addresses

� landline phone numbers in specific area codes

� blocks or census tracts in specified geographic areas

� members of professional organization

� schools

� cell phone numbers

11

Target populations, sample frames, and

coverage

Example 1:

� Population:Adults in Champaign County, IL

� Frames: List of landline numbers, list of census blocks, list of addresses

Example 2:

� Population: Youth age 5 to 18 in Cook County

� Frame: List of schools

Example 3:

� Population: Adults age 18-34 in United States

� Frame: ??

Coverage: How well does the sample frame represent the target population?

12

Coverage Error

13

Target Population

Sample Frame

Sample designs for probability samples

� Simple random samples

� Systematic samples

� Stratified samples

� Cluster

� Multi-stage

� Combination (e.g. stratified cluster sample)

14

Simple random sampling (SRS)

� Definition: Every element has the same probability of selection and every combination of elements has the same probability of selection.

� Probability of selection: n/N, where n = sample size; N = population size

� Use Random Number tables, software packages to generate random numbers

� Most precision estimates assume SRS

15

Simple Random (6 out of 30)

16

Systematic sampling

� Definition: Every element has the same probability of selection, but not every combination can be selected.

� Use when drawing SRS is difficult

� List of elements is long & not computerized

� Procedure

� Determine population size N and sample size n

� Calculate sampling interval (N/n)

� Pick random start between 1 & sampling interval

� Take every ith case

� Problem of periodicity17

Systematic Sample (every 5th)

18

Stratified sampling: Proportionate

� To ensure sample resembles some aspect of population

� Population is divided into subgroups (strata)

� Students by year in school

� Faculty by gender

� Simple Random Sample (with same probability of selection) taken from each stratum.

� Sampling fraction is the same for all strata, regardless of population in each stratum.

� Larger strata will have larger sample

19

Proportionate Stratified Sample(sampling fraction=1/5)

20

N=25 (n=5) N=10 (n=2) N=15 (n=3)

Stratified sampling: Disproportionate

� Major use is comparison of subgroups

� Population is divided into subgroups (strata)

� Compare girls & boys who play Little League

� Compare seniors & freshmen who live in dorms

� Probability of selection needs to be higher for smaller stratum (girls & seniors) to be able to compare subgroups.

� Requires weighting to adjust for different probabilities of selection

21

Disproportionate Stratified Sample(n=12--4 from each stratum, overall p=.24)

22

p=4/25=.16 p=4/10=.40 p=4/15=.267

Cluster/Multistage sampling

� Typically used in face-to-face surveys

� Population divided into clusters

� Schools (earlier example)

� Blocks

� Draw a sample of clusters

� Include every member of cluster (=cluster sample)

� Select random sample of cluster members (=multistage sample)

� Reasons for cluster sampling

� Reduction in cost

� No satisfactory sampling frame available

23

Cluster Sample

24

Complex Sample Designs

� Combination of sample strategies

� Example: multistage, stratified sample of adults in Chicago

1. Stratify census blocks into groups based on predominant racial/ethnic group

2. Draw a sample of census blocks from each stratum

3. Draw a sample of housing units from each sampled census block

4. Sample one respondent from all eligible adults in the household

5. Each sampling stage has its own probability of selection

6. Final probability of selection of eligible adult is product of all stages

25

Determining sample size: SRS

� Need to consider

� Precision

� Variation in subject of interest

� Formula� Sample size no = CI2 * (pq)

Precision

� For example: no = 1.962 * (.5 * .5)

.052

� Sample size not dependent on population size (except finite population correction)

26

Sample size: Other issues

� Finite Population Correction (FPC)

� Use when sample >5% of pop

� = ��/(1 +�

)

� Design effects

� Analysis of subgroups

� Increase size to accommodate nonresponse

� Cost

27

Modes of data collection

� Face to face

� Phone

� Web

� Mail

28

Target population/frame/mode

correspondence

� Mode needs to be consistent with information in sample frame

� Mode needs to be consistent with target population

29

Cell phone and landline frames

� Increasing proportion of US households are cell phone only (52.5% in 2017, 5.9% landline only)

� https://www.cdc.gov/nchs/data/nhis/earlyrelease/wireless201712.pdf (Blumberg & Luke)

� Cell phone only households tend to be

• Unrelated adults

• Hispanic adults

• Younger

• Lower SES

• But……

� Landline sample frames can will lead to bias

30

Cell phone and landline frames, cont.

� Cell phone frames harder to target geographically than landline frames

� Survey researchers are combining landline and cell phone frames

31

Address-based sampling

� Sampling addresses from a near universal listing of residential mail delivery locations

� Post Office Delivery Sequence Files (DSF)

32

Address-based sampling: advantages

� Coverage of households is very high

� Can be matched to name and listed telephone numbers

� Includes non-telephone households

� More efficient than traditional block-listing

33

Address-based sampling: disadvantages

� Incomplete in rural areas (although improving with 9-1-1 address conversion)

� Difficulties with “multidrop” addresses

� Best used with mail or face to face surveys.

� Can be used for web surveys with some additional effort/cost

34

Thank you!

Future noontime webinars

� Introduction to Questionnaire Design, Wednesday, February 21

� Survey Response Rates: Uses and Misuses, Wednesday, February 28

35

Evaluation

36

Questions

37

Resources

� Books on Sampling: the Classics

• Leslie Kish, Survey Sampling, 1965

• William Cochrane, Sampling Techniques, 3rd Ed. 2007

• Seymour Sudman, Applied Sampling, 1976

� Sharon Lohr, Sampling: Design and Analysis, 2009

� https://www.cdc.gov/nchs/nhis/releases.htm#wireless

� Rahul Patwari, YouTube video on non-probability sampling: https://youtu.be/-kwdXEXC7yE

� Steve Mays, YouTube video on sampling: https://youtu.be/yx5KZi5QArQ

38