Upload
truongcong
View
218
Download
0
Embed Size (px)
Citation preview
INTRODUCTION TO SURVEY SAMPLING
February 14, 2018
Linda Owens
www.srl.uic.edu
General information
� Please hold questions until the end of the presentation
� Slides available athttp://www.srl.uic.edu/seminars.htm
� Please raise your hand so that I can see that you can hear me
2
Outline
� Introduction
� Target Populations
� Sample Frames
� Sample Designs
� Determining Sample Sizes
� Modes of Data Collection
� Questions
3
Introduction
Census:
� Gathering information about every individual in a population
Sample:
� Selection of a small subset of a population
4
Census
Sample
� Less expensive
� Less time-consuming
� More accurate
� Samples can lead to statistical inference about the entire population
Why sample instead of taking a census?
5
Probability vs. non-probability
� Probability Sample
� Generalize to the entire population
� Unbiased results
� Known, non-zero probability of selection
� Non-probability Sample
� Exploratory research
� Convenience
� Probability of selection is unknown
6
Probability vs Non-Probability Sample
7
p=n/N=10/30=.3333 p=n/?=?=?
Steve Mays, YouTube video on sampling: https://youtu.be/yx5KZi5QArQ
Rahul Patwari, YouTube video on non-probability sampling: https://youtu.be/-kwdXEXC7yE
Target population
Definition: The population to which we want to generalize our findings
� Unit of analysis: Individual/Household/City
� Geography: State of Illinois/Champaign County/City of Urbana
� Age/Gender
� Other variables
8
Examples of target populations
� Population of adults in Champaign County
� Faculty, staff, or students at the University of Illinois
� Youth age 5 to 18 in Champaign County
� Registered Voters
9
Sampling frame
� Before you can ask people to answer your questions, you have to make contact with them
� How will you do that?
� Sampling frame is the mechanism that makes that possible
� Information on sampling frame has bearing on mode of data collection
10
Sampling frame
� A complete list of all units, at the first stage of sampling, from which a sample is drawn
� For example, lists of . . .
� addresses
� landline phone numbers in specific area codes
� blocks or census tracts in specified geographic areas
� members of professional organization
� schools
� cell phone numbers
11
Target populations, sample frames, and
coverage
Example 1:
� Population:Adults in Champaign County, IL
� Frames: List of landline numbers, list of census blocks, list of addresses
Example 2:
� Population: Youth age 5 to 18 in Cook County
� Frame: List of schools
Example 3:
� Population: Adults age 18-34 in United States
� Frame: ??
Coverage: How well does the sample frame represent the target population?
12
Coverage Error
13
Target Population
Sample Frame
Sample designs for probability samples
� Simple random samples
� Systematic samples
� Stratified samples
� Cluster
� Multi-stage
� Combination (e.g. stratified cluster sample)
14
Simple random sampling (SRS)
� Definition: Every element has the same probability of selection and every combination of elements has the same probability of selection.
� Probability of selection: n/N, where n = sample size; N = population size
� Use Random Number tables, software packages to generate random numbers
� Most precision estimates assume SRS
15
Simple Random (6 out of 30)
16
Systematic sampling
� Definition: Every element has the same probability of selection, but not every combination can be selected.
� Use when drawing SRS is difficult
� List of elements is long & not computerized
� Procedure
� Determine population size N and sample size n
� Calculate sampling interval (N/n)
� Pick random start between 1 & sampling interval
� Take every ith case
� Problem of periodicity17
Systematic Sample (every 5th)
18
Stratified sampling: Proportionate
� To ensure sample resembles some aspect of population
� Population is divided into subgroups (strata)
� Students by year in school
� Faculty by gender
� Simple Random Sample (with same probability of selection) taken from each stratum.
� Sampling fraction is the same for all strata, regardless of population in each stratum.
� Larger strata will have larger sample
19
Proportionate Stratified Sample(sampling fraction=1/5)
20
N=25 (n=5) N=10 (n=2) N=15 (n=3)
Stratified sampling: Disproportionate
� Major use is comparison of subgroups
� Population is divided into subgroups (strata)
� Compare girls & boys who play Little League
� Compare seniors & freshmen who live in dorms
� Probability of selection needs to be higher for smaller stratum (girls & seniors) to be able to compare subgroups.
� Requires weighting to adjust for different probabilities of selection
21
Disproportionate Stratified Sample(n=12--4 from each stratum, overall p=.24)
22
p=4/25=.16 p=4/10=.40 p=4/15=.267
Cluster/Multistage sampling
� Typically used in face-to-face surveys
� Population divided into clusters
� Schools (earlier example)
� Blocks
� Draw a sample of clusters
� Include every member of cluster (=cluster sample)
� Select random sample of cluster members (=multistage sample)
� Reasons for cluster sampling
� Reduction in cost
� No satisfactory sampling frame available
23
Cluster Sample
24
Complex Sample Designs
� Combination of sample strategies
� Example: multistage, stratified sample of adults in Chicago
1. Stratify census blocks into groups based on predominant racial/ethnic group
2. Draw a sample of census blocks from each stratum
3. Draw a sample of housing units from each sampled census block
4. Sample one respondent from all eligible adults in the household
5. Each sampling stage has its own probability of selection
6. Final probability of selection of eligible adult is product of all stages
25
Determining sample size: SRS
� Need to consider
� Precision
� Variation in subject of interest
� Formula� Sample size no = CI2 * (pq)
Precision
� For example: no = 1.962 * (.5 * .5)
.052
� Sample size not dependent on population size (except finite population correction)
26
Sample size: Other issues
� Finite Population Correction (FPC)
� Use when sample >5% of pop
� = ��/(1 +�
)
� Design effects
� Analysis of subgroups
� Increase size to accommodate nonresponse
� Cost
27
Modes of data collection
� Face to face
� Phone
� Web
28
Target population/frame/mode
correspondence
� Mode needs to be consistent with information in sample frame
� Mode needs to be consistent with target population
29
Cell phone and landline frames
� Increasing proportion of US households are cell phone only (52.5% in 2017, 5.9% landline only)
� https://www.cdc.gov/nchs/data/nhis/earlyrelease/wireless201712.pdf (Blumberg & Luke)
� Cell phone only households tend to be
• Unrelated adults
• Hispanic adults
• Younger
• Lower SES
• But……
� Landline sample frames can will lead to bias
30
Cell phone and landline frames, cont.
� Cell phone frames harder to target geographically than landline frames
� Survey researchers are combining landline and cell phone frames
31
Address-based sampling
� Sampling addresses from a near universal listing of residential mail delivery locations
� Post Office Delivery Sequence Files (DSF)
32
Address-based sampling: advantages
� Coverage of households is very high
� Can be matched to name and listed telephone numbers
� Includes non-telephone households
� More efficient than traditional block-listing
33
Address-based sampling: disadvantages
� Incomplete in rural areas (although improving with 9-1-1 address conversion)
� Difficulties with “multidrop” addresses
� Best used with mail or face to face surveys.
� Can be used for web surveys with some additional effort/cost
34
Thank you!
Future noontime webinars
� Introduction to Questionnaire Design, Wednesday, February 21
� Survey Response Rates: Uses and Misuses, Wednesday, February 28
35
Evaluation
36
Questions
37
Resources
� Books on Sampling: the Classics
• Leslie Kish, Survey Sampling, 1965
• William Cochrane, Sampling Techniques, 3rd Ed. 2007
• Seymour Sudman, Applied Sampling, 1976
� Sharon Lohr, Sampling: Design and Analysis, 2009
� https://www.cdc.gov/nchs/nhis/releases.htm#wireless
� Rahul Patwari, YouTube video on non-probability sampling: https://youtu.be/-kwdXEXC7yE
� Steve Mays, YouTube video on sampling: https://youtu.be/yx5KZi5QArQ
38