Upload
rhiannon-peeling
View
224
Download
0
Tags:
Embed Size (px)
Citation preview
VI. Sampling: (Nov. 2, 4)
Frankfort-Nachmias & Nachmias (Chapter 8 – Sampling and Sample Designs)
King, Keohane and Verba (Chapter 4)
Barbara Geddes. 1990. “How the Cases You Choose Affect the Answers You Get: Selection Bias in Comparative Politics.” Political Analysis, 2:1, 131-150.
Applications William Reed, “A Unified Statistical Model of Conflict Onset and Escalation.”
American Journal of Political Science, Vol. 44, No. 1 (Jan., 2000), pp. 84-93
Richard Timpone. 1998. “Structure, Behavior and Voter Turnout in the United States.” American Political Science Review, Vol. 92 (1): 145-158.
Sampling
Population – any well-defined set of units of analysis; the group to which our theories apply
Sample – any subset of units collected in some manner from the population; the data we use to test our theories
Parameter vs. Statistic
Types of Samples
Probability sample – each element of the population has a known probability of being included in the sample
Nonprobability sample - each element of the population has an unknown probability of being included in the sample
Types of Nonprobability Samples
Convenience sample Purposive sample
Problem – may not be representative of the population to which we want to generalize
Famous Example of Convenience Sampling
Literary Digest – used automobile registration lists and telephone directories as sampling frame for presidential polls 1928 - 18 million postcards to accurately
predict outcome of 1928 election (Hoover-R) 1932: 20 million postcards to accurately
predict 1932 election (Roosevelt-D)
Famous Example of Convenience Sampling
Literary Digest – used automobile registration lists and telephone directories as sampling frame for presidential polls 1928 - predicted Hoover-R 1932: predicted Roosevelt-D 1936: predicted Landon (R) 57%
What happened?
Famous Example of Convenience Sampling
Before 1936 Upper class/Working Class – more or less
representative partisan distribution
Famous Example of Convenience Sampling
Before 1936 Upper class/Working Class – more or less
representative partisan distribution 1936 and beyond
Upper class disproportionately Republican Working class disproportionately Democrat
Types of Nonprobability Samples
Quota samples – elements are chosen based on selected characteristics and the representation of these characteristics in the population Insures accurate representation of selected
characteristics Elements with selected characteristics chosen
in convenience fashion
Famous Examples of Quota Samples
1936 – George Gallup used quota sampling to accurately predict:
The (inaccurate) Literary Digest prediction
The winner of the 1936 election
Famous Examples of Quota Samples
1948 – quota sampling incorrectly predicts Dewey to defeat Truman
Types of Probability Samples
Simple random sample – each element of the population has an equal chance of being selected
Systematic sample – elements selected from a list at predetermined intervals
Types of Probability Samples
Stratified sample – elements in population are grouped into strata, and each strata is randomly sampled
Example of Stratified Sampling
Population: 75% white, 10% black, 10 Hispanic, 5% Asian
Simple random sample of 1000: Approximately 750 white, 100 black, 100 Hispanic, 50 Asian
Samples too small for group comparisons
Solution: Use stratified sampling to over-sample minority groups (disproportionate stratified sampling)
Types of Probability Samples
Cluster sample – elements are grouped into “clusters,” and sampling proceeds in two stages:
• (1) A random sample of clusters is chosen• (2) Elements within selected clusters are then
randomly selected and aggregated to form final sample
• This is the sampling method used in many national surveys (e.g. clusters=metropolitan areas, zip codes, area codes)
Sampling Distribution (of sample means)
Population
Draw Random Sample of Size n
Calculate sample mean
Repeat until all possible random samples of size n are exhausted
The resulting collecting of sample means is the sampling distribution of sample means
Sampling Distribution of Sample Means
Def: A frequency distribution of all possible sample means taken from the same population for a given sample size (n)
Sampling Distribution of Sample Means
Def: A frequency distribution of all possible sample means taken from the same population for a given sample size (n) The mean of the sampling distribution
will be equal to the population mean. The sampling distribution will be
normally distributed (regardless of population distribution if n>30)
Standard Error
How the sample means vary from sample to sample (i.e. within the sampling distribution) is expressed statistically by the value of the standard deviation of the sampling distribution.
Standard Error, cont.
The standard error for a sample mean is calculated as: s / √n
Where s = sample standard deviation n = sample size
Simulating a Sampling Distribution (For a Sample Proportion)
Dichotomous variable for which the true population value is set at .25
Randomly draw 1,000 samples of size n
Repeat for different n’s and compare
Simulation of a Sampling Distribution (n=10)
Simulation of a Sampling Distribution (n=100)
Sample Size and Sampling Error
Sample Selection Bias
What is it? What are the consequences of selecting
on: The dependent variable? The independent variable?