Upload
jensen
View
38
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Sampling. Representing populations. Let’s say you wanted to know whether people over 60 used the Internet for medical information. - PowerPoint PPT Presentation
Citation preview
Sampling
Representing populations
Let’s say you wanted to know whether people over 60 used the Internet for medical information
You could save a bundle on providing medical information by putting up a web page with the necessary information rather than contacting people directly or having them call their doctors for it
But how could you determine whether they would use it?Track them all down and ask
them?◦Practically impossible◦Prohibitively expensive◦Not really necessary
SoTalk to some of them and
estimate what the rest would say
But which ones should be talked to?
Sampling theory guides us in the choice of people to measure as well as estimating what the entire population would have answered
Samples and Sampling
A sample is a subgroup drawn from a larger population that is meant to represent all members
Sampling refers to the actions taken to draw a sample from a population
Examples of SamplingSmall portions of food are given away in
supermarkets in order to get you to buy the product (I made it through grad school this way)
Geologists drill out deep cylinders of rock to determine whether to drill for oil
Farmers pick ears of corn from many parts of the field to check for insects
Short portions of songs are downloaded from the Internet by prospective buyers
Sampling frame◦A list of the units of the population used to draw the sample
◦A sampling frame must closely reflect population (e.g., telephone books, voter
registration lists)
Parameters and statistics
Parameter◦A true characteristic of a population
Average age of LexingtoniansStatistic
◦A numeric summary of a variable in a sample Mean age of a sample of Lexingtonians
◦Sample statistics are computed in order to estimate population parameters.
Random sampleThe best method for
representing the entire population with a sample is to use a random sample
In a random sample, each person in the population of interest has an equal and known chance of being selected◦allows researchers to calculate
sampling error
Nonrandom samplesIn nonrandom samples, the
likelihood of inclusion of any individual elements from the population into the sample is not known◦Means that many of the advantages
of statistical analyses are lost
The researcher may choose a nonrandom sample for several reasons:
◦Purpose of the study explore variable relationships (experiment) exploratory research
◦Cost versus value probability sample may be too expensive
Low incidence of preferred respondents black lawyers
Willingness to participate focus groups
◦Time constraints◦Exploratory study
Types of nonrandom samplesConvenience sample (also called
‘haphazard’ or ‘accidental’ sample)
Volunteer samplePurposive sampleQuota sampleNetwork sample
Convenience sampleRespondents are included
based on availability◦students in introductory courses◦mall intercepts◦movie studio tours
Volunteer sample
Respondents choose to participate in the study◦clinical trials◦consumer juries◦extra-credit psych experiments
Volunteers are different:◦higher educational status◦higher occupational status◦greater need for approval◦higher IQ◦ lower authoritarianism◦more sociable◦more ‘arousal-seeking’◦ less conventional◦tend to be first children◦younger
Purposive sampleSubjects selected on the basis of
specific characteristics or qualities◦users of a particular brand◦young mothers with small children◦doctors◦members of a fan club◦target market members
Quota sample“respondents are selected
nonrandomly according on the basis of their known proportion in a population” (Frey et al., 2000)◦Large/medium/small hospitals◦Caucasian/Black/Asian◦Heavy/medium/light users
Responses may be weighted according to population proportion
Network sample‘Snowball’ sample
◦ask respondents to recommend additional sources/respondents cheaper helps identify people with certain
characteristics aids in respondent compliance identify networks of people
Random samplesSimple random sampleSystematic random sampleStratified random sampleCluster sample
Simple random sample (SRS)
The simple random sample is a case where each element has an equal chance of being selected into the sample◦Lottery◦Random number table◦Roulette wheel◦Random digit dialing
Statistics often assume a “SRS”
Systematic random sampling“A random sample that chooses every
nth person/text from a complete list of a population after starting at a random point.” (Frey et al., 2000)
For example, if you have a sampling frame of 600 elements and you need a sample of 100, then you would have to pick every 6th name. You randomly choose the first name--it turns out to be the 4th element. You then choose the 4th, 10th, 16th, 22nd, etc.
Stratified random sampleA sample developed by first
splitting the population based on some important characteristic and sampling randomly from within categories◦e.g. age, gender, race, income
random samples are taken from within each of the subpopulations
Cluster samplingLarger groupings of individual
sample elements are chosen, then the elements are measured◦Usually geographic areas
Cluster samplingAdvantages:
◦Only part of the population needs to be enumerated
◦Costs reduced◦Cluster estimates can be
compared to population numbers
Cluster samplingDisadvantages
◦Sampling errors are likely◦Clusters may not be
representative of the population Number and size of clusters is
important◦Each subject or unit must be
assigned to a specific cluster
Multi-stage samplingSample large groups/clusters,
then sample smaller units within the groups, and so on◦metropolitan area◦county◦block◦residence◦individual
Sample SizeGenerally speaking, the larger
the better◦But quality is most important◦Though people find it hard to
believe, you can make some pretty good estimates of very large populations from rather small samples National polls can be pretty accurate
with 600 respondents
Sample sizeThere is a law of diminishing
returns:◦additional units add less and
less precision The first respondent is the most
valuable, the second is second-most, etc.
Will often be determined by time and cost considerations
Sampling error“A number that expresses how
much the characteristics of a sample probably differ from the characteristics of its population” (Frey et al., 2000)
Sampling error can be estimated for random samples
this is nonsystematic error variance
Sampling errorTwo key components of sampling error
estimates are confidence levels and confidence intervals
“We express the accuracy of our sample statistics in terms of a level of confidence that the statistics fall within a specified interval from the parameter.” (Babbie)◦tradeoff between confidence level and
confidence interval
Example:Research finds that 45% of males
say that they have broken the speed limit by 15 mph in the last two months.◦The researcher is 99% confident that the
actual percent is between 42% and 48%.◦That is, if the researcher took 100
samples, she would expect that in 99 of them the estimate of the % of males speeding by 15 mph would fall between 42% & 48%.
SoWe use samples to estimate
population parameters because our estimates can be pretty close while drastically reducing the costs of carrying out the research
Samples are either random or nonrandom
Random samples allow us to estimate the sampling error attached to statistics describing the sample
Nonrandom samples are used when random samples are too expensive or impractical◦They employ methods other than
randomization meant to increase their representativeness
A number of different types of random and nonrandom sampling can be used to reduce costs or improve sample quality