1.3 Experimental Design Designing a Statistical Study 1.Identify the variable(s) of interest (the...
If you can't read please download the document
1.3 Experimental Design Designing a Statistical Study 1.Identify the variable(s) of interest (the focus) and the population of the study. 2.Develop a detailed
1.3 Experimental Design Designing a Statistical Study
1.Identify the variable(s) of interest (the focus) and the
population of the study. 2.Develop a detailed plan for collecting
data. Make sure the data are representative of the population
Slide 2
Designing a Statistical Study 3.Collect the data. 4.Describe
the data with descriptive statistics and techniques. 5.Make
decisions using inferential statistics. Identify any possible
errors.
Slide 3
Data Collection 1.Perform an experiment Do experiment to part
of the population Do nothing to, or give a placebo to the other
part of the population. This is the control group. Take data and
compare the results between the two groups. Example: an experiment
could be used to evaluate the benefits of a new drug or medical
procedure.
Slide 4
Data Collection 2.Use a simulation A simulation uses a
mathematical or physical model to reproduce the conditions of a
situation or process. Allows the study of something that is
impractical or dangerous to create in real life. Simulations often
save time and money. Example: the destructive characteristics of a
bomb or fire. Example: 50/50 odds game with pins on a board at a
carnival
Slide 5
Data Collection 3.Take a census A census is a count or measure
of the entire population. A census provides complete information,
but is expensive, time consuming, and difficult to perform. In the
case of destructive testing (think of testing bombs), may not have
anything left when done.
Slide 6
Data Collection 4.Use sampling this is what you are going to
do. A sampling is a count or measure part of the population Use the
sample to predict the behavior of the population A sample of the
bombs can be tested for potency, and the results can be used to
predict the potency of the un-tested bombs.
Slide 7
Examples from the book Try it Yourself pg 16 1a) Focus: Effect
of exercise on senior citizens. Population: Collection of all
senior citizens. 1b) Experiment 2a) Focus: Effect of radiation
fallout on senior citizens. Population: Collection of all senior
citizens 2b) Sampling
Slide 8
Dictionary Word Chase What percent of the English words do you
know? Randomly open the book and pick a word Is this truly random?
This would be like convenience sampling
Slide 9
Dictionary Word Chase Simple Random Sample (SRS) all the words
have to have the same probability of being selected. Use a number
generator (math, probability, randInt(#))to randomly pick a word
from all words in dictionary (Websters Ninth New Collegiate
Dictionary has 13,000,000 words) Is this feasible?
Slide 10
Dictionary Word Chase Stratification use the number generator
to pick a letter (stratified on letter) and then randomly select a
word Is this feasible? Cluster pick a page (or pages), then a
column, then all the words in that(those) column(s)
Slide 11
Dictionary Word Chase Systematic randomly select a page,
randomly select a starting word, select words at a specified
interval. An advantage to systematic sampling is that it is easy to
use.
Slide 12
Resource: Much of the information for the following slides was
taken from: http://stattrek.com/http://stattrek.com/
Slide 13
Data Collection Methods: Pros and Cons Each method of data
collection has advantages and disadvantages. Resources. When the
population is large, a sample survey has a big resource advantage
over a census. A well-designed sample survey can provide very
precise estimates of population parameters - quicker, cheaper, and
with less manpower than a census.
Slide 14
Data Collection Methods: Pros and Cons Generalizability.
Generalizability refers to the appropriateness of applying findings
from a study to a larger population. Generalizability requires
random selection. If participants in a study are randomly selected
from a larger population, it is appropriate to generalize study
results to the larger population; if not, it is not appropriate to
generalize. Observational studies do not feature random selection;
so it is not appropriate to generalize from the results of an
observational study to a larger population.
Slide 15
Data Collection Methods: Pros and Cons Causal inference.
Cause-and-effect relationships can be teased out when subjects are
randomly assigned to groups. Therefore, experiments, which allow
the researcher to control assignment of subjects to treatment
groups, are the best method for investigating causal
relationships.
Slide 16
Test Your Understanding of This Lesson Which of the following
statements are true? I. A sample survey is an example of an
experimental study. II. An observational study requires fewer
resources than an experiment. III. The best method for
investigating causal relationships is an observational study. (A) I
only (B) II only (C) III only (D) All of the above. (E) None of the
above.
Slide 17
Test Your Understanding of This Lesson Solution The correct
answer is (E). In a sample survey, the researcher does not assign
treatments to survey respondents. Therefore, a sample survey is not
an experimental study; rather, it is an observational study. An
observational study may or may not require fewer resources (time,
money, manpower) than an experiment. The best method for
investigating causal relationships is an experiment - not an
observational study - because an experiment features randomized
assignment of subjects to treatment groups.
Slide 18
Survey Sampling Methods Probability vs. Non-Probability Samples
As a group, sampling methods fall into one of two categories.
Probability samples. With probability sampling methods, each
population element has a known (non-zero) chance of being chosen
for the sample. Non-probability samples. With non-probability
sampling methods, we do not know the probability that each
population element will be chosen, and/or we cannot be sure that
each population element has a non-zero chance of being chosen.
Slide 19
Survey Sampling Methods Non-probability samples. With
non-probability sampling methods, we do not know the probability
that each population element will be chosen, and/or we cannot be
sure that each population element has a non-zero chance of being
chosen. Non-probability sampling methods offer two potential
advantages - convenience and cost. The main disadvantage is that
non-probability sampling methods do not allow you to estimate the
extent to which sample statistics are likely to differ from
population parameters. Only probability sampling methods permit
that kind of analysis.
Slide 20
Non-Probability Sampling Methods Two of the main types of
non-probability sampling methods are voluntary samples and
convenience samples. Voluntary sample. A voluntary sample is made
up of people who self-select into the survey. Often, these folks
have a strong interest in the main topic of the survey. Suppose,
for example, that a news show asks viewers to participate in an
on-line poll. This would be a volunteer sample. The sample is
chosen by the viewers, not by the survey administrator.
Slide 21
Non-Probability Sampling Methods Convenience sample. A
convenience sample is made up of people who are easy to reach.
Consider the following example. A pollster interviews shoppers at a
local mall. If the mall was chosen because it was a convenient site
from which to solicit survey participants and/or because it was
close to the pollster's home or business, this would be a
convenience sample.
Slide 22
Probability Sampling Methods The main types of probability
sampling methods are simple random sampling, stratified sampling,
cluster sampling, multistage sampling, and systematic random
sampling. The key benefit of probability sampling methods is that
they guarantee that the sample chosen is representative of the
population. This ensures that the statistical conclusions will be
valid.
Slide 23
Probability Sampling Methods Simple random sampling. Simple
random sampling refers to any sampling method that has the
following properties. The population consists of N objects. The
sample consists of n objects. If all possible samples of n objects
are equally likely to occur, the sampling method is called simple
random sampling. There are many ways to obtain a simple random
sample. One way would be the lottery method. Each of the N
population members is assigned a unique number. The numbers are
placed in a bowl and thoroughly mixed. Then, a blind-folded
researcher selects n numbers. Population members having the
selected numbers are included in the sample.
Slide 24
Probability Sampling Methods Stratified sampling. With
stratified sampling, the population is divided into groups, based
on some characteristic. Then, within each group, a probability
sample (often a simple random sample) is selected. In stratified
sampling, the groups are called strata. As a example, suppose we
conduct a national survey. We might divide the population into
groups or strata, based on geography - north, east, south, and
west. Then, within each stratum, we might randomly select survey
respondents.
Slide 25
Probability Sampling Methods Cluster sampling. With cluster
sampling, every member of the population is assigned to one, and
only one, group. Each group is called a cluster. A sample of
clusters is chosen, using a probability method (often simple random
sampling). Only individuals within sampled clusters are surveyed.
Note the difference between cluster sampling and stratified
sampling. With stratified sampling, the sample includes elements
from each stratum. With cluster sampling, in contrast, the sample
includes elements only from sampled clusters.
Slide 26
Probability Sampling Methods Systematic random sampling. With
systematic random sampling, we create a list of every member of the
population. From the list, we randomly select the first sample
element from the first k elements on the population list.
Thereafter, we select every kth element on the list. This method is
different from simple random sampling since every possible sample
of n elements is not equally likely.
Slide 27
Probability Sampling Methods Multistage sampling. With
multistage sampling, we select a sample by using combinations of
different sampling methods. For example, in Stage 1, we might use
cluster sampling to choose clusters from a population. Then, in
Stage 2, we might use simple random sampling to select a subset of
elements from each chosen cluster for the final sample.
Slide 28
Test Your Understanding An auto analyst is conducting a
satisfaction survey, sampling from a list of 10,000 new car buyers.
The list includes 2,500 Ford buyers, 2,500 GM buyers, 2,500 Honda
buyers, and 2,500 Toyota buyers. The analyst selects a sample of
400 car buyers, by randomly sampling 100 buyers of each brand. Is
this an example of a simple random sample? (A) Yes, because each
buyer in the sample was randomly sampled. (B) Yes, because each
buyer in the sample had an equal chance of being sampled. (C) Yes,
because car buyers of every brand were equally represented in the
sample. (D) No, because every possible 400-buyer sample did not
have an equal chance of being chosen. (E) No, because the
population consisted of purchasers of four different brands of
car.
Slide 29
Test Your Understanding Solution The correct answer is (D). A
simple random sample requires that every sample of size n (in this
problem, n is equal to 400) have an equal chance of being selected.
In this problem, there was a 100 percent chance that the sample
would include 100 purchasers of each brand of car. There was zero
percent chance that the sample would include, for example, 99 Ford
buyers, 101 Honda buyers, 100 Toyota buyers, and 100 GM buyers.
Thus, all possible samples of size 400 did not have an equal chance
of being selected; so this cannot be a simple random sample.simple
random samplesample
Slide 30
Test Your Understanding of This Lesson The fact that each buyer
in the sample was randomly sampled is a necessary condition for a
simple random sample, but it is not sufficient. Similarly, the fact
that each buyer in the sample had an equal chance of being selected
is characteristic of a simple random sample, but it is not
sufficient. The sampling method in this problem used random
sampling and gave each buyer an equal chance of being selected; but
the sampling method was actually stratified random
sampling.stratified random sampling The fact that car buyers of
every brand were equally represented in the sample is irrelevant to
whether the sampling method was simple random sampling. Similarly,
the fact that population consisted of buyers of different car
brands is irrelevant.
Slide 31
Bias in Survey Sampling In survey sampling, bias refers to the
tendency of a sample statistic to systematically over- or
under-estimate a population parameter.statisticparameter
Slide 32
Bias Due to Unrepresentative Samples A good sample is
representative. This means that each sample point represents the
attributes of a known number of population
elements.samplepopulation Bias often occurs when the survey sample
does not accurately represent the population. The bias that results
from an unrepresentative sample is called selection bias. Some
common examples of selection bias are described below.
Slide 33
Bias Due to Unrepresentative Samples Undercoverage.
Undercoverage occurs when some members of the population are
inadequately represented in the sample. A classic example of
undercoverage is the Literary Digest voter survey, which predicted
that Alfred Landon would beat Franklin Roosevelt in the 1936
presidential election. The survey sample suffered from
undercoverage of low-income voters, who tended to be Democrats. How
did this happen? The survey relied on a convenience sample, drawn
from telephone directories and car registration lists. In 1936,
people who owned cars and telephones tended to be more affluent.
Undercoverage is often a problem with convenience
samples.convenience sample
Slide 34
Bias Due to Unrepresentative Samples Nonresponse bias.
Sometimes, individuals chosen for the sample are unwilling or
unable to participate in the survey. Nonresponse bias is the bias
that results when respondents differ in meaningful ways from
nonrespondents. The Literary Digest survey illustrates this
problem. Respondents tended to be Landon supporters; and
nonrespondents, Roosevelt supporters. Since only 25% of the sampled
voters actually completed the mail-in survey, survey results
overestimated voter support for Alfred Landon. The Literary Digest
experience illustrates a common problem with mail surveys. Response
rate is often low, making mail surveys vulnerable to nonresponse
bias.
Slide 35
Bias Due to Unrepresentative Samples Voluntary response bias.
Voluntary response bias occurs when sample members are self-
selected volunteers, as in voluntary samples. An example would be
call-in radio shows that solicit audience participation in surveys
on controversial topics (abortion, affirmative action, gun control,
etc.). The resulting sample tends to overrepresent individuals who
have strong opinions.voluntary samples
Slide 36
Bias Due to Unrepresentative Samples Random sampling is a
procedure for sampling from a population in which (a) the selection
of a sample unit is based on chance and (b) every element of the
population has a known, non- zero probability of being selected.
Random sampling helps produce representative samples by eliminating
voluntary response bias and guarding against undercoverage bias.
All probability sampling methods rely on random sampling.
Slide 37
Bias Due to Measurement Error A poor measurement process can
also lead to bias. In survey research, the measurement process
includes the environment in which the survey is conducted, the way
that questions are asked, and the state of the survey
respondent.
Slide 38
Bias Due to Measurement Error Response bias refers to the bias
that results from problems in the measurement process. Some
examples of response bias are given below. Leading questions. The
wording of the question may be loaded in some way to unduly favor
one response over another. For example, a satisfaction survey may
ask the respondent to indicate where she is satisfied,
dissatisfied, or very dissatified. By giving the respondent one
response option to express satisfaction and two response options to
express dissatisfaction, this survey question is biased toward
getting a dissatisfied response.
Slide 39
Bias Due to Measurement Error Social desirability. Most people
like to present themselves in a favorable light, so they will be
reluctant to admit to unsavory attitudes or illegal activities in a
survey, particularly if survey results are not confidential.
Instead, their responses may be biased toward what they believe is
socially desirable.
Slide 40
Test Your Understanding Which of the following statements are
true? I. Random sampling is a good way to reduce response bias. II.
To guard against bias from undercoverage, use a convenience sample.
III. Increasing the sample size tends to reduce survey bias. IV. To
guard against nonresponse bias, use a mail-in survey. (A) I only
(B) II only (C) III only (D) IV only (E) None of the above.
Slide 41
Test Your Understanding The correct answer is (E). None of the
statements is true. Random sampling provides strong protection
against bias from undercoverage bias and voluntary response bias;
but it is not effective against response bias. A convenience sample
does not protect against undercoverage bias; in fact, it sometimes
causes undercoverage bias. Increasing sample size does not affect
survey bias. And finally, using a mail-in survey does not prevent
nonresponse bias. In fact, mail-in surveys are quite vulnerable to
nonresponse bias.Random samplingundercoveragevoluntary response
biasresponse biasconvenience sample biasnonresponse bias