Upload
hannah-ball
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Producing data: sampling and experiments
BPS chapters 8 & 9
© 2006 W. H. Freeman and Company
From Exploration to Inference (p 186)Exploratory Data
AnalysisStatistical Inference
Purpose is unrestricted exploration of the data, searching for interesting
patterns.
Purpose is to answer specific questions, posed before the data were
produced.
Conclusions apply only to the individuals and circumstances for
which we have data in hand.
Conclusions apply to a larger group of individuals or a broader class of
circumstances.
Conclusions are informal, based on what we see in the data.
Conclusions are formal, backed by a statement of our confidence in them.
Objectives (BPS chapter 8)
Producing data: sampling
Observation versus experiment
Population versus sample
Sampling methods
How to sample badly
Simple random samples
Other sampling designs
Caution about sample surveys
Learning about populations from samples (inference)
Observation versus experiment (p 190)Observational study: Record data on individuals without attempting to
influence the responses. We typically cannot prove causation this way.
Example 8.1: Observational study on hormone replacement.
Experimental study: Deliberately impose a treatment on individuals and record their responses. Influential factors can be controlled.
Example 8.1: Experimental study on hormone replacement.
Confounding (p 191)Two variables (explanatory variables or lurking variables)
are confounded when their effects on a response
variable cannot be distinguished from each other.
Well-designed experiments take steps to defeat confounding.
Lurking:Richer/Better Educated
Reduced Risk of Heart Attack
Hormone Replacement
CAUSE?
Confounding?Confounding?
Observational studies of the effect of one variable on another often fail because
the explanatory variable is confounded with lurking variables.
Confounding?Confounding?
Population versus sample (p 192) Sample: The part of the
population we actually examine
and for which we do have data
How well the sample represents the population depends on the sample design.
Population: The entire group
of individuals in which we are
interested but can’t usually
assess directly
Example: All humans, all working-age people in California, all crickets
A parameter is a number
describing a characteristic of
the population.
A statistic is a number
describing a characteristic of a
sample.
Population
Sample
We want samples that are representative of the population
Convenience sampling: Just ask whoever is around.
Example: “Man on the street” survey (cheap, convenient, often quite
opinionated or emotional → now very popular with TV “journalism”)
Which men, and on which street?
Ask about gun control or legalizing marijuana “on the street” in
Berkeley, CA and in some small town in Idaho and you would probably
get totally different answers.
Even within an area, answers would probably differ if you did the
survey outside a high school or a country-western bar.
Bias: Opinions limited to individuals present
Bad sampling methods (p 194-195)
Voluntary Response Sample:
A voluntary response sample consists of people who choose
themselves for inclusion in the sample in response to a broad
appeal. Voluntary response samples are prone to bias because people with
strong opinions are most likely to respond.
Bias: Sample design systematically favors a particular outcome.
Ann Landers summarizing responses of readers:
Seventy percent of (10,000) parents wrote in to say that having
kids was not worth it—if they had to do it over again, they
wouldn’t.
Bias: Most letters to newspapers are written by disgruntled
people. A random sample showed that 91% of parents
WOULD have kids again.
CNN on-line surveys:
Bias: People have to care enough about an issue to bother replying. This sample
is probably a combination of people who hate “wasting the taxpayers’ money”
and “animal lovers.”
Good sampling methods
Probability or random sampling:
Individuals are randomly selected. No one group should be over-
represented.
Random samples rely on the absolute objectivity of random numbers. There are books and tables of random digits available for random sampling.
Statistical software cangenerate random digits (e.g., Excel “=random()”).
Sampling randomly gets rid of bias.
Sampling randomly controls for lurking variables
Simple random samples (p 196)The simple random sample (SRS) is made of randomly selected
individuals. Each individual in the population has the same probability of
being in the sample. All possible samples of size n have the same
chance of being drawn.
How to choose an SRS of size n from a population of size N using the TI-8x randInt function:
Label the individuals in your population from 1 to N.
(optional) Store a “seed” value in the rand register of the TI (the seed will be given to you):
Generate 2n random integers from 1 to N and store them in a list.
Select the n corresponding individuals for your sample
19 STO rand MATH | PRB menuFor a seed of 19
randInt(1,N,n) STO L6MATH|PRB menu
01 Alison02 Amy03 Brigitte04 Darwin05 Emily06 Fernando07 George08 Harry09 Henry10 John11 Kate12 Max13 Moe14 Nancy15 Ned16 Paul17 Ramon18 Rupert19 Tom20 Victoria
Example: Choosing a simple random sample using the randInt function
01 Alison02 Amy03 Brigitte04 Darwin05 Emily06 Fernando07 George08 Harry09 Henry10 John11 Kate12 Max13 Moe14 Nancy15 Ned16 Paul17 Ramon18 Rupert19 Tom20 Victoria
Problem: select a simple random sample of size 5 from a
class of size 20. The class members are listed to the right…
1. Number the members of the population.
2. Let’s use a seed value of 71 for the randInt function.
71 STO rand
3. Generate the sample, store it in list 6.
randInt(1,20,10) STO L6
4. Pick the corresponding individuals. (Skip duplicates.)
Simple random samples (p 196)How to choose an SRS of size n from a population of size N using a random number table.
Label. Give each member of the population a numerical label of the same length.
Table. To choose an SRS, read from Table B (page 686) successive groups of digits of the length you used as labels. Your sample contains the individuals whose labels you find in the table.
Example: Choosing a simple random sample using a table of random numbers (Table B page 686)Problem: select a simple random sample of size 5 from a
class of size 20. The class members are listed to the right…
Step 1: Assign a numerical label to each member of the
population. Use leading zeros to ensure that each label
has the same number of digits.
In our case there are 20 individuals in the
population, so all labels must use 2 digits
Step 2: Go to the table of random digits, and select a
row at random. (Or, I may tell you which row to start
with.)
01 Alison02 Amy03 Brigitte04 Darwin05 Emily06 Fernando07 George08 Harry09 Henry10 John11 Kate12 Max13 Moe14 Nancy15 Ned16 Paul17 Ramon18 Rupert19 Tom20 Victoria
01 Alison02 Amy03 Brigitte04 Darwin05 Emily06 Fernando07 George08 Harry09 Henry10 John11 Kate12 Max13 Moe14 Nancy15 Ned16 Paul17 Ramon18 Rupert19 Tom20 Victoria
Example: Choosing a simple random sample using a table of random numbers (Table B page 686) 01 Alison
02 Amy03 Brigitte04 Darwin05 Emily06 Fernando07 George08 Harry09 Henry10 John11 Kate12 Max13 Moe14 Nancy15 Ned16 Paul17 Ramon18 Rupert19 Tom20 Victoria
For this example, let’s start with line 103…
Step 2: Go to the table of random digits, and select a
row at random. (Or, I may tell you which row to start
with.)
Example: Choosing a simple random sample using a table of random numbers (Table B page 686) 01 Alison
02 Amy03 Brigitte04 Darwin05 Emily06 Fernando07 George08 Harry09 Henry10 John11 Kate12 Max13 Moe14 Nancy15 Ned16 Paul17 Ramon18 Rupert19 Tom20 Victoria
For this example, let’s start with line 103…
Our data labels have two digits, so we parse the
random number table into blocks of length 2…
Step 3: Parse the random numbers into blocks of the
same length as your data labels.
Example: Choosing a simple random sample using a table of random numbers (Table B page 686) 01 Alison
02 Amy03 Brigitte04 Darwin05 Emily06 Fernando07 George08 Harry09 Henry10 John11 Kate12 Max13 Moe14 Nancy15 Ned16 Paul17 Ramon18 Rupert19 Tom20 Victoria
Our data labels have two digits, so we parse the
random number table into blocks of length 2…
Step 3: Parse the random numbers into blocks of the
same length as your data labels.
45 46 77 17 09 77 55 80 00 95 32 86 32 94 85 82 22 69 00 56
Example: Choosing a simple random sample using a table of random numbers (Table B page 686) 01 Alison
02 Amy03 Brigitte04 Darwin05 Emily06 Fernando07 George08 Harry09 Henry10 John11 Kate12 Max13 Moe14 Nancy15 Ned16 Paul17 Ramon18 Rupert19 Tom20 Victoria
Step 4: Read through your parsed numbers in order,
and each time you come to one of your labels, select it
for the sample.
45 46 77 17 09 77 55 80 00 95 32 86 32 94 85 82 22 69 00 56
Example: Choosing a simple random sample using a table of random numbers (Table B page 686) 01 Alison
02 Amy03 Brigitte04 Darwin05 Emily06 Fernando07 George08 Harry09 Henry10 John11 Kate12 Max13 Moe14 Nancy15 Ned16 Paul17 Ramon18 Rupert19 Tom20 Victoria
Step 5: Continue line-by-line through the table until
you’ve got the number of samples that you need.
45 46 77 17 09 77 55 80 00 95 32 86 32 94 85 82 22 69 00 56
52 71 13 88 89 93 07 46 02 27 40 01 18 58 48 48 76 75 25 73
If you pick a sample that you’d already picked
previously, just skip it.
Stratified samplesA stratified random sample is essentially a series of SRS performed
on subgroups of a given population. The subgroups are chosen to
contain all the individuals with a certain characteristic. For example: Divide the population of UCI students into males and females.
Divide the population of California by major ethnic group.
Divide the counties in America as either urban or rural based on a
criterion of population density.
The SRS taken within each group in a stratified random sample need
not be of the same size. For example: Stratified random sample of 100 male and 150 female UCI students
Stratified random sample of a total of 100 Californians, representing
proportionately the major ethnic groups
Multistage samples use multiple stages of stratification. They are often
used by the government to obtain information about the U.S. population.
Example: Sampling both urban and rural areas, people in different ethnic
and income groups within the urban and rural areas, and then individuals
of different political orientation within those strata.
Data are obtained by taking an SRS for each substrata.
Statistical analysis for
multistage samples is more
complex than for an SRS.
Caution about sampling surveys (p 201) Nonresponse: People may simply refuse to participate. Yet they are
part of the population, and should be represented. Nonresponse can
introduce bias into results, especially if those who refuse to participate
come predominately from one subgroup of the population.
Response bias: Fancy term for lying when you think you should not tell
the truth. Like if your family doctor asks: “How much do you drink?”
People also simply forget, give erroneous answers, or avoid giving
answers that show them in a potentially bad light. Example: a survey of
white voters asking: “Does a candidate’s race affect your decision of
whether a not to vote for him or her?” (Bradley effect)
Wording effects: Questions worded like “Don’t you agree that it is
awful that…” are prompting you to give a particular response.
Undercoverage
Undercoverage occurs when parts of the population
are left out in the process of choosing the sample.
Because the U.S. Census goes “house to house,” homeless people
are not represented. Illegal immigrants also avoid being counted.
Geographical districts with a lot of undercoverage tend to be poor
ones. Representatives from richer areas typically strongly oppose
statistical adjustment of the census.
Historically, clinical trials have avoided including women in
their studies because of their periods and the chance of
pregnancy. This means that medical treatments were not
appropriately tested for women. This problem is slowly
being recognized and addressed.
1) To assess the opinions of students at The Ohio State University regarding campus safety, a reporter interviews 15 students he meets walking on the campus late at night who are willing to give their opinions.
What is the sample here? What is the population? Has the reporter chosen a random sample of students?
2) An SRS of 1200 adult Americans is selected and asked: “In light of the huge national deficit, should the government at this time spend additional money to establish a national system of health insurance?” Thirty-nine percent of those responding answered yes.
What is the sample here? What is the population? What else can you say about this survey?
Should you trust the results of the first survey? Of the second? Why?
Learning about populations from samples
The techniques of inferential statistics allow us to draw inferences or
conclusions about a population from a sample.
Your estimate of the population is only as good as your sampling design
Work hard to eliminate biases.
Your sample is only an estimate—and if you randomly sampled again,
you would probably get a somewhat different result.
The bigger the sample the better. We’ll get back to it in later chapters.
Population
Sample
Objectives (BPS chapter 9)
Producing data: experiments
Experiments
How to experiment badly
Randomized comparative experiments
The logic of randomized comparative experiments
Cautions about experimentation
Matched pairs and other block designs
Terminology
The individuals in an experiment are the experimental units. If they are human, we call them subjects.
The explanatory variables in an experiment are often called factors.
A treatment is any specific experimental condition applied to the subjects. If an experiment has several factors, a treatment is a
combination of specific values of each factor.
The factor may be the administration of a drug.
One group of people may be placed on a diet/exercise program for 6 months (treatment), and their blood pressure (response variable) would be
compared with that of people who did not diet or exercise.
If the experiment involves giving two different doses of a drug, we
say that we are testing two levels of the factor.
A response to a treatment is statistically significant if it is larger
than you would expect by chance (due to random variation among
the subjects). We will learn how to determine this later.
In a study of sickle cell anemia, 150 patients were given the drug hydroxyurea, and 150
were given a placebo (dummy pill). The researchers counted the episodes of pain in
each subject. Identify:
•The subjects : the 300 patients
•The factors : hydroxyurea and placebo
•The treatments :
• Treatment 1 : hydroxyurea
• Treatment 2 : placebo
•And the response variable : number of episodes of pain
How to experiment badlySubjects Treatment Measure response
In a controlled environment of a laboratory (especially if human
subjects are not being used), a simple design like this one, where all
subjects receive the same treatment, can work well.
Field experiments and experiments with human subjects are
exposed to more variable conditions and deal with more variable
subjects.
A simple design often yields worthless results because of
confounding with lurking variables.
3 Principles of experimental design Control the effects of lurking variables on the response, most
simply by comparing two or more treatments.
Randomize—use impersonal chance to assign subjects to
treatments.
Replicate—use enough subjects in each group to reduce chance
variation in the results.
Principles of comparative experimentsMost experiments are comparative in nature: We compare the
response to a treatment versus to: another treatment no treatment (a control) a placebo or any combination of the above
A control is a situation in which no treatment is administered. It serves
as a reference mark for an actual treatment (e.g., a group of subjects
does not receive any drug or pill of any kind).
A placebo is a fake treatment, such as a sugar pill. It is used to test
the hypothesis that the response to the treatment is due to the actual
treatment and not to how the subject is being taken care of.
About the placebo effect
The “placebo effect” is an improvement in health due not to
any treatment but only to the patient’s belief that he or she
will improve.
The placebo effect is not well understood, but it is believed
to have therapeutic results on a significant proportion of
patients.
It can sometimes ease the symptoms of a variety of ills,
from asthma to pain to high blood pressure and even to
heart attacks.
An opposite, or “negative placebo effect,” has been
observed when patients believe their health will get worse.
The best way to exclude biases in an experiment is to randomize
the design. Both the individuals and treatments are assigned
randomly.
A double-blind experiment is one in which neither the subjects nor
the experimenter know which individuals got which treatment until
the experiment is completed.
Getting rid of sampling biases
Another way to make sure your conclusions are robust is to replicate
your experiment—do it over. Replication ensures that particular results
are not due to uncontrolled factors or errors of manipulation.
In a completely randomized experimental design, individuals are
randomly assigned to groups, then the groups are randomly assigned
to treatments.
Completely randomized designs
Matched pairs: Choose pairs of subjects that are closely matched—
e.g., same sex, height, weight, age, and race. Within each pair,
randomly assign who will receive which treatment.
It is also possible to just use a single person and give the two
treatments to this person over time in random order. In this case, the
“matched pair” is just the same person at different points in time.
Matched pairs designs
Matched pairs designs are not always possible. But when they
are, they usually do a better job of controlling for lurking variables
than completely randomized designs.
Example: To assess the effects of genetics and environment on IQ,
study identical twins who have been separated at birth.
Example: To assess the effects cell phone usage on braking time, study
the reaction times of the same individual both with and without a cell.
In a block design, subjects are divided into groups, or blocks, prior
to the experiment to test hypotheses about differences between the
groups.
The blocking, or stratification, here is by gender.
Block designs
Example: Advertising
What are the effects of repeated exposure to advertising messages? Do these effects vary between men and women?
The effectiveness of an advertising campaign may depend on both the length of the message, and the number of times a viewer is exposed to the message. In addition, there might be a difference between the way men and women percieve the ad. A researcher decides to study this question. He recruits 40 undergraduates as subjects for his experiment. He has two versions of the same ad for a digital camera; the first version is 30 seconds long, while the second version is 90 seconds long. He wants to assess the effects of showing the ad either 1, 3, or 5 times.
Discuss an appropriate design for the experiment.
What are the factors?
What are the treatments?
What are the blocks?
What is the appropriate design?
Factors:
1. Number of Repetitions
a) 3 values (or levels) of this variable: 1, 3, or 5 repetitions
2. Length of ad
a) 2 values (or levels) of this variable: 30 seconds or 90 seconds
Treatments:
All combinations of a value from each factor:
1 rep.30 sec.
2 rep.30 sec.
3 rep.30 sec.
1 rep.90 sec.
2 rep.90 sec.
3 rep.90 sec.
Factor:Ad Length
Factor:Num. Repetitions
30 sec.
90 sec.
1 rep. 2 rep. 3 rep.Treatment 1: 1 rep., 30 sec.
Treatment 2: 2 rep., 30 sec.
Treatment 3: 3 rep., 30 sec.
Treatment 4: 1 rep., 90 sec.
Treatment 5: 2 rep., 90 sec.
Treatment 6: 3 rep., 90 sec.
Blocks:
1. Men
2. Women
Design:
subjects
Men
Women
Group 1Group 2Group 3Group 4Group 5Group 6
Group 1Group 2Group 3Group 4Group 5Group 6
Treatment 1Treatment 2Treatment 3Treatment 4Treatment 5Treatment 6
Treatment 1Treatment 2Treatment 3Treatment 4Treatment 5Treatment 6
Compareresponse
Compareresponse
Compareresponse
Example: Alchohol and Reaction Times A study of the effects of alchohol on reaction time studied 50
undergraduate subjects. Reaction times for each subject were measured while playing a “shooter” video game. Each subject was tested twice. In one test, the subject was asked to drink a beer 10 minutes before playing the game. In the other test, the subject was asked to drink a non-alchoholic beer 10 minutes before playing the game. (The order of the two tests was varied from subject to subject, and the subject was not given any information about which test involved the real beer, and which involved the NA beer.) The difference in average reaction times was compared for the two tests. This is an example of a1. Observational study
2. Completely randomized experimental design.
3. Matched pairs experimental design.
4. Block experimental design.