Upload
gabriel-mendez
View
222
Download
0
Embed Size (px)
Citation preview
8/17/2019 Cu as i Experimental
1/81
ii:.
EXPERIMENTAL
ND
QUASI-EXPERIMENTAL
DESIGNS
OR
GENERALIZED
CAUSAL
NFERENCE
Wil l iam R. Shadish
Trru UNIvERSITY
op
MEvPrrts
Thomas
D. Cook
NonrrrwpsrERN
UNrvPnslrY
Donald
T.Campbell
i L l i
.jr-* -
**
fr
HOUGHTON
IFFLIN
OMPANY
Boston
New
York
8/17/2019 Cu as i Experimental
2/81
Experimentsnd
Generalized
ausal
lnference
Ex.per'i'ment
(ik-spEr'e-mant):
Middle
English from Old
French
rom
Latin
experimentum, from experiri, to try; see er- in Indo-EuropeanRoots.]
n.
Abbr. exp.,
expt,
1. a. A
test under
controlled
conditions that
is
made to demonstrate
a
known
truth,
examine
the validity of
a hypothe-
sis,
or determine
he
efficacy
of something
previously untried' b.
The
processof conducting
such
a test; experimentation.
2'
An innovative
act or
procedure:
"Democracy
is
only an experiment
n
gouernment"
(.V{illiam
Ralph lnge).
Cause
k6z):
[Middle
English
from
Old French
from
Latin causa' teason,
purpose.] n. 1. a.
The
producer
of an
effect,
result, or consequence.
b. The
one, such
as a
person,
an
event' or
a condition,
that is responsi-
ble
for an action
or a
result.
v. 1.
To be the
causeof or
reason or;
re-
sult
in. 2.
To bring
about
or compel
by authority
or
force.
o
MANv
historians
and
philosophers,
he
increased
emphasis
on experimenta-
tion
in the
15th and
L7th centuries
marked
the emergence
f
modern science
from
its roots
in
natural
philosophy
(Hacking,
1983). Drake
(1981)
cites
Galileo's
1.6' .2
treatrse
Bodies
Tbat Stay
Atop
'Water,
or Moue in
It as ushering
n
modern
experimental
science,
but
earlier
claims
can be
made favoring
\Tilliam
Gilbert's
1,600 tudy
Onthe
Loadstone
nd
MagneticBodies,
eonardo
da Vinci's
(1,452-1.51.9)
any nvestigations,
nd
perhapseven
he
Sth-century
.C.philoso-
pher
Empedocles,
who used
various
empirical
demonstrations
o argue
against
Parmenides
Jones,
1.969a,
1'969b).In
the
everyday
senseof
the term,
humans
have beenexperimenting
with
different
ways
of doing
things
from
the earliest
mo-
ments
of their
history.
Such
experimenting
s as
natural a
part
of
our
life as rying
a
new
recipeor
a different
way
of starting
campfires.
8/17/2019 Cu as i Experimental
3/81
z
|
1. EXeERTMENTs
NDGENERALTzED
AUsAL
NFERENcE
I
However,
the
scientific revolution
of the
1.7th
century departed
n
three ways
from
the common use
of observation n natural
philosophy
atthat
time. First, it in-
creasingly used
observation to
correct errors
in
theory.
Throughout historg
natu-
ral
philosophers
often used
observation in their theories,
usually to win
philo-
sophical
arguments
by
finding
observations that supported
their theories.
However,
they still
subordinated the use of observation to
the
practice
of
deriving
theories rom
"first
principles,"
starting
points
that
humans know to be true
by our
nature or
by divine revelation
(e.g.,
he assumed
properties
of the
four
basic
ele-
ments
of
fire,
water,
earth, and air in Aristotelian natural
philosophy).
According
to some accounts,
his subordination
of
evidence o theory degenerated
n the 17th
century:
"The
Aristotelian
principle
of
appealing
to experience
had degenerated
among
philosophers
nto
dependenceon
reasoning supported by casual examples
and the refutation
of opponents by
pointing
to apparent
exceptionsnot carefully
examined"
(Drake,
'1,98"1.,
p. xxi).'Sfhen
some
17th-century
scholars hen began
o
use observation to correct apparent errors in theoretical and religious first princi-
ples,
they came
into
conflict with religious or
philosophical authorities, as
in
the
case
of the
Inquisition's
demands that Galileo
recant his
account
of the earth re-
volving around
the sun. Given such hazards, he fact that the
new experimental
sci-
ence ipped
the
balance
oward observation and ^way
from dogma
is remarkable.
By
the time
Galileo died, the role
of systematicobservation
was
firmly
entrenched
as a central feature
of science,and it has remained so ever
since
(Harr6,1981).
Second,
before the
17th
century,
appeals o
experiencewere
usually
based
on
passive
observation
of ongoing systems ather than on
observation of what
hap-
pens
after a
system
s
deliberately
changed.
After the scientific
revolution
in the
L7th
centurS the
word experiment
(terms
in boldface
in this book are defined
in
the Glossary)
came to connote
taking a deliberate
action
followed by systematic
observation
of what occurredafterward. As Hacking
(1983)
noted
of Francis
Ba-
con:
"He
taught that not
only
must
we observenature
in the
raw,
but that
we must
also
'twist
the
lion's
tale', that
is,
manipulate our world
in order to
learn its
se-
crets"
(p.
U9).
Although
passive
observation
evealsmuch about
the world, ac-
tive manipulation
is required
to discover some of the
world's
regularities and
pos-
sibilities
(Greenwood,,
1989). As
a
mundane
example,
stainless steel
does not
occur
naturally;
humans must manipulate it into
existence.
Experimental science
came to
be concerned with
observing
the
effects of
such
manipulations.
Third,
early
experimenters
realized
he desirability of controlling
extraneous
influences that might limit or bias observation. So telescopeswere carried to
higher
points
at
which
the air
was clearer,
he
glass for microscopeswas
ground
ever more
accuratelg
and scientists
constructed laboratories
in
which
it was
pos-
sible
to use walls
to
keep
out
potentially
biasing ether
waves and to use
(eventu-
ally
sterilized)
est tubes to
keep out dust or
bacteria.
At first, thesecontrols were
developed
or
astronomg chemistrg
and
physics,
he
natural sciences
n which in-
terest in
science irst
bloomed.
But when scientists
started
to use experiments
n
areas such
as
public
health
or
education,
in
which extraneous
influences
are
harder
to control
(e.g.,
Lind
,
1,753lr, hey found
that the controls
used
n natural
8/17/2019 Cu as i Experimental
4/81
EXPERTMENTS
ND CAUSATTON
I
science
n the aboratory
worked
poorly
in these
new applications.
So hey
devel-
oped
new methods
of dealing
with extraneous
nfluence,such
asrandom
assign-
ment
(Fisher,
,925) r
adding
a nonrandomized
ontrol
group
(Coover
&
Angell,
1.907).
s theoretical
nd
observational
xperience
ccumulated
cross hese et-
tings and
opics,
more sources
f bias
were dentified
and
more methodswere de-
velopedo copewith them(Dehue, 000).
TodaS he
key feature
common
o all
experiments
s still
to
deliberately
ary
something
o as
o
discover
what
happenso something
lse ater-to discover he
effects
f
presumed auses.
s laypersons
e
do this,
for example, o assess hat
happens
o our
blood
pressuref we exercise
more, to our weight
f we diet
less,
or ro our
behavior
f we
read a
self-help
book.
However,scientific
experimenta-
tion
has developed
ncreasingly
pecialized
ubstance,
anguage,
and tools, in-
cluding
he
practice
of
field
experimentation
n the social
scienceshat
is the
pri-
mary
focus of
this
book.
This chapter
begins
to explore
these
matters by
(1)
discussing
he
nature
of causation
hat
experiments
est,
2)
explaining he spe-
cializederminology e.g., andomizedexperiments, uasi-experiments)hat de-
scribes
social
experiments,
3)
introducing
the
problem
of
how to
generalize
causalconnections
rom
individual
experiments,
nd
(4)
briefly situating he ex-
perimentwithin
a larger
iterature
on the
nature
of science.
EXPERIMENTS
ND CAUSATION
A sensible
iscussion
f
experiments
equires
both a vocabulary
or talking about
causation
nd
an understanding
f
key concepts
hat underlie
hat vocabulary.
Defining
Cause,
ffect, nd
Causal
elationships
Most
people
ntuitively
recognize
ausal
elationships
n their
daily lives.
For in-
stance,
ou
may
say hat
another
automobile's
itting
yours
was a cause
of the
damage o
your
car;
that the
number
of
hours
you
spent
studying
was a cause
f
your
test
grades; r that
the amount
of food a
friend
eatswasa cause
f his weight.
You may even
point to
more complicated
ausal
elationships,
oting that a
low
test
grade
was
demoralizing,
which
reducedsubsequent
tudying,which
caused
even
ower
grades.
Here
he same
ariable
low grade)
can
be both
a cause nd an
effect,and there can be a reciprocalrelationshipbetween wo variables
low
grades
and
not studying)
hat
cause ach
other.
Despite his
ntuitive
amiliarity with
causal
elationsbips,
precise
efinition
of
cause nd effect
haseluded
philosophers
or centuries.l
ndeed, he definitions
1. Our analysis
efldcts he use
of the
word causation
n ordinary
anguage, ot the
more detailed
discussions f
cause
by
philosophers. eaders
nterestedn
suchdetail
may consult
a
host of works that
we reference
n
this
chapter,
ncludingCook
and Campbell
1979).
8/17/2019 Cu as i Experimental
5/81
4
|
1.
EXPERTMENTS
ND
GENERALTZED
AUSALNFERENCE
of terms
such
as cause
and,
ffectdependpartly
on
eachother
and
on the
causal
relationship
n
which
both
are
embedded.
o the 17th-century
hilosopher
John
Locke
said:
"That
which
produces
ny
simpleor complex
dea,
we
denote
by
the
general
name
caLtse,
nd
that
which is
produce
, effect"
(1,97
,
p.
32fl
and also:
" A cAtrses that which
makes
any
other thing,
either
simple dea,
substance,
r
mode,
begin
o be;
and
an effect s
that,
which
had
ts
beginning rom
some
other
thing"
(p.
325).
Since
hen,
other
philosophers
nd
scientistsavegiven
us
useful
definitions
of
the three
key deas--cause,
ffect,
and causal elationship-that
are
more
specific
nd
hat
better
lluminate
how
experiments
ork. We
would
not
de-
fend
any
of these
as he
true or
correct
definition,
given
hat the
atter
has
eluded
philosophers
or
millennia;
but
we do claign
hat these deas
help
o
clarify
the sci-
entific
practice
of probing
causes.
Cause
Consider
he
cause
of
a
forest
ire.
We
know
that
fires
start n
different
ways-a
match
tossed
rom
a ca\
a
lightning
strike,
or a smoldering
ampfire, or
exam-
ple.
None
of these
causes
s necessary
ecause forest ire
can start
even
when,
say'
a match
s
not present.
Also,
none
of
them
s
sufficient
o start
he fire.
After
all,
a match
must
stay
"hot"
long
enough
o start combustion;
t
must
contact
combustible
material
such
as
dry leaves;
here
must
be
oxygen
or
combustion
o
occur;
and
the weather
must
be
dry enough
so that the leaves
are dry
and the
match
s
not
doused
by rain.
So
he match
s
part
of a constellation
f conditions
without
which
a
fire
will not result,
although
someof
these onditions
an
be usu-
ally
taken
or granted,
such
as he
availability
of oxygen.A lighted
match s,
rhere-
fore, what Mackie (1,974)calledan inus condition-"an
insufficient
but non-
redundant
part
of an unnecessary
ut sufficient
condition"
(p.
62; italics
n
orig-
inal).
It
is insufficient
because
match
cannot start
a
fire
without
the other
con-
ditions.
It
is nonredundant
only if it
adds
something ire-promoting
hat is
uniquely
different
rom
what the
other factors
n the
constellation
e.g.,
oxygen,
dry leaves)
ontribute
o
starting
a fire;
after all,it
would beharder
o say
whether
the match
caused
he
fire if
someone
lsesimultaneously
ried
starting t
with
a
cigarette
ighter.
t is part
of a
sufficient
condition to
start a fire in
combination
with
the full
constellation
of
factors.
But
that condition s not necessary
ecause
there
are
other
sets
of
conditions
hat
can
also start
fires.
A research xampleof an inusconditionconcerns newpotential reatment
for
cancer.
n
the
ate
1990s,
a team
of
researchers
n Boston
headed
y Dr.
Judah
Folkman
reported
hat
a new
drug
calledEndostatin
shrank umors
by limiting
their
blood
supply
(Folkman,
1996).
Other respectedesearchers
ould not repli-
cate
he
effect
even
when
usingdrugs
shipped
o
them
from Folkman's
ab.
Scien-
tists
eventually
eplicated
he
results
after they had
traveled
o
Folkman's
ab
to
learn
how
to
properly
manufacture,
ransport,
store,and handle
he drug
and how
to
inject
t in
the
right
location
at
the right
depthand angle.One
observerabeled
these
contingencies
he
"in-our-hands"
phenomenon,
meaning
"even
we
don't
8/17/2019 Cu as i Experimental
6/81
EXPERIMENTS
ND
CAUSATION
S
know
which
details
are
mportant,
so
it
might
take
you some
time
to work
it out"
(Rowe,
L999,
p.732).
Endostatin
was
an
inus
condition.
It was insufficient
cause
by
itself, and
its effectiveness
equired
it to
be
embedded
n a
larger set
of
condi-
tions
that
were
not
even
ully understood
by
the original
investigators.
Most
causes
are
more
accurately
called
nus conditions.
Many
factors are
usu-
ally
required
for
an effect
o occur,
but
we
rarely
know all of them and how they
relate
to each
other.
This
is one
reason
that
the
causal
relationships
we discuss
n
this
book
are
not
deterministic
but only
increase
he
probability that
an effect
will
occur
(Eells,
1,991,;
olland,
1,994).It
also
explains
why
a
given causal
elation-
ship
will
occur
under
some
conditions
but
not universally
across
ime,
space,
hu-
-"r
pop,rlations,
or
other
kinds
of treatments
and
outcomes
that
are
more
or less
related
io those
studied.
To different
{egrees,
all
causal
relationships
are
context
dependent,
so
the
generalization
of
experimental
effects
s always
at
issue.
That is
*hy
*.
return
to
such
generahzations
hroughout
this
book.
Effect
'We
can
better
understand
what
an effect
s
through
a counterfactual
model
that
goes
back
at
least
to
the
18th-century
philosopher
David
Hume
(Lewis,
'l'973'
p.
SSe
.
A counterfactual
is something
that
is contrary
to
fact.
In an experiment,
ie
obserue
what
did
happez
when
people
received
a
treatment.
The
counterfac-
tual
is knowledge
of
what
would
haue
happened
to
those
same
people
if
they
si-
multaneously
had
not
received
reatment.
An
effect
s the
difference
between
what
did
happen
and
what
would
have
happened.
'We
cannot
actually
observe
a counterfactual.
Consider
phenylketonuria
(PKU),
a
genetically-based
etabolic
disease
hat
causes
mental
retardation
unless
treated
during
the
first
few
weeks of life. PKU is the absenceof an enzymethat
would
otherwise
prevent
a buildup
of
phenylalanine,
a
substance
oxic
to the
nervous
system.
Vhen
a restricted
phenylalanine
diet
is begun
early
and
main-
tained,
reiardation
is
prevented.
n this
example,
the
cause
could
be thought
of as
the
underlying
genetic
defect,
as
the
enzymatic
disorder,
or as
the
diet.
Each
im-
plies a difierenicounterfactual.
For
example,
if we
say
that
a
restricted
phenyl-
alanine
diet
caused
a
decrease
n
PKU-based
mental
retardation
in
infants who
are
phenylketonuric
at
birth,
the
counterfactual
is whatever
would
have
happened
'h"d
t'h.r.
same
nfants
not
received
a
restricted
phenylalanine
diet.
The same
ogic
applies
to
the
genetic
or enzymatic
version
of
the
cause.
But
it is
impossible
for
theseu.ry ,"-i infants simultaneously to both have and not havethe diet, the ge-
netic disorder,
or
the
enzyme
deficiency.
So
a central
task
for all
cause-probing
research
s to create
reasonable
ap-
proximations
to
this
physically
impossible
counterfactual.
For
instance,
f it were
ethical
to do
so,
we
might
contrast
phenylketonuric
infants
who
were
given the
diet
with
other
phenylketonuric
infants
who
wer€
not
given the
diet
but who
were
similar
in
many ways
to
those
who
were
(e.g.,
similar
face)
gender,age,
socioeco-
nomic status,
health
status).
Or
we
might
(if
it were
ethical)
contrast
infants
who
8/17/2019 Cu as i Experimental
7/81
I
6
I
1. EXPERIMENTS
ND
GENERALIZED
AUSALNFERENCE
were not on
the diet for
the first
3 months
of their
lives
with those
same nfants
after they
were
put
on the diet
starting in
the
4th
month. Neither of these
ap-
proximations
is
a true counterfactual.
In
the
first
case,
he
individual infants in
the
treatment condition are different from those in the comparison condition; in the
second case,
he identities
are
the same,
but time
has
passed
and
many
changes
other than the
treatment have
occurred
to the infants
(including
permanent
dam-
age done
by
phenylalanine
during the first
3
months
of life). So two central
tasks
in experimental
design are
creating
a
high-quality
but necessarily
mperfect
source
of counterfactual
inference
and understanding
how this source differs from
the
treatment condition.
This
counterfactual
reasoning
s fundarnentally
qualitative
because ausal n-
ference, even
in experiments,
is
fundamentally
qualitative
(Campbell,
1975;
Shadish, 1995a;
Shadish
6c Cook, 1,999). However, some
of these
points
have
been
ormalized
by statisticians nto
a specialcase hat
is
sometimes alled Rubin's
CausalModel (Holland, 1,986;Rubin, 1.974,'1.977,1978,79861.This book is not
about statistics,
so we
do
not
describe hat model in detail
('West,
Biesanz,
&
Pitts
[2000]
do so and relate t
to the
Campbell radition). A
primary
emphasis f Ru-
bin's
model
is
the analysis
of cause n
experiments, and
its
basic
premises
are con-
sistent with
those
of this book.2 Rubin's
model has also beenwidely used
o ana-
lyze
causal inference
in
case-control
studies in
public health
and medicine
(Holland
6c Rubin, 1988),
in
path
analysis n sociology
(Holland,1986),
and
in
a
paradox
that
Lord
(1967)
introduced
into
psychology
(Holland
6c Rubin,
1983);
and
it
has
generated
many
statistical nnovations that we cover ater in
this
book.
It is
new
enough that
critiques
of
it
are
just
now
beginning
to
appear
(e.g.,
Dawid,
2000;
Pearl, 2000).
tUfhat
s clear, however, is that Rubin's is a very gen-
eral
model
with
obvious
and subtle implications. Both it
and the critiques of
it
are
required
material
for
advanced
students
and
scholars of cause-probingmethods.
CausalRelationship
How
do
we
know if
cause and effect
are
related? In
a classicanalysis
ormalized
by the
19th-century
philosopher
John
Stuart Mill, a causal
relationship
exists if
(1)
the causepreceded
he effect,
(2)
the causewas
related
to the effect,and
(3)
we
can find
no
plausible
alternative
explanation
for
the effect other
than
the cause.
These
three
characteristics
mirror
what happens in experiments
n
which
(1)
we
manipulate the presumed cause and observe an outcome afterward; (2) we see
whether
variation in
the
cause s related
to variation
in
the effect ; and
(3)
we use
various methods
during
the experiment
to
reduce
the
plausibility
of other expla-
nations for
the effect,
along with
ancillary methods to explore the
plausibility
of
those
we cannot
rule
out
(most
of this book is abou t methods
for
doing this).
2. However,
Rubin's model
is not intended
to say much
about the matters of
causal
generalization
that
we address
in this book.
8/17/2019 Cu as i Experimental
8/81
EXPERTMENTS
ND
CAUSATTON
|
7
I
Henceexperiments
re
well-suited
o
studying
causal
elationships.No other
sci-
entific
method
egularly
matches
he characteristics
f causal elationships
owell.
Mill's analysis
lso
points o the
weakness
f
other
methods. n many correlational
studies,
or example,
t
is impossible
o
know
which of two variables
ame irst,
so defending
causal
elationship
etween
hem
s
precarious.
Understanding
his
logic of causal elationships nd how its key terms,suchas causeand effect,are
defined
helps
esearchers
o
critique
cause-probing
tudies.
Causation,
orrelation,
nd
Confounds
A well-known
maxim
in research
is:
Correlation
does
not
proue
causation.
This is
so because
we
may not
know
which
variable
came
irst
nor whether
alternative ex-
planations
for the
presumed effect
exist.
For example,
suppose
ncome
and
educa-
tion are correlated.
Do
you
have
o
have a high
income
before
you
can aff.ordto
pay
for education,or do you first have o get a good educationbefore you can get a bet-
ter
paying
ob?
Each
possibility
may
be true,
and
so both
need
nvestigation.
But
un-
til
those
nvestigations
are
completed
and
evaluated
by the scholarly
communiry
a
simple
correlation
does
not indicate
which
variable
came
first. Correlations
also
do
little to rule
out alternative
explanations
for a
relationship
between two
variables
such as education
and
income.
That
relationship
may not be
causal at al l but
rather
due
to a
third
variable
(often
called
a confound),
such
as
intelligenceor
family so-
cioeconomic
status,
hat
causes
oth
high
education
and
high
income.
For
example,
if high
intelligence
causes
uccess
n education
and on
the
job,
then intelligent
peo-
ple
would
have correlated
education
and
incomes,
not because
ducation
causes
n-
come
(or
vice
versa)
but
because
oth
would
be
causedby
intelligence.
Thus
a
cen-
tral task in the study of experiments is identifying the different kinds of confounds
that
can operate
n a
particular
research
area
and
understanding
he strengths
and
weaknesses
ssociated
with
various
ways
of dealing
with them
Manipulable
nd
Nonmanipulable
auses
In the
ntuitive understanding
f experimentation
hat most
peoplehave, t makes
sense
o say,
Let's
seewhat
happens
f we
require
welfare
ecipients
o work";
but
it makesno sense
o
say,
Let's
see
what
happens
f I
change
his
adult
male nto a
three-year-oldirl." And so t is also n scientific xperiments. xperiments xplore
the effects
of things
that
can
be
manipulated,
such as
the dose
of a
medicine, he
amount of
a welfare
check,
he
kind or
amount
of
psychotherapy r
the number
of
children
n a classroom.
onmanipulable
vents
e.g.,
he explosion
of a super-
nova) or attributes
e.g., eople's
ges,
heir
raw
geneticmaterial,or their
biologi-
cal sex)
cannot
be causes
n experiments
ecause
e cannot
deliberately
ary
hem
to seewhat
then
happens.
Consequently,
ost scientists
nd
philosophers
gree
that
it is much
harder
o discover
he
effects
f
nonmanipulable
auses.
8/17/2019 Cu as i Experimental
9/81
I
8
|
1. EXeERTMENTSNDGENERALTzEDAUsAL
NFERENcE
To be clear,we are
not
arguing that
all causes
must
be
manipulable-only
that
experimental
causes
must
be so.
Many variables hat
we correctly
think
of as causes
are
not directly manipulable. Thus
it is
well established
hat a
genetic
defect causes
PKU even hough that defect s not directly manipulable.'We can investigatesuch
causes
ndirectly in nonexperimental studiesor
even
n
experiments
by manipulat-
ing biological
processes
hat
prevent
the
gene from exerting
its
influence,
as
through
the use of diet to
inhibit
the
gene's
biological
consequences.
oth the non-
manipulable
gene
and the manipulable diet
can be
viewed as
causes-both
covary
with
PKU-based etardation, both
precede he retardation,
and
it is possible
o ex-
plore
other explanations
for the
gene's
and the
diet's
effectson
cognitive
function-
ing.
However, investigating he manipulablc
diet as a
cause
has two
important ad-
vantages
over considering the
nonmanipulable
genetic
problem as a cause.
First,
only the diet
provides
a direct action to
solve the
problem;
and
second,
we will see
that studying
manipulable agentsallows a
higher
quality
source
of counterfactual
inference hrough such methods as random assignment. fhen individuals with the
nonmanipulable
genetic problem
are compared
with
persons
without
it,
the
latter
are
likely to be different
from
the
former in
many ways
other than
the
genetic
de-
fect. So the counterfactual
inference
about
what
would
have
happened
to
those
with the PKU
genetic
defect
s much more difficult
to
make.
Nonetheless,
nonmanipulable causes hould
be
studied using
whatever
means
are available
and seemuseful.
This is true because
uch
causes ventually
help
us
to
find
manipulable agents
that can then be
used
to ameliorate
the
problem
at
hand. The PKU example
illustrates
this.
Medical researchers
id
not discover
how
to treat
PKU
effectively
by
first
trying different
diets
with
retarded children.
They
first discovered the nonmanipulable biological features of
retarded children
af-
fected with PKU, finding abnormally
high
levels of
phenylalanine
and
its
associ-
ated
metabolic
and
genetic problems
in those
children.
Those
findings
pointed
in
certain ameliorative directions and
away
from others,
leading scientists
o exper-
iment with treatments they
thought might be effective
and
practical. Thus
the
new
diet
resulted from a sequenceof studies
with different
immediate
purposes, with
different
forms, and
with
varying degreesof
uncertainty
reduction.
Somewere ex-
perimental, but
others were
not.
Further,
analogue experiments
can sometimes
be done
on
nonmanipulable
causes, hat is, experiments that
manipulate an
agent
that
is
similar
to the
cause
of
interest. Thus
we cannot change
a
person's ace, but
we can
chemically
induce
skin pigmentation changes n volunteer individuals-though such analoguesdo
not match the reality of being
Black
every
day and
everywhere
or an entire
life.
Similarly
past
events,which are
normally nonmanipulable,
sometimes
constitute
a
natural
experiment that
may even
have
been
randomized,
as when
the
1'970
Vietnam-era draft
lottery
was used
to
investigate a
variety of
outcomes
(e.g.,
An-
grist,
Imbens,
&
Rubin, 1.996a;Notz, Staw,
&
Cook,
l97l).
Although
experimenting on
manipulable causes
makes he
job
of discovering
their effectseasier,experiments are
far from
perfect
meansof
investigating
causes.
8/17/2019 Cu as i Experimental
10/81
I
EXPERIMENTSND CAUSATION 9
Sometimes
experiments
modify the conditions
in
which testing
occurs in a
way
that reduces he
fit between
hose conditions and
the
situation to which
the results
are
to
be
generalized.Also,
knowledge of the
effects
of manipulable
causes
ells
nothing about
how and why those effects
occur.
Nor
do experiments
answer many
other
questions relevant to the
real world-for example,
which
questions
are
worth asking, how strong the need for treatment is, how a cause s distributed
through societg
whether
the treatment
is
implemented with theoretical fidelitS
and what value
should be
attached to the
experimental
results.
In additioq,
in experiments,
we first
manipulate a treatment and only then
ob-
serve
ts effects;but
in some other
studieswe
first
observean effect, such as AIDS,
and then search
for its cause,
whether
manipulable
or not. Experiments
cannot
help
us with
that search.
Scriven
(1976)
likens such
searches o detective
work
in
which a crime
has beencommitted
(..d.,
"
robbery),
the detectives
bservea
par-
ticular
pattern
of evidence
surrounding
the
crime
(e.g.,
he
robber
wore a baseball
cap and a
distinct
jacket
and used a certain
kind of
Bun),
and then the detectives
search or criminals whose known method of operating
(their
modus
operandi or
m.o.) includes this
pattern. A
criminal
whose
m.o. fits that
pattern
of
evidence
then becomesa
suspect o be
investigated
further.
Epidemiologists
use a similar
method, the case-control
design
(Ahlbom
6c
Norell, 1,990), n which they observe
a
particular health outcome
(e.g.,
an
increase
n brain tumors) that is not
seen
n
another
group
and then
attempt to
identify
associatedcauses
e.g.,
ncreased
cell
phone use). Experiments
do
not aspire to answer
all the
kinds
of
questions,
not
even all the
types of
causal
questions,
hat social
scientistsask.
Causal escriptionnd Causal xplanation
The uniquestrength
of experimentation
s in describing
he consequencesttrib-
utable o deliberately
aryinga treatment.'We
all this
causaldescription. n con-
trast,
experiments
o
lesswell in clarifying
the
mechanismshrough which and
the
conditions
under
which that
causal
elationship
holds-what
we
call
causal
explanation.
For example,most
childrenvery
quickly earn he descriptive
ausal
relationshipbetween
licking
a light switch
and obtaining
llumination n a room.
However, ew children
(or
evenadults)
can
fully explain why that
light
goes
on.
To do so, hey would
have o
decompose
he treatment
the
act of flicking
a
light
switch)
nto
its
causally fficacious
eatures
e.g.,
losingan nsulated ircuit) and
its nonessentialeatures
e.g.,
whether he
switch
s thrown by hand or a motion
detector).
They would
have o do the same
or the effect
(either
ncandescent r
fluorescent
ight can be
produced,
but
light
will still be
produced
whether the
light fixture is recessed r
not). For
full explanation,
hey would then have to
show
how
the
causally
efficacious
arts of
the treatment
nfluence he
causally
affected
parts
of the outcome
hrough
identified
mediating
processes
e.g.,
he
8/17/2019 Cu as i Experimental
11/81
I
1O I T. CXPTRIMENTS
ND
GENERALIZED
AUSAL
NFERENCE
passage
of electricity
through the circuit,
the excitation
of
photons).3
ClearlS the
causeof the
light
going
on
is
a complex
cluster
of
many factors.
For those
philoso-
phers
who equate cause
with
identifying that constellation
of
variables
that
nec-
essarily inevitably and infallibly results in the effect (Beauchamp,1.974), alk of
cause
s not
warranted
until everything
of
relevance
s known.
For them,
there
is
no causal description
without causal
explanation.
Whatever
the
philosophic mer-
its of their
position,
though,
it is not
practical to expect
much
current social
sci-
ence
o
achieve
such complete
explanation.
The
practical
importance of
causal explanation
is
brought
home when the
switch
fails
to
make the
light
go
on
and when
replacing
the
light bulb
(another
easily
learned manipulation)
fails to solva
the
problem. Explanatory
knowledge
then
offers clues about
how to
fix
the
problem-for
example,
by detecting
and
re-
pairing
a
short circuit. Or
if we wanted
to create
llumination
in
a
place
without
lights
and
we had explanatory
knowledge, we would
know exactly
which
features
of the cause-and-effect elationship are essential o create ight and which are ir-
relevant. Our explanation
might tell
us that
there
must be
a source
of electricity
but
that that source
could take several
different
molar
forms, such
as abattery,
a
generator,
a windmill, or a
solar array.
There
must also
be a
switch
mechanism o
close a circuit, but
this could also
take
many forms,
including
the touching of
two
bare wires
or
even
a
motion
detector
that
trips the
switch
when someone
enters
the
room.
So causal explanation
is an
important
route to
the
generalization
of
causal descriptions
because
t tells us which
featuresof
the
causal
relationship
are
essential o transfer
to other situations.
This
benefit
of causal explanation
helps elucidate
its
priority and
prestige n
all sciences nd helpsexplain why, once a novel and important causal
relationship
is discovered, he
bulk
of
basic scientific
effort
turns
toward
explaining
why and
how it happens. Usuallg this
involves decomposing
he
cause
nto its causally ef-
fective
parts,
decomposing the
effects
nto its causally
affected
parts,
and
identi-
fying
the
processes
hrough
which the effective
causal
parts influence
the causally
affected
outcome
parts.
These examplesalso
show the close
parallel between
descriptive
and
explana-
tory causation and
molar and
molecular causation.a
Descriptive causation
usually
concerns
simple bivariate
relationships between
molar treatments
and
molar out-
comes,
molar here referring to a
package
hat
consists
of
many different
parts.
For
instance, we may
find
that
psychotherapy decreases
epression,
a simple
descrip-
tive causalrelationship benveena molar treatment package and a molar outcome.
However,
psychotherapy
consists of
such
parts
as
verbal
interactions,
placebo-
3. However, he full explanationa
physicist
would
offer might
be
quite different
rom
this electrician's
explanation,
perhaps nvoking he behaviorof subparticles.
his difference
ndicates
ust
how complicated
s the
notion of explanationand
how it
can
quickly
become
uite
complex
once
one
shifts
evelsof analysis.
4. By molar, we meansomething aken as
a whole rather than
in
parts.An analogy
s to
physics,
n
which molar
might refer o the
properties
r
motions of masses, s
distinguished
rom
thoseof
molecules
r atoms hat
make up
thosemasses.
8/17/2019 Cu as i Experimental
12/81
EXPERIMENTSNDCAUSATION
11
I
generating
procedures,
setting characteristics,
ime
constraints,
and
payment
for
services.
Similarly,
many
depression
measures
consist of
items
pertaining
to
the
physiological,cognitive,
and affective
aspects
f
depression.
Explan atory causation
breaks
hese
molar
causes
and
effects
nto
their
molecular
parts
so as to
learn,
say,
that
the verbal
nteractions
and
the
placebo
featuresof therapy
both cause
changes
in the cognitivesymptomsof depression,but that payment for services oes not do
so even
hough
it is
part
of the
molar
treatment
package.
If experiments
are
less
able to
provide
this highly-prized
explanatory
causal
knowledge,
why.are
experiments
so
central
to science,
specially
o basic
social sci-
ence,
n which
theory
and
explanation
are often
the coin of
the realm?
The answer is
that
the dichotomy
ber'*reen
escriptive
and
explanatory
causation
is lessclear
in
sci-
entific
practice han
in abstract
discussions
bout
causation.
First, many causalex-
planatironsconsist
of
chains
of descriptivi
causal
inks in which
one event causes
he
next. Experiments
help to test
the
links
in each
chain. Second,
experiments
help dis-
tinguish
between
he validity
of
competing
explanatory
theories,
or example, by test-
ing competing mediating links proposed by those theories.Third, someexperiments
test whether
a descriptive
causal
relationship
varies
in strength
or direction
under
Condition
A versus
Condition
B
(then
the condition
is
a
moderator variable
that ex-
plains
the
conditions
under
which the
effect
holds).
Fourth, some
experiments
add
quantitative
or
qualitative observations
of
the
links
in the explanatory
chain
(medi-
ator
variables)
to
generateand study
explanations
for the
descriptive
causal effect.
Experiments
are
also
prized
in
applied
areas
of social
science,
n which the
identification
of
practical
solutions
to
social
problems has as
great
or
even
greater
priority
than
explanations
of those
solutions.
After all, explanation
is not always
required
for
identifying
practical solutions.
Lewontin
(1997)
makes this
point
about the
Human Genome
Project,
a
coordinated
multibillion-dollar
research
program ro map the human genomethat it is hoped eventually will clarify the ge-
netic
causesof
diseases.
ewontin
is skeptical
about
aspects
of this
search:
' ilhat
is
involved
here
s the difference
etween
xplanation
nd
intervention.
Many
disorders
anbe
explained
y the
failure
of
the organism
o
makea
normal
protein,
a
failure
hat
is
the
consequence
f a
genemutation.
But interuention
equireshat
the
normal
proteinbe
providedat
the
right
place
n the
right cells,at the
right time and
n
the
right amount,
or
else hat
an alternative
way
be
ound to
providenormal
cellular
function.'What
s worse,
t
might even
be
necessary
o
keep he
abnormal
roteinaway
from the
cellsat
critical
moments.
None
of
these
bjectives
s served
y knowing he
DNA sequence
f the
defective
ene.
Lewontin,
1,997,
p.29)
Practical applications arenot immediately revealedby theoretical advance.In-
stead, o
reveal hem
may take
decades
of
follow-up
work, including
tests of sim-
ple
descriptive
causal
relationships.
The same
point
is illustrated
by the cancer
drug
Endostatin,
discussed
arlier.
Scientists
knew the
action
of
the drug
occurred
through
cutting off
tumor
blood
supplies;
but
to successfully
use he
drug
to treat
cancers
n
mice required
administering
it at
the
right
place,
angle,
and
depth,
and
those details
were
not
part of the usual
scientific
explanation
of the
drug's
effects.
8/17/2019 Cu as i Experimental
13/81
12
I
1. EXPERTMENTS
ND
GENERALTZED
AUSAL
NFERENCE
I
In the
end, hen,causal
escriptions ndcausal xplanations
re
n
delicate
al-
ance
n
experiments.'$7hatxperiments
o best
s to improvecausaldescriptions;
they do
less
well at explaining
causal
elationships. ut most experiments
an be
designedo providebetterexplanationshan s typically he caseoday.Further, n
focusing
on causaldescriptions, xperiments
ften
investigate
molar
events hat
may be less
strongly related to outcomes han are more
molecularmediating
processes,
specially
hose
processes
hat are
closer o the outcome
n
the explana-
tory
chain. However,
many
causaldescriptions
re still dependable nd strong
enough o be useful,
o be worth making the building blocks
around which im-
portant policies
and theoriesare created.
ust
consider
he dependability f
such
causal tatementss
hat schooldesegregationauses
hite light, or that outgroup
threatcausesngroup
cohesion, r
that
psychotherapymprovesmentalhealth,
or
that diet reduces
he
retardation
due
o
PKU. Suchdependableausal
elationships
are
useful o
policymakers, ractitioners,
nd
scientists like.
MODERN
DESCRIPTIONS
F EXPERIMENTS
Some of
the terms used n describing modern experimentation
(see
Table L.L)
are
unique,
clearly defined,
and consistently used;
others are blurred and
inconsis-
tently used. The common
attribute in all experiments
is control of treatment
(though
control can take many
different
forms). So Mosteller
(1990,
p.
225)
writes,
"fn
an experiment
the
investigator
controls the
application of the treat-
ment"l
and
Yaremko,
Harari, Harrison,
and
Lynn
(1,986,
p.72)
write,
"one
or
more independent
variables are
manipulated
to
observe their effects on one or
more
dependent
variables." However,
over time
many different
experimental sub-
types
have
developed
n response
o the needs and
histories of different sciences
('Winston,
1990;
'Winston
6c Blais, 1.996\.
TABLE .1
TheVocabulary
f Experiments
Experiment:
study n whichan nterventions
deliberately
ntroduced
o observetseffects.
Randomized
xperiment:
n experimentn whichunitsareassigned
o
receive
he reatment r
analternativeondition ya random rocessuch s he oss f a coin r a table f
random
umbers.
Quasi-Experiment:
n experimentn
whichunitsare
not assignedo
conditionsandomly.
Natural
xperiment: ot
really n experimentecausehe cause
sually annot e
manipulated;
study
hatcontrastsnaturallyccurring
vent uch
sanearthquakeith
a comoarison
ondition.
Correlational
tudy:Usually
ynonymous
ith nonexperimental
r observationaltudy; study
that
simply
bserveshesize nddirection
f
a relationship
mong
ariables.
8/17/2019 Cu as i Experimental
14/81
I
MODERN
ESCRIPTIONS
F
EXPERIMENTS
I
tr
Randomized
xperiment
The
most clearly
described
ariant
s
the
randomized
experiment,
widely
credited
to Sir
Ronald
Fisher
1,925,1926).It
was
irst
used
n agriculture
ut
aterspread
to other
topic
areas
because
t
promisedcontrol
over extraneous
ources
f
vari-
ation
without
requiring
he physical solationof the aboratory. ts distinguishing
feature
s clear
and
important-that
the
various
reatments
being
contrasted
in-
cluding
no treatment
at
all)
are assigned
o experimental
nits'
by chance,
or ex-
ample,
by
coin
ossor
use
of a table
of
random
numbers.
f
implemented orrectlS
,"rdo-
assignment
reates
wo
or
more
groupsof units
that are
probabilistically
similar
o .".h
other
on the
average.6
ence,
any
outcome
differences
hat
are ob-
served
etween
hose
groups
at
the
end,of
a study
are
ikely
to be due o
treatment'
not to differences
etween
he
groups hat
already
existed
at the start
of the study.
Further,
when
certain
assumptions
re
met, he
randomized
experiment
ields
an
estimate
of
the
size
of a treatment
effect
hat
has desirable
tatistical
properties'
alongwith estimates f the probability that the true effect alls within a defined
confidence
nterval.
These
eatures
of experiments
re so
highly
prized
hat
in a
research
rea
such
as
medicine
he randomized
experiment
s
often
referred
o as
the
gold standard
or treatment
outcome
esearch.'
Closely
elated
o
the
randomized
experiment
s a
more ambiguous
and
in-
consistently
sed
erm,
true
experiment.
Some
authors
use
t
synonymously
ith
randomized
xperiment
Rosenthal
&
Rosnow,
1991').
Others
use
t more
gener-
ally to
refer
o any
study
n
which
an
independent
ariable
s deliberately
manip-
ulated
(Yaremko
et
al.,
1,9861and
dependent
ariable
s assessed.
We
shall
not
use he
term
at all
given ts
ambiguity
and
given
hat
the
modifier
true seems
o
imply
restricted
laims
o a
single
correct
experimental
method.
Quasi-Experiment
Much
of this
book
focuses
on
a class
of
designs
hat
Campbell
and Stanley
(1,963)
opularized
s
quasi-experiments.s
uasi-experiments
hare
with all other
5. Units
can be
people,animals,
ime
periods, nstitutions,
or
almost
anythingelse.
Typically
n field
experimentation
hey
are
people
or
some
aggregate
f
people,such
as classrooms
r work sites.
n addition,
a
little
thought
shows hat
random
assignment
f units
to treatments
s the
sameas assignment
f
treatments
o units, so
these
phrases
re requendy
used
nterchangeably'
6. The word probabilistically s crucial,as s explained n more detail n Chapter8.
7.
Although the rerm
randomized
experiment
is used his
way
consistently
across
many
fields and
in this book,
statisticians
sometimes
use he closely
related term
random
experiment
n a different
way to
indicate
experiments
for which the
outcome
annor
be
predictedwith
certainry
e.g.,
Hogg &
Tanis, 1988).
8. Campbell
1957)
irst called
hese ompromise
esigns
ut
changed
erminology
very
quickly; Rosenbaum
(1995a\
and Cochran
1965\
refer o these
as
observational
tudies,
term we
avoid because
many
people
use
t to
refer o
correlational
r nonexperimental
tudies,
s
well. Greenberg
nd Shroder
1997)
use
qudsi-etcperiment
o
refer o studies
hat
randomly
assign
roups
(e.g.,
ommunities)
o conditions,
but
we would
consider hese
roup-
randomizedexperiments
Murray'
1998).
8/17/2019 Cu as i Experimental
15/81
I
14
I
1. EXPERIMENTS
NDGENERALIZED
AUSALNFERENCE
I
experiments
a similar
purpose-to
test descriptivecausalhypotheses
bout
manip-
ulable causes-as well as many
structural details, such as
the
frequent
presence
of
control
groups
and
pretest
measures,
o support a counterfactual
nference
about
what would have happened in the absenceof treatment. But, by definition, quasi-
experiments lack random
assignment. Assignment to conditions
is by
means
of self-
selection,by which units choose
reatment
for
themselves, r
by
meansof adminis-
trator selection,
by which teachers,bureaucrats, egislators, herapists,
physicians,
or others
decidewhich
persons
should
get
which treatment.
Howeveq researchers
who use
quasi-experiments
may still have considerablecontrol
over selectingand
schedulingmeasures,
ver how nonrandom
assignment
s
executed,
over the kinds
of
comparison
groups
with which
treatment,groups
are compared,
and over some
aspectsof how treatment is
scheduled.
As
Campbell
and Stanley
note:
There are many
natural socialsettings n which the
research
erson
can
introduce
somethingike experimental esign nto his scheduling f datacollection rocedures
(e.g.,
he uhen and o
whom of
measurement),ven hough
he acks he full control
over he
scheduling f experimental timuli
(the
when
and o wltom of exposure nd
the ability to randomize
xposures) hich makes
a true experiment
ossible.
ollec-
tively,such
situations an be regarded s
quasi-experimental
esigns.
Campbell
&
StanleS
,963,
.
34)
In quasi-experiments,
he
cause s manipulable and
occurs before the effect
is
measured. However,
quasi-experimental
design
features usually create
ess
com-
pelling
support for counterfactual inferences. For example,
quasi-experimental
control
groups
may differ from
the treatment
condition
in many systematic
non-
random) ways other than the presenceof the treatment Many of theseways could
be alternative
explanations for the observed effect,
and so
researchershave to
worry
about ruling
them out in order
to
get
a
more valid
estimateof
the treatment
effect.
By
contrast, with random
assignment
he researcher
does
not have to th ink
as
much
about a ll these alternative
explanations.
If correctly done,
random as-
signment makes most
of the alternatives
less likely as causes
of the observed
treatment effect
at the start of the study.
In
quasi-experiments,
he researcher as o enumerate
alternative
explanations
one by one,
decide which are
plausible,
and then use
ogic, design,
and measure-
ment
to assess
hether
each
one
is
operating
in
a
way that
might explain any ob-
servedeffect. The diff iculties are
that thesealternative
explanations
are
never com-
pletely enumerable n advance, that some of them are particular to the context
being studied,
and that the methods needed o eliminate
them
from contention will
vary
from
alternative o
alternative and
from
study
to study.
For example,suppose
two nonrandomly
formed
groups
of
children are
studied, a volunteer
treatment
group
that
gets
a
new
reading
program
and a control
group
of
nonvolunteerswho
do not
get
it.
If the treatment
group
does better,
s it becauseof
treatment or be-
cause he
cognitive development of
the volunteers
was
increasing
more rapidly even
before treatment
began?
(In
a
randomized experiment,
maturation rates would
r
8/17/2019 Cu as i Experimental
16/81
MODERN ESCRIPTIONS
FEXPERIMENTS
|
1s
have
been
probabilistically
qual
n both
groups.)
To assess
his
alternative,
he
re-
searcher
might
add
multiple
pretestso
reveal
maturational
rend
before
he
treat-
ment, and
then
compare
hat
trend
with
the
trend after
treatment.
Another
alternative
xplanation
might
be
hat the
nonrandom
control
group n-
cluded
more
disadvantaged
hildren
who
had
essaccess
o books
n their
homes
or
who
had
parentswho
read
o them lessoften. (In a randomizedexperiment'both
groupswould
have
had
similar
proportions
of
such
children.)
To assess
his
alter-
nativi,
the
experimenter
may
measure
he
number
of
books
at home,
parental
ime
spent
eadingto
children,
and
perhaps rips
o
libraries.
hen
he
researcher
ould
see
f these
variables
differed
across
reatment
and
control
groups n the
hypothe-
sized
direction
hat
could
explain
the
observed
reatment
effect.
Obviously,
as the
number
of
plausible
alternative
explapations
ncreases,
he design
of
the
quasi-
.
experiment
becomes
more
intellectually
demanding
and
complex---especially
e-
cause
we
are
never
certain
we
have
dentified
all
the alternative
xplanations.
he
efforts
of
the
quasi-experimenter
tart
to
look
like affempts
o bandage
a
wound
that would havebeen essseveref random assignment ad beenused nitially.
The ruling
out
of alternative
ypotheses
s closely
elated
o a
falsificationist
logic
popularized
y
Popper
1959).
Popper
noted
how
hard it
is to be
sure
hat a
g*.r"t conclusion
e.g.,
ll r*"ttr
are
white)
is correct
based
on
a
limited
set of
observations
e.g.,
all
the
swans
've seen
were
white).
After
all,
future observa-
tions
may change
e.g.,
ome
ay
may
seea
black
swan).
So confirmation
s
log-
ically
difficult.
By contrast,
observing
disconfirming
nstance
e.g.,
a black
swan)
is sufficient,
n
Popper's
iew,
o
falsify
the
general onclusion
hat
all
swans
are
white.
Accordingly,
opper
urged
scientists
o try
deliberately
o
falsify the
con-
clusions
hey
wiih
to
draw
rather
than
only
to seek
nformation
corroborating
them.
Conciusions
hat
withstand
alsification
are
retained
n scientific
books
or
journals and treated as plausibleuntil better evidencecomesalong. Quasi-
experimentation
s falsificationist
n that
it requires
experimenters
o
identify
a
causal
laim
and
then
o
generate
nd examine
plausible
alternative
xplanations
that
might
falsify
he
claim.
However,
uch
alsification
an
never
be
as definitive
as
Popper
hoped.
Kuhn
(7962)pointed out
that
falsification
depends
n
two
assumptions
hat
can
never
be
fully
tested.
The
first
is that
the
causal
claim
is
perfectlyspecified.
But
that
is
never h.
."r..
So
many
features
of both
the claim
and
the test
of
the claim
are
debatable-for
example,
which
outcome
s of
interest,
how
it
is measured,
he
conditions
of
treatment,
who
needs
reatment,
and
all the
many
other
decisions
that researchers ustmake n testingcausal elationships. s a result,disconfir-
mation
often
eads
heorists
o
respecify
art of their
causal
heories.
For
exam-
ple,
hey
might
now
specify
ovel
conditions
hat
must
hold
for their
theory
o
be
irue
and
that
were
derived
rom
the apparently
disconfirming
observations.
ec-
ond,
falsification
equires
measures
hat are
perfectly
valid
reflections
f
the the-
ory
being
tested.
However,
most
philosophers
maintain
that
all
observation
s
theorv-laden.
t
is laden
both
with
intellectual
nuances
pecific
o
the
partially
8/17/2019 Cu as i Experimental
17/81
8/17/2019 Cu as i Experimental
18/81
8/17/2019 Cu as i Experimental
19/81
18
I
1. EXPERIMENTS
NDGENERALIZED
AUSAL
NFERENCE
Nonexperimental esigns
The termscorrelational
design,
passive
bservational esign,and
nonexperimental
design efer to situations n which a presumedcauseand effectare identifiedand
measuredbut in which
other structural featuresof experiments
re missing.Ran-
dom assignments not
part
of the design,nor are such design
elements s
pretests
and control
groups
rom
which researchers ight
construct useful ounterfactual
inference. nstead, eliance
s
placed
on measuring lternative
xplanationsndi-
vidually
and then statistically
ontrolling
for them. In cross-sectionaltudies
n
which all the
data are
gathered
on
the
respondents t one ime,
the researchermay
not even know if
the cause
precedes
he dffect.
When these
studiesare used or
causal
purposes,
he missing
design eatures an
be
problematic
nlessmuch s
al-
ready known
about
which alternative
nterpretations re
plausible,
unless hose
that are
plausible
an
be validly measured, nd unless
he substantive
model
used
for statistical djustments well-specified. hesearedifficult conditions o meet n
the real
world of research
ractice,
and therefore
many commentators
oubt the
potential
of suchdesigns
o supportstrongcausal
nferences
n most cases.
EXPERIMENTS
ND
THEGENERALIZATION
F
CAUSAL
ONNECTIONS
The strength
of experimentation is its ability to
illuminate causal
inference.The
weaknessof experimentation
is doubt about the extent
to which
that causal rela-
tionship
generalizes.
We
hope that an innovative
feature of this
book is its focus
on
generalization.
Here
we
introduce
the
general ssues hat are
expanded
n
later
chapters.
Most Experiments
re HighlyLocalBut Have
GeneralAspirations
Most experiments
rehighly ocalized
and
particularistic. hey arealmostalways
conducted n
a
restricted
ange
of settings,
ften
just
one, with a
particular
ver-
sion of one typeof treatment ather than, say,a sampleof all possible ersions.
Usually they
have severalmeasures-each
with
theoretical
assumptionshat are
different rom
those
present
n
other
measures-but
far from a complete etof all
possible
measures.
ach
experimentnearly always
usesa
convenient ampleof
people
ather
than
one that reflectsa well-described
opulation;
and
it
will
in-
evitably
be conducted
t a
particular point
in time that
rapidly becomes
istory.
Yet
readers f
experimental esultsare rarelyconcerned
ith
what happened
in that particular,past,
ocal study.Rather, hey usually
aim to
learn eitherabout
theoretical
onstructs
f
interest
or about alarger
policy.Theoristsoften want to
8/17/2019 Cu as i Experimental
20/81
EXeERTMENTS
ND
THE
GENERALIZATIONF
CAUSAL
ONNECTIONS
I
t '
connect
experimental
results
to
theories
with broad
conceptual
applicability,
which
,.q,rir.,
generalization
at
the
linguistic
level
of constructs
rather
than
at the
level of
the
operations
used
to
represent
these constructs
in
a
given experiment.
They
nearly
always
want
to
generallze
o
more
people and
settings
han
are
rep-
resented
n a single
experiment.
Indeed,
the value
assigned
o
a substantive
heory
usually
depends
on
how
broad
a
rangeof phenomena the theory covers.SimilarlS
policymakers
may be
interested
in whether
a
causal
relationship
would
hold
iprobabilistically)
across
he
many
sites
at
which
it would
be
implemented
as a
policS an
inference
hat
requires
generalization
beyond
the
original
experimental
stody
contexr.
Indeed,
all
human
beings
probably
value
the
perceptual and
cogni-
tive stability
that
is fostered
by
generalizations.
Otherwise,
the
world
might ap-
pear
as a
btulzzing
acophony
of
isolqted
instances
requiring
constant
cognitive
processing hat
would
overwhelm
our
limited
capacities.
In defining
generalization
as
a
problem,
we
do
not assume
hat more
broadly
ap-
plicable
resulti
are
always
more
desirable
Greenwood, 1989).
For example,
physi-
cists -ho useparticle accelerators o discover new elementsmay not expect that it
would
be desiiable
to
introduce
such
elements
nto the
world.
Similarly,
social
scien-
tists
sometimes
aim
to
demonstrate
that
an
effect
is
possible and
to understand
its
mechanisms
without
expecting
that
the
effect
can be
produced
more
generally.
For
instance,
when
a
"sleeper
effect"
occurs
in an
attitude
change
study
involving
per-
suasive
communications,
the
implication
is
that
change
s manifest
after
a time
delay
but
not
immediately
so.
The circumstances
under
which
this
effect
occurs
turn
out
to
be
quite
limited
and
unlikely
to
be
of any
general
nterest
other
than
to
show
that
the
theory
predicting
t
(and
many
other
ancillary
theories)
may
not be
wrong
(Cook,
Gruder,
Hennigan
&
Flay
l979\.Experiments
that
demonstrate
limited
generaliza-
tion
may be
ust
as
valuable
as hose
hat
demonstrate
broad
generalization.
Nonetheless,
conflict seemso exist berween he localizednature of the causal
knowledge
that
individual
experiments
provide
and
the
more
generalized
causal
goals hat
research
aspires
o attain.
Cronbach
and
his
colleagues
Cronbach
et
al.,
f
gSO;
Cronbach,
19821have
made
this
argument
most
forcefully
and their
works
have
contributed
much
to
our
thinking
about
causal
generalization.
Cronbach
noted
that
each
experiment
consists
of
units
that
receive
he
experiences
eing
con-
trasted,
of
the
treaiments
themselves
of obseruations
made
on the
units,
and
of the
settings
in
which
the
study
is conducted.
Taking
the
first
letter
from each
of
these
four
iords,
he defined
the
acronym
utos
to
refer
to the
"instances
on which
data
are
collected"
(Cronb
ach,
1.982,p.
78)-to
the
actual
people, reatments'
measures'
and settings hat were sampled n the experiment.He then defined wo
problems
of
generalizition:
(1)
generaliiing
to
the
"domain
about
which
[the]
question
is
asked"
(p.7g),which
he called
UTOS;
and
(2)
generalizing
o
"units,
treatments,
variables,
"nd
r.r,ings
not
directly
observed"
(p.
831,
hi.h
he called
oUTOS.e
9. We oversimplify
Cronbach's
presentation
here
or
pedagogical
easons.
For example,
Cronbach
only
usedcapital
S,
not small s,
so that
his system
eferred
only
to
,tos,
not
utos. He
offered
diverse
and
not always
consistent
definitions
of
UTOS and
*UTOS,
in
particular.
And
he
does
not
use he
word
generalization
n
the
same
broad
way we
do here.
8/17/2019 Cu as i Experimental
21/81
I
20 I 1. EXPERIMENTSNDGENERALIZEDAUSAL
NFERENCE
Our
theory
of
causal
generalization,
utlinedbelowand
presented
n morede-
tail in ChaptersLL through
13, melds
Cronbach's
hinking
with
our own
ideas
about
generalization
rom
previous
works
(Cook,
1990, t99t;
Cook 6c Camp-
bell,1979), creatinga theory that is different n modestways rom both of these
predecessors.
ur theory
s influenced y Cronbach'swork
in
two
ways.First,we
follow him
by
describing
xperiments
onsistently
hroughout
his
book as con-
sistingof the elements f units, treatments,
bservations,
nd
settingsrlo
hough
we
frequently ubstitute
ersons
or
units
given
hat most
ield
experimentation
s
conductedwith humansas
participants.
We
lso
often
substitute
utcome
.orob-
seruations
iven
he centrality of observations
bout
outcome
when examining
causal
elationships. econd, e acknowledge
hat
researchers
reoften
nterested
in two
kinds
of.generalization bout
eachof these
ive
elements,
nd that
these
two typesare nspiredbg but
not identical o, the
two
kinds of
generalization
hat
Cronbach defined.
We
call these
construct validity
generalizations
inferences
about he constructshat research perations epresent) nd external aliditygen-
eralizations
inferences
bout whether he causal
elationship
oldsover
variation
in
persons,
ettings,reatment,
and measurement
ariables).
Construct alidity:CausalGeneralization
as
Representation
The first
causal
generalization
problem
concerns
how to
go
from the
particular
units, treatments,
observations, and settings
on
which data
are collected
to the
higher order constructs these nstances epresent.These constructs are almost al-
ways couched in terms that are
more abstract
than the
particular
instancessam-
pled
in an experiment. The labels may
pertain
to
the
individual
elementsof
the ex-
periment
(e.g.,
is the outcome
measured by
a
given
test
best described
as
intelligence or as achievement?).Or
the labels
may
pertain
to
the
nature of
rela-
tionships among elements,
ncluding causal
relationships,
as
when cancer
treat-
ments are
classified as
cytotoxic or cytostatic
depending
on
whether
they
kill tu-
mor cells directly or delay tumor
growth
by
modulating
their
environment.
Consider a
randomized
experiment
by Fortin
and
Kirouac
(1.9761.
he treatment
was a brief
educational
course administered
by several
nurses,
who
gave
a tour of
their
hospital
and covered
some basic facts
about
surgery
with
individuals
who
were to have elective abdominal or
thoracic surgery
1-5 o 20 days later in a sin-
gle
Montreal hospital. Ten specific outcome
measures
were used
after the
surgery,
such
as an activities
of
daily living scaleand
a count
of the
analgesics
sed
o con-
trol
pain.
Now compare this study