22
THE OXFORD HANDBOOK OF POLITICAL METHODOLOGY Edited by JANET M. BOX-STEFFENSMEIER HENRY E. BRADY and DAVID COLLIER OXEORD UNIVERSITY PRESS

Morton R Williams K_Experimentation in Political Science

Embed Size (px)

DESCRIPTION

Morton R Williams K_Experimentation in Political Science

Citation preview

  • T H E O X F O R D H A N D B O O K OF

    POLITICAL METHODOLOGY

    Edited by

    JA N E T M. B O X -S T E F F E N S M E IE R H E N R Y E. B R A D Y

    and

    DAVID C O L LIE R

    OXEORDU N IV E R S IT Y PRESS

  • O XFO R DU N IV E R S IT Y PRESS

    Great Clarendon Street, Oxford 0x2 6 d p Oxford University Press is a department o f the University o f Oxford.

    It furthers the Universitys objective o f excellence in research, scholarship, and education by publishing worldwide in

    Oxford New York

    Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur M adrid Melbourne Mexico City Nairobi

    New Delhi Shanghai Taipei Toronto

    With offices in

    Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam

    Oxford is a registered trade mark o f Oxford University Press in the UK and in certain other countries

    Published in the United States by Oxford University Press Inc., New York

    The several contributors 2008

    The moral rights o f the authors have been asserted Database right Oxford University Press (maker)

    First published 2008

    First published in paperback 2010

    All rights reserved. No part o f this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means,

    without the prior permission in writing o f Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate

    reprographics rights organization. Enquiries concerning reproduction outside the scope o f the above should be sent to the Rights Department,

    Oxford University Press, at the address above

    You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer

    British Library Cataloguing in Publication Data

    Data available

    Library o f Congress Cataloging in Publication Data

    The Oxford handbook o f political methodlogy / edited by Janet M. Box-Steffensmeier, Henry E. Brady and David Collier,

    p. cm.ISBN 978-0-19-928654-6

    1. Political science-M ethodology-Handbooks, manuals, etc.I. Box-Steffensmeier, Janet M ., 1965 II. Brady, Henry E.

    Ill, Collier, David, 1942 JA71.O948 2008

    320.01dc22 , 2008024395

    Typeset by SPI Publisher Services, Pondicherry, India Printed in Great Britain

    on acid-ffee paper by CPI Antony Rowe, Chippenham and Eastbourne

    ISBN 978-0-19-928654-6 (hbk.)ISBN 978-0-19-958556-4 (pbk.)

    3 5 7 9 1 0 8 6 4

  • T H E " O X F O R D H A N D B O O K S

    O F

    P O L I T I C A L S C I E N C E

    G e n e r a l E d i t o r : R o b e r t E . G o o d i n

    The Oxford Handbooks of Political Science is a ten-volume set o f reference books offering authoritative and engaging critical overviews of all the main branches of political science.

    The series as a whole is under the General Editorship of Robert E. Goodin, with each volume being edited by a distinguished international group of specialists in their respective fields:

    P O L I T I C A L T H E O R YJohn S. Dryzek, Bonnie Honig & Anne Phillips

    P O L I T I C A L I N S T I T U T I O N SR. A. W. Rhodes, Sarah A. Binder & Bert A. Rockman

    P O L I T I C A L B E H A V IO RRussell J. Dalton & Hans-Dieter Klingemann

    C O M P A R A T IV E P O L I T I C SCarles Boix & Susan C. Stokes

    LA W & P O L I T I C SKeith E. Whittington, R. Daniel Kelemen & Gregory A. Caldeira

    P U B L I C P O L IC YMichael Moran, Martin Rein & Robert E. Goodin

    P O L I T I C A L E C O N O M YBarry R. Weingast & Donald A. Wittman

    IN T E R N A T IO N A L R E L A T IO N SChristian Reus-Smit & Duncan Snidal

    C O N T E X T U A L P O L I T I C A L A N A L Y S ISRobert E. Goodin & Charles Tilly

    P O L I T I C A L M E T H O D O L O G YJanet M. Box-Steffensmeier, Henry E. Brady & David Collier

    This series aspires to shape the discipline, not just to report on it. Like the Goodin- Klingemann New Handbook of Political Science upon which the series builds, each of these volumes will combine critical commentaries on where the field has been together with positive suggestions as to where it ought to be heading.

  • For those w ho instilled in us a passion for m ethods,

    and the confidence to strive for excellence

    in p ursu ing rigor and high standards

    Jam es Lindsay, Peter M cCorm ick,

    Tse-M in Lin , and H erbert W eisberg

    - JB S

    C hristopher A chen, W alter D ean Burnham ,

    D ouglas H ibbs, and D aniel M cFadden

    -H E B

    G iovanni Sartori, A lexander George,

    C hristopher A chen, and D avid Freedm an

    - D C

  • \C H A P T E R 1 4

    EXPERIMENTATION IN POLITICAL

    SCIENCE

    R E B E C C A B. M O R T O N

    K E N N E T H C. W I L L I A M S

    1 T h e A d v e n t o f E x p e r i m e n t a l P o l i t i c a l

    S c i e n c e

    E x p e r i m e n t a t i o n is increasing dramatically in political science. Figure 1 4 . 1 shows the number o f experimental articles published by decade in three major mainstream journals the American Political Science Review (APSR), American Journal o f Political Science (AJPS), and Journal o f Politics (JOP) from 1 9 5 0 to 2 0 0 5 .1 These figures do not include the new use o f so-called survey experiments and natural experiments. Moreover, other political science journals have published political experimental articles, such as Economics and Politics, Political Behavior, Political Psychology, Public Choice, Public Opinion Quarterly, and the Journal o f Conflict Resolution, as have numerous economics and social psychology journals. Furthermore, a number of political scientists have published experimental work in research monographs; see for example Ansolabehere and Iyengar ( 1 9 9 7 ) , Lupia and McCubbins ( 1 9 9 8 ) , and Morton and Williams ( 2 0 0 1 ) .

    As discussed in Kinder and Palfrey (1993), there are a number o f reasons for the increase in experimentation in political science in the 1970s and 1980s. We believe that

    ' Earlier figures are from M cG raw and Hoekstra (1994); m ore recent ones w ere com piled b y the authors. N ote that our figures are significantly greater than those com piled b y M cD erm ott (2002).

  • 340 R E B E C C A B. M O R T O N & K E N N E T H C. W I L L I A M S

    Fig. 14.1. Experimental articles published in APSR, AJPS & JOP 1950-2005

    the dominant explanation for the expansion since 1990 is the increase in cheap and easily programmable computer networking technology, for the number o f possible experimental designs via laboratory networks and over the internet is hugely greater than what researchers can conduct manually. Computer technology has also led to a greater ability to engage in survey experiments, and to deal with the statistical and other methodological issues that are sometimes involved in field and natural experiments. Technology has transformed political science into an experimental discipline. In this chapter we explore the expanding area o f experimental political science and some o f the concerns involved in experimentation.2

    2 W h a t i s a n E x p e r i m e n t ?

    2.1 The Myth of the Ideal Experiment

    Political science is a discipline that is defined by the substance o f what we study politics. As Beck (2000) remarks, researchers freely use whatever methodological

    M cDerm otts calculations were lim ited to publications b y certain authors according to undefined criteria (established political scientists).

    2 A full and detailed treatment o f experim entation in political science cannot be presented in a chapter-length study; we encourage readers to consult M orton and W illiam s (2008), w ho expand on the concepts and issues presented here.

  • \ E X P E R I M E N T A T I O N I N P O L I T I C A L S C I E N C E 341

    solutions are available, drawing from other social sciences like economics, sociology, and psychology as well as statistics and applied mathematics to answer substantive questions. Unfortunately, this has led to misinterpretations as experimentalists learn about their method from other disciplines, and few nonexperimentalists have an understanding o f advances that have occurred outside political science. The most prominent misconception is that the best experiment is one where a researcher manipulates one variable, called a treatment, while having an experimental group who receives the treatment and a control group who does not, and randomly assigning subjects across groups. This perception arises largely because most nonexperimentalists learn about experiments through the lens o f behavioral social science methods formulated in the 1950s. Certainly such a design can be perfect for the case where a researcher is interested in the causal effect of a binary variable that might affect the choices o f subjects in some known context, and the researcher is interested in nothing else. But this is rarely true for any significant question in twenty-first-century political science research. Furthermore, the aforementioned advances in technology have changed the nature o f experimental research in fundamental ways that researchers writing in the 1950s could not have anticipated. By adhering to an outdated view of what an experiment should be and what is possible with experimentation, political scientists often rule out experimentation as a useful method for many interesting research questions.

    There is no perfect or true experiment. The appropriate experimental design depends on the research question, just as is the case with observational data. In fact, the variety o f possible experimental designs and treatments is in some ways greater than the range o f possibilities with observational data. But how is experimental research different from research with observational data? That is, what is experimental political science?

    2.2 How Experimental Data Differs from Observational Data

    The defining characteristic o f experimental research is intervention by the researcher in the data-generating process (DGP), which we define as the source o f the data we use in research. In experimental research, the variation in the data is partly a consequence o f the researchers decisions at the design stage before the data are measured. We call the data generated by such intervention experimental data. Nonexperimental empirical research involves using only data in which all the variation is a consequence of factors outside o f the control o f the researcher that is, the researcher only observes the DGP, but does not intervene in that process.3 We call this data observational or nonexperimental data.

    3 T his assum es that the researcher m easures the data perfectly; clearly choices made in m easurem ent can result in a type o f post-D G P intervention.

  • 342 R E B E C C A E . M O R T O N & K E N N E T H C. W I L L I A M S

    3 O t h e r F e a t u r e s o f E x p e r i m e n t a t i o n

    Beyond intervention, some would contend that control and random assignment are must have attributes to an experiment. Below we discuss these two additional features.

    3.1 Control

    In the out-of-date view of political experimentation, control refers to a baseline treatment that allows a researcher to gather data where he or she has not intervened. But many experiments do not have a clear baseline, and in some cases it is not necessary. For example, suppose a researcher is interested in evaluating how voters choose in a three-party election conducted by plurality rule as compared to how they would choose in an identical three-party election conducted via proportional representation. The researcher might conduct two laboratory elections where subjects payments depend on the outcome o f the election, but some subjects vote in a proportional representation election and others vote in a plurality rule election. The researcher can then compare voter behavior in the two treatments.

    In this experiment and many like it, the aspect o f control that is most important is not the fact that the researcher has a comparison, but that the researcher can control confounding variables such as voter preferences and candidate identities in order to make the comparison meaningful. To make the same causal inference solely with observational data the researcher would need to do two things: (x) rely on statistical methods to control for observable confounding variables such as control functions in regression equations, proximity scores, or matching methods, and (2) make the untestable assumption that there are no unobservable variables that confound the causal relationship the researcher is attempting to measure.4 In observational data, things like candidate identities cannot be held constant and it may be impossible to match the observable variables. By being able to control these things, an experimentalist can ensure that observable variables are perfectly matched and that many unobservable variables are actually made observable (and made subject to the same matching).

    3.2 Random Assignment

    Most nonexperimentalists think o f random assignment as the principal way that an experimentalist deals with unobservable variables. If subjects are randomly assigned to treatments, then the experimentalist can eliminate within statistical limits extraneous factors that can obscure the effects that the experimentalist expects to observe (or not observe), such as voter cognitive abilities, gender differences in voter behavior, and so on. For this reason, experimentalists typically attempt to randomly

    4 M orton and W illiam s (2008) discuss these issues in depth.

  • I E X P E R I M E N T A T I O N IN P O L I T I C A L S C I E N C E 343

    assign manipulations as much as possible, and when random assignment is not perfect they employ statistical techniques to account for uncontrolled variables that can be observed.

    3.3 Simulations versus Experiments

    Occasionally political scientists use the term experiment to refer to a computer simulation that either solves a formal model or explores the implications o f an estimated empirical model. We do not include such analyses here. In experimental political science the subjects choices are independent o f the researchers brain; the subjects affect their own choices in ways that the researcher cannot control. As opposed to simulations, subjects in experiments freely make decisions within the controlled environment constructed by the experimenter.

    3.4 Natural Experiments

    Sometimes nature acts in a way that is close to how a researcher, given the choice, would have intervened. For example, hurricane Katrina displaced thousands o f New Orleans residents and changed the political make-up o f the city; it also had an impact on locations that received large numbers o f refugees. This event, although devastating to many, provides researchers the opportunity to evaluate theories o f how elected representatives respond to changing constituencies, in particular large changes in the racial and ethnic composition o f the population. For example, US Census surveys in the spring o f 2006 showed that the African-American population in Orleans parish, Louisiana, fell from 36 percent to 21 percent and a study by Tulane University and the University o f California, Berkeley, found that the percentage o f Latinos in the city doubled from 4 percent to 8 percent as many immigrants filled clean-up jobs. Such situations are called natural experiments. In the event of Katrina, it is as i f nature was composed o f two separate personalities: the one that generated most data, and then the interventionist one that occasionally ran experiments like academics, messing up the other personalitys data generating process.

    4 T h e V a l i d i t y o f E x p e r i m e n t a l

    P o l i t i c a l S c i e n c e R e s e a r c h

    4.1 Defining Validity

    The most important question facing empirical researchers is what can we believe about what we learn from the data or how valid is the research? The most definitive

  • 3 4 4 R E B E C C A B. M O R T O N & K E N N E T H C. W I L L I A M S

    recent study of validity is provided in the work o f Shadish, Cook, and Campbell (2002) (hereafter SCC). They use the term validity as the approximate truth of the inference or knowledge claim. As McGraw and Hoekstra (1994) remark, political scientists have adopted a simplistic view o f validity based on the early division of Campbell (1957), using internal validity to refer to how robust experimental results are within the experimental data, and external validity to refer to how robust experimental results are outside the experiment.5 Again, we note that political scientists are adhering to an outdated view, for there are actually many ways to measure the validity o f empirical research, both experimental and observational.

    SCC divide validity into four types, each of which are explained below: statistical conclusion validity, internal validity, construct validity, and external validity. Statistical conclusion validity is defined as whether there is statistically significant covariance between the variables the researcher is interested in, and whether the relationship is sizeable. Internal validity is defined as the determination o f whether the relationships the researcher finds within the particular data-set analyzed are causal relationships. It is important to note that this is a much more narrow definition than generally perceived by many political scientists. In fact, Campbell (1986) relabeled internal validity to local molar causal validity.

    When some political scientists think o f internal validity, particularly with respect to experiments that evaluate formal models or take place in the laboratory, they are referring to construct validity. Construct validity has to do with how valid the inferences o f the data are for the theory (or constructs) that the researcher is evaluating. SCC emphasize issues o f sampling accuracy that is, is this an appropriate data-set for evaluating the theory? Is there a close match between what the theory is about and what is happening in the manipulated data-generating process? Thus, when political scientists refer to internal validity they are often referring to a package o f three things: statistical conclusion validity, local molar causal validity, and construct validity.

    4.2 Are Experiments Externally Valid?

    External validity is whether causal inferences established in an empirical analysis hold over variations in persons, settings, treatment variables, and measurement variables. Do the results from one data-set generalize to another? This is a question that is o f importance for any empirical researcher, regardless o f whether the data-set is experimental or observational. For example, if we study voter turnout in congressional elections, can we generalize from that study to turnout in mayoral elections, or elections in Germany, or elections held in a laboratory? Yet most political scientists rarely worry about the external validity issue with observational data, but do worry about it with experimental data.

    5 For an exam ple o f how the outdated view o f validity still dom inates discussions o f experim entation in political science see M cD erm ott (20 0 2,35-8 ).

  • \ E X P E R I M E N T A T I O N IN P O L I T I C A L S C I E N C E 345

    Political scientists typically assum e that external validity means that the data-set used for the analysis m ust resemble som e natural or unm anipulated DGP. Since experim entation means m anipulation and control o f the D G P by the researcher, and an experim ental data-set generated through that m anipulation and control, then some argue that the results cannot be externally valid by definition. Nevertheless, even in observational data, by choosing variables to m easure and study and b y focusing on particular aspects o f the data, the researcher also engages in m anipulation and control through statistical and measurem ent choices. The researcher who works with

    observational data sim plifies and ignores factors or makes assum ptions about those things he or she cannot m easure. Thus, the observational data-set is also not natural since it is sim ilarly divorced from the natural D G P; by the same reasoning the results

    also cannot be externally valid.It is a mistake to equate external validity with whether a given data-set used

    to establish a particular causal relationship resembles the unm anipulated D G P

    (which can never be accurately m easured or observed). Instead, establishing whether a result is externally valid involves replication o f the result across a variety o f datasets. For exam ple, i f a researcher discovered that m ore inform ed voters were more likely to turnout in congressional elections, and then found this also to be true in mayoral elections, in elections in Germ any, and in an experim ent with control and

    m anipulation, then we w ould say the result showed high external validity. The same thing is true with results that originate from experim ental analysis. I f an experi

    m ent dem onstrates a particular causal relationship, then whether that relationship is externally valid is determ ined b y exam ining the relationship across a range o f experim ental and observational data-sets; it is not ascertained by seeing i f the original experim ental data-set resembles a hypothesized version o f the unm anipulated

    DGP.One o f the m ost interesting developm ents in experim ental research has come

    in laboratory experim ents that test the behavioral assum ptions in rational choice based m odels in economics. In order to test these assum ptions a researcher must be able to m aintain significant control over the choices before an individual and be able to m easure subtle differences in preference decisions as these choices are m anipulated b y the researcher. These things are extrem ely hard to discern in observational data since it is alm ost im possible to measure isolated preference changes w ith enough variation in choices. The robustness o f som e o f the behavioral vio lations has been established across a wide range o f experim ental data-sets. An entire new field w ithin economics, behavioral game theory, has evolved as a consequence, with a whole new set o f perspectives on understanding economic behavior this is som ething that was alm ost unthinkable twenty years ago. It is odd that m any in political science w ho disagree w ith assum ptions made in rational choice models are also quick to dism iss the experim ents which test these assum ptions; these critics claim that these experim ents do not have external validity, but external validity is really about the robustness o f these experim ents across different form ulations, and not about whether the experim ent resembles the hypothesized unmanipulated DGP.

  • 3 4 6 R EB EC CA B. M O R T O N & K E N N E T H C. W I L L I A M S

    5 D i m e n s i o n s o f E x p e r i m e n t a t i o n i n

    P o l i t i c a l S c i e n c e

    5.1 Location: Field, Lab, or Web?

    A dimension that is especially salient among political scientists, particularly nonexperimentalists, is location. Laboratory experiments are experiments where the subjects are recruited to a common location, the experiment is largely conducted at that location, and the researcher controls almost all aspects in the environment in that location, except for subjects behavior. Field experiments are experiments where the researchers intervention takes place in an environment where the researcher has only limited control beyond the intervention conducted, and the relationship between the researcher and the subject is conducted often through variables such as postal delivery times or work schedules o f subjects outside o f the researchers control. Furthermore, since many o f the individuals in a field experiment that are affected by the researcher are not aware that they are being manipulated, these also raise ethical issues not generally involved in laboratory experiments. For example, in Wantchekons (2003) experiment in Benin, candidates were induced to vary their campaign messages in order to examine the effects o f different messages on voter behavior. But voters were unaware that they were subjects in an experimental manipulation.

    5.2 Why Laboratories?

    First, laboratory experiments provide a researcher with greater control. For example, suppose that a researcher is conducting a decision-making experiment that evaluates the extent to which different types o f campaign advertisements affect voter preferences. In the laboratory, the researcher can show the ads in a specially designed environment that can be held constant across subjects. In the field, however, though subjects can be shown the ads in their homes or via the internet, the researcher loses control over other aspects o f the environment in which the ads are watched. I f these environmental differences vary systematically with which ads are viewed or if there are other factors that affect the subjects viewing o f the ads, and if these factors are unobservable by the researcher, then the comparison between ads can be more difficult to determine. The advantage o f control in the laboratory also extends to group experiments such as the aforementioned one involving a comparison between voters in plurality versus proportional representation systems.

    Second, in the laboratory the researcher has the ability to create environments that simply do not exist in observational data. For example, in the laboratory the researcher can conduct group and individual decision-making experiments on specific voting mechanisms that are used very little if at all in any country. Many have proposed that approval voting would be a better choice mechanism than plurality

  • rE X P E R I M E N T A T I O N I N P O L I T I C A L S C I E N C E 347

    or proportional representation,6 and in the laboratory a researcher can create.and conduct elections using new procedures that it would be difficult to convince jurisdictions to adopt on a random basis (as in Bassi 2006).

    The third advantage o f laboratory experiments is that the researcher can induce a wider range o f variation than is possible in the field. For example, suppose a researcher theorizes that voters who choose late in sequential voting elections like presidential primaries learn about candidates policy positions from horse race information about the outcomes o f early voting, and that this learning affects their choices in the elections. In the field the researcher can randomly provide horse race information to later voters during a presidential primary contest, and then observe how the provision affects their choices and information measured via survey instruments. But suppose the researcher also expects that the effect is related to the policy positions o f the candidates. In the laboratory the researcher can not only provide information randomly, but can also randomly vary the candidates policy positions to see if the effect is robust to this manipulation; a researcher can do this because the laboratory affords the opportunity to intervene in hundreds o f elections, and not just one. Thus, while the number and variety o f observations limits the researcher in the field, the experimentalist is able to pick up on subtle differences in prediction of theories.

    Independent o f these advantages, laboratories are still necessary for some technologies and manipulations. In particular, social scientists are beginning to use fM RI equipment to measure brain activity as subjects make choices (see Dickson and Scheve 2006); currently such equipment is not portable. Other experiments involve subjects in face-to-face interactions as in some free-form deliberation experiments that test theories about social interaction and discussion. Such experiments would be extremely difficult, if not impossible, to conduct outside o f a laboratory.

    5.3 Why the Field or the Internet?

    Field and internet experiments can potentially focus on either the group or the individual, although most are currently individual decision-making experiments. An early example o f a field experiment in political science was Gosnells (1927) study. By sending postcards containing information on voter registration (the postcards also encouraged individuals to vote) to some voters and not to others, he manipulated the environment o f the subjects. However, Gosnell did not randomly assign his treatments. More recently, Gerber and Green (2000) replicated Gosnells experiments making use o f current understandings of statistical inference.

    Survey experiments are an increasingly popular type o f field (and sometimes internet) experiment in political science. In a particularly noteworthy survey experiment, Sullivan, Piereson, and Marcus (1978) compared the question format used by the National Election Studies prior to 1964 with that used later. Researchers using the data

    6 See for exam ple Bram s and F ish b um (1983).

  • 348 R E B E C C A B. M O R T O N & K E N N E T H C. W I L L I A M S

    had found that the surveys showed significantly different attitudes towards politics over time. But at the same time, the survey questions had changed. Thus, to determine whether attitudes had truly changed, or whether the survey question format changes accounted for the results, Sullivan et al. surveyed the same set o f voters but randomly determined which set o f questions (pre-1964 or after) each respondent would be asked. The researchers found that the change in the format o f the survey implied that the respondents had different political attitudes, and concluded that the empirical analysis comparing the two time periods was fundamentally flawed.

    The most exciting advances in survey experimental design can be found in the work o f Paul Sniderman and colleagues with respect to computer-assisted telephone interviewing (CATI) (see Sniderman, Brody, and Tetlock 1991). Using computers, telephone interviewers can randomly assign questions to subjects, thereby manipulating the question environment. The National Science Foundation funded program Time- Sharing Experiments in the Social Sciences (TESS) has facilitated the advancement of survey experiments in political science, providing researchers with a large, randomly selected, diverse subject population for such experiments (not to mention the infrastructure needed for conducting the experiments with this population).7 For a more comprehensive review o f current field experiments, see the chapter by Gerber and Green in this volume.

    5.4 Audience or Goal: Theorists, Empiricists, or Policy-makers?

    Experim ents vary in both the audience that they address and the goal o f the analysis. Experim ental political scientists have three types o f audiences theorists, empiricists, and policy-m akers.8

    5.4.1 Theory TestingAs in a number o f observational studies, some experimentation in political science evaluates nonformal theories. A nonformal theory or model is one in which hypotheses are proposed about the DGP, but the assumptions behind the hypotheses are not precisely stated. One example o f such work is that o f Lau and Red- lawsks (2001) study, which evaluates a theory from political psychology, hypothesizing that voters use cognitive heuristics in decision-making. Another example of nonformal theory evaluation using experiments is contained in the work o f Mutz (2006) who, using both experimental and observational work, challenges arguments of political theorists about the benefits o f face-to-face deliberative interactions. The relationship between the theory and experimental data in nonformal theory testing is much the same as the relationship between the theory and observational

    7 A rthur Lupia and D iana M utz have been particu larly influential scholars in this field (see, for exam ple, Lupia and M cCubbins 1998 and M utz 2006).

    8 We build on distinctions discussed in Davis and H olt (1993) and Roth (1994).

  • E X P E R I M E N T A T I O N I N P O L I T I C A L S C I E N C E 349

    data in similar evaluations o f nonformal theory and is already familiar to political scientists.9

    Other experimental work in political science, which is recently reviewed in Palfrey (2005), evaluates formal models with experimental data. Fiorina and Plott (1978) is an example o f one o f the first such experimental tests. A formal theory or model is one in which assumptions made regarding the DGP are explicitly stated and the implications o f those assumptions are then solved for (usually) using tools from mathematics. Formal theory testing using experiments is distinctive from most o f the empirical research conducted by political scientists for two reasons. First, researchers can use the explicit nature o f the models assumptions to guide the experimental design yielding a close link between the theory and the experimental manipulations. Second, researchers can directly evaluate the underlying assumptions o f their theories in experiments, which is not possible when evaluating nonformal theories where the assumptions are not stated or when using observational data to evaluate formal models. Thus we expand on how these experiments proceed in the next sub-section.

    5.5 An Example: Theory Testing of Formal Models

    In formal models the assumptions are o f two types: assumptions about institutional and other exogenous factors such as voting rules, bargaining procedures, preference distributions, and so forth; and assumptions about the behavioral choices of the political actors in the context o f these exogenous factors. For example, suppose a formal model considers three-candidate elections. In the model, in plurality rule elections without majority requirements voters are more likely to vote strategically for their second preference but sincerely for their first preference in plurality rule elections with majority requirements. The model makes assumptions about both electoral institutions and voter rationality, and the predictions are behavioral.

    An experimentalist can create the electoral institution and then pay subjects based on the outcomes o f voting that induce a preference ordering over the candidates. An example o f such a payment schedule is shown in Table 14.1. Note that the payoffs are received based on who wins, not based on how a voter chooses. For example, if subjects who are assigned to be Type 1 voted for candidate A, and that candidate wins the election, then that subject would receive $1 for the election period. However, if candidate C wins but candidate B is chosen, the subject would receive only $0.10.

    Under the plurality rule, if all voters voted sincerely for the candidate for whom they would receive the highest payoff, then Candidate C would win. But, for example, if voters who are Type 1 vote strategically for B, while the other voters choose sincerely, B would win. Similarly, if voters who are Type 2 vote strategically for A while the other

    9 For a m ore expansive recent review o f the political-psychological experim ental research, see M cD erm ott (2002). M orton (1999) reviews and discusses the differences between form al and nonform al theory evaluation as well as a num ber o f other exam ples o f such w ork.

  • 350 RE B EC C A B. M O R TO N & K E N N E T H C. W I L L I A M S

    Table 14.1. An example payoff schedule in a voting experiment

    Voter types Winning candidate ($)

    A B C Number of voters

    1 1.00 0.75 0.10 32 0.75 1.00 0.10 33 0.25 0.25 1.00 4

    voters choose sincerely, A would win. As Myerson and Weber (1993) demonstrate, formally the voting situation o f all three o f these outcomes are equilibria in pure strategies.

    Suppose now, however, that there is a majority requirement that is, in order for a candidate to win, he or she must receive at least 50 percent o f the vote. I f no candidate receives more than 50 percent o f the vote, a run-off election is held between the two top vote receivers. In such a case then, even if all voters vote sincerely in the first round election and candidate C wins that vote, C would have to face either A or B in a run-off (assuming that ties are broken by random draws). In a run-off, either A or B would beat C, so C cannot win. O f course voters o f Types 1 and 2 might engage in strategic voting in the first round (as discussed above), thereby avoiding the need for a run-off Morton and Rietz (2006) formally demonstrate that when there are majority requirements and voters perceive that there is some probability albeit perhaps small that their votes can affect the electoral outcome, that the only equilibrium is for voters to vote sincerely in the first stage and let the random draw for second place choose which candidate faces C in a run-off (and then wins the election).

    In the end, this theory suggests that strategic voting (in our payoff matrix) is not likely when there are majority requirements, but is possible when these requirements do not exist. These predictions depend on assumptions about payoffs, voting rules, and the distribution of voter types, as well as voter rationality. An experimenter can create a situation that closely matches these theoretical assumptions about payoffs, voting rules, and the distribution o f voter types, and then evaluate whether individuals act in accordance with theoretical predictions. The results from such an experiment tell the theorist about the plausibility o f their assumptions about human behavior.

    5.6 Searching for Facts

    Political scientists, like economists, also use experiments to search for facts. For example, as noted in the aforementioned voting game (Table 14.1), when votes are counted according to the plurality rule there are multiple equilibria. Myerson and Weber (1993) conjecture (but do not prove) that campaign contributions and public

  • \ E X P E R I M E N T A T I O N I N P O L I T I C A L S C I E N C E 351

    opinion polls can serve as a coordinating device for voters o f Types 1 and 2; that is, they argue that these types of voters would like to coordinate on a common candidate. One way such coordination may be facilitated is by focusing on whichever candidate is ahead in a public opinion poll or in campaign contribution funds prior to the election. However, there is no basis within the Myerson and Weber theory on which to argue that voters will use polls or contributions in this fashion. Researchers have used experiments to see if voters do use polls and campaign contributions to coordinate, and the results support Myerson and Webers conjecture that polls and contributions can serve as a coordination device (Rietz 2003).

    In contrast to the approach o f relaxing theoretical assumptions or investigating conjectures in the context o f a theoretically driven model as above, some prominent political scientists see searching for facts through experiments as a substitute for theorizing. For example, Gerber and Green (2002) advocate experiments as an opportunity to search for facts with little or no theory, remarking (2002, 820): The beauty o f experimentation is that one need not possess a complete theoretical model o f the phenomenon under study. By assigning.. . at random, one guarantees that the only factors that could give rise to a correlation. . . occur by chance alone. Although they go on to recognize the value o f having theoretical underpinnings more generally, they clearly see this as simply being more efficient; they do not view a theoretical model as a necessary component in the experimental search for facts.

    5.7 The Need for Theory in Experimentation

    How necessary is a well-articulated theory for learning from experiments? Is it possible to discover knowledge without theory? Is theorizing merely an aid to experimental political science, but not a requirement? The argument for this position is based on the presumption that causality can be determined without theory by the trial and error o f manipulating various causes and determining their effects through experimentation, facts and nuances can be accumulated that eventually lead to greater knowledge (with or without theory). But what does it mean to measure causality in the experimental setting? Measuring causality requires that a researcher think hypothetically or theoretically, having a model o f the data-generating process.

    To see how theorizing is necessary when searching for facts, consider a simple field experiment on voter information. Suppose that a researcher randomly assigns a set o f voters in an upcoming election to a treatment group that receives informative campaign material, and a control group that does not. The researcher then measures how voters in the treated and control groups voted. In order to make a causal inference about the effect o f the treatment on voter behavior, researchers who are searching for facts like Gerber and Green typically take a Rubin Causal Model approach (hereafter RCM ) to establishing causality (see Holland 1988; Rubin 1974). The researcher using an RCM approach to establish a causal inference must hypothesize that for each voter in each group there are two potential behavioral

  • 35 2 REBECCA B. M O R T O N & K E N N E T H C. W I L L I A M S

    choices: a choice the voter makes when fully informed and a choice a voter makes when uninformed. O f course, because only one state o f the world is ever observed for any given voter, the researcher must hypothesize that these counterfactual worlds exist (even though she knows that they do not). Furthermore, the researcher must then make what is called the stable unit treatment assumption (SUTVA), which demands no cross-effects o f treatment from one subject onto another's choices, homogeneity of treatment across subjects, and a host o f other implicit assumptions about the DGP that are often left unexplored (Morton and Williams 2008). Finally, the researcher must also typically make the untestable assumption that unobservable variables do not confound the effect that she is measuring; that there are no selection effects that interfere with the randomization he is using in the experimental design. In the end, searching for facts even in field experiments requires that a researcher theorize. It is not possible to search for facts and discover nuances o f the DGP through trial and error without theorizing about that process.

    5.8 A Test Bed for Political Methodologists

    Searching for facts is not the only way in which experimental research can provide useful information for empirical scholars. Methodologists often test their approaches on simulated data to see if known parameters can be recovered. However, experiments can serve as a behavioral test bed. The use o f experimental data to test methodologies goes back to at least the work o f LaLonde (1986), who compared experimental estimates of the impact of a training program with a range o f nonexperimental estimates of the same program. LaLonde found that the nonexperimental estimates varied widely, and that although some came close to the experimental estimates, there was no a priori way for the empirical researcher to determine which estimates were more reliable. This analysis led to many new developments in the econometric literature on the estimation of treatment effects in observational data (for more see Morton and Williams 2008).

    More recently, Frechette, Kagel, and Morelli (2005) compare empirical procedures used by Ansolabehere et al. (2005), as well as Warwick and Druckman (2001), to evaluate models o f legislative bargaining over ministry divisions when forming governments. Using observational data, researchers can only evaluate the ex post distributions and argue that they are the same as predicted by a given theory since it is impossible to observe the secret bargaining process that the legislators typically employ. Ansolabehere et al. (2005) find support for the Baron-Ferejohn legislative bargaining model (see Baron and Ferejohn 1989) with observational data from a cross-section o f industrialized democracies. However, using the same data, Warwick and Druckman (2001) find support for a demand-bargaining model (see Morelli

    1999)-

    Which empirical study is correct? Frechette, Kagel, and Morelli use data from laboratory experiments where the underlying bargaining framework was controlled by the

  • I E X P E R I M E N T A T I O N I N P O L I T I C A L S C I E N C E 353

    researchers. In some treatments the underlying bargaining model is a Baron-Ferejohn game and in others the underlying bargaining model is a demand-bargaining game. In the laboratory, subjects are paid based on their choices, and thus they are induced to have preferences as hypothesized by the two theories. Frechette, Kagel, and Morelli (2005) then take the data from each treatment individually and apply the estimation strategies employed by Ansolabehere et al. and Warwick and Druckman to the experimental data they do so as if they did not know the underlying bargaining game used in that treatment, as if the data was observational. Frechette, Kagel, and Morelli (2005) find that the empirical estimation strategies fail to distinguish between different bargaining procedures used in the laboratory. Thus, their results show that some standard empirical strategies used by political scientists on observational data are unlikely to identify correctly the underlying bargaining game and explain how two researchers using the same observational data may find support for two different theories o f legislative bargaining.

    5.9 Experimentalists as Politicians Assistants

    Finally, political scientists also conduct experiments in order to speak to policymakers. Gosnells (1927) early field experiment was arguably an attempt to whisper in the ears o f princes; it is well known that he was highly involved in everyday politics in his Chicago community. Gerber and Green (2004) have written a book based on their experimental research on techniques o f mobilizing voters; this is expressly addressed to those involved in election campaigns.

    Moreover, Gerber and Green (2002) have argued that the relationship between political experimentalists in field experiments and actors in politics is a potentially interactive one, and that increased experimentation by political scientists will increase the relevance o f political science research for political actors. For example, suppose that an interest group is uncertain about which advertising strategy is likely to work in an upcoming election. I f the interest group is induced to randomize their advertisements across television markets by an experimentalist, both the experimentalist and the interest group can gain knowledge about the causal effects o f such advertisements; presumably, then, the interest group can then use this knowledge to make more efficient future choices.

    But should political scientists, as Gerber and Green advocate, increase their focus on experiments that can interest and be directly relevant to the nonacademic community? Certainly, some experiments can serve this purpose. There are certainly questions about politics that are o f interest to both academics and nonacademics. But there are risks given that nonacademics may have financial resources to pay for the services o f academics; the weight of dollars can impact the choice o f which questions are asked, and how they are asked. Furthermore, nonacademics may not be as willing to share information gathered through such research, given that politics is a competitive arena.

  • 354 RE B EC C A B. M O R T O N Sc K E N N E T H C. W I L L I A M S

    6 C o n c l u d i n g R e m a r k s

    Political science is an experimental discipline. Nevertheless, we argue that m ost political scientists have a view o f experim entation that fits with a 1950s understanding o f research methodology, and one that especially m isunderstands the issue o f external validity. These false perceptions lead m any political scientists to dism iss experim ental work as not being relevant to substantive questions o f interest. In this chapter we have provided an analysis o f twenty-first-century experim entation in political science, and have demonstrated the wide variety o f ways in which experim entation can be used to answer interesting research questions while speaking to theorists, em piricists, and policy-makers.

    We expect the use o f experim entation techniques to continue to rise in political science, just as it has in other social sciences. In particular, we expect that the technological revolution will continue to expand the opportunities for experim entation in political science in two im portant ways. First, the expansion o f interactive Web- based experimental methods will allow researchers to conduct large-scale experiments testing game-theoretic models that have previously only been contemplated. For example, through the internet, it is possible for researchers to consider how voting rules might affect outcomes in m uch larger electorates than has been possible in the laboratory. Second, advances in brain-im aging technology w ill allow political scientists to explore much more deeply the connections between cognitive processes and political decisions. These two types o f experim ents will jo in with traditional laboratory and field experiments to transform political science into a discipline where experimental work will one day be as prevalent as traditional observational analyses.

    R e f e r e n c e s

    A n s o l a b e h e r e , S., a n d I y e n g a r , S. 1997. Going Negative: How Political Advertisements Shrink and Polarize the Electorate. New York: Free Press.

    S t r a u s s , A., S n y d e r , J., and T i n g , M. 2005. Voting weights and formateur advantages in the formation of coalition governments. American Journal of Political Science, 49: 550-63.

    B a r o n , D., a n d F e r e jo h n , J. 1989. B a rg a in in g in le g is la tu re s . American Political Science Review, 8 3 :1 18 1 - 2 0 6 .

    B a s s i , A. 2006. Experiments on approval voting. Working paper, New York University.B e c k , N. 2000. Political methodology: a welcoming discipline. Journal o f the American Statis

    tical Association, 95: 651-4.B r a m s , S., and F i s h b u r n , P. 1983. Approval Voting. Cambridge, Mass.: Birkhauser.C a m p b e l l , D. T. 1957. Factors relevant to the validity of experiments in social settings.

    Psychological Bulletin, 54: 297-312.

  • \ E X P E R I M E N T A T I O N IN P O L I T I C A L S C I E N C E 355

    1986. Sciences social system of validity-enhancing collective belief change and the. problems of the social sciences. Ch. 19 in Selected Papers, ed. E. S. Overman. Chicago:

    University o f Chicago Press.D a v i s , D ., and H o l t , C. 1993. Experimental Economics. Princeton, NJ: Princeton University

    Press.D i c k s o n , E., and S c h e v e , K. 2006. Testing the effect o f social identity appeals in election

    campaigns: an fMRI study. Working paper, Yale University.F i o r i n a , M., and P l o t t , C. 1978. Committee decisions under majority rule. American Political

    Science Review, 72: 575-98.F r e c h e t t e , G., K a g e l , J., and M o r e l l i , M. 2005. Behavioral identification in coalitional

    bargaining: an experimental analysis o f demand bargaining and alternating offers. Econo- metrica, 73:1893-937.

    G e r b e r , A., and G r e e n , D. 2 0 0 0 . The effects o f canvassing, direct mail, and telephone calls on voter turnout: a field experiment. American Political Science Review, 94: 653-63.

    ------------- 2002. Reclaiming the experimental tradition in political science. Pp. 805-32 inPolitical Science: State o f the Discipline, ed. I. Katznelson and H. V. Milner. New York: W. W. Norton.

    ------------- 2004. Get out the Vote: How to Increase Vote Turnout. Washington, DC: Brookings.G o s n e l l , H. 1927. Getting out the Vote: An Experiment in the Stimulation of Voting. Chicago:

    University o f Chicago Press.H o l l a n d , P. W. 1988. Causal inference, path analysis, and recursive structural equation models

    (with discussion). Pp. 449-93 in Sociological Methodology, ed. C. C. Clogg. Washington, DC: American Sociological Association.

    K i n d e r , D., and P a l f r e y , T. R. 19 9 3. On behalf o f an experimental political science. Pp. 1 - 4 2 in Experimental Foundations of Political Science, ed. D. Kinder and T. Palfrey. Ann Arbor: University o f Michigan Press.

    L a L o n d e , R. 19 8 6 . Evaluating the econometric evaluations of training programs. American Economic Review, 76 : 6 0 4 -2 0 .

    Lau , R. R., and R ed la w sk , D. P. 2001. Advantages and disadvantages o f cognitive heuristics in political decision making. American Journal of Political Science, 45: 951-71.

    L u p i a , A., and M c C u b b i n s , M . 19 9 8 . The Democratic Dilemma: Can Citizens Learn What They Need to Knowl Cambridge: Cambridge University Press.

    M c D e r m o t t , R. 2002. Experimental methods in political science. Annual Review of Political Science, 5: 31-61.

    M c G r a w , K.', and H o e k s t r a , V. 1994. Experimentation in political science: historical trends and future directions. Pp. 3-30 in Research in Micropolitics, vol. iv, ed. M. Delli Carpini, L. Huddy, and R. Y. Shapiro. Greenwood, Conn.: JAI Press.

    M o r e l l i , M. 1999. Demand competition and policy compromise in legislative bargaining. American Political Science Review, 93: 809-20.

    M o r t o n , R. 1999. Methods and Models: A Guide to the Empirical Analysis o f Formal Models in Political Science. Cambridge: Cambridge University Press.

    and R i e t z , T. 2006. Majority requirements and strategic coordination. Working paper,New York University.

    and W i l l i a m s , K. 2001. Learning by Voting: Sequential Choices in Presidential Primariesand Other Elections. Ann Arbor: University o f Michigan Press.

    ------------- 2008. Experimental political science and the study o f causality. Unpublished manuscript, New York University.

  • 356 R E B E C C A B. M O R T O N St K E N N E T H C. W I L L I A M S

    M u t z , D. C. 2006. Hearing the Other Side: Deliberative Versus Participatory Democracy. Cambridge: Cambridge University Press.

    M y e r s o n , R., and W e b e r , R. 1993. A theory of voting equilibria. American Political Science Review, 87:102-14.

    P a l f r e y , T. 2005. Laboratory experiments in political economy. Center for Economic Policy Studies Working Paper No. 111, Princeton University.

    R i e t z , T. 20 0 3 . Three-way experimental election results: strategic voting, coordinated outcomes and Duvergers Law. In The Handbook o f Experimental Economics Results, ed. C. R. Plott and V. L. Smith. Amsterdam: Elsevier Science.

    R o t h , A. 19 9 4 . Introduction to experimental economics. In Handbook of Experimental Economics, ed. J. Kagel and A. Roth. Princeton, NJ: Princeton University Press.

    R u b i n , D. B. 1974. Estimating causal effects o f treatments in randomized and nonrandomized studies. Journal o f Educational Psychology, 6 6 : 6 8 8 -7 0 1 .

    S h a d i s h , W. R., C o o k , T. D., and C a m p b e l l , D. T. 2002. Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston, Mass.: Houghton Mifflin.

    S n i d e r m a n , P. M., B r o d y , R. A., and T e t l o c k , P. E. 1991. Reasoning and Choice: Explorations in Political Psychology. New York: Cambridge University Press.

    S u l l i v a n , J. L., P i e r e s o n , J. E., and M a r c u s , G. E. 1978. Ideological constraint in the mass public: a methodological critique and some new findings. American Journal o f Political Science, 22: 233-49.

    W a n t c h e k o n , L. 2003. Clientelism and voting behavior: evidence from a field experiment in Benin. World Politics, 55:399-422.

    W a r w i c k , P., and D r u c k m a n , J. 2 0 0 1 . Portfolio salience and the proportionality o f payoffs in coalition government. British Journal o f Political Science, 31: 627-49.