+ Lab Experiments for Measurement in Program Evaluation Michael
J. Gilligan, New York University
Slide 2
+ The Task Government/NGO/CBO programs wish to change
participants attitudes and beliefs in particular ways Typically
these program coach participants in the right set of attitudes and
beliefs. Examples Pro-social behaviors: contributions to public
goods, trust, tolerance, non-violence and so on Attitude and
behaviors toward marginalized groups: women minorities, particular
ethnic groups These programs would like to be able to measure
whether their efforts have been successful
Slide 3
+ The Problem Randomized control trials are essential to be
able to make causal statements about the effects of the program But
randomized control trials are not a solution to the measurement
problemindeed they are a hindrance to it. RCT programmers only
operate with treated populations so only treated populations
receive coaching on the right responses RCTs, the very thing that
is insuring unbiasedness with respect to subject pools (balance) is
introducing bias in measurement
Slide 4
+ Social Capital & Pro-social Attitudes
Slide 5
+ Definition [S]ocial networks and the norms of reciprocity and
trustworthiness that arise from them. [S]ocial capital is closely
related to... civic virtue.'' The difference is civic virtue is
more powerful when embedded in a dense network of reciprocal social
relations. A society of many virtuous but isolated individuals is
not necessarily rich in social capital (Putnam 2000).
Slide 6
+ We are interested in measuring: Altruism Trust
Trustworthiness Willingness to contribute to public goods The
social networks that (purportedly) support these behaviors
Slide 7
+ Implications for Development Trust: crucial for
cost-effective self enforcement of contracts Compliance with social
norms: non- violence, compromise, fairness Contributions to public
goods: essential for economic efficiency Respect for legitimate
sources of authority
Slide 8
+ A Few Findings (among many) Putnam (1993) shows that local
governments in Italy are more efficient where there is greater
civic engagement. Knack and Keefer (1997) demonstrate that
increases in country-level trust lead to large increases in the
countrys economic growth. La Porta et. al. (1997) establish a
strong positive link between trust and judicial efficiency and a
strong negative link between trust and corruption.
Slide 9
+ Implications The World Bank and other international actors
have many programs to foster social capital and pro-sociality
Community-based DDR Community-driven development programs A focus
on local capacity in development efforts Local ownership of
development programs to foster sustainability
Slide 10
+ Measuring Social Capital and Social Norms These are very
difficult concepts to measure In many cases they are not observed
directly Indicators differ greatly across different cultures People
are often unwilling to reveal behavior that is not pro-social
Slide 11
+ Traditional survey measures Generally speaking, would you say
that most people can be trusted or that you cant be too careful in
dealing with people? (World Values Survey) Would you be willing to
contribute a day of free time to ? How difficult do you think it
would be for your community to reach agreement on ? In the last
three months have you contributed time or money to a
community-based organization? Did you vote in the last
election?
Slide 12
+ Bias concerns with surveys Programmers coach respondents in
the right answers to these types of questions They do not operate
in control communities at all so respondents many not even know the
right answers
Slide 13
+ Observational Measures Number of people who voted in the last
election Number of people who show up to clean up a public park
Contributions to a community fund
Slide 14
+ The measures have great external (real world) validity but
Are we measuring social attitudes or leadership strength? or
intimidation? or corruption? Example: Voter turnout in the Soviet
Union was routinely above 98 percent Good outcomes may be caused by
the exact opposite of good institutions and pro-social
attitudes
Slide 15
+ Structured Observational Measures Structured Community
Activities (Casey Glennerster and Miguel) Funds collected in
matching-grant scheme Decision making over allocating salt or
batteries Allocation of tarpaulin Tuungame Project, Congo
(Humpreys, Sanchez de la Sierra and van der Windt 2013)
Participation in matching funds for a public good Allocation of a
$100 windfall Participation in a community meeting
Slide 16
+ Structured Observational Measures Structured and therefore
more comparable to each other Have great external validity but we
still cannot disentangle individual factors (attitudes) from
community-wide factors (leadership, institutions)
Slide 17
+ Lab-in-the Field Activities Observing behavior in a
controlled laboratory setting All social pressures, political
institutional effects etc., are removed by design of the experiment
We observe only peoples responses to the incentives that we (the
experimenters) offer them We are able to disentangle attitudes from
community-wide factors
Slide 18
+ Loss in External Validity Community-wide factors (leadership,
institutional efficiency) are excluded from the lab so we cannot
obtain measures of them Thus lab activities are best combined with
the other measurement methods
Slide 19
+ Behavioral games Three important games are: Altruism game
Trust game Public goods game Our main interest is in the altruism,
trust and public goods games, but we also need to conduct the other
games to control for risk attitudes, patience and altruism
Slide 20
+ Game Instruction
Slide 21
+
Slide 22
+ Altruism Activity Subjects were given a sum of money Nepal;
40 NPR in 5 NPR notes Sudan: 3 pounds in half-pound coins Cambodia:
16,000 KHR in 4,000 KHR notes Subjects decide how much they want to
contribute to a local needy family The identity of the family is
not revealed
Slide 23
+
Slide 24
+ Trust/Trustworthiness Activity Subjects are randomly assigned
to one of two roles: sender or receiver (we use neutral names in
the field) Both types are given initial endowment of money Senders
decide how much of their endowment to send to the receiver We
triple that amount and give it to the receiver The receiver decides
how much of this total to return to the sender All players and
types are anonymous Nash: send zero, return zero Social optimum:
send full endowment, return whatever is necessary to support
trusting behavior
Slide 25
+
Slide 26
+ Public Goods Game All subjects play simultaneously Each
player is given two cards, one with an X and one blank For each X
card turned in in the first round all players receive an amount of
money, say 4NPR Turning in an X card in the second round earns the
player that turned it in a larger amount, say 20 NPR
Slide 27
+ Attitudes Toward Marginalized Groups
Slide 28
+ Examples Many programs are interested improving the status of
marginalized groups, especially women Governments/NGOs/CBOs are
often interested in easing (often violent) ethnic rivalries,
especially in post-conflict settings
Slide 29
+ Same Problem RCT programmers only operate with treated
populations so only treated populations receive coaching on the
right responses RCTs, the very thing that is insuring unbiasedness
with respect to subject pools (balance) is introducing bias in
measurement
Slide 30
+ A Variety of Options Standard games (altruism, trust, public
goods etc.) can be used to measure attitudes toward out groups
groups Bracic 2013 attitudes toward Roma in the former Yugoslavia
Observing behavior of deliberation, cooperation and teamwork among
mixed groups Karpowitz and Mandelberg 2014 deliberation in mixed
groups of men and women
Slide 31
+ Observing group behavior Bales Interaction Process Analysis
Participants are given a task that requires a group decision or
cooperation Record interactions according to a specific set of
criteria to code whatever the researcher is interested in measuring
(respect, hostility, etc.) The trick Not cuing participants that
this is a study of in- group out-group interaction Incentivizing
participants to act according to beliefs about the out-group
Slide 32
+ Example: Attitudes toward Gender and Ethnicity in the
Liberian National Police (LNP) The government of Liberia adopted an
explicit 30% quota for women in the LNP We did NOT conduct an RCT
but we were interested in: testing some of the assumptions of the
gender program
Slide 33
+ Program proponents claimed that more women would produce a
variety of benefits More consensual decision making Greater
sensitivity to gendered crimes Decades of social psychology
findings that women would not participate fully in group
deliberations.
Slide 34
+ The program had been underway for several years so officers
new the attitudes toward female officer that they were supposed to
have Thus a survey would not have been a convincing measurement
strategy We had groups of size officers complete team tasks and
randomized the number of female officers in each group We observed
team members to see if men reacted differently in groups with more
women Groups with more women deliberated more consensually and were
more likely to see crime as gendered
Slide 35
+ Findings Female officers were not, in general, more likely to
see a gendered crime but more competent women were Groups with more
women members were not more likely to see a gendered crime Groups
with more women were not more consensual Backlash effect: Men in
majority female groups were significantly more aggressive.
Slide 36
+ Conclusion Programming by its very nature coaches
beneficiaries in giving the types of survey responses answers the
program would like to hear Randomization exacerbates this problem
Behavioral measures are appealing but: Measures with high external
validity can make it hard to disentangle mechanisms at the
individual and community level Fine tuning individual incentives
correctly get at attitudes even when subjects are cued to the right
answer: monetary reward will induce people will act on actually
held beliefs rather than the socially correct ones Lab-in-the field
activities address both of these issues and provide an important
tool for measuring the social effects of programs, at some loss of
external validty