Signaling When No One Is Watching: A Reputation ......doing. In other words, what material benefits might moralistic Jillian J. Jordan, Kellogg School of Management, Northwestern Univer-sity;

Signaling When No One Is Watching: A Reputation Heuristics Account ofOutrage and Punishment In One-Shot Anonymous Interactions

Jillian J. JordanNorthwestern University

David G. RandMassachusetts Institute of Technology

Moralistic punishment can confer reputation benefits by signaling trustworthiness to observers. However,why do people punish even when nobody is watching? We argue that people often rely on the heuristicthat reputation is typically at stake, such that reputation concerns can shape moralistic outrage andpunishment even in one-shot anonymous interactions. We then support this account using data fromAmazon Mechanical Turk. In anonymous experiments, subjects (total n ! 8,440) report more outrage inresponse to others’ selfishness when they cannot signal their trustworthiness through direct prosociality(sharing with a third party)—such that if the interaction were not anonymous, punishment would havegreater signaling value. Furthermore, mediation analyses suggest that sharing opportunities reduceoutrage by influencing reputation concerns. Additionally, anonymous experiments measuring costlypunishment (total n ! 6,076) show the same pattern: subjects punish more when sharing is not possible.Moreover, and importantly, moderation analyses provide some evidence that sharing opportunities do notmerely reduce outrage and punishment by inducing empathy toward selfishness or hypocrisy aversionamong non-sharers. Finally, we support the specific role of heuristics by investigating individualdifferences in deliberateness. Less deliberative individuals (who typically rely more on heuristics) aremore sensitive to sharing opportunities in our anonymous punishment experiments, but, critically, not inpunishment experiments where reputation is at stake (total n ! 3,422); and not in our anonymous outrageexperiments (where condemning is costless). Together, our results suggest that when nobody is watching,reputation cues nonetheless can shape outrage and—among individuals who rely on heuristics—costlypunishment.

Keywords: signaling, third-party punishment, morality, trustworthiness, anger

Supplemental materials: http://dx.doi.org/10.1037/pspi0000186.supp

Moralistic punishment is a central feature of human nature.Humans react to a wide range of selfish and immoral behaviorswith condemnation, and act to punish transgressors— even asthird-party observers who have not been directly harmed. Suchthird-party punishment (TPP) appears universal across cultures(Henrich, Ensminger, et al., 2010, Henrich et al., 2006; Herr-mann, Thöni, & Gächter, 2008), has early roots in development(Hamlin, Wynn, Bloom, & Mahajan, 2011; Jordan, McAuliffe,& Warneken, 2014; McAuliffe, Jordan, & Warneken, 2015), isobserved in both lab (Fehr & Fischbacher, 2004; FeldmanHall,Sokol-Hessner, Van Bavel, & Phelps, 2014; Goette, Huffman,& Meier, 2006; Jordan, McAuliffe, & Rand, 2015) and field(Balafoutas & Nikiforakis, 2012; Mathew & Boyd, 2011) ex-periments, and is unique to humans (Jensen, 2010; Riedl, Jen-sen, Call, & Tomasello, 2012).

Moreover, theoretical research demonstrates that punishmentcan serve to promote and maintain prosocial behavior by deterringselfishness (Boyd, Gintis, Bowles, & Richerson, 2003; Boyd &Richerson, 1992; Henrich & Boyd, 2001), and empirical evidencesupports this claim (Balafoutas, Grechenig, & Nikiforakis, 2014;Balliet, Mulder, & Van Lange, 2011; Charness, Cobo-Reyes, &Jimenez, 2008; Feinberg, Willer, & Schultz, 2014; Jordan et al.,2015; Mathew & Boyd, 2011; Yamagishi, 1986). Thus, moralisticpunishment plays a critical role in shaping human morality andsupporting prosocial behavior.

However, punishing wrongdoing can be costly. It can taketime and effort, and risk physical harm and (physical or non-physical) retaliation (Balafoutas et al., 2014; Dreber, Rand,Fudenberg, & Nowak, 2008; Nikiforakis, 2008). So, why dounaffected third parties respond to moral transgressions withcondemnation and punishment? While a large body of researchhas investigated the “proximate” psychological drivers of mor-alistic punishment (e.g., Carlsmith, Darley, & Robinson, 2002;Cushman, Dreber, Wang, & Costa, 2009; Haidt, 2001; Horberg,Oveis, Keltner, & Cohen, 2009; Nelissen & Zeelenberg, 2009),in this article we focus on the “ultimate-level” question of whyour psychology should drive us to incur costs to punish wrong-doing. In other words, what material benefits might moralistic

Jillian J. Jordan, Kellogg School of Management, Northwestern Univer-sity; David G. Rand, Sloan School, Massachusetts Institute of Technology.

Correspondence concerning this article should be addressed to Jillian J.Jordan, Kellogg School of Management, Northwestern University, GlobalHub, 2211 Campus Drive, Evanston, Illinois. E-mail: [email protected]

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Journal of Personality and Social Psychology:Interpersonal Relations and Group Processes

© 2019 American Psychological Association 2019, Vol. 1, No. 999, 0000022-3514/19/$12.00 http://dx.doi.org/10.1037/pspi0000186

1

mailto:[email protected]

mailto:[email protected]

http://dx.doi.org/10.1037/pspi0000186

punishment confer in the long run, such that it is supported bylearning or evolutionary processes?

Reputation Mechanisms for Punishment

A large body of work has investigated mechanisms throughwhich TPP can, in the long-run, be strategically beneficial. Muchof this work has focused on mechanisms through which punish-ment can confer reputational benefits (Kurzban, DeScioli, &O’Brien, 2007). These mechanisms include indirect reciprocity,whereby punishers are rewarded (Ohtsuki, Iwasa, & Nowak, 2009;Raihani & Bshary, 2015b), and signaling, whereby punishers ad-vertise either their prosociality (Barclay, 2006; Horita, 2010; Jor-dan, Hoffman, Bloom, & Rand, 2016; Nelissen, 2008; Raihani &Bshary, 2015a; Simpson, Harrell, & Willer, 2013) or their will-ingness to retaliate when harmed directly (Delton & Krasnow,2017; Krasnow, Delton, Cosmides, & Tooby, 2016).

One particular way that TPP may confer reputational benefits isby serving as a costly signal (Zahavi, 1975) of trustworthiness (i.e.,an individual’s propensity to reciprocate cooperation from others).This account is supported by game theoretic modeling that pro-ceeds from the premise that the same mechanisms (e.g., reciproc-ity, institutions) incentivize people to both (a) cooperate them-selves, and (b) encourage others to cooperate by punishingselfishness (Jordan, Hoffman, Bloom, et al., 2016; Jordan & Rand,2017). As a result, individuals who face larger incentives tocooperate also face larger incentives to punish, such that punishingis less costly for them. TPP can, therefore, serve as a costly signalthat the punisher can be trusted to cooperate.

Critically, however, this theory predicts that punishment shouldonly provide a meaningful window into an individual’s underlyingtrustworthiness in the absence of more informative signals. Forexample, what would happen if a potential punisher also had theopportunity to signal his or her trustworthiness via a “direct” act ofprosociality, like helping somebody by sharing a resource withthem? If punishment signals trustworthiness because similar in-centives encourage both punishment and reciprocal cooperation,we might expect sharing a resource to be an even stronger signalof trustworthiness than punishment. The prosociality-promotingmechanisms that incentivize both punishment and reciprocal co-operation should also incentivize resource sharing. And typically,the incentive structures underlying resource sharing and reciprocalcooperation may be more tightly linked than the incentive struc-tures underlying punishment and reciprocal cooperation, becauseresource sharing and reciprocal cooperation (but not punishment)both involve paying a cost to directly benefit another individual.

Relatedly, resource sharing may be a “purer” signal of trustwor-thiness than punishment. This form of direct helping is unambig-uously prosocial—whereas while punishment encourages others tocooperate, it also harms the punished and can, thus, reflect anti-social or spiteful motivations (Herrmann et al., 2008), or seemwrong or aversive under certain moral frameworks (Baron &Ritov, 1993). Consequently, there are good reasons to expectresource sharing to be a stronger signal of trustworthiness thanpunishment.

As such, the opportunity to share a resource should underminethe signaling value of punishment. And notably, this should be trueboth for individuals who do and do not actually choose to share.After an individual chooses to share, she should be perceived as

quite trustworthy by others—even if she declines to punish. Andafter an individual chooses not to share, he should be perceived asquite untrustworthy by others—even if he does punish. Thus, inboth cases, the marginal signaling benefit of punishment shoulddecline after a sharing opportunity.

Indeed, experimental evidence supports key predictions of thiscostly signaling theory. Specifically, when punishment is the onlyavailable signal, it is perceived as (Barclay, 2006; Horita, 2010;Jordan, Hoffman, Bloom, et al., 2016; Nelissen, 2008), and actu-ally is (Jordan, Hoffman, Bloom, et al., 2016), an honest andreliable signal of trustworthiness. However, when potential pun-ishers also have the opportunity to directly help others by sharinga resource with them, the perceived and actual signaling value ofpunishment declines dramatically (while sharing is perceived as,and actually is, a very strong signal of trustworthiness; Jordan,Hoffman, Bloom, et al., 2016). And, most critically, potentialpunishers are less likely to punish when sharing is possible (i.e.,when a more informative signal is available). In other words, ratesof punishment are influenced not merely by the transgressionitself, but also by the value of punishment as a signal of trustwor-thiness.

Thus, there is clear evidence that people strategically use TPP tobuild their reputations. However, a framework that merely viewsmoralistic punishment as a “strategic,” reputation-focused phe-nomenon seems limited in many ways. First, beyond enactingpunishment, people often respond to wrongdoing by experiencinggenuine moral outrage. Moral outrage is often discussed primarilyas an affective reaction to wrongdoing, consisting of moralisticanger toward the transgressor (Batson et al., 2007; Haidt, 2003;M. L. Hoffman, 2001; Montada & Schneider, 1989), but otherdiscussions of moral outrage also consider cognitive (e.g., beliefsthat the transgressor has bad moral character) and behavioral (e.g.,a drive toward or support for punishing the transgressor) compo-nents (Fiske & Tetlock, 1997; Tetlock, Kristel, Elson, Green, &Lerner, 2000). Furthermore, while outrage is proposed to serve theultimate function of motivating punishment (Carlsmith et al.,2002; Darley & Pittman, 2003; Fessler & Haley, 2003; Fiske &Tetlock, 1997; Goldberg, Lerner, & Tetlock, 1999; Jordan et al.,2015), from a proximate psychological perspective, it is notablethat moral emotions and judgments are usually not caused byreasoning (Haidt, 2001) and do not feel strategic. Rather, intro-spection suggests that outrage feels like a private response toimmorality that simply tracks the magnitude of wrongdoing thathas occurred—and certainly does not shut off in contexts wherethere is no opportunity for punishment to confer reputation bene-fits.

Moreover, people sometimes punish wrongdoing even in con-texts where punishment cannot confer reputation benefits (Crock-ett, Özdemir, & Fehr, 2014; Fehr & Fischbacher, 2004; Jordan etal., 2015; Nelissen & Zeelenberg, 2009). In other words, peoplepunish in contexts where there are no observers who can link theirbehavior to their identity and may interact with them in the future(or transmit information about their behavior to someone who willinteract with them in the future). Throughout this article, we referto these contexts as “one-shot anonymous interactions,” or inter-actions where “reputation is not at stake.”

On first inspection, it may seem that because reputation is not atstake, the ultimate explanation for punishment in these contextscannot involve reputation. However, in this article, we challenge

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

2 JORDAN AND RAND

this idea. We argue that reputation theories are not exclusivelyrelevant to moralistic punishment and outrage in contexts wherereputation is at stake. Rather, we argue that because people oftenrely on the heuristic that reputation is typically at stake, a reputa-tion framework that incorporates heuristics—while drawing on thetheory that punishment serves to signal trustworthiness—can shedlight on when and why people experience outrage and enactpunishment even in one-shot anonymous interactions.

A Reputation Heuristics Hypothesis for One-ShotAnonymous Punishment

Our work is based on the premise that it is a good rule of thumbto behave, by default, as if reputation is at stake (i.e., as if yourbehavior will be observed and linked to your identity, influencingthe way that others treat you in the future). One reason such anapproach could be optimal is “error management” (Delton, Kras-now, Cosmides, & Tooby, 2011; although see Zefferman, 2014;Zimmermann & Efferson, 2017). According to this account, evenif one attempts to evaluate whether reputation is at stake andconcludes that it likely is not, there is still uncertainty—and so itmay pay to nonetheless behave as if reputation is at stake. How-ever, our work is based on a different reason it may pay to behave,by default, as if reputation is at stake. This reason is not that it isoptimal to behave as if reputation is at stake after determining thatit appears not to be, but rather that it can sometimes be optimal notto evaluate whether reputation appears to be at stake. In otherwords, it can pay to sometimes rely on the heuristic that reputationis typically at stake.

In human social life, reputation is frequently at stake, anddetermining whether the current situation is an exception may beeffortful (e.g., because even when nobody seems to be watching,one may need to evaluate whether there are hidden observers).Consequently, evaluating whether reputation is at stake can havecognitive costs (such evaluation takes time and effort; Bear &Rand, 2016; Kahneman, 2011; Rand, Tomlin, Bear, Ludvig, &Cohen, 2017) as well as social costs (those who appear calculatingin their moral decision-making may be seen negatively by others;Critcher, Inbar, & Pizarro, 2013; M. Hoffman, Yoeli, & Nowak,2015; Jordan, Hoffman, Nowak, & Rand, 2016). To avoid thesecosts, it may be beneficial to rely on the heuristic that reputation istypically at stake instead of constantly calculating whether this iscurrently the case (Bear, Kagan, & Rand, 2017; Bear & Rand,2016).

If (some) people use the heuristic that reputation is typicallyat stake, the theory that punishment serves to signal trustwor-thiness may help explain when and why people punish incontexts where reputation is not at stake. Specifically, a repu-tation heuristics hypothesis makes the prediction that even inone-shot anonymous interactions, people’s punishment deci-sions should be sensitive to cues of the potential signaling valueof punishing (if reputation were at stake)—and that this sensi-tivity should be greater among less deliberative decision-makers, who should be more prone to rely on heuristics. More-over, because outrage is proposed to adaptively motivatepunishment (despite being experienced genuinely), a reputationheuristics hypothesis may predict that when reputation is not atstake, outrage also increases in contexts where punishment

would confer larger reputation benefits if reputation were atstake.

An Illustrative Example

To illustrate our reputation heuristics hypothesis, imagine thefollowing example. One day, your workplace holds a fundraiserfor a local charity that fights homelessness. To collect funds, theyask for donations in the break room during lunchtime. However,you happen to be in a meeting when the funds are collected, so youhave no opportunity to donate. Afterward, a well-off colleaguetells you (and several other colleagues) that he thinks homelesspeople are lazy and makes it a rule to never help them.

How outraged do you feel, and how likely are you to chastiseyour colleague? A signaling theory predicts that in general, con-demning him could confer reputation benefits by demonstrating toother colleagues that you are not selfish, and do not have disdainfor the homeless. It also predicts that in this particular situation,you might be especially driven to punish. Because you were out ofthe room when donations were collected, you were unable todonate to the charity—and consequently, you missed an opportu-nity to send a more direct signal that you are not selfish and havepositive attitudes toward the homeless. Thus, the signaling value ofpunishing may be especially high, as compared with the counter-factual in which you had the opportunity to donate.

In this example, punishing would be observed by a host ofpeople you know and, therefore, could confer genuine reputationalbenefits. However, now consider the case where after missing theopportunity to donate, you leave the office and see a stranger insulta homeless person on the street. If you choose to chastise thestranger, you will not be observed by people you know and, thus,will not actually gain reputation benefits. However, insofar as youbehave by default as if reputation is at stake, your reaction mightnonetheless be influenced by reputation-relevant cues. Specifi-cally, your reaction might be influenced by the fact that you missedthe opportunity to donate to the office fundraiser—so punishingthe stranger would serve as a relatively strong signal of yourmorality if you were observed by somebody from work. Thus, wepredict that even in this anonymous context, you might feel height-ened outrage, and be more likely to chastise the stranger (ascompared with the counterfactual in which you were present whendonations were solicited).

Overview of Analyses

To test our reputation heuristics account of outrage and punish-ment in one-shot anonymous interactions, we used five analyses of12 different experiments. See Table 1 for a summary of ouranalyses, and Table 2 for a summary of the experiments includedin them.

Across our first two analyses, we began by testing the predictionthat moral outrage is sensitive to reputation cues in contexts wherereputation is not at stake. In Analysis 1, we investigated seven exper-iments that measured moral outrage in one-shot anonymous interac-tions (total n ! 8,440). (Six experiments measured outrage using athree-item scale designed to tap the affective, cognitive, and behav-ioral components of outrage, and one used a single item designed totap only the affective component of outrage.) We tested the predictionthat outrage would increase in contexts where punishment would

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

3SIGNALING WHEN NOBODY IS WATCHING

serve as effective signal of trustworthiness, if observed. Specifically,we predicted that subjects would report more outrage toward selfish-ness when they could not signal their trustworthiness via directhelping (sharing a resource with a third party)—and, thus, if punish-ment were observed, it would have greater signaling value.

In Analysis 2, we tested the prediction that helping opportu-nities influence reported outrage via reputation concerns. Tothis end, we investigated mediation through two reputation-relevant constructs. First, in one subset of our outrage experi-ments (n ! 2,434), we measured the perceived reputationbenefits of punishment. This construct was intended to be amediator, and we predicted that (a) when subjects did not havethe opportunity to help, they would report that punishmentwould have greater reputations benefits, (b) the perceived rep-utation benefits of punishment would correlate positively withoutrage, and (c) the perceived reputation benefits of punishment

would mediate the effect of helping opportunities on outrage.Second, in a partially overlapping subset of our outrage exper-iments (n ! 2,432), we measured general reputation concerns.This construct was initially intended to be a moderator; how-ever, it was measured after our helping opportunities manipu-lation and we found evidence that it was influenced by helpingopportunities, so we analyzed it as a mediator. Thus, we inves-tigated whether (a) subjects reported being more generallyconcerned with their reputations when they did not have theopportunity to help, (b) general reputation concerns correlatedpositively with outrage, and (c) general reputation concernsmediated the effect of helping opportunities on outrage.

In Analysis 3, we tested the prediction that helping opportunitiesalso influence costly punishment in contexts where reputation isnot at stake. We investigated a set of four experiments that mea-sured costly punishment in one-shot anonymous interactions (total

Table 1Overview of Analyses

Analysis Key questions and predictions

Experiments included

1 2 3 4 5 6 7 8 9 10 11 12

1 When reputation is not at stake, how do helpingopportunities influence outrage, and vice versa?

X X X X X X X

Predict: Helping opportunities reduce outrage(Experiments 1–7)

Predict: The opportunity to rate outrage does not reducehelping (Experiment 1)

2 When reputation is not at stake, do two reputation-relevantconstructs mediate the effect of helping opportunitieson outrage?

X X X X

Predict: The Perceived Reputation Benefits of Punishment(PRBP) mediate the effect of helping opportunities onoutrage (Experiments 2, 4, and 5)

Explore whether General Reputation Concerns (GRC)mediate the effect of helping opportunities on outrage(Experiments 3–5)

3 When reputation is not at stake, how do helpingopportunities influence punishment, and vice versa?

X X X X

Predict: Helping opportunities reduce punishment(Experiments 6, and 8–10)

Predict: Punishment opportunities do not reduce helping(Experiments 8 and 9)

4 When reputation is not at stake, does follow-up experimenthelping moderate the effects of helping opportunitieson affective outrage and punishment? And if so, arethese effects driven solely by non-helpers, or do theyalso hold among helpers?

X

Predict: The negative effects of helping opportunities onaffective outrage and punishment are not driven solelyby non-helpers (Experiment 6)

5a When reputation is not at stake, does deliberativenessmoderate the effect of helping opportunities onpunishment?

X X X X

Predict: Deliberativeness attenuates the effect of helpingopportunities on punishment (Experiments 6, and 8–10)

5b When reputation is at stake, does deliberativeness moderatethe effect of helping opportunities on punishment?

X X X X

Predict: Deliberativeness does not attenuate the effect ofhelping opportunities on punishment (Experiments 9–12)

5c When reputation is not at stake, does deliberativenessmoderate the effect of helping opportunities onoutrage?

X X X X X X X

Explore this question without a directional prediction(Experiments 1–7)

Note. For each analysis, we report the key questions and predictions, and the experiments included.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

4 JORDAN AND RAND

n ! 6,076). We predicted that subjects would be more likely topunish when they did not have the opportunity to help.

In Analysis 4, we tested the deflationary hypothesis that helpingopportunities reduce outrage and punishment merely by inducingempathy toward selfishness or hypocrisy aversion among subjectswho decline to help. This deflationary hypothesis predicts thathelping opportunities only reduce outrage and punishment amongsubjects who choose not to help when given the opportunity, andnot among subjects who choose to help. In contrast, our signalinghypothesis predicts that helping opportunities should reduce out-rage and punishment among all subjects, regardless of whether ornot they choose to help when given the opportunity.

To test our signaling hypothesis, we sought to tap individualdifferences in the likelihood of helping, when given the chance.To this end, after one of our experiments (that manipulatedhelping opportunities and measured affective outrage and pun-ishment), we conducted a follow-up experiment that gave allsubjects the opportunity to help. We treated follow-up experi-ment helping as an index of an individual’s propensity to helpwhen given the chance. Then, we tested (a) whether follow-upexperiment helping moderated the effects of helping opportu-nities on affective outrage and punishment in our originalexperiment, and (b) if so, whether these effects were drivensolely by follow-up experiment non-helpers, or also held amongfollow-up experiment helpers. We predicted that the negativeeffects of helping opportunities on affective outrage and pun-ishment would not be driven solely by non-helpers.

Together, Analyses 1–4 tested the predictions that moral out-rage and costly punishment are influenced by the potential signal-ing value of punishment, even when reputation is not at stake.Finally, in Analysis 5, we specifically tested our reputation heu-ristics explanation for these predictions. Based on the premise thatdeliberative individuals tend to rely more on heuristics, we inves-tigated the potential moderating role of deliberativeness. We did soby investigating two indicators of deliberativeness: performanceon questions assessing comprehension of the incentives in ourexperiment, and performance on the Cognitive Reflection Task(Frederick, 2005).

As per our reputation heuristics hypothesis, we predicted thatless deliberative subjects would be more likely to enact one-shotanonymous punishment when helping was not possible, whilemore deliberative subjects would punish at relatively lower ratesregardless of helping opportunities. Moreover, we predicted thatdeliberativeness would not moderate the influence of helpingopportunities on punishment in a set of experiments (total n !3,422) where reputation was actually at stake and thus attending toreputation cues actually had strategic value. Finally, we also in-vestigated whether deliberativeness would moderate the effect ofhelping opportunities on outrage in our one-shot anonymous out-rage experiments.

Analysis 1

In Analysis 1, we tested the prediction that moral outrage isinfluenced by cues of the potential signaling value of punishment.To this end, we considered seven experiments investigatingwhether people respond to selfishness with more moral outrage insituations where they lack the opportunity to directly help others.

As discussed previously, there are theoretical reasons that directhelping should typically be a stronger signal of trustworthinessthan punishment. And indeed, empirical evidence from a contextwhere reputation is at stake suggests that the expected signalingvalue of punishment is larger when helping is not possible (and,thus, a better signal is not available; Jordan et al., 2016). Moreover,Jordan et al. find that helping opportunities reduce punishment (aspredicted by the observation that helping is a stronger signal thanpunishment), while punishment opportunities do not reciprocallyreduce helping (as predicted by the observation that punishment isa weaker signal than helping).

When designing the seven experiments analyzed here, weadapted the design of this previous work to test the hypothesis thatmoral outrage is sensitive to cues of punishment’s potential sig-naling value. Across all seven experiments, we tested the predic-tion that helping opportunities would decrease moral outrage(rather than punishment) in a context where reputation was notactually at stake. Additionally, we tested the prediction that theopportunity to express moral outrage would not reciprocally de-crease helping.

Method

Design. We designed a “Third-Party Condemnation Game”(TPCG), which we used in all seven experiments. The TPCG hadthree players, and involved an incentivized economic game deci-sion with no deception. In this game, subjects had the opportunityto earn money that was paid out in a “bonus payment,” on top ofthe show-up fee they earned for participating. Specifically, onesubject (the Helper) was endowed with money (30¢) and decidedwhether or not to split it evenly with (i.e., help) another subject(the Recipient). Then, a third subject (the Condemner) rated theirmoral outrage toward the Helper. (Specifically, we always mea-sured outrage toward a selfish helper who did not split the moneywith the Recipient; see Procedure for details.) The TPCG met ourdefinition of a one-shot anonymous interaction, in which reputa-tion was not at stake. It was conducted online in privacy, withanonymous strangers, and there was no potential for any of theplayers to base their game play on other players’ past actions.Moreover, while we (i.e., the experimenters) could observe sub-jects’ responses, we could not link them to subjects’ identities.Thus, there was no strategic reason for subjects to care about howtheir responses were perceived by others.

In all seven experiments, target subjects read about all roles inthe TPCG, and then we manipulated the role(s) they were assignedto play. In the Condemnation Only condition, we assigned targetsubjects to play the TPCG once, in the role of the Condemner. Inthe Condemnation " Helping condition, by contrast, we assignedtarget subjects to play twice, with two different sets of otherplayers: once in the role of Condemner, and once in the role ofHelper.

While our experiments were anonymous, what would happen iftarget subjects in these conditions were actually being judged byobservers? In the Condemnation " Helping condition, an observerwould have access to a very strong signal of a target’s trustwor-thiness: whether or not the target chose to help. Therefore, if theobserver were to also find out whether the target punished self-ishness, we would expect this second (weaker) signal to havelimited influence on the observer’s judgment. In contrast, in Con-

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


demnation Only, the observer would not have information abouttarget helping, so we would expect punishment to carry moreweight. Thus, despite the fact that our experiments were anony-mous, helping opportunities served to undermine the potentialsignaling value that punishment could confer if observed. Wepredicted that this would influence outrage—such that subjects inCondemnation Only would report more outrage than subjects inCondemnation " Helping.

We note that, importantly, when target subjects participated asthe Condemner, they rated their outrage toward a Helper who hadbehaved selfishly toward a Recipient. In contrast, when targetsubjects participated as the Helper, they decided whether to help aRecipient who had no previous experience with the game. In otherwords, they decided whether to help a completely neutral party—they were not paired with a Recipient who had previously beenmistreated in any way. Thus, while Condemners had the opportu-nity to express outrage in response to a selfish transgression,Helpers were not reacting to injustice or compensating victims. Assuch, our experimental design falls outside the purview of themoral psychology literature on compensation versus punishmentas modes of restorative justice (e.g., Darley & Pittman, 2003;Gromet, Okimoto, Wenzel, & Darley, 2012; Lotz, Okimoto,Schlösser, & Fetchenhauer, 2011).

While the above-described design was constant across our sevenoutrage experiments, some details varied (see Table 2 for anoverview of differences). First, in Experiment 1, to rule out thepossibility that subjects might express less outrage in Condemna-tion " Helping simply because they had two different responseoptions, we also included a Helping Only condition (in which weassigned target subjects to play the TPCG once, in the role ofHelper). We predicted that while helping opportunities wouldattenuate reported outrage, condemnation opportunities would notreciprocally attenuate rates of helping. Specifically, we predictedthat rates of helping would be similar in the Helping Only andCondemnation " Helping conditions (although see Simpson et al.,2013 for a context in which condemnation opportunities actuallyincreased helping). As described below, we found support for thisprediction; thus, in Experiments 2–7, we focused only on the effectof helping opportunities on outrage, and did not include HelpingOnly conditions.

Relatedly, in Experiment 1, we counterbalanced the order inwhich subjects in Condemnation " Helping made their Con-demner and Helper decisions. (Subjects in Condemnation " Help-ing always knew that they would make both a Condemner and aHelper decision, but we randomized the order in which thesedecisions were made.) This counterbalancing allowed for a sym-metrical test of the effect of helping opportunities on outrage, ascompared with the effect of condemnation opportunities on help-ing. In contrast, in Experiments 2–7, we always assigned subjectsin Condemnation " Helping to make their Helper decisions beforetheir Condemner decisions. This fixed order was used to increasethe salience of helping opportunities, given our exclusive focuson the effect of helping opportunities on outrage. (For analyses oforder effects within the Condemnation " Helping condition ofExperiment 1, see online supplementary material).

Additionally, Experiments 2–5 investigated the mechanism be-hind the effect of helping opportunities on outrage by measuringtwo candidate mediators (see Analysis 2 for more details).

Additionally, Experiment 6 differed from Experiments 1–5 inseveral ways. In Experiments 1–5, we framed the outrage-ratingtask as “making a judgement about the Helper’s moral character,”asked subjects to complete this task imagining that the Helperchose not to help (without knowing what the Helper actually did),and measured outrage using a three-item scale designed to tap theaffective, cognitive, and behavioral components of outrage; inExperiment 6, we framed the outrage-rating task more neutrally,told subjects that the Helper chose not to help, and measuredoutrage with a single item designed to tap only the affectivecomponent of outrage (see Procedure for details). Furthermore,unlike Experiments 1–5, Experiment 6 (a) administered the Cog-nitive Reflection Task before assigning subjects to an experimentalcondition (see Analysis 5 for more details), (b) after measuringoutrage, administered a filler task and then measured costly pun-ishment (see Analysis 3 for details), (c) added a few post-experimental questions (see Procedure and Analysis 2 for details),and (d) approximately 2 weeks after data collection was com-pleted, was followed by a follow-up experiment that Experiment 6subjects were invited to complete (see Analysis 4 for details).

Finally, in Experiment 7, we returned to our procedure fromExperiments 1–5, and thus used our three-item outrage scale, andframed our outrage-rating task as “making a judgement about theHelper’s moral character”; however, as in Experiment 6, we againtold subjects that the Helper chose not to help (as opposed toasking them to imagine that the Helper chose not to help). We alsoincluded a slightly modified version of one of the post-experimental questions included in Experiment 6 (see Procedurefor more details).

Subjects. In each of Experiments 1–5, we requested a targetof n ! 400 subjects per condition from Amazon Mechanical Turk(MTurk; i.e., a total of n ! 1,200 subjects in Experiment 1, whichincluded a Helping Only condition, and n ! 800 subjects in eachof Experiments 2–5, which did not). In Experiment 6, we decided(before data collection) to request a larger sample size of n !1,500 subjects per condition (i.e., a total of n ! 3,000 subjects) forgreater power, particularly because this experiment involved in-viting subjects to complete a follow-up experiment (see Analysis 4for more details) and we were concerned about the potential forlow response rates. Finally, in Experiment 7, we decided (beforedata collection) to request a sample of n ! 750 subjects percondition. We selected this sample size to provide high power toconfirm that the effect of helping opportunities on outrage would becomparable to the effect observed in Experiments 1–6, despite Ex-periment 7 being the only experiment in which we both used ourthree-item outrage scale and told subjects that the Helper chose not tohelp (rather than asking them to imagine the Helper not helping).

In our final samples for analysis, we included all subjects whofinished the survey and thus completed all dependent variables,and had a unique IP address and MTurk ID; when we encounteredduplicate IPs or IDs, we included only the observation that wascompleted chronologically first. This process sometimes resultedin final samples that were slightly larger than the target numberrequested on MTurk (as some subjects completed our survey, butdid not indicate this to MTurk).

Throughout our article, we report and plot results from allsubjects, regardless of performance on comprehension questions(see online supplementary materials for statistics on performance);then, in Analysis 5, we investigate the influence of comprehension

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

6 JORDAN AND RAND

http://dx.doi.org/10.1037/pspi0000186.supp


Table 2Overview of Experiments

Experiment Key design features ConditionsAnalyzed variables measured

(in order) Sample size Methodological notesIn whichanalyses?

1

-Reputation not at stake-Key DV: Outrage

-Condemnation Only-Condemnation " Helping

(counterbalanced)-Helping Only

-Comprehension-Outrage 1,195 (#400/condition) 1, 5c

2

-Condemnation Only-Condemnation " Helping (helping first)

-Comprehension-Outrage-PRBP

819 (#400/condition)

1, 2, 5c3 -Comprehension

-Outrage-GRC


4 -Comprehension-Outrage-PRBP and GRC in random order


5 804 (#400/condition)

6 -Reputation not at stake-Key DVs: Affective outrage and

punishment

-Condemnation/Punishment Only-Condemnation/Punishment " Helping

(helping first)

-CRT-Comprehension (for outrage task)-Affective outrage-Comprehension (for punishment task)-Punishment-Beliefs re: can other players affect

payoff?-Other-, self-, and experimenter-

signaling concerns-Follow-up experiment helping (if

participated)

2,924 (#1,500/condition)

-Outrage-rating task framed moreneutrally than in Experiments 1–5

-Transgression had already occurred(i.e., was not hypothetical)

-Filler memory task betweenmeasurement of affective outrageand punishment

1, 3, 4,5a, 5c

7-Reputation not at stake-Key DV: Outrage

-Condemnation Only-Condemnation " Helping (helping first)

-Comprehension-Outrage-Beliefs re: Can other players affect

payoff?

1,447 (#750/condition)

-Transgression had already occurred(i.e., was not hypothetical)

-Wording for beliefs questionmodified from Experiment 6

1, 5c

8 -Reputation not at stake-Key DV: Punishment

-Punishment Only-Punishment " Helping (counterbalanced)-Helping Only

-Comprehension-Punishment

1,160 (#400/condition)3,5a

9

-Manipulated if reputation at stake-Key DV: Punishment

-Punishment Only vs. Punishment "Helping (counterbalanced) vs. HelpingOnly $ TG vs. No TG

2,331 (#400/condition)

3, 5a, 5b10 -Punishment Only vs. Punishment " Helping

(counterbalanced) $ TG vs. No TG3,104 (#775/condition) -Reputation condition emphasized

how subjects’ decisions wouldlook to TG senders

11-Reputation at stake-Key DV: Punishment

-Punishment Only-Punishment " Helping (counterbalanced)-Helping Only

1,199 (#400/condition) -Previously published and re-analyzed here (Jordan, Hoffman,Bloom, et al., 2016) 5b

12 563 (#200/condition)

Note. PRBP ! Perceived Reputation Benefits of Punishment; TG ! Trust Game; No TG ! No Trust Game, DV ! Dependent Variable. For each experiment, we report the key design features(specifically, whether reputation was at stake and the key DV(S)), experimental conditions (along with counterbalancing information), measured variables that were analyzed, final sample size (as wellas the approximate number per condition, which was the target number recruited), methodological notes, and analyses the experiment was included in.

This document is copyrighted by the American Psychological Association or one of its allied publishers.This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

7SIG

NA

LIN

GW

HE

NN

OB

OD

YIS

WA

TC

HIN

G

on our results. Aggregating across our seven experiments, our finalsamples have n ! 8,847 subjects (n ! 4,228 in CondemnationOnly, n ! 4,212 in Condemnation " Helping, and n ! 407 inHelping Only), Mage ! 35.98 years, SDage ! 11.66 years, 43%male. For demographics by experiment, see online supplementarymaterial.

Procedure. We began by providing subjects with instructionsexplaining the TPCG and their role(s) in it (as determined bycondition). In Experiments 1–5 and 7, we described the Condemn-er’s role as “making a judgement about the Helper’s moral char-acter, in the event that the Helper decides not to help.” In contrast,in Experiment 6, we described the Condemner’s role more neu-trally, as “rating their reaction towards the Helper.”

Next, we provided subjects with two comprehension questionsevaluating their understanding of the incentive structure of theTPCG helping decision (for all experiments in this article, seeonline supplementary material for full stimuli). Then, subjects inHelping Only made their helping decisions, subjects in Condem-nation Only rated their outrage, and subjects in Condemnation "Helping provided both responses. To measure helping, we re-minded subjects that they had 30¢, and that their job was to decidewhether to pay 15¢ to share with the Recipient. We then askedthem to make a decision, which we subsequently repeated back tothem.

To measure moral outrage, we reminded subjects of their role asCondemner (using the language described above). Then, in Exper-iments 1–5, we instructed subjects to imagine that the Helperdecided not to share, and in Experiments 6 and 7, we told subjectsthat the Helper did not share. Next, we presented our moral outragescale. In Experiments 1–5 and 7, we used a three-item scale that wedesigned to tap the affective, cognitive, and behavioral components ofoutrage. This scale was conceptually similar to other moral outragescales designed to tap these three components of outrage (Salerno &Peter-Hagene, 2013; Skitka, Bauman, & Mullen, 2004), and wasdesigned to use language appropriate for the relatively minor trans-gression our experiments focused on. In our scale, we asked subjects(a) how angry they felt toward the Helper, (b) how much the Helperdeserved to be punished, and (c) how morally bad the Helper was (inthat fixed order); then, we computed moral outrage scores as theaverage response across our three scale items.

In Experiment 6, we replaced this three-item scale with one itemthat specifically measured the affective component of outrage. Ourgoal was to investigate whether our results were robust to a contextin which only affective outrage was measured, to provide a stron-ger case for an effect on an affective process and, thus, to connectour work to the psychological literatures on affective outrage,moral emotions, and emotion regulation (e.g., Batson et al., 2007;Brady, Wills, Jost, Tucker, & Van Bavel, 2017; Gross, 1998b;Haidt, 2003; Hutcherson & Gross, 2011; Nelissen & Zeelenberg,2009; Tangney, Stuewig, & Mashek, 2007). To this end, wepresented only the anger item from our three-item scale.

In Experiment 1, Condemners made ratings using Likert scalesranging from 10 to 100 in 10-point increments, with extremeanchors reading not at all and very much. In Experiments 2–7, wemodified these scales to range from 0 to 100. Then, to facilitatecomparison across experiments, we rescaled Experiment 1 re-sponses (that originally ranged from 10 to 100) by subtracting 10and then multiplying by 10/9 (such that they ranged from 0 to 100,like in Experiments 2–6). In Experiments 6–7, for grammatical

correctness, we changed the wording on the extreme anchor fromvery much to a lot.

After subjects made their decisions, they completed a post-experimental survey including some demographic and other ques-tions. Of relevance to Analysis 1, in both Experiments 6 and 7, weincluded one post-experimental question investigating subjects’beliefs about whether other players could influence their payoffs.These questions were designed to investigate whether, to the extentthat subjects were sensitive to reputation cues in our one-shotanonymous experiments, this reflected a mistaken explicit beliefthat other players really could observe their behavior and theninfluence their payoffs. Specifically, in Experiment 6, we askedsubjects who, if anyone, could influence their payoffs, and pro-vided response options of the Helper, the Recipient, both, andneither; responses of “neither” were considered correct. In Exper-iment 7, we modified the wording slightly to avoid suggesting tosubjects that other player(s) could influence their payoffs. We askedsubjects whether, while rating their outrage, they believed that any ofthe other players had the ability to influence their payoffs, andprovided response options of Yes or No; then, (only) if subjectsselected “Yes,” we asked them to pick between the response optionsoffered in Experiment 6. Responses of “No” were considered correct.Finally, after all data was collected, we used ex-post matching to pairHelpers and Recipients and calculate their bonuses.

Results

We begin by noting that all of our data, and a script forreproducing all our analyses, are available online at https://osf.io/7z8b6/. Next, we report aggregated analyses of moral outrageacross Experiments 1–7. These analyses aggregated average re-sponses across our three-item outrage scale in Experiments 1–5and 7 with responses to our single-item affective outrage measurein Experiment 6; however, we subsequently report analyses byexperiment to demonstrate robustness across both measures. Wenote that throughout our analyses, we used linear regressions topredict continuous variables and logistic regressions to predictbinary variables, and in all analyses that pooled data from multipleexperiments, we included experiment dummies.

Collapsing across our six experiments that used our three-itemoutrage scale, we found that this scale was reliable (% ! .88). Allthree items were strongly correlated with each other: anger anddeserved punishment, r ! .73, p & .001, anger and badness ofperson, r ! .72, p & .001, badness of person and deservedpunishment, r ! .71, p & .001. Additionally, as predicted and asillustrated in Figure 1a, we found that subjects across Experiments1–7 reported significantly more outrage in Condemnation Only(M ! 35.18, SD ! 29.50) than Condemnation " Helping (M !30.22, SD ! 30.06), B ! 0.08, t ! 7.68, p & .001, n ! 8,440.Thus, when subjects had the opportunity to signal their trustwor-thiness via direct helping, they reported less outrage in response toselfish behavior.

As predicted and illustrated in Figure 1b, conversely, we ob-served comparable rates of helping in Helping Only (66%) andCondemnation " Helping (67%) in Experiment 1 (which includedthe Helping Only control condition), odds ratio (OR) ! 0.94,z ! '.39, p ! .693, n ! 797. Thus, while helping opportunitiesreduced outrage across Experiments 1–7, condemnation opportu-nities did not reduce helping in Experiment 1. We also bolster this

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

8 JORDAN AND RAND




https://osf.io/7z8b6/

https://osf.io/7z8b6/

conclusion by directly comparing the effect of helping opportuni-ties on outrage to the effect of condemnation opportunities onhelping within Experiment 1. Investigating only this experiment,we used linear regressions to predict both outrage and helping asa function of condition, and found that the standardized conditioncoefficient was significantly larger when predicting outrage (B !.10, SE ! .04, p ! .006) than helping (B ! '.01, SE ! .04, p !.694), z ! 2.24, p ! .025.

Next, we report our results by experiment. In Table 3 we report,for each experiment and overall, the reliability of, and effect ofhelping opportunities on, our three-item outrage scale (measured inExperiments 1–5 and 7), as well as the effect of helping opportu-nities on our affective outrage item (anger) specifically (measuredin all seven experiments).

Our results are quite robust across experiments: in all six ex-periments measuring outrage via our three-item scale, we foundthat the scale was reliable, and in five out of those six, we observedsignificantly more outrage in Condemnation Only than Condem-nation " Helping. Moreover, this effect was significant in Exper-iment 7, demonstrating that the effect of helping opportunities onoutrage was robust to telling Condemners that the Helper did notshare (rather than asking them to imagine the Helper not sharing).Additionally, across all seven experiments, we always observeddirectionally more affective outrage in Condemnation Only, andthis effect was significant in three experiments and overall, as wellas marginally significant in two experiments. Furthermore, theeffect was significant in Experiment 6, providing further evidencethat the effect of helping opportunities on outrage was robust totelling Condemners that the Helper did not share, and also dem-onstrating that it was robust to framing the Condemner’s role moreneutrally, and measuring affective outrage only.

Finally, we consider a deflationary account of our results. Wereour subjects sensitive to helping opportunities in our one-shotanonymous experiments simply because they held the mistaken

Figure 1. Helping opportunities reduce moral outrage (while condemna-tion opportunities do not reduce helping) in one-shot anonymous interac-tions. In (a), we show box plots (drawing lines at the 25th, 50th, and 75thpercentiles, and illustrating the minimum and maximum values) for outrageas a function of helping opportunities across Experiments 1–7. In (b), weplot the proportion of subjects helping as a function of condemnationopportunities in Experiment 1; error bars are 95% confidence intervals(CIs). See the online article for the color version of this figure.

Tab

le3

Ana

lysi

s1

Res

ults

,B

yE

xper

imen

t

Stat

istic

Exp

erim

ent

1E

xper

imen

t2

Exp

erim

ent

3E

xper

imen

t4

Exp

erim

ent

5E

xper

imen

t6

Exp

erim

ent

7A

ggre

gate

(n!

788)

(n!

819)

(n!

817)

(n!

811)

(n!

804)

(n!

2,92

4)(n

!1,

477)

(nva

ries

)

Rel

iabi

lity

ofth

ree-

item

outr

age

scal

e.8

8.8

8.8

6.8

8.8

8N

/A.9

0.8

8(E

xper

imen

ts1–

5,an

d7,

n!

5,51

6)E

ffec

tof

Con

dem

natio

nO

nly

dum

my

onth

ree-

item

outr

age

scal

eB

!.1

0,B

!.0

7,B

!.0

4,B

!.0

9,B

!.1

2,N

/AB

!.0

8,B

!.0

8,p

!.0

06p

!.0

36p

!.2

74p

!.0

11p

!.0

01p

!.0

01p

&.0

01(E

xper

imen

ts1–

5,an

d7,

n!

5,51

6)E

ffec

tof

Con

dem

natio

nO

nly

dum

my

onaf

fect

ive

outr

age

(ang

erite

m)

B!

.07,

B!

.05,

B!

.02,

B!

.06,

B!

.09,

B!

.09,

B!

.06,

B!

.07,

p!

.050

p!

.123

p!

.635

p!

.094

p!

.012

p&

.001

p!

.031

p&

.001

(Exp

erim

ents

1–7,

n!

8,44

0)

Not

e.Sa

mpl

esi

zes

indi

cate

subj

ects

for

who

mou

trag

ew

asm

easu

red

(i.e

.,th

eE

xper

imen

t1

sam

ple

size

excl

udes

the

Hel

ping

Onl

yco

nditi

on).

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


explicit belief that other players really could observe their behaviorand then influence their payoffs? To address this question, weanalyzed responses to the post-experimental questions in Experi-ments 6 and 7 that measured subjects’ explicit beliefs aboutwhether other players could influence their payoffs. Excludingsubjects who reported that other player(s) could influence theirpayoffs, we still found that helping opportunities reduced affectiveoutrage in Experiment 6 (B ! 0.09, t ! 3.74, p & .001, n ! 1,703)and outrage in Experiment 7 (B ! 0.08, t ! 2.86, p ! .004, n !1,227). Thus, our results do not seem to have been driven bysubjects who held the mistaken explicit belief that other playerscould observe their behavior and influence their payoffs.

Because the one-shot anonymous nature of our experiments is acritical design feature, we also examined the absolute percentageof subjects who correctly indicated that other players could notinfluence their payoffs. We found that 58.24% of subjects an-swered correctly in Experiment 6, while 83.07% answered cor-rectly in Experiment 7. At face value, the percentage of correctresponses in Experiment 6 seems worryingly low. However, giventhe Experiment 6 question wording, and the Experiment 7 result,we suspect that this percentage overestimates the frequency withwhich Experiment 6 subjects, while rating their outrage and mak-ing their punishment decisions, actually believed that other playerscould influence their payoffs.

Recall that Experiment 6 asked subjects who, if anyone,could influence their payoffs, and then presented four responseoptions, three of which indicated that other player(s) couldinfluence their payoffs. We worried that this setup may havesuggested to subjects that other players could influence theirpayoffs, inducing this belief among subjects who had not pre-viously held it while making their outrage and punishmentdecisions. We also considered that random responses (providedby hurried or inattentive subjects) would be incorrect 75% ofthe time. For these reasons, in Experiment 7, we asked a “yes orno” question of whether, while rating their outrage, subjectsbelieved that any of the other players could influence theirpayoffs. Using this wording, we found a substantial increase incorrect responding, consistent with the possibility that ourExperiment 6 wording was suggestive.

Of course, it is difficult to completely avoid suggestion whilemeasure subjects’ beliefs about whether other players could influ-ence their payoffs. Our data cannot decisively reveal the truepercentage of subjects in each experiment who held this beliefbefore we asked about it. Nonetheless, we see the comprehensionrate in Experiment 7 as encouraging, and consistent with ourgeneral prior that it would be relatively unlikely for subjects toexplicitly believe that other players could influence their payoffs(as this would require confabulating an additional component ofthe game that did not exist). And regardless of the true compre-hension rate, our key hypothesis—that reputation heuristics canshape outrage in one-shot anonymous interactions—is supportedby the fact that our results hold among subjects who explicitlyunderstood that other players could not influence their payoffs. Ofcourse, our results may have been driven by subjects who implic-itly believed that other players could influence their payoffs—apossibility that is consistent with our reputation heuristics hypoth-esis.

Discussion

Analysis 1 supports our prediction that in one-shot anonymousinteractions, subjects who did not have the opportunity to signaltheir trustworthiness via direct helping—and, thus, for whompunishment had greater potential signaling value—reacted to self-ishness with more moral outrage. This finding does not appear toreflect a general mechanism whereby any response is less likelywhen two response options are available: as predicted, whilehelping opportunities reduced reported outrage, the opportunity toreport outrage did not reduce helping.

Our Analysis 1 results are aligned with the previous results that,in a context where reputation was actually at stake, helping op-portunities reduced rates of costly punishment, while the reversewas not true (Jordan, Hoffman, Bloom, et al., 2016). Analysis 1extends this pattern to the context of reported moral outrage inone-shot anonymous interactions.

We note that the observed effect of helping opportunities onoutrage was relatively small. However, it is theoretically signifi-cant that helping opportunities—a proposed reputation cue—hadany effect on outrage, given that the transgression in question wasidentical across conditions. This result provides support for theproposal that, as an adaptive motivator of punishment, moraloutrage is not just an objective indicator of the magnitude ofwrongdoing that has occurred. Rather, despite being experiencedgenuinely, our data suggest that outrage can be influenced by thepotential signaling value of punishment. This conclusion has theimplication that in daily life, other reputation cues could alsoinfluence outrage—and more broadly, our results support the the-ory that a reputation framework can shed light on our moralpsychology, even in contexts where reputation is not at stake.

However, while our results are consistent with the hypothesisthat helping opportunities influenced the subjective experience ofmoral outrage (i.e., that subjects genuinely felt more morallyoutraged when helping was not possible), we note that an alterna-tive interpretation of our results is also possible. Specifically, it ispossible that subjects who did not have the opportunity to help hadthe same subjective experience of moral outrage, but were drivento rate themselves as more morally outraged. In other words,helping opportunities may not have influenced feelings of moraloutrage, but the drive to express those feelings—in this case, viaratings on our moral outrage scale, which subjects may havetreated as an opportunity for verbal condemnation (rather than aprecise barometer of their subjective experience). It is difficult todiscriminate between these possibilities, which are not mutuallyexclusive. An increase in self-reported outrage can always reflectan increase in the experience of outrage, or the drive to expressit—but it is difficult to measure the subjective experience ofoutrage without self-report.

We do find it notable that helping opportunities reduced re-ported outrage even in Experiment 6, where we focused on theaffective component of outrage (by measuring only anger), andframed the outrage-rating task more neutrally (by telling subjectsto “rate their reaction towards” rather than “make a moral judge-ment about” the Helper). It seems possible that these changesreduced the extent to which subjects viewed our outrage-ratingtask as an opportunity to verbally condemn selfishness, such thatExperiment 6 served as a purer measure of subjects’ true affectiveexperience. Nonetheless, it is of course still possible that helping

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

10 JORDAN AND RAND

opportunities reduced reported affective outrage in Experiment 6merely by modifying subjects’ drive to express their (unchanged)experience of affective outrage. Future work should seek to dif-ferentiate between these possibilities. Even if helping opportunitiesonly influenced expressions of outrage, however, our results wouldstill imply that the basic drive to express outrage—in a contextwhere expressions are completely anonymous—is shaped by rep-utation cues. Thus, either interpretation suggests that a reputationframework can help explain a broad set of expressions of moraloutrage and acts of punishment, even when reputation is not atstake.

Analysis 2

In Analysis 2, we aimed to provide more direct support for areputation-based interpretation of our Analysis 1 results. Specifi-cally, we sought to test the hypothesis that helping opportunitiesinfluenced reported outrage because they served as a cue of thepotential signaling value of punishment. To this end, we con-ducted mediation analyses to investigate the mechanismsthrough which helping opportunities influenced reported out-rage, and tested the prediction that they did so insofar as theywere seen as a reputation-relevant cue.

Two Candidate Mediators

We investigated two reputation-relevant candidate mediators.First, in Experiments 2, 4, and 5, we measured the perceivedreputation benefits of punishment. According to our theory, (a)because helping is such a diagnostic signal of trustworthiness, thepotential reputation benefits of punishment should decline after ahelping opportunity, and (b) moral outrage (or the drive to expressoutrage) should be sensitive to the potential reputation benefits ofpunishment. Thus, helping opportunities should influence out-rage—insofar as they are, in fact, seen as relevant to punishment’spotential reputation value. In other words, the perceived reputationbenefits of punishment should mediate the effect of helping op-portunities on outrage.

This pattern could reflect that when helping is possible, outragedeclines insofar as people have learned that helping opportunitiesare a reliable cue that punishment will have limited reputationvalue. Alternatively, outrage might decline insofar as people re-spond to helping opportunities by computing, in the moment, thatthe potential reputation value of punishment is relatively small(Crockett, 2013; Cushman, 2013). Even under this second possi-bility, it seems unlikely that our subjects would consciously com-pute the reputation value of punishment, or that such reasoningwould consciously influence outrage: in our outrage experiments,subjects did not actually make punishment decisions, and reputa-tion was not actually at stake. Thus, we saw it as more likely thathelping opportunities would unconsciously influence outrage in-sofar as they were implicitly seen as a reliable reputation cue, orinfluenced implicit computations about punishment’s reputationvalue. However, we reasoned that these implicit processes couldlikely be accessed by explicitly asking subjects to evaluate thereputation value of punishment. Thus, we directly asked subjectshow a hypothetical act of punishment would be perceived byothers, and treated this measure as our first candidate mediator.

Our second candidate mediator, which was measured in Exper-iments 3–5, was the extent to which subjects reported being

generally concerned with their reputations (i.e., being a personwho tends to desire positive social evaluation, and fear negativesocial evaluation). We initially intended for this measure to be amoderator and thus selected a scale that was designed to assess thegeneral trait of concern with one’s reputation across contexts.However, we always collected this measure after our manipulationof helping opportunities, and found some evidence that it wasinfluenced by helping opportunities (with a significant effect ob-served in Experiment 3, and a marginally significant effect ob-served in aggregated analyses). Thus, we concluded that it wouldbe inappropriate to treat this measure as a moderator of the effectof helping opportunities.

Moreover, the evidence that our manipulation impacted reportedtrait reputation concerns suggests that to some degree, subjectswere also reporting on their state reputation concerns. Thus, wechose to investigate general reputation concerns as a candidatemediator. Thus far, we have proposed that when helping is notpossible, outrage is elevated insofar as subjects implicitly seepunishment as having more reputation value. However, by treatinggeneral reputation concerns as a second mediator, we could alsoask whether subjects who did not have the chance to help felt moreconcerned with their reputations, and whether such concerns mighthave shaped outrage.

Method

As discussed above, Experiments 2–5 each measured at leastone of our candidate mediators. In each of these experiments, wealways measured our mediators after measuring outrage and thusavoided activating reputation concepts before measuring outrage.This decision has an important advantage: we can be confident thatevidence of mediation does not merely reflect that we inducedsubjects to think about the reputation value of punishment, or theirgeneral reputation concerns, before measuring outrage. However,it also has a disadvantage: responses to our outrage scale couldhave causally influenced responses to our mediator scales. Thispossibility is worth keeping in mind in the interpretation of ourmediation analyses. However, we note that if we had measured ourmediators before measuring outrage, while the act of reportingoutrage could not have causally affected ratings of our mediators,the outrage subjects experienced (before being asked to report it)could still have causally affected these ratings. Thus, we view apossible causal path from our dependent variable to our mediatingvariables as an inherent issue that would be necessary to keep inmind, regardless of order.

Perceived reputation benefits of punishment. To measurethe perceived reputation benefits of punishment, we instructedsubjects to imagine that, instead of being asked to make a judg-ment about the Helper, they had instead been given the opportunityto punish the Helper with a financial fine. Specifically, we in-structed subjects to imagine that (a) the Helper did not share withthe Recipient, and (b) they were given 30¢, and had the opportu-nity to punish the Helper by paying 5¢ to deduct 15¢ from theHelper’s payoff. Then, subjects answered six questions, whichmeasured their beliefs that punishing—if observed—would havepositive reputation consequences, as compared to not punishing.

Models of punishment as a signal of trustworthiness show that,depending on the context, the act of punishing can be a positivesignal (i.e., it can increase the punisher’s perceived trustworthi-

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


ness), and the act of not punishing can be a negative signal (i.e., itcan decrease the punisher’s perceived trustworthiness; Jordan,Hoffman, Bloom, et al., 2016; Jordan & Rand, 2017). For thisreason, we asked three questions about the likely positive reputa-tion consequences of punishing, and three questions about thelikely negative reputation consequences of not punishing.

Specifically, we first asked subjects, if they were to punish theHelper, (a) how morally good this would make them look in theeyes of others, (b) how much this would benefit their reputation,and (c) how positively others would see this. Next, we askedsubjects, if they were not to punish the Helper, (a) how immoralthis would seem to somebody else, (b) to what extent this wouldmake them look bad, and (c) how much this would reflect nega-tively on their reputation. These questions were presented in thatfixed order, and were each answered on a Likert scale that rangedfrom 0 to 100 in 10-point increments, with extreme anchorsreading not at all and very much. As our composite measure of theperceived reputation benefits of punishment, we took the averagevalue across the six items in our scale (although we note thatresults were qualitatively equivalent when using only the positiveor negative items).

We also note that our items were not neutrally framed (i.e., theysuggested that punishing would be perceived neutrally or posi-tively but not negatively, and that not punishing would be per-ceived neutrally or negatively but not positively). While this islikely to have affected absolute ratings of the perceived reputationvalue of punishment, we do not believe that it is likely to haveinteracted with our helping opportunities manipulation to producethe predicted mediation pattern. Finally, we note that in Experi-ment 4 (but not Experiments 2 or 5), we added an extra item to theend of our perceived reputation benefits of punishment scale,which was designed to measure subjects’ valuation of those ben-efits (see stimuli in the online supplementary material for details).We found no condition effect on this item and, thus, do notreport analyses of it.

General reputation concerns. To measure general reputationconcerns, we used a 16-item scale. Eight of the items were theeight straightforwardly worded items on the brief fear of negativeevaluation scale (BFNE). The BFNE (Leary, 1983) is based on thefear of negative evaluation scale (FNE; Watson & Friend, 1969),which was designed to measure the extent to which people areafraid of being evaluated negatively by others, and predicts be-haviors like working hard to gain approval in the eyes of others, aswell as traits like social approval seeking. The eight straightfor-wardly worded BFNE items have been shown to correlate morestrongly with theoretically related measures than reverse-wordeditems (Rodebaugh et al., 2004). The other eight items in ourgeneral reputation concerns scale were designed by us to mirrorthese eight BFNE items, but measure the desire for positive eval-uation.

All 16 items were measured as in the BFNE: with 1–5 Likertscales with anchors at every item, ranging from not at all charac-teristic of me to extremely characteristic of me. We presented the16 items in a pseudorandom order across two pages. For allsubjects, each page had the same four fear of negative evaluationitems and four desire for positive evaluation items, but we ran-domized the order of the items within each page across subjects.As our composite general reputation concerns measure, we tookthe average value across our 16 scale items (although the results

were qualitatively equivalent when using only the positive ornegative items). Finally, we note that in Experiments 4 and 5,which measured both mediators, we randomized between subjectsthe order in which they were measured.

Results

Perceived reputation benefits of punishment (PRBP). Webegan by investigating our first candidate mediator in Experiments2, 4, and 5. Collapsing across these three experiments, we foundthat our six-item PRBP scale was reliable (% ! .92). Before testingfor mediation, we also estimated the total effect of helping oppor-tunities on outrage in Experiments 2, 4, and 5 (that was slightlydifferent from the results reported in Analysis 1, because it ex-cluded Experiments 1 and 3). Within these experiments, we ob-served significantly more outrage when helping was not possible,B ! 0.09, t ! 4.57, p & .001, n ! 2,434.

Next, we tested for mediation, and found the predicted pattern(Figure 2a). First, helping opportunities attenuated the perceivedreputation benefits of punishment. Subjects in Condemnation Onlyreported that punishment would have significantly greater reputa-tional benefits (M ! 3.77, SD ! 2.43) than subjects in Condem-nation " Helping did (M ! 3.24, SD ! 2.47), B ! 0.11, t ! 5.42,p & .001, n ! 2,434. This suggests that subjects did, in fact, treathelping opportunities as a cue of punishment’s reputation value.Second, predicting outrage as a function of condition and PRBP,we found a significant effect of PRBP, B ! 0.51, t ! 29.47, p &.001, n ! 2,434. This suggests that individuals who believed thatpunishing would confer larger reputation benefits experiencedmore outrage, which is consistent with the theory that outragefunctions to motivate punishment and thus is sensitive to itsperceived reputation value.

Finally, we investigated the indirect effect of helping opportu-nities on outrage through PRBP, and the direct effect of helpingopportunities on outrage. For all analyses, we calculated indirectand direct effects using standardized ( coefficients and Preacherand Hayes’s (2008) bootstrapping procedure with 5,000 resamples.We found a significant indirect effect of .06 [.04, .08], and asignificant direct effect of 0.04 [.002, .07]. Comparing the directeffect to the total effect of helping opportunities revealed that 61%of the total effect was mediated by PRBP.

General reputation concerns. We next investigated our sec-ond candidate mediator in Experiments 3–5. Collapsing acrossthese three experiments, we found that our 16-item GRC scale wasreliable (% ! .96). Before testing for mediation, we also estimatedthe total effect of helping opportunities on outrage in Experiments3–5. Within these experiments, we observed significantly moreoutrage when helping was not possible, B ! 0.08, t ! 4.01, p &.001, n ! 2,432.

Next, we tested for mediation, and found equivocal evidence(Figure 2b). First, helping opportunities had a marginally signifi-cant effect on general reputation concerns. Subjects in Condem-nation Only reported being marginally significantly more con-cerned with their reputations (M ! 2.99, SD ! 0.96) than subjectsin Condemnation " Helping did (M ! 2.92, SD ! 0.97), B !0.04, t ! 1.79, p ! .073, n ! 2,432. This suggests that having thechance to help may have reduced the extent to which subjects feltconcerned with their reputations. Second, predicting outrage as afunction of condition and GRC, we found a significant effect of

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

12 JORDAN AND RAND


Figure 2. Reputation constructs mediate the effect of helping opportunities on outrage. We illustrate the effectof our helping opportunities manipulation on moral outrage, as mediated by (a) the perceived reputation benefitsof punishment (in a single mediation analysis of Experiments 2, 4, and 5), (b) general reputation concerns (ina single mediation analysis of Experiments 3–5), and (c) both candidate mediators (in a multiple mediationanalysis of Experiments 4 and 5).

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


GRC, B ! 0.20, t ! 10.19, p & .001, n ! 2,432. This suggests thatindividuals with greater general reputation concerns reported moreoutrage. This correlation is consistent with the theory that reputa-tion concerns drive punishment and thus shape outrage as a mo-tivator of punishment.

Finally, we estimated the indirect effect of helping opportunitiesthrough GRC, as well as the direct effect of helping opportunities.We observed a marginally significant indirect effect of .01 ['.001,.02], and a significant direct effect of 0.07 [.04, .11]. Comparingthe direct and total effects revealed that 9% of the total effect wasmediated by GRC.

Multiple mediation. Finally, we simultaneously investigatedboth mediators in Experiments 4 and 5. First, within these twoexperiments, we investigated the total effect of helping opportu-nities on outrage. We observed significantly more outrage whenhelping was not possible, B ! 0.10, t ! 4.14, p & .001, n ! 1,615.

Next, we conducted a multiple mediation analysis (Figure 2c).First, we found that helping opportunities significantly influencedboth mediators. In Condemnation Only, subjects both reported thatthe reputation value of punishment was higher (B ! 0.09, t ! 3.71,p & .001, n ! 1,615) and that they were more concerned with theirreputations (B ! 0.07, t ! 2.68, p ! .007, n ! 1,615). Second,predicting outrage as a function of condition, PRBP, and GRC, wefound significant effects of both PRBP (B ! .49, t ! 22.82, p &.001, n ! 1,615) and GRC (B ! .12, t ! 5.71, p & .001, n !1,615). This result suggests that the perceived reputation benefitsof punishment and general reputation concerns may have hadindependent effects on outrage.

Finally, we estimated the indirect effects of each mediator, aswell as the direct effect of helping opportunities. We found sig-nificant indirect effects through both PRBP (.05 [.02, .07]) andGRC (.01 [.001, .01]), resulting in a significant total indirect effect(.05 [.03, .08]). We also found a significant direct effect of .05[.01, .09]. Comparing the direct and total effects revealed that 52%of the total effect was mediated by our mediators. (Note that thispercentage is smaller than what was reported above for PRBP

alone because in Experiment 2, which only measured PRBP,PRBP mediated considerably more of the total effect than it did inExperiments 4 and 5).

Mediation results by experiment. Finally, we conducted me-diation analyses separately by experiment (see Table 4). All threeexperiments measuring PRBP showed a consistent pattern: whensubjects did not have the opportunity to help, they reliably reportedincreased PRBP, which reliably predicted increased outrage—sowe reliably observed indirect effects through PRBP.

In contrast, across the three experiments measuring GRC, wesaw a mixed pattern. In all three experiments, GRC reliablypredicted increased outrage. However, we found equivocal evi-dence regarding the effect of the Condemnation Only condition onGRC and, thus, the indirect effect through GRC. Specifically, themarginally significant positive indirect effect in our aggregatedanalysis was driven most strongly by the significant positive effectin Experiment 5. It was also consistent with the directionallypositive effect in Experiment 4—but not the directionally negativeeffect in Experiment 3. In interpreting this pattern, it is perhapsworth noting that Experiment 3 was the one experiment in whichwe did not observe a significant effect of helping opportunities onoutrage (see Table 3); this might suggest that for some reason (e.g.,a randomization failure), the effect of our manipulation on bothoutrage and GRC was meaningfully different in Experiment 3.

Finally, in both experiments measuring both PRBP and GRC,our multiple mediation analyses produced fairly consistent results.Both experiments showed a significant positive indirect effect ofPRBP, and a directionally positive indirect effect of GRC (al-though this effect was only significant in Experiment 5).

Discussion

Together, Analysis 2 supports the hypothesis that helping op-portunities shape reported outrage insofar as they serve as areputation-relevant cue. We found robust evidence for partialmediation through the perceived reputation benefits of punish-

Table 4Analysis 2 Results, By Experiment

StatisticExperiment 2 Experiment 3 Experiment 4 Experiment 5 Aggregate

(n ! 819) (n ! 817) (n ! 811) (n ! 804) (n varies)

Single mediation by Perceived Reputation Benefits of Punishment (PRBP; aggregate n ! 2,434)Effect of Condemnation Only (CO) dummy on PRBP B ! .14, B ! .09, B ! .09, B ! .11,

p & .001 p ! .007 p ! .011 p & .001Effect of PRBOP on outrage (controlling for CO) B ! .53, B ! .50, B ! .51, B ! .51,

p & .001 p & .001 p & .001 p & .001Indirect effect of CO on outrage via PRBP .07 [.04, .11] .05 [.01, .08] .05 [.01, .08] .06 [.04, .08]

Single mediation by General Reputation Concerns (GRC; aggregate n ! 2,432)Effect of CO on GRC B ! '.02, B ! .05, B ! .08, B ! .04,

p ! .505 p ! .157 p ! .018 p ! .073Effect of GRC on outrage (controlling for CO) B ! .21, B ! .18, B ! .21, B ! .20,

p & .001 p & .001 p & .001 p & .001Indirect effect of CO on outrage via GRC

'.005 ['.02, .01] .01 ['.004, .02] .02 [.001, .03] .01 ['.001, .02]Multiple mediation by both PRBP and GRC (aggregate n ! 1,615)

Indirect effect of CO on outrage via PRBP in multiple mediation .05 [.01, .08] .04 [.01, .08] .05 [.02, .07]Indirect effect of CO on outrage via GRC in multiple mediation .01 ['.003, .01] .01 [.001, .02] .01 [.001, .01]

Note. For a and b paths, we show standardized coefficients and p values, and for indirect effects, we show standardized coefficients and 95% confidenceintervals (CIs).

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

14 JORDAN AND RAND

ment. When subjects did not have the chance to help, they sawpunishing as having greater reputation value—and insofar as thiswas true, their outrage was heightened. This pattern is consistentwith our theory that helping opportunities influence outrage be-cause helping is a stronger signal of trustworthiness than punish-ment, and outrage is sensitive to the potential signaling value ofpunishment.

We also found equivocal evidence for partial mediation throughgeneral reputation concerns. When subjects did not have thechance to help, our results suggest that they may have felt some-what more concerned with their reputations. This effect was onlymarginally significant despite our large sample sizes, and the effectsize was very small. The relative weakness of this effect couldreflect that we designed our reputation concerns scale as a traitmeasure, and that it measured general reputation concerns, ratherthan moral reputation concerns specifically. Nonetheless, it mightalso reflect that helping opportunities genuinely do not have mucheffect on reputation concerns. However, while the effect of helpingopportunities on general reputation concerns was marginal, we didobserve a robust correlation between general reputation concernsand outrage. This correlation supports our reputation frameworkfor moral outrage.

Reputation in the eyes of who? Our reputation heuristicstheory proposes that even when reputation is not at stake, peoplemay engage in reputation-relevant computations that make themsensitive to reputation cues. However, what kind of reputation-relevant computations? Our candidate mediators focused on rep-utation in the eyes of vaguely described “others.” We askedsubjects to report on how others would evaluate them if they choseto punish (or not), and how concerned they were with othersevaluating them positively (or negatively). We reasoned that aplausible mechanism through which our subjects may have imple-mented a reputation heuristic, and shown a sensitivity to helpingopportunities, is by engaging in (likely implicit) computationsabout their hypothetical reputation in the eyes of other (but absent)individuals.

However, other mechanisms are also plausible: subjects mayhave engaged in reputation-based computations that did not con-cern reputation in the eyes of absent others. For example, subjectsmay have conducted computations about their reputation in theirown eyes. People can always observe their own behavior, and arestrongly driven to view themselves as morally good—and a largebody of work demonstrates the importance of self-concept man-agement in shaping our moral psychology and behavior (Aquino &Reed, 2002; Mazar, Amir, & Ariely, 2008; Merritt, Effron, &Monin, 2010; Monin & Miller, 2001; Perugini & Leone, 2009;Sachdeva, Iliev, & Medin, 2009; Young, Chakroff, & Tom, 2012).Likewise, despite the fact that the experimenter could not linksubjects’ responses to their identities, subjects may have conductedcomputations about their reputations in the eyes of the experi-menter. Such computations could reflect a general heuristic to careabout what people will think of your behavior, even if they cannotidentify you or will not interact with you in the future.

Moreover, when measuring our candidate mediators, we mayhave tapped these alternative reputation computations. Specifi-cally, when measuring our first candidate mediator, we askedsubjects how others would perceive the choice to (or not to) punishselfishness; however, their responses may have reflected how theywould have perceived their own choice, or how the experimenter

would have perceived it. Likewise, when measuring our secondcandidate mediator, we asked subjects how concerned they typi-cally are with the way others evaluate them; however, their re-sponses may have reflected concerns with their own self-evaluations, or evaluations from the experimenter.

These different reputation-based computations are theoreticallydistinct, but teasing them apart empirically is a challenge: they arenot mutually exclusive, and may often be strongly positivelycorrelated. Moreover, all of these reputation-based computationscould ultimately function to implement reputation heuristics inanonymous interactions. For these reasons, we did not attempt todiscriminate between them in our mediation analyses.

However, in Experiment 6, we did collect some exploratory datadesigned to investigate the extent to which subjects reported beingconcerned with signaling to others, themselves, and the experi-menter. These items were retrospectively measured after outrage,punishment, and the post-experimental question about whetherother players could influence subjects’ bonuses. We observedstrong positive correlations between them (Bs ranging from .66 to.83, all ps & .001), and none of them were influenced by ourmanipulation of helping opportunities; thus, we did not treat themas mediators. However, descriptive statistics about these variablesmay provide some interesting and suggestive information aboutthe mechanisms through which reputation heuristics operate in thecontext of our experiments.

Specifically, in our post experimental survey (i.e., after wemeasured both outrage and punishment), we asked subjects toreport the extent to which they had been concerned with whethertheir decisions would (a) make them look like a good person in theeyes of others (other-signaling concerns), (b) make them look likea good person in the eyes of the “HIT requestor” (i.e., the exper-imenters; experimenter-signaling concerns), and (c) make themthink that they were a good person (self-signaling concerns), using1–7 Likert scales ranging from not concerned at all to veryconcerned. We randomized the order of these three questionsbetween subjects, and in the Condemnation " Helping condition,we specifically asked subjects about the extent to which they hadheld such concerns while in the role of the condemner. Weobserved moderate levels of all three types of signaling concerns,with somewhat higher levels for self-signaling concerns (M !3.38, SD ! 2.07) than other-signaling concerns (M ! 2.87, SD !1.96; paired-sample t test: t ! 18.51, p & .001, n ! 2,924), andsomewhat higher other-signaling concerns than experimenter-signaling concerns (M ! 2.74, SD ! 1.93; paired-sample t test: t !5.89, p & .001, n ! 2,924). These results suggest that all of thesereputation concerns are plausible mechanisms through which sub-jects may have implemented reputation heuristics in the context ofour one-shot anonymous experiments, and should be investigatedin future research.

Analysis 3

Together, Analyses 1 and 2 supported our proposal that moraloutrage is sensitive to the potential signaling value of punish-ment—and thus is influenced by helping opportunities even inone-shot anonymous interactions. In Analysis 3, we tested theprediction that costly punishment decisions are also sensitive tohelping opportunities in one-shot anonymous interactions.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


Method

Design. To this end, we conducted three additional experi-ments (Experiments 8–10) that measured costly punishment inone-shot anonymous interactions. Their design was very similar tothat of Experiments 1–7, except that the dependent variable wascostly punishment, not moral outrage. Thus, it was also verysimilar to previous research showing that helping opportunitiesreduce punishment in a situation where reputation was at stake(Jordan, Hoffman, Bloom, et al., 2016), except that we modifiedthe design so that reputation was not at stake.

Thus, Experiments 8–10 used the Third-Party PunishmentGame (TPPG) from Jordan, Hoffman, Bloom, et al., 2016. TheTPPG was identical to the TPCG described previously, except thatthe Condemner was replaced with a Punisher. The Punisher, thus,made an incentivized decision that was similar to the hypotheticalpunishment decision described to subjects in our perceived repu-tation benefits of punishment scale. As in Experiments 1–7, sub-jects in Experiments 8–10 read about the TPPG and their role(s) init. In the Punishment Only condition, target subjects played onceas the Punisher. In Punishment " Helping, they played twice: onceas the Punisher, and once as the Helper.

Experiments 8–10 measured punishment via the strategy meth-od: Punishers were endowed with 20¢, and—without knowingwhether or not the Helper chose to share with the Recipient—decided whether or not to commit to punishing the Helper in theevent that the Helper chose not share. Specifically, Punisherscould commit to paying 5¢ to punish the Helper by deducting 15¢from their payoff, in the event that the Helper did not share. Byusing the strategy method, we were able to obtain an incentivizedmeasure of punishment of selfishness for all Punishers, despite thefact that not all Helpers selfishly declined to share. We note thatthe strategy method is a standard approach for measuring third-party punishment (Fehr & Fischbacher, 2004) and evidence sug-gests that it does not influence rates of punishment (Jordan et al.,2015).

In addition to Experiments 8–10, as previously described, Ex-periment 6 also measured costly punishment. Specifically, in Ex-periment 6, after manipulating helping opportunities and measur-ing outrage, we explained the punishment decision describedabove, and then measured punishment. Thus, Experiments 6 and8–10 all manipulated helping opportunities and measured punish-ment, and we analyzed them together in Analysis 3. In the contextof Analysis 3 (as well as all other punishment analyses in thisarticle) we refer to the Experiment 6 conditions as “PunishmentOnly” and “Punishment " Helping” (rather than “CondemnationOnly” and “Condemnation " Helping,” as in the context of ouroutrage analyses).

However, recall that in Experiment 6, all subjects of interestwere matched with a Helper who did not share, and were told thatthe Helper did not share before rating their outrage. Thus, Exper-iment 6 subjects also knew that the Helper did not share beforedeciding whether to punish; in other words, Experiment 6 did notuse the strategy method, and helps test whether our results arerobust to this methodological distinction. Additionally, in Exper-iment 6, we presented subjects with a filler task after measuringoutrage but before measuring punishment (see Procedure for moredetails). Our goal was to reduce the probability that measuringoutrage influenced punishment ratings via anchoring or consis-

tency effects (that could cause subjects to match their punishmentdecisions to their outrage ratings) to facilitate comparison betweenExperiment 6 and our other punishment experiments (that did notmeasure outrage).

Moreover, several other details varied across our set of punish-ment experiments (see Table 2 for an overview of differences).First, as described previously, subjects in Experiment 6 who hadthe opportunity to help always made their helping decision beforewe measured outrage—and subsequently, punishment. In contrast,within the Punishment " Helping conditions of Experiments8–10, we always counterbalanced the order of helping and pun-ishment decisions. (For analyses of order effects within the Pun-ishment " Helping conditions of our punishment experiments thatused counterbalancing, see online supplementary material). Sec-ond, Experiments 8 and 9 included a Helping Only condition (inwhich target subjects played the TPPG once as the Helper).

Third, while reputation was never at stake in Experiments 6 and8, Experiments 9 and 10 also included a manipulation of whetherreputation was at stake (that we examine in Analysis 5b). Specif-ically, for half of subjects, like in Experiments 6 and 8, theexperiment ended after the TPPG; thus, TPPG decisions had noreputation consequences. These are the subjects who we analyze inAnalysis 3, which investigates one-shot anonymous punishment.However, for the other half of subjects, the TPPG was followed byan economic Trust Game (TG), as in Jordan, Hoffman, Bloom, etal., 2016. In this TG, another MTurk worker—who was not in-volved in the TPPG—decided how much money to entrust thetarget subject with, and could condition this decision on the targetsubject’s TPPG behavior. Thus, reputation was at stake. In Anal-ysis 5b, we provide more methodological details about our TGmanipulation, and investigate costly punishment when reputationis at stake. (Hereafter, we refer to Experiments 6 and 8, and the“No Trust Game” conditions of Experiments 9 and 10, as our “NoTG punishment experiments”; and we refer to the “Trust Game”conditions of Experiments 9 and 10, as well as two other verysimilar experiments using a Trust Game, as our “TG punishmentexperiments.”).

Finally, for completeness we note that in Experiments 8 and 9,after subjects finished their economic game decisions (i.e., pun-ishment and/or helping), they completed some emotion ratings.Specifically, subjects in Punishment Only completed our three-item outrage scale, subjects in Helping Only completed a three-item scale measuring positive emotions toward the Recipient, andsubjects in Punishment " Helping completed both scales. Thisdesign makes it possible to analyze the effect of helping opportu-nities on outrage in these experiments; however, we leave thisanalysis to the online supplementary material, because due to aprogramming error we failed to counterbalance the order of scalepresentation (outrage or positive emotions first) in the Punish-ment " Helping condition. As such, we confounded the effect ofhelping before rating outrage with the effect of rating positiveemotions before rating outrage, and suspect that these two manip-ulations may have had countervailing effects. See online supple-mentary materials for complete methodological details, analyses,and discussion.

Subjects. As reported above, in Experiment 6 we requested atarget of n ! 1,500 subjects per condition from MTurk (i.e., a totalof n ! 3,000 subjects). In Experiments 8 and 9, which bothincluded Helping Only conditions, we requested a target of n !

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

16 JORDAN AND RAND





400 subjects per condition (i.e., a total of n ! 1,200 subjects inExperiment 8, and across the No TG conditions of Experiment 9).In Experiment 10, which did not include a Helping Only condition,we decided (before data collection) to request a larger sample ofn ! 775 subjects per condition (i.e., a total of n ! 1,550 subjectsacross the No TG conditions) for increased power because thisexperiment sought to detect an interaction between helping oppor-tunities and the presence of a TG. Our final sample of No TGpunishment experiments includes n ! 6,863 subjects (n ! 3,066 inPunishment Only, n ! 3,010 in Punishment " Helping, and n !787 in Helping Only), Mage ! 35.64 years, SDage ! 11.61 years,45% male.

Procedure. The Experiment 8–10 procedure was analogousto that of our outrage experiments, but with the above-describeddesign changes. Our TPPG instructions were followed by fourTPPG comprehension questions. Two tested comprehension of theincentive structure underlying the Helper’s decision (like in theTPCG); the other two focused on the Punisher’s decision. Whenmeasuring punishment, we reminded subjects that they had 20¢,and that their job was to decide whether to pay 5¢ to deduct 15¢from the Helper, if they Helper chose not to share with theRecipient. We then asked them to make a decision, which wesubsequently repeated back to them.

In Experiment 6, after measuring outrage, we presented subjectswith a filler task that involved memorizing a list of words. Spe-cifically, we informed subjects that they would be shown a list of20 words for 60 s, and instructed them to try their best to study andremember them without writing them down, before attempting torecall as many as possible on the next screen. Then, we presented20 neutral words (with no moral content), while a timer counteddown from 1 min. Finally, the screen advanced and subjects wereasked to recall as many words as they could. Subjects wereinformed that their performance in this task would have no bearingon their bonus payments.

Afterward, we informed subjects that they would move on to the“next phase” of the game, where they would have the opportunityto make another decision. Then, we explained their punishmentdecision, and presented them with the two punishment-relevantTPPG comprehension questions. Finally, we measured punish-ment. Punishment was measured as in Experiments 8–10, exceptthat in Experiment 6, subjects had already been told that the Helperdid not share and were asked whether they wanted to punish(whereas in Experiments 8–10, subjects were asked whether theywanted to punish if the Helper did not share).

Results

We began by investigating the effect of helping opportunities onpunishment in an aggregated analysis of our No TG punishmentexperiments. As predicted and illustrated in Figure 3a, subjectswere more likely to punish in Punishment Only (32%) than Pun-ishment " Helping (27%), OR ! 1.31, z ! 4.78, p & .001, n !6,076. Thus, when subjects had the opportunity to signal theirtrustworthiness via direct helping, they were less likely to pay topunish—even though reputation was not actually at stake.

Next, as in Analysis 1, we asked whether this effect simplyreflected that subjects in Punishment " Helping had two actionsavailable to them. To address this question, we investigatedwhether punishment opportunities reciprocally influenced helping

in the subset of our No TG punishment experiments that includeda Helping Only condition. On the contrary, as predicted andillustrated in Figure 3b, subjects in these experiments helped atcomparable rates in Helping Only (58%) and Punishment " Help-ing (57%), OR ! 1.05, z ! .43, p ! .669, n ! 1,556. We alsoinvestigated only these experiments, and used linear regressions topredict both punishment and helping as a function of condition.We found that the standardized condition coefficient was margin-ally significantly larger when predicting punishment (B ! .08,SE ! .03, p ! .002) than helping (B ! .01, SE ! .03, p ! .669),z ! 1.90, p ! .058.

Next, we investigated the effect of helping opportunities onpunishment by experiment (see Table 5). In three out of our fourexperiments, we observed significantly more punishment in Pun-ishment Only than Punishment " Helping, and in the fourth, weobserved a marginally significant effect in the same direction.Thus, our results were fairly robust across experiments. In partic-ular, we note that compared with Experiments 8–10, Experiment 6showed a similar effect, despite its methodological differences.This suggests that that our key result was robust to whether outragewas measured before punishment, and whether the strategy methodwas used to measure punishment.

Finally, like in Analysis 1, we investigated whether helpingopportunities merely influenced one-shot anonymous punishmentamong subjects who held the mistaken explicit belief that otherplayers could observe their behavior and influence their payoffs.The only punishment experiment in which we measured thesebeliefs was Experiment 6; thus, we asked whether helping oppor-tunities reduced costly punishment in Experiment 6, excluding allsubjects who reported that other player(s) could influence theirbonuses. Indeed, we continued to observe more punishment inPunishment Only than Punishment " Helping, OR ! 1.34, z !2.47, p ! .014, n ! 1,703. Thus, our punishment results do notseem to be driven by subjects who held the mistaken explicit beliefthat other players could observe their behavior and influence theirpayoffs.

Figure 3. Helping opportunities reduce punishment (while punishmentopportunities do not reduce helping) in one-shot anonymous interactions.In (a), we plot the proportion of subjects punishing as a function of helpingopportunities across our No Trust Game (No TG) punishment experiments.In (b), we plot the proportion of subjects helping as a function of punish-ment opportunities across the subset of these experiments with a HelpingOnly condition. Error bars are 95% confidence intervals (CIs). See theonline article for the color version of this figure.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


Discussion

Analysis 3 provides evidence that in one-shot anonymous inter-actions, helping opportunities can influence costly punishment. Itdemonstrates that in settings where reputation is not at stake,reputation cues do not merely have the potential to influencereported outrage (as shown in Analysis 1)—they can also influencethe willingness to pay actual costs to punish wrongdoers. As inthe context of outrage, this effect is relatively small, but has theimportant implication that a reputation framework can help explainone-shot anonymous punishment.

Analysis 4

Together, Analyses 1 and 3 provided evidence that helpingopportunities reduce outrage and punishment in one-shot anony-mous interactions, and Analysis 2 provided evidence that helpingopportunities specifically reduce outrage insofar as they serve as areputation-relevant cue. In Analysis 4, we aimed to further supportour reputation-based theory by testing a deflationary explanationfor the effects of helping opportunities on outrage and punishment.

As discussed previously, our theory posits that helping oppor-tunities should reduce outrage and punishment for all subjects,regardless of whether or not they chose to help. After having thechance to help, the positive reputations of people who did helpshould be relatively established, and the negative reputations ofpeople who did not help should be relatively established—so foreveryone, the potential reputation value of punishing should de-cline.

However, one might imagine that helping opportunities specif-ically reduced outrage and punishment among subjects who de-clined to help, for two reasons. First, declining to help and thencondemning another non-helper is hypocritical, and hypocrites areviewed negatively (Barden, Rucker, & Petty, 2005; Effron, Lucas,& O’Connor, 2015; Jordan, Sommers, Bloom, & Rand, 2017);thus, hypocrisy aversion could reduce outrage among non-helpers.Second, declining to help could increase empathy for other non-helpers, reducing outrage.

Our Analysis 2 results provide some evidence that this empathymechanism is not the sole driver of our results: an empathy-basedmechanism would not predict the observed mediation patterns.However, our finding that helping opportunities reduced the per-ceived reputation value of punishment could merely reflect sub-jects who declined to help perceiving that punishment would beseen as hypocritical, harming their reputations. While this possi-bility would still support the theory that reputation concerns shapeoutrage and punishment in one-shot anonymous contexts, ourreputation-based theory is based on a broader reputation mecha-

nism that should extend to both helpers and non-helpers. In Anal-ysis 4, we sought to support our reputation-based mechanism andprovide evidence against the deflationary explanation that ourresults merely reflect empathy or hypocrisy aversion among non-helpers. To this end, we tested our prediction that the effects ofhelping opportunities on outrage and punishment were not solelydriven by non-helpers.

To test this prediction, we needed to compare the effects ofhelping opportunities on outrage and punishment among help-ers versus non-helpers. But how? One obvious approach is tosimply compare subjects who chose to help to subjects who didnot have the opportunity to help (and to compare subjects whochose not to help to subjects who did not have the opportunityto help). However, these comparisons introduce a self-selectioneffect that violates random assignment and prevents appropriatecausal inference: subjects who chose to help might differ fromthe overall population in their baseline inclination toward out-rage and punishment (and likewise for subjects who chose notto help). Consistent with this possibility, across the Condem-nation " Helping conditions of our outrage experiments, sub-jects who helped reported significantly more outrage (M !37.78, SD ! 30.04, n ! 2,937) than subjects who did not help(M ! 12.82, SD ! 21.72, n ! 1,275), B ! 0.39, t ! 27.38, p &.001. Likewise, across the Punishment " Helping conditions ofour punishment experiments (including both our TG and no TGpunishment experiments), subjects who helped punished at asignificantly higher rate (36%, n ! 3,257) than subjects whodid not help (10%, n ! 1,445), OR ! 5.06, z ! 16.89, p & .001.Thus, comparing helpers to those who did not have the oppor-tunity to help likely biases us away from finding the predictednegative effect of helping opportunities on outrage and punish-ment (while comparing non-helpers to those who did not havethe opportunity to help likely biases us toward finding thepredicted negative effects).

To avoid this self-selection issue, we would ideally compare(a) subjects who did help (when given the chance) to subjectswho would have helped (if given the chance), and (b) subjectswho did not help (when given the chance) to subjects whowould not have helped (if given the chance). However, we donot know which subjects in our Condemnation Only and Pun-ishment Only conditions would have helped, had they insteadbeen assigned to our Condemnation " Helping or Punish-ment " Helping conditions.

In Experiment 6, we addressed this issue by gathering additionaldata, to obtain a measure of helping for all subjects (regardless ofcondition). One way to do this would have been to give subjectswho initially did not have the opportunity to help (i.e., subjects in

Table 5Analysis 3 Results

Experiment 6 Experiment 8No TG condition of

Experiment 9No TG condition of

Experiment 10 Aggregate(n ! 2,924) (n ! 772) (n ! 799) (n ! 1,581) (n ! 6,076)

OR ! 1.26, OR ! 1.53, OR ! 1.32, OR ! 1.31, OR ! 1.31,p ! .005 p ! .009 p ! .072 p ! .014 p & .001

Note. No TG ! No Trust Game; OR ! odds ratio. Reported sample sizes include only subjects for whompunishment was measured (i.e., sample sizes exclude subjects in Helping Only conditions).

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

18 JORDAN AND RAND

our Condemnation/Punishment Only condition) an unexpectedhelping opportunity after we measured their affective outrage andpunishment. However, we were concerned about the possibility oforder effects in a design like this (i.e., about the possibility thatdifferent types of people would choose to help, depending onwhether helping was measured at the beginning or end of theexperiment). Instead, then, we conducted a follow-up experimentapproximately 2 weeks after Experiment 6 was finished. In thisexperiment, we measured helping among all subjects, regardless oftheir Experiment 6 condition.

In Analysis 4, we treated follow-up experiment helping as anindex of an individual’s propensity to help when given the chance.In the words, we treated it as a proxy for who would have helpedin Experiment 6, even among subjects who were not given theopportunity to help. Through this approach, we attempted to gaininsight into whether the effect of helping opportunities on affectiveoutrage and punishment in Experiment 6 was specifically drivenby non-helpers, as predicted by the hypocrisy aversion and empa-thy mechanisms. We did so by investigating (a) whether helpingin the follow-up experiment moderated the effects of helpingopportunities on affective outrage or punishment, and (b) if so,whether these effects were driven solely by follow-up experimentnon-helpers, or also held among follow-up experiment helpers. Wepredicted that either (a) follow-up experiment helping would notmoderate the effects of helping opportunities on affective outrageor punishment, or (b) it would moderate, but the negative effects ofhelping opportunities would hold among helpers.

Method

To conduct our follow-up experiment, 13 (12) days after begin-ning (completing) data collection for Experiment 6, we invited allsubjects to participate in an additional experiment, in which ev-eryone was asked to complete the same helping decision that weused in the helping condition of Experiment 6 (and all otherexperiments). We kept our follow-up experiment survey open tonew respondents for 8 days, at which point the rate of newresponses had become very low, and we closed the survey. Wethen linked follow-up survey responses to our Experiment 6 datausing MTurk Worker IDs; for subjects who completed thefollow-up survey more than once, we used their chronologicallyfirst response. A total of n ! 2,056 subjects completed ourfollow-up experiment (n ! 1,051 who were assigned to the Con-demnation Only condition of Experiment 6, n ! 1,005 who wereassigned to Condemnation " Helping), Mage ! 37.22 years,SDage ! 12.16 years, 44% male.

We designed our follow-up experiment with the goal that few (ifany) subjects would remember Experiment 6 clearly enough fortheir Experiment 6 decisions to influence their follow-up experi-ment helping. We attempted to facilitate this goal in two ways.First, we did not tell Experiment 6 subjects that there would be afollow-up experiment, and when inviting them to participate in thefollow-up experiment, we did not tell them that it was related toExperiment 6. We hoped that this would limit the extent to whichthe follow-up experiment reminded them of Experiment 6. Second,we conducted the follow-up experiment after a meaningful timedelay, which we hoped would substantially weaken subjects’memories of Experiment 6 (and give them ample opportunity tocomplete other tasks on MTurk, interfering with their memories).

Consistent with this goal, at the end of the follow-up experimentwe asked subjects to report the approximate number of tasks theyhad completed on MTurk over the last 2 weeks. Among subjectswho answered this question with a number (n ! 1,994), themedian answer was 80 (25th percentile ! 30, 75th percentile !200). We see these numbers as relatively large and, thus, find itlikely that most subjects did not have a clear memory of Experi-ment 6.

Results

Validation of our analysis approach. We began by investi-gating the validity of treating helping in our follow-up experimentas a proxy for helping in Experiment 6. We did so in two ways.First, we asked whether our Experiment 6 manipulation of helpingopportunities influenced rates of helping in our follow-up experi-ment. This question is relevant to whether it may be appropriate totreat helping in our follow-up experiment as a moderator of ourExperiment 6 results, even though the follow-up experiment wasconducted after Experiment 6. Indeed, we found that subjects whowere assigned to the Condemnation Only condition of Experiment6 did not show significantly different rates of helping in thefollow-up experiment (62%), as compared with subjects who wereassigned to the Condemnation " Helping condition of Experiment6 (60%), OR ! 1.09, z ! 0.94, p ! .346, n ! 2,056. This providessuggestive evidence that the type of individual who helped in thefollow-up experiment did not vary by condition, such thatfollow-up experiment helping may be an appropriate moderator.

Second, we investigated the correlation between helping in thefollow-up experiment and helping in Experiment 6, among sub-jects assigned to the Experiment 6 helping condition. This analysisis relevant to whether follow-up experiment helping is in fact areliable predictor of Experiment 6 helping. Indeed, we found that88% of follow-up experiment helpers (n ! 606) helped in Exper-iment 6, while only 34% of follow-up experiment non-helpers(n ! 399) helped in Experiment 6. Thus, we observed a significantassociation between helping in Experiment 6 and the follow-upexperiment (via linear regression B ! 0.56, t ! 21.57, p & .001;via logistic regression OR ! 14.64, z ! 16.25, p & .001, n !1,005). In other words, helping in the follow-up experimentstrongly predicted helping in Experiment 6, when given thechance.

Do helping opportunities solely reduce affective outrage andpunishment among non-helpers? After validating our Analysis4 approach, we moved to testing our key prediction: that helpingopportunities did not solely reduce affective outrage and punish-ment among non-helpers. More specifically, we tested our predic-tion that either (a) follow-up experiment helping would not mod-erate the effects of helping opportunities on affective outrage orpunishment, or (b) follow-up experiment helping would moderate,but the negative effects of helping opportunities on affectiveoutrage and punishment would hold among helpers.

We began by investigating affective outrage. We predictedExperiment 6 affective outrage as a function of a CondemnationOnly dummy, helping in the follow-up experiment, and theirinteraction. We found a significant negative interaction, B ! '.10,t ! '2.45, p ! .014, n ! 2,056. In other words, we did supportthe deflationary explanation’s prediction that follow-up experi-

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


ment helping should moderate the effect of helping opportunitieson affective outrage.

As such, we moved to investigating whether the negative effectof helping opportunities on affective outrage held among helpers.Critically, we found a significant positive effect of a Condemna-tion Only dummy on affective outrage among both follow-upexperiment helpers, B ! .07, t ! 2.42, p ! .016, n ! 1,261, andnon-helpers, B ! .18, t ! 5.19, p & .001, n ! 795. Thus, thenegative effect of helping opportunities on affective outrage didhold among helpers, as predicted by our signaling account and notthe deflationary explanation. That said, we did find that the effectwas significantly stronger among non-helpers, which is consistentwith a role of empathy and/or hypocrisy aversion.

Next, we turned to punishment. We predicted Experiment 6punishment as a function of a Punishment Only dummy, helping inthe follow-up experiment, and their interaction. We found nosignificant interaction, OR ! .83, z ! '.84, p ! .403, n ! 2,056.In other words, we failed to support the deflationary explanation’sprediction that helping in the follow-up experiment should mod-erate the effect of helping opportunities on punishment, and foundno statistical justification for investigating helpers and non-helpersseparately.

However, because helpers showed a directionally smaller effectthan non-helpers and the non-significant interaction could reflectlimited power, we nonetheless analyzed each group separately. Inthese separate analyses, we found a non-significant positive effectof a Punishment Only dummy on punishment among helpers,OR ! 1.15, z ! 1.19, p ! .234, n ! 1,261, and a marginallysignificant positive effect among non-helpers, OR ! 1.40, z !1.70, p ! .090, n ! 795. Thus, while our punishment analysesmirror our affective outrage analyses in terms of the directionaleffects observed, their more limited power makes them moreequivocal: we failed to support the deflationary explanation’smoderation prediction, but also were unable to demonstrate asignificant effect of helping opportunities on punishment amonghelpers.

Comparing helpers and non-helpers in the context of ourmediators. Finally, we note that our follow-up experiment onlyincluded subjects from Experiment 6, which did not measure eitherof our reputation-relevant mediators. Thus, we cannot use ourAnalysis 4 approach to compare the effects of helping opportuni-ties on our reputation-relevant mediators (or the indirect effects viaour reputation-relevant mediators) among helpers and non-helpers.

Moreover, the simple approach of investigating these effectsamong helpers (or non-helpers) by comparing subjects who helped(or did not help) to subjects who did not have the opportunity tohelp creates the same self-selection effect described above in thecontext of outrage. And like in the context of outrage, we findevidence consistent with the possibility that helpers and non-helpers differ in their baseline perceptions of the reputation valueof punishment and general reputation concerns. Across the Con-demnation " Helping conditions of our three experiments thatmeasured PRBP, as compared with non-helpers, helpers reportedthat punishment would have significantly greater reputation value,B ! 0.09, t ! 3.07, p ! .002. Furthermore, across our threeexperiments that measured GRC, helpers reported significantlygreater general reputation concerns than non-helpers, B ! 0.11,t ! 3.80, p & .001. Thus, comparing helpers to those who did nothave the opportunity to help likely biases us away from finding the

predicted negative effect of helping opportunities on our mediators(while comparing non-helpers to those who did not have theopportunity to help likely biases us toward finding the predictednegative effects).

Indeed, across our experiments that measured GRC, subjects inCondemnation Only reported significantly greater general reputa-tion concerns (M ! 2.99, SD ! 0.96) than Condemnation "Helping non-helpers (M ! 2.73, SD ! 0.93), B ! 0.11, t ! 4.22,p & .001, n ! 1,521), but not helpers (M ! 2.98, SD ! 0.98), B !0.004, t ! 0.19, p ! .847, n ! 2,127). However, across ourexperiments that measured PRBP, subjects in Condemnation Onlyreported that punishment would have greater reputation value(M ! 3.77, SD ! 2.43) than both Condemnation " Helpingnon-helpers (M ! 2.89, SD ! 2.44), B ! 0.15, t ! 5.99, p & .001,n ! 1,557) and helpers (M ! 3.37, SD ! 2.47), B ! 0.08, t !3.71, p & .001, n ! 2,094). Because the self-selection effect likelybiases us against finding this pattern in the context of helpers, thisresult provides evidence that helping opportunities reduced theperceived reputation value of punishing even among subjects whochose to help. Thus, it further supports our signaling theory, and itsprediction that helping opportunities should reduce outrage andpunishment even among helpers.

Discussion

Overall, Analysis 4 supports a role of a signaling-based mech-anism for the effects of helping opportunities on affective outrageand punishment, and provides evidence that these effects were notsolely driven by empathy or hypocrisy aversion among non-helpers. In the context of affective outrage, we supported ourprediction that if follow-up experiment helping moderated thenegative effect of helping opportunities, the effect would holdamong helpers. And in the context of punishment, we did not findsignificant moderation. Our results thus matched the predictionsoutlined by our signaling account. We also report evidence sug-gesting that helping opportunities reduced the perceived reputationvalue of punishment among helpers. Together, these analyses aresupportive of our signaling account.

We note, however, that we did not find a significant effect ofhelping opportunities on punishment when restricting our analysesto helpers; thus, our conclusions regarding punishment are some-what equivocal, and future research should attempt to more pre-cisely estimate the effect of helping opportunities on punishmentamong helpers and non-helpers.

Additionally, it remains possible that helping opportunities re-duce outrage and punishment among helpers via mechanism(s)other than the reputation-based ones we have proposed. It seemsunlikely that helping opportunities induce hypocrisy aversionamong helpers, because helping and condemning others for nothelping is not hypocritical. In contrast, however, it is possible thatwhen people are given the opportunity to help and chose to do so,they gain empathy for the perspective of non-helpers, decreasingoutrage toward them.

As noted previously, however, an empathy mechanism wouldnot predict the mediation results from Analysis 2. Furthermore, itis plausible that empathy could go in the reverse direction amonghelpers. Having the chance to help and choosing to do so couldmake the decision not to help seem less relatable, decreasingempathy toward non-helpers and thus increasing outrage and

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

20 JORDAN AND RAND

punishment. This possibility is consistent with evidence that peo-ple who have endured a hardship can be less likely to empathizewith others enduring the same hardship, as compared with thosewho have no experience with the relevant situation (Ruttan, Mc-Donnell, & Nordgren, 2015). If helping opportunities reducedempathy toward selfishness among helpers, this effect would ac-tually suppress the observed negative effect of helping opportuni-ties on outrage and punishment—such that the reported effectswould underestimate the reputation-based mechanism we haveproposed.

Adjudicating between these possibilities may be difficult, giventhat the causal pathway between reduced outrage and empathy islikely bidirectional. If having the chance to help and choosing todo so reduces outrage toward non-helpers for reasons that do notrelate to empathy (i.e., via our proposed reputation-based mecha-nism), this could plausibly increase reported empathy for non-helpers, making such a finding difficult to interpret. Nevertheless,future research should attempt to provide further insight into therole of empathy in shaping the effects of helping opportunities onoutrage toward and punishment of non-helpers. It should alsofurther investigate whether these effects occur through differentprocesses among helpers and non-helpers.

Together, however, our results from Analyses 1–4 providesupport for our theory that helping opportunities reduce outrageand punishment by reducing the signaling value of punishment,and not merely by inducing hypocrisy aversion or empathy towardselfishness among non-helpers.

Analysis 5

In our fifth and final analysis, we tested the heuristics compo-nent of our reputation heuristics theory. To this end, we investi-gated whether deliberativeness moderated the effects of helpingopportunities on outrage and punishment.

In general, deliberation allows people to tailor their behaviorto the specific situation they are in and, thus, can serve toinhibit typically advantageous responses in atypical contextswhere they will be costly (Kahneman, 2011; Rand et al., 2014,2017; Shenhav et al., 2017; Stanovich, 2005). Consequently,when reputation is not actually at stake— but punishment wouldbe an effective signal if reputation were at stake—we predictedthat less deliberative individuals would show elevated levels ofcostly punishment, while this pattern would be attenuated oreliminated among more deliberative individuals. In otherwords, we predicted that less deliberative individuals would bemore sensitive to helping opportunities in the context of ourone-shot anonymous punishment experiments. In contrast, how-ever, we predicted that deliberativeness would not moderate theeffect of helping opportunities on punishment in contexts wherereputation really was at stake.

To test these predictions, we investigated individual differencesin deliberativeness. We drew on two distinct behavioral indicatorsof the extent to which subjects were likely to use deliberationduring our experiment. First, we considered performance on ques-tions assessing comprehension of incentives in our experiment,following the logic that individuals approaching our experimentmore deliberatively should be more likely to carefully considertheir current situation and incentives. Second, we considered per-formance on the Cognitive Reflection Task (CRT; Frederick,

2005), a set of math problems with intuitively compelling butincorrect answers designed to measure individual differences indeliberativeness.

Analysis 5a tested the prediction that across both of theseindicators, less deliberative subjects would enact one-shot anony-mous punishment at higher rates when helping was not possible,while more deliberative subjects would punish at relatively lowerrates regardless of helping opportunities. Analysis 5b tested theprediction that deliberativeness would not moderate the influenceof helping opportunities on punishment in experiments wherereputation was actually at stake.

Finally, after confirming our prediction from Analysis 5a (thatdeliberativeness should attenuate the effect of helping opportuni-ties on one-shot anonymous punishment), we sought in Analysis5c to unpack the mechanism underlying this pattern. To this end,we investigated whether deliberativeness moderated the influenceof helping opportunities in our (one-shot anonymous) outrageexperiments. If more deliberative subjects are always less sensitiveto reputation cues in one-shot anonymous interactions, delibera-tiveness should attenuate the influence of helping opportunities onoutrage. In contrast, if deliberative subjects are specifically lesssensitive to reputation cues when acting on such sensitivity iscostly, deliberativeness might not moderate the influence of help-ing opportunities on outrage. Because either possibility seemedconsistent with our reputation heuristics theory, we did not ap-proach Analysis 5c with a clear directional prediction.

Analysis 5a

In Analysis 5a, we tested our prediction that the effect of helpingopportunities on one-shot anonymous punishment would be drivenby relatively less deliberative decision-makers.

Method. To this end, we investigated whether our two indi-cators of deliberativeness would moderate the influence of helpingopportunities on punishment in our No TG punishment experi-ments. For our first indicator, we used the comprehension ques-tions included in all of our experiments. For our second indicator,we used performance on the CRT. In Experiment 6, subjectscompleted the CRT at the beginning of the study. While Experi-ments 8–10 did not measure the CRT, we took advantage of theobservation that CRT scores are fairly stable across time (Stagn-aro, Pennycook, & Rand, 2018) to nonetheless obtain CRT scoresfor some subjects in those experiments by matching MTurk IDs toan external dataset compiling other MTurk experiments that in-cluded the CRT and were conducted by members of our researchgroup (Stagnaro et al., 2018). These experiments all used a versionof the CRT that was conceptually identical to the one presented inExperiment 6, originally published in Frederick, 2005; however,there was some minor variation in the wording for some subset ofthe questions (e.g., “If it takes 10 s for 10 printers to print out 10documents, how many seconds will it take 50 printers to print out50 documents?” vs. “If it takes 5 machines 5 min to make 5widgets, how long would it take 100 machines to make 100widgets?”).

This dataset compiled 11 different sets of experiments, con-ducted between 2012 and 2017, and included 23,264 unique CRTscores from 17,999 unique subjects (as indexed by MTurk IDs).Stagnaro et al. (2018) found that among subjects in this datasetwho took the CRT more than once (i.e., because they participated

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


in multiple included experiments), CRT scores increased over time(suggesting learning effects); thus, we considered only the chron-ologically first CRT score from each subject. Then, we identifiedmatches between subjects in this CRT dataset (as indexed byMTurk Worker IDs) and subjects in the No TG conditions ofExperiments 8–10. This resulted in a sample of n ! 1,672 matchesin the punishment conditions (i.e., excluding subjects in HelpingOnly) of these experiments (n ! 847 in Punishment Only, n ! 825in Punishment " Helping), Mage ! 36.95 years, SDage ! 11.70years, 48% male. When including Experiment 6, we had CRT datafor a total of n ! 4,595 subjects in the punishment conditions ofour No TG punishment experiments (n ! 2,313 in PunishmentOnly, n ! 2,283 in Punishment " Helping), Mage ! 36.41 years,SDage ! 11.80 years, 46% male.

We note for completeness that in Experiments 6 and 9, ourpost-experimental survey included one item each from the Faith inIntuition and Need for Cognition scales (Epstein, Pacini, Denes-Raj, & Heier, 1996), which are conceptually related to delibera-tiveness; however, these single-item self-report measures corre-lated only weakly with comprehension and CRT performance anddid not moderate the effect of helping opportunities on one-shotanonymous punishment. We focus on comprehension and CRTperformance here, because (a) as multi-item measures they aremore reliable than the single-item measures, and (b) as behavioralmeasures of deliberateness, they—unlike the self-report mea-sures—do not rely on subjects’ introspection and are not suscep-tible to self-presentation concerns.

For both of our indicators of deliberativeness, we analyzed boththe continuous measure (i.e., number of comprehension questionscorrect and number of CRT questions correct) as well as a mediansplit on that continuous measure. These median split measurescapture (a) whether all comprehension questions were correct (trueof 59% of subjects) and (b) whether at least 2 out of 3 CRTquestions were correct (true of 41% subjects for whom we hadCRT data). We also note that our continuous indicators weremodestly positively correlated, r ! .34, p & .001, supporting ourpremise that they are distinct but related indicators of deliberative-ness.

Results. Did deliberativeness attenuate the influence of help-ing opportunities on one-shot anonymous punishment? To address

this question, first we separately considered less versus moredeliberative subjects using our two median split indicators, andinvestigated the effect of helping opportunities on punishment(Table 6, rows 1–2). As predicted, across both indicators, only lessdeliberative subjects showed a significant effect of helping oppor-tunities on punishment. Next, we tested whether our deliberatenessindicators significantly moderated the effect of helping opportuni-ties on punishment. For each (continuous or median split) indica-tor, we separately predicted punishment as a function of condition,deliberativeness, and their interaction (Table 6, row 3). We founda significant negative interaction for all four indicators.

Next, we further examined these interactions by computingsimple slopes for the effect of helping opportunities on punishmentat 1 SD above and below the mean for both of our continuousdeliberativeness indicators. We found that helping opportunitieshad a significant effect on punishment at 1 SD below the mean oncomprehension (OR ! 1.50, z ! 5.39, p & .001) and CRT (OR !1.42, z ! 3.97, p & .001) performance, but no significant effect at1 SD above the mean on comprehension (OR ! 1.11, z ! 1.31,p ! .191) or CRT (OR ! 1.09, z ! 0.85, p ! .396) performance.

In Figure 4, we illustrate our results. Across our set of No TGpunishment experiments, we plot punishment as a function ofhelping opportunities and our binary measures of comprehension(panel a) and CRT performance (panel b). Across both indicators,we see that less deliberative subjects were more likely to punishwhen helping was not possible. In contrast, more deliberativesubjects were not sensitive to helping opportunities. Instead, theypunished at relatively low rates regardless of whether helping waspossible.

Finally, we investigated the possibility that the results amongour less deliberative subjects were driven exclusively by subjectswho held the mistaken explicit belief that other players couldobserve their behavior and influence their payoffs. We againfocused on Experiment 6, our only punishment experiment mea-suring these beliefs, and investigated the effect of helping oppor-tunities on punishment among below-median comprehension andCRT performers, excluding subjects who reported that other play-ers could influence their payoffs.

In this analysis, we observed marginally significantly morepunishment in Punishment Only than Punishment " Helping, both

Table 6Analysis 5a Results

Statistic

Comprehension(n ! 6,076)

CRT performance(n ! 4,596)

Binarymeasure

Continuousmeasure

Binarymeasure

Continuousmeasure

Simple effect of Punishment Only (PO) dummy among less deliberative subjects OR ! 1.59, OR ! 1.41,z ! 5.50, z ! 4.13,p & .001 p & .001

Simple effect of PO among more deliberative subjects OR ! 1.13, OR ! 1.06,z ! 1.57, z ! .51,p ! .117 p ! .606

Interaction between PO and indicator of deliberativeness OR ! .71, OR ! .85, OR ! .75, OR ! .89,z ! '2.96, z ! '2.74, z ! '2.13, z ! '2.02,p ! .003 p ! .006 p ! .033 p ! .043

Note. Reported sample sizes indicate the number of subjects for whom punishment and the relevant indicator of deliberativeness were both measuredacross our No Trust Game punishment experiments.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

22 JORDAN AND RAND

among below-median comprehension performers, OR ! 1.43, z !1.75, p ! .080, n ! 538, and among below-median CRT perform-ers, OR ! 1.34, z ! 1.79, p ! .073, n ! 863. While these resultsare only marginally significant, we note that restricting this anal-ysis to less deliberative subjects substantially reduces power. Thus,in conjunction with our Analysis 3 finding that our overall Exper-iment 6 punishment result is robust to excluding subjects whoreported that other players could influence their payoffs, it seemsunlikely that our less deliberative subjects were only sensitive toour manipulation because they held this mistaken explicit belief.

Together, Analysis 5a presents evidence that when reputation isnot at stake, deliberativeness attenuates the influence of helpingopportunities on costly punishment. This evidence is consistentwith our theory that in one-shot anonymous interactions, subjectsare sensitive to helping opportunities insofar as they rely onreputation heuristics.

Analysis 5b

Our theory also predicts that when reputation is actually at stake,even more deliberative individuals—who rely less on heuristics—should be sensitive to helping opportunities. In such contexts,punishment really can confer reputation benefits, and really ismore likely to do so when helping is not possible. Thus, deliber-ativeness should not moderate the effect of helping opportunities.To test this prediction, we analyzed a set of experiments wherereputation was at stake, because the opportunity to punish (and/orhelp) was followed by a Trust Game where another player decidedhow much to trust the subject (based on his or her TPPG deci-sions).

Method.Design. Specifically, in Analysis 5b, we analyzed data from

four experiments (see Table 2 for an overview of their designs). Ineach of these experiments, we used the design from our No TGpunishment experiments, except that the TPPG was followed by aTrust Game (TG). The TG involved two players: a Sender and aReceiver. The Sender was a new MTurk worker who did notparticipate in the TPPG, and the Receiver was the target subjectfrom the TPPG (i.e., the player who we focus on in this article).

In the TG, the Sender was endowed with 30¢, and decided howmuch, if anything, to send to the Receiver; anything sent wastripled by the experimenter. Then, the Receiver decided how muchof the amount sent to return to the Sender. In this game, Sendershad an incentive to send more to Receivers who they trusted toreturn more. And critically, Senders could condition their sendingon the Receiver’s TPPG decision(s). Thus, TPPG decisions hadreputation consequences. And these reputation consequences werefinancially meaningful to Receivers: the more money the Sendertrusted them with, the more money they could potentially takehome.

The first experiment we analyze in Analysis 5b, which we referto here as Experiment 11, is the previously mentioned (and pub-lished) experiment in Jordan, Hoffman, Bloom, et al., 2016. Thesecond experiment, which we refer to here as Experiment 12, is apreviously unpublished exact replication of Experiment 11—albeitwith a smaller sample size (determined before data collection).And the final two experiments are the TG conditions of Experi-ments 9 and 10, which were very similar to Experiments 11 and12, except that in Experiment 10, (a) there was no Helping Onlycondition, and (2) we showed subjects an example screenshot ofhow their TPPG decision(s) might be conveyed to the TG Sender.

Subjects. As noted previously, in Experiment 9 we requesteda target of n ! 400 subjects per condition (i.e., a total n ! 1,200subjects across the TG conditions), and in Experiment 10, werequested a target of n ! 775 subjects per condition (i.e., a totaln ! 1,550 subjects across the TG conditions). In Experiment 11we also requested a target of n ! 400 subjects per condition (i.e.,a total of n ! 1,200 subjects), and in Experiment 12, we requesteda target of n ! 200 subjects per condition (i.e., a total of n ! 600subjects). Our final sample of TG punishment experiments in-cludes n ! 4,418 subjects (n ! 1,730 in Punishment Only, n !1,692 in Punishment " Helping, and n ! 996 in Helping Only),Mage ! 34.07 years, SDage ! 11.41 years, 46% male.

Procedure. The procedure was analogous to that of our No TGPunishment Experiments, but with the above-described designchanges. After reading about the TPPG, subjects read about the TGand answered three TG comprehension questions. When they

Figure 4. Deliberativeness moderates the influence of helping opportunities on one-shot anonymous punish-ment. We plot the proportion of subjects punishing as a function of helping opportunities and our median splitindicators of deliberativeness. Error bars are 95% confidence intervals (CIs). See the online article for the colorversion of this figure.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


subsequently made their TPPG decision(s), they were remindedthat the TPPG Sender would see these decision(s) before decidinghow much to send them. Afterward, they decided the percentage ofthe amount they were sent by the TG Sender to return to them(without actually learning this amount).

Indicators of deliberativeness. To investigate whether delib-erativeness would moderate the effect of helping opportunities onpunishment in our TG punishment experiments, we used the sametwo indicators as in Analysis 5a. When investigating comprehen-sion, we considered only questions about the TPPG (that wereidentical to the comprehension questions in our No TG punishmentexperiments) and not questions about the TG. None of our TGpunishment experiments directly measured CRT; thus, we reliedon n ! 1,446 matches between the punishment conditions of ourTG punishment experiments and the external CRT dataset (n !714 in Punishment Only, n ! 732 in Punishment " Helping),Mage ! 36.56 years, SDage ! 11.73 years, 47% male. Our mediansplit indicators of deliberativeness again captured (a) whether allcomprehension questions were correct (true for 61% of subjects)and (b) whether at least 2 out of 3 CRT questions were correct(true of 49% subjects for whom we had CRT data). We also againfound a moderate correlation between our continuous indicators ofcomprehension and CRT performance, r ! .31, p & .001.

Results. Before investigating whether deliberativeness mod-erated the effect of helping opportunities on punishment; we askedwhether there was a main effect of helping opportunities onpunishment across our TG punishment experiments. Indeed, sub-jects in these experiments were significantly more likely to punishin the Punishment Only conditions (39%) than the Punishment "Helping conditions (30%), OR ! 1.55, z ! 6.04, p & .001, n !3,422. We also confirmed that punishment opportunities did notreciprocally influence helping in the subset of our TG punishmentexperiments that included a Helping Only condition. Indeed, sub-jects in these experiments helped at comparable rates in HelpingOnly (82%) and Punishment " Helping (81%), OR ! 1.04, z !.30, p ! .766, n ! 1,927. We also investigated only these exper-iments, and used linear regressions to predict both punishment andhelping as a function of condition. We found that the standardizedcondition coefficient was significantly larger when predicting pun-ishment (B ! .10, SE ! .02, p & .001) than when predictinghelping (B ! .01, SE ! .02, p ! .766), z ! 2.78, p ! .006.

Thus, within our TG punishment experiments, we replicated thefindings that helping opportunities reduced punishment, but pun-

ishment opportunities did not reduce helping. Next, we tested ourkey prediction that deliberativeness should not moderate the influ-ence of helping opportunities on punishment in these experiments.We used the same approach as in Analysis 5b, reported our resultsin Table 7, and illustrated them in Figure 5. As predicted, for bothof our median split indicators of deliberativeness, both less andmore deliberative subjects were more likely to punish when help-ing was not possible. Furthermore, we observed no significantinteractions between helping opportunities and any indicator ofdeliberativeness. Additionally, when we computed simple slopesfor the effect of helping opportunities on punishment at 1 SDabove and below the mean for both of our continuous delibera-tiveness indicators, we found significant effects at 1 SD below themean on comprehension (OR ! 1.53, z ! 4.06, p & .001) andCRT (OR ! 1.51, z ! 2.54, p ! .011) performance, and at 1 SDabove the mean on comprehension (OR ! 1.57, z ! 4.42, p &.001) and CRT (OR ! 1.47, z ! 2.41, p ! .016) performance.

Thus, deliberativeness did not undermine the influence of help-ing opportunities on costly punishment when there was an explicitstrategic reason to appear trustworthy. Rather, subjects were morelikely to punish when helping was not possible, regardless ofdeliberativeness. This finding rules out the possibility that moredeliberative subjects are never sensitive to helping opportunities,and documents their sensitivity in a context where it can conferstrategic benefits: our TG punishment experiments.

We can also directly compare our TG and No TG punishmentexperiments by investigating the three-way interactions betweenhelping opportunities, deliberativeness, and the presence of a TrustGame. For each (continuous or median split) indicator of deliber-ativeness, we predicted punishment as a function of helping op-portunities, the deliberativeness indicator, a dummy indicatingwhether there was a TG, and all two- and three-way interactions.We found that the three-way interaction term was in the predicteddirection for all measures, and was significant for our median splitmeasure of comprehension (OR ! 1.51, z ! 2.17, p ! .030, n !9,498), but only marginally significant for our continuous measureof comprehension (OR ! 1.19, z ! 1.75, p ! .081, n ! 9,498),and non-significant for our median split (OR ! 1.31, z ! 1.01, p !.311, n ! 6,042) and continuous (OR ! 1.11, z ! 0.91, p ! .362,n ! 6,042) measures of CRT.

Thus, we found some (albeit weak) evidence of the three-wayinteraction implied by the significance of the two-way interactionsin No TG condition, and the non-significance of the two-way

Table 7Analysis 5b Results

Statistic



Binarymeasure

Continuousmeasure

Binarymeasure

Continuousmeasure

Simple effect of Punishment Only (PO) dummy among less deliberative subjects OR ! 1.49, OR ! 1.49,z ! 3.39, z ! 2.53,p ! .001 p ! .012

Simple effect of PO among more deliberative subjects OR ! 1.59, OR ! 1.47,z ! 5.07, z ! 2.40,

p & .001 p ! .016Interaction between PO and measure of deliberativeness OR ! 1.07, OR ! 1.01, OR ! .98, OR ! .99,

z ! .46, z ! .18, z ! '.09, z ! '.12,p ! .647 p ! .858 p ! .929 p ! .906

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

24 JORDAN AND RAND

interactions in the TG condition reported above. The non-significant three-way interaction terms here may reflect a lack ofpower—even though our sample sizes seem very large, a greatdeal of power is needed to detect a three-way interaction in thecontext of a relatively small simple effect on a binary dependentvariable. Moreover, because we do not have CRT data for roughlyone third of our subjects, we have less power to detect a three-wayinteraction for our CRT indicators of deliberativeness than for thecomprehension indicators.

This possibility is supported by power calculation simulations(described in detail in the online supplementary material) investi-gating our ability to detect a three-way interaction between helpingopportunities, our binary deliberativeness measures, and the pres-ence of a Trust Game. These simulations indicate that, even ifmore deliberative subjects in the No TG condition showed no morepunishment in Punishment Only than in Punishment " Helping,rates of punishment would have to be 14 percentage points higherin Punishment Only among all other groups of subjects (i.e., lessdeliberative subjects in No TG, and all subjects in TG) to generate80% power to detect the three-way interaction at n ! 750 per cell(roughly the sample size in our CRT analyses). This would be arather sizable simple effect to be entirely eliminated among delib-erative subjects when reputation is not at stake: as illustrated inFigures 4 and 5, we generally observed a baseline of about 30% ofsubjects punishing in Punishment " Helping, and therefore a 14percentage point simple effect would be an almost 50% increase inpunishment.

Thus, the non-significant results for CRT suggest that it isunlikely that the true three-way interaction effect is that large.However, observing our data would not be especially surprising ifthere was actually a three-way interaction, but either the baselinesimple effect of helping opportunities was smaller than 14 per-centage points and/or that effect was not completely attenuatedamong more deliberative individuals in the No TG condition.Thus, although our results do not provide strong evidence insupport of the hypothesized three-way interaction, they do notprovide strong evidence against a meaningfully sized three-wayinteraction. Furthermore, our power simulations reveal that evenwith a larger sample size of n ! 1,200 per cell (roughly our sample

size for our comprehension analyses), we are not that well-powered to the three-way interaction (see online supplementarymaterial for details). Thus, like with CRT, the relatively weakevidence for a three-way interaction for comprehension does notplace an especially small upper bound on the possible true effectsize.

Overall, we argue that the results provided in this section pro-vide tentative support for the hypothesis that when making costlypunishment decisions, deliberative individuals are specifically lesssensitive to reputation cues in contexts where reputation is not atstake.

Analysis 5c

However, why were deliberative individuals insensitive to rep-utation cues in the context of one-shot anonymous punishment, asobserved in Analysis 5a? To address this question, we returned toour (one-shot anonymous) outrage experiments, and investigatedwhether our indicators of deliberativeness moderated the influenceof helping opportunities on outrage. If more deliberative individ-uals are always insensitive to reputation cues in one-shot anony-mous interactions, deliberativeness should also have attenuated theinfluence of helping opportunities on reported outrage. In contrast,if deliberative individuals specifically inhibit their sensitivity toreputation cues when acting on such sensitivity is costly, it ispossible that deliberativeness did not attenuate the influence ofhelping opportunities on outrage, which was costless to express inour experiments. Because either possibility seemed consistent withour reputation heuristics theory, we did not approach Analysis 5cwith a clear directional prediction.

Method. We used the same two deliberativeness indicators asin Analyses 5a–b, except that we only considered the two com-prehension questions included in all of our outrage experiments(rather than the four in our punishment experiments). We note,however, that we found qualitatively identical results when rean-alyzing our punishment experiments considering only these twoquestions. When investigating CRT performance, we found n !1,576 matches between the outrage conditions of Experiments 1–5and 7 (which did not measure CRT) and our CRT dataset (n ! 785

Figure 5. Deliberativeness does not moderate the influence of helping opportunities on punishment whenreputation is at stake. We plot the proportion of subjects punishing as a function of helping opportunities and ourmedian split indicators of deliberativeness. Error bars are 95% confidence intervals (CIs). See the online articlefor the color version of this figure.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.





in Condemnation Only, n ! 791 in Condemnation " Helping),Mage ! 37.85 years, SDage ! 11.97 years, 46% male). Thus, whenincluding Experiment 6 (which did measure CRT), we had CRTdata for a total of n ! 4,500 subjects in the outrage conditions ofour outrage experiments (n ! 2,251 in Condemnation Only, n !2,249 in Condemnation " Helping), Mage ! 36.72 years, SDage !11.92 years, 45% male).

Our median split indicators of deliberativeness captured (a)whether both comprehension questions about the helping decisionwere correct (true for 82% of subjects) and (b) whether at least 2out of 3 CRT questions were correct (true of 41% subjects forwhom we had CRT data). We again found a modest correlationbetween our continuous measures of comprehension and CRTperformance, r ! .22, p & .001.

Results. To investigate whether deliberativeness moderatedthe influence of helping opportunities on outrage, we used ananalogous approach to Analyses 5a–b. Our results are reported inTable 8, and illustrated in Figure 6. For both of our median splitindicators of deliberativeness, we observed a significant effect ofhelping opportunities on outrage among both more and less delib-erative subjects. Thus, in one-shot anonymous interactions, moredeliberative subjects did report heightened outrage when helpingwas not possible. Unexpectedly, in fact, we observed that moredeliberative subjects actually showed directionally larger effectsof helping opportunities than less deliberative subjects, althoughwe observed no significant interactions between our deliberative-ness indicators and helping opportunities.

Next, we computed simple slopes for the effect of helpingopportunities on outrage at 1 SD above and below the mean forboth of our continuous deliberativeness indicators. We found sig-nificant effects at 1 SD below the mean on comprehension (B !.08, t ! 5.21, p & .001) and CRT (B ! .08, t ! 3.69, p & .001)performance, and at 1 SD above the mean on comprehension (B !.09, t ! 6.02, p & .001) and CRT (B ! .11, t ! 5.11, p & .001)performance.

Overall, then, helping opportunities did influence outrageamong more deliberative (as well as less deliberative) individuals.We also unexpectedly found that more deliberative individualsshowed directionally larger effects of helping opportunities onoutrage—the opposite pattern as we observed in the context ofpunishment—but did not observe significant interactions.

Discussion. Together, Analysis 5a–c suggest that delibera-tiveness attenuates the influence of helping opportunities on pun-ishment in one-shot anonymous interactions, but not on punish-ment when reputation is actually at stake, and not on reportedoutrage in one-shot anonymous interactions.

It is interesting that in one-shot anonymous interactions where itwas not possible to help, more deliberative individuals reportedheightened outrage but were not more likely to pay to punish. Thispattern suggests that even when reputation is not at stake, delib-erative individuals are not always insensitive to reputation cues. Itis also consistent with a large body of evidence that, depending onthe individual and the situation, a particular emotional experiencecan give rise to many different behavioral expressions—or noexpression at all (Roseman, 2011; Roseman, Wiest, & Swartz,1994). In the context of our experiments, deliberativeness seems tobe one individual difference that is relevant to whether a contextthat increases outrage (or the drive to report outrage) also increasescostly punishment behavior.

Generally, one important reason for the limited correspondencebetween emotion feelings (like outrage) and emotion-related be-haviors (like punishment) is that people can regulate their emo-tions (Gross, 1998b), and there are substantial individual differ-ences in when, how, and whether emotion regulation occurs (Gross& John, 2003). In our experiments, more deliberative individualsmay have engaged in emotion regulation that hampered theirpunishment behavior, but not their experience of or drive to reportoutrage.

Given that punishing was costly and outrage was costless toreport, this pattern may reflect that deliberative individuals specif-ically regulate their sensitivity to reputation cues when such asensitivity will be costly. This explanation would be consistentwith the proposal that emotions constitute adaptive response ten-dencies, but that these tendencies are not always optimal for asituation and thus need to be regulated (Gross, 1998b). In line withthis proposal, it is hypothesized that deliberation has the functionof preventing typically advantageous behaviors in atypical con-texts where they are costly (Kahneman, 2011; Rand et al., 2014,2017; Shenhav et al., 2017; Stanovich, 2005). An interestingquestion concerns the process through which deliberative individ-uals regulate their sensitivity to reputation cues when makingone-shot anonymous punishment decisions. Like non-deliberativesubjects, deliberative subjects reported heightened outrage whenhelping was not possible, so what process prevented them fromenacting more costly punishment?

One possibility is that they were driven to enact more punish-ment but inhibited that drive. This mechanism is consistent withevidence that people often engage in the “response-focused” emo-tion regulation strategy of suppressing emotion-related behaviors,despite being driven to engage in them (Gross, 1998a, 1998b;Gross & John, 2003). For example, deliberative subjects wholacked the opportunity to help might have chosen not to punish—despite a relatively strong drive to do so—because they reasonedthat punishing would be costly and would not materially benefitthem. Or, they might have suppressed their drive to punish byconstructing self-serving moral justifications (Uhlmann, Pizarro,Tannenbaum, & Ditto, 2009; e.g., by reasoning that punishingwould actually be morally wrong because the non-helper likelyreally needed the money, or because punishing is a destructiveaction that only serves to harm others). Such processes couldsuppress the drive to engage in a typically advantageous behaviorin an atypical context where it is costly.

Alternatively, it is possible that when reputation was not atstake, deliberative subjects who did not have the opportunity tohelp were not driven to enact heightened punishment, despitereporting heightened outrage. This would imply that for thesesubjects, while our manipulation altered the experience of or driveto report outrage, it did not alter the drive to punish. This mech-anism is consistent with evidence that people often engage in the“antecedent-focused” emotion regulation strategy of cognitive re-appraisal, which involves thinking about a situation differently soas to change one’s emotional experience—in this case, their af-fective drive to punish (Gross, 1998a, 1998b; Gross & John, 2003).Under this scenario, insofar as deliberative subjects reasoned thatpunishing was personally costly or morally wrong, these processeswould have served to prevent them from ever feeling driven topunish (rather than to help them suppress that drive). Future

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

26 JORDAN AND RAND

research should attempt to discriminate between these potentialmechanisms.

A related question for future research pertains to the psycho-logical process through which helping opportunities did influencepunishment when reputation was at stake (as observed in Analysis5b). In such contexts, did subjects explicitly reason about thereputation value of punishment, motivating their sensitivity tohelping opportunities? Or did they rely on emotional feelings likeoutrage, or the affective drive to punish, in the absence of strategicreasoning? And did the answer vary with deliberativeness?

General Discussion

Across five analyses of 12 different experiments, we haveprovided evidence that (a) moral outrage is influenced by cues ofthe potential signaling value of punishment, and (b) these cues alsoinfluence one-shot anonymous punishment among less delibera-tive individuals. Together, our results suggest that a reputationframework—and specifically the hypothesis that punishmentserves to signal trustworthiness—can shed light on when and whypeople express outrage and incur personal costs to punish wrong-doing, even when reputation is not actually at stake. Thus, they

contribute to our understanding of key features of human morality,and have numerous theoretical implications.

A Reputation Heuristics Account of One-ShotAnonymous Punishment

First, our results support a reputation heuristics account ofone-shot anonymous punishment. We found that helping opportu-nities influenced one-shot anonymous punishment, but not amongmore deliberative individuals. This pattern provides insight intowhy, from an ultimate perspective, less deliberative individualswere sensitive to helping opportunities in a context where reputa-tion was not at stake. Our results suggest that these individualsrelied on the heuristic that reputation is typically at stake to avoidthe cognitive (Bear et al., 2017; Bear & Rand, 2016) and/or social(Critcher et al., 2013; M. Hoffman et al., 2015; Jordan, Hoffman,Nowak, et al., 2016) costs of constantly calculating who is cur-rently watching. If less deliberative individuals had instead beensensitive to helping opportunities because it is actually optimal toattend to reputation cues even when reputation appears not to be atstake (e.g., as an error management strategy; Delton et al., 2011),

Table 8Analysis 5c Results

Statistic



Binarymeasure

Continuousmeasure

Binarymeasure

Continuousmeasure

Simple effect of Condemnation Only (CO) dummy among less deliberative subjects B ! .07, B ! .07t ! 2.84, t ! 3.63,p ! .005 p & .001

Simple effect of CO among more deliberative subjects B ! .09, B ! .12,t ! 7.40, t ! 5.29,p & .001 p & .001

Interaction between CO and measure of deliberativeness B ! .01, B ! .02, B ! .04, B ! .03,t ! .40, t ! .57, t ! 1.56, t ! 1.01,p ! .690 p ! .566 p ! .118 p ! .312

Figure 6. Deliberativeness does not significantly moderate the influence of helping opportunities on outragein one-shot anonymous interactions. We show box plots (that draw lines at the 25th, 50th, and 75th percentiles,and illustrate the minimum and maximum values) for moral outrage as a function of helping opportunities andour binary indicators of deliberativeness. See the online article for the color version of this figure.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


we would expect more deliberative individuals to have shown thesame sensitivity.

Thus, our results suggest that one-shot anonymous punishmentreflects a reputation heuristic. They therefore contribute to andextend evidence that social heuristics shape moral decision-making (Kiyonari, Tanida, & Yamagishi, 2000). In particular,previous research has provided evidence that one-shot anonymouscooperation can reflect the heuristic that interactions are typicallyrepeated or observed (Bear & Rand, 2016; Everett, Ingbretsen,Cushman, & Cikara, 2017; Rand, 2016; Rand, Greene, & Nowak,2012), and our results extend this evidence to the domain ofpunishment.

An interesting open question is how, from a proximate psycho-logical perspective, reputation heuristics are implemented in con-texts where reputation is not at stake. What kinds of reputationconcerns do people have, and what makes them sensitive to cuesof the potential reputation value of their possible actions? Exper-iment 6 provided some preliminary evidence that our subjects mayhave been concerned about looking good in the eyes of genericallydescribed others (possibly reflecting their imagination about howpotential observers would evaluate their behavior), as well as in theeyes of the experimenter and in their own eyes. Future researchshould investigate the relative contributions of these differentreputation motivations.

Critically, however, our reputation heuristics hypothesis makesa unique prediction that is independent of the particular reputa-tional concern(s) that are at play: in one-shot anonymous interac-tions, deliberative individuals should be relatively unwilling to paycosts to act on those concerns. By supporting this prediction, ourresults provide interesting context for the possibility that helpingopportunities influenced outrage and punishment because peoplewere concerned about being viewed positively by others, theexperimenter, or themselves. Specifically, our results suggest thatamong more deliberative individuals, the reputation concerns un-derlying their sensitivity to helping opportunities did not persist incontexts where reputation was not actually at stake, and acting onthem would be costly.

Our results also have implications for when social heuristics aremost likely to motivate typically advantageous behaviors in atyp-ical contexts. We found that less deliberative individuals engagedin more one-shot anonymous punishment than more deliberativeindividuals, supporting a reputation heuristics account of one-shotanonymous punishment. However, this pattern was much strongerwhen helping was not possible (and, thus, punishment, if observed,would have been an effective signal of trustworthiness). This maysuggest that in atypical contexts more generally, social heuristicsare most likely to motivate typically advantageous behaviors whenthey would be advantageous in typical contexts.

Related, our results suggest that despite relying on a reputationheuristic, less deliberative individuals were nonetheless sensitiveto whether helping was possible. This may imply that it is fairlycognitively demanding or socially costly to determine that repu-tation is not at stake—but less demanding or costly to determinethat if reputation were at stake, punishment would have limitedreputation value because helping would also be observable. It alsoraises the important future question of which reputation cuespeople who rely on reputation heuristics are and are not sensitiveto in contexts where nobody is watching.

The Nature and Functions of Moral Outrage

In addition to supporting a reputation heuristics hypothesis forone-shot anonymous punishment, our results have theoretical im-plications for the nature and function of moral outrage. Introspec-tion suggests that we experience outrage as a private and genuineresponse to wrongdoing that simply indexes the magnitude ofimmorality that has occurred. However, our results suggest that theexperience of (or drive to report) outrage also tracks the reputationbenefits we may gain from punishing. This proposal is not mutu-ally exclusive with the idea that outrage is experienced genuinely,but it supports theories of emotions as adaptive motivators ofaction (Cosmides & Tooby, 2000; Fredrickson, 2001; Frijda, 1986;Lazarus, 1991), and moral outrage specifically as a motivator ofpunishment (Carlsmith et al., 2002; Darley & Pittman, 2003;Fessler & Haley, 2003; Fiske & Tetlock, 1997; Goldberg et al.,1999; Jordan et al., 2015).

Additionally, because moral outrage appears to track the poten-tial reputation value of punishment even when reputation is not atstake, our results are consistent with theories that moral emotionsand judgments are usually not caused by reasoning (Haidt, 2001),and can “misfire” in contexts where they are not adaptive (Greene,2014; Gross, 1998b; Haidt, 2001; Inbar, Pizarro, Knobe, & Bloom,2009; Kahneman, 2011). Moreover, we find that when reputationis not at stake, deliberative individuals are sensitive to reputationcues when reporting outrage but not when enacting punishment.This result that is consistent with evidence that emotion feelings donot always translate to emotion-related behaviors (Roseman, 2011;Roseman et al., 1994), and that these gaps can reflect that someindividuals respond to “misfiring” by adaptively engaging in emo-tion regulation (Gross, 1998b; Gross & John, 2003).

Implications for Moral Licensing

Our outrage and punishment results also connect to a large bodyof work on moral licensing (Monin & Miller, 2001). Moral licens-ing refers to a phenomenon in which engaging in one moralbehavior makes an individual feel free to subsequently behave lessmorally. Licensing effects have been documented in the context ofpolitical correctness, prosocial behavior, and consumer choice(Merritt et al., 2010), and are often discussed as reflecting self-concept maintenance motives.

In our experiments, the effects of helping opportunities onpunishment and outrage among subjects who chose to help may bethought of as extending licensing effects to the domains of pun-ishment, as well as emotions and judgments. As discussed inAnalysis 2, subjects in our one-shot anonymous experiments mayhave been concerned with their self-concepts. And if choosing tohelp (an act of morality that is straightforward and “positive”)reduces the probability of punishing wrongdoing (another act ofmorality, albeit one that is less straightforward and more “nega-tive”), it may plausibly reflect that helping makes people feellicensed not to punish. Moreover, helping opportunities also re-duced outrage, suggesting that licensing effects may extend to thedomains of emotions and judgments.

Importantly however, we also found evidence that helping op-portunities reduced punishment and outrage among subjects whodeclined to help. Declining to help should not affirm an individ-ual’s positive moral self-concept, and thus should not make anindividual feel licensed to not punish. Thus, the observed effects

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

28 JORDAN AND RAND

among non-helpers are unlikely to have reflected a licensingpsychology, and are also inconsistent with the broader theory ofmoral balancing (Mullen & Monin, 2016). Moral balancing pro-poses that while moral behavior should license subsequent immo-rality, immoral behavior should induce compensation efforts—increasing subsequent morality. Thus, balancing predicts thathaving the opportunity to help should increase punishment amongnon-helpers which is the opposite of what we found. Instead, ourresults are consistent with our signaling theory, which proposesthat declining to help sends a strong signal of untrustworthiness—reducing the reputation value of punishment and, thus, the prob-ability that it will occur. In other words, our signaling theorymakes a prediction that contrasts with balancing theory, and whichwas born out in our data.

Moreover, our signaling theory and results may help shed lighton the factors that moderate licensing effects after people behavemorally. We have proposed a reputation-based explanation forwhy choosing to help reduces outrage and punishment. And wehave supported this proposal by showing that deliberative individ-uals cease to be sensitive to helping opportunities when reputationis not at stake, and reacting to wrongdoing is costly (i.e., in ourpunishment but not our outrage experiments). Might this moder-ation pattern extend to licensing effects more generally? Ourreputation theory predicts that (a) licensing may occur wheneverengaging in an initial moral act reduces the reputation value of asubsequent moral act, and (b) deliberative individuals may notshow these licensing effects whenever reputation is not at stakeand the subsequent moral act is costly.

Future Directions

Our experiments investigated moralistic punishment and out-rage in the context of one canonical, but relatively minor, act ofselfishness. Specifically, we measured reactions to an MTurkworker who declined to share money with another MTurk worker.On the one hand, the fact that this straightforward transgression isnot embedded in rich contextual details suggests that our resultsmay be likely to generalize to other transgressions. On the otherhand, its relatively minor nature raises the question of whether ourresults would generalize to more severe moral violations. We usedthe term moral outrage in this article to refer to the set of affective,cognitive, and behavioral responses people have to wrongdoing.However, subjects’ reactions would probably not be colloquiallydescribed as outraged, given their relatively low absolute ratingson our scale. Future research should investigate the effect ofreputation cues on moralistic outrage and punishment in responseto a more diverse set of transgressions, including those that aremore extreme, and that are more concrete and realistic.

Another important direction for future research is investigatingthe influence of reputation heuristics on outrage and punishmentacross cultures. Our experiments are all conducted via MTurk andonly investigate American subjects, raising questions about gen-eralizability (Henrich, Heine, & Norenzayan, 2010). Researchinvestigating moralistic punishment across cultures has demon-strated that it is widespread, and that punishment of selfishnessseems to universally increase with the severity of selfishness(Henrich et al., 2006). Nonetheless, the prevalence of moralisticpunishment varies considerably, and the different mechanisms(both proximate and ultimate) driving punishment across cultures

remains unclear. Is there substantial cross-cultural variation in theextent to which punishment serves to signal trustworthiness, and inthe extent to which signaling cues influence punishment evenwhen reputation is not actually at stake? And might such variancecorrelate with the prevalence of punishment? Future researchshould address these important questions.

Conclusion

Third-party punishment is central to human morality, and playsa key role in promoting cooperation. However, from an ultimateperspective, it is also puzzling, especially in the context of one-shot anonymous interactions: why should we make personal sac-rifices to punish wrongdoing toward others? Our results supportthe theory that even in such contexts, some people rely on theheuristic that reputation is typically at stake. As a result, even whenreputation is not actually at stake, reputation cues can shape moraloutrage—and, among less deliberative individuals, costly punish-ment. Our results thus demonstrate how a reputation frameworkcan shed light on these key features of human morality.

References

Aquino, K., & Reed, A., II. (2002). The self-importance of moral identity.Journal of Personality and Social Psychology, 83, 1423–1440. http://dx.doi.org/10.1037/0022-3514.83.6.1423

Balafoutas, L., Grechenig, K., & Nikiforakis, N. (2014). Third-party pun-ishment and counter-punishment in one-shot interactions. EconomicsLetters, 122, 308–310. http://dx.doi.org/10.1016/j.econlet.2013.11.028

Balafoutas, L., & Nikiforakis, N. (2012). Norm enforcement in the city: Anatural field experiment. European Economic Review, 56, 1773–1785.http://dx.doi.org/10.1016/j.euroecorev.2012.09.008

Balliet, D., Mulder, L. B., & Van Lange, P. A. (2011). Reward, punish-ment, and cooperation: A meta-analysis. Psychological Bulletin, 137,594–615. http://dx.doi.org/10.1037/a0023489

Barclay, P. (2006). Reputational benefits for altruistic punishment. Evolu-tion and Human Behavior, 27, 325–344. http://dx.doi.org/10.1016/j.evolhumbehav.2006.01.003

Barden, J., Rucker, D. D., & Petty, R. E. (2005). “Saying one thing anddoing another”: Examining the impact of event order on hypocrisyjudgments of others. Personality and Social Psychology Bulletin, 31,1463–1474. http://dx.doi.org/10.1177/0146167205276430

Baron, J., & Ritov, I. (1993). Intuitions about penalties and compensationin the context of tort law. Journal of Risk and Uncertainty, 7, 17–33.http://dx.doi.org/10.1007/BF01065312

Batson, C. D., Kennedy, C. L., Nord, L. A., Stocks, E., Fleming, D. Y. A.,Marzette, C. M., . . . Zerger, T. (2007). Anger at unfairness: Is it moraloutrage? European Journal of Social Psychology, 37, 1272–1285. http://dx.doi.org/10.1002/ejsp.434

Bear, A., Kagan, A., & Rand, D. G. (2017). Co-evolution of cooperationand cognition: The impact of imperfect deliberation and context-sensitive intuition. Proceedings of the Royal Society B: Biological Sci-ences, 284. http://dx.doi.org/10.1098/rspb.2016.2326

Bear, A., & Rand, D. G. (2016). Intuition, deliberation, and the evolutionof cooperation. Proceedings of the National Academy of Sciences of theUnited States of America, 113, 936–941. http://dx.doi.org/10.1073/pnas.1517780113

Boyd, R., Gintis, H., Bowles, S., & Richerson, P. J. (2003). The evolutionof altruistic punishment. Proceedings of the National Academy of Sci-ences of the United States of America, 100, 3531–3535. http://dx.doi.org/10.1073/pnas.0630443100

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


http://dx.doi.org/10.1037/0022-3514.83.6.1423

http://dx.doi.org/10.1037/0022-3514.83.6.1423

http://dx.doi.org/10.1016/j.econlet.2013.11.028

http://dx.doi.org/10.1016/j.euroecorev.2012.09.008

http://dx.doi.org/10.1037/a0023489

http://dx.doi.org/10.1016/j.evolhumbehav.2006.01.003


http://dx.doi.org/10.1177/0146167205276430

http://dx.doi.org/10.1007/BF01065312

http://dx.doi.org/10.1002/ejsp.434

http://dx.doi.org/10.1002/ejsp.434

http://dx.doi.org/10.1098/rspb.2016.2326

http://dx.doi.org/10.1073/pnas.1517780113




Boyd, R., & Richerson, P. J. (1992). Punishment allows the evolution ofcooperation (or anything else) in sizeable groups. Ethology & Sociobi-ology, 13, 171–195. http://dx.doi.org/10.1016/0162-3095(92)90032-Y

Brady, W. J., Wills, J. A., Jost, J. T., Tucker, J. A., & Van Bavel, J. J.(2017). Emotion shapes the diffusion of moralized content in socialnetworks. Proceedings of the National Academy of Sciences of theUnited States of America, 114, 7313–7318. http://dx.doi.org/10.1073/pnas.1618923114

Carlsmith, K. M., Darley, J. M., & Robinson, P. H. (2002). Why do wepunish? Deterrence and just deserts as motives for punishment. Journalof Personality and Social Psychology, 83, 284–299. http://dx.doi.org/10.1037/0022-3514.83.2.284

Charness, G., Cobo-Reyes, R., & Jimenez, N. (2008). An investment gamewith third-party intervention. Journal of Economic Behavior & Organi-zation, 68, 18–28. http://dx.doi.org/10.1016/j.jebo.2008.02.006

Cosmides, L., & Tooby, J. (2000). Evolutionary psychology and theemotions. Handbook of emotions, 2, 91–115.

Critcher, C. R., Inbar, Y., & Pizarro, D. A. (2013). How quick decisionsilluminate moral character. Social Psychological and Personality Sci-ence, 4, 308–315. http://dx.doi.org/10.1177/1948550612457688

Crockett, M. J. (2013). Models of morality. Trends in Cognitive Sciences,17, 363–366. http://dx.doi.org/10.1016/j.tics.2013.06.005

Crockett, M. J., Özdemir, Y., & Fehr, E. (2014). The value of vengeanceand the demand for deterrence. Journal of Experimental Psychology:General, 143, 2279–2286. http://dx.doi.org/10.1037/xge0000018

Cushman, F. (2013). Action, outcome, and value: A dual-system frame-work for morality. Personality and Social Psychology Review, 17, 273–292. http://dx.doi.org/10.1177/1088868313495594

Cushman, F., Dreber, A., Wang, Y., & Costa, J. (2009). Accidentaloutcomes guide punishment in a “trembling hand” game. PLoS ONE, 4,e6699. http://dx.doi.org/10.1371/journal.pone.0006699

Darley, J. M., & Pittman, T. S. (2003). The psychology of compensatoryand retributive justice. Personality and Social Psychology Review, 7,324–336. http://dx.doi.org/10.1207/S15327957PSPR0704_05

Delton, A. W., & Krasnow, M. M. (2017). The psychology of deterrenceexplains why group membership matters for third-party punishment.Evolution and Human Behavior, 38, 734–743. http://dx.doi.org/10.1016/j.evolhumbehav.2017.07.003

Delton, A. W., Krasnow, M. M., Cosmides, L., & Tooby, J. (2011).Evolution of direct reciprocity under uncertainty can explain humangenerosity in one-shot encounters. Proceedings of the National Academyof Sciences of the United States of America, 108, 13335–13340. http://dx.doi.org/10.1073/pnas.1102131108

Dreber, A., Rand, D. G., Fudenberg, D., & Nowak, M. A. (2008). Winnersdon’t punish. Nature, 452, 348 –351. http://dx.doi.org/10.1038/nature06723

Effron, D. A., Lucas, B. J., & O’Connor, K. (2015). Hypocrisy by asso-ciation: When organizational membership increases condemnation forwrongdoing. Organizational Behavior and Human Decision Processes,130, 147–159. http://dx.doi.org/10.1016/j.obhdp.2015.05.001

Epstein, S., Pacini, R., Denes-Raj, V., & Heier, H. (1996). Individualdifferences in intuitive-experiential and analytical-rational thinkingstyles. Journal of Personality and Social Psychology, 71, 390–405.http://dx.doi.org/10.1037/0022-3514.71.2.390

Everett, J. A., Ingbretsen, Z., Cushman, F., & Cikara, M. (2017). Delib-eration erodes cooperative behavior—Even towards competitive out-groups, even when using a control condition, and even when eliminatingselection bias. Journal of Experimental Social Psychology, 73, 76–81.http://dx.doi.org/10.1016/j.jesp.2017.06.014

Fehr, E., & Fischbacher, U. (2004). Third-party punishment and socialnorms. Evolution and Human Behavior, 25, 63–87. http://dx.doi.org/10.1016/S1090-5138(04)00005-4

Feinberg, M., Willer, R., & Schultz, M. (2014). Gossip and ostracismpromote cooperation in groups. Psychological Science, 25, 656–664.http://dx.doi.org/10.1177/0956797613510184

FeldmanHall, O., Sokol-Hessner, P., Van Bavel, J. J., & Phelps, E. A.(2014). Fairness violations elicit greater punishment on behalf of anotherthan for oneself. Nature Communications, 5, 5306. http://dx.doi.org/10.1038/ncomms6306

Fessler, D. M., & Haley, K. J. (2003). The Strategy of Affect: Emotions inHuman Cooperation 12. In P. Hammerstein (Ed.), The genetic andcultural evolution of cooperation (pp. 7–36). Cambridge, MA: MITPress.

Fiske, A. P., & Tetlock, P. E. (1997). Taboo trade-offs: Reactions totransactions that transgress the spheres of justice. Political Psychology,18, 255–297. http://dx.doi.org/10.1111/0162-895X.00058

Frederick, S. (2005). Cognitive reflection and decision making. The Jour-nal of Economic Perspectives, 19, 25–42. http://dx.doi.org/10.1257/089533005775196732

Fredrickson, B. L. (2001). The role of positive emotions in positivepsychology. The broaden-and-build theory of positive emotions. Amer-ican Psychologist, 56, 218–226. http://dx.doi.org/10.1037/0003-066X.56.3.218

Frijda, N. H. (1986). The emotions. New York, NY: Cambridge UniversityPress.

Goette, L., Huffman, D., & Meier, S. (2006). The impact of group mem-bership on cooperation and norm enforcement: Evidence using randomassignment to real social groups. The American Economic Review, 96,212–216. http://dx.doi.org/10.1257/000282806777211658

Goldberg, J. H., Lerner, J. S., & Tetlock, P. E. (1999). Rage and reason:The psychology of the intuitive prosecutor. European Journal of SocialPsychology, 29, 781–795. http://dx.doi.org/10.1002/(SICI)1099-0992(199908/09)29:5/63.0.CO;2-3

Greene, J. (2014). Moral tribes: Emotion, reason, and the gap between usand them. New York, NY: Penguin Press.

Gromet, D. M., Okimoto, T. G., Wenzel, M., & Darley, J. M. (2012). Avictim-centered approach to justice? Victim satisfaction effects on third-party punishments. Law and Human Behavior, 36, 375–389. http://dx.doi.org/10.1037/h0093922

Gross, J. J. (1998a). Antecedent- and response-focused emotion regulation:Divergent consequences for experience, expression, and physiology.Journal of Personality and Social Psychology, 74, 224–237. http://dx.doi.org/10.1037/0022-3514.74.1.224

Gross, J. J. (1998b). The emerging field of emotion regulation: An inte-grative review. Review of General Psychology, 2, 271–299. http://dx.doi.org/10.1037/1089-2680.2.3.271

Gross, J. J., & John, O. P. (2003). Individual differences in two emotionregulation processes: Implications for affect, relationships, and well-being. Journal of Personality and Social Psychology, 85, 348–362.http://dx.doi.org/10.1037/0022-3514.85.2.348

Haidt, J. (2001). The emotional dog and its rational tail: A social intuition-ist approach to moral judgment. Psychological Review, 108, 814–834.http://dx.doi.org/10.1037/0033-295X.108.4.814

Haidt, J. (2003). The moral emotions. Handbook of Affective Sciences, 11,852–870.

Hamlin, J. K., Wynn, K., Bloom, P., & Mahajan, N. (2011). How infantsand toddlers react to antisocial others. Proceedings of the NationalAcademy of Sciences of the United States of America, 108, 19931–19936. http://dx.doi.org/10.1073/pnas.1110306108

Henrich, J., & Boyd, R. (2001). Why people punish defectors. Weakconformist transmission can stabilize costly enforcement of norms incooperative dilemmas. Journal of Theoretical Biology, 208, 79–89.http://dx.doi.org/10.1006/jtbi.2000.2202

Henrich, J., Ensminger, J., McElreath, R., Barr, A., Barrett, C., Bolyanatz,A., . . . Ziker, J. (2010). Markets, religion, community size, and the

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

30 JORDAN AND RAND

http://dx.doi.org/10.1016/0162-3095%2892%2990032-Y



http://dx.doi.org/10.1037/0022-3514.83.2.284

http://dx.doi.org/10.1037/0022-3514.83.2.284

http://dx.doi.org/10.1016/j.jebo.2008.02.006

http://dx.doi.org/10.1177/1948550612457688

http://dx.doi.org/10.1016/j.tics.2013.06.005

http://dx.doi.org/10.1037/xge0000018

http://dx.doi.org/10.1177/1088868313495594

http://dx.doi.org/10.1371/journal.pone.0006699

http://dx.doi.org/10.1207/S15327957PSPR0704_05





http://dx.doi.org/10.1038/nature06723


http://dx.doi.org/10.1016/j.obhdp.2015.05.001

http://dx.doi.org/10.1037/0022-3514.71.2.390

http://dx.doi.org/10.1016/j.jesp.2017.06.014

http://dx.doi.org/10.1016/S1090-5138%2804%2900005-4

http://dx.doi.org/10.1016/S1090-5138%2804%2900005-4

http://dx.doi.org/10.1177/0956797613510184

http://dx.doi.org/10.1038/ncomms6306


http://dx.doi.org/10.1111/0162-895X.00058

http://dx.doi.org/10.1257/089533005775196732

http://dx.doi.org/10.1257/089533005775196732

http://dx.doi.org/10.1037/0003-066X.56.3.218

http://dx.doi.org/10.1037/0003-066X.56.3.218

http://dx.doi.org/10.1257/000282806777211658

http://dx.doi.org/10.1002/%28SICI%291099-0992%28199908/09%2929:5/63.0.CO;2-3

http://dx.doi.org/10.1002/%28SICI%291099-0992%28199908/09%2929:5/63.0.CO;2-3

http://dx.doi.org/10.1037/h0093922

http://dx.doi.org/10.1037/h0093922

http://dx.doi.org/10.1037/0022-3514.74.1.224

http://dx.doi.org/10.1037/0022-3514.74.1.224

http://dx.doi.org/10.1037/1089-2680.2.3.271

http://dx.doi.org/10.1037/1089-2680.2.3.271

http://dx.doi.org/10.1037/0022-3514.85.2.348

http://dx.doi.org/10.1037/0033-295X.108.4.814


http://dx.doi.org/10.1006/jtbi.2000.2202

evolution of fairness and punishment. Science, 327, 1480–1484. http://dx.doi.org/10.1126/science.1182238

Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people inthe world? Behavioral and Brain Sciences, 33, 61–83. http://dx.doi.org/10.1017/S0140525X0999152X

Henrich, J., McElreath, R., Barr, A., Ensminger, J., Barrett, C., Bolyanatz,A., . . . Ziker, J. (2006). Costly punishment across human societies.Science, 312, 1767–1770. http://dx.doi.org/10.1126/science.1127333

Herrmann, B., Thöni, C., & Gächter, S. (2008). Antisocial punishmentacross societies. Science, 319, 1362–1367. http://dx.doi.org/10.1126/science.1153808

Hoffman, M. L. (2001). Empathy and moral development: Implications forcaring and justice. New York, NY: Cambridge University Press.

Hoffman, M., Yoeli, E., & Nowak, M. A. (2015). Cooperate withoutlooking: Why we care what people think and not just what they do.Proceedings of the National Academy of Sciences of the United States ofAmerica, 112, 1727–1732. http://dx.doi.org/10.1073/pnas.1417904112

Horberg, E. J., Oveis, C., Keltner, D., & Cohen, A. B. (2009). Disgust andthe moralization of purity. Journal of Personality and Social Psychol-ogy, 97, 963–976. http://dx.doi.org/10.1037/a0017423

Horita, Y. (2010). Punishers may be chosen as providers but not asrecipients. Letters on Evolutionary Behavioral Science, 1, 6–9. http://dx.doi.org/10.5178/lebs.2010.2

Hutcherson, C. A., & Gross, J. J. (2011). The moral emotions: A social-functionalist account of anger, disgust, and contempt. Journal of Per-sonality and Social Psychology, 100, 719–737. http://dx.doi.org/10.1037/a0022408

Inbar, Y., Pizarro, D. A., Knobe, J., & Bloom, P. (2009). Disgust sensitivitypredicts intuitive disapproval of gays. Emotion, 9, 435–439. http://dx.doi.org/10.1037/a0015960

Jensen, K. (2010). Punishment and spite, the dark side of cooperation.Philosophical Transactions of the Royal Society of London Series B:Biological Sciences, 365, 2635–2650. http://dx.doi.org/10.1098/rstb.2010.0146

Jordan, J. J., Hoffman, M., Bloom, P., & Rand, D. G. (2016). Third-partypunishment as a costly signal of trustworthiness. Nature, 530, 473–476.http://dx.doi.org/10.1038/nature16981

Jordan, J. J., Hoffman, M., Nowak, M. A., & Rand, D. G. (2016). Uncal-culating cooperation is used to signal trustworthiness. Proceedings of theNational Academy of Sciences of the United States of America, 113,8658–8663. http://dx.doi.org/10.1073/pnas.1601280113

Jordan, J. J., McAuliffe, K., & Rand, D. (2015). The effects of endowmentsize and strategy method on third party punishment. Experimental Eco-nomics, 19, 741–763.

Jordan, J. J., McAuliffe, K., & Warneken, F. (2014). Development ofin-group favoritism in children’s third-party punishment of selfishness.Proceedings of the National Academy of Sciences of the United States ofAmerica, 111, 12710 –12715. http://dx.doi.org/10.1073/pnas.1402280111

Jordan, J. J., & Rand, D. G. (2017). Third-party punishment as a costlysignal of high continuation probabilities in repeated games. Journal ofTheoretical Biology, 421, 189–202. http://dx.doi.org/10.1016/j.jtbi.2017.04.004

Jordan, J. J., Sommers, R., Bloom, P., & Rand, D. G. (2017). Why do wehate hypocrites? Evidence for a theory of false signaling. PsychologicalScience, 28, 356–368. http://dx.doi.org/10.1177/0956797616685771

Kahneman, D. (2011). Thinking, fast and slow. London: Macmillan.Kiyonari, T., Tanida, S., & Yamagishi, T. (2000). Social exchange and

reciprocity: Confusion or a heuristic? Evolution and Human Behavior,21, 411–427. http://dx.doi.org/10.1016/S1090-5138(00)00055-6

Krasnow, M. M., Delton, A. W., Cosmides, L., & Tooby, J. (2016).Looking under the hood of third-party punishment reveals design forpersonal benefit. Psychological Science, 27, 405–418. http://dx.doi.org/10.1177/0956797615624469

Kurzban, R., DeScioli, P., & O’Brien, E. (2007). Audience effects onmoralistic punishment. Evolution and Human Behavior, 28, 75–84.http://dx.doi.org/10.1016/j.evolhumbehav.2006.06.001

Lazarus, R. S. (1991). Emotion and adaptation. New York, NY: OxfordUniversity Press on Demand.

Leary, M. R. (1983). A brief version of the Fear of Negative EvaluationScale. Personality and Social Psychology Bulletin, 9, 371–375. http://dx.doi.org/10.1177/0146167283093007

Lotz, S., Okimoto, T. G., Schlösser, T., & Fetchenhauer, D. (2011).Punitive versus compensatory reactions to injustice: Emotional anteced-ents to third-party interventions. Journal of Experimental Social Psy-chology, 47, 477–480. http://dx.doi.org/10.1016/j.jesp.2010.10.004

Mathew, S., & Boyd, R. (2011). Punishment sustains large-scale cooper-ation in prestate warfare. Proceedings of the National Academy ofSciences of the United States of America, 108, 11375–11380. http://dx.doi.org/10.1073/pnas.1105604108

Mazar, N., Amir, O., & Ariely, D. (2008). The dishonesty of honest people:A theory of self-concept maintenance. Journal of Marketing Research,45, 633–644. http://dx.doi.org/10.1509/jmkr.45.6.633

McAuliffe, K., Jordan, J. J., & Warneken, F. (2015). Costly third-partypunishment in young children. Cognition, 134, 1–10. http://dx.doi.org/10.1016/j.cognition.2014.08.013

Merritt, A. C., Effron, D. A., & Monin, B. (2010). Moral self-licensing:When being good frees us to be bad. Social and Personality PsychologyCompass, 4, 344 –357. http://dx.doi.org/10.1111/j.1751-9004.2010.00263.x

Monin, B., & Miller, D. T. (2001). Moral credentials and the expression ofprejudice. Journal of Personality and Social Psychology, 81, 33–43.http://dx.doi.org/10.1037/0022-3514.81.1.33

Montada, L., & Schneider, A. (1989). Justice and emotional reactions tothe disadvantaged. Social Justice Research, 3, 313–344. http://dx.doi.org/10.1007/BF01048081

Mullen, E., & Monin, B. (2016). Consistency versus licensing effects ofpast moral behavior. Annual Review of Psychology, 67, 363–385. http://dx.doi.org/10.1146/annurev-psych-010213-115120

Nelissen, R. (2008). The price you pay: Cost-dependent reputation effectsof altruistic punishment. Evolution and Human Behavior, 29, 242–248.http://dx.doi.org/10.1016/j.evolhumbehav.2008.01.001

Nelissen, R., & Zeelenberg, M. (2009). Moral emotions as determinants ofthird-party punishment: Anger, guilt, and the functions of altruisticsanctions. Judgment and Decision Making, 4, 543–553.

Nikiforakis, N. (2008). Punishment and counter-punishment in public goodgames: Can we really govern ourselves? Journal of Public Economics,92, 91–112. http://dx.doi.org/10.1016/j.jpubeco.2007.04.008

Ohtsuki, H., Iwasa, Y., & Nowak, M. A. (2009). Indirect reciprocityprovides only a narrow margin of efficiency for costly punishment.Nature, 457, 79–82. http://dx.doi.org/10.1038/nature07601

Perugini, M., & Leone, L. (2009). Implicit self-concept and moral action.Journal of Research in Personality, 43, 747–754. http://dx.doi.org/10.1016/j.jrp.2009.03.015

Preacher, K. J., & Hayes, A. F. (2008). Asymptotic and resamplingstrategies for assessing and comparing indirect effects in multiple me-diator models. Behavior research methods, 40, 879–891.

Raihani, N. J., & Bshary, R. (2015a). The reputation of punishers. Trendsin Ecology & Evolution, 30, 98–103. http://dx.doi.org/10.1016/j.tree.2014.12.003

Raihani, N. J., & Bshary, R. (2015b). Third-party punishers are rewarded,but third-party helpers even more so. Evolution, 69, 993–1003. http://dx.doi.org/10.1111/evo.12637

Rand, D. G. (2016). Cooperation, fast and slow: Meta-analytic evidence fora theory of social heuristics and self-interested deliberation. Psycholog-ical Science, 27, 1192–1206. http://dx.doi.org/10.1177/0956797616654455

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.


http://dx.doi.org/10.1126/science.1182238


http://dx.doi.org/10.1017/S0140525X0999152X

http://dx.doi.org/10.1017/S0140525X0999152X





http://dx.doi.org/10.1037/a0017423

http://dx.doi.org/10.5178/lebs.2010.2

http://dx.doi.org/10.5178/lebs.2010.2

http://dx.doi.org/10.1037/a0022408

http://dx.doi.org/10.1037/a0022408

http://dx.doi.org/10.1037/a0015960

http://dx.doi.org/10.1037/a0015960

http://dx.doi.org/10.1098/rstb.2010.0146

http://dx.doi.org/10.1098/rstb.2010.0146





http://dx.doi.org/10.1016/j.jtbi.2017.04.004

http://dx.doi.org/10.1016/j.jtbi.2017.04.004

http://dx.doi.org/10.1177/0956797616685771

http://dx.doi.org/10.1016/S1090-5138%2800%2900055-6

http://dx.doi.org/10.1177/0956797615624469

http://dx.doi.org/10.1177/0956797615624469


http://dx.doi.org/10.1177/0146167283093007

http://dx.doi.org/10.1177/0146167283093007

http://dx.doi.org/10.1016/j.jesp.2010.10.004



http://dx.doi.org/10.1509/jmkr.45.6.633

http://dx.doi.org/10.1016/j.cognition.2014.08.013

http://dx.doi.org/10.1016/j.cognition.2014.08.013

http://dx.doi.org/10.1111/j.1751-9004.2010.00263.x

http://dx.doi.org/10.1111/j.1751-9004.2010.00263.x

http://dx.doi.org/10.1037/0022-3514.81.1.33

http://dx.doi.org/10.1007/BF01048081

http://dx.doi.org/10.1007/BF01048081

http://dx.doi.org/10.1146/annurev-psych-010213-115120

http://dx.doi.org/10.1146/annurev-psych-010213-115120


http://dx.doi.org/10.1016/j.jpubeco.2007.04.008


http://dx.doi.org/10.1016/j.jrp.2009.03.015

http://dx.doi.org/10.1016/j.jrp.2009.03.015

http://dx.doi.org/10.1016/j.tree.2014.12.003

http://dx.doi.org/10.1016/j.tree.2014.12.003

http://dx.doi.org/10.1111/evo.12637

http://dx.doi.org/10.1111/evo.12637

http://dx.doi.org/10.1177/0956797616654455

http://dx.doi.org/10.1177/0956797616654455

Rand, D. G., Greene, J. D., & Nowak, M. A. (2012). Spontaneous givingand calculated greed. Nature, 489, 427–430. http://dx.doi.org/10.1038/nature11467

Rand, D. G., Peysakhovich, A., Kraft-Todd, G. T., Newman, G. E.,Wurzbacher, O., Nowak, M. A., & Greene, J. D. (2014). Social heuris-tics shape intuitive cooperation. Nature Communications, 5, 3677. http://dx.doi.org/10.1038/ncomms4677

Rand, D. G., Tomlin, D., Bear, A., Ludvig, E. A., & Cohen, J. D. (2017).Cyclical population dynamics of automatic versus controlled processing:An evolutionary pendulum. Psychological Review, 124, 626–642. http://dx.doi.org/10.1037/rev0000079

Riedl, K., Jensen, K., Call, J., & Tomasello, M. (2012). No third-partypunishment in chimpanzees. Proceedings of the National Academy ofSciences of the United States of America, 109, 14824–14829. http://dx.doi.org/10.1073/pnas.1203179109

Rodebaugh, T. L., Woods, C. M., Thissen, D. M., Heimberg, R. G.,Chambless, D. L., & Rapee, R. M. (2004). More information from fewerquestions: The factor structure and item properties of the original andbrief fear of negative evaluation scale. Psychological Assessment, 16,169–181. http://dx.doi.org/10.1037/1040-3590.16.2.169

Roseman, I. J. (2011). Emotional behaviors, emotivational goals, emotionstrategies: Multiple levels of organization integrate variable and consis-tent responses. Emotion Review, 3, 434–443. http://dx.doi.org/10.1177/1754073911410744

Roseman, I. J., Wiest, C., & Swartz, T. S. (1994). Phenomenology, behav-iors, and goals differentiate discrete emotions. Journal of Personalityand Social Psychology, 67, 206–221. http://dx.doi.org/10.1037/0022-3514.67.2.206

Ruttan, R. L., McDonnell, M.-H., & Nordgren, L. F. (2015). Having “beenthere” doesn’t mean I care: When prior experience reduces compassionfor emotional distress. Journal of Personality and Social Psychology,108, 610–622. http://dx.doi.org/10.1037/pspi0000012

Sachdeva, S., Iliev, R., & Medin, D. L. (2009). Sinning saints and saintlysinners: The paradox of moral self-regulation. Psychological Science,20, 523–528. http://dx.doi.org/10.1111/j.1467-9280.2009.02326.x

Salerno, J. M., & Peter-Hagene, L. C. (2013). The interactive effect ofanger and disgust on moral outrage and judgments. Psychological Sci-ence, 24, 2069–2078. http://dx.doi.org/10.1177/0956797613486988

Shenhav, A., Musslick, S., Lieder, F., Kool, W., Griffiths, T. L., Cohen,J. D., & Botvinick, M. M. (2017). Toward a rational and mechanisticaccount of mental effort. Annual Review of Neuroscience, 40, 99–124.http://dx.doi.org/10.1146/annurev-neuro-072116-031526

Simpson, B., Harrell, A., & Willer, R. (2013). Hidden paths from moralityto cooperation: Moral judgments promote trust and trustworthiness.Social forces, 91(4), 1529–1548.

Skitka, L. J., Bauman, C. W., & Mullen, E. (2004). Political tolerance andcoming to psychological closure following the September 11, 2001,terrorist attacks: An integrative approach. Personality and Social Psy-chology Bulletin, 30, 743–756. http://dx.doi.org/10.1177/0146167204263968

Stagnaro, M., Pennycook, G., & Rand, D. G. (2018). Cognitive reflectionis a stable trait. Available at SSRN: https://ssrn.com/abstract!3115809;http://dx.doi.org/10.2139/ssrn.3115809

Stanovich, K. E. (2005). The robot’s rebellion: Finding meaning in the ageof Darwin. Chicago, IL: University of Chicago Press.

Tangney, J. P., Stuewig, J., & Mashek, D. J. (2007). Moral emotions andmoral behavior. Annual Review of Psychology, 58, 345–372. http://dx.doi.org/10.1146/annurev.psych.56.091103.070145

Tetlock, P. E., Kristel, O. V., Elson, S. B., Green, M. C., & Lerner, J. S.(2000). The psychology of the unthinkable: Taboo trade-offs, forbiddenbase rates, and heretical counterfactuals. Journal of Personality andSocial Psychology, 78, 853–870. http://dx.doi.org/10.1037/0022-3514.78.5.853

Uhlmann, E. L., Pizarro, D. A., Tannenbaum, D., & Ditto, P. H. (2009).The motivated use of moral principles. Judgment and Decision Making,4, 479.

Watson, D., & Friend, R. (1969). Measurement of social-evaluative anxi-ety. Journal of Consulting and Clinical Psychology, 33, 448–457.http://dx.doi.org/10.1037/h0027806

Yamagishi, T. (1986). The provision of a sanctioning system as a publicgood. Journal of Personality and Social Psychology, 51, 110–116.http://dx.doi.org/10.1037/0022-3514.51.1.110

Young, L., Chakroff, A., & Tom, J. (2012). Doing good leads to moregood: The reinforcing power of a moral self-concept. Review of Philos-ophy and Psychology, 3, 325–334. http://dx.doi.org/10.1007/s13164-012-0111-6

Zahavi, A. (1975). Mate selection-a selection for a handicap. Journal ofTheoretical Biology, 53, 205–214. http://dx.doi.org/10.1016/0022-5193(75)90111-3

Zefferman, M. R. (2014). Direct reciprocity under uncertainty does notexplain one-shot cooperation, but demonstrates the benefits of a normpsychology. Evolution and Human Behavior, 35, 358–367. http://dx.doi.org/10.1016/j.evolhumbehav.2014.04.003

Zimmermann, J., & Efferson, C. (2017). One-shot reciprocity under errormanagement is unbiased and fragile. Evolution and Human Behavior,38, 39–47. http://dx.doi.org/10.1016/j.evolhumbehav.2016.06.005

Received May 3, 2018Revision received January 16, 2019

Accepted February 6, 2019 !

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

32 JORDAN AND RAND





http://dx.doi.org/10.1037/rev0000079

http://dx.doi.org/10.1037/rev0000079



http://dx.doi.org/10.1037/1040-3590.16.2.169

http://dx.doi.org/10.1177/1754073911410744

http://dx.doi.org/10.1177/1754073911410744

http://dx.doi.org/10.1037/0022-3514.67.2.206

http://dx.doi.org/10.1037/0022-3514.67.2.206

http://dx.doi.org/10.1037/pspi0000012

http://dx.doi.org/10.1111/j.1467-9280.2009.02326.x

http://dx.doi.org/10.1177/0956797613486988

http://dx.doi.org/10.1146/annurev-neuro-072116-031526

http://dx.doi.org/10.1177/0146167204263968

http://dx.doi.org/10.1177/0146167204263968

https://ssrn.com/abstract=3115809

http://dx.doi.org/10.2139/ssrn.3115809

http://dx.doi.org/10.1146/annurev.psych.56.091103.070145

http://dx.doi.org/10.1146/annurev.psych.56.091103.070145

http://dx.doi.org/10.1037/0022-3514.78.5.853

http://dx.doi.org/10.1037/0022-3514.78.5.853

http://dx.doi.org/10.1037/h0027806

http://dx.doi.org/10.1037/0022-3514.51.1.110

http://dx.doi.org/10.1007/s13164-012-0111-6

http://dx.doi.org/10.1007/s13164-012-0111-6

http://dx.doi.org/10.1016/0022-5193%2875%2990111-3

http://dx.doi.org/10.1016/0022-5193%2875%2990111-3




Documents

Signaling When No One Is Watching: A Reputation ......doing. In other words, what material benefits might moralistic Jillian J. Jordan, Kellogg School of Management, Northwestern Univer-sity;