34
PERSUASION OF A PRIVATELY INFORMED RECEIVER Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk Abstract. We study Bayesian persuasion in linear environments with a privately informed receiver. We allow the sender to condition information provided to the receiver on the receiver’s report about his type (private persuasion). We describe implementable outcomes, establish equivalence between public persuasion studied in the prior literature and private persuasion, and draw connections with the stan- dard linear mechanism design with transfers. We also characterize optimal per- suasion rules, establish monotone comparative statics, and consider several appli- cations, such as a competitive market for a good with consumption externalities (e.g., cigarettes) in which a benevolent government designs an educational cam- paign about a payoff-relevant characteristic (e.g., the health risks of smoking). JEL Classification: D81, D82, D83 Keywords: information disclosure, Bayesian persuasion, screening, mechanism design without transfers, advertising, certification, mass media Date : February 18, 2016. Kolotilin : School of Economics, University of New South Wales, Sydney, NSW 2052, Australia. E-mail: [email protected]. Li: Concordia University and CIREQ, 1455 Boulevard de Maisonneuve Ouest, Department of Eco- nomics, Concordia University, Montr´ eal, Qu´ ebec, Canada H3G 1M8. Email: [email protected]. Mylovanov : University of Pittsburgh, Department of Economics, 4714 Posvar Hall, 230 South Bouquet Street, Pittsburgh, PA 15260, USA. E-mail: [email protected]. Zapechelnyuk : Adam Smith Business School, University of Glasgow, University Avenue, Glasgow G12 8QQ, UK. E-mail: [email protected]. The authors would like to express their gratitude to Navik Kartik for very helpful comments and discussion at the Boston ASSA meetings. We are also grateful to Ricardo Alonso, Dirk Bergemann, Patrick Bolton, Alessandro Bonatti, Rahul Deb, P´ eter Es¨ o, Johannes H¨ orner, Florian Hoffman, Roman Inderst, Emir Kamenica, Daniel Kr¨ aehmer, Marco Ottaviani, Mallesh Pai, Ilya Segal, Larry Samuelson, and Joel Sobel for their excellent comments. 1

PERSUASION OF A PRIVATELY INFORMED … OF A PRIVATELY INFORMED RECEIVER Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk Abstract. We study Bayesian persuasion in linear

Embed Size (px)

Citation preview

PERSUASION OF A PRIVATELY INFORMED RECEIVER

Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

Abstract. We study Bayesian persuasion in linear environments with a privatelyinformed receiver. We allow the sender to condition information provided to thereceiver on the receiver’s report about his type (private persuasion). We describeimplementable outcomes, establish equivalence between public persuasion studiedin the prior literature and private persuasion, and draw connections with the stan-dard linear mechanism design with transfers. We also characterize optimal per-suasion rules, establish monotone comparative statics, and consider several appli-cations, such as a competitive market for a good with consumption externalities(e.g., cigarettes) in which a benevolent government designs an educational cam-paign about a payoff-relevant characteristic (e.g., the health risks of smoking).

JEL Classification: D81, D82, D83

Keywords: information disclosure, Bayesian persuasion, screening, mechanismdesign without transfers, advertising, certification, mass media

Date: February 18, 2016.

Kolotilin: School of Economics, University of New South Wales, Sydney, NSW 2052, Australia.E-mail: [email protected]: Concordia University and CIREQ, 1455 Boulevard de Maisonneuve Ouest, Department of Eco-nomics, Concordia University, Montreal, Quebec, Canada H3G 1M8. Email: [email protected] : University of Pittsburgh, Department of Economics, 4714 Posvar Hall, 230 SouthBouquet Street, Pittsburgh, PA 15260, USA. E-mail: [email protected] : Adam Smith Business School, University of Glasgow, University Avenue, GlasgowG12 8QQ, UK. E-mail: [email protected].

The authors would like to express their gratitude to Navik Kartik for very helpful commentsand discussion at the Boston ASSA meetings. We are also grateful to Ricardo Alonso, DirkBergemann, Patrick Bolton, Alessandro Bonatti, Rahul Deb, Peter Eso, Johannes Horner, FlorianHoffman, Roman Inderst, Emir Kamenica, Daniel Kraehmer, Marco Ottaviani, Mallesh Pai, IlyaSegal, Larry Samuelson, and Joel Sobel for their excellent comments.

1

2 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

1. Introduction

A sender wishes to manipulate a receiver’s beliefs about an optimal action, butthere is uncertainty about the receiver’s preferences. This is a Bayesian persuasionenvironment a la Kamenica and Gentzkow (2011) and Rayo and Segal (2010),1 witha novelty that the receiver has private information (type) about his tastes. Numer-ous natural applications fit this environment: public educational campaigns, optimalcertification policies of financial and consumer products, media censorship regula-tions, informational lobbying, etc. (Some of the applications are discussed in detailin Section 7.)

By way of motivation, consider a government that would like to persuade the publicto reduce the amount of public smoking by commissioning an educational campaignabout the health risks of cigarettes. What is the optimal way to conduct the edu-cational campaign? Should the government target different consumers by providingthem with different information or is it optimal to provide the same information toeveryone? When should the government withhold information from the public? Howdoes the distribution of tastes among the consumers affect the amount of informationreleased by the government?

In our model, the receiver has a binary choice: to act or not. His utility differencebetween action and inaction depends on his private type and an uncertain state, andit is additively separable. The sender is biased and might wish the receiver to actin situations where a fully informed receiver would have not. The sender choosesany mechanism that sends messages depending on the type reported by the receiverand the realized state. The receiver makes a report, observes a message from themechanism, and then takes an optimal action based on his updated beliefs about thestate. Thus, the sender can screen different types of receivers by targeting informationdisclosure to the receiver’s reports. We call this model private persuasion.

The important benchmark is public persuasion in which the sender designs a publicsignal about the state that is independent of the receiver’s report. The literature onBayesian persuasion focuses on the environments in which the receiver does not haveprivate information and, hence, private persuasion is not relevant. Public persua-sion with private information is considered in Rayo and Segal (2010), Kamenica andGentzkow (2011), and Kolotilin (2014).

The first result of the paper is equivalence of private and public persuasion. Theresult is based on an elegant connection between Mirlees integral representation ofincentive-compatibility (Mirrlees 1971) and Blackwell integral representation of gar-bling (Blackwell 1953). First, using the integral envelope representation of the incen-tive compatibility we characterize the set of receiver’s utility profiles implementableby private persuasion. It consists of every convex function bounded by the utilityprofiles obtained if the sender reveals the state (full disclosure) and if the senderreveals nothing (no disclosure). Second, using Blackwell integral representation of

1We study the environment of Kamenica and Gentzkow (2011); our results do not apply to theenvironment in Rayo and Segal (2010).

PERSUASION OF A PRIVATELY INFORMED RECEIVER 3

garbling we observe that every convex function bounded by full-disclosure and no-disclosure utility profiles can be obtained by a public signal that is a garbling of thefully revealing signal, establishing equivalence.

The equivalence result tells us that screening the receiver’s type by conditioningmessages on the receiver’s reports does not expand the set of implementable utilityprofiles. The result is non-obvious because different types of the receiver have differ-ent preferences over disclosed information: The ideal signal for the receiver revealswhether his utility difference between action and inaction is positive or negative,which varies with the receiver’s type. The equivalence result will fail if in addition toinformation disclosure a mechanism can use transfers (c.f. Bergemann, Bonatti, andSmolin (2014) and Krahmer and Mylovanov (2014)).

The equivalence has implications for optimality of public persuasion for the senderin our model. We consider the sender’s payoff to be a weighted sum of the probabilityof acting and the utility of the receiver. By incentive compatibility, the probabilityof acting is equal to receiver’s marginal utility. Hence, the sender’s payoff can beexpressed in terms of the receiver’s utility profile, which together with the equivalenceresult establishes that public persuasion is optimal.

Our second result is the characterization of optimal persuasion mechanisms andcomparative statics. A public signal is called upper censorship if it sends the messageequal to the realized value of the state whenever it is below some threshold, andsends no message otherwise. In the latter case, the receiver infers only that the stateis above the threshold. Upper censorship is optimal if the probability density of thereceiver’s types is logconcave—the condition which is satisfied for most densities usedin applications.2

The structure of the optimal mechanism underscores the effect of privacy of thereceiver’s information. As an example, assume that the sender would like to maximizethe probability of action. If the receiver’s preferences were public, the optimal (public)mechanism would be a signal that reveals whether the state is above or below somecutoff. The cutoff is designed to make the receiver indifferent to act if the news isgood (i.e., the state is above the cutoff) and strictly prefers not to act otherwise. Inthis mechanism, the sender offers no value to the receiver, raising the question whythe receiver should employ the sender in the first place.3 By contrast, in our model,privacy of the type generates information rents for the receiver.

Upper censorship provides more information than cutoff mechanisms (which areoptimal in the benchmark environment with publicly known type of the receiver). Italso resembles some of the persuasion mechanisms observed in reality. Consider, forexample, the question of optimal certification of financial products.4 Upper censorship

2Lower censorship, defined symmetrically, is optimal if the probability density of the receiver’stypes is logconvex.

3Since the receiver weakly prefers not to act regardless of the signal realization, he obtains thesame payoff as from ignoring the signals and taking no action. For this setting, Kolotilin (2015)derives the comparative statics over the distributions of states of nature and public types of thereceiver.

4For optimal certification see Lizzeri (1999) and the subsequent literature.

4 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

can be approximated by a coarse ranking, with one high grade that bunches togethergood securities (the states above the threshold) and multiple lower grades (the statesbelow the threshold). The high grade is rather uninformative, but is nevertheless asignal of good enough quality that makes cautious investors buy the security. Theother grades signal negative news, and are pretty informative, inducing purchase froma smaller set of less risk-averse investors.

We establish a comparative statics result that the sender optimally discloses moreinformation if she puts a higher weight on the receiver’s utility and if economic fun-damentals are more biased against the sender. As extreme cases, revealing full infor-mation is optimal if the density of the receiver’s type is increasing and revealing noinformation can be optimal if the receiver type’s density is decreasing. Thus, privacyof the receiver’s tastes can have an ambiguous result on the amount of information re-vealed by the sender. Nevertheless, a relatively higher probability of a more skepticalreceiver will tend to force the sender to reveal more information.

In Section 7 we consider three applications to demonstrate versatility of the model.We start with the design of optimal educational campaigns about the value of a prod-uct in competitive markets with positive or negative consumption externalities andconsumer biases. Our second application is the government’s control of the editorialpolicies of media outlets. Finally, we consider optimal informational lobbying policyof a politician with uncertain stand on an issue. In these applications, we provideconditions under which upper censorship policy is optimal and show the relationshipbetween the economic fundamentals and the amount of information revealed by thesender.

We conclude the paper with comparison of our results with those in Johnson andMyatt (2006), and with relating the problem of persuasion by cutoff mechanisms tothe problem of optimal delegation.

Two classes of results are relegated to the Appendix. We provide a structuralcharacterization of optimal public signals for general environments, beyond logconcaveand logconvex density, in Appendix A. We explore the limits of equivalence betweenpublic and private persuasion outside of linear environments in Appendix B. Someproofs are deferred to Appendix C.

Related literature. Our Bayesian persuasion model is a variation of Kamenica andGentzkow (2011), with more structure on the preferences and the action space, anda novelty that the receiver is privately informed.5 We study private persuasion inwhich information revealed to the receiver depends on her reported private informa-tion. Public persuasion, in which the information revealed must be identical for allreceiver types, has been covered in Kamenica and Gentzkow (2011, Section VI.A),Rayo and Segal (2010), and Kolotilin (2014). Kamenica and Gentzkow (2011) providea methodological contribution; Rayo and Segal (2010) consider a sufficiently different

5For recent work on persuasion by disclosure, see, for example, Che, Dessein, and Kartik (2013),Koessler and Renault (2012), Hagenbach, Koessler, and Perez-Richet (2014), and Hoffmann, Inderst,and Ottaviani (2014).

PERSUASION OF A PRIVATELY INFORMED RECEIVER 5

model, with a more complex preference structure; and Kolotilin (2014) provides analgebraic criterion of optimality for public persuasion mechanisms.

Bayesian persuasion with a privately informed sender is considered in Rayo andSegal (2010), Perez-Richet (2014) and Alonso and Camara (2014a).6 Competitionin Bayesian persuasion is analyzed in Gentzkow and Kamenica (2012),7 Bayesianpersuasion with costly signals is studied in Gentzkow and Kamenica (2014a), and en-dogenous acquisition of information in Bayesian persuasion is introduced in Gentzkowand Kamenica (2014b). Dynamic Bayesian persuasion environments appear in Ely,Frankel, and Kamenica (2015) and Ely (2014).8 Variations of Bayesian persuasionwith monetary transfers and selling information are studied in Horner and Skrzypacz(2009), Bergemann and Bonatti (2013), Bergemann, Bonatti, and Smolin (2014),and Krahmer and Mylovanov (2014). Alonso and Camara (2014b) explore Bayesianpersuasion without common priors.

Bayesian persuasion is a single-agent (decision) problem. Informational structuresin multi-agent environments are studied in Bergemann and Morris (2013a) and Berge-mann and Morris (2013b). Bergemann, Brooks, and Morris (2013) analyze limits ofprice discrimination on the set of all information structures and Bergemann, Brooks,and Morris (2012) focus on information structures in first price auctions.9 Alonsoand Camara (2014c) study Bayesian persuasion in voting environments. Bayesianpersuasion with multiple receivers who take a collective action is also considered inTaneva (2014) and Wang (2013).

2. Model

2.1. Setup. There are two players: the sender (she) and the receiver (he). Thereceiver takes a binary action: to act or not to act. There are two payoff-relevantrandom variables: state of nature ω ∈ Ω = [0, 1] and the receiver’s type r ∈ R =[0, 1]. The state of nature is uncertain, while the type is privately observed by thereceiver. Random variables ω and r are independently distributed with c.d.f. F andG, respectively, where G admits a strictly positive differentiable density g.

The receiver’s payoff is u(ω, r) = ω − r if he acts and is normalised to zero if hedoes not act. The sender’s payoff is v (ω, r) = 1 + ρ (r)u(ω, r) if the receiver actsand is zero if the receiver does not act, where ρ(r) ∈ R. That is, the sender is biasedtowards a = 1, but she also puts a (type-specific) weight ρ(r) on the receiver’s payoff.In particular, if the weight is very large, then the sender’s and receiver’s interestsare aligned, but if the weight is zero, then the sender cares only about whether thereceiver acts or not. The utility is not transferrable; there are no monetary payments.

6Perez-Richet and Prady (2012) is a related model with a restricted set of persuasion rules.7Boleslavsky and Cotton (2015a, 2015b) study related models with competing persuaders.8See also Kremer, Mansour, and Perry (2014) for an optimal disclosure policy in a dynamic

environment.9The literature on optimal informational structures in auctions has been initiated in Bergemann

and Pesendorfer (2007). See Eso and Szentes (2007), Bergemann and Wambach (2013), and Li andShi (2015) for recent contributions.

6 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

The assumption that u(ω, r) = ω− r is without loss of generality if we consider theclass of the receiver’s payoff functions of the form u(ω, r) = b(r) + c(r)d(ω) for somefunctions b, c, and d, with c having a constant sign, as we note in Remark 2. Theboundary conditions u(ω, 0) ≥ 0 and u(ω, 1) ≤ 0 for all ω ∈ Ω allow for elegance ofpresentation; relaxing these assumptions poses no difficulty.

2.2. Private persuasion. In order to influence the decision made by the receiver,the sender can design a test that asks the receiver to report his private informationand sends a message to the receiver conditional on his report and the realized state.This is an environment of private persuasion because the test can depend non-triviallyon the information revealed by the receiver.

We are interested in optimal tests and adopt the mechanism design approach. Aprivate persuasion mechanism (test) π : Ω × R → [0, 1] asks the receiver to reportr ∈ [0, 1] and then provides him with a binary message: for every ω ∈ Ω, it recom-mends to act (a = 1) with probability π (ω, r) and not to act (a = 0) with the com-plementary probability. A private mechanism is incentive-compatible if the receiverfinds it optimal to report his true type and follow the mechanism’s recommendation.

By the revelation principle, the focus on private incentive-compatible mechanismsis without loss of generality in that any equilibrium outcome of any game between thesender and the receiver, in which the value of ω is disclosed in some way to the receiver,can be replicated by an incentive-compatible private mechanism. In particular, therestriction that the mechanism returns a binary recommendation about action insteadof messages about the state is without loss since action is binary.

2.3. Public persuasion. Under public persuasion, messages are independent of re-ports of the receiver, so all types of the receiver are informed identically. A publicpersuasion mechanism σ : Ω → ∆(M) sends to the receiver a randomized messagedistributed according to measure σ(ω) for each realized state ω, where M denotes amessage space.

For a given public signal σ, each message m induces a posterior belief of the receiverabout the distribution of states, and hence the posterior value of state E[ω|m]. Asin Kamenica and Gentzkow (2011), without loss assume that the set of messages isM = Ω = [0, 1] and that messages are direct, in the sense that they inform the receiverabout the posterior value of the state, m = E[ω|m].

Observe that every public signal σ is identical to mechanism π defined by

π(ω, r) = Pr[m ≥ r

∣∣∣m ∈ Supp(σ(ω))]. (1)

2.4. Interpretation: Continuum of receivers and ex-post implementation.In some applications, it can be useful to think about a population of heterogeneousreceivers of mass one parametrized by the privately known preference parameter rdistributed according to G. Under private persuasion, the receivers report their typesto the sender and then receive private recommendations from the mechanism. Underpublic persuasion, all receivers observe the same public message about the state.

PERSUASION OF A PRIVATELY INFORMED RECEIVER 7

By definition (c.f. (1)), an outcome of public persuasion can be achieved througha private mechanism. Furthermore, in this private mechanism, the incentive compat-ibility and obedience constraints are satisfied ex-post, even after the receiver learnsthe entire profile of recommendations that would be made to other types. The reversestatement is true as well: if a private mechanism is ex-post incentive-compatible andobedient, then it can be implemented through public persuasion.

3. Envelope characterization of incentive compatibility

Denote the expected payoff of a receiver of type r who reports r and takes actionsa0 and a1 in 0, 1 after recommendations a = 0 and a = 1, respectively, by

Uπ(r, r, a0, a1) :=

∫Ω

(a0(1− π(ω, r)) + a1π(ω, r)) (ω − r)dF (ω).

The expected payoff of the truthful (r = r) and obedient (a0 = 0 and a1 = 1) receiveris equal to

Uπ(r) := Uπ(r, r, 0, 1) =

∫Ω

π(ω, r)(ω − r)dF (ω).

We consider mechanisms that satisfy the incentive compatibility constraint

Uπ(r) ≥ Uπ(r, r, a0, a1) for all r, r ∈ R, a0, a1 ∈ 0, 1. (IC)

It is convenient to introduce the notation for the expected payoff of the obedientreceiver, who makes report r and then obeys the recommendation of the mechanism:

Uπ (r, r) := Uπ(r, r, 0, 1) = pπ (r)− qπ (r) r,

where

qπ (r) =

∫Ω

π(ω, r)dF (ω) and pπ (r) =

∫ω

ωπ(ω, r)dF (ω) .

With this representation of the payoff function we can draw the parallel to the stan-dard linear mechanism design problem with transfers, where r is a private value,qπ(r) is the probability of transaction and pπ(r) is the expected monetary transferthat depend on report r. The classical envelope argument yields the following lemma:

Lemma 1. A mechanism π is incentive-compatible if and only if

Uπ (r) =

∫ 1

r

qπ(s)ds, (2)

Uπ (0) = E[ω], (3)

qπ is non-increasing. (4)

Proof. The proof is in the Appendix.

Interestingly, the obedience constraints for the intermediate types are implied bythe obedience constraints for the boundary types and incentive compatibility, Uπ(r) ≥Uπ(r, r). To disobey by ignoring the recommendation, that is, to act (not to act)irrespective of what is recommended, is the same as to pretend to be the lowesttype r = 0 (the highest type r = 1, respectively). To disobey by taking the opposite

8 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

action to the recommended one is never beneficial due to monotonicity of the receiver’sutility.

In our model, there are no transfers and there are obedience constraints insteadof individual rationality constraints. These differences between our and the standardenvironment with transfers translate in the following differences in characterization.

First, there are two boundary conditions, q(0) = 1 and q(1) = 0:

(a) Since ω ≥ 0, we have ω − r ≥ 0 for all ω if r = 0. Hence, type 0’s payoff ismaximized by taking action for any belief about the state, implying q(0) = 1and the payoff equal to the expected state. This is (3).

(b) Since ω ≤ 1, we have ω − r ≤ 0 for all ω if r = 1. Hence, type 1 never acts,implying q(1) = 0 and U(1) = 0. This is (2), with r = 1.

Second, monotonicity of q’s and boundary conditions q(0) = 1 and q(1) = 1 arenot sufficient for incentive compatibility. Conditions (2) and (3) imply that the area

under q must be equal to the expected state,∫ 1

0qπ(s)ds = E[ω].

Finally, not every pair (q, U) that satisfies conditions (2)–(4) is feasible, that is, amechanism π that implements such a pair need not exist. For example, if ω = 1/2

with certainty, then every monotonic q with q(1) = 0, q(0) = 1, and∫ 1

0q(r)dr = 1

2

satisfies (2)–(4). Among these functions, the unique feasible q is q(r) = 1 if r ≤ 1/2and q(r) = 0 otherwise.

4. Equivalence of private and public persuasion

4.1. Bounds on the receiver’s payoff. Consider two simple mechanisms. The fulldisclosure mechanism informs the receiver about the state, so the receiver acts iff histype is below the realized state ω, i.e., π(ω, r) = 1 iff r ≤ ω, and the expected payoffis

U (r) =

∫ 1

r

(ω − r) dF (ω) .

The no disclosure mechanism does not convey any information to the receiver, so thereceiver acts if and only if his type r is below the ex-ante expected value of the state,i.e., π(r, ω) = 1 iff r ≤ E[ω], and the expected payoff is

U (r) = max E[ω]− r, 0 .

Thus, U(r) is the optimal payoff of the receiver based on prior information about ωas given by F , while U(r) is the receiver’s expected payoff if he observes ω.

Note that every mechanism π must satisfy

U(r) ≤ Uπ(r) ≤ U(r) for all r ∈ R. (B)

The left-hand side inequality of (B) is implied by incentive compatibility: the receivercannot be better off by ignoring the sender’s recommendation. The right-hand sideinequality of (B) is the feasibility constraint : the receiver’s payoff cannot exceed thepayoff attained under full disclosure of ω.

PERSUASION OF A PRIVATELY INFORMED RECEIVER 9

4.2. Implementable utility profiles. We say that receiver’s utility profile U isimplementable if there exists a private persuasion mechanism π such that U(r) =Uπ(r) for all r ∈ R. We say that U is implementable by public signal if there exists apublic persuasion mechanism σ such that the receiver’s utility in this mechanism Uσsatisfies U(r) = Uσ(r) for all r ∈ R.

Figure 1. Set U contains every convex function between U and U .

Let U be the set of all convex functions bounded by U and U (see Fig. 1).

Theorem 1. The following statements are equivalent:(a) U is a convex function bounded by U and U ;(b) U is implementable;(c) U is implementable by public signal.

Before proving the theorem, we would like to point out to an important propertyof public signals. The receiver’s utility is linear in state ω, so she only cares aboutthe posterior mean state. Thus, every public signal σ can be equivalently describedby the probability that the posterior mean state is below any given value x ∈ R,

H(x) =

∫ 1

0

Pr[σ(ω) ≤ x

]dF (ω), x ∈ R.

The next lemma is immediate by Proposition 1 in Kolotilin (2014).

Lemma 2. There exists a public signal that induces a distribution H of the posteriormean of the state if and only if the prior distribution F is a mean-preserving spreadof H.

We now prove Theorem 1.

Proof. By (1), every U implementable by public signal is implementable. By Lemma1 and (B), every implementable U belongs to U . It remains to prove that every U ∈ Uis implementable by a public signal.

Let U ∈ U . Define H(r) = 1+U ′(r), where U ′ denotes the right-derivative. Observe

that U ′(0) = U′(0) = −1 and U ′(1) = U

′(1) = 0, hence H(0) = 0 and H(1) = 1.

Also, H is increasing, since U is convex. Hence, H is a c.d.f. Next, observe that∫ 1

r

(1−H(s))ds = U(r) ≤ U(r) =

∫ 1

r

(1− F (s))ds. (5)

Furthermore, U(0) = U(0) = µ0 implies that (5) is satisfied with equality for r = 0.Thus, F is a mean-preserving spread of H. Be Lemma 2 it follows that there existsa signal σ that induces utility profile U .

Remark 1. Theorem 1 indirectly characterizes the set of implementable triples(q, U, V ), where q : R → [0, 1] is the receiver’s action profile and V : R → R is

10 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

the sender’s utility profile. Indeed, profile U uniquely pins down profiles q and V :

q (r) = −U ′ (r) , (6)

V (r) = q (r) + ρ (r)U (r) , (7)

where the first equality holds by Lemma 1 and the second by definitions of q, U , andV . In particular, action profile q is implementable by a persuasion mechanism if andonly if q is non-increasing and satisfies∫ 1

r

q(s)ds ≤ U(r) for all r ∈ R. (8)

Remark 2. For Theorem 1 to hold, it suffices to assume that the receiver’s payoffu(ω, r) is b(r) + c(r)d(ω) for some functions b, c, and d, with c positive. In addition,the receiver’s utility profile U pins down the sender’s utility profile V if v (r, ω) =1 + ρ (r)u(ω, r) for some function ρ.10 Otherwise, (7) does not hold. Hence, a publicpersuasion mechanism and a private persuasion mechanism that implement the sameutility profile U for the receiver may yield different payoffs to the sender. As aconsequence, optimal persuasion mechanisms may be outside of the set of publicpersuasion mechanisms.

5. Optimal Mechanisms

Denote by Vπ(r) the sender’s expected payoff under mechanism π when the re-ceiver’s type is r:

Vπ(r) =

∫Ω

v(ω, r)π(ω, r)dF (ω).

The sender seeks a persuasion mechanism π that maximises∫RVπ(r)dG(r). The fol-

lowing lemma is a useful tool for finding optimal persuasion mechanisms. It expressesthe sender’s expected payoff as a function of the receiver’s utility profile.

Lemma 3. For every mechanism π,∫R

Vπ(r)dG(r) = C +

∫R

Uπ(r)I(r)dr, (9)

whereI(r) = g′(r) + ρ(r)g(r) and C = g(0)E[ω].

This result follows from the observation that Vπ is a function of Uπ for every π, aswe note in Remark 1. Integration by parts yields (9). The proof is in the Appendix.

By Theorem 1 the receiver’s utility profile is implementable by some persuasionmechanism if and only if it is in U , hence the sender’s problem can be expressed as:

maxU∈U

∫R

U(r)I(r)dr. (10)

10The receiver’s incentive compatibility constraints do not change if his payoff function is di-vided by c (r). After this normalization, without loss of generality, we can rename the types,r = −b(r)/c (r), and the states, ω = d(ω). The sender’s payoff can also be expressed asv (ω, r) = 1 + ρ (r) (ω − r) where ρ (r) = ρ (r) c (r).

PERSUASION OF A PRIVATELY INFORMED RECEIVER 11

We say that U is optimal if it solves the above problem.

5.1. Upper- and lower-censorship. A public signal is an upper-censorship if thereexists a cutoff ω∗ ∈ Ω such that the signal truthfully reveals the state whenever it isbelow the cutoff and does not reveal the state whenever it is above the cutoff, so

σ(ω) =

ω, if ω ≤ ω∗,

E[ω|ω > ω∗], if ω > ω∗.

A lower-censorship signal is defined symmetrically, σ(ω) = ω whenever ω > ω∗ andσ(ω) = E[ω|ω ≤ ω∗] whenever ω ≤ ω∗ for some ω∗ ∈ Ω.

Figure 2. Utility under upper-censorship mechanism.

The utility profile under an upper-censorship mechanism (which pools all states oninterval [ω∗, 1]) is shown as a black kinked curve on Fig. 2. Since ω < ω∗ is revealed tothe receiver, if receiver’s type r is below ω∗, he is informed whether ω exceeds r or notand hence receives the highest feasible utility U(r) by acting iff ω > r. On the otherhand, a receiver with type r above ω∗ is essentially informed whether ω exceeds ω∗ ornot, hence he acts iff E[ω|ω > ω∗]−r > 0. His utility is thus either E[ω|ω > ω∗]−r orzero, whichever is greater, corresponding to the two straight segments of the utilitycurve.

The next result provides conditions for optimality of upper- (lower-) censorship.

Lemma 4. Upper-censorship (lower-censorship) is optimal for all F iff I crosses thehorizontal axis at most once and from above (from below).

Figure 3. Construction of upper-censorship mechanism under quasiconcave I.

Proof. Consider a function I that crosses the horizontal axis once and from above atsome r∗, so I(r) is positive for r < r∗ and negative for r > r∗. By (10) the senderwishes to design a mechanism that maximizes U(r) for all r < r∗ and minimizes it forall r > r∗, subject to the incentive and feasibility constraints. To prove that upper-censorship is optimal, given U , we now construct an upper-censorship function thathas exactly the same utility at r∗, while higher utility for all r < r∗ and lower utilityfor all r > r∗, thus being preferable for the sender. The construction is illustrated byFig. 3. Function U is depicted by the solid dark red curve; the constructed upper-censorship function is depicted by the kinked black curve. This function coincideswith U(r) on interval [0, ω∗] and then connects points (ω∗, U(ω∗) and (r∗, U(r∗)) bya straight line, where ω∗ is the tangency point. This straight line continues untilit hits zero utility and forms a kink at zero. This utility function is induced by anupper-censorship mechanism with cutoff point ω∗, and it is superior to U by (10).The argument for lower-censorship is symmetric.

12 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

Conversely, suppose that I does not cross the horizontal axis at most once and fromabove. Then there exist r1 < r2 < r3 such that I is negative on (r1, r2) and positiveon (r2, r3), because I is continuous by assumption. Therefore, lower-censorship isoptimal for any F that has support only on [r1, r3]. In fact, by (10), every upper-censorship mechanism is strictly inferior. The argument for I that does not cross thehorizontal axis at most once and from below is symmetric.

Full disclosure of ω and no disclosure of any information about ω are two naturalpolicies that can be available to the sender even if she cannot use the entire set ofpersuasion mechanisms. The privacy of the receiver’s information can protect thereceiver and force the sender to fully disclose her information. However, disclosing noinformation about the state can also be optimal. From (10), full disclosure is optimalamong all persuasion mechanisms if I ≥ 0 and no disclosure is optimal if I ≤ 0. Bythe same argument as in the proof of Lemma 4, the converse is also true.

Corollary 1. Full disclosure (no disclosure) is optimal for all F iff I is positive(negative).

Optimality of full disclosure or no disclosure in Corollary 1 is determined only bythe distribution of type of the receiver. In particular, full disclosure can be optimaleven when the sender is indifferent about the payoff of the receiver (ρ(r) = 0, sov(ω, r) = 1), and the ex-ante expected state E[ω] is so high that no disclosure inducesaction 1 with probability close to 1.

Under the assumption of type-independent weight ρ we can obtain a stronger resultfor optimality of upper- (lower-) censorship:

Theorem 2. Let ρ(r) = ρ ∈ R. Upper censorship (lower censorship) is optimal forall F and all ρ iff density g of the receiver’s type is logconcave (logconvex).

Proof. By Lemma 4, upper censorship (lower censorship) is optimal iff function I =g′ (r) + ρg (r) crosses the horizontal axis at most once and from above (from below).

This holds for all ρ ∈ R iff g′(r)g(r)

is nonincreasing (nondecreasing) by Proposition 1 of

Quah and Strulovici (2012).

This result is useful, because many commonly-used density functions are logconcaveor logconvex (Tables 1 and 3 of Bagnoli and Bergstrom, 2005).

Under the assumption of type-independent weight ρ, we can also obtain a strongerresult for optimality of full disclosure and no disclosure. We say that G is a maximumentropy distribution with parameter λ ∈ R if

g(r) = ceλr, r ∈ [0, 1],

where c > 0 is a constant such that∫ 1

0g(r)dr = 1.11 Note that the uniform distribu-

tion is the maximum entropy distribution with λ = 0.

Corollary 2. Let ρ(r) = ρ ∈ R. Either full disclosure or no disclosure is optimal forall F and all ρ iff G is the maximum entropy distribution function.

11That is, G is the maximum entropy distribution on the class of probability distributions withsupport on [0, 1] and a given mean.

PERSUASION OF A PRIVATELY INFORMED RECEIVER 13

Proof. The only two mechanisms that are both upper-censorship and lower-censorshipare full disclosure and no disclosure. By Theorem 2, both upper-censorship and lower-censorship are optimal for all F and all ρ iff g is both logconcave and logconvex. That

is, g′(r)g(r)

is both non-increasing and non-decreasing, which holds iff g′(r)g(r)

is equal to some

constant λ.

The class of maximum entropy distributions is a very special class under which theoptimal choice of the sender polarizes, in the sense that either full disclosure or nodisclosure is optimal. So, if g(r) = ceλr for some λ ∈ R, then full disclosure is optimalwhenever ρ ≥ −λ and no disclosure is optimal whenever ρ ≤ −λ (and anything isoptimal when ρ = −λ).

As an illustartion, let ρ = 0 and let G be the maximum entropy distribution withmean r0. Then, no disclosure is optimal if and only if r0 ≤ 1

2and full disclosure is

optimal if and only if r0 ≥ 12. Indeed, with ρ = 0, the optimal persuasion mechanism

is fully determined by the sign of λ, which is in turn determined by whether mean r0

is greater or smaller than 12.

5.2. Comparative statics. Theorem 2 allows for sharp comparative statics analysison the amount of information that is optimally disclosed by the sender. Specifically,if g is logconcave and if a change in primitives increases optimal cutoff ω∗, thenthe optimal mechanism discloses more information. We now show that the senderoptimally discloses more information when she is less biased relative to the receiver,or when receivers are more reluctant to act.

Let ρ(r) = ρ ∈ R. Recall that ρ is the alignment parameter between the sender’sand the receiver’s preferences: a greater ρ means a smaller sender’s bias.

Consider a family of c.d.f.s Gt of the receiver’s type

Gt (r) ≡ G (r − t) ,

where t ∈ R is a parameter and G admits a strictly positive, differential, and logcon-cave density g on R with g′ + ρg being almost everywhere nonzero. Parameter t is aparallel shift of the distribution of types. A greater t means a more “conservative”(reluctant to act) distribution, since every type of the receiver has a greater cost ofaction.

Since gt is logconcave for every t, upper censorship is optimal by Theorem 2. Letω∗ (ρ, t) ∈ Ω be the optimal upper-censorship cutoff.

Proposition 1. For all ρ and t such that ω∗ (ρ, t) ∈ (0, 1):

(a) ω∗ (ρ, t) is strictly increasing in ρ,(b) ω∗ (ρ, t) is strictly increasing in t.

The intuition for part (a) is that for a higher ρ, the sender puts more weight onthe receiver’s utility, so she optimally endows the receiver with a higher utility byproviding more information. To get the intuition for part (b), consider the case ofρ = 0. When t increases, the peak of the density function g shifts to the right.For a large enough t density g is increasing on [0, 1], so full disclosure is optimal.

14 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

Symmetrically, when t is sufficiently small, density g is decreasing on [0, 1], so nodisclosure is optimal.

6. Multiple Actions and Piecewise Linear Utility

In this section we extend our model by allowing the receiver to make a choiceamong multiple actions. We characterize the receiver’s utility profiles implementableby private and public persuasion mechanisms and demonstrate that the are, generally,not equal when the receiver has more than two of actions. That is, the sender cangenerally implement a strictly larger set of the receiver’s utility profiles by privatepersuasion, as compared to public persuasion.

We point out that the problem with multiple actions and linear utility of the receivercan simulate the problem with binary actions and convex piecewise linear utility.Therefore, this section also addresses the question of implementation equivalence withnonlinear utilities. It shows that the equivalence of private and public implementationfails even if we limit attention to the class of convex piecewise linear utility functions.

The section concludes with two examples. The first one is a simple illustrationhow the utility profile attainable by private persuasion cannot be implemented by apublic signal. The second example shows that optimality equivalence for the senderalso fails: the outcome of the optimal private persuasion mechanism is unattainableby public persuasion.

6.1. Model. Let A = 0, 1, ..., n be a finite set of actions available to the receiver.As we previously assumed, the receiver’s utility u(ω, r, a) is linear in state ω for everytype and every action. That is to say, after having observed any message from thesender, all that matters for the receiver is the posterior mean of the state. With thisassumption we can represent the receiver’s utility in a reduced form. For every andtype r and every posterior distribution over states with a mean µ define

U(µ, r) = maxa∈A

u(µ, r, a).

Observe that if U(µ, r) is convex and piecewise linear in type with a finite number ofkinks, then by the Envelope Theorem U(µ, r) is a reduced form of some utility functionu(µ, r, a) with finite actions which is linear in type for every action. We restrictattention to such utility functions. Specifically, let the receiver’s utility u(ω, r, a)satisfy

u(ω, r, a) = baω − car − da,where ω ∈ [0, 1] is a state, r ∈ [0, 1] is a receiver’s type, a ∈ A is an action, and(ba, ca, da)a∈A are parameters. We normalize b0 = c0 = d0 = 0.

Suppose that the actions are ordered such that a receiver chooses a higher actionin a higher posterior mean state. This assumption is w.l.o.g. after having eliminatedthe actions that are dominated, i.e., never optimal for any type. Denote by xa(r)the cutoff state at which the receiver of type r is indifferent between two consecutive

PERSUASION OF A PRIVATELY INFORMED RECEIVER 15

actions a− 1 and a:

xa(r) =ca − ca−1

ba − ba−1

r +da − da−1

ba − ba−1

, a = 1, ..., n,

and define x0(r) = −∞ and xn+1(r) = ∞. Thus, the receiver of type r choosesaction a on the interval of states [xa(r), xa+1(r)]. Note that the cutoffs need not bein Ω = [0, 1].

6.2. Signals. Recall that every public signal σ can be equivalently described by theprobability that the posterior mean state is below a given value x ∈ R,

H(x) =

∫Ω

Pr[σ(ω) ≤ x

]dF (ω).

It remains true here due to the assumption that the receiver’s utility is linear in state.In fact, it will be more convenient to describe a signal by a convex function C :

R→ [0, 1] defined as

C(x) =

∫ ∞x

(1−H(s))ds =

∫ ∞x

(s− x)dH(s).

With abuse of notation we will refer to function C as signal.Denote by C and C the full-disclosure and no-disclosure signals, respectively. For

the full-disclosure signal, the c.d.f. of the posterior mean state is F , and thus

C(x) =

∫ ∞x

(1− F (s))ds.

For the no-disclosure signal we have

C(x) = maxµ0 − x, 0.Lemma 2 then implies

C(x) ≤ C(x) ≤ C(x), x ∈ R.Public mechanisms. A public mechanism is a signal whose messages are observed

by every type of the receiver. For a given signal σ, the utility of the expected receiveris equal to

U(r) =

∫Ω

U(µ, r)dH(µ), r ∈ [0, 1]. (11)

Denote by Upub the set of utility profiles implementable by public signals,

Upub = U : there exists a signal σ s.t. U satisfies (11) .We now characterize of the utility profiles of the receiver implementable by public

mechanisms.

Proposition 2. A utility profile U is implementable by public persuasion if and onlyif there exists a convex C with C ≤ C ≤ C such that

U(r) =n∑a=1

(ba − ba−1)C(xa(r)) for all r ∈ [0, 1].

16 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

Proof. Fix a signal and let H : R → [0, 1] be the c.d.f. of the posterior mean state.Also let C(x) =

∫∞x

(1−H(x))dx.Recall that xa(r) denotes the cutoff posterior mean where type r is indifferent

between actions a − 1 and a, where we set x0 = −∞ and xn+1 = ∞. The payoff oftype r under the given signal is

U(r) =n∑a=0

∫ xa+1(r)

xa(r)

(baµ− car − da)dH(µ)

=n∑a=1

∫ ∞xa(r)

((ba − ba−1)µ− (ca − ca−1)r − (da − da−1)

)dH(µ),

where we used b0 = c0 = d0 = 0. For each a = 1, ..., n, using notation βa = ba − ba−1

γa = (ca − ca−1)r + (da − da−1) to simplify exposition, integration by parts yields∫ ∞xa(r)

(βaµ− γa)dH(µ) = −(βaµ− γa)(1−H(µ))∣∣∣∞xa(r)

+ βa

∫ ∞xa(r)

(1−H(µ))dµ

= (βaxa(r)− γa)(1−H(xa(r)) + βaC(xa(r)).

Notice that, by definition,

xa(r) =(ca − ca−1)r + (da − da−1)

ba − ba−1

=γaβa,

and hence (βaxa(r)− γa) = 0. Thus we have for every r ∈ [0, 1]

U(r) =n∑a=1

βaC(xa(r)) =n∑a=1

(ba − ba−1)C(xa(r)). (12)

By Lemma 2, the set of signals is characterized by the set of convex functions Csuch that C ≤ C ≤ C. Consequently, the set of utility profiles for the receiver isequal to the set of functions that satisfy (12) for every possible signal C.

Private mechanisms. A private mechanism is described a (possibly, infinite)menu of signals, Σ. The receiver of type r chooses a signal from the menu, and thenobserves messages only from that signal. Obviously, she will choose the one thatmaximizes her expected utility.

U(r) = maxσ∈Σ

∫Ω

U(µ, r)dH(µ), r ∈ [0, 1]. (13)

Denote by Upr the set of utility profiles implementable by all menus of signals,

Upr = U : there exists a menu Σ s.t. U satisfies (13) .

The following corollary is immediate by Proposition 2. Denote

C = C : C ≤ C ≤ C and C is convex.

PERSUASION OF A PRIVATELY INFORMED RECEIVER 17

Corollary 3. A utility profile U is implementable by private persuasion if and onlyif there exists a menu Σ ⊂ C such that

U(r) = maxC∈Σ

n∑a=1

(ba − ba−1)C(xa(r))

for all r ∈ [0, 1].

To make a comparison to the result in Theorem 1 for the binary-action model, letus establish necessary conditions for implementation by private mechanisms. First,if U is implementable, it must be below the full-disclosure utility and above the no-disclosure utility, U ≤ U ≤ U . Second, notice that U(µ, r) is piecewise linear andconvex in r due to our assumptions, and so is

∫ΩU(µ, r)dH(µ) for every signal σ.

Hence, by the Envelope Theorem, an implementable U(r) must be convex. Denoteby Unes the set of utility profiles that satisfy these necessary conditions,

Unes =U : U ≤ U ≤ U and U is convex

.

With binary actions, A = 0, 1, after normalization b1 = 1 and c1 = 0, we haveu(ω, r, a) = (ω − r)a, a ∈ 0, 1. The receiver’s payoff under a signal C is simplyU(r) = C(r). With this assumption, the equivalence is immediate:

Upub = Upr = Unes = C.

However, with more than two actions the equivalence generally fails, not hold,which is established by the following proposition.

Proposition 3. Let A be a finite set with at least 3 actions. Suppose that the cutoffsxa(r) are increasing in a for all types r ∈ [0, 1] and strictly increasing for an intervalof types. Then

Upub ( Upr ( Unes.

The proof is in the Appendix. Example 1 below provides the intuition and illus-trates the proof.

Example 1: Nonequivalence of utility profile implementation. A simpleexample with 3 actions illustrates why the equivalence can fail.

Consider the problem with three actions, A = 0, 1, 2, and the uniform distributionof states, F (x) = x. Let

u(ω, r, a) =

0, a = 0,14

+ ω − r, a = 1,

2(ω − r), a = 2.

The coefficients are (b0, b1, b2) = (0, 1, 2) and (c0, c1, c2) = (0,−14, 0). The cutoffs are

(x1, x2) = (−14, 1

4). By Proposition 2, for a given signal C the expected utility of type

r is

UC(r) = C(r − 14) + C(r + 1

4).

18 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

0.2 0.4 0.6 0.8 1.0r

0.2

0.4

0.6

0.8

1.0

Figure 4. U (blue) and U (brown).

The full-disclosure signal under the uniform distribution is

C(x) =

∫ 1

x

(ω − x)dω =(1− x)2

2,

and the no-disclosure signal is

C(x) = maxµ0 − x, 0 = max12− x, 0.

The full-disclosure and no-disclosure utility profiles are thus

U(r) =(1− (r − 1

4))2

2+

(1− (r + 14))2

2=

17

16− (2− r)r

and

U(r) = max34− r, 0+ max1

4− r, 0 =

0, r > 3

4,

34− r, 1

4≤ r ≤ 3

4,

1− 2r, r < 14.

The full-disclosure and no-disclosure utility profiles are shown on Fig. 4. Notice thegap of 1

16between these utility curves for extreme types, 0 and 1, the feature that

have not been present with binary action.We now fix expected utilities of two types, rL = 1

4and rR = 3

4, equal to their the

full-disclosure utilities,U(1

4) = 5

8and U(3

4) = 1

8,

and consider the least attainable utility of type rM = 12

under necessary conditions,private persuasion, and public persuasion.

PERSUASION OF A PRIVATELY INFORMED RECEIVER 19

Necessary conditions. We construct the least function that lies between U and Uand passes through points (rL, U(rL)) and (rR, U(rR)). This function is piecewiselinear and equal to

Unes(r) =

2−3r

2, r ≤ 1

2,

1−r2, r > 1

2.

Type rM = 12

gets 14, which is also the payoff of rM under no disclosure,

Unes(rM) = U(rM) =1

4.

Private persuasion. The least payoff for all types subject to the utility profilepassing trough points (rL, U(rL)) and (rR, U(rR)) is attained by the least informativemenu Σ which provides the first-best payoffs for types rL and rM . This menu consistsof two cutoff signals, Σ = L,R, which are first-best for type rL and rR, respectively:1. Signal L informs the receiver whether ω is below or above 1

4:

µL =

18, with probability 1

458, with probability 3

4.

2. Signal R informs the receiver whether ω is below or above 34:

µR =

38, with probability 3

478, with probability 1

4.

Observe that type rL = 14

never chooses action 0, and signal L lets him optimallychoose between actions 1 and 2. Thus signal L is first-best for this type. Similarly,type rR = 3

4never chooses action 2, and signal Rlets him optimally choose between

actions 0 and 1. Thus signal R is first-best for this typeUnder these signals,

CL(x) =

0, x > 5

8,

332

(5− 8x), 18≤ x ≤ 5

8,

12− x, x < 1

8.

and

CR(x) =

0, x > 7

8,

132

(7− 8x), 38≤ x ≤ 7

8,

12− x, x < 3

8.

Thus, the utility of type rM = 12

is equal to

Upr(rM) = maxCL(rM − 14) + CL(rM + 1

4), CR(rM − 1

4) + CR(rM + 1

4) =

9

32.

Public persuasion. In order to preserve the utilities of types rL and rR who aregetting their first bests from the private menu L,R, a public signal must at least

20 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

communicate the partition

[0, 14], [1

4, 3

4], [3

4, 1]

to the set of types rL and rR, withposterior mean

µL∧R =

18, with probability 1

412, with probability 1

278, with probability 1

4.

Substituting this into the utility function one obtains that

Upub(rM) =5

16.

Thus we have obtained the least utilities for type rM = 12

attainable under publicpersuasion, private persuasion, and necessary conditions satisfy

Upub(rM) =10

32> Upr(rM) =

9

32> Unes(rM) = U(rM) =

8

32.

Consequently, in this example, the sets of corresponding utility profiles satisfy strictinclusion,

Upub ( Upr ( Unes.

Example 2: Nonequivalence of optimality. We have established that withmultiple action or nonlinear utility public persuasion mechanisms generally cannotimplement every utility profile of the receiver attainable by private persuasion. Onecan suspect that the same statement can be made about optimality for the sender.That is, an optimal outcome under private mechanisms can be unattainable by publicsignals. Here we show, by example, that this is true.

Consider the problem with three actions, A = 0, 1, 2, and the uniform distributionof states, F (x) = x. Let

u(ω, r, a) = ωa− ra2.

The interpretation is that a is a level of effort and ra2 is the type-dependent cost ofeffort. The receiver maximizes her expected utility. The cutoff values of posteriormeans are

x1(r) = r, x2(r) = 3r.

Thus type r chooses action 0 if the posterior mean satisfies µ < r, chooses action 1 ifr ≤ µ < 3r, and chooses action 2 if µ ≥ 3r.

Suppose that there are two receiver types rL = 0.2 and rR = 0.975 in equal pro-portion in the population. The seller’s objective is to maximize the expected effort(action) of the receivers.

Consider the private menu of two signals, Σ = L,R:1. Signal L informs the receiver whether ω is below or above 0.2:

µL =

0.1, with probability 0.2

0.6, with probability 0.8.

PERSUASION OF A PRIVATELY INFORMED RECEIVER 21

2. Signal R informs the receiver whether ω is below or above 0.95:

µR =

0.475, with probability 0.95

0.975, with probability 0.05.

First let us inspect the incentives of type rR = 0.975. This type chooses eitheraction 0 or 1 (action 2 is never optimal). Signal R maximizes this type’s probabilityof choosing action 1 (and, in fact, takes away all surplus of this type). Signal L isuseless for this type as it leads to the choice a = 0 after all its messages. Hence typerR weakly prefers signal R. This is optimal for the sender, as type rR’s expectedaction is maximized.

Next we inspect the incentives of type rL = 0.2. Signals L and R yield the utilities,respectively,

0.2u(0.1, rL, 0) + 0.8u(0.6, rL, 1) = 0.32

and0.95u(0.475, rL, 1) + 0.05u(0.975, rL, 2) = 0.31875.

Thus type rL prefers signal L. Observe that signal L is optimal for the sender as itmaximizes type rL’s probability of action 2, and there is no way to further partitionthe state space to increase the expected action.

So, we have shown that signals L and R maximize the expected actions from typesrL and rR, respectively, and incentive compatible, hence menu L,R is optimal.The expected action of type rL is 0.8× 2 = 1.6 and the expected action of type rR is0.05× 1 = 0.05. The total is, given the equal proportions of the types,

121.6 + 1

20.05 = 0.825.

Now, notice that any public signal that preserves the expected action by type rLmust send the same message whenever state is in [0.2, 1], as otherwise the frequencyof action 2 will be smaller. But then, in every such signal, type rR will always choosea = 0. Hence there does not exist a public signal that generates as high payoff forthe sender as the private menu L,R.

7. Applications

7.1. Public education campaign. The government would like to persuade the pub-lic to reduce the amount of smoking by commissioning a public education campaignabout the health risks of cigarettes.

There is a competitive market for cigarettes. The supply is perfectly elastic: themarket clearing price is equal to the cost parameter, p = c. The demand is inducedby a continuum of consumers. Each consumer values cigarettes at ω − ε, where ω isan uncertain common value of cigarettes and ε is an idiosyncratic preference shock;ω ∈ [0, 1] and ε ∈ R are distributed according to F and G, respectively, where densityg is log-concave. Thus, a consumer would like to purchase cigarettes if and only ifher value ω − ε exceeds price p.

Denote r = ε + p. Then a buyer’s net payoff is u(ω, r) = ω − r and the c.d.f.of r is Gp (r) ≡ G (r − p). The total consumed quantity will depend on consumers’

22 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

beliefs about the expected value of ω conditional on the information they have. If thisexpected value is m, then a consumer would like to purchase cigarettes if and onlyif m ≥ r. The demand curve is thus equal to Gp(m). Thus, the consumer surplus isequal to CS(m) =

∫ m0

(m− r)dGp(r).Smoking imposes a negative externality on the entire population proportional to the

amount of cigarette consumption, αGp(m), where α > 0. The social planner wouldlike to maximize the social welfare by designing a public education campaign (per-suasion mechanism) that reveals information about ω and thereby alters its expectedvalue, m. For each ω and r social planner’s preferences are captured by payoff

v(ω, r) = u(ω, r)− α

if consumer with type r buys cigarettes, and zero if he does not buy. The socialwelfare is equal to V (m) = CS(m)− αGp(m).

As in Section 2, let π be a persuasion mechanism. The social planer maximizes

Vπ(r) =

∫ ∫v(ω, r)π(ω, r)dF (ω)dGp(r) =

∫(Uπ(r)− αqπ(r)) dGp(r)

= C +

∫Ip(r)Uπ(r)dr

where Ip(r) = gp(r)− αg′p(r) and C is a constant. Our results from Section 5 apply,allowing us to characterize the optimal educational campaign. In particular, fullinformation disclosure is optimal if g is decreasing (so a large mass of consumers hashigh value for cigarettes), irrespective of the magnitude of the negative externality,α. A lower-censorship that fully informs consumers only if quality ω is above a cutoffis optimal if g is log-concave (Theorem 2).

Naturally, absent negative externality (α = 0), full disclosure is always optimal:the social planner only increases the consumer surplus by letting the consumers makeinformed choices. But in presence of the externality, the social planner may wish tohide information. A larger negative externality and a higher price of cigarettes lead to(weakly) less disclosure (Proposition 1). The intuition is that if the price is sufficientlylarge, then g is increasing (so a large mass of consumers has low value for cigarettes)and a substantial fraction of consumers would not buy cigarettes if uninformed. So, abetter information is more likely to increase cigarette consumption than to decreaseit. Even though there is a positive effect of information due to consumers makingbetter informed choices, this is offset by the externality effect if α is large.

Finally, our analysis can easily handle environements with positive externalities(negative α), such as educational campaigns about benefits of vaccination. In suchenvironments, upper-censorship that fully informs consumers only if quality ω is belowa cutoff is optimal. Moreover, a larger positive externality and a lower price of vaccineslead to less disclosure.

7.2. Lobbying. We consider a modification of the lobbying example from Kamenicaand Gentzkow (2011) adapted for the case of two actions of the receiver and uncer-tainty about the receiver’s preferences. A lobbying group (sender) commissions a

PERSUASION OF A PRIVATELY INFORMED RECEIVER 23

study with the goal of influencing a politician’s decision about adoption of a policy.The politician (receiver) chooses whether to adopt the policy (a = 1) or not (a = −1).The social value of the policy is determined by state ω ∈ [−1, 1] and politician’s typer ∈ [−1, 1]. The politician’s payoff is equal to u(a, ω, r) = −(a − ap (ω, r))2, wherethe politician’s bliss point is given by ap (ω, r) = ω− r. The lobbyist’s payoff is equalto v(a, ω) = −(a − al (ω, r))

2, where the lobbyist’s bliss point is biased towards c:al (ω, r) = αap (ω, r) + (1− α)c with α ∈ [0, 1) and c > 0.

Note that1

4(u(1, ω, r)− u(−1, ω, r)) =

1

4

((1 + (ω − r))2 − (1− (ω − r))2

)= ap (ω, r)

and1

4(v(1, ω)− v(−1, ω)) = al (ω, r) .

Making the substitution ω = ω+12

, r = r+12

, and ρ = α2(1−α)c

, and changing the notation

to a = 0 for a = −1 and a = 1 for a = 1, we obtain

u(ω, r) =1

2(u(1, ω, r)− u(−1, ω, r)) = ω − r,

v(ω, r) =(1− α) c

2(v(1, ω)− v(−1, ω)) = 1 + ρ (ω − r)

with ω, r ∈ [0, 1] and ρ ≥ 0. Thus, our analysis applies. In particular, if density g ofr is logconcave, then upper-censorship is optimal with more information disclosed ifthe lobbyist is less biased (α closer to 1 and c closer to 0).

There are two noteworthy differences with the results in Kamenica and Gentzkow(2011) for this environment. First, the optimal persuasion mechanism is very muchdriven by the shape of the distribution of the private information of the receiver andmuch less by the degree of the conflict of preferences between the parties expressedin terms of the effective bias. In Kamenica and Gentzkow (2011), the value of αdetermines the structure of the optimal mechanism, whereas c played no role. In ourenvironment, both of them interact. Second, in Kamenica and Gentzkow (2011) thelobbyist either commissions a fully revealing study or no study at all. As Kamenicaand Gentzkow (2011) remark: “This contrasts with the observation that industry-funded studies often seem to produce results more favorable to the industry thanindependent studies.” In our environment, more complex mechanisms can be optimaland the expected probability of approval of the policy can be higher than the ex-anteprobabilities of approval under full or no information.

7.3. Media censorship. The following is an adaptation of the model of media en-dorsement of competing political candidates (parties) in Chiang and Knight (2011)augumented by the possibility of censorship by a partial government.12

There is a government, two political parties c ∈ G,O (pro-Government andOpposition) competing in a parliamentary election, a continuum of voters and acontinuum of newspapers, each of unit measure. Parties are characterized by a pair

12Another model with similar structure is Chan and Suen (2008).

24 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

of platform and quality (ic, ωc), where we assume that ic, ωc ∈ [0, 1], and without lossof generality set iG < iO. If party c wins, the voter’s payoff is equal to

u(c, r) = ωc −β

2(ic − r)2,

where r ∈ [0, 1] is the voter’s ideal platform (ideology) and β > 0 represents therelative utility weight placed on the party’s platform. The value of r is distributedacross the population of voters according to c.d.f. G. The voters know the platformsof the parties but are uncertain about their relative quality differential ω = ωG−ωO ofthe candidates on the party ballots. We assume that ω is distributed on R accordingto c.d.f. F . Voters support the party that maximizes their expected utility.

The newspapers receive information about the quality of the candidates and chooseto endorse one of the parties. Each newspaper n observes signal ωn = ω + εn, whereεn are idependent identically distributed uniform error terms with zero mean. Wedenote the distribution of ωn by Fn. A newspaper endorses the pro-Governmentparty iff ωn > γn, where γn describes the newspaper’s editorial policy. Voters knowthe editorial policies and choose a single newspaper to read.

In Chiang and Knight (2011), the newspaper editorial policies are exogenous. Weextend the model by allowing the government to determine the set of editorial poli-cies available in the market by shutting down newspapers with undersirable editorialpolicies and subsidizing creation of additional newspapers if some desired editorialpolicies are lacking. Thus, we allow the government to determine the set of the edi-torial policies Γ = γn|n ∈ N. The objective of the government is to maximize theshare of the votes for the pro-Government party.

The timing of the game is as follows. The government chooses the set of permittededitorial policies.13 The voters choose their preferred newspapers. The newspapersobserve the information about the party quality and make their endorsements. Thevotes are cast. The solution concept is perfect Bayesian equilibrium.

Proposition 4. Assume that the density of the distribution of the voter ideologyis log-concave. Then, Γ = γ|γ ≤ ω∗ for some ω∗ ∈ [0, 1], that is, the governmentcensors all newspapers with the editorial policies above ω∗ and ensures that all editorialpolicies with γ ≤ ω∗ are represented in the market.

8. Discussion

In this section we discuss side issues and extensions that connect this paper to theliterature beyond the topic of Bayesian persuasion.

8.1. Rotation order. The set of upper-censorship (lower-censorship) mechanisms istotally ordered by cutoff points ω∗ in Ω. The extremes, ω∗ = 0 and ω∗ = 1, correspondto the two extreme mechanisms, full disclosure and no disclosure.

To compare to Johnson and Myatt (2006), we consider the case of log-concavedensity g of the receiver’s type and assume that ρ = 0 (so the sender cares about

13The newspapers are not strategic, they only follow their editorial policies (unless censored bythe government).

PERSUASION OF A PRIVATELY INFORMED RECEIVER 25

maximizing the probability that the receiver acts). Interestingly, the set of c.d.f.sof the posterior expected payoff θ induced by upper-censorship (lower-censorship)mechanisms is also totally ordered by the rotation order R (Johnson and Myatt

2006): for every two c.d.f.s H and H one can find a rotation point θ∗ such that H RH iff H(θ) ≥ H(θ) for all θ ≤ θ∗ and H(θ) ≤ H(θ) for all θ ≥ θ∗. Roughly speaking,a higher c.d.f. in the rotation order has a higher dispersion. For upper-censorshipmechanisms, the rotation point θ∗ is ω∗−ω∗B, where ω∗B is the optimal upper-censorshipcutoff. This rotation point is increasing w.r.t. R. For lower-censorship mechanisms,the rotation point θ∗ is ω∗− ω∗W where ω∗W is the worst lower-censorship cutoff. Thisrotation point is decreasing w.r.t. R.

Let us first restrict attention to the set of lower-censorship mechanisms (whichare generally suboptimal, since under log-concave g an optimal mechanism is uppercensorship by Theorem 2). The c.d.f.s H induced by the lower-censorship mechanismsare ordered by rotation order R and satisfy the condition of Lemma 1 of Johnsonand Myatt (2006) that the rotation point is decreasing w.r.t. R, which entails thatone of the extremes (full disclosure or no disclosure) maximizes the sender’s payoffon this set.14

If, instead, we restrict attention to the set of upper-censorship mechanisms (thatcontain an optimal mechanism by Theorem 2), then the c.d.f.s H induced by themechanisms in this set are also ordered by rotation order R, but the condition ofLemma 1 of Johnson and Myatt (2006) is not satisfied. In this case, the sender’s payoffis not quasiconvex w.r.t. R, so it can be maximized at an interior upper-censorshipmechanism.

8.2. Cutoff mechanisms and optimal delegation. A mechanism π is a cutoffmechanism if it recommends the receiver to act if and only if ω is above some cutoffϕ(r) in [0, 1]:

π (ω, r) =

1, if ω ≥ ϕ(r),

0, otherwise.

A cutoff mechanism can be equivalently described as a menu T ⊂ [0, 1] of cutoff testsfor the receiver to choose, such that T = ϕ(r)r∈R. A cutoff test t ∈ T is a binarysignal that recommends the receiver to act if and only if ω ≥ t.

Upper-censorship and lower-censorship mechanisms are cutoff mechanisms. For ex-ample, an upper-censorship mechanism with censorship cutoff w∗ can be representedas the menu of cutoff tests T = [0, ω∗] ∪ 1.

Observe that test t = 0 (t = 1) yields the uninformative recommendation to alwaysact (never act), irrespective of the state. Since type r = 0 always acts and type r = 1never acts, for every incentive-compatible cutoff mechanism the set of cutoff tests Tmust contain 0 and 1. There are no further restrictions: every T ⊂ [0, 1] such that0, 1 ⊂ T corresponds to an incentive-compatible cutoff mechanism.

The problem of optimal persuasion on the class of cutoff mechanisms is equivalentto the problem of optimal delegation of choice from set T ⊂ [0, 1], with the constraint

14Of course, neither may be optimal on a more general domain of mechanisms.

26 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

that T must contain the endpoints, 0 and 1. In this delegation problem, the receivermust pick a cutoff test and then follow the recommendation of the test. The senderrestricts the receiver’s choice to a subset T ⊂ [0, 1] such that 0, 1 ⊂ T and lets thereceiver choose t from T .

Figure 5. Utility under upper-censorship mechanism.

Note that restriction to the class of cutoff mechanisms is not without loss of gen-erality. Consider the following example.

Example 1. Let ρ(r) = 0 (so the sender wishes to maximize the probability that thereceiver acts). Let distribution F be uniform and let distribution G be bimodal asshown on Fig. 5.

Under this assumptions, G is the value function for the sender: conditional onposterior expected value of the state m, the receiver acts iff r ≤ m, thus the sender’spayoff is G(m), the probability that the receiver acts. The sender’s unconditionalpayoff is a convex combination of these values.

The upper bound on the sender’s expected payoff, V ∗, is given by the concaveclosure of G evaluated at E[ω] (cf. Kamenica and Gentzkow, 2011). This upperbound can be achieved by the binary public signal that sends message m = 1/3 ifω ∈ (1/12, 7/12) and m = 2/3 otherwise. However, this upper bound cannot beachieved by any cutoff mechanism.

Appendix A. Structure of Optimal Mechanisms: General Case

In Section 5 we studied optimal persuasion mechanisms under the assumption thatfunction I changes sign at most once. Here we describe the structure of the optimalmechanism for a more general case.

Assume that I is almost everywhere nonzero and changes sign n times.15 Letr1, r2, . . . , rn be the set of types at which I changes its sign.

Figure 6. Optimal utility profile for the case where I(r) changes sign threetimes.

Theorem 3. Every optimal utility profile U is the convex envelope of the graph of Uand a set of at most n+1

2points. Each point lies between U and U and corresponds to

a distinct interval (ri, ri+1) on which I is negative.

Theorem 3 implies that optimal U is piecewise linear unless U(r) = U(r), withkinks on intervals where I is negative. Fig. 6 illustrates the structure of the optimalutility profile for an example where I(r) (the red curve) changes its sign three times.

15On the intervals where I(r) = 0 the sender is indifferent about the choice of U , hence multiplesolutions emerge in this case. The characterization of the solutions is a straightforward but tedious,and thus omitted.

PERSUASION OF A PRIVATELY INFORMED RECEIVER 27

The optimal utility profile U(r) (the black kinked curve) is the lower contour of theconvex hull of points A and B and the graph of U (the solid blue curve). Points Aand B are kinks of U(r), and located over the intervals where I(r) is negative. Onthe first interval of positive I(r), U(r) is the straight line. On the second interval ofpositive I(r), U(r) is the straight line that smoothly merges into U(r). The segmentof U(r) to the right of the tangency point coincides with U .

Moreover, an optimal U is fully determined by a nonincreasing profile (u1, . . . , un),where each ui defines the value of U at point ri, so ui ∈ [U(ri), U(ri)] for all i =1, . . . , n. This follows from the two properties below.

Figure 7. Optimal utility profile on the interval of where I(r) is positive.

Figure 8. Optimal utility profile on the interval of where I(r) is positive.

Clearly, as follows from (10), on any interval (ri, ri+1) where I(r) is positive, theoptimality requires that U(r) is pointwise maximized subject to feasibility (U ≤ U)and convexity of U . That is, for any given values of U at boundary points ri and ri+1,the utility profile U on the interior of (ri, ri+1) is a straight line unless U(r) = U(r),as illustrated by Fig. 7 and Fig. 8. Formally:

(P1) On every interval (ri, ri+1) on which I is positive, U is the greatest convexfunction that passes through the endpoints U(ri) and U(ri+1) and does notexceed U .

Figure 9. Optimal utility profile on the interval of where I(r) is negative.

Similarly, on any interval (ri, ri+1) where I(r) is negative, the optimality requiresthat U(r) is pointwise minimized subject to U ≥ U and convexity of U . That is, on theinterior of (ri, ri+1) the utility profile U is the minimum convex function that passesthrough endpoints U(ri) and U(ri+1). It is an upper envelope of two straight lineswhose slopes are the derivatives at the endpoints, U ′(ri) and U ′′(ri+1), as illustratedby Fig. 9. Formally:

(P2) On every interval (ri, ri+1) on which I is negative, U is piecewise linear withat most one kink, and thus must satisfy

U(r) = maxU(ri) + U ′(ri)(r − ri), U(ri+1) + U ′(ri+1)(r − ri+1)

, r ∈ (ri, ri+1).

Consider a nonincreasing sequence (u1, . . . , un) that satisfies ui ∈ [U(ri), U(ri)] forall i = 1, . . . , n. Let ui determine the value of U at point ri, U(ri) = ui. Observe thatthe interior of every interval (ri, ri+1) is uniquely determined by properties (P1) and(P2). Thus, optimal U can be fully described by the profile (u1, . . . , un). Finding anoptimal U is thus an n-variable optimization problem.

Proof of Theorem 3. The proof is immediate by necessary conditions (P1) and (P2)that every optimal U must satisfy.

28 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

Appendix B. Nonlinear Utility of the Receiver

In this section we discuss the complications of handling the case of nonlinear utilityof the receiver, even for the case of binary actions. We relax the linearity assump-tion and only assume that the receiver’s payoff from acting, u (ω, r), is continuousand strictly monotonic (w.l.o.g., increasing in ω and decreasing in r), and satisfiesnormalization u (r, r) = 0 for all r.

Note that if types ω and r are correlated, the analysis below carries over if weimpose strict monotonicity on function u (ω, r) = u (ω, r) g (r|ω) /g (r) rather than onu, because the receiver’s interim expected payoff under mechanism π can be writtenas U (r) =

∫Ωu (ω, r)π (ω, r) dF (ω).

B.1. Monotonicity. Monotonicity of u implies the same ordinal preferences overthe states by all types of the receiver. Without this assumption, optimal persuasionmechanisms need not be public, as the following example demonstrates.

Example 2. Let ω = (ω1, ω2) ∈ 0, 1 × 0, 1 and r ∈ r1, r2, r3. Let all pairs(ω, r) be equally likely. Define u (ω, r1) = ω1 − 1, u (ω, r2) = ω2 − 1, and u (ω, r3) =3/4 − ω1ω2. Suppose that the sender wishes to maximize the probability that thereceiver acts.

Consider the following mechanism:

π (ω, r) =

1 if r = r1 and ω1 = 1, or if r = r2 and ω2 = 1, or if r = r3,0 otherwise.

Essentially, this mechanism allows the receiver to learn at most one component of ω.Clearly, under this mechanism, it is incentive compatible for the receiver to truthfullyreport r and follow the recommendation. Thus, the probability that the receiver acts(a = 1) under π is:

Pr π (a = 1) = Pr (r = r1) Pr (ω1 = 1) + Pr (r = r2) Pr (ω2 = 1) + Pr (r = r3) =2

3.

However, the sender cannot induce the receiver to choose a = 1 with probability2/3 using a public signal. To see this, note first that the receiver r1 chooses a = 1only if he is certain that ω1 = 1, and the receiver r2 chooses a = 1 only if he iscertain that ω2 = 1. Thus, under any public signal, the probability that the receiveracts cannot exceed 2/3. The only possibility of how the sender could achieve thisprobability would be to reveal both ω1 and ω2, but in that case the receiver r3 wouldnot act when (ω1, ω2) = (1, 1).

B.2. Additive-multiplicative separability. Recall that Theorem 1 only provesequivalence of implementation of the receiver’s utility profile under private and publicpersuasion. To establish equivalence of the sender’s payoff, we used two assumptions,that the sender’s utility is a function of the receiver’s utility and the receiver’s action,and that the receiver’s action profile is pinned down by her utility profile (Remark1). The latter property is a consequence of linear u. More generally, this propertyobtains if and only if u is additively-multiplicatively separable in the sense of Remark

PERSUASION OF A PRIVATELY INFORMED RECEIVER 29

2 (see Proposition 5 below). In other words, when u is not additively-multiplicativelyseparable, it may be the case that two persuasion mechanisms implement the sameutility profile for the receiver, and yet the sender’s expected payoff is different.

Proposition 5. Let π1 and π2 be two mechanisms that are distinct for every r butimplement the same receiver’s utility profile U . The probability of a = 1 is the samefor π1 and π2 if and only if there exist functions b, c, and d such that u (ω, r) =b (r) + c (r) d (ω) for every (ω, r).

Proof. Consider two mechanisms π1 and π2 that implement the same utility profile Ufor the receiver. We have for each i = 1, 2 and for all r∫

u (ω, r)πi (ω, r) dF (ω) = U (r) ,∫ ∂u(ω,r)∂r

πi (ω, r) dF (ω) = U ′(r),

where the first line holds by definition of U and the second line is the local incentivecompatibility condition.

The expected action, qπi(r) =∫πi (ω, r) dF (ω) is the same across i = 1, 2 for all r

if and only if the vectors u (ω, r), ∂u(ω,r)∂r

, and 1 are linearly dependent for all r:

γ (r)∂u (ω, r)

∂r+ µ (r)u (ω, r) = 1 for some µ and γ. (14)

Note that γ(r) 6= 0, because otherwise u (ω, r) would not depend on ω. For every ω,the solution of differential equation (14) is given by

u (ω, r) = e−∫ µ(r)γ(r)

dr

(Const (ω) +

∫1

γ (r)e∫ µ(s)γ(s)

dsdr

),

which completes the proof with b (r) , c (r) , and d (ω) defined by

(b (r) , c (r) , d (ω)) =

(e−

∫ µ(r)γ(r)

dr

∫1

γ (r)e∫ µ(s)γ(s)

dsdr, e−∫ µ(r)γ(r)

dr, Const (ω)

).

B.3. Ex-post equivalence. Theorem 1 establishes equivalence of private and publicpersuasion mechanisms in terms of the interim expected payoff of the receiver. Thisdefinition is in line with recent papers on standard mechanism design with transfers(Manelli and Vincent, 2010 and Gerskov et al., 2013).

A stronger notion of equivalence requires private and public persuasion mechanismsimplement the same ex-post outcomes (actions by the receiver). Notice that equiva-lence in terms of ex-post outcomes implies equivalence in terms of interim expectedpayoffs, without any assumption (such as linearity) on the sender’s and receiver’spayoff functions. Under monotonic u, it is easy to verify whether a given privatemechanism π is implementable by public signal:

Proposition 6. An incentive-compatible persuasion mechanism π is ex-post equiva-lent to a public signal if and only if π(ω, r) is non-increasing in r for every ω.

30 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

Proof. Consider an arbitrary public signal σ. By monotonicity of u, for every messagem if type r acts, then every type r′ < r will act as well. Consequently, we can focuson direct signals where every message m is equal to the type who is indifferent to actor not after receiving this message, Eω[u(ω,m)|m] = 0. Thus, the mechanism16

π(ω, r) =∑

mPr[m ≥ r

]σm(ω).

is non-increasing in r for every ω.Consider a mechanism π that is non-increasing in r. Thus, the probability qπ(r) =∫ 1

0π(ω, r)dF (ω) that the receiver of type r will act is non-increasing in r. The c.d.f.

H(r) = 1− qπ(r) defines a public signal as described in the proof of Theorem 1.

Observe that when u is nonlinear, a incentive-compatible mechanism π need nothave a monotonic qπ, and hence it may be impossible to construct an equivalent publicsignal.

B.4. Binary state. If the state is binary, then there is ex-post equivalence betweenprivate and public mechanisms under the assumption of monotonic u.17

Proposition 7. Let the support of F consist of 0, 1. Then every incentive-compatiblemechanism π is ex-post equivalent to a public signal.

Proof. It is sufficient to show that π is non-increasing in r for r ∈ (0, 1). The receiverr prefers to report n = r rather than r′ only if:∑

ω=0,1

u (ω, r) (π (ω, r)− π (ω, r)) Pr (ω) ≥ 0. (15)

Writing (15) for (r, r) = (r2, r1) and (r, r) = (r1, r2) yields:

− u (0, r2)

u (1, r2)δ (r2, r1, 0) ≤ δ (r2, r1, 1) ≤ −u (0, r1)

u (1, r1)δ (r2, r1, 0) (16)

where δ (r2, r1, ω) ≡ (π (ω, r2)− π (ω, r1)) Pr (ω). Because u (0, r) < 0 and u (1, r) > 0for r = r1, r2, the monotonicity of u in r implies that

0 < −u (0, r2)

u (1, r2)≤ −u (0, r1)

u (1, r1)for r2 ≤ r1. (17)

Combining (16) and (17) gives π (ω, r2) ≥ π (ω, r1) for ω = 0, 1, and r2 ≤ r1.

Appendix C. Omitted Proofs

C.1. Proof of Lemma 1. Incentive compatibility (IC) implies truthtelling,

Uπ(r) ≥ Uπ(r, r) for all r, r ∈ R. (18)

Also, observe that type r = 0 can secure the maximal attainable payoff E[ω] by alwaysacting (irrespective of recommendation). On the other hand, the maximal attainable

16We write σm(ω) for the probability of message m conditional on state ω.17In fact, this result is tight: if the state has at least three values, then there is no ex-post

equivalence between private and public mechanisms, even if the receiver’s utility is linear.

PERSUASION OF A PRIVATELY INFORMED RECEIVER 31

payoff of type r = 1 is zero, which can be secured by never acting. Together with(IC), this implies

Uπ(0) = E[ω] and Uπ(1) = 0. (19)

By the standard envelope argument, (18)–(19) is equivalent to (2)–(4).It remains to show that (18)–(19) imply (IC). By contradiction, suppose that

(18)–(19) hold and there exist r and r such that

Uπ(r) < maxa0,a1∈0,1

Uπ(r, r, a0, a1) .

By (18) we have Uπ(r, r, 0, 1) ≤ Uπ(r). Also, it can be easily verified that Uπ(r, r, 0, 1) ≤Uπ(r, r, 1, 0) implies

Uπ(r, r, 1, 0) ≤ max Uπ(r, r, 0, 0), Uπ(r, r, 1, 1) .Thus we are left with

Uπ(r) < max Uπ(r, r, 0, 0), Uπ(r, r, 1, 1) .Note that a0 = a1 = 1 (a0 = a1 = 0) means that the receiver chooses to act (notto act) regardless of the recommendation, in which case his expected payoff is E[ω](zero). By (19) this is also the payoff of the obedient receiver who reports r = 0(r = 1, respectively). Hence

Uπ(r, r, 0, 0) = Uπ(r, 0) = E[u(ω)] and Uπ(r, r, 1, 1) = Uπ(r, 1) = 0.

But (18) implies Uπ(r) ≥ max Uπ(r, 0), Uπ(r, 1), a contradiction.

C.2. Proof of Lemma 3. We have

Vπ(r) =

∫Ω

(1 + ρ(r)u(ω, r)

)π(ω, r)dF (ω) = qπ(r) + ρ(r)Uπ(r).

Since∫ 1

rqπ(s)ds = Uπ(r) by Lemma 1, integrating by parts yields∫

R

qπ(r)dG(r) = g(0)Uπ(0) +

∫R

g′(r)Uπ(r)dr.

Hence ∫R

Vπ(r)dG(r) = Uπ(0)g(0) +

∫R

Uπ(r)(

(g′(r) + ρ(r)g(r))

dr.

Substituting C = Uπ(0)g(0) = E[ω]g(0) and I(r) = g′(r) + ρ(r)g(r) yields (9).

C.3. Proof of Proposition 1. Types r < 0 always act and types r > 1 never act;so we omit these types from the analysis. The sender’s expected payoff under publicmechanism σ can be written as:

Vσ = C +

∫ 1

0

Jt (r) dHσ (r) , (20)

where C is a constant that does not depend on σ, Hσ is c.d.f. of posterior valuesEσ [ω|m], and

Jt (r) =

∫ r

0

(gt (s) + ρGt (s)) ds.

32 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

Consider ρ and t such that ω∗ (ρ, t) ∈ (0, 1). Denote ω∗∗ = E [ω|ω ≥ ω∗]. Thederivative of the sender’s expected payoff (20) under upper censorship with respectto cutoff ω∗ is:

dV

dω∗= f (ω∗)

∫ ω∗∗

ω∗(J ′t (ω∗∗)− J ′t (s)) ds

= f (ω∗)

[∫ ω∗∗

ω∗(gt (ω∗∗)− gt (s)) ds+ ρ

∫ ω∗∗

ω∗(Gt (ω∗∗)−Gt (s)) ds

].

This derivative is strictly increasing in ρ; so ω∗ is strictly increasing in ρ by Theorem1 of Edlin and Shannon (1998).

Consider t at which ω∗ (ρ, t) ∈ (0, 1). Notice that

J ′t (r) = g (r − t) + ρG (r − t) = J ′ (r − t) ;

so

d2V

dtdω∗= −f (ω∗)

∫ ω∗∗

ω∗(J ′′t (ω∗∗)− J ′′t (s)) ds

= −f (ω∗) J ′′t (ω∗∗) (ω∗∗ − ω∗) + f (ω∗) (J ′t (ω∗∗)− J ′t (ω∗)) .

At optimal interior cutoff, we have dVdω∗

= 0; so

J ′t (ω∗∗) (ω∗∗ − ω∗) =

∫ ω∗∗

ω∗J ′t (s) ds. (21)

Since J ′′t (r) is almost everywhere nonzero, J ′t is not constant on (ω∗, ω∗∗). Moreover,by Theorem 2, It (r) = J ′′t (r) crosses the horizontal axis at most once and from above;so J ′t (r) is quasiconcave. Therefore, (21) implies that J ′′t (ω∗∗) < 0 and J ′t (ω∗∗) >

J ′t (ω∗); so d2Vdtdω∗

> 0 and ω∗ is strictly increasing in t by Theorem 1 of Edlin andShannon (1998).

C.4. Proof of Proposition 4. Let ωn = E [ω|ωn] and notice that E [ω|ωn] is increas-ing in ωn, because εn is uniformly distributed. Thus, without loss we can assume thatnewspaper n observes ωn and endorses the pro-Government party iff ωn > γn. Givena newspaper’s signal about quality differential ωn, the voter’s expected net payofffrom choosing the pro-Government party versus the Opposition party is equal to

u(ωn, r) = ωn −β

2(iG − r)2 +

β

2(iO − r)2

= ωn − β (iO − iG) r +β

2

(i2O − i2G

)Define ω =

ωn+β2 (i2O−i2G)

β(iO−iG)and let

u(ω, r) =1

β (iO − iG)u(ωn, r) = ω − r.

Let F be the distribution of ω. Let γ(r) be the choice of the editorial policy by voterr. Now, consider the auxiliary problem in which the government maximizes V overthe set of persuasion mechanisms. Then, by Propositon 2, the optimal persuasion

PERSUASION OF A PRIVATELY INFORMED RECEIVER 33

policy is upper-censorship. We now show that Γ = [0, ω∗] implements the outcome ofthe optimal persuasion policy, which establishes the claim of the proposition.

Obseve that the voter’s payoff from choosing the newspaper with editorial policyγ net of his full information payoff U(r) is equal to

v(γ, r) =

∫ r

γ

(ω − r)dF (ω)

If Γ = [0, ω∗], this payoff is maximized by choosing the newspaper with the editorialpolicy γ = r if r ≤ ω∗ and γ = ω∗ otherwise. It follows that the voter will choose tovote for the pro-Government party iff ω ≥ r for r ≤ ω∗, iff ω > ω∗ if E[ω|ω > ω∗] ≥ rand r > ω∗, and never otherwise. Observe that the same behavior is induced by theoptimal persuasion policy of upper-censorship at ω∗.

References

Alonso, R., and O. Camara (2014a): “On the value of persuasion by experts,” Mimeo.(2014b): “Persuading skeptics and reaffirming believers,” Mimeo.(2014c): “Persuading voters,” Mimeo.

Bagnoli, M., and T. Bergstrom (2005): “Log-concave probability and its applications,” Eco-nomic Theory, 26(2), 445–469.Bergemann, D., and A. Bonatti (2013): “Selling Cookies,” Cowles Foundation DiscussionPapers 1920, Cowles Foundation for Research in Economics, Yale University.Bergemann, D., A. Bonatti, and A. Smolin (2014): “Selling Experiments: Menu Pricing ofInformation,” working paper.Bergemann, D., B. Brooks, and S. Morris (2012): “Extremal Information Structures in FirstPrice Auctions,” Cowles Foundation Discussion Papers 1299, Cowles Foundation for Research inEconomics, Yale University.

(2013): “The Limits of Price Discrimination,” Cowles Foundation Discussion Papers 1896,Cowles Foundation for Research in Economics, Yale University.Bergemann, D., and S. Morris (2013a): “Bayes Correlated Equilibrium and Comparison ofInformation Structures,” mimeo.

(2013b): “Robust Predictions in Games With Incomplete Information,” Econometrica,81(4), 1251–1308.Bergemann, D., and M. Pesendorfer (2007): “Information structures in optimal auctions,”Journal of Economic Theory, 137, 580–609.Bergemann, D., and A. Wambach (2013): “Sequential Information Disclosure in Auctions,”Cowles Foundation Discussion Papers 1900, Cowles Foundation for Research in Economics, YaleUniversity.Blackwell, D. (1953): “Equivalent comparisons of experiments,” Annals of Mathematical Statis-tics, 24, 265–272.Boleslavsky, R., and C. Cotton (2015a): “Grading Standards and Education Quality,” Amer-ican Economic Journal: Microeconomics, forthcoming.

(2015b): “Information and Extremism in Elections,” American Economic Journal: Mi-croeconomics, forthcoming.Chan, J., and W. Suen (2008): “A spatial theory of news consumption and electoral competition,”Review of Economic Studies, 75, 699–728.Che, Y.-K., W. Dessein, and N. Kartik (2013): “Pandering to Persuade,” American EconomicReview, 103(1), 47–79.

34 Anton Kolotilin, Ming Li, Tymofiy Mylovanov, Andriy Zapechelnyuk

Chiang, C.-F., and B. Knight (2011): “Media Bias and Influence: Evidence from NewspaperEndorsements,” Review of Economic Studies, 78(3), 795–820.Edlin, A. S., and C. Shannon (1998): “Strict Monotonicity in Comparative Statics,” Journal ofEconomic Theory, 81, 201–219.Ely, J. (2014): “Beeps,” mimeo.Ely, J., A. Frankel, and E. Kamenica (2015): “Suspense and Surprise,” Journal of PoliticalEconomy, forthcoming.Eso, P., and B. Szentes (2007): “Optimal Information Disclosure in Auctions and the HandicapAuction,” Review of Economic Studies, 74(3), 705–731.Gentzkow, M., and E. Kamenica (2012): “Competition in Persuasion,” Discussion paper, Uni-versity of Chicago.

(2014a): “Costly Persuasion,” American Economic Review, 104(5), 457–62.(2014b): “Disclosure of Endogenous Information,” Discussion paper, University of Chicago.

Hagenbach, J., F. Koessler, and E. Perez-Richet (2014): “Certifiable Pre-Play Communi-cation: Full Disclosure,” Econometrica, 83(3), 1093–1131.Hoffmann, F., R. Inderst, and M. Ottaviani (2014): “Persuasion through Selective Disclo-sure: Implications for Marketing, Campaigning, and Privacy Regulation,” Mimeo.Horner, J., and A. Skrzypacz (2009): “Selling Information,” Cowles Foundation DiscussionPapers 1743, Cowles Foundation for Research in Economics, Yale University.Johnson, J. P., and D. P. Myatt (2006): “On the Simple Economics of Advertising, Marketing,and Product Design,” American Economic Review, 96(3), 756–784.Kamenica, E., and M. Gentzkow (2011): “Bayesian Persuasion,” American Economic Review,101(6), 2590–2615.Koessler, F., and R. Renault (2012): “When does a firm disclose product information?,”RAND Journal of Economics, 43(4), 630–649.Kolotilin, A. (2014): “Optimal Information Disclosure: Quantity vs. Quality,” mimeo.

(2015): “Experimental Design to Persuade,” Games and Economic Behavior, forthcoming.Krahmer, D., and T. Mylovanov (2014): “Selling Information,” working paper.Kremer, I., Y. Mansour, and M. Perry (2014): “Implementing the ?Wisdom of the Crowd?,”Journal of Political Economy, 122(5), 988 – 1012.Li, H., and X. Shi (2015): “Discriminatory Information Disclosure,” mimeo.Lizzeri, A. (1999): “Information Revelation and Certification Intermediaries,” RAND Journal ofEconomics, 30(2), 214–231.Mirrlees, J. A. (1971): “An Exploration in the Theory of Optimum Income Taxation,” Reviewof Economic Studies, 38, 175–208.Perez-Richet, E. (2014): “Interim Bayesian Persuasion: First Steps,” American Economic Re-view, 104(5), 469–74.Perez-Richet, E., and D. Prady (2012): “Complicating to Persuade?,” Working papers.Rayo, L., and I. Segal (2010): “Optimal Information Disclosure,” Journal of Political Economy,118(5), 949 – 987.Taneva, I. (2014): “Information design,” mimeo.Wang, Y. (2013): “Bayesian Persuasion with Multiple Receivers,” Mimeo.