Upload
doandiep
View
230
Download
0
Embed Size (px)
Citation preview
Perfect communication equilibria in repeated games
with imperfect monitoring
Tristan Tomala∗
HEC Paris, Economics and Finance Department
December 19, 2007
Abstract
This paper introduces an equilibrium concept called perfect communication equi-
librium for repeated games with imperfect private monitoring. This concept is a
selection of Myerson’s ([25]) communication equilibrium. A communication equilib-
rium is perfect if it induces a communication equilibrium of the continuation game,
after every history of messages of the mediator. We provide a characterization of the
set of corresponding equilibrium payoffs and derive a Folk Theorem for discounted
repeated games with imperfect private monitoring.
Keywords: Repeated Games, Imperfect monitoring, Communication equilib-
ria.∗The author thanks Galit Ashkenazi, Eduardo Faingold, Olivier Gossner, Eilon Solan and Satoru Takahashi for helpful
discussions and comments. Two anonymous referees are also gratefully acknowledged for their help in improving theexposition of the paper.
1
1 Introduction
The central result in the theory of repeated games is the Folk Theorem which states
that when players perfectly observe their opponents’ actions and are patient enough, every
feasible and individually rational payoff vector can be sustained by an equilibrium. The
equilibrium construction is well known: a contract specifies the actions that should be
played by each player at each stage of the game, and in case of unilateral deviation from
the contract, the deviation is publicly observed and the deviating player’s payoff is pushed
down to his individually rational level.
However, in many economic situations, actions are not directly observed. In Stigler
[32], two firms are engaged in a repeated price competition and each firm observes its own
sales, but not the price set by the opponent. Low sales level may either be the result of
secret price cutting from the rival or of shocks on demand. In Green and Porter [16], firms
compete over quantities, and observe the market price but not the quantities produced by
the opponents. In Radner’s ([28]) partnership games, each player chooses an effort level
and receives a payoff contingent on a random output level which depends on all effort
levels. Players do not observe each other’s efforts but only the output level.
Such interactions are modelled by games with imperfect monitoring where players get
signals that partially reflect their opponents behavior. The main problems in the theory
of repeated games with imperfect monitoring are to characterize the set of equilibrium
payoffs and to give conditions under which feasible and individually rational payoffs are
equilibrium payoffs.
Sharp characterizations of equilibrium payoffs are obtained for games with public
signals and perfect public equilibria, i.e. that rely on public signals only. Such equilibria
possess a recursive structure similar to dynamic programming models, see Abreu et al. [1]
who characterize the set of payoffs associated to perfect public equilibria of the discounted
game as the largest fixed point of some correspondance. The analysis of the asymptotic
2
behavior of this set as the discount factor grows is based on these recursive methods and
is due to Fudenberg and Levine [12] (see [14] for the most general result to date in this
direction.). Fudenberg et al. [13] derive a Folk Theorem for games with public monitoring
and perfect public equilibria.
Little is known about discounted repeated games with imperfect private monitoring.
The prominent solution concept is the sequential equilibrium (Kreps and Wilson [20])
which lacks a tractable recursive structure1 and is therefore much harder to analyze (see
[24] for a survey).
A natural way to circumvent the difficulty is to allow for some kind of communica-
tion between players. First, it seems reasonable to assume that players may exchange
messages freely between game stages, and further, the repeated game extended by com-
munication hopefully possesses a more tractable structure. Ben-Porath and Kahneman
[4], [5], Compte [6] and Kandori and Matsushima [18], consider games extended by public
communication where players make public announcments between game stages. Kandori
and Matsushima show that a recursive analysis may be based on the profile of public
announcments. They indentify a class of sequential equilibria of the repeated game with
public communication that displays recursive properties analogous to those of perfect pub-
lic equilibria. They chracterize the corresponding set of equilibrium payoffs and derive a
Folk Theorem which generalizes the result of [13] when specialized to games with public
monitoring.
However, assuming that all communication is public is rather restrictive. For instance,
this precludes pairwise exchange of private messages, like phone calls or e-mails. To take
into account all possible communications methods, we resort to the concept of commu-
nication equilibrium (Myerson [25], [26], Forges [10]). The repeated game is extended by
1Amarante [2] provides a recursive structure for repeated games with imperfect private monitoring.However, the relevant state variable at stage t is the distribution of past histories, which lies in a setof increasing dimension. Gossner and Tomala [17] show that, at least for the restrictive problem ofcomputing the individually rational level, the entropy of the distribution of past histories is a sufficientstatistic.
3
allowing players to communicate privately with a mediator between game stages. The me-
diator is a non-strategic agent who does not observe the outcomes of the game and is only
informed of the messages it exchanges with the players. A communication equilibrium is
given by two objects: a communication device (a rule that specifies how the mediator
selects new messages given past messages), and a Nash equilibrium of the exented game.
A useful feature of this concept is that a “revelation principle” applies ([26]). Every
communication equilibrium is payoff equivalent to a canonical communication equilibrium
where the messages sent by the mediator to the players are actions that the mediator rec-
ommends to play, the messages sent by a player to the mediator are observed signals (the
device is then called canonical), and the equilibrium strategies are faithful: each player
plays the actions recommended by the mediator and reports the signal actually observed.
This representation of communication equilibria has been fruitfully exploited by Renault
and Tomala [31], who give a characterization of communication equilibrium payoffs which
is valid for every undiscounted repeated game with imperfect monitoring.
The goal of the present paper is to study communication equilibria for discounted
games. The achievements are of two kinds. First, we introduce a natural notion of per-
fectness for canonical communication equilibria. We say that a canonical communication
device is a perfect communication equilibrium if, after every history of messages, the con-
tinuation communication device is a canonical communication equilibrium. For such a
communication device, the profile of faithful strategies is a sequential equilibrium of the
(extended) repeated game. We show that this equilibrium concepts possesses recursive
properties and that the analysis of [1] and [12] extend. Secondly, we derive sufficient
conditions to get a Folk Theorem for perfect communication equilibria. The conditions
we give are weaker than those of [18].
The paper is organized as follows. The model is given in section 2. Section 3 gives the
equilibrium payoffs characterizations. Section 4 gives the Folk Theorem. We discuss here
our conditions at length and compare them with those in the literature. We also study
4
a class of monitoring structure for which our conditions are strictly weaker than those
of [18]. We give concluding remarks in section 5. The proof of the Folk Theorem is in
section 6.
2 Model
2.1 The repeated game
The data of the repeated game are as follows. There is a finite set of players {1, . . . , n},
each player i has a finite set of actions Ai, a finite set of signals Y i and a payoff function
gi :∏
j Aj → R. Player’s observation of each other’s actions is given by a transition
probability q from the set of actions profiles to the set of distributions over profiles of
signals.
The game is repeated over and over. At each stage t = 1, 2, . . . , each player i = 1, . . . , n
chooses an action ait from his set of actions and if at = (a1
t , . . . , ant ) is the action profile
played, a profile of signals yt = (y1t , . . . , y
nt ) is selected with probability q(yt|at). Each
player i observes yit and the game goes to stage t + 1. If (at)t≥1 is the sequence of action
profiles played, the discounted payoff of player i is:
∑t≥1
(1− δ)δt−1gi(at)
where δ < 1 is a discount factor common to all players.
Throughout the paper, we use the following notations. If (Ei)i=1,...,n is a collection
of sets, e−i denotes an element of∏
j 6=i Ej. A profile e ∈
∏j Ej is denoted e = (ei, e−i)
when the i-th component is stressed. If E is a finite set, ∆(E) the set of probability
distributions over E.
5
2.2 Communication equilibria
A communication equilibrium of a repeated game (see Myerson [25], [26] and Forges
[10]) is an equilibrium of an extended repeated game where players communicate privately
with a mediator between stages. The mediator does not observe the history of the game
and is only informed of the private messages it sends to the players and receives from
them. The mediator has no preferences over plays and cannot sign binding contracts with
the players. The set of communication equilibria is obtained by considering all possible
specifications of the mediator’s behavior, that is the way the mediator produces new
messages according to its history of messages, and all Nash equilibria of the corresponding
extended repeated games. Nash equilibria, normal form correlated equilibria (Aumann,
[3]) and extensive form correlated equilibria (Forges, [10]) are particular communication
equilibria. This concept covers most communication technologies, for instance repeated
games extended by public communication where each player makes a public announcment
after each stage (see e.g. Ben-Porath and Kahnemann [4], [5], Compte [6], Kandori and
Matsushima [18]).
The well known revelation principle (Myerson, [26]) states that every communication
equilibrium is payoff equivalent2 to a canonical communication equilibrium where: i) be-
fore each stage, the mediator recommends privately an action to each player, ii) after each
stage, each player privately reports a signal to the mediator, iii) the equilibrium strategies
prescribe each player to play the recommended actions and to report the observed signals.
Now we give a formal description of canonical communication equilibria.
Canonical communication device The mediator selects at each stage a profile of
recommended actions and observes a profile of reported signals. A history of length t for
the mediator is thus an element of Hmt = (A× Y )t. A canonical communication device is
2The equivalence is even stronger: for every communication equilibrium, there exists a canonicalcommunication equilibrium such that both induce the same probability distribution over plays of thegame.
6
a mapping c : ∪t≥1Hmt → ∆(A) which defines a lottery on recommended action profiles,
after every history.
Strategies In the repeated game extended by such a device, the set of histories of length
t for player i is H it = (Ai × Ai × Y i × Y i)t, that is, a history for player i consists of the
actions recommended to player i by the mediator, the actions played by player i, the
signals observed by player i and the signals reported to the mediator up to stage t. A
behavior strategy for player i defines a lottery on actions, after each history and each new
recommendation. It also defines a lottery on reported signals, after each history, each
recommendation and each observed signal. In other words, after each history, player i
chooses at random an action rule, that is a mapping αi : Ai → Ai that associates the
action played αi(ai) to the recommended action ai, and a reporting rule, that is a mapping
ρi : Ai×Y i → Y i that associates the reported signal ρi(ai, yi) to the recommended action
ai and the observed signal yi. We call decision rule such a pair of mappings (αi, ρi) and
let Di be the set of all decision rules. A behavior strategy for player i is then a mapping
σi : ∪t≥1Hit → ∆(Di). We define the faithful decision rule ϕi as the one that plays
recommended actions, i.e. the action rule is the indentity on Ai, and reports the correct
signals, i.e. the reporting rule is the projection on Y i. We call faithful strategy ϕi of
player i the one that selects the faithful decision rule after every history.
Definition 2.1 The canonical communication device c is a canonical communication
equilibrium if the profile of faithful strategies is a Nash equilibrium of the repeated game
extended by c.
2.3 Perfect communication equilibria
The equilibrium concept studied in this paper is a refinement of communication equi-
libria. For each history h of the mediator, we let c(·|h) be the canonical communication
7
device induced by c after history h, i.e. if h′ is a history for the mediator, c(h′|h) = c(hh′)
where hh′ is h followed by h′.
Definition 2.2 A canonical communication device c is a perfect communication equilib-
rium if for every history h of the mediator, c(·|h) is a canonical communication equilib-
rium.
If the communication device is perfect, the faithful strategy profile is a sequential
equilibrium ([20]) of the extended game. If player i had more information, that is if he
observed the history of the mediator, the faithful strategy would be a best-reply in the
continuation game following every history. Thus, for every possible belief that player i
may hold on the histories of the mediator, the faithful strategy is a best-reply in each
continuation game. The faithful strategy profile is thus a belief-free equilibrium of the
extended game (see [27], [8], [7]).
3 Characterizations
We let C(δ) be the set of payoff vectors associated to perfect communication equi-
libria of the game with discount factor δ. This set is non-empty: if c(h) is a correlated
equilibrium of the stage game for every history h of the mediator, then c is obviously a
perfect communication equilibrium. Perfect communication equilibria possess a recursive
structure similar to the one of perfect public equilibria ([1], [12]) which allows to derive a
characterization of the limit set of payoffs as the discount factor goes to one.
3.1 The recursive structure
Given a correlated distribution of actions p ∈ ∆(A), a mapping f = (f i)i : A×Y → Rn
and a discount factor δ < 1, we let Γ(p, f, δ) be the game where:
8
• At a first round, the mediator selects a ∈ A according to p and informs privately
each player i of ai.
• At a second round, player i chooses bi ∈ Ai, choices being simultaneous. A profile
of signals z is drawn according to q(·|b) and player i observes zi.
• At the final round, player i reports privately a signal yi to the mediator.
The payoff for player i is (1− δ)gi(b) + δf i(a, y).
This game represents the interaction at a given stage of the repeated game, where the
mediator selects an action profile according to p and f i(a, y) is the continuation payoff of
player i if a is the profile of recommended actions and y is the profile of reported signals.
Definition 3.1 The pair (p, f) is δ-balanced if the faithful strategy profile is a Nash
equilibrium of Γ(p, f, δ).
We let vδ(p, f) = (1 − δ)g(p) + δ∑
a,y p(a)q(y|a)f(a, y) be the payoff vector induced
by the faithful strategy profile in Γ(p, f, δ). Given a set of payoff vectors W ⊂ Rn, we
say that a payoff vector v is decomposable with respect to W and δ if there exists a
δ-balanced pair (p, f) such that f(a, y) ∈ W for all profiles (a, y) and v = vδ(p, f). The
set of payoffs which are decomposable with respect to W and δ is denoted Fδ(W ). A set
W is self-decomposable with respect to δ if W ⊂ Fδ(W ). The recursive characterization
of the set of perfect communication equilibrium payoffs is the following:
Theorem 3.2 C(δ) is the largest (for inclusion) bounded set which is self-decomposable
with respect to δ, i.e. C(δ) ⊂ Fδ(C(δ)) and for every bounded W , W ⊂ Fδ(W ) implies
W ⊂ C(δ).
Proof. The proof of this result follows from similar characterizations for perfect public
equilibria (see [1], [12], [13], [14]). Actually, perfect communication equilibrium payoffs
can be obtained as perfect public equilibrium payoffs of a modification of the repeated
9
game extended by a canonical communication device, where the mediator publicly reveals
his information (recommended actions and reported signals) at the end of each stage. We
consider this modified game as a n+1-player repeated game (the mediator has payoff zero),
where the final announcment of the mediator is a public signal. Then, the communication
device c is a perfect communication equilibrium if and only if in the modifed game,
the strategy profile (c, ϕ1, . . . , ϕn) is a perfect public equilibrium. This follows from the
definition of perfect communication equilibrium: the faithful strategy is a best-reply for
each player in the game where he knows the history of the mediator. The result then
follows from [1]’s characterization of perfect public equilibria (Theorem 1 page 1047 and
Theorem 2 page 1049). Note that C(δ) is not the set of all perfect equilibrium payoffs
of the modified game but only a subset, as the n first players are constrained to use the
faithful strategies at equilibrium. The proof adapts without any difficulty. �
3.2 Characterization of the limit set
Now, we give a characterization of limδ→1 C(δ). We apply the method of [12], gener-
alized by [14].
Definition 3.3 Let λ be a vector in Rn and (p, f) be a δ-balanced pair. This pair is
λ-directed if:
λ · vδ(p, f) ≥ λ · f(a, y), ∀(a, y) ∈ A× Y
where u · v is the inner product in Rn.
Since vδ(p, f) = (1 − δ)g(p) + δ∑
a,y p(a)q(y|a)f(a, y), the above inequalities imply
that, λ · vδ(p, f) ≤ (1 − δ)λ · g(p) + δλ · vδ(p, f), thus λ · vδ(p, f) ≤ λ · g(p). Therefore
λ · g(p) ≥ λ · vδ(p, f) ≥ λ · f(a, y), ∀(a, y) ∈ A × Y , i.e. the current payoff vector is
separated from continuation payoffs by the hyperplane {v | λ · v = λ · vδ(p, f)}.
10
Definition 3.4 The maximal score in direction λ is:
kδ(λ) = max {λ · vδ(p, f) | (p, f) δ − balanced and λ− directed}
The maximal score kδ(λ) does not depend on the discount factor3 and is denoted k(λ).
We denote C∗ the convex set of payoff vectors v such that for each λ ∈ Rn, λ · v ≤ k(λ).
Theorem 3.5 (1) For each discount factor δ, C(δ) ⊂ C∗.
(2) If C∗ has full dimension, limδ→1 C(δ) = C∗.
Proof. This follows directly from the main theorem of [14] which characterizes the asymp-
totic set of payoffs associated to perfect public equilibria with the additional constraint
that at equilibrium, the stage mixed actions belong to a fixed set of distributions. We
apply this theorem to the auxiliary game introduced in the proof of Theorem 3.2 above,
with the constraint that at equilibrium, all players but the mediator should play a single
pure strategy (the faithful decision rule). Under the assumption that the dimension of C∗
is n, the algorithm used in [14] to find the dimension of the limit set, stops at the first
iteration4. �
4 A Folk Theorem
The characterization of the limit set of perfect communication equilibrium payoffs
allows to derive a Folk Theorem, that is to give sufficient conditions under which the set
C∗ coincides with the set of feasible and individually rational payoffs.
3The proof of this fact is a word-to-word extension of lemma 3.1 i), in [12].4We do not use the full force of the main theorem of [14]. Indeed, their algorithm can be used to give
a characterization of limδ→1 C(δ) without the full-dimensionality assumption. We keep this assumptionfor simplicity.
11
4.1 Feasible and individually rational payoffs
Given a probability distribution p on A, the expected payoff to player i is gi(p) =∑a p(a)gi(a) and the payoff vector yielded by p is g(p) = (g1(p), . . . , gn(p)). The set of fea-
sible payoffs (the feasible set) is the set of such vectors as p varies: V = {g(p) | p ∈ ∆(A)},
i.e. V is the convex hull of payoff vectors associated to pure action profiles.
The individually rational level of player i is the harshest punishment that players −i
can inflict to player i. Using the mediator as a correlation device, players −i may punish
player i in correlated strategies. The correlated minmax level of player i is:
mi = minp−i∈∆(A−i)
maxai
gi(ai, p−i)
A correlated distribution p ∈ ∆(A) is a minmax distribution for player i if for each action
ai such that p(ai) > 0, the conditional distribution on A−i, p(·|ai) achieves the min in the
definition of mi and ai is a best reply to p(·|ai).
We let IR = {v ∈ Rn | ∀i, vi ≥ mi} be the set of individually rational payoffs and
denote V ∗ = V ∩ IR.
4.2 Enforceability
Before giving sufficient conditions for the Folk Theorem, we need the key notion of
enforceability (we adapt it from [13]).
Definition 4.1 A correlated distribution p ∈ ∆(A) is enforceable if there exists continu-
ation payoffs f and a discount factor δ < 1 such that (p, f) is δ-balanced.
For every perfect communication equilibrium c and every history h for the mediator,
c(h) is enforceable. Let f ih(a, y) be the continuation payoff for player i given that after h,
the mediator recommends the action profile a and observes the profile of reported signals
12
y. From the definition of perfect communication equilibria, the faithful strategy profile is
a Nash equilibrium of Γ(c(h), fh, δ).
Enforceability is related to detectable deviations, that is unilateral deviations that
change the reported signals. To formalize this relationship, we introduce some notations.
Given a probability distribution p ∈ ∆(A) and a decision rule di = (αi, ρi) for player i,
we let gi(p, di) be the payoff of player i when the mediator selects recommended actions
according to p and player i plays according to di while other players play faithfully:
gi(p, di) =∑
ap(a)gi(αi(ai), a−i).
We let QY (a, di) be the probability distribution induced by the recommended action a
and the decision rule di on reported signals:
QY (a, di)(y) =∑
zi:ρi(ai,zi)=yiq(ρi(ai, zi), y−i|αi(ai), a−i)
The probability distribution on the set of profiles of recommended actions and reported
signals induced by p and di, is Q(p, di)(a, y) = p(a)QY (a, di)(y). When di is the faithful
decision rule ϕi, we denote QY (a, ϕi) by QY (a) and Q(p, ϕi) by Q(p). These definitions
extend to mixed decision rules in a standard way.
Given that recommended actions are distributed according to p, a decision rule di
is an undetectable deviation if for each a in the support of p, QY (a, di) = QY (a): the
distribution of reported signals is not affected by the deviation. Enforceable distributions
are characterized as follows:
Proposition 4.2 The distribution p is enforceable if and only if it is immune to unde-
tectable deviations, that is for every player i and every decision rule di:
Q(p, di) = Q(p) =⇒ gi(p, di) ≤ gi(p).
13
Proof. Let δ be a discount factor and f : A × Y → Rn. The pair (p, f) is δ-balanced if
and only if for each player i and decision rule di:
(1− δ)gi(p) + δQ(p) · f i ≥ (1− δ)gi(p, di) + δQ(p, di) · f i
where Q(p, di) · f i denotes∑
a,y Q(p, di)(a, y)f i(a, y). That is, (p, f) is δ-balanced if and
only if for each player i for each player i, f i is a solution of the following set of inequalities:
{(Q(p)−Q(p, di)) · f i ≥ 1− δ
δ(gi(p, di)− gi(p)) (∀di) (1)
We apply the alternative theorem (see e.g. Ky Fan [9] or Rockafellar [29] Theorem 22.1,
page 198): a linear system of inequalities Mx ≥ c has a solution if and only if (β ≥ 0
and βM = 0 imply β · c ≤ 0) where ≥ means component-wise weak inequality. Letting
M be the matrix with row vectors (Q(p)−Q(p, di)), di ∈ Di and c be the column vector
with di-component 1−δδ
(gi(p, di)− gi(p)), the system (1) has a solution f i if and only if p
is immune to undetectable deviations from player i, which gives the desired result. �
4.3 The Folk Theorem
Now we give sufficient conditions to get a Folk Theorem, i.e. C∗ = V ∗. The main
conditions require that individual deviations by players are detected and identified by the
mediator. That is, whenever a player deviates from the faithful strategy, either by playing
another action than the one recommended or by reporting another signal than the one
observed, the distribution of the profile of reported signals is affected (the deviation is
detected). Further, two deviations by different players induce different distributions of
reported signals (the deviation is identified).
14
A detectability condition Our first condition requires that the distributions of sig-
nals induced by unilateral deviations are different from the ones induced by the faithful
strategies. The condition involves pairs of players. For every pair of players (i, j), call
composite decision a probability distribution over Di ∪Dj, i.e. a tuple (t, µi, µj) where:
t ∈ [0, 1] is the probability of Di, µi is the conditional distribution on Di and µj is the con-
ditional distribution on Dj. A composite decision is faithful if tµi(ϕi)+(1− t)µj(ϕj) = 1.
A tuple (t, µi, µj) which is not faithful is a composite deviation. The following condition
requires that every composite deviation alters the distribution of reported signals.
Condition C1. For each pair of players i, j, for every mixed decision rules µi ∈ ∆(Di),
µj ∈ ∆(Dj) and t ∈ [0, 1],
[tµi(ϕi) + (1− t)µj(ϕj) < 1
]=⇒
[∃a ∈ A, QY (a) 6= tQY (a, µi) + (1− t)QY (a, µj)
]. (2)
The interpretation is as follows. If player i [resp. player j] is selected with proba-
bility t [resp. 1 − t] and then plays µi [resp. µj], then unless each of them plays the
faithful strategy, there exists a profile of recommended actions allowing the mediator to
statistically detect the deviation.
Remark that this implies that for every player i and every non-faithful decision rule
di 6= ϕi, there exists a profile a such that QY (a) 6= QY (a, di). Since Q(p, di)(a, y) =
p(a)QY (a, di)(y), if p has full support, then Q(p, di) = Q(p). In view of Proposition 4.2,
this implies:
Lemma 4.3 Under (C1), every distribution of actions with full support is enforceable.
An identifiability condition The second condition requires that deviations by differ-
ent players induce different distributions of reported signals.
Condition C2. For every pair of players i 6= j, for every mixed decision rules µi ∈
15
∆(Di), µj ∈ ∆(Dj),
[∃a ∈ A, QY (a, µi) 6= QY (a) or QY (a, µj) 6= QY (a)
]=⇒
[∃a ∈ A, QY (a, µi) 6= QY (a, µj)
].
This condition says that if either player i or player j deviates in such a way that
the mediator, recommending a, detects the deviation, then the mediator attributes the
deviation to the correct player.
Additional conditions We add two technical conditions.
Condition C3. For each player i, there exists an enforceable distribution of actions
profiles which is a minmax distribution for player i.
Condition C4. For each player i, there exists an enforceable distribution of actions
profiles p that maximises i’s payoff, i.e. gi(p) = maxa∈A gi(a).
Both (C3) and (C4) are satisfied if we assume the following:
Condition C5. Every pure action profile is enforceable.
Indeed, under (C5), every distribution of action profiles is enforceable. Conditions similar
to (C3) and (C4) are found in [13] and [18].
We obtain the following Folk Theorem for perfect communication equilibria:
Theorem 4.4 If the dimension of V ∗ is n, then under conditions (C1), (C2), (C3) and
(C4), C∗ = V ∗.
The proof is deferred to the appendix.
4.4 Discussion
The paper which is closest to ours is [18] which deals with repeated games with private
monitoring extended by public communication. It is thus worthwhile to compare our
assumptions and results with those of [18].
16
The use of general communication devices has several advantages, compared to public
communication. First, the mediator has the ability to correlate the players’ actions.
This enables us to get a complete Folk Theorem. Indeed, every perfect communication
equilibrium payoff is in V ∗, i.e. is feasible and individually rational with respect to the
minmax levels in correlated strategies. Thus, under conditions (C1), (C2), (C3) and (C4),
we have a full characterization of the set of equilibrium payoffs. This contrasts with [18]
who consider payoffs which are above the minmax levels in mixed strategies, as in repeated
games with more than two players and imperfect monitoring, equilibrium payoffs may be
lower than these levels for some players5. Remark also that we get the Folk Theorem
without assuming full support conditions (q(y|a) > 0, ∀(y, a)).
Discussion of the conditions An important feature is that the mediator has more
information than any player has, even with public communication. The mediator knows
the action profile that should be played at equilibrium whereas players in a model with
public communication only know the distribution of the action profile. This allows to get
the Folk Theorem for a wider class of monitoring structures: our main conditions (C1)
and (C2) are weaker than conditions (A2) and (A3) of [18]. We recall these conditions
and compare them with ours.
For each action profile a, denote q−i(a) the marginal probability distribution on Y −i
derived from q(·|a). For every pair of players (i, j), denote q−ij(a) the marginal probability
distribution on∏
k 6=i,k 6=j Y k derived from q(·|a). Let also Ex be the set of action profiles
such that the payoff vector g(a) is extremal in V . Kandori and Matsushima’s conditions
are as follows.
5In games with more than two players and imperfect monitoring, the past signals may be used as acorrelation device, and a player’s payoff may be pushed down to a level lower than the minmax in mixedstrategies, see [17].
17
Condition A2. ([18]) For each pair i 6= j and each a ∈ Ex,
q−ij(a) /∈ co ({q−ij(a−i, bi) | bi ∈ Ai\{ai}} ∪ {q−ij(a−j, bj) | bj ∈ Aj\{aj}})
Condition A3. ([18]) For each pair i 6= j and each a ∈ Ex,
co ({q−ij(a−i, bi) | bi ∈ Ai}) ∩ co ({q−ij(a−j, bj) | bj ∈ Aj}) = {q(·|a)}
Under Condition (A2), (C1) is clearly satisfied: if the distribution of signals for players
k 6= i, k 6= j is changed by a deviation of player i or of player j, then the distribution of
reported signals is changed as well. However, (C1) is weaker. Indeed, (A2) states that for
each pure action profile a ∈ Ex, for each pair of players (i, j), every composite deviation
is detected. By contrast, conditions (C1) requires that for every pair of players (i, j) and
composite deviation, there exists a profile a, depending on the tuple (i, j, t, µi, µj), such
that if a is recommended, the mediator detects the deviation. Likewise (A3) implies (C2),
but (C2) is weaker: (A3) requires every deviation to be indentified at every pure action
profile a ∈ Ex, whereas (C2) requires that for every pair of deviations by different players,
there exists a profile a that allows the mediator to differentiate those deviations.
On another hand, [18]’s conditions are given on the monitoring structure q whereas
our conditions are formulated on distributions of reported signals which are more difficult
to check. We give two conditions on q which are respectively weaker than (A2) and (A3)
and imply respectively (C1) and (C2).
Condition C1’. For each pair i 6= j, there exists an action profile a such that,
q−ij(a) /∈ co ({q−ij(a−i, bi) | bi 6= ai} ∪ {q−ij(a−j, bj) | bj 6= aj})
18
Condition C2’. For each pair i 6= j there exists an action profile a such that,
co ({q−ij(a−i, bi) | bi ∈ Ai}) ∩ co ({q−ij(a−j, bj) | bj ∈ Aj}) = {q(·|a)}
Clearly, (A2) implies (C1’) which implies in turn (C1). Likewise, (A3) implies (C2’)
which implies in turn (C2).
The conditions given in [18] on the monitoring structure q are generic (satisfied for
an open and dense set of transition probabilities q) provided that the number of possible
signals is large with respect to the number of possible actions. Our conditions thus
inherit the same genericity properties. The next subsection presents a class of monitoring
structures for which our conditions are strictly weaker than those of [18].
Computational complexity Lastly, our equilibrium concept yields a set of payoffs
C∗ which is computationally simpler than its counterpart in [18]. The computation of
the maximal score, as introduced in [12] and used in [18], requires computing the Nash
equilibria of the game with payoff function (1−δ)gi(a)+∑
y q(y|a)f i(a, y) for player i, for
various continuation payoffs f . Computing the set of Nash equilibria of a finite game is
known to be of high complexity, in fact NP-hard (see e.g. [15]). By contrast, the incentive
constraints that characterize a δ-balanced (p, f) are linear with respect to the distribution
p. The computational complexity of computing the distribution p is thus polynomial, as
it is similar to the complexity of computing correlated equilibria ([15]).
4.5 Partially perfect monitoring
Consider a n-player game with action sets (Aj)j. We say that a monitoring structure
is partially perfect if for every action profile, each player observes the actions chosen by
a subset of players. That is, the monitoring structure is given by a family of mappings
(obsj)j where for each player i, obsi maps the actions profiles to the set of subset of players
19
and if a = (aj)j is played, player i is told (aj)j∈obsi(a).
Particular cases of this model appear in the literature. [4] and [30] consider monitoring
structures for which each player observes the actions of a fixed subset of players and [5]
consider games where each player chooses at each stage the set of players he wishes to
monitor and substract a monitoring cost from the stage payoff.
We give a characterization of partially perfect monitoring structures that satisfy the
conditions of Theorem 4.4. We start by discussing an example.
Example 4.5 Consider a 5-player game where the players are partners working in the
same firm. The firm’s building has two offices (room A and room B), each containing at
most two persons. Players 1 and 2 are both in room A and players 3 and 4 are both in
room B. Player 5 is an inspector who can choose to stay in room A, in room B or shirk.
The monitoring structure is thus as follows: player 5 observes the actions of players 1 and
2 [resp. 3 and 4] if he stays in room A [resp. room B], and gets no signal if does not
stay in one of the two offices. Player players 1 and 2 [resp. 3 and 4] observe each other’s
action and observe the action of player 5 if he inspects room A [resp. room B].
Conditions (C1), (C2), and (C5) are satisfied for this example (recall that (C5) implies
(C3) and (C4)). Take a pair of players among the four first players and a deviation for
each of them. If they are in different offices, say if player 1 or player 3 deviates, then
their deviations induce different reported signals: a deviation of player 1 is reported by
player 2 while a deviation of player 3 is reported by player 4. If the two players are in
the same office, say if player 1 or player 2 deviates, then their deviations are detected and
differentiated at any action profile where the inspector inspects room A. Lastly, deviations
by player 5 can be detected and indentifed: if the inspector shirks or inspects the wrong
office, the mediator knows it from the other players’ reports. Therefore conditions (C1)
and (C2) hold. Further, each player’s action is observed by at least another player, thus
every unilateral deviation can be detected at every action profile. It follows that every
20
correlated distribution on actions is enforceable thus (C5) holds.
Theorem 4.4 thus holds for this monitoring structure and for each payoff function that
satisfies the full-dimensionality condition.
On another hand, this monitoring structure does not satisfy the conditions of [18].
Given any pure action profile, there is an office which is not inspected by player 5, and
thus deviations by players in this office cannot be differentiated.
Regarding n-player partially-perfect monitoring structures, we get the following result:
Proposition 4.6 Assume that for each player i, there exists an action profile a such that
there exists two players j 6= i, k 6= i that observe the action of player i, i.e. i ∈ obsj(a) ∩
obsk(a). Then, Conditions (C1), (C2) and (C5) are satisfied and the Folk Theorem holds
under the full-dimensionality condition.
The proof is a straightforward extension of the above example and is left to the reader.
Remark that a partially-perfect monitoring structure satifies (A2) and (A3) if for every
a ∈ Ex, and every player i, there exists j 6= i, k 6= i such that i ∈ obsj(a) ∩ obsk(a).
5 Conclusion
The characterization of equilibrium payoffs for discounted repeated games with im-
perfect private monitoring is still an open problem. We provide in this paper a solution
concept which entails private communication, and possesses recursive properties that allow
to derive a characterization of the corresponding equilibrium payoffs and a Folk Theo-
rem. However, the perfect communication equilibrium concept is more restrictive than
sequential equilibrium of the game extended by communication and does not even cover
all sequential equilibria of the game without communication. Mailath et al. [23] give an
example of a 2-stage repeated game with imperfect monitoring for which: i) the stage
game has a unique correlated equilibrium, and ii) there exists a sequential equilibrium of
21
the repeated game that does not play this unique correlated equilibrium at the first stage.
For this game, any perfect communication equilibrium prescribes the unique correlated
equilibrium at the second stage for every history. By a backward induction argument, the
mediator prescribes the unique correlated equilibrium at the first stage as well.
An open problem is thus to characterize the equilibrium payoffs than may be obtained
by sequential equilibria of the repeated game extended by a communication device.
Another question is how to dispense with the mediator. The mediator is necessary
for correlation, which pushes the individually rational payoffs downwards, thus enlarg-
ing the set of equilibrium payoffs. Apart from this aspect, could it be possible to get
the Folk Theorem with our conditions using public equilibria? Answering this question
requires characterizing the set of sequential equilibria of repeated games extended by
public communication. Note that the revelation principle does not apply for games ex-
tended by public communication: the canonical communication equilibrium associated to
a public communication equilibrium may well fail to be public. Thus, one cannot restrict
to canonical public equilibria without losing generality. This problem seems also difficult
since sequential equilibria do not possess a known, tractable recursive structure ([18] study
a class of equilibria with public communication which possesses a recursive structure).
6 Appendix: Proof of Theorem 4.4
Lemma 6.1 C∗ ⊂ V ∗
Proof. First C∗ ⊂ V . Take v ∈ C∗. For each direction λ, there exists a δ-balanced and
λ-directed (p, f), such that λ·v ≤ λ·vδ(p, f) ≤ λ·g(p). Since g(p) ∈ V , λ·v ≤ maxx∈V λ·x
which implies C∗ ⊂ V since V is convex.
Now we prove C∗ ⊂ IR. For each player i, let ei be the unit vector whose i-component
is 1 and other components are zero. Take v ∈ C∗, for λ = −ei, λ·v ≤ k(λ), i.e. there exists
a δ-balanced (p, f) such that vi ≥ viδ(p, f) and f i(a, y) ≥ vi
δ(p, f) for each (a, y). This
22
implies that for every decision rule di, Q(p, di) · f i ≥ viδ(p, f). Since the pair is balanced,
for each di:
vi ≥ viδ(p, f)
≥ (1− δ)gi(p, di) + δQ(p, di) · f i
≥ (1− δ)gi(p, di) + δviδ(p, f)
thus viδ(p, f) ≥ gi(p, di). So there exists p such that for each di, vi ≥ gi(p, di) i.e. vi ≥
minp maxdi gi(p, di) = mi. �
To complete the proof, we need to prove V ∗ ⊂ C∗. First we introduce a stronger
notion of enforceability.
Definition 6.2 1. Given a direction λ, the distribution p is enforceable with respect
to λ-hyperplanes if there exists a mapping f and a discount factor δ such that (p, f)
is δ-balanced and there exists a k ∈ R s.t. ∀(a, y), λ · f(a, y) = k.
2. A vector λ = (λ1, . . . , λn) is singular if there is a unique i s.t. λi 6= 0. Otherwise,
λ is called regular. Given a pair of players i 6= j, an ij-vector is a regular vector λ
s.t. λk = 0 for k 6= i, k 6= j.
3. Given a pair of players i 6= j, p is enforceable with respect to ij-hyperplanes if p is
enforceable with respect to λ-hyperplanes for each ij-vector λ.
The following properties hold.
Lemma 6.3 1. If p is enforceable with respect to λ-hyperplanes, then λ · g(p) ≤ k(λ).
2. If p is enforceable with respect to ij-hyperplanes for all pairs i 6= j, then p is en-
forceable with respect to λ-hyperplanes for all regular λ.
23
Proof. (1). Let (p, f) be δ-balanced such that ∀(a, y), λ · f(a, y) = k. Given a vector
β ∈ Rn, let fβ(a, y) = f(a, y) + β, then (p, fβ) is δ-balanced as well. Choosing β such
that λ · β = λ · g(p) − k, we get that ∀(a, y), λ · fβ(a, y) = λ · g(p). It follows that
λ · vδ(p, f) = λ · ((1 − δ)g(p) + δQ(p) · fβ)) = λ · g(p) and (p, f ′) is thus λ-directed.
Therefore, λ · g(p) = λ · vδ(p, f) ≤ k(λ).
(2). Assume that for each ij-vector λ, p is enforceable with respect to λ-hyperplanes.
Let λ be a regular vector. First assume that the number of players s.t. λi 6= 0 is even, i.e.
up to a relabelling of players, λ = (λ1, . . . , λ2L, 0, . . . , 0). For each pair of players (`, `+1)
with ` odd in {1, . . . 2L− 1}, choose (f `, f `+1) that solves the system:
(Q(p)−Q(p, d`)) · f ` ≥ 1−δ
δ(g`(p, d`)− g`(p)) (∀d`)
(Q(p)−Q(p, d`+1)) · f `+1 ≥ 1−δδ
(g`+1(p, d`+1)− g`+1(p)) (∀d`+1)
λ`f `(a, y) + λ`+1f `+1(a, y) = 0 (∀(a, y))
(3)
Such a pair exists since p is enforceable with respect to (`, ` + 1)-hyperplanes, and since
we may shift the continuation payoffs by adding constants in order to get λ`f `(a, y) +
λ`+1f `+1(a, y) = 0. For other players i, choose f i that solves:
{(Q(p)−Q(p, di)) · f i ≥ 1− δ
δ(gi(p, di)− gi(p)) (∀di) (4)
Such f i exists since p is enforceable. This defines a continuation payoff f which has all
the desired properties.
Assume now that the number of players s.t. λi 6= 0 is odd, i.e. λ = (λ1, . . . , λ2L+1, 0, . . . , 0).
For players i s.t. λi = 0, solve the system (4). For every pair of players (`, ` + 1) s.t. ` is
even and in {4, . . . , 2L}, choose (f `, f `+1) that solves the system (3) for the pair (`, `+1).
24
For players 1 and 2 choose (f 1∗ , f
2∗ ) that solve the system:
(Q(p)−Q(p, d1)) · f 1 ≥ 1−δ
δ(g1(p, d1)− g1(p)) (∀d1)
(Q(p)−Q(p, d2)) · f 2 ≥ 1−δδ
(g2(p, d2)− g2(p)) (∀d2)
λ1f 1(a, y) + λ2
2f 2(a, y) = 0 (∀(a, y))
and for players 2 and 3 choose (f 2∗∗, f
3∗∗) that solve:
(Q(p)−Q(p, d2)) · f 2 ≥ 1−δ
δ(g2(p, d1)− g2(p)) (∀d2)
(Q(p)−Q(p, d3)) · f 3 ≥ 1−δδ
(g3(p, d3)− g3(p)) (∀d3)
λ2
2f 2(a, y) + λ3f 3(a, y) = 0 (∀(a, y))
Finally set f 1 = f 1∗ , f 2 = 1
2f 2∗ + 1
2f 2∗∗ and f 3 = f 3
∗ . As before, the mapping f has the
desired properties. Note that since f 2∗ and f 2
∗∗ satisfy the incentive constraints for player
2, so does their average f 2. �
The core of the proof of Theorem 4.4 is the following lemma.
Lemma 6.4 For each v ∈ interior(V ) ∩ IR and each direction λ, there exists p ∈ ∆(A)
which is enforceable with respect to λ-hyperplanes such that λ · v ≤ λ · g(p).
Proof. Consider v ∈ interior (V ) ∩ IR and a direction λ. We distinguish two cases: first
we assume that λ is a singular vector, then a regular vector.
Singular vectors.
Case 1. λ = ei, the unit vector whose i-th component is 1 and other components are
zero. Let p be an enforceable distribution that maximizes i’s payoff. Such a distribution
exists from condition (C3). Then, vi ≤ gi(p). Since p maximizes player i’s payoff, each
ai s.t. p(ai) > 0 is a best reply to p(·|ai). To enforce p, we may thus choose f i constant,
e.g. f i(a, y) = gi(p) for each (a, y). Since p is enforceable, for each player j 6= i, there
exists f j that solves the system:
25
{(Q(p)−Q(p, dj)) · f j ≥ 1− δ
δ(gj(p, dj)− gj(p)) (∀dj)
For these continuation payoffs, (p, f) is δ-balanced and λ·f(a, y) = f i(a, y) is constant.
Thus p, is enforceable with respect to λ-hyperplanes.
Case 2. λ = −ei. Let p be an enforceable distribution that is also a minmax distribution
for player i. Such a distribution exists from (C4). Then, as v is individually rational
vi ≥ mi = gi(p). Since p is a minmax distribution for player i, each ai s.t. p(ai) > 0 is a
best response to p(·|ai), thus we may choose f i constant, e.g. f i(a, y) = mi for each (a, y).
Since p is enforceable, for each player j 6= i, there exists f j that solves the system:
{(Q(p)−Q(p, dj)) · f j ≥ 1− δ
δ(gj(p, dj)− gj(p)) (∀dj)
The pair (p, f) is then δ-balanced and λ · f(a, y) = −f i(a, y) is constant. Thus, p is
enforceable with respect to λ-hyperplanes.
Regular vectors.
Since v belongs to the interior of V and since g is linear on ∆(A), there exists a full-
support distribution p ∈ ∆(A) such that v = g(p). We prove that (C1) and (C2) imply
that p is enforceable with respect to ij-hyperplanes for all pair (i, j). From Lemma 6.3
point 2, this implies that p is enforceable with respect to λ-hyperplanes for all regular λ.
Let us choose a pair of players (i, j) and a vector λ s.t. λiλj 6= 0 and λk = 0, for each
k 6= i, k 6= j. Since p is enforceable, for each player k such that λk = 0, there exists fk
that solves the system:
{(Q(p)−Q(p, dk)) · fk ≥ 1− δ
δ(gk(p, dk)− gk(p)) (∀dk)
Case 1. λiλj > 0. Recall that condition (C2) states that for every pair of players i 6= j,
for every mixed decision rules µi ∈ ∆(Di), µj ∈ ∆(Dj),
26
[∀a ∈ A, QY (a, µi) = Q(a, µj)
]=⇒
[∀a ∈ A, QY (a, µi) = QY (a, µj) = QY (a)
].
Since for each p ∈ ∆(A), Q(p, µi)(a, y) = p(a)QY (a, µi)(y) and since p has full support,
we get that for every pair of players i 6= j, for every mixed decision rules µi ∈ ∆(Di),
µj ∈ ∆(Dj),
[Q(p, µi) = Q(p, µj)
]=⇒
[Q(p, µi) = Q(p, µj) = Q(p)
].
Or equivalently,
co{Q(p, di), di ∈ Di
}∩ co
{Q(p, dj), dj ∈ Dj
}= {Q(p)} .
By the separation theorem, there exists a mapping ` : A× Y → Rn s.t.
Q(p, dj) · ` < Q(p) · ` < Q(p, di) · `
for all decision rules di 6= ϕi, dj 6= ϕj. For t > 0, set f i = t` and f j = − λi
λj fi. Obviously,
λif i + λjf j = 0. The system of incentive constraints for player i is then:
{t(Q(p)−Q(p, di)) · ` ≥ 1− δ
δ(gi(p, di)− gi(p)) (∀di)
which is satisfied for t large enough since the left-hand side is positive for di 6= ϕi. The
system of incentive constraints for player j is:
{−t
λi
λj(Q(p)−Q(p, dj)) · ` ≥ 1− δ
δ(gj(p, dj)− gj(p)) (∀dj)
which is also satisfied for t large enough since (Q(p)−Q(p, dj)) · ` is negative for dj 6= ϕj.
27
Case 2. λiλj < 0.
Now recall that that Condition (C1) states that for each pair of players i, j, for each
mixed decision rules µi ∈ ∆(Di), µj ∈ ∆(Dj) and t ∈ [0, 1],
[∀a ∈ A, QY (a) = tQY (a, µi) + (1− t)QY (a, µj)
]=⇒
[tµi(ϕi) + (1− t)µj(ϕj) = 1
].
Again, since for each p ∈ ∆(A), Q(p, µi)(a, y) = p(a)QY (a, µi)(y) and since p has full
support, for every pair of players i 6= j and for every mixed decision rules µi ∈ ∆(Di),
µj ∈ ∆(Dj),
Q(p) = tQ(p, µi) + (1− t)Q(p, µj) =⇒[tµi(ϕi) + (1− t)µj(ϕj) = 1
].
This implies,
Q(p) /∈ co ({Q(p, di), di 6= ϕi
}∪
{Q(p, dj), dj 6= ϕj
}).
By the separation theorem, there exists a mapping ` : A× Y → Rn s.t.
Q(p, dh) · ` < Q(p) · `
for h = i, j and all decision rules dh 6= ϕh. For t > 0, let f i = t` and f j = − λi
λj fi. Again,
λif i + λjf j = 0 and the incentive constraints for player i is:
{t(Q(p)−Q(p, di)) · ` ≥ 1− δ
δ(gi(p, di)− gi(p)) (∀di)
These are satified for t large enough. The incentive constraints for player j are:
{−t
λi
λj(Q(p)−Q(p, dj)) · ` ≥ 1− δ
δ(gj(p, dj)− gj(p)) (∀dj)
and are also satisfied for t large enough. �
28
Theorem 4.4 follows from Lemma 6.4 and Lemma 6.3 point 1. Take v ∈ interior(V )∩
IR and a direction λ. There exists p ∈ ∆(A) which is enforceable with respect to λ-
hyperplanes such that λ · v ≤ λ · g(p) ≤ k(λ). Thus v ∈ C∗. Since C∗ is closed and the
closure of interior(V ) ∩ IR is V ∗, the proof of Theorem 4.4 is complete.
References
[1] D. Abreu, D. Pearce, and E. Stacchetti. Toward a theory of discounted repeated
games with imperfect monitoring. Econometrica, 58:1041–1063, 1990.
[2] M. Amarante. Recursive structure and equilibria in games with private moni-
toring. Economic Theory, 22:353–374, 2003.
[3] R.J. Aumann. Subjectivity and correlation in randomized strategies. Journal
of Mathematical Economics, 1:67–95, 1974.
[4] E. Ben-Porath and M. Kahneman. Communication in repeated games with
private monitoring. Journal of Economic Theory, 70:281–297, 1996.
[5] E. Ben-Porath and M. Kahneman. Communication in repeated games with
costly monitoring. Games and Economic Behavior, 44:227-250, 2003.
[6] O. Compte. Communication in repeated games with imperfect private monitor-
ing. Econometrica, 66:597–626, 1998.
[7] J.C. Ely, J. Horner, and W. Olszewski. Belief-free equilibria in repeated games.
Econometrica, 73:377–415, 2005.
[8] J.C. Ely and J. Valimaki. A robust folk theorem for the prisoner’s dilemma.
Journal of Economic Theory, 102:84–106, 2002.
29
[9] K. Fan. On systems of linear inequalities. in Linear inequalities, H. Kuhn and
A. Tucker, eds. Princeton, NJ: Princeton University Press, 1956.
[10] F. Forges. An approach to communication equilibria. Econometrica, 54:1375–
1385, 1986.
[11] F. Forges. Correlated equilibrium in two person zero sum games. Econometrica,
58:515–516, 1988.
[12] D. Fudenberg and D. K. Levine. Efficiency and observability with long-run and
short-run players. Journal of Economic Theory, 62:103–135, 1994.
[13] D. Fudenberg, D. K. Levine, and E. Maskin. The folk theorem with imperfect
public information. Econometrica, 62:997–1039, 1994.
[14] D. Fudenberg, D. K. Levine, and S. Takahashi. Perfect public equilibrium when
players are patient. Games and Economic Behavior, 61:27–49, 2007.
[15] I. Gilboa and E. Zemel. Nash and correlated equilibria: Some complexity con-
siderations. Games and Economic Behavior, 1:80–93, 1989.
[16] E. J. Green and R. H. Porter. Noncooperative collusion under imperfect price
information. Econometrica, 52:87–100, 1984.
[17] O. Gossner and T. Tomala. Secret correlation in repeated games with signals.
Mathematics of Operations Research, 32:413–424, 2007.
[18] M. Kandori and H. Matsushima. Private observation, communication and col-
lusion. Econometrica, 66:627–652, 1998.
[19] M. Kandori and I. Obara. Efficiency in repeated games revisited: The role of
private strategies. Econometrica, 74:499–519, 2006.
30
[20] D.M. Kreps and R.B. Wilson. Sequential equilibria. Econometrica, 50:863–894,
1982.
[21] E. Lehrer. Nash equilibria of n player repeated games with semi-standard in-
formation. International Journal of Game Theory, 19:191–217, 1990.
[22] E. Lehrer. Internal correlation in repeated games. International Journal of
Game Theory, 19:431–456, 1991.
[23] G.J. Mailath, S. A. Matthews and T. Sekiguchi. Private Strategies in Finitely
Repeated Games with Imperfect Public Monitoring. The B.E. Journal in The-
oretical Economics, vol. 2, issue 1, 2002.
[24] G.J. Mailath and L. Samuelson. Repeated Games and Reputations: Long-Run
Relationships. Oxford University Press, 2006.
[25] R.B. Myerson. Optimal coordination mechanisms in generalized principal agent
problems, Journal of Mathematical Economics, 10:67–81, 1982.
[26] R.B. Myerson. Multistage games with communication, Econometrica, 54:323–
358, 1986.
[27] M. Piccione. The repeated prisoner’s dilemma with imperfect private monitor-
ing. Journal of Economic Theory, 102:70–84, 2002.
[28] R. Radner. Repeated partnership games with imperfect monitoring and no
discounting. Review of Economic Studies, 53:43–58, 1986.
[29] R. T. Rockafellar. Convex analysis. Princeton University Press, 1970.
[30] J. Renault and T. Tomala. Repeated proximity games. International Journal
of Game Theory, 27:539–559, 1998.
31
[31] J. Renault and T. Tomala. Communication equilibrium payoffs of repeated
games with imperfect monitoring. Games and Economic Behavior, 49:313–344,
2004.
[32] G. Stigler. A theory of oligopoly. Journal of Political Economy, 72:44–61, 1964.
32