Upload
vincent-traag
View
99
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Presentation on Cooperation and Reputation, on June 29, 2010.
Citation preview
Cooperation and Reputation
Vincent Traag
June 29, 2010
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Outline
1. Introduction
2. Cooperative Mechanisms
3. Indirect Reciprocity
4. Proposed model
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Cooperation
Cooperation (and defection)
• Organizations (also Wikipedia, open source software, . . . )◮ Why do people contribute?
• Worker ants in colonies◮ Why do workers help without individual benefit?
• Prudents parasites in hosts◮ Why do parasites not replicate faster?
• Human body◮ Why do cells not replicate faster?
Central question
If defecting (not cooperating) is a real option, why (and how) hascooperation evolved?
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Formal cooperation (and defection)
Prisoner’s Dilemma
• The game knows two options, donating or not donating.
• Donate at a cost c > 0 to benefit someone else with benefitb > c .
• Agents are paired, and play a round of donating or not.
• Cooperators C donate, defectors D do not donate.
This can be summarized in the payoff matrix
A =
(C D
C b − c −c
D b 0
)
Defectors dominate
Whatever strategy you encounter (C or D), always better to defect.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Evolutionary Stability (static)
Definition (Nash equilibrium)
Strategy i is a Nash equilibrium if Aii ≥ Aji
and is a strict Nash equilibrium if Aii > Aji .Players cannot benefit by switching from strategy i if it is a Nashequilibrium.
Definition (ESS)
Strategy i is an Evolutionary Stable Strategy (ESS) if
Aii > Aji or (Aii = Aji and Aij > Ajj).
A population of players with strategy i cannot be ‘invaded’ by asmall number of different strategies.
Strict Nash =⇒ ESS =⇒ Nash
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Mixed strategies
Mixed strategies
• There are n different ‘pure’ strategies (e.g. Cooperate, Defect).
• Mixed strategy p is: play ‘pure’ strategy i with probability pi .
• Average payoff for ‘pure’ strategy i versus p is then (Ap)i .
• Average payoff for mixed strategy q versus p is then q⊺Ap.
Stability revisited
Strategy p is(Strict) Nash p⊺Ap ≥ q⊺Ap
ESS p⊺Ap > q⊺Ap orp⊺Ap = p⊺Aq and p⊺Aq > q⊺Aq
There always exists a mixed strategy Nash equilibrium.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Dynamical View
• Natural to model game dynamics in an evolutionary context.
• Survival of the fittest (fitness = payoff).
Definition (Replicator equation)
Population with i = 1, . . . , n different mixed strategies pi
xi Relative abundance (frequency)
p =∑
i pixi Average strategy
fi = p⊺
i Ap Expected payoff
f = p⊺Ap Average payoff
Evolution of the population given by
xi = xi (fi − f ) = xi ((pi − p)⊺Ap).
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Stability (dynamic)
Fixed points
• Total population always∑
i xi = 1.
• Dynamics are restricted to unit simplex Sn.
• Fixed point x∗ then p⊺
i Ap = pAp for xi > 0.
Nash and ESS vs. fixed points
• If x∗ is (strictly) Nash, then it is a (stable) fixed point.
• If the fixed point x∗ is stable, it is a Nash equilibrium.
• if x∗ is ESS then it is a stable fixed point.
• An interior ESS x∗ is globally stable.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Overview
What are possibly mechanisms to get cooperation?Payoff matrix
A =
(C D
C b − c −c
D b 0
)
Mechanisms
• Kin selection (r > cb)
Cooperate because offspring benefits of your cooperation. Basisof ‘selfish gene’, or ‘inclusive fitness’.
• Direct reciprocity (w > cb)
Cooperate because of possible future payoffs.
• Indirect reciprocity (q > cb)
Cooperate because someone else may cooperate with you in thefuture.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Kin selection
Kin and gene
• Focus is on the gene, how can the gene spread?
• If coefficient of kinship r > cb
the cooperative gene will spread.
Game theoretic dynamic view
• Let 0 ≤ r ≤ 1 be the assortativity.
• Average payoff (cooperators x , defectors 1 − x)
fC (x) = r(b − c) + (1 − r) (x(b − c) − (1 − x)c)
fD(x) = (1 − r)xb
• Dynamics x = x(1 − x)(fC − fD), x∗ = 1 is stable if r > cb.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Kin selection
Change in payoff
• Average payoff (cooperators x , defectors 1 − x)
fC (x) = r(b − c) + (1 − r) (x(b − c) − (1 − x)c)
fD(x) = (1 − r)xb
• Gives payoff matrix
A =
(C D
C b − c rb − c
D (1 − r)b 0
)
• Cooperation is ESS if (b − c) > (1 − r)b, hence if r > cb.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Reciprocity
Cooperate because possible future rewards.
Iterated Prisoner’s Dilemma
• Play the PD game multiple times.
• Usually probability w to play another round.
• Huge number of possible strategies.
• No definite ESS.
Framework
• Play on average k = 1/(1 − w) rounds, then apply selection.
• Expected payoff aij of strategy i vs j .
• Then apply earlier framework (ESS, replicator).
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Some strategies
Example (Always)
Defect/cooperate on all rounds
Other CDDDDCC
AllD DDDDDDD
AllC CCCCCCCC
Example (Win-Stay, Lose-Shift)
Change strategy if losing, keep itotherwise.
Other CDDDDCC
WSLS CCDCDCC
Example (Tit-for-tat)
Start cooperating, then repeatopponent.
Other CDDDDCC
TFT CCDDDDC
Example (Generous Tit-for-tat)
As TFT, but cooperates afterdefection with probability p.
Other CDDDDCC
GTFT CCDDCDC
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Stability of reciprocity (TFT)
TFT vs. AllD
• TFT will cooperate first round, then defect subsequently.
• Expected payoff matrix
A =
(TFT AllD
TFT (b − c)/(1 − w) −c
AllD b 0
)
• TFT is ESS when (b − c)/(1 − w) > b, or w > cb.
TFT vs. AllC
• TFT is neutral vs AllC, neither is ESS.
• Expected payoff always (b − c)/(1 − w) for both TFT and AllC.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Cyclic behaviour
Weaknesses of TFT
• TFT population can drift towards AllC.
• TFT does not restore cooperation on errors
TFT CCDCDCDD
TFT CCCDCDDD
• Generous TFT (GTFT) sometimes cooperates unreciprocally.
• GTFT can correct errors but still neutral vs AllC.
TFT GTFT
AllCAllD
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Introduction
Why is kin selection and reciprocity not sufficient?
Insufficient explanation
• Humans cooperate also with non-kin.
• Humans cooperate in non-iterative situations.
Indirect reciprocity
• Cooperate if cooperated with others in the past.
• Brings reputation into play.
• How to respond to reputation?
• How to determine new reputation?
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Indirect Reciprocity
Cooperate because others will return the favor.
Reputation
• Cooperation increases reputation, defection decreases it.
• Cooperate with those who have a good reputation.
• Defect those who have a bad reputation.
Action and assesment
• Many other possible interactions between cooperation andreputation.
• Should it be ‘bad’ or ‘good’ to cooperate with ‘bad’ agents?
• Should you cooperate only to increase your own reputation?
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Image score
Definition (Image score, reputation)
• Integer status −5 ≤ Si ≤ 5 known to all.
• If cooperate increase (with 1).
• If defect decrease (with 1).
Definition (Discriminator Strategy)
• Cooperative threshold −5 ≤ kj ≤ 6.
• If status Si ≥ kj cooperate, otherwise defect.
• Strategy kj = −5 corresponds to AllC.
• Strategy kj = 6 corresponds to AllD.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Image score
Simulation
• Have n agents playing m rounds of donating.
• Each agent i has a threshold ki andreputation Si .
• Reproduce offspring proportional to payoff.
Results of simulation
• Cooperative strategies (ki ≤ 0) prevailswithout mutation.
• Cycles of Discriminator → AllC → AllD withmutation.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Some simple analytics
Simple image score
• Only good (1) or bad (0) reputation.
• Conditional cooperation (CC): cooperate if reputation is good.
• Probability q to know reputation of defector.
CC vs AllD
• Payoff matrix
A =
(CC AllD
CC b − c −c(1 − q)AllD b(1 − q) 0
)
• Conditional Cooperation is ESS when q > cb.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Other reputation dynamics
Morals
• Defecting a defector: bad in image score.
• What action should be regarded as good?
• When to cooperate, when to defect?
GG GB BG BB
C ∗ ∗ ∗ ∗
D ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗
Reputation of donor and recipientAction of donor
New reputation can beeither Good or Bad
Action can be eitherCooperate or Defect
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Some reputation dynamics
GG GB BG BB
C G G G G
D B B B BImage scoring
C G G G G
D B G B BStanding
C G B G B
D B G B BJudging
C G B G B
D B B B BShunning
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Leading eight
Best strategies
• In total 2, 048 different possible strategies.
• There are 8 strategies (leading eight) that perform best (highestpayoff, and ESS).
GG GB BG BB
C G ∗ G ∗
D B G B ∗
C D C ×
Maintainance of cooperation
Mark defectors
Punish defectors
Forgive defectors
Apologize
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Subjective reputation
Subjective reputation
• Unrealistic that everybody knows the reputation of everybody.
• Introduce a subjective (private) reputation.
• ‘Observe’ only a few interactions.
Observing
• Probability q of observing an interaction.
• Cooperation declines with lower q.
• Diverging reputations cause further errors.
• Good may defect bad, but not all agree on who’s bad.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Synchronize reputations
Synchronizing reputations
• Spread local information to synchronize reputations.
• Players ‘gossip’ about each other to share information.
• Start gossip, spread gossip and how to interpret gossip?
Lying, cheating and defecting
• Possibly ‘false’ gossips spread.
• Spread rumours unconditionally allows liars to invade.
• Liars cannot invade conditional rumour spreaders.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Empirical evidence
Directly observable
• Humans seem to be using image scoring.
• Norm (help if S > k) can be different across groups.
• Standing strategy might be too ‘demanding’.
• Generates trust, also in subsequent games.
With gossip
• Gossip effective to spread information on reputation.
• Even in presence of direct observation, gossip has an effect.
• More gossip increases the effect.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Current research
Research questions
• What population structure can result from gossip?
• How stable are certain population structures?
Desired properties
• Have subjective reputations.
• Influenced by ‘local’ gossip.
• In the absence of gossip, rely on own observations.
• More gossip should have more influence.
• Have an analytically tractable model.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Simple model
• Start with some simple model and obtain some results.
• Somewhat arbitrary choices, which might be varied later on.
Basics
1 Each agent has a reputation of the other: Sij .
2 Everybody plays and cooperates/defects based on reputation.
3 Everybody gossips the result of the interaction.
4 Update reputation based on own observation and gossip.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Reputation and cooperation
One interaction
• Suppose agent i and j interact
• Each agent has a reputation of the other: Sij and Sji
• Probability to cooperate αij and αji depend on reputation.
Approximation to image score
• Image score uses effectively a Heaviside step function:
αij = Θ(Sij − k)
• We propose continuous version (for now, k = 0)
αij =1
1 + e−γ(Sij−k)
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Individual strategy
The four different outcomes have the following probabilities:Player j
Player i
C DC αijαji αij(1 − αji )D (1 − αij)αji (1 − αij)(1 − αji )
Individual strategy
• +1 for ‘good’ actions, −1 for ‘bad’ actions to reputation.
• TFT-like: Consider CC and DC as good.
• We currently study WSLS-like: Consider CC and DD as good.
∆iSij(t) =αijαji + (1 − αij)(1 − αji )
− (1 − αij)αji − αij(1 − αji )
=(2αij − 1)(2αji − 1)
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Gossiping
Who gossips?
• To whom should you gossip?
• What gossip should you trust?
• Pass on the gossip?
• Currently: no further spreading, talk to cooperative people.
Gossip about what?
• Gossip about reputation?
• Gossip about last interaction?
• Currently: last interaction.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Gossiping
Consider all neighbours k when updating the reputation Sij .
i j
k
The link tobe updated.
Does i ‘like’ k?
Will k gossip to i?
What actionhas j takento k?
Change in reputation after gossiping
∆gSij(t) =∑
k 6=i ,j
αki (2αik − 1)(2αjk − 1)
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Reputation dynamics
Reputation
• Combine change from individual strategy and from gossiping.
• Balance the two changes with a ‘social influence’ parameter0 ≤ λ ≤ 1.
∆Sij(t) = (1−λ) (2αij − 1)(2αji − 1)︸ ︷︷ ︸
Individual strategy
+λ∑
k 6=i ,j
αki (2αik − 1)(2αjk − 1)
︸ ︷︷ ︸
Gossip influence
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Analytics
Obtain differential equation
• Assume for interval ∆t < 1, probability to interact is ∆t.
• Then we can take the limit lim∆t→0 ∆Sij(t)/∆t
• The derivative Sij can be written in terms of αij , we obtain
Sij =αij
γ(1 − αij)αij
Differential equation becomes (with rescaled time τ = γt)
αij = αij(1 − αij)
[
(1 − λ)(2αij − 1)(2αji − 1)
+ λ∑
k 6=i ,j
αki (2αik − 1)(2αjk − 1)
]
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
No gossip
No gossip
• When gossip is not presentdifferential equation is simple:
αij = αij(1 − αij)(2αij − 1)(2αji − 1)
• Only dependent on αij and αji .
• Only stable fixed point: α∗ij = α∗
ji = 1.0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Stability of fixed points
Two classes of fixed points
• Let Sn be the unit hypercube of dimension n.
• First class of fixed points is the corner of Sn.
• That is α∗ij = 0, 1 for all ij
• Second class is outside the corners (internal points).
• That is, there is at least one α∗ij 6= 0, 1
Corner
Stability of points
• Points in the corner are easily classified as (un)stable
• Internal points more difficult.
• It seems that most internal points are non-hyperbolic.
• Possibly some (limit) cycles may exist.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Corner points
Corner points
• All corner points are fixed points.
• Jacobian of α = F(α) defined as
∇F =
∂f12∂α12
· · · ∂f12αn(n−1)
......
. . ....
∂fn(n−1)
∂α12· · ·
∂fn(n−1)
αn(n−1)
α∗
• For corner points, only ∂fij/∂αij is non-zero:
Condition for stability in corners:
(1 − 2α∗ij)
[
(1 − λ)(2α∗ij − 1)(2α∗
ji − 1) + λ(k+ij − k−
ij )]
< 0
where k±ij is the number of matches/differences between i and j .
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Stable groups
Groups
• One special case of corner points
• Cooperate within group, defect between groups
• Working out stability conditions gives
nc >1
λ
• Social influence λ induces lower bound on group size.
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Invasion from AllD
AllD
• Suppose system in equilibrium α∗ = (1, 1, . . . , 1).
• Add a number of defectors (AllD).
• Relationships between gossiping cooperators uneffected.
• Only reputation of defector changes.
New reputation equilibrium
• Let i be a cooperator, and j a defector, then
αij = αij(1 − αij) [(1 − λ)(1 − 2αij) − λ(nc − 1)]
• Stable fixed point 1−λnc
2(1−λ) exists if nc < 1λ
(otherwise 0).
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Invasion from AllD
• In equilibrium, expected payoff Acc of cooperator vs. itself is
(b − c)nc(nc − 1)
n2
• Expected payoff Adc of defector vs. cooperator is
b1 − λn
2(1 − λ)
ncnd
n2
• Condition Acc > Adc reduces to
1 −(1 − λnc)nd
2(1 − λ)(nc − 1)>
c
b
• Since cb
< 1, if RHS larger than that, AllD cannot invade. Thisreduces to
nc >1
λ
Introduction Cooperative Mechanisms Indirect Reciprocity Proposed model
Invasion from AllD
Group size
• Two regimes of behavior:
nc <1
λand nc >
1
λ
• In first regime, some cooperation with defectors.
• Amount of cooperation decreases with group size nc and socialinfluence λ.
• In second regime, defectors can never invade.
• But by earlier stability of groups
nc >1
λ.
• So, always stable against invasion from AllD.