Download pdf - Attainability in Repeated Games with Vector Payoffsbagagiol/Talk-Trento.pdf · 2013. 4. 20. · Lehrer, Solan, Bauso, \Repeated games over networks with vector payo s: the notion

Set-upMotivations

Model and results

Attainability in Repeated Games with VectorPayoffs

Ehud Lehrer1 Eilon Solan1 Dario Bauso2

1Tel Aviv University

2University of Palermo

University of Trento, 17 April 2013

Ehud Lehrer, Eilon Solan, Dario Bauso Attainability in repeated games

Set-upMotivations

Model and results

Outline

Set-upWhat we know (approachability)What we do (attainability)

Motivations

Model and resultsResults


Set-upMotivations

Model and results

What we know (approachability)What we do (attainability)

Repeated game with vector payoffs

I Two players play repeated game

I vector payoffs are d-dimensional

I the total payoff up to stage t is x(t)

I example for d = 2 (in discrete-time)(6, 7) (1, 7) (6, 2) (1, 2)

(6,−4) (1,−4) (6,−9) (1,−9)(−3,−1) (−8,−1) (−3,−6) (−8,−6)(−3, 10) (−8, 10) (−3, 5) (−8, 5)


Set-upMotivations

Model and results


Repeated game with vector payoffs: stage 1




I example for d = 2 (in discrete-time)(6,7) (1, 7) (6, 2) (1, 2)

(6,−4) (1,−4) (6,−9) (1,−9)(−3,−1) (−8,−1) (−3,−6) (−8,−6)(−3, 10) (−8, 10) (−3, 5) (−8, 5)

I t = 1: (Top,Left) → x(1) = (6, 7)


Set-upMotivations

Model and results






I example for d = 2 (in discrete-time)(6, 7) (1, 7) (6, 2) (1, 2)

(6,−4) (1,−4) (6,−9) (1,−9)(−3,−1) (−8,−1) (−3,−6) (−8,−6)(−3, 10) (−8, 10) (−3, 5) (-8,5)

I t = 1: (Top,Left) → x(1) = (6, 7)I t = 2: (T,L)-(Bottom,Right) → x(2) = (−2, 12)


Set-upMotivations

Model and results






I example for d = 2 (in discrete-time)(6,7) (1, 7) (6, 2) (1, 2)

(6,−4) (1,−4) (6,−9) (1,−9)(−3,−1) (−8,−1) (−3,−6) (−8,−6)(−3, 10) (−8, 10) (−3, 5) (−8, 5)

I t = 1: (Top,Left) → x(1) = (6, 7)I t = 2: (T,L)-(Bottom,Right) → x(2) = (−2, 12)I t = 3: (T,L)-(B,R)-(Top,Left) → x(3) = (4, 19)


Set-upMotivations

Model and results


Approachability [Blackwell, ’56]

A set of payoff vectors A is approachable by P1 if she has astrategy such that the average payoff up to stage t, x̄(t) :=x(t)t , converges to A, regardless of the strategy of P2.

Blackwell, D. (1956b) “An Analogof the MinMax Theorem For VectorPayoffs,” Pacific J. of Math., 6, 1-8.


Set-upMotivations

Model and results


Example [Solan, Maschler, and Zamir, 2010]

I C1 approachable when P1 plays T in every stageI C2 approachable when P1 plays B in every stageI C3 approachable when P1 plays{

B if x̄1(t− 1) + x̄2(t− 1) < 1T otherwise


Set-upMotivations

Model and results


Geometric insight

x̄(t)

x̄(t + 1)

x(t + 1)

A

H+

H−

R1(p)

A approachable if for any x̄(t) ∈ H−, ∃ p s.t. R1(p) ⊂ H+.

I y(t) is projection of x̄(t) onto AI take supporting hyperplane (dashed line) for A in y(t)

H = {z ∈ Rd| 〈z − y(t), x̄(t)− y(t)〉 = 0}I R1(p) set of payoff vectors when P1 plays mixed strategy p.


kalisperaSticky NoteShow paper by Blackwell

Set-upMotivations

Model and results


just a few references

I Extension to infinite dimensional spaces��

�

Lehrer, “Approachability in infinite dimensional spaces,”IJGT, 31(2) 2002

I Connections to differential games��

�

Soulaimani, Quincampoix, Sorin “Repeated Games andQualitative Differential Games...” SICON 48, 2009

I continuous-time approachability��

�

Hart, Mas-Colell, “Regret-based continuous-time dynam-ics”, Games and Economic Behavior 45, 2003


kalisperaSticky NoteShow paper by Sorin

Set-upMotivations

Model and results


The notion of attainability

A set of payoff vectors A is attainable by P1 if she hasa strategy such that the total payoff up to stage t, x(t),“converges” to A, regardless of the strategy of P2.

Lehrer, Solan, Bauso, “Repeated games over networks withvector payoffs: the notion of attainability” NetGCooP 2011(journal version at arXiv:1201.6054v1).


Set-upMotivations

Model and results

Background and historical notes


Set-upMotivations

Model and results

Motivation: Control Theory

I Uncontrolled flows/demand w(t) ∈ W, ∀t,I Controlled flows/supply u(t) ∈ U , ∀t.I buffer (excess supply) dynamics:

ẋ(t) = Bu(t)− w(t), x(0) = ζ

I B incidence matrix.

[Bauso, Blanchini, Pesenti, Automatica, 2006], [Bauso, Blanchini, Pesenti, TAC, 2010]


Set-upMotivations

Model and results

Robust stabilizability

I We need to bound total excess supply x(t) =∫ t0 ẋ(τ)dτ + ζ

reinterpret as continuous time repeated game whereI P1 plays u(t) and P2 plays w

I ẋ(t) is instantaneous payoff

I x(t) is total payoff up to time t


Set-upMotivations

Model and results

back to first example

f1

f2

f3

w1

w2

u(t) ∈

1−2

6

, 1−2−5

, −51−5

, −51

6

w(t) ∈

{[−3−3

],

[2−3

],

[−32

],

[22

]}

[ẋ1(t)ẋ2(t)

]=

[1 −1 00 1 1

] u1(t)u2(t)u3(t)

− [ w1(t)w2(t)

]

(6, 7) (1, 7) (6, 2) (1, 2)(6,−4) (1,−4) (6,−9) (1,−9)

(−3,−1) (−8,−1) (−3,−6) (−8,−6)(−3, 10) (−8, 10) (−3, 5) (−8, 5)


Set-upMotivations

Model and results

simulation


Set-upMotivations


The Model

I A repeated game (A1, A2,g)

I Ai action space of PiI g : A1 ×A2 → [−1, 1]d is d-dimensional payoffI (ati)t∈R+ is non-anticipating behavior strategy for player i

I (ati)t∈R+ takes values in ∆(Ai)I ∃ increasing sequence of times τ1i < τ2i < τ3i < . . . s.t. ati is

measurable w.r.t. the information τki , τki ≤ t < τk+1i .

T

Bτ 1i τ 2i τ

3i

∆(Ai)(12,

12)

(1, 0)

(13,23)


kalisperaHighlightSee formal definition in the paper

Set-upMotivations


Attainability: definition

I gt =payoff at time t given the mixed actions of the players

I x(t) =∫ tτ=0 gτ (mixed action pairs at time τ)dτ

Def.: A set A in Rd is attainable by P1 if there is T > 0 such thatfor every � > 0 there is a strategy σ1 of P1 s.t.

dist(x(t)[σ1, σ2], A) ≤ �, ∀t ≥ T, ∀σ2.

�

B(A, �)

A B(A, �) := {z : dist(z,A) ≤ �}


Set-upMotivations


Main results in a nutshell

Thm. 1: The following conditions are equivalent.

B1 Vector ~0 ∈ Rd is attainable by P1;B2 vλ ≥ 0 for every λ ∈ Rd.

Thm. 2: Vector z ∈ Rd (6= ~0) is attainable by P1 ⇔B1 The vector ~0 ∈ Rd is attainable by P1and either one between B3 and B4 holds (yet to be introduced)

Thm. 3: The following statements are equivalent:

C1 vλ > 0 for every λ ∈ Rd;C2 Every vector z ∈ Rd is attainable by player 1.


Set-upMotivations


Value of projected game

x(τk1 )

x(τk+11 )0

H+

H−

R1(p)

λ

I vλ > 0 for every λ ∈ Rd.((#,#) (#,#)(#,#) (#,#)

)⇒(〈λ, (#,#)〉〈λ, (#,#)〉〈λ, (#,#)〉〈λ, (#,#)〉

)I vλ > 0 is value of projected game


Set-upMotivations


Theorem 2

Vector z ∈ Rd (6= ~0) is attainable by P1 ⇔B1 The vector ~0 ∈ Rd is attainable by P1B3 for every function f : ∆(A1)→ ∆(A2), vector z is in

Cone(f) :={y ∈ Rd| y =

∑p∈A1

αpg(p, f(p)) : αp ≥ 0 ∀p}


Set-upMotivations


Sketch of proof

Cone(f )

z

T

Bb

a

I P1 plays mixed action p ∈ ∆{T,B} and P2 plays f(p),I payoff x(τ11 ) belongs to segment ab


Set-upMotivations


Sketch of proof: stage τ 11

Cone(f )

z

T

Bx(τ11 )

d

c

I suppose P1 plays B in interval 0 ≤ t ≤ τ11I payoff x(τ21 ) belongs to segment cd


Set-upMotivations



Cone(f )

z

T

Bx(τ11 )

x(τ21 ) f

e

I suppose P1 plays T in interval τ11 ≤ t ≤ τ21

I payoff x(τ31 ) belongs to segment ef


Set-upMotivations



Cone(f )

z

T

Bx(τ11 )

x(τ21 )

x(τ31 )

I suppose P1 plays12 T and

12 B in interval τ

21 ≤ t ≤ τ31

I get to x(τ31 ) very close to z


Set-upMotivations


Sketch of proof: Necessity of 1.

Cone(f )

z

T

Bx(τ11 )

x(τ21 )

x(τ31 )

In the game from τ31 to∞ total payoff has to be close to zero- true only if ~0 is attainable


Set-upMotivations


Sketch of proof: Necessity of 2.

Cone(f )

z

T

Bx(τ11 )

x(τ21 )

x(τ31 )

all possible trajectories {x(τk1 )}k=0,...,∞ are within Cone(f)


Set-upMotivations


Monte Carlo simulations


Set-upMotivations


sampled average of dist(z, x(t))

0 1 2 3 4 50

0.5

1

1.5

2

2.5

3

3.5sa

mpl

ed a

vera

ge d

(γT(σ

),x)

time T


Set-upMotivations


Conclusions and further questions

I Games with vector payoffs:

I approachability looked at average payoffI attainability looks at total payoff

I necessary and sufficient conditions for attainability

I Still to be done:I characterization of attainable setsI look at discounted payoff


Set-upWhat we know (approachability)What we do (attainability)

MotivationsModel and resultsResults