33
1 Computing Nash Equilibrium Presenter: Yishay Mansour

Computing Nash Equilibrium

Embed Size (px)

DESCRIPTION

Computing Nash Equilibrium. Presenter: Yishay Mansour. Outline. Problem Definition Notation Today: Zero-Sum game Next week: General Sum Games Multiple players. Model. Multiple players N={1, ... , n} Strategy set Player i has m actions S i = {s i1 , ... , s im } - PowerPoint PPT Presentation

Citation preview

Page 1: Computing Nash Equilibrium

1

Computing Nash Equilibrium

Presenter: Yishay Mansour

Page 2: Computing Nash Equilibrium

2

Outline

• Problem Definition

• Notation

• Today: Zero-Sum game

• Next week: General Sum Games– Multiple players

Page 3: Computing Nash Equilibrium

3

Model

• Multiple players N={1, ... , n}

• Strategy set– Player i has m actions Si = {si1, ... , sim}

– Si are pure actions of player i

– S = i Si

• Payoff functions– Player i ui : S

Page 4: Computing Nash Equilibrium

4

Strategies

• Pure strategies: actions• Mixed strategy

– Player i – pi distribution over Si

– Game - P = i pi

• Product distribution

• Modified distribution– P-i = probability P except for player i

– (q, P-i ) = player i plays q other player pj

Page 5: Computing Nash Equilibrium

5

Notations

• Average Payoff– Player i: ui(P) = Es~P[ui(s)] = P(s)ui(s)– P(s) = i pi (si)

• Nash Equilibrium– P* is a Nash Eq. If for every player i– For any distribution qi

– ui(qi,P*-i) ui(P*)• Best Response

Page 6: Computing Nash Equilibrium

6

Notations

• Alternative payoff– xij(P) = ui(sij,P-i) = Es~P[ui(s) | si = sij]

• Difference in payoff– zij(P) = xij(P) – ui(P)

• Improvement in payoff– gij(P) = max{ zij(P),0}

Page 7: Computing Nash Equilibrium

7

Fixed point Theorems

• Intermediate Value Theorem– domain [a,b]– function f continuous– f(a) f(b) < 0– exists z such that f(z)=0– Proof: M+ = { x | f(x) 0} M- ={x | f(x) 0}– closed sets and have an intersection.

Page 8: Computing Nash Equilibrium

8

Brouwer’s Fixed point theorem

• f: S S continuous, S compact and convex

• There exists z in S : z = f(z)– For S=[0,1], previous theorem

Page 9: Computing Nash Equilibrium

9

Kakutani’ Fixed Point Theorem

• L: S S correspondence– L(x) is a convex set– L semi-continuous– S compact and convex

• There exists z: z in L(z)

Page 10: Computing Nash Equilibrium

10

Nash Equilibrium I

• Best response correspondence– L(P) = argmaxQ { ui(qi, P-i)}

– L is a correspondence, continuous– Nash is a fixed point of L

• P* in L(P*)

– Kakutani’s fixed point theorem

Page 11: Computing Nash Equilibrium

11

Nash Equilibrium II

• Fixed point– K(P) has mN parameters

– Kij(P) = (pij+gij(P)) / (1 + gij(P))

– Nash is a fixed point of K• P* = K(P*)

– Original proof of Nash– Continuous function on a compact space

• Brouwer’s fixed point theorem

Page 12: Computing Nash Equilibrium

12

Nash Equilibrium III

• Non-linear complementary problem (NCP)– Recall zij(P)

– For every player i and action aij:

• zij(P)*pij = 0

• zi(P) is orthogonal to pi

– Nash: z(P*) 0• zij(P*) 0

Page 13: Computing Nash Equilibrium

13

Nash Equilibrium IV

• Stationary point problem– Recall: x = alternative payoff– Nash: P*– For every P– (P-P*) x(P*) 0

• (pij –p*ij) x(P*) 0

Page 14: Computing Nash Equilibrium

14

Nash Equilibrium V

• Minimizing a function– Objective function:

– V(P) = i j [gij(P)]2

– V(P) is continuous and differentiable, non-negative function

– NASH: V(P*) = 0• Local Minima

Page 15: Computing Nash Equilibrium

15

Nash Equilibrium VI

• Semi-Algebraic set– distribution P: j pij = 1

– difference in payoff:• zij(P) 0

• zij(P) = xij(P) – ui(P) 0

• Explicitly:

Sss k

kiiSss ik

kiijiij

nn

spsuspssuPz,...,,..., 11

)()()(),()(

Page 16: Computing Nash Equilibrium

16

Two player games

• Payoff matrices (A,B)– m rows and n columns– player 1 has m action, player 2 has n actions

• strategies p and q

• Payoffs: u1(pq)=pAqt and u2(pq)= pBqt

• Zero sum game– A= -B

Page 17: Computing Nash Equilibrium

17

Linear Programming

• Primal LP:

• x in SETprimal is feasible

• maximize <c,x> subject to x in SETprimal

}0

:{

j

jijij

jijij

nprimal

x

bxa

bxa

xSET

Page 18: Computing Nash Equilibrium

18

Linear Programming

• Dual LP:

• y in SETdual is feasible

• minimize <b,y> subject to y in SETdual

}0

:{

i

ijiji

ijiji

mdual

y

cay

cay

ySET

Page 19: Computing Nash Equilibrium

19

Duality Theorem

• Weak duality: <c,x> <b,y> – for any feasible x and y– proof!

• Strong Duality – If there are feasible solutions then– <c,x> = <b,y> for some feasible x and y– sketch of proof.

Page 20: Computing Nash Equilibrium

20

Two players zero sum

• Fix strategy q of player 2,• player 1 best response:

– maximize p (Aqt) such that j pj = 1 and pj 0– dual LP: minimize u such that u Aqt

• Player 2: select strategy q :– minimize u such that u Aqt and i qi = 1 and qi 0– dual (strategy for player 1)– maximize v such that v pA, j pj = 1 and pj 0

• There exists a unique value v.

Page 21: Computing Nash Equilibrium

21

Example

Page 22: Computing Nash Equilibrium

22

Summary

• Two players zero sum– linear programming– polynomial time– can have multiple Nash– unique value!– If (p,q) and (p’,q’) Nash then– (p,q’) and (p’,q) Nash

Page 23: Computing Nash Equilibrium

23

Online learning

• Playing with unknown payoff matrix• Online algorithm:

– at each step selects an action.• can be stochastic or fractional

– Observes all possible payoffs– Updates its parameters

• Goal: Achieve the value of the game– Payoff matrix of the “game” define at the end

Page 24: Computing Nash Equilibrium

24

Online learning - Algorithm

• Notations:– Opponent distribution Qt

– Our distribution Pt

– Observed cost M(i, Qt) • Should be MQt

– Goal: minimize cost

• Algorithm: Exponential weights– Action i has weight proportional to bL(i,t)

– L(i,t) = loss of action i until time t

Page 25: Computing Nash Equilibrium

25

Online algorithm: Notations

• Formally:– parameter: b 0< b < 1

– wt+1(i) = wt(i) bM(i,Qt)

– Zt = wt(i)

– Pt+1(i) = wt+1(i) / Zt

– Number of total steps T is known

Page 26: Computing Nash Equilibrium

26

Online algorithm: Theorem

• Theorem– For any matrix M with entries in [0,1]

– Any sequence of dist. Q1 ... QT

– The algorithm generates P1, ... , PT

– RE(A||B) = Ex~A [ln (A(x) / B(x) ) ]

)||(1

1),(

1

)/1ln(min),( 1

11

PPREb

QPMb

bQPM

T

ttP

T

ttt

Page 27: Computing Nash Equilibrium

27

Online algorithm: Analysis

• Lemma– For any mixed strategy P

• Corollary

),())1(1ln(),()/1ln()||()||( 1 ttttt QPMbQPMbPPREPPRE

nb

QPMb

bQPM

T

ttP

T

ttt ln

1

1),(

1

)/1ln(min),(

11

Page 28: Computing Nash Equilibrium

28

Online Algorithm: Optimization

• b= 1/(1 + sqrt{2 (ln n) / T})

• Average Loss: v + O(sqrt{(ln n )/T})

Page 29: Computing Nash Equilibrium

29

Two players General sum games

• Input matrices (A,B)• No unique value• Computational issues: find some, all Nash• player 1 best response:

– Like for zero sum:– Fix strategy q of player 2– maximize p (Aqt) such that j pj = 1 and pj 0– dual LP: minimize u such that u Aqt

Page 30: Computing Nash Equilibrium

30

Two players General sum games

• Assume the support of strategies known.– p has support Sp and q has support Sq

– Can formulate the Nash as LP:

ii

pi

pi

pj

jij

pj

jij

p

Sip

Sip

Sivqa

Sivqa

1

for 0

for 0

for

for

jj

qj

qj

qi

iji

qi

iji

q

Sjq

Sjq

Sjuap

Sjuap

1

for 0

for 0

for

for

Page 31: Computing Nash Equilibrium

31

Approximate Nash

Page 32: Computing Nash Equilibrium

32

Lemke & Howson

Page 33: Computing Nash Equilibrium

33

Example