Download pdf - Lesson 34: Introduction To Game Theory

Lesson 34 (KH, Section 11.4)Introduction to Game Theory

Math 20

December 12, 2007

Announcements

I Pset 12 due December 17 (last day of class)

I next OH today 1–3 (SC 323)

Outline

Games and payoffsMatching diceVaccination

The theorem of the day

Strictly determined gamesExample: Network programmingCharacteristics of an Equlibrium

Two-by-two strictly-determined games

Two-by-two non-strictly-determined gamesCalculationExample: Vaccination

Other

A Game of Chance

I You and I each have asix-sided die

I We roll and the loserpays the winner thedifference in the numbersshown

I If we play this a numberof times, who’s going towin?

The Payoff Matrix

I Lists each player’soutcomes versusthe other’s

I Each aij representsthe payoff from Cto R if outcomes ifor R and j for Coccur (a zero-sumgame).

C ’s outcomes1 2 3 4 5 6

1 0 -1 -2 -3 -4 -52 1 0 -1 -2 -3 -43 2 1 0 -1 -2 -34 3 2 1 0 -1 -25 4 3 2 1 0 -1

R’s

outc

omes

6 5 4 3 2 1 0

Expected Value

I Let the probabilities of R’s outcomes and C ’s outcomes begiven by probability vectors

p =(p1 p2 · · · pn

)q =

q1

q2...

qn

I The probability of R having outcome i and C having outcomej is therefore piqj .

I The expected value of R’s payoff is

E (p,q) =n∑

i ,j=1

piaijqj = pAq

I A “fair game” if the dice are fair.

Expected Value


p =(p1 p2 · · · pn

)q =

q1

q2...

qn

I The probability of R having outcome i and C having outcome

j is therefore piqj .


E (p,q) =n∑

i ,j=1

piaijqj = pAq


Expected Value


p =(p1 p2 · · · pn

)q =

q1

q2...

qn




E (p,q) =n∑

i ,j=1

piaijqj = pAq


Expected Value


p =(p1 p2 · · · pn

)q =

q1

q2...

qn




E (p,q) =n∑

i ,j=1

piaijqj = pAq


Expected value of this game

pAq

=(1/6 1/6 1/6 1/6 1/6 1/6

)

0 −1 −2 −3 −4 −51 0 −1 −2 −3 −42 1 0 −1 −2 −33 2 1 0 −1 −24 3 2 1 0 −15 4 3 2 1 0

1/6

1/6

1/6

1/6

1/6

1/6

=(1/6 1/6 1/6 1/6 1/6 1/6

)

−15/6

−9/6

−3/6

3/6

9/6

15/6

= 0

Expected value with an unfair dieSuppose p =

(1/10 1/10 1/5 1/5 1/5 1/5

). Then

pAq

=(1/10 1/10 1/5 1/5 1/5 1/5

)

0 −1 −2 −3 −4 −51 0 −1 −2 −3 −42 1 0 −1 −2 −33 2 1 0 −1 −24 3 2 1 0 −15 4 3 2 1 0

1/6

1/6

1/6

1/6

1/6

1/6

= 110 ·

16

(1 1 2 2 2 2

)

−15−9−339

15

=24

60=

2

5

Strategies

I What if we couldchoose a die to beas biased as wewanted?

I In other words,what if we couldchoose a strategyp for this game?

I Clearly, we’d wantto get a 6 all thetime!

C ’s outcomes1 2 3 4 5 6

1 0 -1 -2 -3 -4 -52 1 0 -1 -2 -3 -43 2 1 0 -1 -2 -34 3 2 1 0 -1 -25 4 3 2 1 0 -1

R’s

outc

omes

6 5 4 3 2 1 0

Flu Vaccination

I Suppose there are two flustrains, and we have twoflu vaccines to combatthem.

I We don’t knowdistribution of strains

I Neither pure strategy isthe clear favorite

I Is there a combination ofvaccines (a mixedstrategy) thatmaximizes totalimmunity of thepopulation?

Strain1 2

1 0.85 0.70

Vac

c

2 0.60 0.90

Outline






Other

Theorem (Fundamental Theorem of Zero-Sum Games)

There exist optimal strategies p∗ for R and q∗ for C such that forall strategies p and q:

E (p∗,q) ≥ E (p∗,q∗) ≥ E (p,q∗)

E (p∗,q∗) is called the value v of the game.

Theorem (Fundamental Theorem of Zero-Sum Games)

There exist optimal strategies p∗ for R and q∗ for C such that forall strategies p and q:

E (p∗,q) ≥ E (p∗,q∗) ≥ E (p,q∗)

E (p∗,q∗) is called the value v of the game.

Reflect on the inequality

E (p∗,q) ≥ E (p∗,q∗) ≥ E (p,q∗)

In other words,

I E (p∗,q) ≥ E (p∗,q∗): R can guarantee a lower bound onhis/her payoff

I E (p∗,q∗) ≥ E (p,q∗): C can guarantee an upper bound onhow much he/she loses

I This value could be negative in which case C has theadvantage

Fundamental problem of zero-sum games

I Find the p∗ and q∗!

I The general case we’ll look at next time (hard-ish)I There are some games in which we can find optimal strategies

now:I Strictly-determined gamesI 2× 2 non-strictly-determined games

Outline






Other

Example: Network programming

I Suppose we have twonetworks, NBC and CBS

I Each chooses whichprogram to show in acertain time slot

I Viewer share variesdepending on thesecombinations

I How can NBC get themost viewers?

The payoff matrix and strategies

60M

inute

s

Survivo

r

CSIYes

, Dea

r

My Name is Earl 60 20 30 55

Dateline 50 75 45 60

Law & Order 70 45 35 30

CBS

NB

C


60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C

What is NBC’s strategy?

I NBC wants to maximize NBC’s minimum share

I In airing Dateline, NBC’s share is at least 45

I This is a good strategy for NBC


60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C






60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C






60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C






60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C

What is CBS’s strategy?

I CBS wants to minimize NBC’s maximum share

I In airing CSI, CBS keeps NBC’s share no bigger than 45

I This is a good strategy for CBS


60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C






60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C






60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C






60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C

Equilibrium

I (Dateline,CSI) is an equilibrium pair of strategies

I Assuming NBC airs Dateline, CBS’s best choice is to air CSI,and vice versa


60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C

Equilibrium




60M

inute

s

Survivo

r

CSIYes

, Dea

r



Law & Order 70 45 35 30

CBS

NB

C

Equilibrium



Characteristics of an Equlibrium

I Let A be a payoff matrix. A saddle point is an entry ars

which is the minimum entry in its row and the maximumentry in its column.

I A game whose payoff matrix has a saddle point is calledstrictly determined

I Payoff matrices can have multiple saddle points

Pure Strategies are optimal in Strictly-Determined Games

TheoremLet A be a payoff matrix. If ars is a saddle point, then e′r is anoptimal strategy for R and es is an optimal strategy for C.

Proof.If q is a strategy for C, then

E (e′r ,q) = e′r Aq =n∑

j=1

arjqj ≥n∑

j=1

arsqj = ars = E (e′r , es)

If p is a strategy for R, then

E (e′r , es) = pAes =m∑

i=1

piais ≤m∑

i=1

piars = E (e′r , es)

So for any p and q, we have

E (e′r ,q) ≥ E (e′r , es) ≥ E (e′r , es)

Pure Strategies are optimal in Strictly-Determined Games

TheoremLet A be a payoff matrix. If ars is a saddle point, then e′r is anoptimal strategy for R and es is an optimal strategy for C.

Proof.If q is a strategy for C, then

E (e′r ,q) = e′r Aq =n∑

j=1

arjqj ≥n∑

j=1

arsqj = ars = E (e′r , es)

If p is a strategy for R, then

E (e′r , es) = pAes =m∑

i=1

piais ≤m∑

i=1

piars = E (e′r , es)

So for any p and q, we have

E (e′r ,q) ≥ E (e′r , es) ≥ E (e′r , es)

Outline






Other

Finding equilibria by gravity

I If C chose strategy 2,and R knew it, R woulddefinitely choose 2

I This would make Cchoose strategy 1

I but (2, 1) is anequilibrium, a saddlepoint.

1 3

2 4


Here (1, 1) is an equilibriumposition; starting from thereneither player would want todeviate from this.

2 3

1 4


What about this one?

2 3

4 1

Outline






Other

Two-by-two non-strictly-determined gamesCalculation

In this case we can compute E (p,q) by hand in terms of p1 = pand q1 = q:

E (p, q) = pa11q + pa12(1− q) + (1− p)a21q + (1− p)a22(1− q)

The critical points are when

0 =∂E

∂p= a11q + a12(1− q)− a21q − a22(1− q)

0 =∂E

∂q= pa11 − pa12 + (1− p)a21 − (1− p)a22

So

p =a22 − a12

a11 + a22 − a21 − a22q =

a22 − a21

a11 + a22 − a21 − a12

These are in between 0 and 1 if there are no saddle points in thematrix.





0 =∂E

∂p= a11q + a12(1− q)− a21q − a22(1− q)

0 =∂E

∂q= pa11 − pa12 + (1− p)a21 − (1− p)a22

So

p =a22 − a12

a11 + a22 − a21 − a22q =

a22 − a21

a11 + a22 − a21 − a12






0 =∂E

∂p= a11q + a12(1− q)− a21q − a22(1− q)

0 =∂E

∂q= pa11 − pa12 + (1− p)a21 − (1− p)a22

So

p =a22 − a12

a11 + a22 − a21 − a22q =

a22 − a21

a11 + a22 − a21 − a12


Examples

I If A =

(1 32 4

), then p = 2

0? Doesn’t work because A has a

saddle point.

I If A =

(2 31 4

), p = 3

2? Again, doesn’t work.

I If A =

(2 34 1

), p = −3

−4 = 3/4, while q = −2−4 = 1/2. So R

should pick 1 half the time and 2 the other half, while Cshould pick 1 3/4 of the time and 2 the rest.

Further Calculations

Also

∂2E

∂p2= 0

∂2E

∂q2= 0

So this is a saddle point!Finally,

E (p, q) =a11a22 − a12a21

a11 + a22 − a21 − a22

Example: Vaccination

We have

p1 =0.9− 0.6

0.85 + 0.9− 0.6− 0.7=

2

3

q1 =0.9− 0.7

0.85 + 0.9− 0.6− 0.7=

4

9

v =(0.85)(0.9)− (0.6)(0.7)

0.85 + 0.9− 0.6− 0.7≈ 0.767

Strain1 2

1 0.85 0.70

Vac

c

2 0.60 0.90

I We should give 2/3 of the population vaccine 1 and the restvacine 2

I The worst case scenario is a 4 : 5 distribution of strains

I We’ll still cover 76.7% of the population

Outline






Other

Other Applications of GT

I WarI the Battle of the

Bismarck Sea

I BusinessI product introductionI pricing

I Dating