19
August 25, 2022 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore.

February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

Embed Size (px)

DESCRIPTION

February 2, 2016e-Enterprise Lab Repeated Game When players interact by playing a similar stage game (such as the prisoner's dilemma) numerous times, the game is called a repeated game.prisoner's dilemma

Citation preview

Page 1: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023

Stochastic Games

Mr Sujit P Gujar.e-Enterprise Lab

Computer Science and AutomationIISc, Bangalore.

Page 2: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Agenda

Stochastic Game Special Class of Stochastic Games Analysis : Shapley’s Result. Applications

Page 3: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Repeated Game When players interact by playing a similar

stage game (such as the prisoner's dilemma) numerous times, the game is called a repeated game.

Page 4: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Stochastic Game Stochastic game is repeated game with

probabilistic/stochastic transitions. There are different states of a game. Transition probabilities depend upon actions

of players. Two player stochastic game : 2 and 1/2

player game.

Page 5: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Repeated Prisoner’s Dilemma

Consider Game tree for PD repeated twice.

What is Player 1’s strategy set?(Cross product of all choice sets at all information sets…)

{C,D} x {C,D} x {C,D} x {C,D} x {C,D}25 = 32 possible strategies

First Iteratio

nSecondIteratio

n

21

21

21

21subga

me

12

Assume each player has the same two options at each info set: {C,D}

Page 6: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Issues in Analyzing Repeated Games

How to we solve infinitely repeated games?

Strategies are infinite in number.

Need to compare sums of infinite streams of payoffs

Page 7: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Stochastic Game : The Big Match

Every day player 2 chooses a number, 0 or 1 Player 1 tries to predict it. Wins a point if he is

correct. This continues as long as player 1 predicts 0. But if he ever predicts 1, all future choices for

both players are required to be the same as that day's choices.

Page 8: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

The Big Match S = {0,1*,2*} : State space.

1 00 0

P01 =

0 01 1

s0 ={0,1} s1

={0} s2 ={1}

P02 = N = {1,2} P00 = 0 1

0 0

A = Payoff Matrix = 1* 0*

0 1

Page 9: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

The "Big-Match" game is introduced by Gillette (1957) as a difficult example.

The Big MatchDavid Blackwell; T. S. FergusonThe Annals of Mathematical Statistics, Vol. 39, No. 1. (Feb., 1968), pp. 159-163.

Page 10: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

ScenarioN Total number of States/Positionsmk Choices for row player at position knk Choices for column player at position ksk

ij > 0 The probability with which the game in position k stops when player 1 plays i and player 2, j.

pklij The probability with which the game in position k moves to l

when player 1 plays i and player 2, j.s Min sk

ij

akij Payoff to row player in stage k.

M Max |akij|

Page 11: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Stationary Strategies Enumerating all pure and mixed strategies is

cumbersome and redundant. Behavior strategies those which specify a

player the same probabilities for his choices every time the same position is reached by whatever route.

x = (x1,x2,…,xN) each xk = (xk1, xk

2,…, xkmk

)

Page 12: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Notation Given a matrix game B,

val[B] = minimax value to the first player. X[B] = The set of optimal strategies for first

player. Y[B] = The set of optimal strategies for second

player. It can be shown, (B and C having same

dimensions)|val[B] - val[C]| ≤ max |bij - cij|

Page 13: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

When we start in position k, we obtain a particular game,

We will refer stochastic game as,

Define,

Page 14: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Shapley’s1 Results

1L.S. Shapley, Stochastic Games. PNAS 39(1953) 1095-1100

Page 15: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Let, denote the collection of games whose pure strategies are the stationary strategies of . The payoff function of these new games must satisfy,

Page 16: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Shapley’s Result,

Page 17: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Applications 1When N = 1,

By setting all skij = s > 0, we get model of infinitely

repeated game with future payments are discounted by a factor = (1-s).

If we set nk = 1 for all k, the result is “dynamic programming model”.

1von Neumann J. , Ergennise eines Math, Kolloquims, 8 73-83 (1937)

Page 18: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Example Consider the game with

N = 1, A =

1-s 1-s1-s 1-s

P1 =

1 -1-2 1

x=(0.6,0.4) y=(0.4, 0.6)

1-2s 1-2s1-s 1-2s

P2 =

x=(0.61,0.39) y=(0.39, 0.61)

Page 19: February 2, 2016 Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore

May 3, 2023 e-Enterprise Lab

Thank You!!