39
logo Introduction Markov Decision Evolutionary Games Application to energy management in wireless networks Numerical illustrations Conclusions and perspectives Markov Decision Evolutionary Games Eitan Altman, Yezekael Hayel, Hamidou Tembine, Rachid El Azouzi INRIA Sophia-Antipolis, France University of Avignon, France Popeye seminar, 2008 1 / 39

Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Markov Decision Evolutionary Games

Eitan Altman, Yezekael Hayel,Hamidou Tembine, Rachid El Azouzi

INRIA Sophia-Antipolis, FranceUniversity of Avignon, France

Popeye seminar, 2008

1 / 39

Page 2: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Plan

1 Introduction

2 Markov Decision Evolutionary Games

3 Application to energy management in wireless networks

4 Numerical illustrations

5 Conclusions and perspectives

2 / 39

Page 3: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Plan

1 Introduction

2 Markov Decision Evolutionary Games

3 Application to energy management in wireless networks

4 Numerical illustrations

5 Conclusions and perspectives

3 / 39

Page 4: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Evolutionary Game Theory

Evolutionary Stable Strategy (ESS)

The ESS is characterized by a property of robustness againstinvaders (mutations). More specifically,

if an ESS is reached, then the proportions of each population donot change in time.at ESS, the populations are immune from being invaded by othersmall populations.restriction to interactions that are limited to pairwise.

This notion is stronger than Nash equilibrium in which it is onlyrequested that a single user would not benefit by a change (mutation)of its behavior.ESS is robust against a deviation of a whole fraction of thepopulation.

4 / 39

Page 5: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Definitions

ESSJ(p, q) the expected immediate payoff for an individual if it uses astrategy p when meeting another individual who adopts thestrategy q.K available strategies which are called pure strategies.

Definition

A strategy q is said to be an ESS if for every p 6= q there exists someεq > 0 such that for all ε ∈ (0, εq):

J(q, εp + (1− ε)q) > J(p, εp + (1− ε)q)

5 / 39

Page 6: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Important theorem

TheoremA strategy q is an ESS if and only if it satisfies

for all p 6= q, J(q, q) > J(p, q),

orfor all p 6= q, J(q, q) = J(p, q) and J(q, p) > J(p, p).

6 / 39

Page 7: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Markov Decision Evolutionary Games (MDEG)

MDEGThe fitness of a player depends not only on the actions chosen inthe interaction but also on the individual state of the players.Players have finite life time and take during which theyparticipate in several local interactions.The actions taken by a player determine not only the immediatefitness but also the transition probabilities to its next individualstate.

7 / 39

Page 8: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Interactions and system modelComputing the Weak (resp. Strong) ESSWESS and dynamic programming

Plan

1 Introduction

2 Markov Decision Evolutionary Games

3 Application to energy management in wireless networks

4 Numerical illustrations

5 Conclusions and perspectives

8 / 39

Page 9: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Interactions and system modelComputing the Weak (resp. Strong) ESSWESS and dynamic programming

Model for individual player

Individual MDPWe associate with each player a Markov Decision Process (MDP)embedded at the instants of the interactions. The parameters of theMDP are given by the tuple {S,A, Q} where

S is the set of possible individual states of the player.A is the set of available actions. For each state s, a subset As ofactions is available.Q is the set of transition probabilities; for each s, s′ ∈ S anda ∈ As, Qs′(s, a) is the probability to move from state s to state s′

taking action a.∑

s′∈S Qs′(s, a) is allowed to be smaller than 1.

9 / 39

Page 10: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Interactions and system modelComputing the Weak (resp. Strong) ESSWESS and dynamic programming

Model for individual player

PoliciesDefine further

The set of policies is U . A general policy u is a sequenceu = (u1, u2, . . .) where ui is a distribution over action space A attime i .The subset of mixed (resp. pure or deterministic) policies is UM(resp. UD). We define also the set of stationary policies US wheresuch policy does not depend on time.α(u) = {α(u; s, a)} is the fraction of the population at individualstate s and that use action a when all the population usesstrategy u.

10 / 39

Page 11: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Interactions and system modelComputing the Weak (resp. Strong) ESSWESS and dynamic programming

Interactions and system model

Notationsr(s, a, s′, b) be the immediate reward that a player receives whenit is at state s and it uses action a while interacting with a playerwho is in state s′ that uses action b.The expected immediate reward of a player in state St andplaying action At at time t is given by

Rt =∑s,a

αt(u; s, a)r(St , At , s, a).

The global expected fitness when using a policy v is then

Fη(v , u) =∞∑

t=1

Eη,v [Rt ],

where η is the initial state distribution.11 / 39

Page 12: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Interactions and system modelComputing the Weak (resp. Strong) ESSWESS and dynamic programming

Assumptions

Assumptions

A1 : the expected lifetime of a player Tη,u is finite for all u ∈ UD.A2 : When the whole population uses a policy u, then at any timet which is either fixed or is an individual time of an arbitraryplayer, αt(u) is independent of t and is given by

αt(u; s, a) =fη,u(s, a)

Tη,u

for all s, a and where fη,u(s, a) =∑+∞

t=1 pt(η, u; s, a) is theexpected number of time units during which it is at state s and itchooses action a.

12 / 39

Page 13: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Interactions and system modelComputing the Weak (resp. Strong) ESSWESS and dynamic programming

Defining the weak (resp. strong) ESS

Equivalent class of strategies

We shall say that two strategies u and u′ are equivalent if thecorresponding occupation measures are equal for all state. We shallwrite u =e u′.

Definition of the WESS (resp. SESS)

A strategy u is a weak (resp. strong) ESS, denoted by WESS (resp.SESS), for the MDEG if and only if it satisfies one of the following:

for all v 6=e u (resp. v 6= u), F (u, u) > F (u, v) (1)

for all v 6=e u (resp. v 6= u), F (u, u) = F (v , u) and F (u, v) > F (v , v)(2)

13 / 39

Page 14: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Interactions and system modelComputing the Weak (resp. Strong) ESSWESS and dynamic programming

Transforming the MDEG into a standard EG

MDEG into an EGThe fitness function is bilinear in the occupation measures of theplayers. The set of occupation measures is a polytope whoseextreme points correspond to strategies in UD.Consider the following standard evolutionary game EG:

the finite set of actions of a player is UD,the fitness of a player that uses v ∈ UD when the other use apolicy u ∈ US is given by

F̃ (v , u) =∑s,a

fη,v (s, a)∑s′,a′

fη,u(s′, a′)r(s, a, s′, a′).

Enumerate the strategies in UD such that UD = (u1, ..., um).Define γ = (γ1, ..., γm) where γi is the fraction of the populationthat uses ui .

14 / 39

Page 15: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Interactions and system modelComputing the Weak (resp. Strong) ESSWESS and dynamic programming

Transforming the MDEG into a standard EG

Proposition

Let γ̂ be an ESS for the game EG. Then it is a WESS for the originalMDEG.

15 / 39

Page 16: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Interactions and system modelComputing the Weak (resp. Strong) ESSWESS and dynamic programming

What about stationary policies ?

Theorem

(i) A necessary condition for a policy u to be WESS is thatF (u, u) ≥ F (v , u) for all stationary v.(ii) Assume that the following set of dynamic programming equationsholds: For all state s ∈ S,

Fs(u, u) = maxa

[r(u; s, a) +

∑s′

Qs′(s, a)Fs′(u, u)

]. (3)

Then F (u, u) ≥ F (v , u) for any v.(iii) If η(s) > 0 for all s, then the converse also holds: and (3) isequivalent to F (u, u) ≥ F (v , u) for all stationary v.

16 / 39

Page 17: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Plan

1 Introduction

2 Markov Decision Evolutionary Games

3 Application to energy management in wireless networks

4 Numerical illustrations

5 Conclusions and perspectives

17 / 39

Page 18: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Model of Energy Management in a DistributedAloha Network

Actions and statesA terminal i attempts transmissions during time slots.At each attempt, it has to take a decision on the transmissionpower based on his battery energy state.We assume that the state can take three values: {F , A, E} forFull, Almost empty or Empty.The transmission signal power of a terminal can be High (h) orLow (l).Transmission at high power is possible only when the mobile is instate F .

18 / 39

Page 19: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Model of Energy Management in a DistributedAloha Network

Aloha-type game

A mobile transmits a packet with success during a slot if:the mobile is the only one to transmit during this slotthe mobile transmits with high power and all others transmittingnodes use low power

19 / 39

Page 20: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Some notations

Notationsp is the probability for a mobile to be the only transmitter during aslot.Qi(a) is the probability of remaining at energy level i when usingaction a.α is the fraction of the population who use the action h at anygiven time (situation in which the system attains a stationaryregime).

20 / 39

Page 21: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Policies and fitness

PoliciesA general policy u is a sequence u = (u1, u2, ...) where ui is theprobability of choosing h if at time i the state is F . We consider onlystationary policies, ui = β for all time i .

FitnessLet Rt denote the number of packets (zero or one) successfullytransmitted at time slot t and the fitness of the terminal to be given by∑∞

t=1 Rt .Vβ(i , α) is the total expected fitness (i.e. reward or valuation) of auser given that it uses policy β, that it is in state i and given theparameter α.

21 / 39

Page 22: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Computing fitness and sojourn times

State E and AWhen the level of energy is in state E , the valuation is equal toV (E) = 0.When the state is A, the valuation is V (A) = p

1−QAand expected

time during which a mobile spends in state A is T (A) = 11−QA

.

State FDefine the dynamic programming operator Y (v , a, α) to be the totalexpected fitness of an individual starting at state F , if

It takes action a at time 1,If at time 2 the state is F then the total sum of expected fitnessfrom time 2 onwards is v .At each time the mobile attempts transmission, the probabilitythat another interfering mobile uses action h is α.

22 / 39

Page 23: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Computing fitness and sojourn times

State F

Y (v , l) = p + QF (l)v + p1−QF (l)

1−QA.

and

Y (v , h, α) = α(p + QF (h)v + (1−QF (h))V (A))

+(1− α)(1 + QF (h)v + (1−QF (h))V (A)),

= αp + (1− α) + QF (h)v + p1−QF (h)

1−QA.

23 / 39

Page 24: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Computing fitness and sojourn times

State FThe expected time it spends at state F is

T (F ) =1

1− βQF (h)− (1− β)QF (l).

The fraction of time that the mobile uses action h is then

α̂(β) = βT (F )

T (F ) + T (A)= β

1−QA

2−QA − βQF (h)− (1− β)QF (l).

24 / 39

Page 25: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Computing fitness and sojourn times

State FThe total expected utility Vβ(F , α) the mobile gains starting from stateF is the unique solution of v = (1−β)Y (v , l)+βY (v , h, α). This gives

Vβ(F , α) = V (A) +p + β(1− p)(1− α)

1−QF (l) + β(QF (l)−QF (h)).

some remarks

aggressive policy β = 1, V1(F , α) = V (A) + 1−α(1−p)1−QF (h) ,

passive policy β = 0, V0(F , α) = V (A) + p1−QF (l) ,

Vβ(F , α) is either constant or strictly monotone in β over thewhole interval [0, 1].

25 / 39

Page 26: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Characterization of the ESS

Relation with EGThe fitness that is maximized is not the outcome of a singleinteraction but of the sum of fitnesses obtained during all theopportunities in the mobile’s lifetime.The ESS can be defined using the following fitness:

Vβ(F , α̂(β′)) = J(β, β′).

A necessary condition for β∗ to be an ESS is

for all β′ 6= β∗, Vβ∗(F , α̂(β∗)) ≥ Vβ′(F , α̂(β∗)).

26 / 39

Page 27: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Pure equilibrium with high power

TheoremDefine

∆h :=1−QF (h)

2−QA −QF (h)(1− p)− QF (l)−QF (h)

1−QF (l)p

Let u be the pure aggressive strategy that uses always h at state F .(i) ∆h > 0 is a sufficient condition for u to be an ESS.(ii) ∆h ≥ 0 is a necessary condition for u to be an ESS.

Remarks:If QF (l) = QF (h), the strategy high power is obviously an ESS.If p = 0, there is no benefit from transmission with high power.∆h > 0 is a sufficient and necessary condition for u to be astrongly immune ESS.

27 / 39

Page 28: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Pure equilibrium with low power

TheoremDefine

∆l := p(1−QF (h))− (1−QF (l))

Let v be the pure strategy that uses always l at state F .(i) ∆l > 0 is a sufficient condition for v to be an ESS.(ii) ∆l ≥ 0 is a necessary condition for v to be an ESS.

Remarks:If QF (l) = QF (h), the strategy low power is not an ESS.The condition for the policy v to be ESS does not depend on QA.∆l > 0 is a sufficient and necessary condition for v to be astrongly immune ESS.

28 / 39

Page 29: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Mixed equilibrium

Theorem(a) Each one of the following conditions is necessary for there to exista Weakly Immune ESS:

Condition (i): ∆l ≤ 0,Condition (ii): ∆h ≤ 0,

(b) Assume that Condition (i) and (ii) hold. Then there exists a uniqueweakly immune ESS given by

β∗ =(QA + QF (l))[QF (l)− pQF (h)]

QApQF (l)− (QF (l)−QF (h))(QF (l)− pQF (h))

TheoremFor all QA, QF (l), QF (h) and p, the ESS β∗ of the stochasticevolutionary game exists and is unique.

29 / 39

Page 30: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Price of Anarchy (equilibrium aggressivenesscomparison)

PoA

The strategy β̃ is globally optimal if it maximizes Vβ(F , α̂(β)).The global optimal solution is solution of a second orderpolynomial function.We compare the optimal global solution to the ESS.

Theorem

ESS strategy is more aggressive than the social optimum strategy, i.e.

β∗ ≥ β̃.

30 / 39

Page 31: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

Matrix Game of the EG

Matrix Game with deterministic policies

We restrict to the deterministic policies u1 = (l , l) and u2 = (l , h) (usealways high power in state F ). The WESS of the MDEG is the ESS ofa standard EG defined by through the related matrix game:

G̃ =

(p(X1 + X3)

2 p(X1 + X3)(X1 + X4)

(X1 + X3)(p(X1 + X4) + (1− p)X4) p(X1 + X4)2 + (1− p)X1X4

)with

X1 =1

1 − Q(1, l), X2 =

1

1 − Q(1, h), X3 =

1

1 − Q(2, l), X4 =

1

1 − Q(2, h).

31 / 39

Page 32: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ModelDP approachMatrix Game approach

ESS of the EG

Proposition

The ESS γ̂ exists and is unique.

Proposition

Policies β∗ and γ̂ are in the same equivalent class,i.e.

β∗ =e γ̂.

32 / 39

Page 33: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Plan

1 Introduction

2 Markov Decision Evolutionary Games

3 Application to energy management in wireless networks

4 Numerical illustrations

5 Conclusions and perspectives

33 / 39

Page 34: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ESS and the global optimumComparison of the ESS and the global optimum depending on theprobability p.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

p

popu

latio

ns

β social optimum

β* ESS

34 / 39

Page 35: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

ESS and the global optimumComparison of the ESS and the global optimum depending on thedifference QF (l)−QF (h).

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.2

0.4

0.6

0.8

1

QF(l)−Q

F(h)

popu

latio

ns

β social optimum

β* ESS

35 / 39

Page 36: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Comparison of the two approachesComparison of the two mixed ESS β∗ and γ̂ given by the twoapproaches.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

p

ES

S

β*

γ^

36 / 39

Page 37: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Plan

1 Introduction

2 Markov Decision Evolutionary Games

3 Application to energy management in wireless networks

4 Numerical illustrations

5 Conclusions and perspectives

37 / 39

Page 38: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Conclusions

ConclusionsExtension of evolutionary game paradigm considering actionstate dependance.Application to competitive energy management in wirelessterminals.Two different methods for computing ESS of a MDEG.

38 / 39

Page 39: Markov Decision Evolutionary Gamesmescal.imag.fr/membres/corinne.touati/events/hayel.pdfUniversity of Avignon, France Popeye seminar, 2008 1/39. logo Introduction ... invaders (mutations)

logo

IntroductionMarkov Decision Evolutionary Games

Application to energy management in wireless networksNumerical illustrations

Conclusions and perspectives

Perspectives

Perspectives

Generalization of our energy management application.Develop the theoretical results to other rewards like the meanand the discounted ones.Notion of population dynamics into this framework.

39 / 39