46
Why are reflected random walks so hard to simulate ? Sean Meyn Department of Electrical and Computer Engineering and the Coordinated Science Laboratory University of Illinois Joint work with Ken Duffy - Hamilton Institute, National University of Ireland NSF support: ECS-0523620

Why are stochastic networks so hard to simulate?

Embed Size (px)

DESCRIPTION

http://arxiv.org/abs/0906.4514 Strange behavior of simulation of queues and other "skip free" stochastic models, including the R-W Hastings Metropolis algorithm. Presented at the Workshop on Markov chains and MCMC, in honor of Persi Diaconis http://http//pages.cs.aueb.gr/users/yiannisk/AWMCMC.html

Citation preview

Page 1: Why are stochastic networks so hard to simulate?

Why are reflected random walks so hard to simulate ?

Sean Meyn

Department of Electrical and Computer Engineeringand the Coordinated Science Laboratory University of Illinois

Joint work with Ken Duffy - Hamilton Institute, National University of Ireland

NSF support: ECS-0523620

Page 2: Why are stochastic networks so hard to simulate?

References

(Skip this advertisement)

Page 3: Why are stochastic networks so hard to simulate?

Outline

Motivation

Reflected Random Walks

Why So Lopsided?

Most Likely Paths

Summary and Conclusions

T0

0

4

8

12

16

20

0 10 20 30 40 50

Theory:

Observed: X

Page 4: Why are stochastic networks so hard to simulate?

MM1 queue - the nicest of Markov chains

For the stable queue, with load strictly less than one, it isFor the stable queue, with load strictly less than one, it is

• Reversible• Skip-free• Monotone• Marginal distribution geometric• Geometrically ergodic

Page 5: Why are stochastic networks so hard to simulate?

MM1 queue - the nicest of Markov chains

For the stable queue, with load strictly less than one, it isFor the stable queue, with load strictly less than one, it is

Why then, can’t I simulate my queue?Why then, can’t I simulate my queue?

• Reversible• Skip-free• Monotone• Marginal distribution geometric• Geometrically ergodic

Page 6: Why are stochastic networks so hard to simulate?

MM1 queue - the nicest of Markov chains

Why then, can’t I simulate my queue?Why then, can’t I simulate my queue?

−1000 −500 0 500 1000 1500 20000

200

400

600

800

1000Ch 11 CTCN

ρ = 0.9

Histogram

MM1 queue

Gaussian density

1√100,000

100,000

t=1

X(t) − η

Page 7: Why are stochastic networks so hard to simulate?

t

IIReflected Random Walk

Page 8: Why are stochastic networks so hard to simulate?

Reflected Random Walk

A reflected random walk A reflected random walk

where where is the initial condition, and the increments ∆ are i.i.d.is the initial condition, and the increments ∆ are i.i.d.

Page 9: Why are stochastic networks so hard to simulate?

Reflected Random Walk

A reflected random walk A reflected random walk

where where is the initial condition, and the increments ∆ are i.i.d.is the initial condition, and the increments ∆ are i.i.d.

Basic stability assumption: Basic stability assumption:

and finite second momentand finite second moment

Page 10: Why are stochastic networks so hard to simulate?

MM1 queue - the nicest of Markov chains

Reflected random walk with increments,Reflected random walk with increments,

Load conditionLoad condition

Page 11: Why are stochastic networks so hard to simulate?

Reflected Random Walk

A reflected random walk A reflected random walk

where where

Basic stability assumption: Basic stability assumption:

is the initial condition, and the increments ∆ are i.i.d.is the initial condition, and the increments ∆ are i.i.d.

Basic question: Can I compute sample path averages to estimate the steady-state mean?Basic question: Can I compute sample path averages to estimate the steady-state mean?

Page 12: Why are stochastic networks so hard to simulate?

Simulating the RRW: Asymptotic Variance

The CLT requires a third moment for the incrementsThe CLT requires a third moment for the increments

E[∆(0)] < 0 and E[∆(0)3] < ∞Asymptotic variance = O

1

(1 − ρ)4 Whitt 1989Asmussen 1992

Page 13: Why are stochastic networks so hard to simulate?

Simulating the RRW: Asymptotic Variance

The CLT requires a third moment for the incrementsThe CLT requires a third moment for the increments

E[∆(0)] < 0 and E[∆(0)3] < ∞Asymptotic variance = O

1

(1 − ρ)4

−1000 −500 0 500 1000 1500 20000

200

400

600

800

1000Ch 11 CTCN

ρ = 0.9

Histogram

MM1 queue

Gaussian density

1√100,000

100,000

t=1

X(t) − η

Page 14: Why are stochastic networks so hard to simulate?

Simulating the RRW: Lopsided Statistics

Assume only negative drift, and finite second moment:Assume only negative drift, and finite second moment:

Lower LDP asymptotics: For each r < η ,Lower LDP asymptotics: For each r < η ,

M 2006M 2006

E[∆(0)] < 0 and E[∆(0)2] < ∞

P η(n) ≤ r =limn→∞

1

nlog −I(r) < 0

Page 15: Why are stochastic networks so hard to simulate?

Simulating the RRW: Lopsided Statistics

Assume only negative drift, and finite second moment:Assume only negative drift, and finite second moment:

Lower LDP asymptotics: For each r < η ,Lower LDP asymptotics: For each r < η ,

Upper LDP asymptotics are null: For each r ≥ η ,Upper LDP asymptotics are null: For each r ≥ η ,

M 2006CTCNM 2006CTCN

M 2006M 2006

E[∆(0)] < 0 and E[∆(0)2] < ∞

P η(n) ≤ r =

limn→∞

1

nlog

limn→∞

1

nlog

P η(n) ≥ r = 0

−I(r) < 0

even for the MM1 queue! even for the MM1 queue!

Page 16: Why are stochastic networks so hard to simulate?

IIIWhy So Lopsided?

Page 17: Why are stochastic networks so hard to simulate?

Lower LDP

Construct the family of twisted transition lawsConstruct the family of twisted transition laws

P (x, dy) = eθx−Λ(θ)+F(x)P (x, dy)eF (y) θ < 0

Page 18: Why are stochastic networks so hard to simulate?

Lower LDP

Construct the family of twisted transition lawsConstruct the family of twisted transition laws

Geometric ergodicity of the twisted chain implies multiplicativeergodic theorems, and thence Bahadur-Rao LDP asymptoticsGeometric ergodicity of the twisted chain implies multiplicativeergodic theorems, and thence Bahadur-Rao LDP asymptotics

P (x, dy) = eθx−Λ(θ)+F(x)P (x, dy)eF (y) θ < 0

Kontoyiannis & M 2003, 2005M 2006

Page 19: Why are stochastic networks so hard to simulate?

Lower LDP

Construct the family of twisted transition lawsConstruct the family of twisted transition laws

Geometric ergodicity of the twisted chain implies multiplicativeergodic theorems, and thence Bahadur-Rao LDP asymptoticsGeometric ergodicity of the twisted chain implies multiplicativeergodic theorems, and thence Bahadur-Rao LDP asymptotics

This is only possible when θ is negativeThis is only possible when θ is negative

P (x, dy) = eθx−Λ(θ)+F(x)P (x, dy)eF (y) θ < 0

Kontoyiannis & M 2003, 2005M 2006

Page 20: Why are stochastic networks so hard to simulate?

Null Upper LDP: What is the area of a triangle?

h

b

bhA = 12

Page 21: Why are stochastic networks so hard to simulate?

Null Upper LDP: What is the area of a triangle?

X(t)

t

T0

h

b

bhA = 12

η(n) ≈ n−1A

Page 22: Why are stochastic networks so hard to simulate?

Null Upper LDP: Area of a Triangle?

X(t)

t

T0

T = n

h = γn

b = β∗n

η(n) ≈ n−1A

h

b

Consider the scalingConsider the scaling

Base obtained from mostlikely path to this height:Base obtained from mostlikely path to this height:

Large deviations for reflected random walk: Large deviations for reflected random walk:

e.g., Ganesh, O’Connell, and Wischik, 2004

Page 23: Why are stochastic networks so hard to simulate?

Null Upper LDP: Area of a Triangle

X(t)

t

T0

T = n

h = γn

b = β∗n

η(n) ≈ n−1A

An = 12γβ∗n2

h

b

Consider the scalingConsider the scaling

Upper LDP is null: For any γ,Upper LDP is null: For any γ,

M 2006CTCN 2008 (see also Borovkov, et. al. 2003)

Page 24: Why are stochastic networks so hard to simulate?

IVMost Likely Paths

Page 25: Why are stochastic networks so hard to simulate?

What Is The Most Likely Area?

Are triangular excursions optimal?Are triangular excursions optimal?

Most likely path?Most likely path?

t

T0

Page 26: Why are stochastic networks so hard to simulate?

What Is The Most Likely Area?

Triangular excursions are not optimal:Triangular excursions are not optimal:

A concave perturbation: A concave perturbation: greatly increasedarea at modest “cost”greatly increasedarea at modest “cost”

Better...Better...

t

T0

Page 27: Why are stochastic networks so hard to simulate?

Most Likely PathsBetter...Better...

t

T0

Duffy and M 2009, ...

Scaled processScaled process

LDP question translated to the scaled processLDP question translated to the scaled process

ψn(t) = n−1X(nt)

Page 28: Why are stochastic networks so hard to simulate?

Most Likely PathsBetter...Better...

t

T0

Scaled processScaled process

LDP question translated to the scaled processLDP question translated to the scaled process

ψn(t) = n−1X(nt)

η(n) ≈ An ⇐⇒1

0

ψn(t) dt ≈ A

Page 29: Why are stochastic networks so hard to simulate?

Most Likely PathsBetter...Better...

t

T0

Basic assumption: The sample paths for the unconstrained random walk with increments ∆ satisfy the LDP in D[0,1] with good rate function

Scaled processScaled process

LDP question translated to the scaled processLDP question translated to the scaled process

IX

ψn(t) = n−1X(nt)

η(n) ≈ An ⇐⇒1

0

ψn(t) dt ≈ A

Page 30: Why are stochastic networks so hard to simulate?

Most Likely PathsBetter...Better...

t

T0

Basic assumption: The sample paths for the unconstrained random walk with increments ∆ satisfy the LDP in D[0,1] with good rate function

Scaled processScaled process

LDP question translated to the scaled processLDP question translated to the scaled process

ψ(0+) +1

0

I∆(ψ(t)) dt

IX

ϑ+

ψn(t) = n−1X(nt)

η(n) ≈ An ⇐⇒1

0

ψn(t) dt ≈ A

IX(ψ) = For concave ψ, with no downward jumps

Page 31: Why are stochastic networks so hard to simulate?

Most Likely Path Via Dynamic ProgrammingBetter...Better...

t

T0

Dynamic programming formulationDynamic programming formulation

min ψ(0+) +1

0

I∆(ψ(t)) dt

s.t.1

0

ψ(t) dt = A

ϑ+

Page 32: Why are stochastic networks so hard to simulate?

Most Likely Path Via Dynamic ProgrammingBetter...Better...

t

T0

Dynamic programming formulationDynamic programming formulation

Solution: Integration by parts + Lagrangian relaxation:Solution: Integration by parts + Lagrangian relaxation:

min ψ(0+) +1

0

I∆(ψ(t)) dt

s.t.1

0

ψ(t) dt = A

ϑ+

min ψ(0+) +1

0

I∆(ψ(t)) dt + λ ψ(1) − A −1

0

tψ(t) dtϑ+

Page 33: Why are stochastic networks so hard to simulate?

Most Likely Path Via Dynamic ProgrammingBetter...Better...

t

T0

Variational arguments where ψ is strictly concave:Variational arguments where ψ is strictly concave:

Strictly concave on ( Strictly concave on ( ,, ))

min ψ(0+) +1

0

I∆(ψ(t)) dt + λ ψ(1) − A −1

0

tψ(t) dt

t

T 00 T10 1

ψ(t)

T 00 T1

ψ(0+)

ψ linear here

ϑ+

Page 34: Why are stochastic networks so hard to simulate?

Most Likely Path Via Dynamic ProgrammingBetter...Better...

t

T0

Variational arguments where ψ is strictly concave:Variational arguments where ψ is strictly concave:

Strictly concave on ( Strictly concave on ( ,, ))

min ψ(0+) +1

0

I∆(ψ(t)) dt + λ ψ(1) − A −1

0

tψ(t) dt

t

T 00 T10 1

ψ(t)

T 00 T1

∇I(ψ(t)) = b − λ∗t for a.e. t ∈ (T 00 , T1)

ψ(0+)

ψ linear here

ϑ+

Page 35: Why are stochastic networks so hard to simulate?

Most Likely Paths - Examples t

T 00 T10 1

ψ(t)

ψ(0+)

ψ linear here

I

A

A

A

A

t

t

t

t

+ψ∗(0+) +1

0

I∆(ψ∗(t)) ψ∗(t)Optimal paths for various AExponent:Increment dt

Gau

ssia

n

±1

(MM

1 Q

ueu

e)G

eom

etri

c Po

isso

n

0

0.2

0.4

0.6

0.8

0 0.1 0.2 0.3 0.4 0.5

0 0.1 0.2 0.3 0.4 0.5

0

0.1

0.2

0.3

0.4

0.5

0.6

0 0.2 0.4 0.6 0.8 1

0

1

2

3

0 0.2 0.4 0.6 0.8 1 0

0.1

0.2

0.3

0.4

0.5

0 0.2 0.4 0.6 0.8 1

0

0.5

1

1.5

2

0 0.2 0.4 0.6 0.8 1 0

1

2

3

4

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Page 36: Why are stochastic networks so hard to simulate?

Most Likely Paths - Examples t

T 00 T10 1

ψ(t)

ψ(0+)

ψ linear here

The observed path has the largest simulatedmean out of sample paths 108

Selected Sample Paths:Selected Sample Paths:

Theory:

Observed:

0 40 20 30 10 0

5

10

15

20

RRW: Gaussian Increments RRW: MM1 queue

X

Theory:

Observed: X

10 20 30 40 50

5

0

10

15

20

Page 37: Why are stochastic networks so hard to simulate?

Conclusions

Page 38: Why are stochastic networks so hard to simulate?

Summary

It is widely known that simulation varianceis high in “heavy traffic” for queueing modelsIt is widely known that simulation varianceis high in “heavy traffic” for queueing models

Large deviation asymptotics are exotic,regardless of loadLarge deviation asymptotics are exotic,regardless of load

Sample-path behavior is identified for RRW whenthe sample mean is large. This behavior is very different than previously seen in “buffer overflow” analysis of queues

Sample-path behavior is identified for RRW whenthe sample mean is large. This behavior is very different than previously seen in “buffer overflow” analysis of queues

Page 39: Why are stochastic networks so hard to simulate?

Summary

It is widely known that simulation varianceis high in “heavy traffic” for queueing modelsIt is widely known that simulation varianceis high in “heavy traffic” for queueing models

Large deviation asymptotics are exotic,regardless of loadLarge deviation asymptotics are exotic,regardless of load

What about other Markov models?What about other Markov models?

Sample-path behavior is identified for RRW whenthe sample mean is large. This behavior is very different than previously seen in “buffer overflow” analysis of queues

Sample-path behavior is identified for RRW whenthe sample mean is large. This behavior is very different than previously seen in “buffer overflow” analysis of queues

Page 40: Why are stochastic networks so hard to simulate?

MM1 queue - the nicest of Markov chains

What about other Markov models?What about other Markov models?

• Reversible• Skip-free• Monotone• Marginal distribution geometric• Geometrically ergodic

Page 41: Why are stochastic networks so hard to simulate?

MM1 queue - the nicest of Markov chains

What about other Markov models?What about other Markov models?

The skip-free property makes the fluid model analysis possible. Similar behavior inThe skip-free property makes the fluid model analysis possible. Similar behavior in

• Fixed-gain SA• Some MCMC algorithms• Fixed-gain SA• Some MCMC algorithms

• Reversible• Skip-free• Monotone• Marginal distribution geometric• Geometrically ergodic

Page 42: Why are stochastic networks so hard to simulate?

What Next?

• Heavy-tailed and/or long-memory settings?

Page 43: Why are stochastic networks so hard to simulate?

What Next?

• Heavy-tailed and/or long-memory settings?

• Variance reduction, such as control variates

0 100 200 300 400 500 600 700 800 900 10000

5

10

15

20

25

30

n

η(n)

η+(n)

η−(n)

M 2006CTCN

Page 44: Why are stochastic networks so hard to simulate?

What Next?

• Heavy-tailed and/or long-memory settings?

• Variance reduction, such as control variates

Smoothed estimator using fluid value function in CV:

1020

3040

5060

1020

3040

5060

18

19

20

21

22

23

ww

Performance as a function of safety stocks

(i) Simulation using smoothed estimator (ii) Simulation using standard estimator

g = h − Ph = −Dh, h = Jˆ

1020

3040

5060

1020

3040

5060

18

19

20

21

22

23

ww

Henderson, M., and Tadi´c 2003CTCN

Page 45: Why are stochastic networks so hard to simulate?

What Next?

• Heavy-tailed and/or long-memory settings?

• Variance reduction, such as control variates

• Original question? Conjecture,

limn→∞

1√n

log P η(n) ≥ r = −Jη(r) < 0, r > η

for some range of r

Page 46: Why are stochastic networks so hard to simulate?

References[1] G. Fort, S. Meyn, E. Moulines, and P. Priouret. ODE methods for skip-free Markov chain stability with

applications to MCMC. Ann. Appl. Probab., 18(2):664–707, 2008.

[1] I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theorems for geometrically ergodic Markovprocesses. Ann. Appl. Probab., 13:304–362, 2003. Presented at the INFORMS Applied ProbabilityConference, NYC, July, 2001.

[2] I. Kontoyiannis and S. P. Meyn. Large deviations asymptotics and the spectral theory of multiplicativelyregular Markov processes. Electron. J. Probab., 10(3):61–123 (electronic), 2005.

[3] K. R. Duffy and S. P. Meyn. Most likely paths to error when estimating the mean of a reflected randomwalk. http://arxiv.org/abs/0906.4514, June 2009.

[1] S. P. Meyn. Large deviation asymptotics and control variates for simulating large functions. Ann. Appl.Probab., 16(1):310–339, 2006.

[2] V. S. Borkar and S. P. Meyn. The O.D.E. method for convergence of stochastic approximation andreinforcement learning. SIAM J. Control Optim., 38(2):447–469, 2000.

[2] S. P. Meyn. Control Techniques for Complex Networks. Cambridge University Press, Cambridge, 2007.

[1] S. G. Henderson, S. P. Meyn, and V. B. Tadic. Performance evaluation and policy selection in multiclassnetworks. Discrete Event Dynamic Systems: Theory and Applications, 13(1-2):149–189, 2003. Specialissue on learning, optimization and decision making (invited).

[1] G. Fort, S. Meyn, E. Moulines, and P. Priouret. ODE methods for skip-free Markov chain stability withapplications to MCMC. Ann. Appl. Probab., 18(2):664–707, 2008.

[1] I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theorems for geometrically ergodic Markovprocesses. Ann. Appl. Probab., 13:304–362, 2003.

[2] I. Kontoyiannis and S. P. Meyn. Large deviations asymptotics and the spectral theory of multiplicativelyregular Markov processes. Electron. J. Probab., 10(3):61–123 (electronic), 2005.

[3] K. R. Duffy and S. P. Meyn. Most likely paths to error when estimating the mean of a reflected randomwalk. http://arxiv.org/abs/0906.4514, June 2009.

[1] S. P. Meyn. Large deviation asymptotics and control variates for simulating large functions. Ann. Appl.Probab., 16(1):310–339, 2006.

[2] S. P. Meyn. Control Techniques for Complex Networks. Cambridge University Press, Cambridge, 2007.

[2] V. S. Borkar and S. P. Meyn. The ODE method for convergence of stochastic approximation andreinforcement learning. SIAM J. Control Optim., 38(2):447–469, 2000.

Exac

t LD

PsSi

mu

lati

on

LDPs

for R

RW

[2] S. P. Meyn. Control Techniques for Complex Networks. Cambridge University Press, Cambridge, 2007.