73
When Means Bound Variances: Concentration for Recursively Determined Random Values Mike Steele 1 Analysis of Algorithms Menorca, Spain May, 2013 1 Including joint work with Noah Gans and Alessandro Arlotto J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 1 / 20

When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

When Means Bound Variances:Concentration for Recursively Determined Random Values

Mike Steele 1

Analysis of AlgorithmsMenorca, Spain

May, 2013

1Including joint work with Noah Gans and Alessandro Arlotto

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 1 / 20

Page 2: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

“Just once in a while I’d like to see someone begin a talk without a PLAN”—Luc Devroye, Barbados, 2013

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 2 / 20

Page 3: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

After starting with a quote — time for a “heads up”?

1 Quick Look at Variance Bounds in the “Early Days”The Euclidean TSP: Two Probability ModelsA Variance Bound for Many SeasonsModern Technology — a Gap and a Challenge

2 A Challenge for Bounders of VariancesLeading Example: The Sequential Knapsack ProblemMDPs: A General Framework — Served with Alphabet SoupThree Notable PropertiesMain Result: Variance Bound for a General Class of MDPsProof sketchConcrete Conjecture in Simplest Context

3 “Take Aways”

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 3 / 20

Page 4: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days”

Not a Plan — Just Suggestions

1 Quick Look at Variance Bounds in the “Early Days”The Euclidean TSP: Two Probability ModelsA Variance Bound for Many SeasonsModern Technology — a Gap and a Challenge

2 A Challenge for Bounders of VariancesLeading Example: The Sequential Knapsack ProblemMDPs: A General Framework — Served with Alphabet SoupThree Notable PropertiesMain Result: Variance Bound for a General Class of MDPsProof sketchConcrete Conjecture in Simplest Context

3 “Take Aways”

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 4 / 20

Page 5: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.

I The Incrementing modelI The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSP

I Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 6: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.

I The Incrementing modelI The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSP

I Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 7: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.

I The Incrementing modelI The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSP

I Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 8: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.

I The Incrementing modelI The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSP

I Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 9: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.I The Incrementing model

I The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSP

I Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 10: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.I The Incrementing modelI The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSP

I Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 11: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.I The Incrementing modelI The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSP

I Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 12: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.I The Incrementing modelI The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSPI Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 13: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.I The Incrementing modelI The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSPI Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 14: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.I The Incrementing modelI The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSPI Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 15: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” The Euclidean TSP: Two Probability Models

First imagine n Points in the plane and the shortest path ...

I’d draw a picture but I am “Beamer challenged”

Now, more precisely imagine how the points arrive. There are two ways.I The Incrementing modelI The Independent model

This matters because of Karp’s Partitioning Algorithm for the TSPI Based on the BHH Theorem

limn→∞

Ln√n

= βBHH with probability one.

I But BHH is NOT good enough for honest “almost sure” Partitioning Algorithms

Karp and S. (1985) sorted this out: Once needs complete convergence (or better)

The tool of the time was a Jackknife bound on the variance.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 5 / 20

Page 16: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” A Variance Bound for Many Seasons

Familiar Now but Still “News You can Use”

There is a VERY GENERAL trick for bounding

VarF (X1,X2, ...,Xn)

Steele (1981), generalizing Efron and Stein (1980), showed it is bounded by

1

2

∑1≤i≤n

E(F (X1,X2, ...,Xi , ...,Xn)− F (X1,X2, ..., X̂i , ...,Xn))2 ♥

For the TSP (and MST, etc.) this can be used to prove that (in d = 2) there is aconstant C such that for

VarLn ≤ C for all n

Even now this may seem surprising. Here, and in many other cases, it gives an verypleasing path to the desired strong laws.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 6 / 20

Page 17: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” A Variance Bound for Many Seasons

Familiar Now but Still “News You can Use”

There is a VERY GENERAL trick for bounding

VarF (X1,X2, ...,Xn)

Steele (1981), generalizing Efron and Stein (1980), showed it is bounded by

1

2

∑1≤i≤n

E(F (X1,X2, ...,Xi , ...,Xn)− F (X1,X2, ..., X̂i , ...,Xn))2 ♥

For the TSP (and MST, etc.) this can be used to prove that (in d = 2) there is aconstant C such that for

VarLn ≤ C for all n

Even now this may seem surprising. Here, and in many other cases, it gives an verypleasing path to the desired strong laws.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 6 / 20

Page 18: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” A Variance Bound for Many Seasons

Familiar Now but Still “News You can Use”

There is a VERY GENERAL trick for bounding

VarF (X1,X2, ...,Xn)

Steele (1981), generalizing Efron and Stein (1980), showed it is bounded by

1

2

∑1≤i≤n

E(F (X1,X2, ...,Xi , ...,Xn)− F (X1,X2, ..., X̂i , ...,Xn))2 ♥

For the TSP (and MST, etc.) this can be used to prove that (in d = 2) there is aconstant C such that for

VarLn ≤ C for all n

Even now this may seem surprising. Here, and in many other cases, it gives an verypleasing path to the desired strong laws.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 6 / 20

Page 19: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” A Variance Bound for Many Seasons

Familiar Now but Still “News You can Use”

There is a VERY GENERAL trick for bounding

VarF (X1,X2, ...,Xn)

Steele (1981), generalizing Efron and Stein (1980), showed it is bounded by

1

2

∑1≤i≤n

E(F (X1,X2, ...,Xi , ...,Xn)− F (X1,X2, ..., X̂i , ...,Xn))2 ♥

For the TSP (and MST, etc.) this can be used to prove that (in d = 2) there is aconstant C such that for

VarLn ≤ C for all n

Even now this may seem surprising. Here, and in many other cases, it gives an verypleasing path to the desired strong laws.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 6 / 20

Page 20: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” A Variance Bound for Many Seasons

Familiar Now but Still “News You can Use”

There is a VERY GENERAL trick for bounding

VarF (X1,X2, ...,Xn)

Steele (1981), generalizing Efron and Stein (1980), showed it is bounded by

1

2

∑1≤i≤n

E(F (X1,X2, ...,Xi , ...,Xn)− F (X1,X2, ..., X̂i , ...,Xn))2 ♥

For the TSP (and MST, etc.) this can be used to prove that (in d = 2) there is aconstant C such that for

VarLn ≤ C for all n

Even now this may seem surprising. Here, and in many other cases, it gives an verypleasing path to the desired strong laws.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 6 / 20

Page 21: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” Modern Technology — a Gap and a Challenge

Modern Technology — a Gap and a Challenge

After intermediate results by several authors, Talagrand proved with his convexdistance inequality that the TSP and MST in d = 2 have Gaussian tail bounds:

P(|Ln − ELn| ≥ x) ≤ Ae−Bx2

.

The proof of this inequality can be somewhat simplified by using the SpacefillingCurve Heuristic to get a “certificate” for the size of Talangrand’s distance (seePTCO for the trick).

The problem for d > 2 seems to have dropped off the radar. The basic variancebound gives

VarLn ≤ Cn(d−2)/d for all n;

More is known, but the full truth remains to be sorted out.

The text of Boucheron, Lugosi, and Massart develops the bound ♥ in remarkablypowerful ways.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 7 / 20

Page 22: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” Modern Technology — a Gap and a Challenge

Modern Technology — a Gap and a Challenge

After intermediate results by several authors, Talagrand proved with his convexdistance inequality that the TSP and MST in d = 2 have Gaussian tail bounds:

P(|Ln − ELn| ≥ x) ≤ Ae−Bx2

.

The proof of this inequality can be somewhat simplified by using the SpacefillingCurve Heuristic to get a “certificate” for the size of Talangrand’s distance (seePTCO for the trick).

The problem for d > 2 seems to have dropped off the radar. The basic variancebound gives

VarLn ≤ Cn(d−2)/d for all n;

More is known, but the full truth remains to be sorted out.

The text of Boucheron, Lugosi, and Massart develops the bound ♥ in remarkablypowerful ways.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 7 / 20

Page 23: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” Modern Technology — a Gap and a Challenge

Modern Technology — a Gap and a Challenge

After intermediate results by several authors, Talagrand proved with his convexdistance inequality that the TSP and MST in d = 2 have Gaussian tail bounds:

P(|Ln − ELn| ≥ x) ≤ Ae−Bx2

.

The proof of this inequality can be somewhat simplified by using the SpacefillingCurve Heuristic to get a “certificate” for the size of Talangrand’s distance (seePTCO for the trick).

The problem for d > 2 seems to have dropped off the radar. The basic variancebound gives

VarLn ≤ Cn(d−2)/d for all n;

More is known, but the full truth remains to be sorted out.

The text of Boucheron, Lugosi, and Massart develops the bound ♥ in remarkablypowerful ways.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 7 / 20

Page 24: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” Modern Technology — a Gap and a Challenge

Modern Technology — a Gap and a Challenge

After intermediate results by several authors, Talagrand proved with his convexdistance inequality that the TSP and MST in d = 2 have Gaussian tail bounds:

P(|Ln − ELn| ≥ x) ≤ Ae−Bx2

.

The proof of this inequality can be somewhat simplified by using the SpacefillingCurve Heuristic to get a “certificate” for the size of Talangrand’s distance (seePTCO for the trick).

The problem for d > 2 seems to have dropped off the radar. The basic variancebound gives

VarLn ≤ Cn(d−2)/d for all n;

More is known, but the full truth remains to be sorted out.

The text of Boucheron, Lugosi, and Massart develops the bound ♥ in remarkablypowerful ways.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 7 / 20

Page 25: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” Modern Technology — a Gap and a Challenge

Modern Technology — a Gap and a Challenge

After intermediate results by several authors, Talagrand proved with his convexdistance inequality that the TSP and MST in d = 2 have Gaussian tail bounds:

P(|Ln − ELn| ≥ x) ≤ Ae−Bx2

.

The proof of this inequality can be somewhat simplified by using the SpacefillingCurve Heuristic to get a “certificate” for the size of Talangrand’s distance (seePTCO for the trick).

The problem for d > 2 seems to have dropped off the radar. The basic variancebound gives

VarLn ≤ Cn(d−2)/d for all n;

More is known, but the full truth remains to be sorted out.

The text of Boucheron, Lugosi, and Massart develops the bound ♥ in remarkablypowerful ways.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 7 / 20

Page 26: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Quick Look at Variance Bounds in the “Early Days” Modern Technology — a Gap and a Challenge

Modern Technology — a Gap and a Challenge

After intermediate results by several authors, Talagrand proved with his convexdistance inequality that the TSP and MST in d = 2 have Gaussian tail bounds:

P(|Ln − ELn| ≥ x) ≤ Ae−Bx2

.

The proof of this inequality can be somewhat simplified by using the SpacefillingCurve Heuristic to get a “certificate” for the size of Talangrand’s distance (seePTCO for the trick).

The problem for d > 2 seems to have dropped off the radar. The basic variancebound gives

VarLn ≤ Cn(d−2)/d for all n;

More is known, but the full truth remains to be sorted out.

The text of Boucheron, Lugosi, and Massart develops the bound ♥ in remarkablypowerful ways.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 7 / 20

Page 27: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances

Not a Plan — Just Suggestions

1 Quick Look at Variance Bounds in the “Early Days”The Euclidean TSP: Two Probability ModelsA Variance Bound for Many SeasonsModern Technology — a Gap and a Challenge

2 A Challenge for Bounders of VariancesLeading Example: The Sequential Knapsack ProblemMDPs: A General Framework — Served with Alphabet SoupThree Notable PropertiesMain Result: Variance Bound for a General Class of MDPsProof sketchConcrete Conjecture in Simplest Context

3 “Take Aways”

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 8 / 20

Page 28: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Leading Example: The Sequential Knapsack Problem

How to Pack a Knapsack Sequentially and Optimally— a problem going back at least to Coffman et al. (1987)

Knapsack capacity c ∈ (0,∞)

Item sizes: Y1,Y2, . . . ,Yn independent, continuous distribution F

Decision: include/exclude Yt , 1 ≤ t ≤ n

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ c

},

τi , the index of the ith item included must be a stopping time

Objective: supπ E [Rn(π)]

π∗n : optimal Markov deterministic policy

La pregunta de hoy:

Var [Rn(π∗n)] ?

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 9 / 20

Page 29: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Leading Example: The Sequential Knapsack Problem

How to Pack a Knapsack Sequentially and Optimally— a problem going back at least to Coffman et al. (1987)

Knapsack capacity c ∈ (0,∞)

Item sizes: Y1,Y2, . . . ,Yn independent, continuous distribution F

Decision: include/exclude Yt , 1 ≤ t ≤ n

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ c

},

τi , the index of the ith item included must be a stopping time

Objective: supπ E [Rn(π)]

π∗n : optimal Markov deterministic policy

La pregunta de hoy:

Var [Rn(π∗n)] ?

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 9 / 20

Page 30: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Leading Example: The Sequential Knapsack Problem

How to Pack a Knapsack Sequentially and Optimally— a problem going back at least to Coffman et al. (1987)

Knapsack capacity c ∈ (0,∞)

Item sizes: Y1,Y2, . . . ,Yn independent, continuous distribution F

Decision: include/exclude Yt , 1 ≤ t ≤ n

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ c

},

τi , the index of the ith item included must be a stopping time

Objective: supπ E [Rn(π)]

π∗n : optimal Markov deterministic policy

La pregunta de hoy:

Var [Rn(π∗n)] ?

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 9 / 20

Page 31: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Leading Example: The Sequential Knapsack Problem

How to Pack a Knapsack Sequentially and Optimally— a problem going back at least to Coffman et al. (1987)

Knapsack capacity c ∈ (0,∞)

Item sizes: Y1,Y2, . . . ,Yn independent, continuous distribution F

Decision: include/exclude Yt , 1 ≤ t ≤ n

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ c

},

τi , the index of the ith item included must be a stopping time

Objective: supπ E [Rn(π)]

π∗n : optimal Markov deterministic policy

La pregunta de hoy:

Var [Rn(π∗n)] ?

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 9 / 20

Page 32: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Leading Example: The Sequential Knapsack Problem

How to Pack a Knapsack Sequentially and Optimally— a problem going back at least to Coffman et al. (1987)

Knapsack capacity c ∈ (0,∞)

Item sizes: Y1,Y2, . . . ,Yn independent, continuous distribution F

Decision: include/exclude Yt , 1 ≤ t ≤ n

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ c

},

τi , the index of the ith item included must be a stopping time

Objective: supπ E [Rn(π)]

π∗n : optimal Markov deterministic policy

La pregunta de hoy:

Var [Rn(π∗n)] ?

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 9 / 20

Page 33: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Leading Example: The Sequential Knapsack Problem

How to Pack a Knapsack Sequentially and Optimally— a problem going back at least to Coffman et al. (1987)

Knapsack capacity c ∈ (0,∞)

Item sizes: Y1,Y2, . . . ,Yn independent, continuous distribution F

Decision: include/exclude Yt , 1 ≤ t ≤ n

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ c

},

τi , the index of the ith item included must be a stopping time

Objective: supπ E [Rn(π)]

π∗n : optimal Markov deterministic policy

La pregunta de hoy:

Var [Rn(π∗n)] ?

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 9 / 20

Page 34: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Leading Example: The Sequential Knapsack Problem

How to Pack a Knapsack Sequentially and Optimally— a problem going back at least to Coffman et al. (1987)

Knapsack capacity c ∈ (0,∞)

Item sizes: Y1,Y2, . . . ,Yn independent, continuous distribution F

Decision: include/exclude Yt , 1 ≤ t ≤ n

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ c

},

τi , the index of the ith item included must be a stopping time

Objective: supπ E [Rn(π)]

π∗n : optimal Markov deterministic policy

La pregunta de hoy:

Var [Rn(π∗n)] ?

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 9 / 20

Page 35: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Leading Example: The Sequential Knapsack Problem

How to Pack a Knapsack Sequentially and Optimally— a problem going back at least to Coffman et al. (1987)

Knapsack capacity c ∈ (0,∞)

Item sizes: Y1,Y2, . . . ,Yn independent, continuous distribution F

Decision: include/exclude Yt , 1 ≤ t ≤ n

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ c

},

τi , the index of the ith item included must be a stopping time

Objective: supπ E [Rn(π)]

π∗n : optimal Markov deterministic policy

La pregunta de hoy:

Var [Rn(π∗n)] ?

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 9 / 20

Page 36: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: The Alphabet Soup Illustrated by the Knapsack Problem

(X , Y , A , f , r , n

)

X is the state space; at each t the DM knows the state of the system x ∈ XI Knapsack example: x is the remaining capacity

The independent sequence Y1,Y2, . . .Yn takes value in YI Knapsack example: y ∈ Y is the size of the item that is presented

Action space: A(t, x , y) ⊆ A is the set of admissible actions for (x , y) at tI Knapsack example: “select”; “do not select”

State transition function: f (t, x , y , a) state that one reaches for a ∈ A(t, x , y)I Knapsack example: f (t, x , y , select) = x− y; f (t, x , y ,do not select) = x

Reward function: r(t, x , y , a) reward for taking action a at time t when at (x , y)I Knapsack example: f (t, x , y , select) = 1; f (t, x , y , do not select) = 0

Time horizon: n <∞

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 10 / 20

Page 37: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: The Alphabet Soup Illustrated by the Knapsack Problem

(X , Y , A , f , r , n

)X is the state space; at each t the DM knows the state of the system x ∈ X

I Knapsack example: x is the remaining capacity

The independent sequence Y1,Y2, . . .Yn takes value in YI Knapsack example: y ∈ Y is the size of the item that is presented

Action space: A(t, x , y) ⊆ A is the set of admissible actions for (x , y) at tI Knapsack example: “select”; “do not select”

State transition function: f (t, x , y , a) state that one reaches for a ∈ A(t, x , y)I Knapsack example: f (t, x , y , select) = x− y; f (t, x , y ,do not select) = x

Reward function: r(t, x , y , a) reward for taking action a at time t when at (x , y)I Knapsack example: f (t, x , y , select) = 1; f (t, x , y , do not select) = 0

Time horizon: n <∞

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 10 / 20

Page 38: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: The Alphabet Soup Illustrated by the Knapsack Problem

(X , Y , A , f , r , n

)X is the state space; at each t the DM knows the state of the system x ∈ X

I Knapsack example: x is the remaining capacity

The independent sequence Y1,Y2, . . .Yn takes value in YI Knapsack example: y ∈ Y is the size of the item that is presented

Action space: A(t, x , y) ⊆ A is the set of admissible actions for (x , y) at tI Knapsack example: “select”; “do not select”

State transition function: f (t, x , y , a) state that one reaches for a ∈ A(t, x , y)I Knapsack example: f (t, x , y , select) = x− y; f (t, x , y ,do not select) = x

Reward function: r(t, x , y , a) reward for taking action a at time t when at (x , y)I Knapsack example: f (t, x , y , select) = 1; f (t, x , y , do not select) = 0

Time horizon: n <∞

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 10 / 20

Page 39: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: The Alphabet Soup Illustrated by the Knapsack Problem

(X , Y , A , f , r , n

)X is the state space; at each t the DM knows the state of the system x ∈ X

I Knapsack example: x is the remaining capacity

The independent sequence Y1,Y2, . . .Yn takes value in YI Knapsack example: y ∈ Y is the size of the item that is presented

Action space: A(t, x , y) ⊆ A is the set of admissible actions for (x , y) at tI Knapsack example: “select”; “do not select”

State transition function: f (t, x , y , a) state that one reaches for a ∈ A(t, x , y)I Knapsack example: f (t, x , y , select) = x− y; f (t, x , y ,do not select) = x

Reward function: r(t, x , y , a) reward for taking action a at time t when at (x , y)I Knapsack example: f (t, x , y , select) = 1; f (t, x , y , do not select) = 0

Time horizon: n <∞

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 10 / 20

Page 40: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: The Alphabet Soup Illustrated by the Knapsack Problem

(X , Y , A , f , r , n

)X is the state space; at each t the DM knows the state of the system x ∈ X

I Knapsack example: x is the remaining capacity

The independent sequence Y1,Y2, . . .Yn takes value in YI Knapsack example: y ∈ Y is the size of the item that is presented

Action space: A(t, x , y) ⊆ A is the set of admissible actions for (x , y) at tI Knapsack example: “select”; “do not select”

State transition function: f (t, x , y , a) state that one reaches for a ∈ A(t, x , y)I Knapsack example: f (t, x , y , select) = x− y; f (t, x , y ,do not select) = x

Reward function: r(t, x , y , a) reward for taking action a at time t when at (x , y)I Knapsack example: f (t, x , y , select) = 1; f (t, x , y , do not select) = 0

Time horizon: n <∞

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 10 / 20

Page 41: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: The Alphabet Soup Illustrated by the Knapsack Problem

(X , Y , A , f , r , n

)X is the state space; at each t the DM knows the state of the system x ∈ X

I Knapsack example: x is the remaining capacity

The independent sequence Y1,Y2, . . .Yn takes value in YI Knapsack example: y ∈ Y is the size of the item that is presented

Action space: A(t, x , y) ⊆ A is the set of admissible actions for (x , y) at tI Knapsack example: “select”; “do not select”

State transition function: f (t, x , y , a) state that one reaches for a ∈ A(t, x , y)I Knapsack example: f (t, x , y , select) = x− y; f (t, x , y ,do not select) = x

Reward function: r(t, x , y , a) reward for taking action a at time t when at (x , y)I Knapsack example: f (t, x , y , select) = 1; f (t, x , y , do not select) = 0

Time horizon: n <∞

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 10 / 20

Page 42: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: The Alphabet Soup Illustrated by the Knapsack Problem

(X , Y , A , f , r , n

)X is the state space; at each t the DM knows the state of the system x ∈ X

I Knapsack example: x is the remaining capacity

The independent sequence Y1,Y2, . . .Yn takes value in YI Knapsack example: y ∈ Y is the size of the item that is presented

Action space: A(t, x , y) ⊆ A is the set of admissible actions for (x , y) at tI Knapsack example: “select”; “do not select”

State transition function: f (t, x , y , a) state that one reaches for a ∈ A(t, x , y)I Knapsack example: f (t, x , y , select) = x− y; f (t, x , y ,do not select) = x

Reward function: r(t, x , y , a) reward for taking action a at time t when at (x , y)I Knapsack example: f (t, x , y , select) = 1; f (t, x , y , do not select) = 0

Time horizon: n <∞

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 10 / 20

Page 43: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: A Natural Context Where Means Bound Variances

Π(n) set of all feasible Markov deterministic policies for the n-period problem

Reward of policy π up to time k

Rk(π) =k∑

t=1

r(t,Xt ,Yt ,At), X1 = x̄ , 1 ≤ k ≤ n

Expected total reward criterion, i.e. we are looking for π∗n ∈ Π(n) such that

E[Rn(π∗n )] = supπ∈Π(n)

E[Rn(π)].

Dynamic Programming Recursion: for each 1 ≤ t ≤ n and for x ∈ X ,

vt(x) = E

[sup

a∈A(t,x,Yt )

{r(t, x ,Yt , a) + vt+1 (f (t, x ,Yt , a))}

],

I vn+1(x) = 0 for all x ∈ X , andI v1(x̄) = E[Rn(π∗n )]

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 11 / 20

Page 44: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: A Natural Context Where Means Bound Variances

Π(n) set of all feasible Markov deterministic policies for the n-period problem

Reward of policy π up to time k

Rk(π) =k∑

t=1

r(t,Xt ,Yt ,At), X1 = x̄ , 1 ≤ k ≤ n

Expected total reward criterion, i.e. we are looking for π∗n ∈ Π(n) such that

E[Rn(π∗n )] = supπ∈Π(n)

E[Rn(π)].

Dynamic Programming Recursion: for each 1 ≤ t ≤ n and for x ∈ X ,

vt(x) = E

[sup

a∈A(t,x,Yt )

{r(t, x ,Yt , a) + vt+1 (f (t, x ,Yt , a))}

],

I vn+1(x) = 0 for all x ∈ X , andI v1(x̄) = E[Rn(π∗n )]

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 11 / 20

Page 45: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: A Natural Context Where Means Bound Variances

Π(n) set of all feasible Markov deterministic policies for the n-period problem

Reward of policy π up to time k

Rk(π) =k∑

t=1

r(t,Xt ,Yt ,At), X1 = x̄ , 1 ≤ k ≤ n

Expected total reward criterion, i.e. we are looking for π∗n ∈ Π(n) such that

E[Rn(π∗n )] = supπ∈Π(n)

E[Rn(π)].

Dynamic Programming Recursion: for each 1 ≤ t ≤ n and for x ∈ X ,

vt(x) = E

[sup

a∈A(t,x,Yt )

{r(t, x ,Yt , a) + vt+1 (f (t, x ,Yt , a))}

],

I vn+1(x) = 0 for all x ∈ X , andI v1(x̄) = E[Rn(π∗n )]

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 11 / 20

Page 46: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: A Natural Context Where Means Bound Variances

Π(n) set of all feasible Markov deterministic policies for the n-period problem

Reward of policy π up to time k

Rk(π) =k∑

t=1

r(t,Xt ,Yt ,At), X1 = x̄ , 1 ≤ k ≤ n

Expected total reward criterion, i.e. we are looking for π∗n ∈ Π(n) such that

E[Rn(π∗n )] = supπ∈Π(n)

E[Rn(π)].

Dynamic Programming Recursion: for each 1 ≤ t ≤ n and for x ∈ X ,

vt(x) = E

[sup

a∈A(t,x,Yt )

{r(t, x ,Yt , a) + vt+1 (f (t, x ,Yt , a))}

],

I vn+1(x) = 0 for all x ∈ X , andI v1(x̄) = E[Rn(π∗n )]

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 11 / 20

Page 47: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: A Natural Context Where Means Bound Variances

Π(n) set of all feasible Markov deterministic policies for the n-period problem

Reward of policy π up to time k

Rk(π) =k∑

t=1

r(t,Xt ,Yt ,At), X1 = x̄ , 1 ≤ k ≤ n

Expected total reward criterion, i.e. we are looking for π∗n ∈ Π(n) such that

E[Rn(π∗n )] = supπ∈Π(n)

E[Rn(π)].

Dynamic Programming Recursion: for each 1 ≤ t ≤ n and for x ∈ X ,

vt(x) = E

[sup

a∈A(t,x,Yt )

{r(t, x ,Yt , a) + vt+1 (f (t, x ,Yt , a))}

],

I vn+1(x) = 0 for all x ∈ X , and

I v1(x̄) = E[Rn(π∗n )]

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 11 / 20

Page 48: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances MDPs: A General Framework — Served with Alphabet Soup

MDPs: A Natural Context Where Means Bound Variances

Π(n) set of all feasible Markov deterministic policies for the n-period problem

Reward of policy π up to time k

Rk(π) =k∑

t=1

r(t,Xt ,Yt ,At), X1 = x̄ , 1 ≤ k ≤ n

Expected total reward criterion, i.e. we are looking for π∗n ∈ Π(n) such that

E[Rn(π∗n )] = supπ∈Π(n)

E[Rn(π)].

Dynamic Programming Recursion: for each 1 ≤ t ≤ n and for x ∈ X ,

vt(x) = E

[sup

a∈A(t,x,Yt )

{r(t, x ,Yt , a) + vt+1 (f (t, x ,Yt , a))}

],

I vn+1(x) = 0 for all x ∈ X , andI v1(x̄) = E[Rn(π∗n )]

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 11 / 20

Page 49: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Three Notable Properties

Three Properties: Common and Easy to Check

Property (Bounded Rewards)

There is a constant K <∞ such that 0 ≤ r(t, x , y , a) ≤ K for all triples (x , y , a) and alltimes 1 ≤ t ≤ n.

Property (Existence of a Do-nothing Action)

For each time 1 ≤ t ≤ n and pair (x , y), the set of actions A(t, x , y) includes ado-nothing action a0 such that

r(t, x , y , a0) = 0 and f (t, x , y , a0) = x .

Property (Optimal Action Monotonicity, or “Pay-to-Play”)

For each time 1 ≤ t ≤ n and state x ∈ X one has the inequality

vt+1(x∗) ≤ vt+1(x)

where x∗ = f (t, x , y , a∗) and where a∗ is an optimal action in A(t, x , y).

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 12 / 20

Page 50: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Three Notable Properties

Three Properties: Common and Easy to Check

Property (Bounded Rewards)

There is a constant K <∞ such that 0 ≤ r(t, x , y , a) ≤ K for all triples (x , y , a) and alltimes 1 ≤ t ≤ n.

Property (Existence of a Do-nothing Action)

For each time 1 ≤ t ≤ n and pair (x , y), the set of actions A(t, x , y) includes ado-nothing action a0 such that

r(t, x , y , a0) = 0 and f (t, x , y , a0) = x .

Property (Optimal Action Monotonicity, or “Pay-to-Play”)

For each time 1 ≤ t ≤ n and state x ∈ X one has the inequality

vt+1(x∗) ≤ vt+1(x)

where x∗ = f (t, x , y , a∗) and where a∗ is an optimal action in A(t, x , y).

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 12 / 20

Page 51: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Three Notable Properties

Three Properties: Common and Easy to Check

Property (Bounded Rewards)

There is a constant K <∞ such that 0 ≤ r(t, x , y , a) ≤ K for all triples (x , y , a) and alltimes 1 ≤ t ≤ n.

Property (Existence of a Do-nothing Action)

For each time 1 ≤ t ≤ n and pair (x , y), the set of actions A(t, x , y) includes ado-nothing action a0 such that

r(t, x , y , a0) = 0 and f (t, x , y , a0) = x .

Property (Optimal Action Monotonicity, or “Pay-to-Play”)

For each time 1 ≤ t ≤ n and state x ∈ X one has the inequality

vt+1(x∗) ≤ vt+1(x)

where x∗ = f (t, x , y , a∗) and where a∗ is an optimal action in A(t, x , y).

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 12 / 20

Page 52: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Three Notable Properties

Three Properties: Common and Easy to Check

Property (Bounded Rewards)

There is a constant K <∞ such that 0 ≤ r(t, x , y , a) ≤ K for all triples (x , y , a) and alltimes 1 ≤ t ≤ n.

Property (Existence of a Do-nothing Action)

For each time 1 ≤ t ≤ n and pair (x , y), the set of actions A(t, x , y) includes ado-nothing action a0 such that

r(t, x , y , a0) = 0 and f (t, x , y , a0) = x .

Property (Optimal Action Monotonicity, or “Pay-to-Play”)

For each time 1 ≤ t ≤ n and state x ∈ X one has the inequality

vt+1(x∗) ≤ vt+1(x)

where x∗ = f (t, x , y , a∗) and where a∗ is an optimal action in A(t, x , y).

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 12 / 20

Page 53: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Main Result: Variance Bound for a General Class of MDPs

A Variance Bounded by a Mean: Easy and Useful

Theorem (Arlotto, Gans, S., 2012)

Suppose that the Markov decision problem (X ,Y,A, f , r , n) satisfies rewardboundedness, existence of a do-nothing action and optimal action monotonicity. Ifπ∗n ∈ Π(n) is a Markov deterministic policy such that

E[Rn(π∗n )] = supπ∈Π(n)

E[Rn(π)],

thenVar[Rn(π∗n )] ≤ K E[Rn(π∗n )],

where K is the uniform bound on the one-period reward function.

Corollary (Relative Stability)

If E[Rn(π∗n )]→∞ as n→∞, then

Rn(π∗n )

E[Rn(π∗n )]

p−→ 1 as n→∞.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 13 / 20

Page 54: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Main Result: Variance Bound for a General Class of MDPs

A Variance Bounded by a Mean: Easy and Useful

Theorem (Arlotto, Gans, S., 2012)

Suppose that the Markov decision problem (X ,Y,A, f , r , n) satisfies rewardboundedness, existence of a do-nothing action and optimal action monotonicity. Ifπ∗n ∈ Π(n) is a Markov deterministic policy such that

E[Rn(π∗n )] = supπ∈Π(n)

E[Rn(π)],

thenVar[Rn(π∗n )] ≤ K E[Rn(π∗n )],

where K is the uniform bound on the one-period reward function.

Corollary (Relative Stability)

If E[Rn(π∗n )]→∞ as n→∞, then

Rn(π∗n )

E[Rn(π∗n )]

p−→ 1 as n→∞.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 13 / 20

Page 55: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Main Result: Variance Bound for a General Class of MDPs

A Variance Bounded by a Mean: Easy and Useful

Theorem (Arlotto, Gans, S., 2012)

Suppose that the Markov decision problem (X ,Y,A, f , r , n) satisfies rewardboundedness, existence of a do-nothing action and optimal action monotonicity. Ifπ∗n ∈ Π(n) is a Markov deterministic policy such that

E[Rn(π∗n )] = supπ∈Π(n)

E[Rn(π)],

thenVar[Rn(π∗n )] ≤ K E[Rn(π∗n )],

where K is the uniform bound on the one-period reward function.

Corollary (Relative Stability)

If E[Rn(π∗n )]→∞ as n→∞, then

Rn(π∗n )

E[Rn(π∗n )]

p−→ 1 as n→∞.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 13 / 20

Page 56: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Main Result: Variance Bound for a General Class of MDPs

Examples

Examples of MDPs that satisfy reward boundedness, existence of a do-nothing action andoptimal action monotonicity:

General dynamic and stochastic knapsack problems (Papastavrou, Rajagopalan andKleywegt, 1996)

Sequential investment problems (Derman et al., 1975; Prastacos, 1983)

Capacity control problems in revenue management (Talluri and van Ryzin, 2004)

Stochastic depletion problems with deterministic transitions (Chan and Farias, 2009)

Sequential selection of monotone, unimodal and d-modal subsequences (Arlotto andS., 2011)

More?

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 14 / 20

Page 57: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Proof sketch

Bounding the Variance by the Mean: Sketch of the Proof

For 0 ≤ t ≤ n, the process

Mt = Rt(π∗n ) + vt+1(Xt+1)

is a martingale with respect to the natural filtration Ft = σ{Y1, . . . ,Yt}

M0 = E[Rn(π∗n )] and Mn = Rn(π∗n )

For dt = Mt −Mt−1,

Var[Mn] = Var [Rn(π∗n )] = E

[n∑

t=1

d2t

]

An application of reward boundedness, existence of a do-nothing action, and optimalaction monotonicity gives

E[d2t | Ft−1] ≤ K E[r(t,Xt ,Yt ,A

∗t ) | Ft−1]

Taking total expectations and summing gives

Var [Rn(π∗n )] ≤ K E [Rn(π∗n )]

Crucial here: Xt+1 = f (t,Xt ,Yt ,At) is Ft-measurable!

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 15 / 20

Page 58: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Proof sketch

Bounding the Variance by the Mean: Sketch of the Proof

For 0 ≤ t ≤ n, the process

Mt = Rt(π∗n ) + vt+1(Xt+1)

is a martingale with respect to the natural filtration Ft = σ{Y1, . . . ,Yt}

M0 = E[Rn(π∗n )] and Mn = Rn(π∗n )

For dt = Mt −Mt−1,

Var[Mn] = Var [Rn(π∗n )] = E

[n∑

t=1

d2t

]

An application of reward boundedness, existence of a do-nothing action, and optimalaction monotonicity gives

E[d2t | Ft−1] ≤ K E[r(t,Xt ,Yt ,A

∗t ) | Ft−1]

Taking total expectations and summing gives

Var [Rn(π∗n )] ≤ K E [Rn(π∗n )]

Crucial here: Xt+1 = f (t,Xt ,Yt ,At) is Ft-measurable!

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 15 / 20

Page 59: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Proof sketch

Bounding the Variance by the Mean: Sketch of the Proof

For 0 ≤ t ≤ n, the process

Mt = Rt(π∗n ) + vt+1(Xt+1)

is a martingale with respect to the natural filtration Ft = σ{Y1, . . . ,Yt}

M0 = E[Rn(π∗n )] and Mn = Rn(π∗n )

For dt = Mt −Mt−1,

Var[Mn] = Var [Rn(π∗n )] = E

[n∑

t=1

d2t

]

An application of reward boundedness, existence of a do-nothing action, and optimalaction monotonicity gives

E[d2t | Ft−1] ≤ K E[r(t,Xt ,Yt ,A

∗t ) | Ft−1]

Taking total expectations and summing gives

Var [Rn(π∗n )] ≤ K E [Rn(π∗n )]

Crucial here: Xt+1 = f (t,Xt ,Yt ,At) is Ft-measurable!

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 15 / 20

Page 60: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Proof sketch

Bounding the Variance by the Mean: Sketch of the Proof

For 0 ≤ t ≤ n, the process

Mt = Rt(π∗n ) + vt+1(Xt+1)

is a martingale with respect to the natural filtration Ft = σ{Y1, . . . ,Yt}

M0 = E[Rn(π∗n )] and Mn = Rn(π∗n )

For dt = Mt −Mt−1,

Var[Mn] = Var [Rn(π∗n )] = E

[n∑

t=1

d2t

]

An application of reward boundedness, existence of a do-nothing action, and optimalaction monotonicity gives

E[d2t | Ft−1] ≤ K E[r(t,Xt ,Yt ,A

∗t ) | Ft−1]

Taking total expectations and summing gives

Var [Rn(π∗n )] ≤ K E [Rn(π∗n )]

Crucial here: Xt+1 = f (t,Xt ,Yt ,At) is Ft-measurable!

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 15 / 20

Page 61: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Proof sketch

Bounding the Variance by the Mean: Sketch of the Proof

For 0 ≤ t ≤ n, the process

Mt = Rt(π∗n ) + vt+1(Xt+1)

is a martingale with respect to the natural filtration Ft = σ{Y1, . . . ,Yt}

M0 = E[Rn(π∗n )] and Mn = Rn(π∗n )

For dt = Mt −Mt−1,

Var[Mn] = Var [Rn(π∗n )] = E

[n∑

t=1

d2t

]

An application of reward boundedness, existence of a do-nothing action, and optimalaction monotonicity gives

E[d2t | Ft−1] ≤ K E[r(t,Xt ,Yt ,A

∗t ) | Ft−1]

Taking total expectations and summing gives

Var [Rn(π∗n )] ≤ K E [Rn(π∗n )]

Crucial here: Xt+1 = f (t,Xt ,Yt ,At) is Ft-measurable!

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 15 / 20

Page 62: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Proof sketch

Bounding the Variance by the Mean: Sketch of the Proof

For 0 ≤ t ≤ n, the process

Mt = Rt(π∗n ) + vt+1(Xt+1)

is a martingale with respect to the natural filtration Ft = σ{Y1, . . . ,Yt}

M0 = E[Rn(π∗n )] and Mn = Rn(π∗n )

For dt = Mt −Mt−1,

Var[Mn] = Var [Rn(π∗n )] = E

[n∑

t=1

d2t

]

An application of reward boundedness, existence of a do-nothing action, and optimalaction monotonicity gives

E[d2t | Ft−1] ≤ K E[r(t,Xt ,Yt ,A

∗t ) | Ft−1]

Taking total expectations and summing gives

Var [Rn(π∗n )] ≤ K E [Rn(π∗n )]

Crucial here: Xt+1 = f (t,Xt ,Yt ,At) is Ft-measurable!

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 15 / 20

Page 63: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Proof sketch

Bounding the Variance by the Mean: Sketch of the Proof

For 0 ≤ t ≤ n, the process

Mt = Rt(π∗n ) + vt+1(Xt+1)

is a martingale with respect to the natural filtration Ft = σ{Y1, . . . ,Yt}

M0 = E[Rn(π∗n )] and Mn = Rn(π∗n )

For dt = Mt −Mt−1,

Var[Mn] = Var [Rn(π∗n )] = E

[n∑

t=1

d2t

]

An application of reward boundedness, existence of a do-nothing action, and optimalaction monotonicity gives

E[d2t | Ft−1] ≤ K E[r(t,Xt ,Yt ,A

∗t ) | Ft−1]

Taking total expectations and summing gives

Var [Rn(π∗n )] ≤ K E [Rn(π∗n )]

Crucial here: Xt+1 = f (t,Xt ,Yt ,At) is Ft-measurable!

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 15 / 20

Page 64: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Concrete Conjecture in Simplest Context

Two Concrete Conjectures in the Simplest Context . . .

The simple knapsack problem is equivalent to the monotone subsequence problem:

Knapsack capacity c = 1

Item sizes: Y1,Y2, . . . ,Yn independent uniform on [0, 1]

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ 1

},

π∗n : optimal Markov deterministic policy such that E [Rn(π∗n )] = supπ E [Rn(π)]

Two Conjectures

From the MDP variance bound and Arlotto and S. (2011) [for the lower bound] weknow

(1/3)E [Rn(π∗n )]− 2 ≤ Var (Rn(π∗n )) ≤ E [Rn(π∗n )] for all n ≥ 1

and (prodded by a very insightful referee) we conjecture

Var (Rn(π∗n )) ∼ (1/3)E [Rn(π∗n )] as n→∞

CLT When centered and scaled Rn(π∗n ) converges in distribution to N(0, 1).

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 16 / 20

Page 65: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Concrete Conjecture in Simplest Context

Two Concrete Conjectures in the Simplest Context . . .The simple knapsack problem is equivalent to the monotone subsequence problem:

Knapsack capacity c = 1

Item sizes: Y1,Y2, . . . ,Yn independent uniform on [0, 1]

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ 1

},

π∗n : optimal Markov deterministic policy such that E [Rn(π∗n )] = supπ E [Rn(π)]

Two Conjectures

From the MDP variance bound and Arlotto and S. (2011) [for the lower bound] weknow

(1/3)E [Rn(π∗n )]− 2 ≤ Var (Rn(π∗n )) ≤ E [Rn(π∗n )] for all n ≥ 1

and (prodded by a very insightful referee) we conjecture

Var (Rn(π∗n )) ∼ (1/3)E [Rn(π∗n )] as n→∞

CLT When centered and scaled Rn(π∗n ) converges in distribution to N(0, 1).

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 16 / 20

Page 66: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Concrete Conjecture in Simplest Context

Two Concrete Conjectures in the Simplest Context . . .The simple knapsack problem is equivalent to the monotone subsequence problem:

Knapsack capacity c = 1

Item sizes: Y1,Y2, . . . ,Yn independent uniform on [0, 1]

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ 1

},

π∗n : optimal Markov deterministic policy such that E [Rn(π∗n )] = supπ E [Rn(π)]

Two Conjectures

From the MDP variance bound and Arlotto and S. (2011) [for the lower bound] weknow

(1/3)E [Rn(π∗n )]− 2 ≤ Var (Rn(π∗n )) ≤ E [Rn(π∗n )] for all n ≥ 1

and (prodded by a very insightful referee) we conjecture

Var (Rn(π∗n )) ∼ (1/3)E [Rn(π∗n )] as n→∞

CLT When centered and scaled Rn(π∗n ) converges in distribution to N(0, 1).

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 16 / 20

Page 67: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Concrete Conjecture in Simplest Context

Two Concrete Conjectures in the Simplest Context . . .The simple knapsack problem is equivalent to the monotone subsequence problem:

Knapsack capacity c = 1

Item sizes: Y1,Y2, . . . ,Yn independent uniform on [0, 1]

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ 1

},

π∗n : optimal Markov deterministic policy such that E [Rn(π∗n )] = supπ E [Rn(π)]

Two Conjectures

From the MDP variance bound and Arlotto and S. (2011) [for the lower bound] weknow

(1/3)E [Rn(π∗n )]− 2 ≤ Var (Rn(π∗n )) ≤ E [Rn(π∗n )] for all n ≥ 1

and (prodded by a very insightful referee) we conjecture

Var (Rn(π∗n )) ∼ (1/3)E [Rn(π∗n )] as n→∞

CLT When centered and scaled Rn(π∗n ) converges in distribution to N(0, 1).

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 16 / 20

Page 68: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Concrete Conjecture in Simplest Context

Two Concrete Conjectures in the Simplest Context . . .The simple knapsack problem is equivalent to the monotone subsequence problem:

Knapsack capacity c = 1

Item sizes: Y1,Y2, . . . ,Yn independent uniform on [0, 1]

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ 1

},

π∗n : optimal Markov deterministic policy such that E [Rn(π∗n )] = supπ E [Rn(π)]

Two Conjectures

From the MDP variance bound and Arlotto and S. (2011) [for the lower bound] weknow

(1/3)E [Rn(π∗n )]− 2 ≤ Var (Rn(π∗n )) ≤ E [Rn(π∗n )] for all n ≥ 1

and (prodded by a very insightful referee) we conjecture

Var (Rn(π∗n )) ∼ (1/3)E [Rn(π∗n )] as n→∞

CLT When centered and scaled Rn(π∗n ) converges in distribution to N(0, 1).

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 16 / 20

Page 69: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

A Challenge for Bounders of Variances Concrete Conjecture in Simplest Context

Two Concrete Conjectures in the Simplest Context . . .The simple knapsack problem is equivalent to the monotone subsequence problem:

Knapsack capacity c = 1

Item sizes: Y1,Y2, . . . ,Yn independent uniform on [0, 1]

Knapsack policy π: the number of items included is

Rn(π) = max

{k :

k∑i=1

Yτi ≤ 1

},

π∗n : optimal Markov deterministic policy such that E [Rn(π∗n )] = supπ E [Rn(π)]

Two Conjectures

From the MDP variance bound and Arlotto and S. (2011) [for the lower bound] weknow

(1/3)E [Rn(π∗n )]− 2 ≤ Var (Rn(π∗n )) ≤ E [Rn(π∗n )] for all n ≥ 1

and (prodded by a very insightful referee) we conjecture

Var (Rn(π∗n )) ∼ (1/3)E [Rn(π∗n )] as n→∞

CLT When centered and scaled Rn(π∗n ) converges in distribution to N(0, 1).

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 16 / 20

Page 70: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

“Take Aways”

Not a Plan — Just Suggestions

1 Quick Look at Variance Bounds in the “Early Days”The Euclidean TSP: Two Probability ModelsA Variance Bound for Many SeasonsModern Technology — a Gap and a Challenge

2 A Challenge for Bounders of VariancesLeading Example: The Sequential Knapsack ProblemMDPs: A General Framework — Served with Alphabet SoupThree Notable PropertiesMain Result: Variance Bound for a General Class of MDPsProof sketchConcrete Conjecture in Simplest Context

3 “Take Aways”

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 17 / 20

Page 71: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

“Take Aways”

“Take Aways: Hopefully Something New

Summary:

The alphabet soup of an MDP can be off-putting, but the MDP structure ishonestly rich and it is worth one’s time to become familiar with it (if you’ve notdone so already). You get all the “benefits” of abstraction.

There is a natural martingale associated with any fixed-horizon MDP:

Mt = Rt(π∗n ) + vt+1(Xt+1)

This dynamic programming martingale is not as universal as the Doob martingale,but it still has substantial range. It is worth consideration in “any sequentialproblem”; this is good since in such problems the Doob martingale is often useless.

Simple martingale arguments can be used to extract useful (but not quite precise)distributional information; the “trick” seems to rest in finding the features of theMDP that feedback into nice properties of the DP martingale.

There are numerous open problems with a wide range of potential and of difficulty.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 18 / 20

Page 72: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

Muchas Gracias a Todos

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 19 / 20

Page 73: When Means Bound Variances: Concentration for Recursively ...stat.wharton.upenn.edu/~steele/AccessCash/Steele-AofA-Menorca-2… · When Means Bound Variances: Concentration for Recursively

References

References I

Arlotto, A. and Steele, J. M. (2011), ‘Optimal sequential selection of a unimodalsubsequence of a random sequence’, Combinatorics, Probability and Computing20(06), 799–814.

Chan, C. W. and Farias, V. F. (2009), ‘Stochastic depletion problems: effective myopicpolicies for a class of dynamic optimization problems’, Math. Oper. Res.34(2), 333–350.

Coffman, Jr., E. G., Flatto, L. and Weber, R. R. (1987), ‘Optimal selection of stochasticintervals under a sum constraint’, Adv. in Appl. Probab. 19(2), 454–473.

Derman, C., Lieberman, G. J. and Ross, S. M. (1975), ‘A stochastic sequential allocationmodel’, Operations Res. 23(6), 1120–1130.

Papastavrou, J. D., Rajagopalan, S. and Kleywegt, A. J. (1996), ‘The dynamic andstochastic knapsack problem with deadlines’, Management Science 42(12), 1706–1718.

Prastacos, G. P. (1983), ‘Optimal sequential investment decisions under conditions ofuncertainty’, Management Science 29(1), 118–134.

Talluri, K. T. and van Ryzin, G. J. (2004), The theory and practice of revenuemanagement, International Series in Operations Research & Management Science, 68,Kluwer Academic Publishers, Boston, MA.

J. M. Steele (Analysis of Algorithms) Concentration and Recursion May 2013 20 / 20