Robust Optimization Concepts and Examples Yuriy Zinchenko Shane G. Henderson ORIE, Cornell...

Preview:

Citation preview

Robust OptimizationConcepts and Examples

Yuriy ZinchenkoShane G. Henderson

ORIE, Cornell University

Zinchenko and Henderson 2005 2

Outline

• What can go wrong with LP?• A familiar blend problem• The general picture

– Robust linear programming– Software, resources, practicalities

• Radiation therapy for cancer treatment

Zinchenko and Henderson 2005 3

What can go wrong with LP?

Tough LP problem:

max x + ys/t 1 x 1

1 y 1 x, y 0?

Zinchenko and Henderson 2005 4

Blend Problem

blend to get output properties at minimum cost

$ $$ $$$

but properties change with time

for anyinput propertieswithin reason

Zinchenko and Henderson 2005 5

Blend constraints

• Typical constraint looks like Low ≤ 10 x1 + 12 x2 + 7 x3 ≤ High

• Changes to Low ≤ a1 x1 + a2 x2 + a3 x3 ≤ High

for any vector a that is “close” to (10, 12, 7)

Zinchenko and Henderson 2005 6

General robust LP

min cTx s/t A(1) x b1

A(2) x b2

A(3) x b3

Zinchenko and Henderson 2005 7

A more detailed view

Simple linear constrainta x 1

x 0with a “close” to 1, namely0 a 2

Want x to work for all such aHow do we deal with it?

Zinchenko and Henderson 2005 8

a x 1, x 0 for all 0 a 2

max a x 1, x 0 0 a 2, x 0

2 x 1 , x 0

x 1/2 , x 0

Zinchenko and Henderson 2005 9

A slightly more involved example:a x + b y 1

where (a, b) “close” to (1, 1), namely inEllipsoidal (spherical)

“uncertainty” set U(a, b) is in U if(a, b) = (a0, b0) + (a, b)

with (a0, b0) = (1, 1) and a2 + b2 1

Zinchenko and Henderson 2005 10

Ellipsoidal “uncertainty” set U(a, b) = (a0, b0) + (a, b)

(a0, b0) = (1, 1)

a2 + b2 1

Want (x, y) to satisfya x + b y 1, for all (a, b) from U

U

(a0, b0)

Zinchenko and Henderson 2005 11

a x + b y 1 for all (a, b) in U

max a x + b y 1 (a, b) in U

What can we say about a x + b y ?

a x + b y = (a0 + a) x + (b0 + b) y

= (a0 x + b0 y) + (a x + b y)

Zinchenko and Henderson 2005 12

For a moment, think of (x, y)as your objective function (fixed)

max a x + b y ( 1 ?) (a, b) in U

same as

(a0 x + b0 y) + max (a x + b y) ( 1 ?)

a2 + b2 1

U

(a0, b0)

(x, y)

Zinchenko and Henderson 2005 13

max (a x + b y) ( 1 - (a0 x + b0 y) ?)

a2 + b2 1

Here

a x + b y ||(x, y)|| = (x2 + y2)1/2

the “length” of (x, y)

U(a0, b0)

(x1, y1)

(x2, y2)

Zinchenko and Henderson 2005 14

a x + b y 1 for all (a, b) in U

max a x + b y 1(a, b) in U

(a0 x + b0 y) + max (a x + b y) 1a2 + b2 1

||(x, y)|| 1 - (a0 x + b0 y)

Zinchenko and Henderson 2005 15

Good newsCan handle constraints of this type

||(x, y)|| 1 - (1 x + 1 y)

easily (the so-called second-order conicprogramming (SOCP))

Not much harder than linear programming!

Zinchenko and Henderson 2005 16

General Robust LP formulation

Robust LP:

max cTxs/t A(i) x bi, i = 1,…,m

wherec, x Rn, A(i) R1 x n, A(i)=A(i)

0 + wi Pi

withwi R1 x ki, ||wi|| 1, i=1,…,m, Pi

Rki x n

Zinchenko and Henderson 2005 17

SOCP equivalent:

max cTxs/t || Pi

x || bi - A(i)0 x, i = 1,…,m

Probabilistic interpretation:think of A(i) taken from an -level set of your favorite probability distribution (e.g. multivariate normal)

the robust constraint will readsatisfy the constraint with a given probability

Zinchenko and Henderson 2005 18

Where’d the ellipse come from?

• Expert opinion• Statistics: Averages live in

ellipsoids• Doesn’t have to be an ellipse. Can

be some other shape (e.g., boxes)

Zinchenko and Henderson 2005 19

Commercial:• Mosek (http://www.mosek.com/)

“Free”:• SeDuMi (http://sedumi.mcmaster.ca/)

• SDPT3.x (http://www.math.nus.edu.sg/~mattohkc/sdpt3.html/)

Software

Zinchenko and Henderson 2005 20

Practicalities

• Realistic problem sizes– number of variables/constraints on the

order of 103 – 104

– depends (greatly) on the problem data structure/sparsity

• Possible to obtain a “good”, “inexpensive” approximation with LP

Zinchenko and Henderson 2005 21

Generality• Possible to extend this approach to

quite a few other convex programming problems

Resources• Lectures on Modern Convex Optimizatio

n: Analysis, Algorithms, and Engineering Applications by A. Ben-Tal, A. S. Nemirovskii

• Google for Robust Optimization (robust LP etc.)

Zinchenko and Henderson 2005 22

Joint work with Millie Chu (Cornell) and Michael B. Sharpe (Princess Margaret Hospital, Toronto)

Zinchenko and Henderson 2005 23

Cancer treatment

• About 1.3 million new cancer cases in the U.S. each year

• Nearly 60% receive radiation therapy (in conjunction with surgery, chemotherapy etc)

Zinchenko and Henderson 2005 24

External beam radiation therapy• Radiation

delivered by a linear accelerator

• Cancer cells more susceptible than normal cells

• Overlay beams from different angles

• Dose given in daily fractions for ~ 6 weeks

Zinchenko and Henderson 2005 25

Intensity Modulated Radiation Therapy

• Block parts of the radiation beam – discretize the whole beam into a grid of smaller “beamlets”

• Choose different intensities for each beamlet

Intensity Modulated Radiation Therapy

Collaborative Working Group, 2001

Zinchenko and Henderson 2005 26

Goal: Choose beam angles and beamlet intensities that deliver enough radiation to kill all tumor cells, while avoiding healthy organs & tissue as much as possible

Treatment Planning

Princess Margaret Hospital

- Take CT scan- Delineate target

region and healthy structures

- Discretize body as small cubes, or “voxels”

- Formulate & solve a mathematical program to find a “good” plan

Zinchenko and Henderson 2005 27

Robust Treatment Planning

Setup errors+ Patient motion+Structural changes during

treatment= uncertainty in geometry

• Don’t rescan patient much if at all• Use RO to “robustify”

mathematical program

Zinchenko and Henderson 2005 28

Model Formulation • Many different formulations exist – we use a penalty formulation

minimize: penalty objective

subject to:

Pr(Dose to voxel i in healthy structure k ≤ Uk) ≥ 0.95

Pr(Dose to voxel i in tumor ≥ L) ≥ 0.95

x = beamlet intensities ≥ 0

Zinchenko and Henderson 2005 29

Computational Results

• Prostate: tumor + 5 healthy regions• 5 equi-spaced beams, ~ 225

beamlets from each angle• Voxel size = 2 cm, ~ 400 total

voxels• Solver: Mosek, v. 3.0.1.18• Solve time = 6 seconds (LP), 45

minutes (SOCP)

Zinchenko and Henderson 2005 30

Dose-Volume HistogramsDose-Volume Histograms

deterministic solution’s plan

stochastic solution’s plan

% of structure receiving ≥ x Gy

DVH of expected dose

Zinchenko and Henderson 2005 31

Comparison

• Simulate 10 treatments (45 fractions each)

• For each of the 10 treatments, and for each solution (deterministic & stochastic),– calculated dose delivered to each voxel in

each fraction– summed over the 45 fractions to get total

dose delivered to each voxel– plotted DVH

Zinchenko and Henderson 2005 32

DVH – Treatment 1DVH – Treatment 1

det

stoch

Zinchenko and Henderson 2005 33

DVH – Treatment 2DVH – Treatment 2

det

stoch

Zinchenko and Henderson 2005 34

DVH – Treatment 3DVH – Treatment 3

det

stoch

Zinchenko and Henderson 2005 35

DVH – Treatment 4DVH – Treatment 4

det

stoch

Zinchenko and Henderson 2005 36

DVH – Treatment 5DVH – Treatment 5

det

stoch

Zinchenko and Henderson 2005 37

DVH – Treatment 6DVH – Treatment 6

det

stoch

Zinchenko and Henderson 2005 38

DVH – Treatment 7DVH – Treatment 7

det

stoch

Zinchenko and Henderson 2005 39

DVH – Treatment 8DVH – Treatment 8

det

stoch

Zinchenko and Henderson 2005 40

DVH – Treatment 9DVH – Treatment 9

det

stoch

Zinchenko and Henderson 2005 41

DVH – Treatment 10DVH – Treatment 10

det

stoch

Zinchenko and Henderson 2005 42

Conclusions

• LP “pushes you into a corner”• True situation never same as data• Robust LP: Find good solution that

is always feasible within reason• Efficient solution methods: can

solve real problems• Software available

Zinchenko and Henderson 2005 43

A Bit More DetailA Bit More Detail• Di(x) = Dose delivered to voxel i in N fractions, with intensities x, a random variable

Di(x) is the sum of N random variables (N = 45), assume iid,

apply CLT, so Di(x) is approximately normally distributed

• Take n sample shifts, s1,...,sn, with associated probabilities p = (p1,...,pn)T

• Let ai(∙)T = ai(s1)T ai(s2)T dose delivered to voxel i, shifted by sj,

from each beamlet with unit intensityai(sn)T

so that NpTai(∙)Tx = expected total dose delivered to voxel i, for N fractions.

• Let vi(x) = sample variance of dose delivered to voxel i

Di(x) ~ Normal ( NpTai(·)Tx, Nvi(x) )

Zinchenko and Henderson 2005 44

)(N

)(N

)(N

)(N)(

x

xapm

x

xapx

i

Ti

Tk

i

Ti

Ti

vv

DP

• Want constraints to be violated with low probability (say, δ = .05)

• Example: maximum dose constraint on voxel i in Hk:

Assuming Di(x) ~ Normal ( NpTai(∙)Tx, Nvi(x) ),

mk

1)(N

)(Nz

vi

Ti

Tk

x

xapm

N

)(N)(

1

zv

Ti

Tk

i

xapmx

N

)(N)(

12

zR

Ti

TkT

i

xapmxa

• Second order cone program (SOCP)

Want P(Di(x) > mk) ≤ δ

Probabilistic ConstraintsProbabilistic Constraints

Zinchenko and Henderson 2005 45

Dose-Volume Constraints•Physicians like constraints of form:

“<= fraction fk of structure Hk gets >= dk”•0-1 var for each voxel: = 1 if dose is > dk.•MIP: Hard to solve!•Many voxels get near max allowed dose•Alternative: upper bound the “excess”

dose. For healthy structure Hk, we require:

•Linear constraints☺

kHi

kki gd )'N( Txa