Mixed-Integer programming approaches for some non-convex ...srikris.github.io/presentations/Defense.pdf · Life since 2009 I Part 1:Mixed integer programming (MIP) techniques with

Mixed-Integer programming approaches for some non-convex andcombinatorial optimization problems.

Srikrishna Sridhar

Computer Sciences

University of Wisconsin-Madison

http://www.cs.wisc.edu/∼srikris/

Advised By

Jeff Linderoth, James Luedtke, and Stephen Wright

Committee

Jeff Linderoth, James Luedtke, Christopher Re, Thomas Rutherford, and Stephen Wright

1

Life since 2009

I Part 1: Mixed integer programming (MIP) techniques with applications in:

I Chemical processes design.

I Compressors scheduling in petroleum reservoirs.

I Hydro turbine performance modelling.

I Part 2: Approximation algorithms for combinatorial problems using approximate LPRounding.

2

Life since 2009


I Extraction of natural resources like oil and natural gas.





2

Life since 2009


I Extraction of natural resources like oil and natural gas. Sample Application!





2

Life since 2009


I Extraction of natural resources like oil and natural gas.





2

Part 1: MIP techniques for some non-convex problems.

FPSO

Field

Reservoir

Well

Injection Lines Fiscal Model

Production Model

PLF Model

3

Oil Field Development Life Cycle

Exploration

Appraisal

Development

Production

Abandonment

Data Collection

Strategic Planning

4

Oil Field Development Life Cycle

Exploration

Appraisal

Development

Production

Abandonment

Data Collection

Strategic Planning

4

Oil Field Development Infrastructure1

1Statoil Peregrino Field

5

Oil Field Development Infrastructure1

FPSO

Field

Reservoir

Well

Injection Lines

1Tarhan, Grossmann, and Goel, Industrial & Engineering Chemistry Research (2009)

6

Why do we care?

FPSO development & installation. ≈ $5B1

Oil field development & installation. ≈ $1B1

20 year operational expenses. ≈ $5B1

Knowing optimal strategic & operational de-cisions for a 20 year horizon.

1Estimate: Do not start an oil company based on these estimates.

7

Why do we care?






7

Why do we care?






7

Why do we care?




Knowing optimal strategic & operational de-cisions for a 20 year horizon. Priceless2

1Estimate: Do not start an oil company based on these estimates.2Accurate: You may start a company based on this estimate.

7

Supermodel...

FPSO

Field

Reservoir

Well

Injection Lines

8

Dissected Supermodel...

FPSO

Field

Reservoir

Well

Injection Lines

9


FPSO

Field

Reservoir

Well

Injection Lines

9


FPSO

Field

Reservoir

Well

Injection Lines

9


FPSO

Field

Reservoir

Well

Injection Lines

9


FPSO

Field

Reservoir

Well


Production Model

PLF Model

9

Key Challenges while Modelling Oil Field Infrastructure Planning

FPSO

Field

Reservoir

Well


Production Model

PLF Model

10

Contributions

I PLF Model: Perfect MIP models for piecewise linear functions (PLFs) with indicatorvariables.1

I Production Model: Convex reformulation of the production planning process toeliminate the bilinear terms from the MIP model. 2

I Fiscal Model: MIP models, solution techniques, and algorithms for productionplanning problems in the presence of complex fiscal objectives. 3

1Sridhar, Linderoth, and Luedtke Operations Research Letters (2013)

11

Contributions




1Sridhar, Linderoth, and Luedtke Operations Research Letters (2013)2Sridhar, Linderoth, and Luedtke Journal of Global Optimization (2014) – under review

11

Contributions




1Sridhar, Linderoth, and Luedtke Operations Research Letters (2013)2Sridhar, Linderoth, and Luedtke Journal of Global Optimization (2014) – under review2Sridhar, Linderoth, Luedtke, and Wright – in preparation

11

Challenge I: MIP Formulations for PLFs with Indicator Variables.

FPSO

Field

Reservoir

Well


Production Model

PLF Model

12


FPSO

Field

Reservoir

Well


Production Model

PLF Model

12


FPSO

Field

Reservoir

Well

Injection Lines

12

Challenge I: Nonlinear Production Functions

I The production process creates a mixture of useful products P+ and byproducts P−.

I Production function f (·) is a concave function that determines the maximumproduction rate as a function of total production v(t).

I Product fraction functions gp(·) evolve monotonically as a function of the totalproduction v(t).

Total production (v(t))

Maximum production rate (f)


Product fraction (gp)

Useful product

By-product

13









Useful product

By-product

13









Useful product

By-product

13









Useful product

By-product

13


Contribution: Piecewise linear functions (PLFs) with indicator variables to tacklenonlinearity!1

B0 B1 B2 B3

Variable (x)

Funct

ion f

(x)

[B0 ,f(B0 )]

[B1 ,f(B1 )]

[B2 ,f(B2 )] [B3 ,f(B3 )]


13

Challenge I: MIP Formulations for PLFs with Indicator Variables

MIP formulations for PLFs that are evaluated when an indicator variable (z) is turned on.

z = 0︸︷︷︸Binary variable

⇒ x = 0︸︷︷︸Function argument

, f (x) = 0︸︷︷︸PLF

.

Application Reference

Gas network optimization Martin et al. (2006)

Transmissions expansion planning Alguacil et al. (2003)

Thermal unit commitment Carrion et al. (2006)

Oil field development Gupta et al. (2012)

Hydro Scheduling Borghetti et al. (2008)

Sales resource allocation Lodish et al. (1971)

14





, f (x) = 0︸︷︷︸PLF

.








14





, f (x) = 0︸︷︷︸PLF

.








14





, f (x) = 0︸︷︷︸PLF

.

SOS2 Model

B0 B1 B2 B3

Variable (x)

Funct

ion f

(x)

[B0 ,f(B0 )]

[B1 ,f(B1 )]

[B2 ,f(B2 )] [B3 ,f(B3 )]

Beale and Tomlin(1970)

Incremental Model

B0 B1 B2 B3 B4 B5

Variable (x)

Funct

ion f

(x)

δ0 δ1 δ2 δ3 δ4

Markowitz and Manne(1957)

Multiple Choice

B0 B1 B2 B3

Variable (x)

Funct

ion f

(x)

z0 =1 z1 =1 z2 =1

Balakrishnan and Graves(1989)

14





, f (x) = 0︸︷︷︸PLF

.

Theoretical Results1

I Our proposed formulations is locally ideal.

I Previously proposed formulations are not locally ideal. (counter example)


14

Challenge I: Numerical Experiments

Solve time

0 800 1600 2400 3200MIP Solve time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f data

sets

P(S2 ) P(S1 )

Root LP gap

0 10 20 30 40LP relaxation gap (%)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f data

sets

P(S2 ) P(S1 )

Nodes explored

0 400 800 1200 1600Nodes

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f data

sets

P(S2 ) P(S1 )

Average Performance Improvement1

I 40x faster solve times.

I 15x fewer nodes explored.

I 20% better root node gaps.


15

Challenge II: Bilinear Terms in Production Process

FPSO

Field

Reservoir

Well


Production Model

PLF Model

16


FPSO

Field

Reservoir

Well


Production Model

PLF Model

16


Cumulative production v(t) is calculatedusing production rate x(t)

v(t) =

∫ t

0

x(s)ds

Production rate is limited by a productionfunction f (·)

x(t) ≤ f (v(t))

Product production rates yp(t) calculatedby fraction functions gp(·)

yp(t) = x(t) gp(v(t))

Production profiles are active only after thestart time z(t)

z(t) = 0⇒ v(t) = 0





Useful product

By-product

17



v(t) =

∫ t

0

x(s)ds


x(t) ≤ f (v(t))




z(t) = 0⇒ v(t) = 0





Useful product

By-product

17



v(t) =

∫ t

0

x(s)ds


x(t) ≤ f (v(t))




z(t) = 0⇒ v(t) = 0





Useful product

By-product

17



v(t) =

∫ t

0

x(s)ds


x(t) ≤ f (v(t))




z(t) = 0⇒ v(t) = 0





Useful product

By-product

17



v(t) =

∫ t

0

x(s)ds

Mixture production rate is limited byproduction function f (·)

x(t) ≤ f (v(t))

Product production rates yp(t) calculatedby fraction functionsgp(·)

yp(t) = x(t) gp(v(t))︸︷︷︸non-convex bilinear terms

Production profiles are active only after thetime z(t)

z(t) = 0⇒ v(t) = 0





Useful product

By-product

17

Production Functions: Applications





Useful product

By-product


Oil & Natural gas Iyer et al. (1998) Tarhan et al. (2009)



Compressor Scheduling Camponogara et al. (2011)

18

Main Results1

I Reformulate based on cumulative product production!

I Up to 30% more accurate than Tarhan et al. (2009).

I Order of magnitude faster because it deals with convex functions while Tarhan et. al(2009) deals with bilinear terms.

Total production (vt)


Useful product

By-product


Cumulative product (hp)

Useful product

By-product

1Sridhar, Linderoth, and Luedtke Journal of Global Optimization (2014) – under review

19

Main Results1




0.2 0.4 0.6 0.8

Total Production (vt)

0.0

0.2

0.4

0.6

0.8

1.0

Useful product fraction (gp+ )

F1

F2

0.2 0.4 0.6 0.8


0.0

0.2

0.4

0.6

0.8

1.0

By-product fraction (gp− )

F1

F2


19

Main Results1






Useful product

By-product



Useful product

By-product


19

Solving the Production Model

Piecewise Linear Approximation (PLA)




Cumulative useful product (gp+ )


Cumulative by-product (hp− )

Approximate all the nonlinear production functions using PLFs.

20

Challenge III: Production Planning Problems with Complex Fiscal Terms.

FPSO

Field

Reservoir

Well


Production Model

PLF Model

21

Challenge III: non-convex Objective Function1

Production

Profit Oil Royalty Cost Oil

Contractor's Share Government's Share

Total contractor Share Total Government Share

Contractor's after-tax share Income Tax

Tranche Model

Tax Model

Capex Model

1Image source: World Bank Survey 22

Challenge III: non-convex Objective Function1

Production

Profit Oil Royalty Cost Oil

Contractor's Share Government's Share

Total contractor Share Total Government Share

Contractor's after-tax share Income Tax

Tranche Model

Tax Model

Capex Model

1Image source: World Bank Survey 22


FPSO

Field

Reservoir

Well


Production Model

PLF Model

Set (X )

23

Problem Setup

I Multi period planning problem X with T := {1, 2 . . .T} time periods.

I Continuous operational decision variables x ∈ Rm × {0, 1}n which produce a set ofcash flows for each time period ft .

I The cash flows are broken down into revenue streams rt ∀t ∈ T and expensestreams ct ∀t ∈ T .

ft = rt − ct ∀t ∈ T

Complex fiscal terms like can be viewed as optimizing a discontinuous, non-convexfunction G : RT × RT → R.

max G(r , c)︸︷︷︸Noncovex function

subject to (r , c, x) ∈ X︸︷︷︸MIP Production Model

24

Problem Setup








24

Problem Setup








24

Problem Setup








24

Production Sharing Contracts

Net Present Value (NPV)

Given a time series of cash flows (f[1,t]) and rate of interest (q)

h(q, f[1,t]) =t∑

s=1

fs(1 + q)s︸︷︷︸

Present value of money

Internal rate of return (IRR)

Rate of return for which the NPV function is zero.

h(qt , f[1,t]) = 0︸︷︷︸Solve in each time period

25

Production Sharing Contracts

Net Present Value (NPV)

Given a time series of cash flows (f[1,t]) and rate of interest (q)

h(q, f[1,t]) =t∑

s=1

fs(1 + q)s︸︷︷︸

Present value of money

Internal rate of return (IRR)

Rate of return for which the NPV function is zero.

h(qt , f[1,t]) = 0︸︷︷︸Solve in each time period

25

Illustration: Production Sharing Contracts

0.0 0.2 0.4 0.6 0.8

Discounting rate (q)

400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

Production Sharing Contracts (PSC)I IRR scale into tranches

(K := {1, 2 . . .K})

I Contractors to retain µk fraction ofthe profit during time period t, ifassociated with tranche k.

Applications

I Oil & natural gas fieldinfrastructure planning.

I Portfolio optimization.

I Production planning.

26


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR


(K := {1, 2 . . .K})I Contractors to retain µk fraction of

the profit during time period t, ifassociated with tranche k.

Applications




26


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR


(K := {1, 2 . . .K})I Contractors to retain µk fraction of

the profit during time period t, ifassociated with tranche k.

Applications




26

Example PSC

Time(t)

Cost(ct )

Revenue(pt )

IRR(qt )

Tranche(k(t))

Contractorshare(µk(t))

Cash flow(µk(t)rt −ct )

1 4000 0 n.a 1 70% -4000

2 0 2300 -49.8% 1 70% 1610

3 0 6000 20% 1 70% 4200

4 0 5000 46.7% 2 60% 3000

5 0 4000 49.0% 3 15% 600

27

Example PSC

Time(t)

Cost(ct )

Revenue(pt )

IRR(qt )

Tranche(k(t))



1 4000 0 n.a 1 70% -4000

2 0 2300 -49.8% 1 70% 1610

3 0 6000 20% 1 70% 4200

4 0 5000 46.7% 2 60% 3000

5 0 4000 49.0% 3 15% 600

27

Example PSC

Time(t)

Cost(ct )

Revenue(pt )

IRR(qt )

Tranche(k(t))



1 4000 0 n.a 1 70% -4000

2 0 2300 -49.8% 1 70% 1610

3 0 6000 20% 1 70% 4200

4 0 5000 46.7% 2 60% 3000

5 0 4000 49.0% 3 15% 600

27

Example PSC

Time(t)

Cost(ct )

Revenue(pt )

IRR(qt )

Tranche(k(t))



1 4000 0 n.a 1 70% -4000

2 0 2300 -49.8% 1 70% 1610

3 0 6000 20% 1 70% 4200

4 0 5000 46.7% 2 60% 3000

5 0 4000 49.0% 3 15% 600

27

Markets (Ringfences)

FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

28

Solution Restrictions

1. Given a series of cash flows f[1,t], the equation h(q, f[1,t]) = 0 alwayshas at most one root.1

2. The sequence of tranches associated with each time period isnon-decreasing.2

0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

0.2 0.0 0.2 0.4 0.6

Discounting rate (q)400

200

0

200

400

600

800

Net

pre

sent

valu

e h

(·)

q1

q2

q3

q4

q5

1Sufficient conditions: Single sign change test & the Norstrøm condition (1972).

29

Solution Restrictions

1. Given a series of cash flows f[1,t], the equation h(q, f[1,t]) = 0 alwayshas at most one root.1

2. The sequence of tranches associated with each time period isnon-decreasing.2

0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

0.2 0.0 0.2 0.4 0.6


200

0

200

400

600

800

Net

pre

sent

valu

e h

(·)

q1

q2

q3

q4

q5

1Sufficient conditions: Single sign change test & the Norstrøm condition (1972).2Sufficient conditions: Single sign change test.

29

Tranche Model: Key Idea

0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

h(q, f[1,t])

NPV

=t∑

s=1

fa,s(1 + q)s

h( qt

IRR

, f[1,t]) = 0

qt ∈ [l , u]

IRR between [l,u]

if h(l , f[1,t]) ≥ 0 and h(u, f[1,t]) < 0

30


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

h(q, f[1,t])

NPV

=t∑

s=1

fa,s(1 + q)s

h( qt

IRR

, f[1,t]) = 0

qt ∈ [l , u]

IRR between [l,u]

if h(l , f[1,t]) ≥ 0 and h(u, f[1,t]) < 0

30


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

h(q, f[1,t])

NPV

=t∑

s=1

fa,s(1 + q)s

h( qt

IRR

, f[1,t]) = 0

qt ∈ [l , u]

IRR between [l,u]

if h(l , f[1,t]) ≥ 0 and h(u, f[1,t]) < 0

30


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

h(q, f[1,t])

NPV

=t∑

s=1

fa,s(1 + q)s

h( qt

IRR

, f[1,t]) = 0

qt ∈ [l , u]

IRR between [l,u]

if h(l , f[1,t]) ≥ 0 and h(u, f[1,t]) < 0

30

Tranche Model: MIP Formulation

0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

ha,k,t

NPV at λk

=t∑

s=1

fa,s(1 + λk )s

ba,k,t =

{0 ha,k,t−1 ≤ 0

1 otherwise

fa,t

Cash flow retained by contractor

=

{µk ra,t − ca,t ba,k,t = 1, ba,k+1,t = 0

µ1ra,t − ca,t otherwise

(r , c, x) ∈ X Production model

Notation

Markets: a ∈ A, Tranches: k ∈ K, Timeperiods: t ∈ T

31


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

ha,k,t

NPV at λk

=t∑

s=1

fa,s(1 + λk )s

ba,k,t =

{0 ha,k,t−1 ≤ 0

1 otherwise

fa,t


=




Notation


31


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

ha,k,t

NPV at λk

=t∑

s=1

fa,s(1 + λk )s

ba,k,t =

{0 ha,k,t−1 ≤ 0

1 otherwise

fa,t


=




Notation


31


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

ha,k,t

NPV at λk

=t∑

s=1

fa,s(1 + λk )s

ba,k,t =

{0 ha,k,t−1 ≤ 0

1 otherwise

fa,t


=




Notation


31


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

ha,k,t

NPV at λk

=t∑

s=1

fa,s(1 + λk )s

ba,k,t =

{0 ha,k,t−1 ≤ 0

1 otherwise

fa,t


=




Notation


31


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

ha,k,t

NPV at λk

=t∑

s=1

fa,s(1 + λk )s

mha,k,t−1︸︷︷︸

Min NPV

(1− ba,k,t) ≤ ha,k,t−1 ≤ Mha,k,t−1︸︷︷︸

Max NPV

ba,k,t

fa,t


=




Notation


31


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

ha,k,t

NPV at λk

=t∑

s=1

fa,s(1 + λk )s

mha,k,t−1︸︷︷︸

Min NPV

(1− ba,k,t) ≤ ha,k,t−1 ≤ Mha,k,t−1︸︷︷︸

Max NPV

ba,k,t

fa,t


=




Notation


31


0.0 0.2 0.4 0.6 0.8


400

200

0

200

400

Net

pre

sent

valu

e h

(·)

IRR

ha,k,t =t∑

s=1

fa,s(1 + λk )s

mha,k,t−1︸︷︷︸

Min NPV

(1− ba,k,t) ≤ ha,k,t−1 ≤ Mha,k,t−1︸︷︷︸

Max NPV

ba,k,t

pa,k,t ≤ Mpa,k,t︸︷︷︸

Max Profit

(ba,k,t − ba,k+1,t)

ra,t =∑k∈K

µkpa,k,t

ba,k,t ≥ ba,k+1,t

fa,t = ra,t − ca,t

(r , c, x) ∈ X

ba,k,t ≥ ba,k,t+1

Notation

Markets:a ∈ A, Tranches: k ∈ K, Timeperiods: t ∈ T

32


0.2 0.0 0.2 0.4 0.6


200

0

200

400

600

800

Net

pre

sent

valu

e h

(·)

q1

q2

q3

q4

q5

ha,k,t =t∑

s=1

fa,s(1 + λk )s

mha,k,t−1︸︷︷︸

Min NPV

(1− ba,k,t) ≤ ha,k,t−1 ≤ Mha,k,t−1︸︷︷︸

Max NPV

ba,k,t

pa,k,t ≤ Mpa,k,t︸︷︷︸

Max Profit

(ba,k,t − ba,k+1,t)

ra,t =∑k∈K

µkpa,k,t

ba,k,t ≥ ba,k+1,t

fa,t = ra,t − ca,t

(r , c, x) ∈ X

ba,k,t ≥ ba,k,t+1

Notation

Markets:a ∈ A, Tranches: k ∈ K, Timeperiods: t ∈ T

32

Markets are Everything!

FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

33

Markets are hard!1

Markets # Variables (binary) # Constraints Solve time (s) Gap (%) Nodes

1 15132 (1457) 22101 680.39 - 1323

2 15263 (1514) 22401 1117.4 - 2934

3 15389 (1568) 22701 3670.3 - 12136

4 15515 (1622) 23001 - 8.24 28208

5 15646 (1679) 23301 - 34.4 22381

6 15762 (1727) 23601 - 66.2 10515

1Solution statistics for for a single instance of the MIP for a sample application problem solved using Gurobi5.0.1 with 2 threads for 7200 seconds.

34

Markets are hard!1


1 15132 (1457) 22101 680.39 - 1323

2 15263 (1514) 22401 1117.4 - 2934

3 15389 (1568) 22701 3670.3 - 12136

4 15515 (1622) 23001 - 8.24 28208

5 15646 (1679) 23301 - 34.4 22381

6 15762 (1727) 23601 - 66.2 10515


34

Markets are hard!1


1 15132 (1457) 22101 680.39 - 1323

2 15263 (1514) 22401 1117.4 - 2934

3 15389 (1568) 22701 3670.3 - 12136

4 15515 (1622) 23001 - 8.24 28208

5 15646 (1679) 23301 - 34.4 22381

6 15762 (1727) 23601 - 66.2 10515


34

Contributions1

Complex fiscal terms like can be viewed as optimizing a discontinuous,non-convex function G : RT × RT → R.



1. Search algorithms for finding high-quality feasible solutions.

2. Decomposition approaches for finding solution bounds.

1Sridhar, Linderoth, Luedtke, and Wright – in preparation

35

Contributions1







35

Contributions1







35

Finding Feasible Solutions

FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

36

Notation

Definition: Tranche configuration κ := {tk} where tk is the last time periodwhere the system moves from tranche k − 1 to k.

Time(t)

Cost(ct )

Revenue(pt )

IRR qt Tranche(k(t))


Cashflow(ft )

1 4000 0 n.a 1 70% -4000

2 0 2300 -43% 1 70% 1610

3 0 3000 20% 1 70% 2110

4 0 2800 44% 2 60% 1680

5 0 1000 48% 3 15% 150

Example

The tranche configuration is [0 3 4] (Note: Separate for each market).

37

Fixed Tranche Problem

Function arguments: Tranche configurations

The function uses tranche configurations of the form [0,3,4] as input.

Function evaluation: Operational planning problem

Solve a fixed-tranche version of the operational problem by forcing signs on theNPV at each time period.

First to second tranche

6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

38







6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

38







6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

38

Discrete Domain Search

Pattern search for finding tranche configurations over a discrete space?


6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

Search Directions

I Coarse.

I Computationally Expensive.

39

Discrete Domain Search

Pattern search for finding tranche configurations over a discrete space?


6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

Search Directions

I Coarse.

I Computationally Expensive.

39

Continuous Tranche Space

Question: What does it mean for the system to to move from tranche 1 to 2 at time 2.9.

That would define a tranche configuration [0, 2.9, 4]!


6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

100 200 300 400 500 600 700 800 900


100

200

300

400

500

600

700

800

900

Seco

nd t

o t

hir

d t

ranch

e

0

5

10

15

20

25

30

35

40

I Accurate search directions using finite difference.

I Search directions are computationally cheaper!

I Convert the fractional tranche configuration to feasible solution.

40





6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

100 200 300 400 500 600 700 800 900


100

200

300

400

500

600

700

800

900

Seco

nd t

o t

hir

d t

ranch

e

0

5

10

15

20

25

30

35

40




40





6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

100 200 300 400 500 600 700 800 900


100

200

300

400

500

600

700

800

900

Seco

nd t

o t

hir

d t

ranch

e

0

5

10

15

20

25

30

35

40




40





6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

100 200 300 400 500 600 700 800 900


100

200

300

400

500

600

700

800

900

Seco

nd t

o t

hir

d t

ranch

e

0

5

10

15

20

25

30

35

40




40





6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

100 200 300 400 500 600 700 800 900


100

200

300

400

500

600

700

800

900

Seco

nd t

o t

hir

d t

ranch

e

0

5

10

15

20

25

30

35

40




40

Contractor Share

Question: What does it mean for the system to to move to tranche 2 at time 2.9.


Tranche IRR Range (%) Contractor Share

k [λk , λk+1) µk

1 [-∞, 10) 70%

2 [10, 25) 50%

3 [25, ∞) 30%

100 200 300 400 500 600 700 800 900


100

200

300

400

500

600

700

800

900

Seco

nd t

o t

hir

d t

ranch

e

0

5

10

15

20

25

30

35

40

Contractor share rate for time period 3

0.9× 70%

Tranche 1

+ 0.1× 50%

Tranche 2

≈ 68%.

41

Fractional NPV

Continuous NPV2

h(f[1,t+1], λ, t

Integer

+ δ

Fractional

) =t∑

s=1

fs(1 + λ)s

Discrete NPV

+ (δft+1)e−λc (δ+t)

Continuous discounting

where δ ∈ [0, 1], λc = log(1 + λ)︸︷︷︸Continuous discounting rate

.

.

1The continuous NPV function ˆh(·) is consistent with the NPV function h(·) when δ ∈ {0, 1}42

Search Algorithm


6

8

10

12

14

16

18

Second to

third

tranch

e

6

8

10

12

14

16

18

Obje

ctiv

e f

unct

ion

0

5

10

15

20

25

30

35

40

45

100 200 300 400 500 600 700 800 900


100

200

300

400

500

600

700

800

900

Seco

nd t

o t

hir

d t

ranch

e

0

5

10

15

20

25

30

35

40

43

Search Algorithm

FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

100 200 300 400 500 600 700 800 900


100

200

300

400

500

600

700

800

900

Seco

nd t

o t

hir

d t

ranch

e

0

5

10

15

20

25

30

35

40

43

Solution Bounds: Market Based Decomposition

FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

44

Market Based Decomposition

FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

Observations

I A single market problem cansolve in < 10 minutes.

Decompose by markets?

I Lagrangian decomposition.

I ADMM (Progressive Hedging).1

1Boyd et al. Foundations and Trends in Machine Learning (2009).

45


FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

max∑a∈A

∑t∈T

πt fa,t subject to

(ba, ca, fa, ha, pa, ra) ∈ Ya︸︷︷︸Fiscal model

∀a ∈ A

(ca, ra, xa,w) ∈ Xa︸︷︷︸Production model

∀a ∈ A

w ∈W︸︷︷︸FPSO Constraints∑

a∈A

Caxa + Dw ≤ d︸︷︷︸Aggregate constraints

Notation


46


FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

max∑a∈A

∑t∈T

πt fa,t subject to


∀a ∈ A


∀a ∈ A


a∈A


Notation


46


FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

max∑a∈A

∑t∈T

πt fa,t subject to


∀a ∈ A


∀a ∈ A


a∈A


Notation


46


FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

max∑a∈A

∑t∈T

πt fa,t subject to


∀a ∈ A


∀a ∈ A

w ∈W︸︷︷︸FPSO Constraints

∑a∈A


Notation


46


FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

max∑a∈A

∑t∈T

πt fa,t subject to


∀a ∈ A


∀a ∈ A


a∈A


Notation


46


Original Problem

max∑a∈A

∑t∈T

πt fa,t subject to

(ba, ca, fa, ha, pa, ra) ∈ Ya ∀a ∈ A(ca, ra, xa,w) ∈ Xa︸︷︷︸

Relax

∀a ∈ A

w︸︷︷︸Make copies

∈ W︸︷︷︸Relax∑

a∈A

Caxa + Dw ≤ d

Relaxation

max∑a∈A

∑t∈T

πt fa,t s.t

(ba, ca, fa, ha, pa, ra) ∈ Ya ∀a ∈ A(ca, ra, xa,wa) ∈ LP(Xa) ∀a ∈ A

wa ∈ LP(W ) ∀a ∈ A1

A

∑a′∈A

wa′ = wa ∀a ∈ A

∑a∈A

Caxa + Dwa ≤ d ∀a ∈ A

47


Original Problem

max∑a∈A

∑t∈T

πt fa,t subject to

(ba, ca, fa, ha, pa, ra) ∈ Ya ∀a ∈ A(ca, ra, xa,w) ∈ Xa︸︷︷︸

Relax

∀a ∈ A

w︸︷︷︸Make copies

∈ W︸︷︷︸Relax∑

a∈A

Caxa + Dw ≤ d

Relaxation

max∑a∈A

∑t∈T

πt fa,t s.t

(ba, ca, fa, ha, pa, ra) ∈ Ya ∀a ∈ A(ca, ra, xa,wa) ∈ LP(Xa) ∀a ∈ A

wa ∈ LP(W ) ∀a ∈ A1

A

∑a′∈A

wa′ = wa ∀a ∈ A

∑a∈A

Caxa + Dwa ≤ d ∀a ∈ A

47

Lagrangian Decomposition

Define the Lagrangian as

La(xa, ya; Θ) :=∑t∈T

πt fa,t − ∆Ta wa︸︷︷︸

Relax copy equality

− θT( d

A− Caxa −

Dwa

A

)︸︷︷︸

Aggregate constraints relaxed

.

where for each market a ∈ A, we solve

LDa (Θ) := max

xa,wa ya

{La(xa, ya; Θ) s.t. xa ∈ LP(Xa),wa ∈ LP(W ), ya ∈ Ya

}︸︷︷︸

Single market MIP

.

Lagrangian Dual problem:

G∗D := minΘ

{∑a∈A

LDa (Θ) : θ ≥ 0

}Solve for θ

48





Relax copy equality

− θT( d

A− Caxa −

Dwa

A

)︸︷︷︸


.


LDa (Θ) := max

xa,wa ya


}︸︷︷︸

Single market MIP

.


G∗D := minΘ

{∑a∈A

LDa (Θ) : θ ≥ 0

}Solve for θ

48





Relax copy equality

− θT( d

A− Caxa −

Dwa

A

)︸︷︷︸


.


LDa (Θ) := max

xa,wa ya


}︸︷︷︸

Single market MIP

.


G∗D := minΘ

{∑a∈A

LDa (Θ) : θ ≥ 0

}Solve for θ

48

Augmented/Regularized Decomposition

Primal update

(xa, ya)i+1 ← argmaxxa,wa ,ya

{La(xa, ya; Θ)− ρi

2‖wa − w i‖2︸︷︷︸

Single market MIQP

:

xa ∈ LP(Xa),w ∈ LP(W ), ya ∈ Ya

}

Dual update

∆i+1a ← ∆i

a + ρi(w i+1

a − w i+1)

a ∈ A

θi+1i ← θi

i + ηid

(d −

∑a∈A

Caxi+1a − D

∑a∈A

w i+1a

A

)+

Step size

δi+1s ← δ0

s /(i + 1)

ρe+1 ← max{ρdρe , ρm}︸︷︷︸

Geometrically decreasing

where ρd < 1, ρm > 0

49

Augmented/Regularized Decomposition

Primal update

(xa, ya)i+1 ← argmaxxa,wa ,ya

{La(xa, ya; Θ)− ρi

2‖wa − w i‖2︸︷︷︸

Single market MIQP

:

xa ∈ LP(Xa),w ∈ LP(W ), ya ∈ Ya

}Dual update

∆i+1a ← ∆i

a + ρi(w i+1

a − w i+1)

a ∈ A

θi+1i ← θi

i + ηid

(d −

∑a∈A

Caxi+1a − D

∑a∈A

w i+1a

A

)+

Step size

δi+1s ← δ0

s /(i + 1)

ρe+1 ← max{ρdρe , ρm}︸︷︷︸

Geometrically decreasing

where ρd < 1, ρm > 0

49

Experiments

Goals

I Compare the quality of the solutions obtained by the MIP with that obtained by thecontinuous domain formulation.

I Compare the quality of the solution bounds obtained by the MIP with thedecomposition algorithms.

100 200 300 400 500 600 700 800 900


100

200

300

400

500

600

700

800

900

Seco

nd t

o t

hir

d t

ranch

e

0

5

10

15

20

25

30

35

40

FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

50

Feasible Solution Quality (5min)

Average gap to the best feasible solution (%)

Markets TRMIP1 HEU+M2

t=300 t =1800 t =7200 t=300 t =1800 t =7200

2 3.89 0.01 0.00 0.77 0.17 0.15

3 2.76 0.37 0.00 0.70 0.22 0.18

4 5.34 1.02 0.23 0.67 0.27 0.17

5 8.47 1.39 0.30 0.62 0.33 0.10

6 8.23 1.57 0.79 0.68 0.41 0.25

Average 5.74 0.87 0.26 0.69 0.28 0.17

1TRMIP: Tranche MIP formulation.2HEU+M: Search algorithm (multi-start).

51

Feasible Solution Quality (30min)



t=300 t =1800 t =7200 t=300 t =1800 t =7200

2 3.89 0.01 0.00 0.77 0.17 0.15

3 2.76 0.37 0.00 0.70 0.22 0.18

4 5.34 1.02 0.23 0.67 0.27 0.17

5 8.47 1.39 0.30 0.62 0.33 0.10

6 8.23 1.57 0.79 0.68 0.41 0.25

Average 5.74 0.87 0.26 0.69 0.28 0.17


51

Feasible Solution Quality (2 hour)



t=300 t =1800 t =7200 t=300 t =1800 t =7200

2 3.89 0.01 0.00 0.77 0.17 0.15

3 2.76 0.37 0.00 0.70 0.22 0.18

4 5.34 1.02 0.23 0.67 0.27 0.17

5 8.47 1.39 0.30 0.62 0.33 0.10

6 8.23 1.57 0.79 0.68 0.41 0.25

Average 5.74 0.87 0.26 0.69 0.28 0.17


51

Solution Bound Quality (30min)


Markets TRMIP1 DD+LP2 D+LP3

t=1800 t =7200 t =1800 t=7200 t =1800 t =7200

2 0.23 0.01 2.41 2.13 2.87 2.17

3 14.93 0.03 3.24 2.57 3.34 2.59

4 43.25 11.65 4.49 2.99 3.81 2.92

5 66.43 36.77 6.04 3.54 3.93 3.05

6 81.20 51.86 8.80 4.74 4.60 3.53

Average 41.21 20.06 4.99 3.19 3.71 2.85

1TRMIP: Tranche MIP formulation.2DD+LP: Lagrangian Decomposition.3D+LP: Regularized Lagrangian Decomposition.

52

Solution Bound Quality (2 hour)


Markets TRMIP1 DD+LP2 D+LP3

t=1800 t =7200 t =1800 t=7200 t =1800 t =7200

2 0.23 0.01 2.41 2.13 2.87 2.17

3 14.93 0.03 3.24 2.57 3.34 2.59

4 43.25 11.65 4.49 2.99 3.81 2.92

5 66.43 36.77 6.04 3.54 3.93 3.05

6 81.20 51.86 8.80 4.74 4.60 3.53

Average 41.21 20.06 4.99 3.19 3.71 2.85

1TRMIP: Tranche MIP formulation.2DD+LP: Lagrangian Decomposition.3D+LP: Regularized Lagrangian Decomposition.

52

Part 1 Conclusion

FPSO

Field

Reservoir

Well


Production Model

PLF Model

53

Part 1 Conclusion

100 200 300 400 500 600 700 800 900


100

200

300

400

500

600

700

800

900

Seco

nd t

o t

hir

d t

ranch

e

0

5

10

15

20

25

30

35

40

FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

(Fiscal Model, Search Algorithm)

53

Part 1 Conclusion

FPSO 1 FPSO 2

Market 1 Market 2

Market 3

Reservoirs

(Fiscal Model, Decomposition)

53

Part 2: Approximation algorithms for large combinatorial problems

vertex cover

multiway cut

independent set

54

One slide summary

Background

I Motivation: Sometimes approximate is good enough.

I Approximation algorithms via LP rounding.

Contributions1

I Rounding approximate LP solutions.

I Building a parallel solver to approximately solve large LPs.

I Theoretical results bounding runtime and solution quality for approximatingcombinatorial problems.

1Sridhar, Bittorf, Liu, Zhang, Re, and Wright NIPS (2013)

55

One slide summary

Background



Contributions1





55

One slide summary

Background



Contributions1





55

One slide summary

Background



Contributions1





55

One slide summary

Background



Contributions1





55

One slide summary

Background



Contributions1





55

Motivating Applications

Problem Applications

Set Covering Classification (Bien 2009), Multi-objecttracking (Wu 2012)

Set Packing MAP-inference (Sanghavi 2007), Naturallanguage (Kschischang 2001)

Multiway-cut Computer vision (Yuri 2004), Entity resolu-tion (Lee 2011)

Graphical Models Semantic role labeling (Roth 2005), Cluster-ing (Van Gael 2007)

56

Vertex Cover

Problem StatementGiven a graph G = (V ,E), find a subset of vertices V ⊂ V that covers all edges.

min∑v∈V

cvxv

subject to xu + xv ≥ 1︸︷︷︸Chose one vertex

∀(u, v) ∈ E

Edges

xv ∈ {0, 1} ∀v ∈ V

Vertices

vertex cover

57

Rounding 101

Integer program

min∑v∈V

cvxv

xu + xv ≥ 1 ∀(u, v) ∈ E

xv ∈ {0, 1} ∀v ∈ V

LP relaxation

min∑v∈V

cvxv

xu + xv ≥ 1 ∀(u, v) ∈ E

xv ∈ [0, 1] ∀v ∈ V

Algorithm 1: 2 Approx

1. Compute an optimal solution xLP of the LP relaxation.

2. Round xLP to an integral feasible solution xIP .

3. Clearly, c ′xIP ≤ 2c ′xLP︸︷︷︸2 approx

.

58

Rounding 101

Integer program

min∑v∈V

cvxv

xu + xv ≥ 1 ∀(u, v) ∈ E

xv ∈ {0, 1} ∀v ∈ V

LP relaxation

min∑v∈V

cvxv

xu + xv ≥ 1 ∀(u, v) ∈ E

xv ∈ [0, 1] ∀v ∈ V


1. Compute an optimal solution xLP of the LP relaxation.



.

58

Rounding 101

Formulate MIP

Relax MIP to an LP

Solve the LP

Round LP solution

59

Well Known LP Rounding Schemes 1

Problem family Approx factor Applications

Set Covering f (Hochbaum 1982) Classification (Bien 2009), Multi-object tracking (Wu 2012)

Set Packing es + o(s) (Bansal 2012). MAP-inference (Sanghavi 2007),Natural language (Kschischang2001)

Multiway-cut 3/2− 1/k (Rabini 1998). Computer vision (Yuri 2004), Entityresolution (Lee 2011).

Graphical Models Heuristic Semantic role labeling (Roth 2005),Clustering (Van Gael 2007)

1The parameter f refers to the frequency of the most frequent element; s refers to s-column sparsematrices; and k refers to the number of terminals. e refers to the Euler’s constant.

60

Definitions

α-factor approx

An α-factor approximation provides a solution that is at most α times the cost ofthe true optimal solution.

Integer program

min∑v∈V

cvxv

xu + xv ≥ 1 ∀(u, v) ∈ E

xv ∈ {0, 1} ∀v ∈ V

LP relaxation

min∑v∈V

cvxv

xu + xv ≥ 1 ∀(u, v) ∈ E

xv ∈ [0, 1] ∀v ∈ V

Integrality gap

The worst case ratio between the LP optimum and the integer programmingsolution is the integrally gap.

61

Definitions

Feasible IP Solution

Optimal LP Solution

α factor

Optimal IP Solution

Integrality gap

62

Do we need the optimal LP solution?

Integer program

min∑v∈V

cvxv

xu + xv ≥ 1 ∀(u, v) ∈ E

xv ∈ {0, 1} ∀v ∈ V

LP relaxation

min∑v∈V

cvxv

xu + xv ≥ 1 ∀(u, v) ∈ E

xv ∈ [0, 1] ∀v ∈ V


1. Compute an optimal a feasible solution xLP of the LP relaxation.



.

63

Main Result

Formulate MIP

Relax MIP to an LP

Solve the LP approximately

Round approximate LP solution

64

Contribution: Oblivious Rounding

Feasible IP Solutions

Optimal LP Solution

α factor

Feasible LP Solution

ɣ factor oblivious

Optimal IP Solution

65

Oblivious LP Rounding Schemes

For a minimization problem Φ with an IP formulation P whose LP relaxation is denotedby LP(P), a γ-factor ‘oblivious’ rounding scheme converts any feasible point xf ∈ LP(P)to an integral solution xI ∈ P with cost at most γ times the cost of LP(P) at xf .

vertex cover

multiway cut

independent set

Oblivious Rounding + Feasible LP Solution = Feasible Integral66

Handling Infeasible LP Solutions

(ε, δ) approximate LP solutions

A point x is an (ε, δ) approximate solution of the LP

min cT x s.t Ax = b, x ≥ 0

if x ≥ 0 and ∃ε > 0 and δ > 0 such that

||Ax − b||∞ ≤ ε

infeasibility

|cT x − cT x∗| ≤ δcT x∗

sub-optimality

.

Key Idea

Convert an (ε, δ)︸︷︷︸Infeasible

LP solution to a (0, δ)︸︷︷︸Feasible

LP solution.

67

Handling Infeasible LP Solutions

(ε, δ) approximate LP solutions

A point x is an (ε, δ) approximate solution of the LP

min cT x s.t Ax = b, x ≥ 0

if x ≥ 0 and ∃ε > 0 and δ > 0 such that

||Ax − b||∞ ≤ ε

infeasibility

|cT x − cT x∗| ≤ δcT x∗

sub-optimality

.

Key Idea

Convert an (ε, δ)︸︷︷︸Infeasible

LP solution to a (0, δ)︸︷︷︸Feasible

LP solution.

67

Main Results I: Approximate LP Rounding

Let x be an (ε, δ) approximate solution for the LP relaxation of vertex cover withε ∈ [0, 1). Then, x = Π[0,1]n ((1− ε)−1x) is a (0, δ(1− ε)−1)-approximate solutionfor vertex cover..

α-‐factor approxExact LP solution

Exact LP rounding

-‐approx LP solution

Approx LP rounding (✏, �)

factor approx↵(1 + �(1 � ✏)�1)

Rounding approximate LP solutions are good enough!

Extends to covering, packing and multiway-cuts!

68

Main Results II: Computing Approximate LP Solutions

min cT x s.t

Ax = b, x � 0LP

x(�) := arg minx�0

cT x � uT (Ax � b) +�

2kAx � bk2 +

1

2�kx � xk2

Quadratic penalty formulation

69


x(β) := arg minx≥0

fβ(x) := cT x − uT (Ax − b)

Dual multiplier

+β

2‖Ax − b‖2

Quadratic penalty

+1

2β‖x − x‖2

Regularizer

1Liu et al. Arxiv (2013)2Depends on the conditioning of the underlying LP.

70




Dual multiplier

+β

2‖Ax − b‖2

Quadratic penalty

+1

2β‖x − x‖2

Regularizer

ASCD1 is ideal for large solving QPs!

3 2 1 0 1 2 33

2

1

0

1

2

2.000

6.000

10.000

20.000

30.000

30.000

40.000

40.000

40.000

40.000

Coordinate Descent steps


70




Dual multiplier

+β

2‖Ax − b‖2

Quadratic penalty

+1

2β‖x − x‖2

Regularizer

I For a large enough β, the unique solution x(β) of the QP approximation is an(ε, δ) approximate LP solution2.

I ASCD1 helps provide a O(m3n2ε−2) complexity estimate for the vertex coverproblem for a graph with m edges and n vertices.


70




Dual multiplier

+β

2‖Ax − b‖2

Quadratic penalty

+1

2β‖x − x‖2

Regularizer

I For a large enough β, the unique solution x(β) of the QP approximation is an(ε, δ) approximate LP solution2.

I ASCD1 helps provide a O(m3n2ε−2) complexity estimate for the vertex coverproblem for a graph with m edges and n vertices.


70

Results: Comparisons with Cplex-LP and Cplex-IP

Instance # Vars # Nonzeros Speedup 1 Quality 2

frb59-26-1 0.12M 0.37M 2.8 1.04

Amazon 0.39M 1.17M 8.4 1.23

DBLP 0.37M 1.13M 8.3 1.25

Google+ 0.71M 2.14M 9.0 1.21

1S: (time taken by Cplex-LP)/(time taken by our solver)2Q: (our objective)/(Cplex-IP objective)

71

Results: Comparisons with Cplex-LP and Cplex-IP

Instance # Vars # Nonzeros Speedup 1 Quality 2

frb59-26-1 0.12M 0.37M 2.8 1.04

Amazon 0.39M 1.17M 8.4 1.23

DBLP 0.37M 1.13M 8.3 1.25

Google+ 0.71M 2.14M 9.0 1.21

I Rounding approximate LP solutions produced feasible integral solutions ofcomparable quality with rounding exact LP solutions.

I On larger problems like multiway-cut, we found solutions within 2 minuteswhile Cplex-LP timed out!.

I On statistical inference problems, we obtain identical application level qualitysolutions in comparison with Cplex-LP and Cplex-IP.

1S: (time taken by Cplex-LP)/(time taken by our solver)2Q: (our objective)/(Cplex-IP objective)

71

Part 2 Conclusion

Formulate MIP

Relax MIP to an LP

Solve the LP approximately

Round approximate LP solution

72

Part 2 Conclusion

Feasible IP Solutions

Optimal LP Solution

α factor

Feasible LP Solution

ɣ factor oblivious

Optimal IP Solution

72

Part 2 Conclusion

α-‐factor approxExact LP solution

Exact LP rounding

-‐approx LP solution

Approx LP rounding (✏, �)

factor approx↵(1 + �(1 � ✏)�1)

Rounding approximate LP solutions are good enough!

72

Part 2 Conclusion

min cT x s.t

Ax = b, x � 0LP

x(�) := arg minx�0

cT x � uT (Ax � b) +�

2kAx � bk2 +

1

2�kx � xk2

Quadratic penalty formulation

3 2 1 0 1 2 33

2

1

0

1

2

2.000

6.000

10.000

20.000

30.000

30.000

40.000

40.000

40.000

40.000

Coordinate Descent steps

72

Thats’ all folks!

73

Backup Slides

1

Results: Vertex Cover

VC Cplex-IP Cplex-LP Thetis

(min) t (secs) BFS Gap(%) t (secs) LP RSol t (secs) LP RSol

frb59-26-1 - 1475 0.7 2.48 767.0 1534 0.88 959.7 1532

frb59-26-2 - 1475 0.6 3.93 767.0 1534 0.86 979.7 1532

frb59-26-3 - 1475 0.5 4.42 767.0 1534 0.89 982.9 1533

Amazon 85.5 1.60×105 - 24.8 1.50×105 2.04×105 2.97 1.50×105 1.97×105

DBLP 22.1 1.65×105 - 22.3 1.42×105 2.08×105 2.70 1.42×105 2.06×105

Google+ - 1.06×105 0.01 40.1 1.00×105 1.31×105 4.47 1.00×105 1.27×105

2

Results: Vertex Cover

Cplex-LP

Instance ε = 1× 10−1 ε = 1× 10−3 ε = 1× 10−5

t(s) LP RSol t(s) LP RSol t(s) LP RSol

frb59-26-1 2.48 767.0 1534 4.70 767.0 1534 4.59 767.0 1534

frb59-26-2 3.93 767.0 1534 4.61 767.0 1534 4.67 767.0 1534

frb59-26-3 4.42 767.0 1534 4.62 767.0 1534 4.76 767.0 1534

Amazon 24.8 1.50×105 2.04×105 21.0 1.50×105 1.99×105 46.7 1.50×105 1.99×105

DBLP 22.3 1.42×105 2.08×105 22.8 1.42×105 2.07×105 31.1 1.42×105 2.06×105

Google+ 40.1 1.00×105 1.31×105 61.1 1.00×105 1.29×105 60.0 1.00×105 1.30×105

Thetis

Instance ε = 1× 10−1 ε = 1× 10−3 ε = 1× 10−5

t(s) LP RSol t(s) LP RSol t(s) LP RSol

frb59-26-1 0.88 959.7 1532 13.7 767.0 1534 13.3 767.0 1534

frb59-26-2 0.86 979.7 1532 14.2 767.0 1534 14.1 767.0 1534

frb59-26-3 0.89 982.9 1533 12.9 767.0 1534 12.9 767.0 1534

Amazon 2.97 1.50×105 1.97×105 59.5 1.50×105 1.99×105 50.3 1.50×105 1.99×105

DBLP 2.70 1.42×105 2.06×105 39.2 1.42×105 2.07×105 59.1 1.42×105 2.07×105

Google+ 4.47 1.00×105 1.27×105 1420.1 1.00×105 1.29×105 2818.2 1.00×105 1.30×105

3

Results: Inference

Task Formulation PV NNZ Method P R F1 Rank

Cplex-IP .87 .91 .89 10/13

CoNLL Skip-chain CRF 25M 51M Thetis .87 .90 .89 10/13

Gibbs Sampling .86 .90 .88 10/13

Cplex-IP .80 .80 .80 6/17

TAC-KBP Factor graph 62K 115K Thetis .79 .79 .79 6/17

Gibbs Sampling .80 .80 .80 6/17

4

Results: Combinatorial Problems

VC Cplex-IP Cplex-LP (default) Thetis


frb59-26-1 - 1475 0.7 4.59 767.0 1534 0.88 959.7 1532

Amazon 85.5 1.60×105 - 21.6 1.50×105 1.99×105 2.97 1.50×105 1.97×105

DBLP 22.1 1.65×105 - 23.7 1.42×105 2.07×105 2.70 1.42×105 2.06×105

Google+ - 1.06×105 0.01 60.0 1.00×105 1.30×105 4.47 1.00×105 1.27×105

MC Cplex-IP Cplex-LP (default) Thetis (ε = 0.1)


frb59-26-1 547.4 346 - 397.0 346 346 5.86 352.3 349

Amazon - 12 NA - - - 55.8 7.28 5

DBLP - 15 NA - - - 63.8 11.70 5

Google+ - 6 NA - - - 109.9 5.84 5

MIS Cplex-IP Cplex-LP (default) Thetis (ε = 0.1)

(max) t (secs) BFS Gap(%) t (secs) LP RSol t (secs) LP RSol

frb59-26-1 - 50 18.0 4.88 767 16 0.88 447.7 18

Amazon 35.4 1.75×105 - 25.7 1.85×105 1.58×105 3.09 1.73×105 1.43×105

DBLP 17.3 1.52×105 - 24.0 1.75×105 1.41×105 2.72 1.66×105 1.34×105

Google+ - 1.06×105 0.02 68.8 1.11×105 9.40×104 4.37 1.00×105 8.67×104

5

Problem Description

6

Production Process


I Decisions span a planning horizon T .

I Discrete decisions determine the start time of the production process.

I Continuous decisions determine the production profile evaluated by productionfunctions f (·) and gp(·).

7

Production Process





7

Production Process





7

Production Process





7

Production functions

I Production function f (·) is a concave function that determines the maximumproduction rate as a function of total production.

I Product fraction functions gp(·) evolve monotonically as a function of the totalproduction.





Useful product

8








Useful product

8








Useful product

9








Useful product

By-product

10

Continous time formulation

Cumulative production v(t) is calculated using production rate x(t)

v(t) =

∫ t

0

x(s)ds

Mixture production rate is limited by production function f (·)

x(t) ≤ f (v(t))

Product production rates yp(t) calculated by fraction functions gp(·)


Production profiles are active only after the start time z(t)

v(t) ≤ M z(t)

11



v(t) =

∫ t

0

x(s)ds


x(t) ≤ f (v(t))




v(t) ≤ M z(t)

11



v(t) =

∫ t

0

x(s)ds


x(t) ≤ f (v(t))




v(t) ≤ M z(t)

11



v(t) =

∫ t

0

x(s)ds


x(t) ≤ f (v(t))




v(t) ≤ M z(t)

11

Discrete Time Formulations

12

Discrete time formulations

A ‘natural’ discretization of this continuous time model (Tarhan 2009)

Continuous time formulation(CNT)

v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)

z(t) :T → {0, 1}, increasing

vt Cumulative production up totime period t ∈ T .

xt Mixture production during timeperiod t ∈ T .

yp,t Product p ∈ P production dur-ing time period t ∈ T .

zt Facility on/off decision vari-able.

13




v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)






13




v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)






13




v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)






13




v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)






13




v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)






13


Past models have proposed a natural discretization of this continuous time model.


v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)


=⇒

Discrete time formulation (F1 )

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)

yp,t = xt gp(vt−1)

vt ≤ M zt

zt ≥ zt−1, zt ∈ {0, 1}

14




v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)


=⇒


vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1, zt ∈ {0, 1}

14




v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)


=⇒


vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1, zt ∈ {0, 1}

14




v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)


=⇒


vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1, zt ∈ {0, 1}

14




v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)


=⇒


vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1, zt ∈ {0, 1}

14




v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)


=⇒


vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1, zt ∈ {0, 1}

14

F1 formulation

How much product is produced up to time t?

wp,t :=∑s≤t

yp,s

=∑s≤t

xsgp(vs−1)

Discrete time formulation(F1 )

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1, zt ∈ {0, 1}

15

F1 formulation


wp,t :=∑s≤t

yp,s

=∑s≤t

xsgp(vs−1)






vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1, zt ∈ {0, 1}

15

F1 formulation


wp,t :=∑s≤t

yp,s

=∑s≤t

xsgp(vs−1)






vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1, zt ∈ {0, 1}

15

F1 formulation


wp,t :=∑s≤t

yp,s

=∑s≤t

xsgp(vs−1)

0.2 0.4 0.6 0.8


0.0

0.2

0.4

0.6

0.8

1.0


F1

0.2 0.4 0.6 0.8


0.0

0.2

0.4

0.6

0.8

1.0


F1


vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1, zt ∈ {0, 1}

16

Alternate formulation

Can we do better?

Can we calculate exactly how much ofproduct p ∈ P is produced up to andincluding time period t ?

wp,t =

∫ t

0

yp(s)ds

=

∫ t

0

x(s) gp(v(s))ds

=

∫ vt

0

gp(v)dv

:= hp(vt)

Continuous timeformulation (CNT)

v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)

z(t) :T → {0, 1}, inc

17


Can we do better?


wp,t =

∫ t

0

yp(s)ds

=

∫ t

0

x(s) gp(v(s))ds

=

∫ vt

0

gp(v)dv

:= hp(vt)


v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)

z(t) :T → {0, 1}, inc

17


Can we do better?


wp,t =

∫ t

0

yp(s)ds

=

∫ t

0

x(s) gp(v(s))ds

=

∫ vt

0

gp(v)dv

:= hp(vt)


v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)

z(t) :T → {0, 1}, inc

17


Can we do better?


wp,t =

∫ t

0

yp(s)ds

=

∫ t

0

x(s) gp(v(s))ds

=

∫ vt

0

gp(v)dv

:= hp(vt)


v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)

z(t) :T → {0, 1}, inc

17


Can we do better?


wp,t =

∫ t

0

yp(s)ds

=

∫ t

0

x(s) gp(v(s))ds

=

∫ vt

0

gp(v)dv

:= hp(vt)


v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)

z(t) :T → {0, 1}, inc

17


Can we do better?


wp,t =

∫ t

0

yp(s)ds

=

∫ t

0

x(s) gp(v(s))ds

=

∫ vt

0

gp(v)dv

:= hp(vt)


v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)

z(t) :T → {0, 1}, inc

17


Can we do better?


wp,t =

∫ t

0

yp(s)ds

=

∫ t

0

x(s) gp(v(s))ds

=

∫ vt

0

gp(v)dv

:= hp(vt)


v(t) =

∫ t

0

x(s)ds

x(t) ≤ f (v(t))


v(t) ≤ M z(t)

z(t) :T → {0, 1}, inc

17

Comparing Formulations

Which formulation is better?

Formulation F1

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1

Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)

yp,t = hp(vt)− hp(vt−1)

vt ≤ M zt

zt ≥ zt−1

I F2 is a more accurate formulation of CNT than F1 .I F2 is computationally better because it deals with convex functions while F1 deals

with bilinear terms.

18



I F2 is a more accurate formulation of CNT than F1 .

I F2 is computationally better because it deals with convex functions while F1 dealswith bilinear terms.

0.2 0.4 0.6 0.8


0.0

0.2

0.4

0.6

0.8

1.0


F1

F2

0.2 0.4 0.6 0.8


0.0

0.2

0.4

0.6

0.8

1.0


F1

F2

18



I F2 is a more accurate formulation of CNT than F1 .

I F2 is computationally better because it deals with convex functions while F1 dealswith bilinear terms.



Useful product

By-product



Useful product

By-product

18

MIP Approximation & Relaxations

19

Approximations & Relaxations I


Approximate all the nonlinear production functions with piecewise linearizations.

I ProsI ‘Close’ to a feasible solution of the MINLP formulation.

I Cons

I Introduces additional SOS2 variables to branch on.I NOT a relaxation of the original formulation.







20



Approximate all the nonlinear production functions with piecewise linearizations.I Pros

I ‘Close’ to a feasible solution of the MINLP formulation.

I Cons

I Introduces additional SOS2 variables to branch on.I NOT a relaxation of the original formulation.







20





I ConsI Introduces additional SOS2 variables to branch on.

I NOT a relaxation of the original formulation.







20





I ConsI Introduces additional SOS2 variables to branch on.I NOT a relaxation of the original formulation.







20

Specially ordered sets (SOS)


[a0 ,gp (a0 )]

[a1 ,gp (a1 )]

[a2 ,gp (a2 )] [a3 ,gp (a3 )]

Cumulative useful product (gp+ ) Approximating gp(vt)

gp(vt) ≈∑o∈O

λt,ogp(ao)

1 =∑o∈O

λt,o

Structure: Only two adjacent nonzeros.

{λt,o |o ∈ O} ∈ S0S2



[a0 ,gp (a0 )]

[a1 ,gp (a1 )]

[a2 ,gp (a2 )] [a3 ,gp (a3 )]


gp(vt) ≈∑o∈O

λt,ogp(ao)

1 =∑o∈O

λt,o


{λt,o |o ∈ O} ∈ S0S2



[a0 ,gp (a0 )]

[a1 ,gp (a1 )]

[a2 ,gp (a2 )] [a3 ,gp (a3 )]


gp(vt) ≈∑o∈O

λt,ogp(ao)

1 =∑o∈O

λt,o


{λt,o |o ∈ O} ∈ S0S2



[a0 ,gp (a0 )]

[a1 ,gp (a1 )]

[a2 ,gp (a2 )] [a3 ,gp (a3 )]


gp(vt) ≈∑o∈O

λt,ogp(ao)

1 =∑o∈O

λt,o


{λt,o |o ∈ O} ∈ S0S2


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1

Piecewise Linear Approximation(PLA)

vt =t∑

s=0

xs

vt =∑o∈O

Bo λt,o

xt ≤ ∆t

∑o∈O

Fo λt,o

yp,t = wp,t − wp,t−1

wp,t =∑o∈O

Hp,o λt,o

vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

{λt,o |o ∈ O} ∈ S0S2

22


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs

vt =∑o∈O

Bo λt,o

xt ≤ ∆t

∑o∈O

Fo λt,o


wp,t =∑o∈O

Hp,o λt,o

vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

{λt,o |o ∈ O} ∈ S0S2

22


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs

vt =∑o∈O

Bo λt,o

xt ≤ ∆t

∑o∈O

Fo λt,o


wp,t =∑o∈O

Hp,o λt,o

vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

{λt,o |o ∈ O} ∈ S0S2

22


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs

vt =∑o∈O

Bo λt,o

xt ≤ ∆t

∑o∈O

Fo λt,o


wp,t =∑o∈O

Hp,o λt,o

vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

{λt,o |o ∈ O} ∈ S0S2

22

Approximations & Relaxations II

Secant Relaxation (1-SEC)

Relax all the nonlinear production functions using inner and outer approximations.

I Pros

I Relaxation of the original formulation.I Does NOT introduce additional SOS2 variables.

I ConsI May not be ‘close’ to a feasible solution of the MINLP formulation.




Cumulative useful product (hp+ )



23



Relax all the nonlinear production functions using inner and outer approximations.I Pros

I Relaxation of the original formulation.

I Does NOT introduce additional SOS2 variables.








23



Relax all the nonlinear production functions using inner and outer approximations.I Pros

I Relaxation of the original formulation.I Does NOT introduce additional SOS2 variables.








23

Approximations & Relaxations III

Multiple Secant Relaxation (k-SEC)

Relax all the nonlinear production functions using inner and outer approximations but usemultiple secants instead of a just a single one.

I ProsI ‘Close’ to a feasible solution of the MINLP formulation.

I Relaxation of the original formulation.

I Cons

I Introduces additional SOS2 variables to branch on.


Cumulative useful product (hp− )


Cumulative by-product (hp+ )

24




I ProsI ‘Close’ to a feasible solution of the MINLP formulation.I Relaxation of the original formulation.

I Cons






24





I Cons






24





I ConsI Introduces additional SOS2 variables to branch on.





24

Trix

Key Idea

Production functions are positive only if the facility is open.

Original Formulation...

vt =∑o∈O

Bo λt,o

wp,t =∑o∈O

Hp,o λt,o

1 =∑o∈O

λt,o

vt ≤ Mzt

Stronger Formulation...

vt =∑o∈O

Bo λt,o

wp,t =∑o∈O

Hp,o λt,o

zt =∑o∈O

λt,o

25

Trix

Key Idea



vt =∑o∈O

Bo λt,o

wp,t =∑o∈O

Hp,o λt,o

1 =∑o∈O

λt,o

vt ≤ Mzt


vt =∑o∈O

Bo λt,o

wp,t =∑o∈O

Hp,o λt,o

zt =∑o∈O

λt,o

25

Trix

Key Idea



vt =∑o∈O

Bo λt,o

wp,t =∑o∈O

Hp,o λt,o

1 =∑o∈O

λt,o

vt ≤ Mzt


vt =∑o∈O

Bo λt,o

wp,t =∑o∈O

Hp,o λt,o

zt =∑o∈O

λt,o

25

Locally Ideal formulations

How do we determine which formulation is better?

I Strength of LP relaxations in the absence of other constraints.

Locally Ideal: A strong property of a MIP model

I Vertices of the LP relaxation of a MIP model always satisfy integrality [Padberg2000]

I In a more general sense, locally ideal could also mean that you get a property likeSOS2 also for free. [Keha2004]

Bottom line

I Locally ideal → we get integrality for free.

26







Bottom line


26







Bottom line


26







Bottom line


26







Bottom line


26







Bottom line


26


Locally Ideal

Non-integral extreme point

Not Locally Ideal

Bottom line


27

Properties of MIP formulations

Theorem 1

I F1 is not locally ideal. (counter example)

Theorem 2

I F2 is locally ideal.

28

Other Piecewise Linear Formulations

Incremental Model

B0 B1 B2 B3 B4 B5

Variable (x)

Funct

ion f

(x)

δ0 δ1 δ2 δ3 δ4

[Markowitz and Manne 1957]

Multiple Choice

B0 B1 B2 B3

Variable (x)

Funct

ion f

(x)

z0 =1 z1 =1 z2 =1

[Balakrishna and Graves 1989]

29

Performance Evaluation

30

Experiments

Goals

I Impact on formulation accuracy in going from F1 to F2

I Impact in solution time in going from F1 to F2 as solved by our models.

I Impact of stronger formulations on solving the MIP approximation/relaxations.

Sample Application

Transportation problem with production facilities manufacturing products for customers.

31

Accuracy

2.5

3.0

3.5

4.0

4.5

Obje

ctiv

e f

unct

ion

Formulation F1



I F1: Bilinear formulation of [Tarhan2009].

I F2: Our formulation.

32

Accuracy

2.5

3.0

3.5

4.0

4.5

Obje

ctiv

e f

unct

ion

Formulation F1

Repaired F2 solution



F1



32

Accuracy

2.5

3.0

3.5

4.0

4.5

Obje

ctiv

e f

unct

ion

Formulation F1

Repaired F2 solution

Formulation F2



F1

F2



32

Formulations

0 1 2 3 4 5

Gap to best feasible solution (%)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f in

stance

s

SEC-S kSEC-SS

0 1 2 3 4 5

Gap to best relaxation bound (%)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f in

stance

s

PLA-SS SEC-S kSEC-SS

33

Trix

1 10 1002 5 20

MIP Gaps (%)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f in

stance

s

Secant relaxation (1-SEC)

strengthening no strengthening

1 10 1002 5 20

MIP Gaps (%)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f in

stance

s

Multiple secant relaxation (3-SEC)

1 10 1002 5 20

MIP Gaps (%)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f in

stance

s

Piecewise linear approximation (PLA)

34

Conclusions

I Problem Description

I Defined a non-convex production process involving desirable & undesirable products.

I Ratio of byproducts to total production increases monotonically.

I Methods

I Reformulated an existing formulation (F1) to produce a more accurate formulation(F2) based on the cumulative product production function.

I Devised scalable MIP approximations & relaxations (PLA, 1-SEC, k-SEC).

I Conclusions

I F2 formulation is a more accurate evaluation of operations as compared to F1 .

I F2 is computationally more tractable than F1 .

35

Conclusions




I Methods



I Conclusions



35

Conclusions




I Methods



I Conclusions



35

Conclusions




I Methods



I Conclusions



35

Conclusions




I Methods



I Conclusions



35

Conclusions




I Methods



I Conclusions



35

Conclusions




I Methods



I Conclusions



35

Incremental Formulation

Set structure

I Function f (·) is evaluated only if a binary variable z is on.

Incremental formulation

B0 B1 B2 B3 B4 B5

Variable (x)

Funct

ion f

(x)

δ0 δ1 δ2 δ3 δ4

Original Formulation (∆1)

x =∑o∈O

[Bo − Bo−1]δo

y =∑o∈O

[Fo − Fo−1]δo

δ1 ≤ 1

δn ≥ 0

δi+1 ≤ bi ≤ δi ∀o ∈ Ob ∈ {0, 1}n

36


Set structure



B0 B1 B2 B3 B4 B5

Variable (x)

Funct

ion f

(x)

δ0 δ1 δ2 δ3 δ4


x =∑o∈O

[Bo − Bo−1]δo

y =∑o∈O

[Fo − Fo−1]δo

δ1 ≤ 1

δn ≥ 0

δi+1 ≤ bi ≤ δi ∀o ∈ Ob ∈ {0, 1}n

36


Set structure



B0 B1 B2 B3 B4 B5

Variable (x)

Funct

ion f

(x)

δ0 δ1 δ2 δ3 δ4


x =∑o∈O

[Bo − Bo−1]δo

y =∑o∈O

[Fo − Fo−1]δo

δ1 ≤ 1

δn ≥ 0

δi+1 ≤ bi ≤ δi ∀o ∈ Ob ∈ {0, 1}n

36


Set structure


Stronger Formulation (∆2)

x =∑o∈O

[Bo − Bo−1]δo

y =∑o∈O

[Fo − Fo−1]δo

δ1 ≤ z

δn ≥ 0

δi+1 ≤ bi ≤ δi ∀o ∈ Ob ∈ {0, 1}n


x =∑o∈O

[Bo − Bo−1]δo

y =∑o∈O

[Fo − Fo−1]δo

δ1 ≤ 1

δn ≥ 0

δi+1 ≤ bi ≤ δi ∀o ∈ Ob ∈ {0, 1}n

37


Set structure


Stronger Formulation (∆2)

x =∑o∈O

[Bo − Bo−1]δo

y =∑o∈O

[Fo − Fo−1]δo

δ1 ≤ z

δn ≥ 0

δi+1 ≤ bi ≤ δi ∀o ∈ Ob ∈ {0, 1}n


x =∑o∈O

[Bo − Bo−1]δo

y =∑o∈O

[Fo − Fo−1]δo

δ1 ≤ 1

δn ≥ 0

δi+1 ≤ bi ≤ δi ∀o ∈ Ob ∈ {0, 1}n

37

Numerical Experiments: Incremental Model

Numerical experiments performed on randomly generated instances of an advertisingbudget allocation problem.

Solve time

0 1500 3000 4500 6000

MIP Solve time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f data

sets

P(∆2 ) P(∆1 )

Root LP gap

0 10 20 30 40

LP relaxation gap (%)

0.0

0.2

0.4

0.6

0.8

1.0Fr

act

ion o

f data

sets

P(∆2 ) P(∆1 )

Nodes explored

0 150 300 450 600

Nodes

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f data

sets

P(∆2 ) P(∆1 )

38


Many formulations can incorporate binary indicators variables in a locally ideal way!

Convex Combination

B0 B1 B2 B3

Variable (x)

Funct

ion f

(x)

[B0 ,f(B0 )]

[B1 ,f(B1 )]

[B2 ,f(B2 )] [B3 ,f(B3 )]

Incremental Model

B0 B1 B2 B3 B4 B5

Variable (x)

Funct

ion f

(x)

δ0 δ1 δ2 δ3 δ4

Multiple Choice

B0 B1 B2 B3

Variable (x)

Funct

ion f

(x)

z0 =1 z1 =1 z2 =1

39


Incremental Model

B0 B1 B2 B3 B4 B5

Variable (x)

Funct

ion f

(x)

δ0 δ1 δ2 δ3 δ4

[Balakrishnan1989]

Multiple Choice

B0 B1 B2 B3

Variable (x)

Funct

ion f

(x)

z0 =1 z1 =1 z2 =1

[Dantzig1960]

40

Numerical Experiments: SOS2 Model

Numerical experiments performed on randomly generated instances of an advertisingbudget allocation problem.

Solve time

0 800 1600 2400 3200MIP Solve time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f data

sets

P(S2 ) P(S1 )

Root LP gap

0 10 20 30 40LP relaxation gap (%)

0.0

0.2

0.4

0.6

0.8

1.0Fr

act

ion o

f data

sets

P(S2 ) P(S1 )

Nodes explored

0 400 800 1200 1600Nodes

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f data

sets

P(S2 ) P(S1 )

41


I Transportation problem with production facilities I manufacturing products P+ forcustomers J .

I Demand made by customers are known a priori.

I Facility operations follow known production functions.

I Facilities incur fixed, operating, transportation and penalty costs.

42






42






42






42


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs

vt =∑o∈O

Bo λt,o

xt ≤ ∆t

∑o∈O

Fo λt,o


wp,t =∑o∈O

Hp,o λt,o

vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

43


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs

vt =∑o∈O

Bo λt,o

xt ≤ ∆t

∑o∈O

Fo λt,o


wp,t =∑o∈O

Hp,o λt,o

vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

43


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs

vt =∑o∈O

Bo λt,o

xt ≤ ∆t

∑o∈O

Fo λt,o


wp,t =∑o∈O

Hp,o λt,o

vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

43


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs

vt =∑o∈O

Bo λt,o

xt ≤ ∆t

∑o∈O

Fo λt,o


wp,t =∑o∈O

Hp,o λt,o

vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

43


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs =∑o∈O

Bo λt,o

yp,t = wp,t − wp,t−1∑o∈O

Hp,o λt,o ≤ wp,t ≤∑o∈O

Hp,o λt,o ∀p ∈ P+

∑o∈O


Hp,o λt,o ∀p ∈ P−

vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

{λt,o |o ∈ O} ∈ S0S2

44


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs =∑o∈O

Bo λt,o




∑o∈O



vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

{λt,o |o ∈ O} ∈ S0S2

44


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs =∑o∈O

Bo λt,o




∑o∈O



vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

{λt,o |o ∈ O} ∈ S0S2

44


Formulation F2

vt =t∑

s=0

xs

xt ≤ ∆t f (vt−1)


vt ≤ M zt

zt ≥ zt−1


vt =t∑

s=0

xs =∑o∈O

Bo λt,o




∑o∈O



vt ≤ M zt

zt ≥ zt−1

1 =∑o∈O

λt,o

{λt,o |o ∈ O} ∈ S0S2

44

Breakpoint Selection

45

Multiple functions functions

I We consider a special case with multiple functions sharing the same domain.





Useful product

By-product

This structure occurs in models from various applications!

46

Multiple functions functions

I We consider a special case with multiple functions sharing the same domain.





Useful product

By-product

This structure occurs in models from various applications!

46

Model 1

Separately approximate each of the nonlinear functions.

I Pros: Accurate

I Cons: Introduces additional SOS2 variables for each function.


[b 11 ,f1 ]

[b 12 ,f2 ][b 1

0 , f0 ]

[b3 , f3 ]



[b 20 ,f 2

0 ]

[b 21 ,f 2

1 ]

[b 22 ,f 2

2 ]

[b 23 ,f 2

3 ]

[b 24 ,f 2

4 ]



[b 30 ,f 3

0 ][b 3

1 ,f 31 ]

[b 32 ,f 3

2 ]


47

Model 1


I Pros: Accurate



[b 11 ,f1 ]

[b 12 ,f2 ][b 1

0 , f0 ]

[b3 , f3 ]



[b 20 ,f 2

0 ]

[b 21 ,f 2

1 ]

[b 22 ,f 2

2 ]

[b 23 ,f 2

3 ]

[b 24 ,f 2

4 ]



[b 30 ,f 3

0 ][b 3

1 ,f 31 ]

[b 32 ,f 3

2 ]


47

Model 1


I Pros: Accurate



[b 11 ,f1 ]

[b 12 ,f2 ][b 1

0 , f0 ]

[b3 , f3 ]



[b 20 ,f 2

0 ]

[b 21 ,f 2

1 ]

[b 22 ,f 2

2 ]

[b 23 ,f 2

3 ]

[b 24 ,f 2

4 ]



[b 30 ,f 3

0 ][b 3

1 ,f 31 ]

[b 32 ,f 3

2 ]


47

Model 2

Use common SOS2 variables to approximate the nonlinear functions simultaneously.

I Pros: Fewer branching entities (SOS2 variables)

I Cons: Each function approximation is less accurate.


[b1 ,f1 )]

[b2 ,f2 )][b0 , f0 ]

[b3 , f3 ]



[b0 ,f 20 ]

[b1 ,f 21 ]

[b2 ,f 22 ]

[b3 ,f 23 ]



[b0 ,f 30 ]

[b1 ,f 31 ]

[b2 ,f 32 ]

[b3 ,f 33 ]


Applicable to piecewise linear relaxations

48

Model 2





[b1 ,f1 )]

[b2 ,f2 )][b0 , f0 ]

[b3 , f3 ]



[b0 ,f 20 ]

[b1 ,f 21 ]

[b2 ,f 22 ]

[b3 ,f 23 ]



[b0 ,f 30 ]

[b1 ,f 31 ]

[b2 ,f 32 ]

[b3 ,f 33 ]



48

Model 2





[b1 ,f1 )]

[b2 ,f2 )][b0 , f0 ]

[b3 , f3 ]



[b0 ,f 20 ]

[b1 ,f 21 ]

[b2 ,f 22 ]

[b3 ,f 23 ]



[b0 ,f 30 ]

[b1 ,f 31 ]

[b2 ,f 32 ]

[b3 ,f 33 ]



48

Model 2





[b1 ,f1 )]

[b2 ,f2 )][b0 , f0 ]

[b3 , f3 ]



[b0 ,f 20 ]

[b1 ,f 21 ]

[b2 ,f 22 ]

[b3 ,f 23 ]



[b0 ,f 30 ]

[b1 ,f 31 ]

[b2 ,f 32 ]

[b3 ,f 33 ]



48

Choosing breakpoints

Key question

Is there a way to select breakpoints of piecewise linear approximations & relaxationswhich are:

I As accurate as Model I.

I Use as few variables as Model II.

Contributions

I An NLP formulation to find the tightest possible piecewise linear relaxations orapproximations of a single/multiple convex function(s).

49

Single function approximation

Piecewise linear approximations that minimize maximum point-wise function evaluationerror.

x

f(x)

[a0 ,f(a0 )][B0 ,F0 ]

[B1 ,F1 ]

εε

ε

Single segment minimax solution

x

f(x)

[B0 ,F0 ][B1 ,F1 ]

[B2 ,F2 ]

[B3 ,F3 ]

[a0 ,f(a0 )]

[a1 ,f(a1 )]

[a2 ,f(a2 )]

Picewise linear approximation

We extend previous work [Imamoto2008], [Rote1992] to multiple univariateconvex/concave that share the same domain

50

Breakpoint selection: Single function approximation

x

f(x)

[a0 ,f(a0 )][B0 ,F0 ]

[B1 ,F1 ]

εε

ε


ε := maxx∈D|f (x)− ˆf (x)|

= |f (Bo)− Fo |

∀o ∈ O

= |Fo + f ′(ao)(ao − Bo)− f (ao)|

∀o ∈ O

51


x

f(x)

[a0 ,f(a0 )][B0 ,F0 ]

[B1 ,F1 ]

εε

ε



= |f (Bo)− Fo |

∀o ∈ O

= |Fo + f ′(ao)(ao − Bo)− f (ao)|

∀o ∈ O

51


x

f(x)

[a0 ,f(a0 )][B0 ,F0 ]

[B1 ,F1 ]

εε

ε



= |f (Bo)− Fo |

∀o ∈ O

= |Fo + f ′(ao)(ao − Bo)− f (ao)|

∀o ∈ O

51


x

f(x)

[B0 ,F0 ][B1 ,F1 ]

[B2 ,F2 ]

[B3 ,F3 ]

[a0 ,f(a0 )]

[a1 ,f(a1 )]

[a2 ,f(a2 )]



= |f (Bo)− Fo | ∀o ∈ O

= |Fo + f ′(ao)(ao − Bo)− f (ao)| ∀o ∈ O

51


x

f(x)

[a0 ,f(a0 )][B0 ,F0 ]

[B1 ,F1 ]

εε

ε


MIP Formulation

minB,a,F

ε

Fo+1 − Fo = f ′(ao)(Bo+1 − Bo)

ε =∣∣Fo+1 − f (Bo)

∣∣ε =

∣∣Fo + f ′(ao)(ao − Bo)− f (ao)∣∣

ao ≤ ao+1

ao−1 ≤ Bo ≤ ao

52


x

f(x)

[a0 ,f(a0 )][B0 ,F0 ]

[B1 ,F1 ]

εε

ε


MIP Formulation

minB,a,F

ε

Fo+1 − Fo = f ′(ao)(Bo+1 − Bo)

ε =∣∣Fo+1 − f (Bo)

∣∣ε =

∣∣Fo + f ′(ao)(ao − Bo)− f (ao)∣∣

ao ≤ ao+1


52


x

f(x)

[B0 ,F0 ][B1 ,F1 ]

[B2 ,F2 ]

[B3 ,F3 ]

[a0 ,f(a0 )]

[a1 ,f(a1 )]

[a2 ,f(a2 )]


MIP Formulation

minB,a,F

ε

Fo+1 − Fo = f ′(ao)(Bo+1 − Bo)

ε =∣∣Fo+1 − f (Bo)

∣∣ε =

∣∣Fo + f ′(ao)(ao − Bo)− f (ao)∣∣

ao ≤ ao+1


52


x

f(x)

[B0 ,F0 ][B1 ,F1 ]

[B2 ,F2 ]

[B3 ,F3 ]

[a0 ,f(a0 )]

[a1 ,f(a1 )]

[a2 ,f(a2 )]


MIP Formulation

minB,a,F

ε

Fo+1 − Fo = f ′(ao)(Bo+1 − Bo)

ε =∣∣Fo+1 − f (Bo)

∣∣ε =

∣∣Fo + f ′(ao)(ao − Bo)− f (ao)∣∣

ao ≤ ao+1


52

Multiple functions


[b1 ,f1 )]

[b2 ,f2 )][b0 , f0 ]

[b3 , f3 ]



[b0 ,f 20 ]

[b1 ,f 21 ]

[b2 ,f 22 ]

[b3 ,f 23 ]



[b0 ,f 30 ]

[b1 ,f 31 ]

[b2 ,f 32 ]

[b3 ,f 33 ]


MIP Formulation

I Share the same variables Bo ∀o ∈ O.

I Objective function changed to a weighted error.

53

Single function relaxation

Piecewise linear relaxations that minimize area difference to the nonlinear function.

x

f(x)

[B0 ,F0 ][B1 ,F1 ]

[B2 ,F2 ]

[B3 ,F3 ]

[a0 ,f(a0 )]

[a1 ,f(a1 )]

[a2 ,f(a2 )]

Relaxing convex function f (·)

minB,F ,H,a

∑o∈O

(Fo + Fo−1)(Bo − Bo−1)

−∫ Mv

0

f (s)ds

Fo − f (ao+1) = f ′(ao+1)(Bo − ao+1)

Fo − f (ao) = f ′(ao)(Bo − ao)

ao ≤ ao+1


f (Bo) ≤ Fo

Extension to multiple functions similar to the approximation case.

54



x

f(x)

[B0 ,F0 ][B1 ,F1 ]

[B2 ,F2 ]

[B3 ,F3 ]

[a0 ,f(a0 )]

[a1 ,f(a1 )]

[a2 ,f(a2 )]


minB,F ,H,a

∑o∈O

(Fo + Fo−1)(Bo − Bo−1)

−∫ Mv

0

f (s)ds

Fo − f (ao+1) = f ′(ao+1)(Bo − ao+1)


ao ≤ ao+1


f (Bo) ≤ Fo


54



x

f(x)

[B0 ,F0 ][B1 ,F1 ]

[B2 ,F2 ]

[B3 ,F3 ]

[a0 ,f(a0 )]

[a1 ,f(a1 )]

[a2 ,f(a2 )]


minB,F ,H,a

∑o∈O

(Fo + Fo−1)(Bo − Bo−1)

−∫ Mv

0

f (s)ds

Fo − f (ao+1) = f ′(ao+1)(Bo − ao+1)


ao ≤ ao+1


f (Bo) ≤ Fo


54



x

f(x)

[B0 ,F0 ][B1 ,F1 ]

[B2 ,F2 ]

[B3 ,F3 ]

[a0 ,f(a0 )]

[a1 ,f(a1 )]

[a2 ,f(a2 )]


minB,F ,H,a

∑o∈O

(Fo + Fo−1)(Bo − Bo−1)

−∫ Mv

0

f (s)ds

Fo − f (ao+1) = f ′(ao+1)(Bo − ao+1)


ao ≤ ao+1


f (Bo) ≤ Fo


54



x

f(x)

[B0 ,F0 ][B1 ,F1 ]

[B2 ,F2 ]

[B3 ,F3 ]

[a0 ,f(a0 )]

[a1 ,f(a1 )]

[a2 ,f(a2 )]


minB,F ,H,a

∑o∈O

(Fo + Fo−1)(Bo − Bo−1)

−∫ Mv

0

f (s)ds

Fo − f (ao+1) = f ′(ao+1)(Bo − ao+1)


ao ≤ ao+1


f (Bo) ≤ Fo


54



x

f(x)

[B0 ,F0 ][B1 ,F1 ]

[B2 ,F2 ]

[B3 ,F3 ]

[a0 ,f(a0 )]

[a1 ,f(a1 )]

[a2 ,f(a2 )]


minB,F ,H,a

∑o∈O

(Fo + Fo−1)(Bo − Bo−1)

−∫ Mv

0

f (s)ds

Fo − f (ao+1) = f ′(ao+1)(Bo − ao+1)


ao ≤ ao+1


f (Bo) ≤ Fo


54


55

Piecewise linear approximation: Accuracy

Pointwise function evaluation error between the piecewise linear approximations and thenon-linear functions over 592 instances.

Benchmarks

I Our method: NLP formulation usingcommon breakpoints.

I Optimal: NLP formulation usingseparate breakpoints.

I Uniform: Uniformly spaced commonbreakpoints.

Metrics

I Maximum relative difference infunction evaluation.

56



Benchmarks




Metrics


56



Benchmarks




Metrics


56



Benchmarks




Metrics


56



0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

Relative function evaluation error (%)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f in

stance

s

uniform

optimal

our method

Benchmarks




Metrics


56

Piecewise linear relaxations: Accuracy

Relative area difference between the piecewise linear relaxations and the non-linearfunctions over 592 instances.

Benchmarks



Metrics

I Relative difference in area under thecurve.

57



Benchmarks



Metrics


57



Benchmarks



Metrics


57



0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

Relative area difference (%)

0.0

0.2

0.4

0.6

0.8

1.0

Fract

ion o

f in

stance

s

our method

optimal

Benchmarks



Metrics


57

Documents

Mixed-Integer programming approaches for some non-convex ...srikris.github.io/presentations/Defense.pdf · Life since 2009 I Part 1:Mixed integer programming (MIP) techniques with