84
ENE 2XX: Renewable Energy Systems and Control LEC 02 : Convex Programs Professor Scott Moura University of California, Berkeley Summer 2017 Prof. Moura |Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 1

ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

ENE 2XX: Renewable Energy Systems and Control

LEC 02 : Convex Programs

Professor Scott MouraUniversity of California, Berkeley

Summer 2017

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 1

Page 2: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

What is an Optimization Program?

We seek “the best” values for design variables x ∈ Rn

Must respect certain constraints / limitations

minimize f(x) [Objective Function]

subject to gi(x) ≤ 0, i = 1, · · · ,m [Inequality constraints]

hj(x) = 0, j = 1, · · · l [Equality constraints]

A value x? that solves this optimization program is called a “minimizer”.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 2

Page 3: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

What is an Optimization Program?

We seek “the best” values for design variables x ∈ Rn

Must respect certain constraints / limitations

minimize f(x) [Objective Function]

subject to g(x) ≤ 0 [Inequality constraints]

h(x) = 0 [Equality constraints]

Vector notation

A value x? that solves this optimization program is called a “minimizer”.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 2

Page 4: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

What is an Optimization Program?

We seek “the best” values for design variables x ∈ Rn

Must respect certain constraints / limitations

minimize f(x) [Objective Function]

subject to g(x) ≤ 0 [Inequality constraints]

h(x) = 0 [Equality constraints]

Vector notation

A value x? that solves this optimization program is called a “minimizer”.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 2

Page 5: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Classes of Optimization Programs

LPQP

CPNLPMIP

LP = Linear Program; QP = Quadratic Program; CP = Convex Program;NLP = Nonlinear Program; MIP = Mixed Integer Program

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 3

Page 6: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Outline

1 Convex Programming

2 Linear Programming

3 Quadratic Programming

4 Second Order Cone Programming

5 Robust Programming & Chance Constraints

6 Maximum Likelihood Estimation

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 4

Page 7: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Convex Programs

A convex optimization problem has the form

minimize f(x) (1)

subject to gi(x) ≤ 0, i = 1, · · · ,m (2)

aTj x = bj, j = 1, · · · , l. (3)

Comparing this problem with the abstract optimization problem definedbefore, the convex optimization problem has three additional requirements:

objective function f(x) must be convex,

inequality constraint functions gi(x) must be convex for all i = 1, · · · ,m,

the equality constraint functions hj(x) must be affine for all j = 1, · · · , l.

Note that in the convex optimization problem, we can only tolerate affineequality constraints, meaning (3) takes the matrix-vector form of Aeqx = beq.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 5

Page 8: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Why care?

No general analytic solutions, however VERY powerful methods exist tosolve CPs numerically

Ex: Easily solve CPs with 100’s or 1000’s of variables in just a fewseconds

Ex: Easily solve CPs with 1M’s of variables in tens of seconds

CP solvers are off-the-shelf technology

YOUR focus: Find ways to convert your problem into a CP

If you formulate your problem into a CP, then you have essentiallysolved it

Converting your problem into a CP requires both art & technical skill

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 6

Page 9: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Key CP Properties

If a local minimum exists, then it is the global minimum.

If the objective function is strictly convex, and a local minimum exists,then it is a unique minimum.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 7

Page 10: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Sub-classes of Convex Programs

Linear Programs (LPs)

Some Quadratic Programs (QPs)

Second Order Cone Programs (SOCPs)

Maximum Likelihood Estimation (MLE)

Geometric Programs (GPs)

Semidefinite Programs (SDPs)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 8

Page 11: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Exercise

Is this a convex program?

minimize f(x) = x21 + x2

2 (4)

subject to g1(x) = x1/(1 + x22) ≤ 0 (5)

h1(x) = (x1 + x2)2 = 0 (6)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 9

Page 12: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Exercise

Is this a convex program?

minimize f(x) = x21 + x2

2 (4)

subject to g1(x) = x1/(1 + x22) ≤ 0 (5)

h1(x) = (x1 + x2)2 = 0 (6)

NOT a convex program.

Inequality constraint function g1(x) is not convex in (x1, x2)

Equality constraint function h1(x) is not affine in (x1, x2)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 9

Page 13: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Exercise

Is this a convex program?

minimize f(x) = x21 + x2

2 (4)

subject to g1(x) = x1/(1 + x22) ≤ 0 (5)

h1(x) = (x1 + x2)2 = 0 (6)

NOT a convex program.

Inequality constraint function g1(x) is not convex in (x1, x2)

Equality constraint function h1(x) is not affine in (x1, x2)

Now, an astute observer might comment that both sides of (5) can bemultiplied by (1 + x2

2) and (6) can be represented simply by x1 + x2 = 0,without loss of generality.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 9

Page 14: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Exercise

Is this a convex program?

minimize f(x) = x21 + x2

2 (4)

subject to g1(x) = x1 ≤ 0 (5)

h1(x) = x1 + x2 = 0 (6)

YES. This is a convex program.

Objective function f(x) is convex in (x1, x2)

Inequality constraint function g1(x) is convex in (x1, x2)

Equality constraint function h1(x) is affine in (x1, x2)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 9

Page 15: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Outline

1 Convex Programming

2 Linear Programming

3 Quadratic Programming

4 Second Order Cone Programming

5 Robust Programming & Chance Constraints

6 Maximum Likelihood Estimation

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 10

Page 16: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Linear Programs

Linear program (LP) is defined as the following special case of a CP:

minimize cTx (7)

subject to Ax ≤ b (8)

Aeqx = beq (9)

f(x) must be linear (or affine, before dropping the additive constant)gi(x) and hj(x) must be affine for all i and j, respectively.

Figure: Feasible set of LPs always forms a polyhedron P. Objective functionvisualized as isolines of constant cost (dotted lines). The optimal solution is at theboundary point that touches the isoline of least cost.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 11

Page 17: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Nature of LP Solutions

Proposition (Nature of LP Solutions)The solution to any linear program is characterized by one of the followingthree categories:

[No Solution] Occurs when feasible set is empty, or objective functionis unbounded.

[One Unique Solution] There exists a single unique solution at thevertex of the feasible set. That is, at least two constraints are activeand their intersection gives the optimal solution. (see previous slide)

[A Non-Unique Solution] There exists an infinite number of solutions,given by one edge of the feasible set. That is, one or more constraintsare active and all solutions along the intersection of these constraintsare equally optimal. This can only occur when the objective functiongradient is orthogonal to one or multiple constraint.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 12

Page 18: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

LP Examples

Diet Problem: choose quantities x1, · · · , xn of n foods

one unit of food j costs cj, contains amount aij of nutrient i

healthy diet requires nutrient i in quantity at least bi

to find cheapest health diet

minimize cTx

subject to Ax ≥ b, x ≥ 0

Minimize a piecewise affine (PWA) function:

minimize maxi=1,··· ,m

{aT

i x + bi

}is equivalent to the LP

minimize t

subject to aTi x + bi ≤ t, ∀ i = 1, · · · ,m

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 13

Page 19: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

LP Examples

Diet Problem: choose quantities x1, · · · , xn of n foods

one unit of food j costs cj, contains amount aij of nutrient i

healthy diet requires nutrient i in quantity at least bi

to find cheapest health diet

minimize cTx

subject to Ax ≥ b, x ≥ 0

Minimize a piecewise affine (PWA) function:

minimize maxi=1,··· ,m

{aT

i x + bi

}is equivalent to the LP

minimize t

subject to aTi x + bi ≤ t, ∀ i = 1, · · · ,m

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 13

Page 20: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Optimal Economic Dispatch

You are the California Independent System Operator (CAISO). You mustschedule power generators for tomorrow (24 one-hour segments) to satisfyelectricity demand. Given data:

Generator i provides “marginal cost” ci (units of USD/MW). Quantity ci isfinancial compensation each generator requests for providing 1 MW.

Generator i has maximum power capacity of xi,max (units of MW).

California electricity demand is D(k), where k indexes each hour, i.e.k = 0,1, · · · ,23.

minimize23∑

k=0

n∑i=1

cixi(k) (10)

subject to 0 ≤ xi(k) ≤ xi,max, ∀ i = 1, · · · ,n, k = 0, · · · ,23 (11)n∑

i=1

xi(k) = D(k), k = 0, · · · ,23 (12)

optimization var xi(k) is power produced by generator i during hour k.Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 14

Page 21: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Optimal Economic Dispatch

0 6 12 18 240

5

10

15

20

25

30

35

Time of Day

Pow

er

Dem

and [G

W]

Figure: [LEFT] Marginal cost of electricity for various generators, as a function of cumulativecapacity. The purple line indicates the total demand D(k). All generators left of the purple lineare dispatched. [RIGHT] Optimal supply mix and demand for 03:00.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 14

Page 22: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Optimal Economic Dispatch

0 6 12 18 240

5

10

15

20

25

30

35

Time of Day

Pow

er

Dem

and [G

W]

Figure: [LEFT] Marginal cost of electricity for various generators, as a function of cumulativecapacity. The purple line indicates the total demand D(k). All generators left of the purple lineare dispatched. [RIGHT] Optimal supply mix and demand for 19:00.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 14

Page 23: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Outline

1 Convex Programming

2 Linear Programming

3 Quadratic Programming

4 Second Order Cone Programming

5 Robust Programming & Chance Constraints

6 Maximum Likelihood Estimation

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 15

Page 24: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Quadratic Programs

Quadratic program (QP) is defined as:

minimize1

2xTQx + RTx + S (10)

subject to Ax ≤ b (11)

Aeqx = beq (12)

f(x) must be quadratic in xgi(x) and hj(x) must be affine for all i and j, respectively.

Figure: Feasible set of QPs always forms a polyhedron P. Objective functionvisualized as convex quadratic iso-countours of constant cost (dotted lines).

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 16

Page 25: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Quadratic Programs

Quadratic program (QP) is defined as:

minimize1

2xTQx + RTx + S (10)

subject to Ax ≤ b (11)

Aeqx = beq (12)

f(x) must be quadratic in x

gi(x) and hj(x) must be affine for all i and j, respectively.

RemarkNot all QPs are convex programs! A QP is a convex program only if Q � 0,i.e. Q is positive semi-definite. QPs where Q � 0 are called non-convex QPsand are generally very hard to solve.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 16

Page 26: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Linear Regression Modelsmore specifically, linear-in-the-parameters models

Suppose you have data comprised of n-data pairs (xi, yi), where i = 1, · · · ,n.You seek to fit a mathematical model to this data, of the form:

y = θ1x + θ0 (13)

How do we determine θ1, θ0?

Regression AnalysisEstablish a mathematical relationship between variables, given data.

Quoted Text Message from Tech IP Attorney to MeAttorney: One of our outside consultant firms billed us 170,000 USD to doan SQL regression modelMe: My undergrads would do that for 25 USD and pizza.Attorney: yeah, next time we should go that route

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 17

Page 27: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Graphical Version

Determine a “best fit” for m,b in the linear model

y = mx + b (14)

given n-data pairs (xi, yi), where i = 1, · · · ,n.In other words, find the line that best fits data points:

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 18

Page 28: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Least Squaresa.k.a. Ordinary Least Squares (OLS) or Linear Least Squares

Let us define best fit as follows. Define the “residual” ri for m,b and datapair (xi, yi) as follows:

ri = mxi + b− yi (15)

Obviously, when ri then m,b fit that data pair perfectly. We would like toselect m,b such that the sum of all residuals squared are minimized:

minm,b

i=n∑i=1

r2i = min

m,b

i=n∑i=1

(mxi + b− yi)2 (16)

= minθ=[m,b]

‖Xθ − Y‖22 (17)

where

θ =

[mb

], X =

x1 1x2 1...

...xn 1

, Y =

y1

y2

...yn

(18)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 19

Page 29: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Graphical Result

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 20

Page 30: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Other Linear-in-the-Parameter Models

Polynomial: y = θ0 + θ1x + θ2x2 + · · ·+ θpxp. Residual r = Xθ − Y

θ =

θ0

θ1

...θp

, X =

1 x1 x2

1 · · · xp1

1 x2 x22 · · · xp

2...

......

1 xn x2n · · · xp

n

, Y =

y1

y2

...yn

(19)

Harmonic: y = θ1 sin(x) + θ2 cos(x) + θ3 sin(2x) + θ4 cos(2x).Residual r = Xθ − Y

θ =

θ1

θ2

θ3

θ4

, X =

sin(x1) cos(x1) sin(2x1) cos(2x1)sin(x2) cos(x2) sin(2x2) cos(2x2)

......

...sin(xn) cos(xn) sin(2xn) cos(2xn)

, Y =

y1

y2

...yn

(20)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 21

Page 31: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Other Linear-in-the-Parameter Models

Radial Basis Function: y = θ1e−(x+0.5)2

+ θ2e−(x)2

+ θ3e−(x−0.5)2

.Residual r = Xθ − Y

θ =

θ1

θ2

θ3

, X =

e−(x1+0.5)2

e−(x1)2

e−(x1−0.5)2

e−(x2+0.5)2

e−(x2)2

e−(x2−0.5)2

......

...

e−(xn+0.5)2

e−(xn)2

e−(xn−0.5)2

, Y =

y1

y2

...yn

(21)

limited only by your imagination

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 22

Page 32: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Optimization Perspective

All regression problems for linear-in-the-parameters models can be written:

minimizeθ‖Xθ − Y‖22, X ∈ Rn×p, θ ∈ Rp, Y ∈ Rn (22)

n : number of data pairs (xi, yi)p : number of coefficients θ1, · · · , θp.We assume n > p.

Recall First Order Necessary Condition (FONC): If θ∗ minimizes (22),then d

dθ‖Xθ − Y‖22 = 0. Let’s expand this condition!

0 =d

dθ‖Xθ − Y‖22

=d

dθ(Xθ − Y)T(Xθ − Y)

=d

(θTXTXθ − 2YTXθ + YTY

)= 2XTXθ − 2XTY

⇒ θ∗ =(XTX

)−1XTY

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 23

Page 33: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Least Squares with L2 Regularizationa.k.a. Ridge Regression

What if we define “best fit” by a different criterion? For example, weminimize the sum of residuals squared, but penalize the coefficients fromgetting “too big”. Consider

minimizeθ ‖Xθ − Y‖22 + α‖θ‖22 (23)

Apply FONC:

0 =d

dθ‖Xθ − Y‖22 + α‖θ‖22

=d

dθ(Xθ − Y)T(Xθ − Y) + αθTθ

=d

(θT(XTX + αI)θ − 2YTXθ + YTY

)= 2(XTX + αI)θ − 2XTY

⇒ θ∗ =(XTX + αI

)−1XTY

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 24

Page 34: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Ridge Coefficients as you vary α

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 25

Page 35: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Least Squares with L1 Regularizationa.k.a. Lasso Regression

What if we define “best fit” by a different criterion? For example, supposeour data occasionally contains outliers that can bias our fitted linear modelundesirably. Is there a “robust regression” method? Yes.

minθ‖Xθ − Y‖22 + α‖θ‖1 (24)

L2 penalties place small weight on small coefficients

θ2i is very small when θi is small

Little incentive to drive θi to zero, unless you consider |θi| instead.

Note: Due to the 1-norm, this is no longer a QP! It is, however, a CP.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 26

Page 36: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

y = θ0 + θ1x + θ2x2 + · · ·+ θ8x8

0 0.2 0.4 0.6 0.8 12

2.5

3

3.5

SOC

Vo

ltag

e

Data

LSQ Fit

θ0 = 2.6460; θ1 = 5.5442; θ2 = −15.7690; θ3 = 16.4894; θ4 = −0.9965;θ5 = −4.2202; θ6 = −2.8927; θ7 = 0.0602; θ8 = 2.6326;

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 27

Page 37: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

y = θ0 + θ1x + θ2x2 + · · ·+ θ8x8

0 0.2 0.4 0.6 0.8 12

2.5

3

3.5

SOC

Vo

ltag

e

DataL

1 Reg w/ α = 0.0042

θ0 = 2.9873; θ1 = 0.9366; θ2 = −0.5531; θ3 = −0.0641; θ4 = 0;θ5 = 0; θ6 = 0; θ7 = 0;θ8 = 0.1052

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 28

Page 38: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

y = θ0 + θ1x + θ2x2 + · · ·+ θ8x8

0 0.2 0.4 0.6 0.8 12

2.5

3

3.5

SOC

Vo

ltag

e

DataL

1 Reg w/ α = 0.0083

θ0 = 3.0579; θ1 = 0.4773; θ2 = 0;θ3 = −0.1202; θ4 = 0;θ5 = 0; θ6 = 0; θ7 = 0; θ8 = 0

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 29

Page 39: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

y = θ0 + θ1x + θ2x2 + · · ·+ θ8x8

0 0.2 0.4 0.6 0.8 12

2.5

3

3.5

SOC

Vo

ltag

e

DataL

1 Reg w/ α = 0.0125

θ0 = 3.0889; θ1 = 0.3547; θ2 = 0; θ3 = 0; θ4 = 0;θ5 = 0; θ6 = 0; θ7 = 0; θ8 = 0

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 30

Page 40: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

y = θ0 + θ1x + θ2x2 + · · ·+ θ8x8

0 0.2 0.4 0.6 0.8 12

2.5

3

3.5

SOC

Vo

ltag

e

DataL

2 Reg w/ α = 0.000

L2 Reg w/ α = 0.001

L2 Reg w/ α = 0.010

θ0 = 2.1; θ1 = 28.8; θ2 = −301; θ3 = 1595; θ4 = −4743; θ5 = 8222; θ6 = −8238; θ7 = 4414; θ8 = −977

θ0 = 2.71; θ1 = 4.13; θ2 = −8.51; θ3 = 4.10; θ4 = 3.42; θ5 = −0.34; θ6 = −2.31; θ7 = −1.49; θ8 = 1.75

θ0 = 2.82; θ1 = 2.40; θ2 = −2.81; θ3 = −0.33; θ4 = 0.71; θ5 = 0.69; θ6 = 0.32; θ7 = −0.04; θ8 = −0.27Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 31

Page 41: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

A Generalized Linear Model

The generalized linear model is given by:

y =

p∑i=1

θiφi(x) = θTφ(x) (25)

y ∈ R is the output of interest

θ ∈ Rp are the coefficients or parameters to fit

φ(x) are “regressors” or “predictors”, which can involve dependentdata in a nonlinear way

Summary of Regression Procedures

Least Squares (LSQ), a.k.a. linear least squares, ordinary least squares(convex QP)

LSQ w/ L2 Regularization, a.k.a. ridge regression (convex QP)

LSQ w/ L1 Regularization, a.k.a. lasso regression (CP)

LSQ w/ L1 and L2 Regularization, a.k.a. elastic net (CP)

LSQ w/ Huber Regularization (Hybridized L1+L2), a.k.a. Robust LSQ (CP)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 32

Page 42: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Markowitz Portfolio Optimization - I

Problem Statement: Imagine you are an investment portfolio manager.You control a large sum of money, and can invest in n different assets. Atthe end of some time period, your investment produces a financial return.The key challenge, here, is the return is not easily predictable. It is random.

Notation:

xi denotes the percentage of fund to invest in asset i. Note that∑ni=1 xi = 1, and xi ≥ 0

Return is well characterized by Gaussian distribution N (µ,Σ), whereµ ∈ Rn is expected return and Σ ∈ Rn×n is the covariance

Examples:

Asset i has expected return µi =2%, with std dev of√

Σii =5%

Asset j has expected return µj =5%, with std dev of√

Σjj =50%

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 33

Page 43: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Markowitz Portfolio Optimization - II

Suppose we seek to

maximize expected return, AND

minimize risk

These two objectives cannot be achieved w/o tradeoffs. Therefore, oneoften “scalarizes” this bi-criterion problem to explore the tradeoff:

minimize −µTx + γ · xTΣx (26)

subject to 1Tx = 1, x � 0 (27)

where the parameter γ ≥ 0 is called the “risk aversion” parameter.

Increasing γ increases your sensitivity to risk

γ = 0 means you are risk neutral

γ < 0 means you are a risk seeker. Note this is NOT a convex QP.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 34

Page 44: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Markowitz Portfolio Optimization - III

Consider this expected return & covariance data for a portfolio of 3 assets:

µ = [1.02, 1.05, 1.04]T , Σ =

(0.05)2 0 00 (0.5)2 00 0 (0.1)2

(28)

1.02 1.025 1.03 1.035 1.04 1.045 1.05Expected Return [%]

0

0.05

0.1

0.15

0.2

0.25

Ris

k

X: 1.046Y: 0.09999

. = 0

. = 1

Pareto Frontier

. = 0.05

Figure: Trade off between maximizingexpected return and minimizing risk. Thistrade off curve is called a “Pareto Frontier”

0 0.2 0.4 0.6 0.8 1Risk Aversion Param, .

0

0.2

0.4

0.6

0.8

1

Por

tfolio

Dis

trib

utio

n

x1

x2

x3

Figure: Optimal portfolio investmentstrategy, as risk aversion parameter γincreases.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 35

Page 45: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Quadratically Constrained QPs

A generalization of the convex QP problem is the quadratically constrainedQP (QCQP):

minimize1

2xTQx + RTx + S (29)

subject to1

2xTQix + RT

i x + Si ≤ 0, ∀ i = 1, · · · ,m (30)

Aeqx = beq (31)

where Q,Qi � 0 for the program to be convex.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 36

Page 46: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Outline

1 Convex Programming

2 Linear Programming

3 Quadratic Programming

4 Second Order Cone Programming

5 Robust Programming & Chance Constraints

6 Maximum Likelihood Estimation

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 37

Page 47: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Second Order Cone Programs

Second Order Cone Program (SOCP) is defined as:

minimize fTx (32)

subject to ‖Aix + bi‖2 ≤ cTi x + di, i = 1, · · · ,m (33)

Aeqx = beq (34)

Inequalities form a “second order cone” constraint

The unit second-order (convex)cone of dimension k is

Ck

={

[x; t] | x ∈ Rk−1, t ∈ R, ‖x‖ ≤ t}

which is also called the “Ice cream”cone or “Lorentz” cone.

Figure: Boundary of second-order conein R3, {(x1, x2, t)|(x2

1 + x22)

1/2 ≤ t}.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 38

Page 48: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Convex QP→ SOCP - I

Consider the convex QP Problem:

minimize1

2xTQx + RTx + S (35)

subject to Ax ≤ b, Aeqx = beq (36)

where Q ∈ Rn×n is symmetric and positive definite, R ∈ Rn, and S ∈ R.Note that ∥∥∥∥∥ 1√

2Q1/2x +

√2

2Q−1/2R

∥∥∥∥∥2

=1

2xTQx + RTx +

1

2RTQ−1R (37)

This allows us to re-write this convex QP as

minimize

∥∥∥∥∥ 1√2

Q1/2x +

√2

2Q−1/2R

∥∥∥∥∥2

− 1

2RTQ−1R + S (38)

subject to Ax ≤ b, Aeqx = beq (39)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 39

Page 49: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Convex QP→ SOCP - II

which can be recast as a SOCP

minimize t (40)

subject to

∥∥∥∥∥ 1√2

Q1/2x +

√2

2Q−1/2R

∥∥∥∥∥ ≤ t (41)

Ax ≤ b, Aeqx = beq (42)

Note:

Original convex QP: n optimization variables; m + l constraints

SOCP reformulation: n + 1 optimization variables; m + l + 1 constraints

Remark:

Can extend to semi-definite Q

Can extend to QCQP problems

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 40

Page 50: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Robust Dispatch w/ High Penetration of Renewables

Problem Statement: Economically dispatch n generators to serveelectricity demand D. In this case, large percentage of n generators arerenewable.

Key Challenge: Maximum power generating capacity of renewable plantsis uncertain. For example,

Wind farm can produce anywhere between 0MW and 10MW

Solar PV farm can produce anywhere between 0MW and 20MW

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 41

Page 51: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Robust Dispatch w/ High Penetration of Renewables

Model as LP:

minimize fTx (43)

subject to RTx ≥ D (44)

0 ≤ x ≤ 1 (45)

x ∈ Rn is vector of power generation dispatched to the generators, as afraction of the generator’s rated capacity

f ∈ Rn is vector of marginal costs [USD/MW]

D ∈ R is electricity demand [MW]

R ∈ Rn is vector of real-time power capacity for generators [MW]

Convert into standard form:

minimize fTx (46)

subject to aTx ≤ b (47)

0 ≤ x ≤ 1 (48)

where a = −R and b = −DProf. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 42

Page 52: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Robust Dispatch w/ High Penetration of Renewables

Focus on uncertain parameter a. To illustrate, imagine...

generator i is wind farm, 0MW ≤ Ri ≤ 10MW; −10MW ≤ ai ≤ 0MW

generator j is solar PV farm, 0MW ≤ Rj ≤ 20MW; −20MW ≤ aj ≤ 0MW

generator k is natural gas plant, Rk = 50MW; ak = −50MW

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 43

Page 53: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Robust Dispatch w/ High Penetration of Renewables

Assumption: Vector a is known to lie within an ellipsoid

a ∈ E = {a + Pu | ‖u‖2 ≤ 1} (49)

where a ∈ Rn is ellipsoid center, and P ∈ Rn×n is positive semidefinite matrix

wind farm i: ai = −5 MW

solar PV farm j: aj = −10 MW

natural gas plant k: ak = −50 MW

Recall that λi(P) provides semi-axis lengths of ellipsoid.

Define P as diagonal matrix with:

wind farm i: Pii = 5 MW

solar PV farm j: Pjj = 10 MW

natural gas plant k: Pkk = 0 MW

Note: when P is positive definite, we have a 100% renewable grid!

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 44

Page 54: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Robust Dispatch w/ High Penetration of Renewables

Assumption: Vector a is known to lie within an ellipsoid

a ∈ E = {a + Pu | ‖u‖2 ≤ 1} (49)

where a ∈ Rn is ellipsoid center, and P ∈ Rn×n is positive semidefinite matrix

wind farm i: ai = −5 MW

solar PV farm j: aj = −10 MW

natural gas plant k: ak = −50 MW

Recall that λi(P) provides semi-axis lengths of ellipsoid.

Define P as diagonal matrix with:

wind farm i: Pii = 5 MW

solar PV farm j: Pjj = 10 MW

natural gas plant k: Pkk = 0 MW

Note: when P is positive definite, we have a 100% renewable grid!

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 44

Page 55: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Robust Dispatch w/ High Penetration of Renewables

Robust LP requires us to satisfy demand in EVERY instance of a ∈ E:

minimize fTx (50)

subject to aTx ≤ b, ∀ a ∈ E (51)

0 ≤ x ≤ 1 (52)

Note: We have n optimization vars and +∞ constraints... not cool.

Alternative idea: Solve under worst case scenario for a ∈ E . Convert (51)to

max{

aTx | a ∈ E}≤ b (53)

Re-write left hand side of (53) as

max{

aTx | a ∈ E}

= aTx + max{

uTPTx | ‖u‖2 ≤ 1}

(54)

= aTx + ‖PTx‖2 (55)

then the robust linear constraint can be re-expressed as

aTx + ‖PTx‖2 ≤ b (56)

which is a second order cone constraint

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 45

Page 56: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Robust Dispatch w/ High Penetration of Renewables

Robust LP requires us to satisfy demand in EVERY instance of a ∈ E:

minimize fTx (50)

subject to aTx ≤ b, ∀ a ∈ E (51)

0 ≤ x ≤ 1 (52)

Note: We have n optimization vars and +∞ constraints... not cool.

Alternative idea: Solve under worst case scenario for a ∈ E . Convert (51)to

max{

aTx | a ∈ E}≤ b (53)

Re-write left hand side of (53) as

max{

aTx | a ∈ E}

= aTx + max{

uTPTx | ‖u‖2 ≤ 1}

(54)

= aTx + ‖PTx‖2 (55)

then the robust linear constraint can be re-expressed as

aTx + ‖PTx‖2 ≤ b (56)

which is a second order cone constraint

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 45

Page 57: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Robust Dispatch w/ High Penetration of Renewables

Robust LP requires us to satisfy demand in EVERY instance of a ∈ E:

minimize fTx (50)

subject to aTx ≤ b, ∀ a ∈ E (51)

0 ≤ x ≤ 1 (52)

Note: We have n optimization vars and +∞ constraints... not cool.

Alternative idea: Solve under worst case scenario for a ∈ E . Convert (51)to

max{

aTx | a ∈ E}≤ b (53)

Re-write left hand side of (53) as

max{

aTx | a ∈ E}

= aTx + max{

uTPTx | ‖u‖2 ≤ 1}

(54)

= aTx + ‖PTx‖2 (55)

then the robust linear constraint can be re-expressed as

aTx + ‖PTx‖2 ≤ b (56)

which is a second order cone constraintProf. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 45

Page 58: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Robust Dispatch w/ High Penetration of Renewables

Robust LP→ SOCP – a sub-class of convex optimization problems:

minimize fTx (57)

subject to aTx + ‖PTx‖2 ≤ b (58)

0 ≤ x ≤ 1 (59)

Note: additional norm term acts as regularization term. Namely, it preventsx from being large in directions with considerable uncertainty.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 46

Page 59: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Outline

1 Convex Programming

2 Linear Programming

3 Quadratic Programming

4 Second Order Cone Programming

5 Robust Programming & Chance Constraints

6 Maximum Likelihood Estimation

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 47

Page 60: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

A Stochastic Approach to Robust Programming

In previous example, we optimized w.r.t. worst case scenario. Some wouldargue this is too conservative. That is, we could allow constraint violationsin very low probability situations. This motivates “chance constraints”.

Recall the LP

minimize cTx (60)

subject to aTx ≤ b (61)

Assume a ∈ Rn is Gaussian random vector, i.e. a ∼ N (a,Σ).Then aTx is a Gaussian random variable, with

mean aTx

variance xTΣx

Express probability that aTx ≤ b is satisfied as

Pr(aTx ≤ b) = Φ

(b− aTx

‖Σ1/2x‖2

)(62)

where Φ(x) = (1/√

2π)∫ x−∞ e−y2/2dy is the CDF of N (0,1).

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 48

Page 61: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

A Stochastic Approach to Robust Programming

In previous example, we optimized w.r.t. worst case scenario. Some wouldargue this is too conservative. That is, we could allow constraint violationsin very low probability situations. This motivates “chance constraints”.Recall the LP

minimize cTx (60)

subject to aTx ≤ b (61)

Assume a ∈ Rn is Gaussian random vector, i.e. a ∼ N (a,Σ).Then aTx is a Gaussian random variable, with

mean aTx

variance xTΣx

Express probability that aTx ≤ b is satisfied as

Pr(aTx ≤ b) = Φ

(b− aTx

‖Σ1/2x‖2

)(62)

where Φ(x) = (1/√

2π)∫ x−∞ e−y2/2dy is the CDF of N (0,1).

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 48

Page 62: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

A Stochastic Approach to Robust Programming

In previous example, we optimized w.r.t. worst case scenario. Some wouldargue this is too conservative. That is, we could allow constraint violationsin very low probability situations. This motivates “chance constraints”.Recall the LP

minimize cTx (60)

subject to aTx ≤ b (61)

Assume a ∈ Rn is Gaussian random vector, i.e. a ∼ N (a,Σ).Then aTx is a Gaussian random variable, with

mean aTx

variance xTΣx

Express probability that aTx ≤ b is satisfied as

Pr(aTx ≤ b) = Φ

(b− aTx

‖Σ1/2x‖2

)(62)

where Φ(x) = (1/√

2π)∫ x−∞ e−y2/2dy is the CDF of N (0,1).

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 48

Page 63: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Chance Constraints

This enables us to relax (61) into

minimize cTx (63)

subject to Pr(aTx ≤ b) ≥ η [“chance constraint”] (64)

In words, we require aTx ≤ b is satisfied with a reliability of η, where η istypically 0.9, 0.95, or 0.99

Interestingly, we can use (62) to convert this stochastic LP into an SOCP

Pr(aTx ≤ b) = Φ

(b− aTx

‖Σ1/2x‖2

)≥ η (65)

b− aTx

‖Σ1/2x‖2≥ Φ−1(η) (66)

Φ−1(η) · ‖Σ1/2x‖2 ≤ b− aTx (67)

where we recognize the final inequality as a second order cone constraint

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 49

Page 64: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Chance Constraints

This enables us to relax (61) into

minimize cTx (63)

subject to Pr(aTx ≤ b) ≥ η [“chance constraint”] (64)

In words, we require aTx ≤ b is satisfied with a reliability of η, where η istypically 0.9, 0.95, or 0.99

Interestingly, we can use (62) to convert this stochastic LP into an SOCP

Pr(aTx ≤ b) = Φ

(b− aTx

‖Σ1/2x‖2

)≥ η (65)

b− aTx

‖Σ1/2x‖2≥ Φ−1(η) (66)

Φ−1(η) · ‖Σ1/2x‖2 ≤ b− aTx (67)

where we recognize the final inequality as a second order cone constraint

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 49

Page 65: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Chance Constrained LP→ SOCP

thus, we converted chance constrained LP into the SOCP

minimize cTx (68)

subject to aTx + Φ−1(η)‖Σ1/2x‖2 ≤ b (69)

where Φ−1(·) is the inverse CDF

Note Φ−1(η) ≥ 0 to be valid second order cone constraint. True, if η ≥ 0.5. Inpractice, we almost always want reliability ≥ 0.5.

Comments:

approach extends to chanced constrained convex QPs

approach does NOT directly extend to non-Gaussian distributions

much more elegant & efficient than two-stage stochastic programming

Want to learn more about robust programming? ReadR. Bental, L. El Ghaoui, A. Nemirovski. Robust Optimization, PrincetonUniversity Press, 2009

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 50

Page 66: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Chance Constrained LP→ SOCP

thus, we converted chance constrained LP into the SOCP

minimize cTx (68)

subject to aTx + Φ−1(η)‖Σ1/2x‖2 ≤ b (69)

where Φ−1(·) is the inverse CDF

Note Φ−1(η) ≥ 0 to be valid second order cone constraint. True, if η ≥ 0.5. Inpractice, we almost always want reliability ≥ 0.5.

Comments:

approach extends to chanced constrained convex QPs

approach does NOT directly extend to non-Gaussian distributions

much more elegant & efficient than two-stage stochastic programming

Want to learn more about robust programming? ReadR. Bental, L. El Ghaoui, A. Nemirovski. Robust Optimization, PrincetonUniversity Press, 2009

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 50

Page 67: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Chance Constrained LP→ SOCP

thus, we converted chance constrained LP into the SOCP

minimize cTx (68)

subject to aTx + Φ−1(η)‖Σ1/2x‖2 ≤ b (69)

where Φ−1(·) is the inverse CDF

Note Φ−1(η) ≥ 0 to be valid second order cone constraint. True, if η ≥ 0.5. Inpractice, we almost always want reliability ≥ 0.5.

Comments:

approach extends to chanced constrained convex QPs

approach does NOT directly extend to non-Gaussian distributions

much more elegant & efficient than two-stage stochastic programming

Want to learn more about robust programming? ReadR. Bental, L. El Ghaoui, A. Nemirovski. Robust Optimization, PrincetonUniversity Press, 2009

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 50

Page 68: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Ex: Portfolio Optimization

Recall the Portfolio optimization problem:

x ∈ Rn indicates portfolio allocation; xi is fraction invested in asset i

x must satisfy 1Tx = 1, x ≥ 0

return (in percentage) is given by pTx, where p ∈ Rn and p ∼ N (p,Σ)

Objective: Maximize expected return, subject to limit on probability of loss

minimize −E[pTx

](70)

subject to Pr[pTx ≥ 1

]≥ η (71)

1Tx = 1, x � 0 (72)

can be recast into the SOCP

minimize −pTx (73)

subject to Φ−1(η) · ‖Σ1/2x‖2 ≤ pTx− 1 (74)

1Tx = 1, x � 0 (75)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 51

Page 69: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Ex: Portfolio Optimization

Recall the Portfolio optimization problem:

x ∈ Rn indicates portfolio allocation; xi is fraction invested in asset i

x must satisfy 1Tx = 1, x ≥ 0

return (in percentage) is given by pTx, where p ∈ Rn and p ∼ N (p,Σ)

Objective: Maximize expected return, subject to limit on probability of loss

minimize −E[pTx

](70)

subject to Pr[pTx ≥ 1

]≥ η (71)

1Tx = 1, x � 0 (72)

can be recast into the SOCP

minimize −pTx (73)

subject to Φ−1(η) · ‖Σ1/2x‖2 ≤ pTx− 1 (74)

1Tx = 1, x � 0 (75)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 51

Page 70: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Ex: Portfolio Optimization

Recall the Portfolio optimization problem:

x ∈ Rn indicates portfolio allocation; xi is fraction invested in asset i

x must satisfy 1Tx = 1, x ≥ 0

return (in percentage) is given by pTx, where p ∈ Rn and p ∼ N (p,Σ)

Objective: Maximize expected return, subject to limit on probability of loss

minimize −E[pTx

](70)

subject to Pr[pTx ≥ 1

]≥ η (71)

1Tx = 1, x � 0 (72)

can be recast into the SOCP

minimize −pTx (73)

subject to Φ−1(η) · ‖Σ1/2x‖2 ≤ pTx− 1 (74)

1Tx = 1, x � 0 (75)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 51

Page 71: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Outline

1 Convex Programming

2 Linear Programming

3 Quadratic Programming

4 Second Order Cone Programming

5 Robust Programming & Chance Constraints

6 Maximum Likelihood Estimation

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 52

Page 72: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Problem Setting

Problem Statement: You are provided m data points for random variabley. Goal: fit a probability distribution to this data.

Notation:

p(y; θ) is probability density function for y

free parameters θ ∈ Rn.

“likelihood function” is when we consider p(y; θ) as function of θ, forfixed value of y

“log-likelihood function” l(θ) = log p(y; θ)

Maximum Likelihood Estimation:

θ? = arg maxθ

l(θ) (76)

Remark: Interestingly, (76) is a convex optimization problem for manycommon scenarios.

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 53

Page 73: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

MLE for Linear Models

Consider a linear measurement model:

yi = θTφi + vi, i = 1, · · · ,m (77)

θ ∈ Rn are parameters to be estimatedyi ∈ R are the measured data pointsφi ∈ Rn are the regressorsvi ∈ R are the measurement errors or noise. Assume vi’s are IID withprobability density p(·)

The likelihood function, given all the measured points yi and regressors φi isgiven by products of likelihood for each data point yi, φi:

p(v; θ) =m∏

i=1

p(yi − θTφi) (78)

The log-likelihood function is then

l(θ) = log p(v; θ) =m∑

i=1

log p(yi − θTφi) (79)

recall log(a · b) = log(a) + log(b)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 54

Page 74: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

MLE for Linear Models

Consider a linear measurement model:

yi = θTφi + vi, i = 1, · · · ,m (77)

θ ∈ Rn are parameters to be estimatedyi ∈ R are the measured data pointsφi ∈ Rn are the regressorsvi ∈ R are the measurement errors or noise. Assume vi’s are IID withprobability density p(·)

The likelihood function, given all the measured points yi and regressors φi isgiven by products of likelihood for each data point yi, φi:

p(v; θ) =m∏

i=1

p(yi − θTφi) (78)

The log-likelihood function is then

l(θ) = log p(v; θ) =m∑

i=1

log p(yi − θTφi) (79)

recall log(a · b) = log(a) + log(b)Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 54

Page 75: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

MLE for Linear Models & Gaussian Noise

The MLE problem is

maximizeθ

m∑i=1

log p(yi − θTφi) (80)

the likelihood function is log-concave for several common distributions

Suppose that vi ∼ N (0, σ2). Thus p(v) = (2πσ2)−1/2 · e−v2/(2σ2). Then...

m∑i=1

log p(vi) =m∑

i=1

[−1

2log(2πσ2)− 1

2σ2· v2

i

](81)

= −m

2log(2πσ2)− 1

2σ2

m∑i=1

v2i (82)

m∑i=1

log p(yi − θTφi) = −m

2log(2πσ2)− 1

2σ2

m∑i=1

(yi − θTφi)2 (83)

= −m

2log(2πσ2)− 1

2σ2‖ΦTθ − Y‖22 (84)

where Φ = [φ1, · · · , φm] ∈ Rn×m, Y = [y1; · · · ; ym] ∈ Rm

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 55

Page 76: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

MLE for Linear Models & Gaussian Noise

The MLE problem is

maximizeθ

m∑i=1

log p(yi − θTφi) (80)

the likelihood function is log-concave for several common distributions

Suppose that vi ∼ N (0, σ2). Thus p(v) = (2πσ2)−1/2 · e−v2/(2σ2). Then...

m∑i=1

log p(vi) =m∑

i=1

[−1

2log(2πσ2)− 1

2σ2· v2

i

](81)

= −m

2log(2πσ2)− 1

2σ2

m∑i=1

v2i (82)

m∑i=1

log p(yi − θTφi) = −m

2log(2πσ2)− 1

2σ2

m∑i=1

(yi − θTφi)2 (83)

= −m

2log(2πσ2)− 1

2σ2‖ΦTθ − Y‖22 (84)

where Φ = [φ1, · · · , φm] ∈ Rn×m, Y = [y1; · · · ; ym] ∈ Rm

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 55

Page 77: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

MLE for Linear Models & Gaussian Noise

The MLE problem with Gaussian noise is a least squares problem!

θ? = arg minθ‖ΦTθ − Y‖22 (85)

Exercise:Derive the MLE optimization formulation for (77) for the followingdistributions for vi:

1. Laplacian noise distribution: p(v) = 1/(2a) · e−|v|/a

2. Uniform noise distribution: p(v) = 1/(2a) on [−a,+a] and zeroelsewhere

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 56

Page 78: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

MLE for Linear Models & Gaussian Noise

The MLE problem with Gaussian noise is a least squares problem!

θ? = arg minθ‖ΦTθ − Y‖22 (85)

Exercise:Derive the MLE optimization formulation for (77) for the followingdistributions for vi:

1. Laplacian noise distribution: p(v) = 1/(2a) · e−|v|/a

2. Uniform noise distribution: p(v) = 1/(2a) on [−a,+a] and zeroelsewhere

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 56

Page 79: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

MLE for Linear Models & Laplacian Noise

Laplace distribution

p(vi) =1

2a· e−

|v|a

Laplace distributions.Image Source: wikipedia.org

θ? = arg minθ‖ΦTθ − Y‖1 (86)

m∑i=1

log p(vi) =m∑

i=1

[log

(1

2a

)− 1

a· |vi|

]= m log

(1

2a

)− 1

a

m∑i=1

|vi|

(87)m∑

i=1

log p(yi − θTφi) = m log

(1

2a

)− 1

a

m∑i=1

|yi − θTφi| (88)

= m log

(1

2a

)− 1

a‖ΦTθ − Y‖1 (89)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 57

Page 80: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Logistic Regressionwith application to discrete choice models

Consider binary random variable y ∈ {0,1}, with

Pr [y = 1] = p, Pr [y = 0] = 1− p (90)

for example

y = 1 corresponds to “charge EV at Nanshan iPark chg station”y = 0 corresponds to “DO NOT charge EV at Nanshan iPark chg station”

Hypothesize probability p is a function of the EV driver’s utility function, i.e.p(U). The utility function is:

U = aTφ+ b (91)

where a ∈ Rn,b ∈ R are free parameters; φ ∈ Rn are “explanatory” vars.Example explanatory vars:

day of week

time of day

charging price

is parking space openWe relate probability p to utility U using the logistic model:

p(U) =eU

1 + eU=

eaTφ+b

1 + eaTφ+b(92)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 58

Page 81: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Logistic Regressionwith application to discrete choice models

Consider binary random variable y ∈ {0,1}, with

Pr [y = 1] = p, Pr [y = 0] = 1− p (90)

for example

y = 1 corresponds to “charge EV at Nanshan iPark chg station”y = 0 corresponds to “DO NOT charge EV at Nanshan iPark chg station”

Hypothesize probability p is a function of the EV driver’s utility function, i.e.p(U). The utility function is:

U = aTφ+ b (91)

where a ∈ Rn,b ∈ R are free parameters; φ ∈ Rn are “explanatory” vars.Example explanatory vars:

day of week

time of day

charging price

is parking space openWe relate probability p to utility U using the logistic model:

p(U) =eU

1 + eU=

eaTφ+b

1 + eaTφ+b(92)

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 58

Page 82: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Logistic Regressionwith application to discrete choice models

Logistic model:

p(U) =eU

1 + eU

Objective: Given explana-tory variables φ1, · · · , φm ∈ Rn

with corresponding outcomesy1, · · · , ym ∈ {0,1}, find freeparameters a,b via MLE

Re-order data

φ1, · · · , φq is for outcome y = 1

φq+1, · · · , φm is for outcome y = 0

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 59

Page 83: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Logistic Regressionwith application to discrete choice models

Logistic model:

p(U) =eU

1 + eU

Objective: Given explana-tory variables φ1, · · · , φm ∈ Rn

with corresponding outcomesy1, · · · , ym ∈ {0,1}, find freeparameters a,b via MLE

Re-order data

φ1, · · · , φq is for outcome y = 1

φq+1, · · · , φm is for outcome y = 0

Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 59

Page 84: ENE 2XX: Renewable Energy Systems and Control · Ex: Easily solve CPs with 1M’s of variables in tens of seconds CP solvers are off-the-shelf technology YOUR focus: Find ways to

Likelihood function

p(y; a,b) =

q∏i=1

pi ·m∏

i=q+1

(1− pi) (93)

Log-likelihood function

l(a,b) =

q∑i=1

log pi +m∑

i=q+1

log(1− pi) (94)

=

q∑i=1

logeUi

1 + eUi+

m∑i=q+1

log1

1 + eUi(95)

=

q∑i=1

[Ui − log(1 + eUi )

]−

m∑i=q+1

log(1 + eUi ) (96)

=

q∑i=1

Ui −m∑

i=1

log(1 + eUi ) (97)

=

q∑i=1

[aTφi + b

]−

m∑i=1

log(

1 + eaTφi+b)

(98)

where this last expression is a concave function w.r.t. (a,b). Solve via CP.Prof. Moura | Tsinghua-Berkeley Shenzhen Institute ENE 2XX | LEC 02 - Convex Programs Slide 60