COMP4121 Lecture Notescs4121/lectures_2019/COMP4121... · COMP4121 Lecture Notes Linear Programming LiC: Aleks Ignjatovic [email protected] THE UNIVERSITY OF NEW SOUTH WALES

COMP4121 Lecture Notes

Linear Programming

LiC: Aleks Ignjatovic

[email protected]

THE UNIVERSITY OFNEW SOUTH WALES

School of Computer Science and EngineeringThe University of New South Wales

Sydney 2052, Australia

We now move to one of the most important cases of convex programming,called Linear Programming, (LP), in which the objective is a linear functionand the convex domain over which the extremum is sought is defined by con-straints which are also linear functions. We follow closely our old COMP3121textbook by Cormen, Leiserson, Rivest and Stein, introduction to Algorithms.We start by defining a common representations of Linear Programming opti-mization problems.

In the standard form the objective to be maximized is given by

f(x) =

n∑j=1

cj xj ,

and the constraints are of the form

n∑j=1

aijxj ≤ bi, 1 ≤ i ≤ m; (1)

xj ≥ 0, 1 ≤ j ≤ n, (2)

Note that x represents a vector, x> = 〈x1, . . . , xn〉; we assume that vectors arecolumns; thus, to write them as rows we have to transpose them. The numbersaij are assumed to be real.

Clearly, a minimization problem of the form:

minimize f(x) =

n∑j=1

cj xj ,

subject to the constraints of the form

n∑j=1

aijxj ≥ bi, 1 ≤ i ≤ m; (3)

xj ≥ 0, 1 ≤ j ≤ n, (4)

can easily be reduced to a corresponding maximization problem in the standardform

maximize f∗(x) =

n∑j=1

c∗j xj ,

with constraints of the form

n∑j=1

a∗ijxj ≤ b∗i , 1 ≤ i ≤ m;

xj ≥ 0, 1 ≤ j ≤ n,

by taking c∗ = −c, a∗ij = −aij and b∗ = −b.

1

Example: Humans require 13 vitamins V1, . . . , V13 in total; the daily re-quired amounts d1, . . . , d13 for each of them are known. Your local grocery storecaries 130 types of food staples, and for each food staple Sk and each of the 13vitamins Vi the content of Vi per gram of Sk is also a known value, denoted byC(k, i). For each food staple Sk you are also given its price per gram, denotedby Pk. Your goal is to determine the quantities xk of each food staple Sk whichyou should buy, such that your daily requirements of all vitamins are met, butthe total price you pay for your daily food provision is as small as possible.

Solution: You have to:

minimize the objective

130∑k=1

Pk xk

subject to 13 constraints

130∑k=1

C(k, i)xk > di, 1 ≤ i ≤ 13,

x1 ≥ 0, . . . , x130 ≥ 0.

Somewhat messy formulation of an LP program we have given can be refor-mulated in a much more compact matrix form. To get a very compact represen-tation of linear programs, let us introduce a partial ordering on vectors x ∈ Rn

by x ≤ y if and only if such inequalities hold coordinate-wise, i.e., if and only ifxj ≤ yj for all 1 ≤ i ≤ n. Since vectors are written as a single column matrices,letting c> = 〈c1, . . . , cn〉 ∈ Rn and b> = 〈b1, . . . bm〉 ∈ Rm, and letting A bethe matrix A = (aij)i,j of size m × n, we get that the above problem can beformulated simply as:

maximize c>x

subject to the following two (matrix-vector) constraints:1 Ax ≤ b and x ≥ 0.

Thus, to specify a Linear Programming optimisation problem we just haveto provide a triplet (A,b, c), and the information that the triplet represents anLP problem in the standard form.

While in general the “natural formulation” of a problem into a Linear Pro-gram does not necessarily produce the non-negativity constrains (4) for all ofthe variables, in the standard form such constraints are indeed required for allof the variables. However, this poses no problem, because each occurrence ofan unconstrained variable xj can be replaced by the expression x′j − x∗j wherex′j , x

∗j are new variables satisfying the constraints x′j ≥ 0, x∗j ≥ 0.

Any vector x which satisfies the two constraints is called a feasible solution,regardless of what the corresponding objective value cTx might be. Note thatthe set of feasible solutions, i.e., the domain over which we seek to maximize the

1Note that the inequality involved in the constraints is the partial ordering on vectorswhich we have just introduced above. Also, in the non-negativity constraint, 0 represents avector 0 ∈ Rn with all coordinates equal to 0.

2

objective, is a convex set because it is an intersection of the half spaces definedby each of the constraints.

As an example, let us consider the following optimization problem:

maximize 3x1 + x2 + 2x3 (5)

subject to the constraints

x1 + x2 + 3x3 ≤ 30 (6)

2x1 + 2x2 + 5x3 ≤ 24 (7)

4x1 + x2 + 2x3 ≤ 36 (8)

x1, x2, x3 ≥ 0 (9)

How large can the value of the objective z(x1, x2, x3) = 3x1 + x2 + 2x3 be,without violating the constraints? If we add inequalities (6) and (7), we get

3x1 + 3x2 + 8x3 ≤ 54

Since all variables are constrained to be non-negative, we are assured that

3x1 + x2 + 2x3 ≤ 3x1 + 3x2 + 8x3 ≤ 54

The far left hand side of this equation is just the objective (5); thus z(x1, x2, x3)is bounded above by 54, i.e., z(x1, x2, x3) ≤ 54.

Can we obtain a tighter bound? We could try to look for coefficientsy1, y2, y3 ≥ 0 to be used to for a linear combination of the constraints:

y1(x1 + x2 + 3x3) ≤ 30y1

y2(2x1 + 2x2 + 5x3) ≤ 24y2

y3(4x1 + x2 + 2x3) ≤ 36y3

Then, summing up all these inequalities and factoring, we get

x1(y1 + 2y2 + 4y3) + x2(y1 + 2y2 + y3) + x3(3y1 + 5y2 + 2y3)

≤ 30y1 + 24y2 + 36y3 (10)

If we compare this with our objective (5) we see that if we choose y1, y2 and y3so that:

y1 + 2y2 + 4y3 ≥ 3

y1 + 2y2 + y3 ≥ 1

3y1 + 5y2 + 2y3 ≥ 2

then

x1(y1 + 2y2 + 4y3) + x2(y1 + 2y2 + y3) + x3(3y1 + 5y2 + 2y3) ≥ 3x3 + x2 + 2x3

3

Combining this with (10) we get:

30y1 + 24y2 + 36y3 ≥ 3x1 + x2 + 2x3 = z(x1, x2, x3)

Consequently, in order to find an as tight as possible upper bound for ourobjective z(x1, x2, x3), we have to look for y1, y2, y3 which produce the smallestpossible value of z∗(y1, y2, y3) = 30y1 + 24y2 + 36y3, but which do not violatethe constraints

y1 + 2y2 + 4y3 ≥ 3 (11)

y1 + 2y2 + y3 ≥ 1 (12)

3y1 + 5y2 + 2y3 ≥ 2 (13)

y1, y2, y3 ≥ 0 (14)

Thus, trying to find the best upper bound for our objective z(x1, x2, x3) obtainedby forming a linear combination of the constraints only reduces the originalmaximization problem to a minimization problem:

minimize 30y1 + 24y2 + 36y3

subject to the constraints (11-14)

Such minimization problem P ∗ is called the dual problem of the initial problemP .

Let us now repeat the whole procedure with P ∗ in place of P , i.e., let us findthe dual program (P ∗)∗ of P ∗. We are now looking for non negative multipliersz1, z2, z3 ≥ 0 to multiply inequalities (11-13) and obtain

z1(y1 + 2y2 + 4y3) ≥ 3z1

z2(y1 + 2y2 + y3) ≥ z2z3(3y1 + 5y2 + 2y3) ≥ 2z3

Summing these up and factoring produces

y1(z1 + z2 + 3z3) + y2(2z1 + 2z2 + 5z3) + y3(4z1 + z2 + 2z3) ≥ 3z1 + z2 + 2z3(15)

If we choose these multiples so that

z1 + z2 + 3z3 ≤ 30 (16)

2z1 + 2z2 + 5z3 ≤ 24 (17)

4z1 + z2 + 2z3 ≤ 36 (18)

we will have:

y1(z1 + z2 + 3z3) + y2(2z1 + 2z2 + 5z3) + y3(4z1 + z1 + 2z3) ≤ 30y1 + 24y2 + 36y3

Combining this with (15) we get

3z1 + z2 + 2z3 ≤ 30y1 + 24y2 + 36y3

Consequently, finding the dual program (P ∗)∗ of P ∗ amounts to maximizingthe objective 3z1 + z2 + 2z3 subject to the constraints (16-18). But notice that,

4

except for having different variables, (P ∗)∗ is exactly our starting program P .Thus, the dual program (P ∗)∗ for program P ∗ is just P itself, i.e., (P ∗)∗ = P .

So, at the first sight, looking for the multipliers y1, y2, y3 did not help much,because it only reduced a maximization problem to an equally hard minimizationproblem. Well, perhaps it still maybe easier to somehow solve both problemsat the same, and this is precisely what the SIMPLEX algorithms does.2

To find the maximal value of 3x1 + x2 + 2x3 subject to the constraints

x1 + x2 + 3x3 ≤ 30

2x1 + 2x2 + 5x3 ≤ 24

4x1 + x2 + 2x3 ≤ 36

let us start with x1 = x2 = x3 = 0 and ask ourselves: how much can we increasex1 without violating the constraints? Since x1 +x2 +3x3 ≤ 30 we can introducea new variable x4 such that x4 ≥ 0 and

x4 = 30− (x1 + x2 + 3x3) (19)

Since such variable measures how much “slack” we got between the actual valueof x1+x2+3x3 and its upper limit 30, such x4 is called a slack variable. Similarlywe introduce new variables x5, x6 requiring them to satisfy x5, x6 ≥ 0 and let

x5 = 24− 2x1 − 2x2 − 5x3 (20)

x6 = 36− 4x1 − x2 − 2x3 (21)

Since we started with the values x1 = x2 = x3 = 0, this implies that these newslack variables must have values x4 = 30, x5 = 24, x6 = 36, or as a single

vector, the initial basic feasible solution is (x1

0 ,x2

0 ,x3

0 ,x4

30,x5

24,x6

36). Note that “fea-sible” refers merely to the fact that all of the constraints are satisfied.3

Now we see that (19) implies that x1 cannot exceed 30, and (20) impliesthat 2x1 ≤ 24, i.e., x1 ≤ 12, while (21) implies that 4x1 ≤ 36, i.e., x1 ≤ 9. Sinceall of these conditions must be satisfied we conclude that x1 cannot exceed 9,which is the upper limit coming from the constraint (21).

If we set x1 = 9, this forces x6 = 0. We now swap the roles of x1 and x6:since we cannot increase x1 any more, we eliminate x1 from the right hand sidesof the equations (19-21) and from the objective, introducing instead variable x6to the right hand side of the constraints and into the objective. To do that, wesolve equation (21) for x1:

x1 = 9− x24− x3

2− x6

42It is now useful to remember how we proved that the Ford - Fulkerson Max Flow algorithm

in fact produces a maximal flow, by showing that it terminates only when the flow reachesthe capacity of a (minimal) cut!

3Clearly, setting all variables to 0 does not always produce a basic feasible solution becausethis might violate some of the constraints; this would happen, for example, if we had aconstraint of the form −x1 +x2 +x3 ≤ −3; choosing an initial basic feasible solution requiresa separate algorithm to “bootstrap” the SIMPLEX algorithm - see for the details our (C-L-R-S) textbook.

5

and eliminate x1 from the right hand side of the remaining constraints and theobjective to get:

z = 3(

9− x24− x3

2− x6

4

)+ x2 + 2x3

= 27− 3

4x2 −

3

2x3 −

3

4x6 + x2 + 2x3

= 27 +1

4x2 +

1

2x3 −

3

4x6

x5 = 24− 2(

9− x24− x3

2− x6

4

)− 2x2 − 5x3

= 6 +x22

+ x3 +x62− 2x2 − 5x3

= 6− 3

2x2 − 4x3 +

x62

x4 = 30−(

9− x24− x3

2− x6

4

)− x2 − 3x3

= 21 +x24

+x32

+x64− x2 − 3x3

= 21− 3

4x2 +

5

2x3 +

x64

To summarise: the “new” objective is

z = 27 +1

4x2 +

1

2x3 −

3

4x6

and the “new constraints” are

x1 = 9− x24− x3

2− x6

5(22)

x4 = 21− 3

4x2 −

5

2x3 +

x64

(23)

x5 = 6− 3

2x2 − 4x3 +

x62

(24)

Our new basic feasible solution replacing (x1

0 ,x2

0 ,x3

0 ,x4

30,x5

24,x6

36) is obtained by set-

ting all the variables on the right to zero, thus obtaining (x1

9 ,x2

0 ,x3

0 ,x4

21,x5

6 ,x6

0 ).

NOTE: These are EQUIVALENT constraints and objectives; the old ones wereonly transformed to an equivalent form. Any values of the variables will produceexactly the same value in both forms of the objective and they will satisfy thefirst set of constraints if and only if they satisfy the second set.

So x1 and x6 have switched their roles; x1 acts as a new basic variable,

and the new basic feasible solution is: (x1

9 ,x2

0 ,x3

0 ,x4

21,x5

6 ,x6

0 ); the new value of theobjective is z = 27 + 1

40 + 120− 3

40 = 27. We will continue this process of find-ing basic feasible solutions which increase the value of the objective, switchingwhich variables are used to measure the slack and which are on the right handside of the constraint equations and in the objective. The variables on the leftare called the basic variables and the variables on the right are the non basicvariables.

6

We now choose another variable with a positive coefficient in the objective,say x3 (we could also have chosen x2). How much can we increase it?

From (22) we see that x3

2 must not exceed 9, otherwise x1 will becomenegative. Thus x3 cannot be larger than 18. Similarly, 5

2x3 cannot exceed 21,otherwise x4 will become negative, and so x3 ≤ 42

5 ; similarly, 4x3 cannot exceed6, ie, x3 ≤ 3

2 . Thus, in order for all constraints to remain valid, x3 cannot exceed32 . Thus, we increase x3 to 3

2 ; equation (24) now forces x5 to zero. We noweliminate x3 from the right hand side of the constraints and from the objective,taking it as a new basic variable:

4x3 = 6− 3

2x2 − x5 +

x62

i.e.,

x3 =3

2− 3

8x2 −

1

4x5 +

1

8x6 (25)

After eliminating x3 by substitution using (25), the objective now becomes:

z =27− 1

4x2 +

1

2

(3

2− 3

8x2 −

1

4x5 +

1

8x6

)− 3

4x6 =

111

4+

1

16x2 −

1

8x5 −

11

6x6 (26)

Using (25) to eliminate x3 from the constraints, after simplifications we get thenew constraints:

x1 =33

4− x2

16+x58− 5x6

16(27)

x3 =3

2− 3x2

8− x5

4+x68

(28)

x4 =69

4+

3x216

+5x38− x6

16(29)

Our new basic solution is again obtained using the fact that all variables on theright and in the objective, including the newly introduced non-basic variable

x5, are equal to zero, i.e., the new basic feasible solution is (x1334 ,

x2

0 ,x332 ,

x4694 ,

x5

0 ,x6

0 )and the new value of the objective is z = 111

4 .

Comparing this with the previous basic feasible solution (x1

9 ,x1

0 ,x3

0 ,x4

21,x5

6 ,x6

0 ) wesee that in the new basic feasible solution the value of x1 has dropped from 9 to33/4; however, this now has no effect on the value of the objective, because x1no longer appears in the objective; all the variables appearing in the objective(thus, the non-basic variables) always have value 0.

We now see that the only variable in the objective (26) appearing with apositive coefficient is x2. How much can we increase it without violating the newconstraints? The first constraint (27) implies that x2

16 ≤334 , i.e., that x2 ≤ 132;

the second constraint (28) implies that 3x2

8 ≤32 , i.e., that x2 ≤ 4. Note that

the third constraint (29) does not impose any restrictions on how large x2 canbe, for as long as it is positive; thus, we conclude that the largest possible value

7

of x2 which does not cause violation of any of the constraints is x2 = 4, whichcorresponds to constraint (28). The value x2 = 4 forces x3 = 0; we now switchthe roles of x2 and x3 (this operation of switching the roles of two variables iscalled pivoting) by solving (28) for x2:

x2 = 4− 8x23− 2x5

3+x63

and then using this to eliminate x2 from the objective, obtaining

z = 28− x36− x5

6− 2x6

3

as well as from the constraints, obtaining after simplification

x1 = 8 +x36

+x56− x6

3(30)

x2 = 4− 8x33− 2x5

3+x63

(31)

x4 = 18− x32

+x52

(32)

which produces the new basic feasible solution (x1

8 ,x2

4 ,x3

0 ,x4

18,x5

0 ,x6

0 ) with the newvalue of the objective z = 28. Note that in the new objective all the variablesappear with a negative coefficient; thus our procedure terminates, but did itfind the maximum value of the objective? Maybe with a different choices ofvariables in pivoting we would have come up with another basic feasible solutionwhich would have different basic variables, also with all non basic variables inthe objective appearing with a negative coefficient, but for which the obtainedvalue of the objective is larger?

This is not the case: just as in the case of the Ford Fulkerson algorithmfor Max Flow, once the pivoting terminates, the solution must be optimal re-gardless of which particular variables where swapped in pivoting, because thepivoting terminates when the corresponding basic feasible solution ofthe program becomes equal to a basic feasible solution of the dualprogram. Since every feasible solution of the dual is larger than every feasiblesolution of the starting (or primal) program, we get that the SIMPLEX algo-rithm must return the optimal value after it terminates. We now explain thisin more detail.

8

1 LP Duality

General setup

Comparing our initial program P with its dual P ∗:

P : maximize 3x1 + x2 + 2x3,


x1 + x2 + 3x3 ≤ 30

2x1 + 2x3 + 5x3 ≤ 24

4x1 + x2 + 2x3 ≤ 36

x1, x2, x3 ≥ 0;

P ∗ : minimize 30y1 + 24y2 + 36y3,


y1 + 2y2 + 4y3 ≥ 3

y1 + 2y2 + y3 ≥ 1

3y1 + 5y2 + 2y3 ≥ 2

y1, y2, y3 ≥ 0.

we see that the original, primal Linear Program P and its dual Linear Programare related as follows

P : maximize z(x) =

n∑j=1

cjxj ,


n∑j=1

aijxj ≤ bi; 1 ≤ i ≤ m

x1, . . . , xn ≥ 0;

P ∗ : minimize z∗(y) =

m∑i=1

biyi,


m∑i=1

aijyi ≥ cj ; 1 ≤ j ≤ n

y1, . . . , ym ≥ 0,

or, in matrix form,

P : maximize z(x) = cTx, subject to the constraints Ax ≤ b and x ≥ 0;

P ∗ : minimize z∗(y) = bT y, subject to the constraints AT y ≥ c and y ≥ 0.

9

Weak Duality Theorem If x> = 〈x1, . . . , xn〉 is any basic feasible solutionfor P and y> = 〈y1, . . . , ym〉 is any basic feasible solution for P ∗, then:

z(x) =

n∑j=1

cjxj ≤m∑i=1

biyi = z∗(y)

Proof: Since x and y are basic feasible solutions for P and P ∗ respectively, wecan use the constraint inequalities, first from P ∗ and then from P to obtain

z(x) =

n∑j=1

cjxj ≤n∑

j=1

(m∑i=1

aijyi

)xj =

m∑i=1

n∑j=1

aijxj

yi ≤m∑i=1

biyi = z∗(y)

Thus, every feasible solution of P ∗ is an upper bound for the set of all feasiblesolutions of P , and every feasible solution of P is a lower bound for the set offeasible solutions for P ∗; see the figure below.

Solutions for P

Solutions for P*

Consequently, if we find a feasible solution of P which is also a feasible solutionfor P ∗, such solution must be maximal feasible solution for P and minimal fea-sible solution for P ∗.

We now show that when the SIMPLEX algorithms terminates that it pro-duces a basic feasible solution x for P , and, implicitly, a basic feasible solutiony for the dual P ∗ for which z(x) = z∗(y); by the above, this will imply thatz(x) is the maximal value for the objective of P and that z∗(y) is the minimalvalue of the objective for P ∗.

Assume that the SIMPLEX algorithm has terminated; let B be such thatthe basic variables (variables on the left hand side of the constraint equations) inthe final form of P are variables xi for which i ∈ B; let N = {1, 2, . . . , n+m}\B;then xj for j ∈ N are all the non-basic variables in the final form of P . Since theSIMPLEX algorithm has terminated, we have also obtained a set of coefficientscj ≤ 0 for j ∈ N , as well as v such that the final form of the objective is

z(x) = v +∑j∈N

cjxj

If we set all the final non-basic variables xj , j ∈ N , to zero, we obtain a basicfeasible solution x for which z(x) = v.

10

Let us define cj = 0 for all j ∈ B; then

z(x) =

n∑j=1

cjxj

= v +

n+m∑j=1

cjxj

= v +

n∑j=1

cjxj +

n+m∑i=n+1

cixi

= v +

n∑j=1

cjxj +

m∑i=1

cn+ixn+i

Since the variables xn+i, (1 ≤ i ≤ m), are the initial slack variables, they satisfyxn+i = bi −

∑nj=1 aijxj ; thus we get

z(x) =n∑

j=1

cjxj

= v +

n∑j=1

cjxj +

m∑i=1

cn+i

bi − n∑j=1

aijxj

= v +

n∑j=1

cjxj +

m∑i=1

cn+ibi −m∑i=1

n∑j=1

cn+iaijxj

= v +

n∑j=1

cjxj +

m∑i=1

cn+ibi −n∑

j=1

m∑i=1

cn+iaijxj

= v +

m∑i=1

cn+ibi +

n∑j=1

(cj −

m∑i=1

cn+iaij

)xj

The above equations hold true for all values of x; thus, comparing the first andthe last equation we conclude that

v +

m∑i=1

cn+ibi = 0;

cj −m∑i=1

cn+iaij = cj , (1 ≤ j ≤ n).

i.e.,

m∑i=1

bi(−cn+i) = v;

m∑i=1

aij(−cn+i) = cj + (−cj), (1 ≤ j ≤ n).

11

We now see that if we set yi = −cn+i for all 1 ≤ i ≤ m, then, since theSIMPLEX terminates when all coefficients of the objective are either negativeor zero, then such y satisfies:

m∑i=1

biyi = v;

m∑i=1

aijyi = cj − cj ≥ cj , (1 ≤ j ≤ n),

yi ≥ 0, (1 ≤ i ≤ m).

Thus, such y is a basic feasible solution for the dual program P ∗ for whichthe dual objective has the same value v which the original, primal program Pachieves for the basic feasible solution x. Thus, by the Weak Duality Theorem,we conclude that v is the maximal feasible value of P and minimal feasible valuefor P ∗.

Note also that the basic and non basic variables of the final primal form ofthe problem and of its dual are complementary: for every 1 ≤ i ≤ m, variablexn+i is basic for the final form of the primal if and only if yi is non basic for thefinal form of the dual; (similarly, for every 1 ≤ j ≤ n, variable xj is basic forthe final form of the primal if and only if ym+j is not basic for the final formof the dual). Since the basic variables measure the slack of the correspondingbasic feasible solution, we get that if x and y are the extremal feasible solutions

12

for P and P ∗, respectively, then for all 1 ≤ j ≤ n and all 1 ≤ i ≤ m,

either xj = 0 or ym+j = 0, i.e.,

m∑i=0

aijyi = cj ;

either yi = 0 or xn+i = 0, i.e.,

n∑j=0

aijxj = bi.

Note that any equivalent form of P which is obtained through a pivotingoperation is uniquely determined by its corresponding set of basic variables.Assuming that we have n variables and m equations, then there are

(n+mm

)choices for the set of basic variables. Using the Stirling formula

n! ≈√

2πn(ne

)n⇒ lnn! ≈ ln(2πn)

2+ n lnn− n = n lnn− n+O(lnn)

we get

ln

(n+m

m

)= ln

(m+ n)!

m!n!= ln(m+ n)!− lnm!− lnn!

= (m+ n) ln(m+ n)− (m+ n)− n lnn−m lnm+m+ n

= m(ln(m+ n)− lnm) + n(ln(m+ n)− lnn)

≥ m+ n

Thus, the total number of choices for the set of the basic variables is(n+mm

)>

em+n. This implies that the SIMPLEX algorithm could potentially run in ex-ponential time, and in fact, one can construct examples of LP on which therun time of the SIMPLEX algorithm is exponential. However, in practice theSIMPLEX algorithm is extremely efficient, even for large problems with thou-sands of variables and constraints, and it tends to outperform algorithms for LPwhich do run in polynomial time (the Ellipsoid Method and the Interior PointsMethod).

1.1 Examples of dual programs: max flow

We would now like to formulate the Max Flow problem in a flow network as aLinear Program.

Thus, assume we are given a flow network G with capacities κij of all edges(i, j) ∈ G. The max flow problem seeks to maximize the flows through a network

13

flow graph G, subject to the constraints:

C∗

fij ≤ κij ; (i, j) ∈ G; (flow smaller than pipe’s capacity)∑

i : (i,j)∈G fij =∑

k : (j,k)∈G fjk; j ∈ G; (incoming flow equals outgoing)

fij ≥ 0; (i, j) ∈ G (no negative flows).

To eliminate the equality in the second constraint in C∗ while introducing onlyone rather than two inequalities, we use a “trick”: we make the flow circular, byconnecting the sink t with the source s with a pipe of infinite capacity. Thus,we now have a new graph G′, G ⊂ G′, with an additional edge (t, s) ∈ G′ withcapacity ∞.

We can now formulate the Max Flow problem as a Linear Program by replacingthe equality in the second constraint with a single but equivalent inequality:

P: maximize: ftssubject to the constraints:

fij ≤ κij ; (i, j) ∈ G;∑i : (i,j)∈G′

fij −∑

k : (j,k)∈G′

fjk ≤ 0; j ∈ G;

fij ≥ 0; (i, j) ∈ G′.

Thus, the coefficients cij of the primal P are zero for all variables fij exceptfor fts which is equal to 1, i.e.,

z(f) =∑ij

0 · fij + 1 · fts (1.1)

To obtain the dual of P we look for coefficients dij , (i, j) ∈ G correspondingto the first set of constraints, and coefficients pj , j ∈ G corresponding to thesecond set of constraints to use as multipliers of the constraints:

fijdij ≤ κij dij ; (i, j) ∈ G;∑i : (i,j)∈G′

fijpj −∑

k : (j,k)∈G′

fjkpj ≤ 0; j ∈ G.

14

note the two special cases of the second inequality involving fts are∑i : (i,t)∈G

fitpt − ftspt ≤ 0;

ftsps −∑

k : (s,k)∈G

fskps ≤ 0.

Summing these inequalities and factoring out, we get∑(i,j)∈G

(dij − pi + pj)fij + (ps − pt)fts ≤∑

(i,j)∈G

κijdij

Thus, the dual objective is the right hand side of the above inequality, and, asbefore, the dual constraints are obtained by comparing the coefficients of theleft hand side with the coefficients of the objective:

P ∗ : minimize:∑

(i,j)∈G κijdij

subject to the constraints:

dij − pi + pj ≥ 0 (i, j) ∈ Gps − pt ≥ 1

dij ≥ 0 (i, j) ∈ Gpi ≥ 0 i ∈ G

Let us write the constraints of P using new slack variables ψij , ϕj :

ψij = κij − fij ; (i, j) ∈ G;

ϕj =∑

k : (j,k)∈G′

fjk −∑

i : (i,j)∈G′

fij ; j ∈ G;

fij ≥ 0; (i, j) ∈ G′;ψij ≥ 0; (i, j) ∈ G′;ϕj ≥ 0; j ∈ G

Note now an important feature of both P and P ∗: all variables fij , ψij , ϕj in P(and also all variables dij , pi in P ∗) appear only with the coefficient ±1. Onecan now see that, if we solve the above constraint equations for any subset ofthe set of all variables {fij , ψij , ϕj : (i, j) ∈ G∗, j ∈ G} as the set of the basicvariables, all the coefficients cij in the new objective which multiply ψij or ϕj

and which are obtained after the corresponding substitutions removing the ba-sic variables from the objective will have coefficients either 0 or 1.1 This meansthat in the corresponding basic feasible solution of P ′ at which the minimalvalue of the dual program is obtained, also all values dij of dij and all values pjof pj will be either 0 or 1.

What is the interpretation of such solution of the dual Linear Program P ∗?Let us consider the set A of all vertices j of G for which pj = 1 and the set B of

1This corresponds to the fact that such constraints define a polyhedra whose all verticeshave coordinates which are all either 0 or 1.

15

all vertices j for which pj = 0. Then A∪B = G and A∩B = ∅. The constraintps − pt ≥ 1 of P ∗ implies that ps = 1 and pt = 0, i.e., s ∈ A and t ∈ B. Thus,A and B define a cut in the flow network. Since at points p, d the objective∑

(i,j)∈G κijdij achieves the minimal value, the constraint dij − pi + pj ≥ 0 im-

plies that dij = 1 if and only if pi = 1 and pj = 0, i.e., if and only the edge(i, j) has crossed from set A into set B. Thus, the minimum value of the dualobjective

∑(i,j)∈G κijdij precisely corresponds to the capacity of the cut defined

by A,B. Since such value is equal to the maximal value of the flow defined bythe primal problem, we have obtained a maximal flow and minimal cut in G!

As we have mentioned, the extreme values of linear optimization problemsare always obtained on the vertices of the corresponding constraint polyhedra.In this particular case all vertices are with 0, 1 coordinates; however, in generalthis is false. For example, NP hard problems formulated as Linear Program-ming problems always result in polyhedra with non-integer vertices.

Theorem: Solving an Integer Linear Program (ILP), i.e., an LP with addi-tional constraints that the values of all variables must be integers is NP hard,i.e., there cannot be a polynomial time algorithm for solving ILPs (unless anextremely unlikely thing happens, namely that P = NP ).

Extending the scope of LP. There are several types of problems whichare not linear programs per se, but which can be reduced to linear programs byclever tricks. We give two such examples below.

Example: Assume that you are given a set of 10 points in the plane, withcoordinates (xi, yi), 1 ≤ i ≤ 10. Your goal is to find a polynomial of degree 3which ”fits” the data as closely as possible. What does this mean? In order forthe requirement to make sense we must specify how we measure how close suchpolynomial is to the given data. Let us represent the polynomial in the form

P (x) = a3x3 + a2x

2 + a1x+ a0

where a3, a2, a2, a1, a0 are the coefficients to be determined.

1. In the sense of the L2 norm: We need to minimize the sum

S2(a0, a1, a2, a3) =

10∑i=1

(P (xi)− yi)2 =

10∑i=1

(a3x3i + a2x

2i + a1xi + a0 − yi)2

which means that we want to minimize the sum of the squares of thedistances between the values of P (x) at points x1, . . . , x10 and the pointswith coordinates yi. This is a typical “Least Squares” approximation, andit is solved by computing the partial derivatives of S(a0, a1, a2, a3) andsetting them equal to 0:

∂S2(a0, a1, a2, a3)

∂ak=

10∑i=1

(a3x3i + a2x

2i + a1xi + a0 − yi)xki = 0

16

Note that this results in a system of linear equations in unknowns a0, a1, a2, a3which is easy to solve.

2. In the sense of the uniform norm: We need to minimize

SU (a0, a1, a2, a3) = max {|P (xi)− yi|, 1 ≤ i ≤ 10}= max

{|a3x3i + a2x

2i + a1xi + a0 − yi|, 1 ≤ i ≤ 10

}Note that, as it stands, this is nNOT an LP problem (or program, as it iscalled) because it involves two non-linear operations: taking the max andtaking absolute values. However, we can reduce it to an LP by introducinga new variable u and solving the following problem:

minimize u

subject to the constraints |P (xi)− yi| ≤ u, 1 ≤ i ≤ 10

Note that in the above minimization problem the objective is extremelysimple, it is just the variable u which plays the role of an upper bound forall of the absolute values of the differences |P (xi)− yi| ≤ u, 1 ≤ i ≤ 10.This way we got rid of the non-linear max operator, but now we also haveto get rid of the absolute values of all the differences. To do that, we notethat, if u > 0, then |z| ≤ u if an only if both z ≤ u and −z ≤ u. Thus, wesolve the following minimization problem:

minimize u

subject to the constraints P (xi)− yi ≤ u & − (P (xi)− yi) ≤ u 1 ≤ i ≤ 10

which is equivalent to:

minimize u

subject to the constraints a3x3i + a2x

2i + a1xi + a0 − u ≤ yi 1 ≤ i ≤ 10

− (a3x3i + a2x

2i + a1xi + a0)− u ≤ −yi 1 ≤ i ≤ 10

This is an LP almost in the standard form: we now only add inequalityu ≥ 0, but, since the coefficients ai can be both positive and negative,we now do the trick we have already mentioned: we replace each ai withai − ai, where a and a are two new variables. Thus our polynomial nowis (a3 − a3)x3 + (a2 − a2)x2 + (a1 − a1)x+ (a0 − a0), and we can add theconstraints ai ≥ 0 and ai ≥ 0, thus obtaining a proper standard form LP.

3. In the sense of the L1 norm: We need to minimize the sum

S1(a0, a1, a2, a3) =

10∑i=1

|P (xi)− yi| =10∑i=1

|a3x3i + a2x2i + a1xi + a0 − yi|

This can be reduced to an LP program by a somewhat similar trick: forevery i such that 1 ≤ i ≤ 10 we introduce new variables ui and solve the

17

following minimization problem:

minimize

10∑i=1

ui

subject to the constraints P (xi)− yi ≤ ui & − (P (xi)− yi) ≤ ui 1 ≤ i ≤ 10

which is equivalent to:

minimize

10∑i=1

ui

subject to the constraints a3x3i + a2x

2i + a1xi + a0 − ui ≤ yi 1 ≤ i ≤ 10

− (a3x3i + a2x

2i + a1xi + a0)− ui ≤ −yi 1 ≤ i ≤ 10

which is again reducible to an LP program by replacing each ai with ai−ai,just as in the previous case.

18

Documents

COMP4121 Lecture Notescs4121/lectures_2019/COMP4121... · COMP4121 Lecture Notes Linear Programming LiC: Aleks Ignjatovic [email protected] THE UNIVERSITY OF NEW SOUTH WALES