38
Math Review: Week 3 1 Yoshifumi Konishi Ph.D. in Applied Economics COB 337j E-mail: [email protected] 12. Quadratic Forms and De…nite Matrices Final ly , we are going to study optimization problems. At least up unt il the 1970s, economics is all about the study of  rational optimizing agents . Thoug h we often study cases of incomplete infor- mation and imperfect competition, we still assume that agents intend to optimize their behaviors, giv en the info rmati on they have. It is importa nt to note that, when we say age nts are rational, we mean that agents make optimal choices according to their objectives. In economics,  rationality is never a state ment about whether or not their objectiv es are reaso nable or rational. We simply tak e their objectives as giv en and solv e their optimization probl ems. So, it is very critical for us to be very familiar wit h optimiz ati on tec hni ques. T o be more precis e, we need to learn at least three things: (i) When the optimizatio n prob lems of our inter est are well-de… ned? When do they have a soluti on? (ii ) Ar e the solutio ns unique? Are the y glo bal or local? (ii i) How can we solv e them correctly and e¢ciently? In Week 1, we have learned how to solve unconstrained one-variable objectiv e functions . This sectio n discusses the cases of more than one varia ble, but with a very simple functional form. That is, a quadratic form. Before proceeding further, recall how to solve a one-variable optimization problem. Consider: min f (x) = x 2 FONC gives us a solution: 2x = 0 = ) x = 0 Need to check if this is a minimum or maximum. SOC at  x = 0  is: 2 > 0 So, this is at lea st a loca l minimum. But , we can eas ily see that  f (x)  >  0  for all  x  6 = 0. This means that  x = 0  is a global minimum. Now, a question arises naturally: Can we extend this sort of simple analysis to more than one v aria ble cases? A natur al extension of quadrati c functions to more than one variable is a  quadratic form. Recall that a quadratic form on  R n is a real-valued function,  Q  : R n ! R such that: Q(x) = X i  j a ij x i x  j 1 This lecture note is adapted from the following sources: Simon & Blume (1994), W. Rudin (1976), A. Takayama (1985), M. Wadachi (2000), and Toda & Asano (2000). 1

MathReview_3

Embed Size (px)

Citation preview

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 1/38

Math Review: Week 31

Yoshifumi Konishi 

Ph.D. in Applied Economics 

COB 337jE-mail: [email protected]

12. Quadratic Forms and De…nite Matrices

Finally, we are going to study optimization problems. At least up until the 1970s, economics isall about the study of  rational optimizing agents . Though we often study cases of incomplete infor-mation and imperfect competition, we still assume that agents intend to optimize their behaviors,given the information they have. It is important to note that, when we say agents are rational,we mean that agents make optimal choices according to their objectives. In economics,  rationality 

is never a statement about whether or not their objectives are reasonable or rational. We simplytake their objectives as given and solve their optimization problems. So, it is very critical for usto be very familiar with optimization techniques. To be more precise, we need to learn at leastthree things: (i) When the optimization problems of our interest are well-de…ned? When do theyhave a solution? (ii) Are the solutions unique? Are they global or local? (iii) How can we solvethem correctly and e¢ciently? In Week 1, we have learned how to solve unconstrained one-variableobjective functions. This section discusses the cases of more than one variable, but with a verysimple functional form. That is, a quadratic form.

Before proceeding further, recall how to solve a one-variable optimization problem. Consider:

min  f (x) = x2

FONC gives us a solution:

2x = 0 =) x = 0

Need to check if this is a minimum or maximum. SOC at  x = 0  is:

2 >  0

So, this is at least a local minimum. But, we can easily see that   f (x)  >  0   for all   x  6= 0. Thismeans that  x = 0  is a global minimum. Now, a question arises naturally: Can we extend this sortof simple analysis to more than one variable cases? A natural extension of quadratic functions to

more than one variable is a  quadratic form. Recall that a quadratic form on  Rn is a real-valuedfunction, Q  : Rn ! R  such that:

Q(x) =Xi j

aijxix j

1 This lecture note is adapted from the following sources: Simon & Blume (1994), W. Rudin (1976), A. Takayama(1985), M. Wadachi (2000), and Toda & Asano (2000).

1

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 2/38

Note that a quadratic form takes the value of zero if  x  =  0. Recall again that we can representevery quadratic form in a matrix form, using a symmetric matrix A,

Q(x) = xT Ax

For example, on  R2,

a11x21 + a22x2

2 + a12x1x2 = (x1x2)

  a11

  12a12

12a12   a22

  x1

x2

The de…niteness of the symmetric matrix   A   is de…ned by the characteristics of an originalfunction Q.

De…nition 12-1 (De…niteness): Let  A  be an (n n) symmetric matrix. Then,  A  is said to be:(i)  positive de…nite  i¤  xT Ax >  0   for all  x 6= 0  in  Rn;(ii)  positive semide…nite  i¤  xT Ax 0  for all  x 6= 0  in  Rn;

(iii)  negative de…nite  i¤  xT 

Ax <  0  for all  x 6= 0   in  Rn

;(iv)  negative semide…nite   i¤  xT Ax 0  for all  x 6= 0   in  Rn;(v) inde…nite  i¤  xT Ax >  0  for some  x and < 0  for some other  x   in  Rn.

Note that if  A  is positive (negative) de…nite, then  A   is positive (negative) semide…nite. Now,what’s the point of this? Remember that   xT Ax   = 0   at   x   =   0. So, if    A   is positive de…nite,then  x  =   0   is a unique global minimum. If   A   is a positive semide…nite, then  x  =   0   is a globalminimum (which may not be unique). And so on. So, if we are solving a maximization problem of aquadratic form, what we want is a negative semide…niteness  of  A. In fact, as we will see later, we cangeneralize this test to more general functional forms and we look for the  negative semide…niteness 

of a Hessian matrix . To have a more geometric idea, take a look at …gures 16.2-16.6 in S&B.

Obviously, these de…nitions by themselves are of no help for us, unless there is a convenientmethod for identifying the de…niteness of matrices. Luckily for us, there is one relatively easymethod.

De…nition 12-2 (Principal Submatrix and Minor): Let  A  be a (n n) matrix. A (k k)submatrix of   A  obtained by deleting   n k   columns,   i1; i2;:::;ink, and   n k   rows of the sameindices  i1; i2;:::;ink   is called a  k-th order principal submatrix  of  A. The determinant of the(k k) principal submatrix is called a  k-th order principal minor of  A.

Example 1. Consider a (3 3) matrix:

A =0@ a11   a12   a13a21   a22   a23

a31   a32   a33

1AQuestion: How many principal submatrices of the third order?   Answer: There is only one.Question: How many second order submatrices are there? Not necessarily principal submatrices.Answer: We need to …nd how many combinations for deleting   (i; j). So, possible combinationsare  (1; 1); (1; 2); (1; 3); (2; 1); (2; 2); (2; 3); (3; 1); (3; 2); (3; 3). Note that deleting 1st row and second

2

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 3/38

column is di¤erent from deleting second row and 1st column. So, there are nine possible secondorder submatrices of  A. How about principal submatrices? There are only three:   (1; 1); (2; 2); (3; 3),because we have to delete rows and columns of the same indices.

A33   =   a11   a12

a21   a22

A22   =

  a11   a13

a31   a33

A11   =

  a22   a23

a32   a33

To …nd the de…niteness of a matrix, however, we only use one special principal minor, called aleading principal minor.

De…nition 12-3 (Leading Principal Submatrix and Minor): Let A  be a (nn) matrix. Thek-th order principal submatrix of  A  obtained by deleting the   last   n k   rows and the   last   n kcolumns is called the  k-th order leading principal submatrix  of  A. We denote it by  Ak. Itsdeterminant is called the  k-th order leading principal minor  of  A.

Example 2. For a (3 3) matrix:

A =

0@ a11   a12   a13

a21   a22   a23a31   a32   a33

1A

Their leading principal minors are:

det(A1) = det(a11)

det(A2) = det

  a11   a12a21   a22

det(A3) = det

0@ a11   a12   a13

a21   a22   a23

a31   a32   a33

1A

Now, we are ready to state the main theorem of this section.

Theorem 12-1 (De…niteness): Let  A  be a (n n) symmetric matrix. Then,(i)  A  is positive de…nite if and only if all its  leading principal minors  are strictly positive (> 0);(ii)  A  is positive semide…nite if and only if all its  principal minors  are nonnegative ( 0);(iii)  A  is negative de…nite if and only if all its  leading principal minors  alternate sings as follows:

det(A1) <  0; det(A2) >  0; det(A3) <  0  etc

3

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 4/38

(iv)  A   is negative semide…nite if and only if all its  principal minors  of odd order are    0  and of even order are   0;(v) If some k-th order leading principal minor  of  A  is nonzero but the sign pattern of nonzero termsdoes not …t either of the case (i) or (iii), then  A  is inde…nite.

Note that for positive or negative de…niteness of  A, we only need to check their leading principalminors. But, if  A   is neither positive (negative) de…nite nor inde…nite, then we must check all of its principal minors. Furthermore, it is important to note that when some of the leading principalminors are zero, it may not be inde…nite and can be positive or negative semide…nite if the signpattern of its nonzero terms still obeys the patterns of (i) or (iii).

Example 3. Consider a (4 4) matrix:

A =

0

BB@a11   a12   a13   a14a21   a22   a23   a24a31   a32   a33   a34

a41   a42   a43   a44

1

CCAQuestion: What are the leading principal submatrices?  Question: Consider each of the followingcases. Are they positive de…nite, negative de…nite, or inde…nite?

(a) det(A1) >  0; det(A2) >  0; det(A3) >  0; det(A4) >  0 =) A  is positive de…nite.(b) det(A1) <  0; det(A2) >  0; det(A3) <  0; det(A4) >  0 =) A   is negative de…nite.(c) det(A1) >  0; det(A2) <  0; det(A3) >  0; det(A4) <  0 =) A   is inde…nite.(d)  det(A1) >  0; det(A2) >  0; det(A3) = 0; det(A4) >  0 =) A  is not positive de…nite, but may

be positive semide…nite. If it is not positive semide…nite, then it is inde…nite.(e) det(A1) = 0; det(A2) <  0; det(A3) >  0; det(A4) >  0 =) A  is inde…nite, because of  det(A2):

If a symmetric matrix is a diagonal matrix, then it becomes very easy. Recall that the deter-

minant of a diagonal matrix is simply a product of the diagonal terms. So, its leading principalminors are:

det(A1) = a11; det(A2) = a11a22;:::; det(An) = a11a22:::ann

Thus, we have the following convenient theorem.

Theorem 12-2 (De…niteness of Diagonal Matrices): Let A  be a (n n) diagonal matrix.Then,(i)  A is positive de…nite if and only if all the  a0ii  are strictly positive;(ii)  A  is negative de…nite if and only if all the  a0ii  are strictly negative;(iii)  A  is positive semide…nite if and only if all the  a 0ii  are nonnegative  ( 0);

(iv)  A  is negative semide…nite if and only if all the  a0ii  are nonpositive  ( 0);(iv)  A  is inde…nite if two of the  a 0ii  have opposite signs.

In summary, we have that  x  =  0  is a unique solution to  max Q(x) = xT Ax if  A is N.D. and tomin Q(x) = xT Ax  if P.D.

Lastly, in economic applications, we often have a linear constraint of the form:

4

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 5/38

max Q(x)   s.t.   c x = 0  where  c = (c1;:::;cn)

In this case, we need to have "A is negative de…nite on the constraint set fx :  c x = 0g". To checkthis, we form a   bordered matrix:

H n+1   =

  0   c

cT  A

  (1)

=

0BBBBB@

0   c1   c2     cnc1c2...

cn

a11   a12     a1n

a21   a22     a2n...

  ...  . . .

  ...an1   an2     ann

1CCCCCA

Theorem 12-3: Consider a bordered matrix of the form in (1). Suppose that  c1 6= 0.

(i) If the last  n   leading principal minors of  H n+1  has the same sign, then the quadratic form  Q   ispositive de…nite on the constraint set  fx :  c x = 0g  (so that  x =  0  is a unique global minimum).(ii) If the last   n   leading principal minors of   H n+1   alternate signs, then the quadratic form   Q   isnegative de…nite on the constraint set  fx :  c x = 0g(so that  x =  0  is a unique global maximum).

We can generalize the result of this theorem for more than one constraint. However, we rarelysee the use of it. So, I believe it is enough that you know where you can …nd it in your textbookwhen you need to refer to it. (But you will see relatively more frequent use of a similar technique,‘bordered Hessian’ later).

13. Unconstrained Optimization

In this section, we deal with unconstrained optimization. The analysis of this case is impor-tant, because this case corresponds to the interior solutions of any (constrained or unconstrained )optimization problems. We have learned that for functions of one variable, a necessary conditionfor an interior maximum is that a derivative of the objective must be equal to zero. We also hadsu¢cient conditions for an interior maximum, which is that the second derivative must be negative.The main results for functions of several variables are analogous to these. First, let’s review a keyterm.

De…nition 3-11 (Interior): For any set E   X , a point x  is an interior point of  E  if there existsa neighborhood  N   of  x such that N   E:  We denote the set of all interior points of  E  as int(E ).

Example 1. Consider a set  E  = f(x; y) : x2

+ y2

1g.   Question: What does this set look like inR2?   Answer: It is an area inside a unit circle. Consider a point  (0; 1). Is it an interior point? No,

it is a boundary point. How about  (0; 1=2)?   02 + (1=2)2 = 1=4 <  1. So it is an interior point. Now,consider a set  E  = f(x; y) : x 0; y > 0g. Is a point  (0; 1) a boundary or interior point?   Answer:It is a boundary point. To convince yourself, try to construct an open ball around  (0; 1) and see if any open ball can be   E .

5

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 6/38

Theorem 13-1 (FONC for Local Max/Min): Let F   : U   Rn ! R be a C 1 function. Suppose

that  x 2 U  is a local maximum or minimum of  F   in  U . Suppose that(i)  U   is open; or(ii)  x 2 int(U ).

Then,

D1F (x) = 0T  i.e.  @F 

@xi(x) = 0  for all  i

It is important to keep in mind that (i) this condition is a necessary condition (D1F (x) = 0T 

does not imply that  x is a local min or max. It simply means that  x is a critical point of  F ) and(ii) this necessary condition only works for interior points. (See the graph).

What about second order conditions? Recall that the second order derivative of   F   can besummarized as the  Hessian of  F :

H  = D2F   =

0BBBBB@@ 2

f @x2

1

@ 2

f @x2@x1   @ 2

f @xn@x1@ 2f 

@x1@x2@ 2f @x2

2

  @ 2f @xn@x2

...  ...

  . . .  ...

@ 2f @x1@xn

@ 2f @x2@xn

  @ 2f @x2n

1CCCCCARecall also that the Hessian is symmetric. We have the following second-order conditions.

Theorem 13-2 (SOSC for Local Max/Min): Let F   : U   Rn ! R be a C 2 function. Suppose

that U   is open and that  D1F (x) = 0T .(i) If the Hessian  D2F (x)  is negative de…nite, then  x is a (strict) local maximum of  F ;(ii) If the Hessian  D2F (x)  is positive de…nite, then  x is a (strict) local minimum of  F ;(iii) If the Hessian  D2F (x) is inde…nite, then  x is neither local max nor minimum of  F .

De…nition 13-1 (Saddle Point): A critical point  x of   F   for which the Hessian   D2F (x)   isinde…nite is called a  saddle point  of  F .

A saddle point is a minimum of  F   in some directions and a maximum in other directions. Itlooks like a saddle, just like in Figure 16.4.

To prove Theorem 13-2, we utilize two of the important results learned in Week 1 and 2. Let’ssee the sketch of the proof, as it is a good review of the previous materials. In Week 1, we havelearned a Taylor approximation. Let’s write a second-order approximation of  F  around the critical

point  x

. Let h  be a change in  x, so that  x

+ h  represents an arbitrary point around  x

. In asu¢ciently small neighborhood of  x, therefore,

F (x + h) F (x) + D1F (x)h + 1

2hT D2F (x)h

Because D1F (x) = 0; D1F (x)h = 0:  So, we can rewrite this as:

6

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 7/38

F (x + h) F (x)  1

2hT D2F (x)h

We learned that, if  D2F (x) is negative de…nite, for all  h 6= 0,  hT D2F (x)h <  0. This means that,

for all  h 6= 0,

F (x + h) F (x) <  0   or   F (x + h) < F (x)

This means that, for all points around  x, F (x) gives the highest value. In other words,  F (x)  isa local maximum. We can easily adopt it for the case of positive de…nite.

Example 2. Let’s work through a concrete problem. Consider a function:

F (x; y) = x3 y3 + 9xy

Let’s …nd a critical point …rst, using FOC.

F x   = 3x2 + 9y = 0   (1)

F y   =   3y2 + 9x = 0   (2)

Now, from (1),  y  = (1=3) x2. Substitute this into (2), and manipulate:

3 (1=3) x2

2+ 9x   =   (1=3) x4 + 9x = 0

or   x4 27x   = 0

or   x

x3 27

  = 0

So,  x = 0  or  3. Thus, the solution to this system is:   (x; y) = (0; 0) or  (3;3). Now, to determinewhich one is a local maximum or minimum, let’s check the Hessian.

H    =

  F xx   F xyF xy   F y

=

  6x   9

9   6y

At (x; y) = (0; 0),

H  =   0 9

9 0 To check the de…niteness of the Hessian, compute the leading principal minors,

det(H 1) = det(0) = 0

det(H 2) = det

  0 99 0

 = 81

7

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 8/38

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 9/38

In summary, we need either one of the conditions:(a)  x  is a local maximum (minimum) of  f   and  x   is the only critical point of  f  on a connected

interval I ; or(b) f 00 0 (for minimum) and  f 00 0 (for maximum) for all  x.

As we will see in a moment, we have analogous conditions for global maxima for functions of several variables. Before discussing further, we need to de…ne several related concepts.

De…nition 13-2 (Concave & Convex Functions): A function F   : U   Rn ! R is  concave on

U   if and only if for all  x;y 2 U , for all   2 [0; 1], we have:

F  [x + (1 )y] F (x) + (1 )F (y)

A function  F   :  U    Rn !  R   is  convex  on  U   if and only if for all  x;y  2  U , for all    2   [0; 1], wehave:

F  [x + (1 )y] F (x) + (1 )F (y)

A function  F   :  U    Rn !  R   is   strictly concave  on  U   if and only if for all  x  6=  y  2  U , for all 2 (0; 1), we have:

F  [x + (1 )y] > F (x) + (1 )F (y)

A function   F   :  U    Rn !  R   is   strictly convex  on   U   if and only if for all  x  6=  y  2  U , for all 2 (0; 1), we have:

F  [x + (1 )y] < F (x) + (1 )F (y)

For functions of one variable, we can easily see what concavity and convexity means graphically.(See the graphs). Note that a linear function is both convex and concave. Now, we have a closelyrelated concept for sets. You may …nd them a bit confusing, but it is very important that you areable to use them correctly. Believe me, you will see them everywhere in your prelim or coursework.

De…nition 13-3 (Convex Sets): A set  U   Rn is  convex  if and only if for all  x;y 2 U , for all

2 [0; 1],

x + (1 )y 2 U 

There is no such thing as concave sets. We say the sets that are not convex are nonconvex.Moreover, note that, for functions of one variable, if  f 00  0, it de…nes a concave function and if 

f 0 0, then it de…nes a convex function. Thus, condition (b) above translates into the statement:If a function is concave, then a critical point is a global maximum. If a function is convex, then acritical point is a global minimum. Now, we are ready to state the main theorem:

Theorem 13-4 (Su¢ciency for Global Min/Max): Let  F   :  U    Rn !  R   be a  C 2 functionand U  is a convex open subset of  Rn.(a) The following three conditions are equivalent:

9

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 10/38

(i)  F  is a concave function on  U ;(ii)  F (y) F (x) D1F (x)(y x)  for all  x;y 2 U ;(iii)  D2F (x) is negative semide…nite for all  x 2 U .

(b) The following three conditions are equivalent:

(i)  F  is a convex function on  U ;(ii)  F (y) F (x) D1F (x)(y x)  for all  x;y 2 U ;(iii)  D2F (x) is positive semide…nite for all  x 2 U .

(c) If  F  is a concave function on U  and D1F (x) = 0 for some  x 2 U , then x is a global maximumof  F   on U .(d) If  F  is a convex function on  U  and D1F (x) = 0  for some  x 2 U , then x is a global minimumof  F   on U .

Question: What does (ii) and (iii) mean for one-variable functions?   Answer: For (i), draw apicture. For (ii),  D2F (x) = F 

00

, so that  F  is concave if and only if  F 00

0  or  <  0.

Note the di¤erence between Theorem 13-2, 13-3 and 13-4.

For local maxima,

H (x)   is ND and  D1F (x) = 0

  =)   [x is a local maximum]

[x is a local maximum] =)

H (x)  is NSD and D1F (x) = 0

For global maxima,

H (x)   is NSD for all  x and D1F (x) = 0

  =)   [x is a global maximum]

Example 4. Recall the problem in Example 2.

F (x; y) = x3 y3 + 9xy

We had two critical points:   (x; y) = (0; 0)   or   (3;3). We concluded that at   (0; 0), Hessian isinde…nite, so that it is neither a local maximum or minimum. On the other hand, we …nd that at(3;3), Hessian is positive de…nite, so that it is a local minimum. Now, the question is whetheror not this local minimum is also a global minimum. To …nd out, we need to check the Hessian atarbitrary points of  x.

H    =

  F xx   F xyF xy   F y

=   6x   99   6y

We need to check all leading principal minors …rst:

det(H 1) = 6x ? 0

det(H 2) =   36xy 81 ? 0

10

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 11/38

So, we cannot say that this is a global minimum or not.

Example 5. Consider an additively-separable utility function:

F (x; y) = u(x) + v(y)

Suppose that both   u()   and   v()  are concave functions, so that   u00  0   and   v00   0.   Question:Then, is the original utility function  F   concave?   Answer: To answer this, we need to check theHessian at arbitrary points.   F xx  =  u00; F xy  = F   = 0; F yy  = v 00. So, the Hessian is:

H  =

  F xx   F xyF xy   F y

 =

  u00 0

0   v00

For NSD, we need to check all principal minors. We have two 1st-order principal minors and one2nd-order principal minor.

det(u00) = u00 0; det(v00) = v 00 0; det   u00 0

0   v00  =  u00v00 0

So, they are consistent with NSD.  F   is a concave function.   Question: What can you say, if  u()and v() are strictly concave?   Answer:   F   is a strictly concave function.

Lastly, as a corollary to Theorem 13-4, we have the following result.

Theorem 13-5 (Uniqueness of Global Min/Max): Let  F   :  U    Rn !  R   be a  C 2 functionand U  is a convex open subset of  Rn.(c) If  F   is a strictly concave function on  U   and D1F (x) = 0 for some  x 2 U , then  x is a uniqueglobal maximum of  F   on U .(d) If  F  is a strictly convex function on  U   and D1F (x) = 0  for some  x 2 U , then  x is a unique

global minimum of  F   on  U .

To check strict concavity (convexity) of a function, we have the following characterizationtheorem (See Takayama A., Mathematical Economics, 1985, p.125-126).

Theorem 13-6: Let  F   : U   Rn ! R   be a  C 2 function and  U  is a convex open subset of  Rn.

(a) F  is a strictly concave function on  U   if  D2F (x)  is negative de…nite for all  x 2 U .(b)  F  is a convex function on  U   if  D2F (x)  is positive de…nite for all  x 2 U .

Note that the converse may not be true (See an example).

14. Constrained Optimization I: First-Order Conditions

The objective of this and the next section is for us to be able to solve the optimization problemsof the following form:

maxx2Rn

F (x1; x2;:::;xn)

11

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 12/38

subject to  g1(x1; x2;:::;xn)     b1...

gk(x1; x2;:::;xn)     bk

and h1(x1; x2;:::;xn) =   c1...

hm(x1; x2;:::;xn) =   cm

That is, an optimization problem of  n decision variables with k inequality constraints and m equalityconstraints. Note that this form of optimization arises very naturally in economics. For example,a utility maximization is generally of the form:

maxx2Rn

U (x1; x2;:::;xn)   s.t.   p x M ; x1;:::;xn   0

The last n  constraints are usually called nonnegativity constraints and are assumed to be there,whether implicitly stated or not, because a consumer can never consume a negative amount of agood (except some special cases). Other common economic problems include pro…t maximization:

maxx2Rn

p yw x   s.t.   f 1(x) y1;:::;f m(x) ym; x1;:::;xn   0

and cost minimization:

minx2Rn

p x   s.t.   U (x1; x2;:::;xn)   U ; x1;:::;xn   0

14-1. Equality Constraints without Inequality Constraints 

Let’s …rst consider an optimization problem with equality constraints only. It is called a  La-grangian method. You may have learned it somewhere in calculus classes. Consider an objectivefunction of two decision variables with one equality constraint:

maxx2R2

f (x1; x2)   s.t.   h(x1; x2) = c   (1)

Then, we have the following result.

Result 14-1 (FONC for (1)): Suppose that in Problem (1)  f   and  h  are  C 1 function and that(x1; x2)  is the solution. If  D1h(x1; x2) 6= 0 (that is,  (x1; x2) is not a critical point of  h), then there

exists a real number  

such that  (x1; x

2;

)  is a critical point of the following function:

L(x1; x2; ) = f (x1; x2) + [c h(x1; x2)]

Thus, by FONC, we must have:

D1L(x1; x2; ) = 0

12

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 13/38

In other words,

@L

@x1(x1; x2; ) = 0;

  @L

@x2(x1; x2; ) = 0;

 @L

@(x1; x2; ) = 0

This is a very surprising result (at least to me). It basically says that we can turn constrainedoptimization into unconstrained optimization and just working through FONCs can give you (can-didate) solutions. As we will see, this method works (with some modi…cation) for inequality con-straints as well. The function  L   is called a   Lagrange function  and the miraculous variable     iscalled a   Lagrange multiplier. The condition that  (x1; x2)   is not a critical point of  h   is calleda  constraint quali…cation. As we get into more complicated problems, we will see various con-straint quali…cations. But, in most economic problems, this condition is automatically satis…ed,because their constraint is likely to be a linear combination of variables:

a1x1 + ::: + anxn  M 

in which case, we only need  ai 6= 0  for all  i.

Now, let’s see why this result can be obtained geometrically. Consider Problem (1) again. Note…rst that the equality constraint forms a level set:

C  = f(x1; x2) : h(x1; x2) = cg

In two dimensional space, this set is represented by some line. Now, consider the objective,  f (x1; x2).We can also draw levels sets for each and every real number  a.

L(a) = f(x1; x2) : f (x1; x2) = ag

Suppose that an increase in  a  is represented by the moves of levels sets in the northeast direction.

Then, the maximum solution occurs exactly at the tangency of  L(a) and C . Recall Implicit FunctionTheorem, which says the slope of the constraint set is given by:

dx2

dx1=

@h=@x1

@h=@x2

This must be equal to the slope of the level set L(a) at this optimum, which is also given by ImplicitFunction Theorem:

dx2

dx1=

@f=@x1

@f=@x2

Thus, we have:

@h=@x1

@h=@x2(x) =

 @f=@x1

@f=@x2(x)

Or,

@f=@x1

@h=@x1(x) =

 @f=@x2

@h=@x2(x)

13

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 14/38

So, let   be the common number:

= @f=@x1

@h=@x1(x) =

 @f=@x2

@h=@x2(x)

Rewrite these as:

=  @f=@x1

@h=@x1(x) =)

  @f 

@x1(x)

 @h

@x1(x) = 0

=  @f=@x2

@h=@x2(x) =)

  @f 

@x2(x)

 @h

@x2(x) = 0

Along with the original constraint,

c h(x) = 0

We have a system of three equations with three unknown variables. Note that this system is exactlythe FONCs for the Lagrangian:

@L

@x1(x1; x2; ) =

  @f 

@x1(x)

 @h

@x1(x) = 0

@L

@x2(x1; x2; ) =

  @f 

@x2(x)

 @h

@x2(x) = 0

@L

@(x1; x2; ) =   c h(x) = 0

A couple of caveats in using this result. First,  cannot be de…ned at  x if    @h@x1(x) = 0  or

@h@x2

(x) = 0, because RHS of the equation:

= @f=@x1

@h=@x1(x) =

 @f=@x2

@h=@x2(x)

are not well de…ned in such a case. Let’s see why this can be a problem in a concrete example.

Example 1. Consider a problem:

maxx2R2

f (x1; x2) = x1x2   s.t.   h(x1; x2) = x1 = 1

Note that  @h=@x1  = 1  but  @h=@x2  = 0. Let’s draw a picture. In this case, x1   is …xed at  c. But,

there is no constraint on  x2   (in general, if  @h=@x2   = 0   for all   x2, there is no constraint on   x2).Thus, we can take any value of  x2. Thus, the function  f (x1; x2) =  x1x2   is unbounded under thisconstraint and there can be no solution.

In general, when CQ is violated, there may or may not be a solution. Caution: This ex-ample is not meant as an example of CQ being violated. In fact, CQ is still satis…ed, because(@h=@x1;@h=@x2) = (1; 0) 6= (0; 0). Rather, it is an example illustrating  what kind of problems you 

14

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 15/38

might encounter when CQ is violated . The following example is a case in which this constraintworks just …ne.

Example 2. Consider a problem:

minx2R2

f (x1; x2) = x21 + x2

2   s.t.   h(x1; x2) = x1 = 1

The level set of  f (x1; x2)   is obviously a circle. The bigger the value of  f   is, the bigger the circlegets. So, the minimum obviously happens at  x1   = 1   and  x2   = 0, at which we have a tangency.(Note that if this were a maximization problem, then there would not be a solution).

Another caveat is about how to set up the Lagrangian function. As you may be aware, (i) wecan have  +  or    before the Lagrange multiplier and (ii) we can have  h(x1; x2) c or  c h(x1; x2).Mathematically, to obtain a solution, it does not matter what combination of these (total of fourcombinations) you have in your Lagrangian setup. However, to get a right interpretation of thevalue of the Lagrange multiplier, it is sometimes important to set it up in a right manner. We will

discuss this point later. For this reason, I advise that you use either one of the following setupsand make it a habit.

L(x1; x2; ) =

  f (x1; x2) + [c h(x1; x2)]f (x1; x2) [h(x1; x2) c]

Lastly, we generalize this Lagrangian method for  n variables and  m equality constraints.

Theorem 14-2 (FONC for Equality Constraints): Consider the maximization or minimizationproblem of the form:

maxx2Rn

F (x1; x2;:::;xn)

subject to  h1(x1; x2;:::;xn) =   c1...

hm(x1; x2;:::;xn) =   cm

Suppose that  F; h1;:::;hm   are  C 1 functions and that  x = (x1; x2;:::;xn)   is a local maximum (orminimum) of  F   on the constraint set. If  x is   not  a critical point of  h  = (h1;:::;hm), then thereexists a vector of Lagrange multipliers   = (1; 2;:::;m)  such that  (x;)  is a critical point of the Lagrangian function:

L(x;) =   F (x) + [c h]

=   F (x) + 1[c1 h1(x)] + ::: + 1[cm hm(x)]

Note that we haven’t de…ned what it means to be a critical point of a vector-valued multivariatefunction h.

15

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 16/38

De…nition 14-1 (Critical Point): Let  h :  Rn ! Rm be a  C 1 function. A point x is said to be

a  critical point  of  h  if and only if the rank of its Jacobian matrix,  rank(D1h(x))  is  < m.

14-2. Inequality Constraints without Equality Constraints 

What if constraints are inequality constraints? Things get a bit more complicated. To developour intuition, let’s consider a simple case of two variables and one inequality constraint.

maxx2R2

f (x1; x2)   s.t.   g(x1; x2) b   (2)

For this problem, two cases can happen.   Case 1: The constraint is   binding   at optimum, i.e.g(x1; x2) =   b.   Case 2: The constraint is  not binding, i.e.   g(x1; x2)  < b. In other words, theoptimum happens at an interior of the constraint set. Let’s see this graphically. (See the Figure).

Case 1. The maximum would have occurred outside the constraint set if there were no constraint.In this case, we could set it up as a problem with an equality constraint. Thus, the constraintproblem is exactly the same as the one we saw in (1) before and FONCs would be:

@f 

@x1(x)

 @g

@x1(x) = 0   (3)

@f 

@x2(x)

 @g

@x2(x) = 0

Case 2. The maximum would be exactly the same as if there were no constraint. In this case, wecould set it up as an unconstrained optimization. FONCs would be just:

@f 

@x1

(x) = 0   (4)

@f 

@x2(x) = 0

So, we just need a convenient device such that we can deal with both cases. The following‘trick’, called a  complementary slackness condition, would do the job.

[b g(x1; x2)] = 0

Let’s examine what this does for us. First, note that (4) becomes (3) if   = 0. Suppose that theconstraint is binding. Then,  b g(x1; x2) = 0  so   can be any value. Suppose that the constraintis not binding. Then,  b g(x1; x2) >  0  so that   must be zero. So, we have an ideal relationship.

Binding constraint   =)  @f 

@x1(x)

  @g

@x1(x) = 0;

  @f 

@x2(x)

 @g

@x2(x) = 0

Non-binding const.   =)  @f 

@x1(x) = 0;

  @f 

@x2(x) = 0

Question: Can it happen that   = 0   and   b g(x1; x2) = 0?   Answer: Yes. It is called adegenerate case. It rarely happens. But when it does, it means that the unconstrained optimum

16

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 17/38

occurs exactly at the boundary of the constraint set. (See the Figure). These conditions can besummarized as the following result.

Result 14-3 (FONC for (2)): Suppose that in Problem (2)  f   and  g  are  C 1 function and that(x1; x2)   is the solution. Suppose that, if  g(x1; x2) = 0,  D1g(x1; x2)  6= 0  (that is,  (x1; x2)   is not a

critical point of  g). Then, there exists a real number   such that, for the Lagrangian functionde…ned by:

L(x1; x2; ) = f (x1; x2) + [b g(x1; x2)]

we have:

(i)  @L

@x1(x1; x2; ) = 0;

(ii)  @L

@x2(x1; x2; ) = 0;

(iii)   g(x1; x2) b;

0;

[b g(x

1; x

2)] = 0

I advise that you write the last condition (iii) as a whole set. This way, you are less likely toforget when there are many mixed constraints. Note that nonnegativity of the Lagrange multipliercomes only if you set up the Lagrangian function right. For this problem, there are only two wayswe can set it up right.

L(x1; x2; ) =   f (x1; x2) + [b g(x1; x2)]

L(x1; x2; ) =   f (x1; x2) [g(x1; x2) b]

Use +  in front of    if you put RHS of ‘less than equal’ constraint …rst (i.e.   b g(x1; x2)) and use

in front of    if you put RHS of ‘less than equal’ constraint second (i.e.   g(x1; x2) b).   Question:What if we have a resource constraint of the form:   b g(x1; x2)?   Answer: The same rule applies.So, the sign changes. Use +   in front of     if you put RHS of ‘less than equal’ constraint …rst (i.e.g(x1; x2)b). That is, f (x1; x2) + [g(x1; x2) b] or  f (x1; x2) [b g(x1; x2)]. In any case, makea habit of using a correct rule, so that you don’t have to ‘think’ while you are writing a Lagrangianin exams.

Now, we generalize this to an optimization problem with  n  decision variables and  m  inequalityconstraints.

Theorem 14-4 (FONC for Inequality Constraints): Consider the maximization problem of the form:

maxx2Rn

F (x1; x2;:::;xn)

subject to g1(x1; x2;:::;xn)     b1...

gm(x1; x2;:::;xn)     bm

17

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 18/38

Suppose that F; g1;:::;gm  are C 1 functions and that  x = (x1; x2;:::;xn)   is a local maximum of  F on the constraint set. Suppose that, when the …rst  k  constraints are binding at  x,   x is not acritical point of  gk  = (g1;:::;gk). That is, the rank of the Jacobian:

D1gk(x) =0B@

@g1

@x1(x)     @g1

@xn(x)

...  . . .

  ...@gk@x1

(x)     @gk@xn

(x)

1CAis   k. Then, there exists a vector of Lagrange multipliers   = (1; 2;:::;m)   such that, for theLagrangian function de…ned by:

L(x;) =   F (x) + [b g]

=   F (x) + 1[b1 g1(x)] + ::: + m[bm gm(x)]

we have:

(i)  @ L(x;)

@xi= 0   for all i = 1; 2:::n

(ii)   g j(x) b j;  j   0;  j [b j g j(x)] = 0   for all j  = 1; 2:::m

Example 3. Let’s consider a standard utility maximization problem:

maxx2R2

U (x1; x2)   s.t.   p1x1 +  p2x2  I 

Assume that  p1; p2 >  0, so that CQ is satis…ed. The Lagrangian is:

L = U (x1; x2) + [I  p1x1  p2x2]

FONCs are:

@ L

@x1=

  @U 

@x1 p1 = 0

@ L

@x2=

  @U 

@x2 p2 = 0

@ L@   =   I  p1x1  p2x2  0; 0; [I  p1x1  p2x2] = 0

Now, let’s impose another assumption on utility, called monotonicity. That is,   @U @x1> 0 or   @U 

@x2> 0.

This says that increasing consumption of at least one good would strictly increase her utility.Under such assumption,     cannot be zero, for otherwise, FONCs would become   @U 

@x1=   p1   = 0

or   @U @x2

=  p2   = 0. Thus,   >  0. But, then, from the slackness condition,   I    p1x1   p2x2   = 0.

18

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 19/38

This means that the consumer spends all of her income if her utility is monotonic increasing. Thisphenomenon is called a  Walras Law.

In the above formulation, we have omitted non-negativity constraints:   x1; x2    0. We shouldhave the following problem:

maxx2R2

U (x1; x2)   s.t.   p1x1 +  p2x2  I; x1  0; x2  0

How can we incorporate the constraints? Notice that we can rewrite the non-negativity constraintsas:

0 x1; 0 x2

Then, in terms of Theorem 14-4,  b1 = 0; g1  = x1; b2 = 0; g2 = x2: Thus, the Lagrangian is:

L   =   U (x1; x2) + 1[b1 g1] + 2[b2 g2] + 3[I  p1x1  p2x2]

=   U (x1; x2) + 1x1 + 2x2 + 3[I  p1x1  p2x2]

FONCs are:

@ L

@x1=

  @U 

@x1+ 1 3 p1 = 0; x1  0; 1  0; 1x1 = 0   (5)

@ L

@x2=

  @U 

@x2+ 2 3 p2 = 0; x2  0; 2  0; 2x2 = 0   (6)

@ L

@  =   I  p1x1  p2x2  0; 3  0; 3[I  p1x1  p2x2] = 0   (7)

Now, note that in (5), if we have x1 >  0, from the slackness condition  1 = 0. Then, it implies that@U @x1

3 p1 = 0. If  x1 = 0, then  1  may not be zero,  1   0, which implies:

@U 

@x1 3 p1 = 1  0

We can do the same analysis for (6). In summary, we can simplify our FONCs using only oneLagrange multiplier:

@U 

@x1 p1     0; x1  0 and

  @U 

@x1 p1 = 0  if  x1 >  0

@U 

@x2 p2     0; x2  0 and

  @U 

@x2 p2 = 0  if  x2 >  0

I  p1x1  p2x2     0; 0; [I  p1x1  p2x2] = 0

This is called   Kuhn-Tucker conditions. In essence, Kuhn-Tucker formulation is a special caseof Theorem 14-4, in which there are non-negativity constraints but the Lagrangian is formulatedwithout non-negativity constraints.

19

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 20/38

Theorem 14-5 (Kuhn-Tucker Theorem): Consider the maximization problem of the form:

maxx2Rn

F (x1; x2;:::;xn)

subject to g1(x1; x2;:::;xn)     b1...

gm(x1; x2;:::;xn)     bm

x1     0...

xn     0

Suppose that F; g1;:::;gm  are C 1 functions and that  x = (x1; x2;:::;xn)   is a local maximum of  F on the constraint set. Suppose that the modi…ed Jacobian matrix:

(@g j=@xi) ji

has maximal rank, where the  j’s run over the indices of the  g j  that are binding at  x, and the  i’srange over the indices  i  for which  xi  >  0. Then, there exists a vector of Lagrange multipliers suchthat, for the Kuhn-Tucker Lagrangian de…ned by:

L(x;) =   F (x) + [b g]

=   F (x) + 1[b1 g1(x)] + ::: + m[bm gm(x)]

we have:

(i)  @ L(x;)

@xi 0; xi   0; xi

@ L(x;)

@xi= 0   for all i = 1; 2:::n

(ii)   g j(x) b j;  j   0;  j [b j g j(x)] = 0   for all j = 1; 2:::m

So, we have n  slackness conditions for all non-negativity constraints and  m  slackness conditionsfor all regular constraints.

Lastly, I will state the theorem for the case of mixed constraints. It is a straightforwardadaptation of Theorems 14-2 and 14-4.

Theorem 14-6 (FONC for Mixed Constraints): Consider the maximization problem of theform:

maxx2Rn

F (x1; x2;:::;xn)

20

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 21/38

subject to  g1(x1; x2;:::;xn)     b1...

gk(x1; x2;:::;xn)     bk

h1(x1; x2;:::;xn) =   c1...

hm(x1; x2;:::;xn) =   cm

Suppose that  F; g1;:::;gk; h1;:::;hm  are  C 1 functions and that  x = (x1; x2;:::;xn)   is a local max-imum of  F   on the constraint set. Without loss of generality, assume that the …rst k0 inequalityconstraints are binding at  x. Suppose that the rank of the Jacobian matrix of the equality con-straints plus binding constraints:

0BBBBBBBBB@

@g1

@x1 (x

)     @g1

@xn (x

)...  . . .

  ...@g

k0

@x1(x)  

  @gk0

@xn(x)

@h1@x1

(x)     @h1@xn

(x)...

  . . .  ...

@hm@x1

(x)     @hm@xn

(x)

1CCCCCCCCCAis k0+m: Then, there exists a vector of Lagrange multipliers = (1; 2;:::;k); = (1; 2;:::;m)such that, for the Lagrangian function de…ned by:

L(x;) =   F (x) + [b g] + [c h]=   F (x) + 1[b1 g1(x)] + ::: + k[bk gk(x)] + 1[c1 h1(x)] + ::: + m[cm hm(x)]

we have:

(i)  @ L(x;)

@xi= 0   for all i = 1; 2:::n

(ii)   g j(x) b j;  j   0;  j [b j g j(x)] = 0   for all j = 1; 2:::k

(iii)   hl(x) = cl;   for all l = 1; 2:::m

Example 4. Let’s conclude this section with a couple of concrete examples. Consider an opti-mization problem with mixed constraints:

max F (x; y) = 3xy x2

21

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 22/38

subject to   2x y   = 5   (1)

5x + 2y     8   (2)

x     0   (3)y     0   (4)

Let us use a Kuhn-Tucker formulation. So, the Lagrangian can be written:

L = 3xy x2 + 1[5 2x + y] + 2[5x + 2y 8]

First, we need to check a constraint quali…cation. The Jacobian of the system (1)-(2) (for Kuhn-Tucker) is:

  @g1=@ x @ g1=@y@g2=@ x @ g2=@y

 =

  2   15 2

The two vectors are linearly independent, so it has maximal rank=2, regardless of what the solutionsare. Thus, it satis…es CQ. Now, from KTNC, we have:

@ L

@x  = 3y 2x 21 + 52  0; x 0;   and x[2y 3x2 21 + 52] = 0   (5)

@ L

@y  = 3x + 1 + 22  0; y  0;   and y [3x + 1 + 22] = 0   (6)

@ L

@1= 5 2x + y = 0   (This must be always binding) (7)

@ L

@2

= 5x + 2y 8 0; 2   0;   and 2[5x + 2y 8] = 0   (8)

There may be a couple of possible cases. But, let’s eliminate impossible cases. Suppose  x  = 0.Then, from (7), we have  y  = 5 <  0, which contradicts  y   0. So,  x > 0. If  y  = 0, then from (7),we have x  = 5=2. We are done. Now, suppose that both  x > 0  and  y > 0. By slackness conditions,we know that:

3y 2x 21 + 52   = 0   (9)

3x + 1 + 22   = 0   (10)

Combining these with (7), we have three equations with four unknowns. We need to eliminate at

least one more variable. Fortunately, we can eliminate  2. To see this, suppose by contradictionthat 2 >  0. Then, it must be the case that  5x + 2y = 17 by slackness condition. Combining with(7),

5x + 2y   = 8

4x 2y   = 10

22

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 23/38

This gives us:   x = 2; y  = 1 <  0. So, this cannot be a solution. Thus, we can set 2  = 0. Then,the system (7), (9), and (10) reduces to a system of 3 equations with 3 unknowns:

2x y   = 5

3y 2x 21   = 0

3x + 1   = 0

With appropriate algebra, we should be able to solve this system. But, in fact, solving this leads toy = 2, which contradicts y > 0. Thus, the unique (candidate) solution for this problem is  x  = 5=2and y  = 0.

Example 5. Lastly, let’s consider the following utility maximization problem.

max U (x; y) = x2 + y2

subject to   2x + y     4   (1)

x     0   (2)

y     0   (3)

For this problem, it might be helpful to use a diagram. Draw several level sets for the objectivefunction. Each level set is a quarter piece of a circle in the positive orthant of  R2. So, from thediagram, it is clear that a solution is on the y-axis. The question is: Can we con…rm this using theKuhn-Tucker theorem? (BTW, you will be often asked to do both diagrammatic and mathematicalanalyses in a micro sequence). The Lagrangian is:

L = x2 + y2 + [4 2x y]

By FONCs, we have:

Lx   = 2x 2

  = 0   if  x > 0 0   if  x = 0

  (4)

Ly   = 2y

  = 0   if  y > 0 0   if  y = 0

  (5)

L   = 4 2x y

  = 0   if   > 0 0   if    = 0

  (6)

Note that U x = 2x 0 and  U y  = 2y  0 and it cannot happen that both  x = y = 0 (for example,(1; 1) is within the budget and  U (1; 1) > U (0; 0)). So, We have U x = 2x > 0  or  U y  = 2y > 0. Thus,by Walras law, the consumer must spend all of her resource. We can consider three possible casesof solutions (See the diagram).

Case 1:   x > 0; y > 0:Case 2:   x > 0; y = 0:

23

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 24/38

Case 3:   x = 0; y > 0:

Consider each case.Case 1: Suppose x >  0; y >  0. From (4),  2x 2  = 0, so that  x  =  :  From (5),  2y  = 0,so that   2y   =  . Thus,   x  = 2y. By assumption,   x >  0; y >  0   so that   >  0. This implies that4 2x y = 0. So, combining with  x = 2y, we have:   4 4y y = 0. So,  y  = 4=5, x = 8=5.

Case 2: Suppose  x >  0; y = 0:  From (4),  2x 2  = 0, so that  x  =  :  This implies   >  0. So,from (6), 4 2x y = 0. Substitute  y  = 0, then we have  x = 2.

Case 3: Suppose x = 0; y >  0:  From (5),  2y  = 0, so that  2y  =  . This implies  >  0. So,again from (6),  4 2x y = 0. Substitute  x = 0, then we have  y = 4.

So, we have obtained three solutions directly from KT conditions. But, substitute those solutionsinto the objective:

Case 1:   U  (x

; y

) = (8=5)2

+ (4=5)2

=  6425  +

  1625  =

  8025  =

  165 :

Case 2:   U  (x; y) = (2)2 + (0)2 = 4 =   205 :

Case 3:   U  (x; y) = (0)2 + (4)2 = 16 =   805 :

Obviously, x  = 0,  y  = 4  gives the highest value of the objective, so it is the solution. The lessonhere is: Although KT conditions are useful, they are still necessary conditions for optimality. Wewill learn second order su¢cient conditions in the next section.

15. Constrained Optimization II: Comparative Statics and SOCs

In this section,we will learn (i) comparative statics and (ii) second order conditions that distin-guish maxima from minima.

15-1. Comparative Statics and Envelope Theorem 

By comparative statics, we mean the sensitivity of (i) the optimal value of the objective and(ii) the optimal value of the decision variables, with respect to changes in primitive parameters of the problem. For example, in a standard utility maximization,

max U (x; y)

subject to   px + y     I 

x     0

y     0

We have two primitive parameters, i.e. relative price of a good,  p, and income, I . Thus, in general,the solution to this problem will be written as:

24

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 25/38

x( p; I )

y( p; I )

v( p; I ) = U (x

( p; I ); y

( p; I ))

as functions of primitive parameters. As many of you know, v( p; I )   is called an   indirect utilityfunction   and   x( p; I ); y( p; I )   are (individual)  demand functions. Our interest here is: howwould a change in these parameters change  v( p; I ); x( p; I ); y( p; I )? More explicitly, what are:

@x( p; I )

@I   ;

 @x( p; I )

@p  ;

 @v( p; I )

@I   ;

 @v( p; I )

@p

In many economic applications, we have to work very hard to get these comparative statics.However, there are several useful theorems that we can employ in such an endeavor: a theorem onshadow price and Envelope Theorem. To get an idea, let’s turn to a simple problem. One primitive

parameter on one equality constraint.

max f (x; y)   (1)

subject to   g(x; y) = a

We can write the optimal value of this problem as:

f (x(a); y(a))

We can prove that:

(a) =   dda

f (x(a); y(a))

where   (a)   is a Lagrange multiplier on the equality constraint. This says that the Lagrangemultiplier measures  the rate of change of the optimal value of   f  with respect to a marginal change 

in the resource constraint . Let’s prove this result.

Proof : The Lagrangian for problem (1) is:

L = f (x; y) + [a g(x; y)]

By FONC, we have:

@ L

@x  =

  @f (x; y)

@x 

@g(x; y)

@x  = 0 =)

  @f (x; y)

@x  = 

@g(x; y)

@x  (2)

@ L

@y  =

  @f (x; y)

@y 

@g(x; y)

@y  = 0 =)

  @f (x; y)

@y  = 

@g(x; y)

@y  (3)

Now, because   g(x(a); y(a)) =  a   for all   a, we can consider this as an identity. So, let’s take aderivative of both sides of the identity w.r.t.   a:

25

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 26/38

dg(x(a); y(a))

da  =

  d

daa   =)

  @g(x; y)

@x

dx(a)

da  +

 @ g(x; y)

@y

dy(a)

da  = 1   (4)

Using a chain rule, we can compute:

d

daf (x(a); y(a)) =

 @f (x; y)

@x

dx(a)

da  +

 @ f (x; y)

@y

dy(a)

da

Substitute (2) and (3) …rst and then (4),

d

daf (x(a); y(a)) =

  @f (x; y)

@x

dx(a)

da  +

 @ f (x; y)

@y

dy(a)

da

=   (a)@g(x; y)

@x

dx(a)

da  + (a)

@g(x; y)

@y

dy(a)

da

=   (a) @g(x; y)

@x

dx(a)

da  +

 @ g(x; y)

@y

dy(a)

da =   (a)

Thus, we have obtained the desired result.  QED

It is relatively straightforward to generalize this result to the case of several equality constraintsand inequality constraints. I will state it without proof.

Theorem 15-1 (Shadow Price): Consider the following maximization problem:

maxx2Rn

F (x1; x2;:::;xn)

subject to  g1(x1; x2;:::;xn)     b1...

gk(x1; x2;:::;xn)     bk

h1(x1; x2;:::;xn) =   c1...

hm(x1; x2;:::;xn) =   cm

Suppose that  F; g1;:::;gk; h1;:::;hm  are  C 1 functions and that  x = (x1; x2;:::;xn)   is a local max-imum of  F   on the constraint set. Let  (b; c) = (1; 2;:::;k);(b; c) = (1; 2;:::;m)  be thecorresponding Lagrange multipliers for the Lagrangian:

L(x;) =   F (x) + [b g] + [c h]

=   F (x) + 1[b1 g1(x)] + ::: + k[bk gk(x)] + 1[c1 h1(x)] + ::: + m[cm hm(x)]

26

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 27/38

Suppose that x(b; c);(b; c);(b; c)  are di¤erentiable w.r.t.   (b; c)  and that relevant CQ holdsat  x(b; c). Then,

 j (b; c) =

  @F (x(b; c))

@b j for all  j  = 1; 2;::;k

l (b; c) =  @F (x(b; c))

@clfor all  l  = 1; 2;::;m

Thus, a Lagrange multiplier    j   measures the e¤ect of a marginal change in input   j   on theobjective value. In this view, economists often call   j   the  shadow price  (or  imputed value) of input j .

Example 1.max U (x; y)

subject to   px + y     I 

x     0

y     0

Let’s treat price  p  as …xed, so we write the objective value as  U (x(I ); y(I )). Then, by the abovetheorem,

= dU (x(I ); y(I ))

dI 

Thus, the Lagrange multiplier on the budget constraint measures the shadow price of income.

Now, I will state and prove an easy version of Envelope Theorem. Although some people may…nd it obvious, this is a very useful result and, yet, is sometimes not understood properly. Theconfusion comes mainly from the notation. For this reason, I slightly deviate from Simon & Blume’snotation.

Theorem 15-2 (Envelope Theorem I): Let  F   :  Rn Rm ! R  be a continuously di¤erentiable

function. For each …xed vector of parameters   2 Rm, consider the maximization problem:

maxx2Rn

F (x;)

Let  x

()  be the solution and  v() = maxx2Rn   F (x;) =  F (x

();)  be the value function of this problem. If  x()  is a  C 1 function, then:

@v()

@ j=

  @F (x;)

@ j

x=x()

(#)

That is, the derivative of the value function w.r.t. a parameter is equal to the derivative of theobjective function w.r.t. that parameter evaluated at optimum  x().

27

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 28/38

Proof : We can prove this using Chain Rule and total di¤erential:

@v()

@ j=

  @F (x();)

@ j

=

"   nXi=1

@F (x;)

@xi

@xi ()

@ j+

 @F (x;)

@ j

#x=x()

=  @F (x;)

@ j

x=x()

where the last equality follows, because   @F (x;)@xi

= 0 at optimum by FONC.  QED:

Question: What’s the point of this theorem?   Answer: It is sometimes cumbersome to com-pute LHS of (#). But, by this theorem, we can simply compute RHS of (#). To see how it works,consider the following example.

Example 2.

maxx

x2 + 2ax + 4a2

where a  is the primitive parameter of this problem. We are interested in the e¤ect of a change in aon the optimal value. The direct approach is to compute LHS of (#). Let’s …rst derive the solution…rst. By FONC,

F x   =   2x + 2a = 0

)   x = a

Then, the value function is:

v(a) = a2 + 2a a + 4a2 = 5a2

Thus, its derivative is:

dv(a)

da  = 10a

Now, let’s use Envelope Theorem. We can simply compute:

@F 

@a  = 2x + 8a

Evaluating this at optimum x = a, we obtain 2x + 8a = 10a, which is exactly the same as before.

A more surprising result is that we can generalize this result for parameters on constraints.That is,

Theorem 15-3 (Envelope Theorem II): Let F; G : RnRm ! R be a continuously di¤erentiablefunction. For each …xed vector of parameters   2 Rm, consider the maximization problem:

28

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 29/38

maxx2Rn

F (x;)   subject to G(x;) 0

(Note that some of the parameters may be in the objective while others may be on the constraint).Let  x()  be the solution and  v() =  F (x();)   be the value function of this problem. Writethe Lagrangian:

L = F (x;) G(x;)

If  x() and ()  are  C 1 functions and CQ conditions are satis…ed, then

@v()

@ j=

  @ L(x;;)

@ j

x=x();=()

In fact, this theorem can generalize to the case of several constraints. Let’s see how this works

with a couple of examples.

Example 3. Consider a standard utility maximization again.

max U (x; y)

subject to   px + y     I 

x     0

y     0

Let  v( p; I ) =  U (x( p; I ); y( p; I ))  be its indirect utility function and  L  =  U (x; y) +  [I    px y].Then, using Envelope Theorem,

@v( p; I )

@I   =

  @ L

@I 

x( p;I );y( p;I );( p;I )

=   ( p; I )

This shows that the shadow price theorem is in fact a special case of Envelope Theorem. Advantageof Envelope Theorem, however, goes beyond that of shadow price theorem. Let’s take a derivativew.r.t. price.

@v( p; I )

@p  =

  @ L

@p

x( p;I );y( p;I );( p;I )

=   ( p; I )x( p; I )

Substitute the result above for  ( p; I ), then

29

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 30/38

@v( p; I )

@p

 @v( p; I )

@I   = x( p; I )

Or in a more conventional notation,

@U (x; y)

@p

 @U (x; y)

@I   = x

In general, we can generalize this to  n  goods case,

@U (x)

@pi

 @U (x)

@I   = xi   for i  = 1; 2;:::n

This is called  Roy’s Identity, which says that (Marshallian) demand for good  i  can be obtainedby a quotient of two partials of the value function.

Example 4. Consider a …rm’s cost minimization problem:

min  w x

subject to   F (x) y

where   w   is a vector of input prices and   F   is a production function. Let   C (w) =   w x andL = w x+[F (x) y]. (Note  minw x = maxw x). By Envelope Theorem,

@C (w)

@wi=  

  @ L

@wi

x;

=   xi   for i = 1; 2;:::n

This is called  Shephard’s Lemma, which says that input demand for good  i  can be obtained bya derivative of the cost function evaluated at optimum.

15-2. Second Order Conditions 

Up to now, we have only looked at …rst order necessary conditions. But, as we know, FONCsdo not guarantee a solution. We need second order conditions. Recall that, from Section 13, weneed to check the de…niteness of the Hessian matrix for su¢cient conditions:

Theorem 13-2 (SOSC for Local Max/Min): Let F   : U   Rn ! R be a C 2 function. Suppose

that U   is open and that  D1F (x) = 0T .(i) If the Hessian  D2F (x)  is negative de…nite, then  x is a (strict) local maximum of  F ;(ii) If the Hessian  D2F (x)  is positive de…nite, then  x is a (strict) local minimum of  F ;(iii) If the Hessian  D2F (x) is inde…nite, then  x is neither local max nor minimum of  F .

But, this theorem is for   unconstrained optimization . What if we have constraints? Intuitively,what we only need is to study   the de…niteness of the Hessian on the constraint space . How can we

30

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 31/38

do that? Now, recall that we have studied a very simple case of constrained optimization with aquadratic function in Section 12. We constructed a bordered Hessian of the form:

H n+1 =   0   c

cT  A where   c   is a vector of coe¢cients for a linear constraint:   c x  = 0. We basically combine thesemethods together.

Theorem 15-4 (SOSC for Constrained Local Max/Min): Let  F; h1;:::;hm  be  C 2 functionson  Rn. Consider the problem:

maxx2Rn

F (x1; x2;:::;xn)

subject to  h1(x1; x2;:::;xn) =   c1

...

hm(x1; x2;:::;xn) =   cm

Form a Lagrangian as usual,  L  =  F (x) + [c h], and let  x; satisfy FONCs. Suppose that"the Hessian of L w.r.t.   x at  (x;), D2

xL(x;), is negative de…nite on the linear constraint setfv :  D1h(x)v =  0g", then  x is a strict local constrained maximum of  F .

What does the condition in quotation mean? It means that the following bordered Hessianmatrix.

H m+n   =

0BBB@mm z}|{ 0

mn z }| { D1h(x)

D1h(x)T  | {z } nm

D2xL(x;) | {z } 

nn

1CCCA

=

0BBBBBBBBB@

0     0...

  . . .  ...

0     0

@h1@x1

  @h1@xn

...  . . .

  ...@hm@x1

  @hm@xn

@h1@x1

  @hm@x1

...  . . .

  ...@h1@xn

  @hm@xn

@ 2L@x2

1

  @ 2L@xn@x1

...  . . .

  ...@ 2L

@x1@xn

  @ 2L

@x2

n

1CCCCCCCCCA

satis…es two conditions: (i) the last   (n m)   leading principal minors alternate in sign, and (ii)det(H ) >  0   if  n  is even or  det(H ) <  0   if  n   is odd.

There are related theorems in Simon & Blume, pp.460-467. As you can see, it is very cum-bersome to check second order conditions (even with computer, writing a code is cumbersome).Moreover, we cannot guarantee a global solution from this second order condition. So, what we

31

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 32/38

usually do is to study characteristics of objective and constraint functions such that global solutionsare guaranteed. The method is called  concave programming.

Example 5. Consider a standard utility maximization again:

max U (x; y)

subject to   p1x + p1y =  h(x; y) = I 

Let’s construct a bordered Hessian.

H  =

0@ 0   h1   h2

h1   Lxx   Lyx

h2   Lxy   Lyy

1A   =)   H  =

0@ 0   p1   p2

 p1   U xx   U yx p2   U xy   U yy

1A   (by elementary row operations)

We want to have:

det(H ) = det

0@ 0   p1   p2

 p1   U xx   U yx p2   U xy   U yy

1A >  0

It turns out that  det(H ) >  0  if  U  is quasiconcave, a concept we study in the next section.

16. Concave and Quasiconcave Functions

In economic applications, you will often encounter concave and quasiconcave functions. It isvery essential that you familiarize yourself with properties of these functions. In general,

(i) For a global maximizer of an unconstrained maximization problem, we want the objective

function to be concave.(ii) For a global maximizer of a constrained maximization problem, we want the objective tobe quasiconcave and constraint functions to be quasiconvex.

The objective of this section is three-fold:(a) To learn properties of concave and convex functions;(b) To learn properties of quasiconcave and quasiconvex functions;(c) To learn concave programming.

16-1. Concave and Convex Functions 

First, let’s recall the de…nitions of concave and convex functions.

De…nition 13-2 (Concave & Convex Functions): A function F   : U   Rn ! R is  concave on

U   if and only if for all  x;y 2 U , for all   2 [0; 1], we have:

F  [x + (1 )y] F (x) + (1 )F (y)

A function  F   :  U    Rn !  R   is  convex  on  U   if and only if for all  x;y  2  U , for all    2   [0; 1], wehave:

32

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 33/38

F  [x + (1 )y] F (x) + (1 )F (y)

A function  F   :  U    Rn !  R   is   strictly concave  on  U   if and only if for all  x  6=  y  2  U , for all 2 (0; 1), we have:

F  [x + (1 )y] > F (x) + (1 )F (y)

A function   F   :  U    Rn !  R   is   strictly convex  on   U   if and only if for all  x  6=  y  2  U , for all 2 (0; 1), we have:

F  [x + (1 )y] < F (x) + (1 )F (y)

We have the following characterizations of concave (convex) functions.

Theorem 16-1 (Properties of Concave Functions): Let  F   :  U    Rn !  R   be a  C 2 unctionand U  is a convex open subset of  Rn.

(a) F  is a concave function on  U   if and only if:(i)  F (y) F (x) D1F (x)(y x)  for all  x;y 2 U ;(ii)  D2F (x)  is negative semide…nite for all  x 2 U ;(iii)  F   is convex on  U ;(iv)  F  is concave for all  0;(v) For i = 1;:::;n, F (x1;:::;xi;:::; xn)   is concave in  xi  for each …xed  ( x1;:::xi1; xi+1:::; xn);(vi) its restriction to every line segment in  U   is a concave function.

(b) Let F 1; F 2;:::;F m  be concave functions and let 1; 2;:::;m  be positive numbers. Then, linearcombination

Pm j=1  jF  j   is a concave function.

(c) If  F   is a concave function on  U , then for every  x 2 U , the  upper contour set:

C +(x) = fx 2 U   : F (x) F (x)g

is a convex set. (If  F  is convex, then the  lower contour set  is a convex set). The converse is nottrue in general.

We saw (a)-(i),-(ii), and -(ii) already. (a)-(iv) is obvious. Implication of (v) and (vi) is veryimportant — it says that we can determine the concavity of a function just by looking at thegeometric feature of its graph. In particular, let F be a function from  R

2 to .R  Then, (vi) meansthat if one slices the image of  F  along any line segment in R

2, then we should see the graph just likethat of a one-variable concave function.   Question: Can you write down the analogous theoremfor convex functions?

Example 1. In a dynamic programming, we often see the following time-separable utility functionfor an in…nite sequence of consumption  c  = fctg1t=0:

U (c) =1Xt=0

 tut(ct)

where     is a discount factor. If each ut()   is concave, then  U (c)   is also concave by (b), as long asthe in…nite sum is well-de…ned.

33

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 34/38

Example 2. Consider a Cobb-Douglas function:

F (x; y) = Axayb where A;a;b > 0,  a + b 1

Question: Is this concave or convex?   Answer: It is a concave function on  R2++. To see this, we

only need to prove that  xayb is concave. Let’s construct a Hessian:

D2F    =   H  =

  a(a 1)xa2yb abxa1yb1

abxa1yb1 b(b 1)xayb2

det(H 1) = a(a 1)xa2yb < 0, because  a 1 <  0

det(H ) = ab(1 a b)x2a2y2b2 0, because  a + b 1

So, under the assumption, the Hessian is negative semide…nite. So,  F  is concave. In fact, if  a+b = 1,F   is concave. If  a + b < 1, F   is strictly concave.

Example 3. We can show that an expenditure function is concave in prices. Consider a standardexpenditure minimization problem.

minx2Rn

+

p x   subject to U (x)   U 

Let  e(p; U ) = minx2Rn+fp x  :  U (x)     U g  be an expenditure function. We want to show  e(;  U )

is a concave function of  p  for each …xed   U . Now, pick arbitrary points  p1;p2   and   2   [0; 1]. Letp  =  p1 + (1 )p2  and let  x (p1) ;x (p2) ;x (p)  be the solution to each minimization problem.Now, by de…nition,

e(p1;  U ) = minx2Rn

+

fp1 x :  U (x)   U g

=   p1 x (p1)

  p1 x (p)   because  x (p)   is not the solution (1)

Similarly,

e(p1;  U ) =   p2 x (p2)

  p2 x (p)   because  x (p)   is not the solution (2)

Its convex combination is,

e(p1;  U ) + (1 )e(p2;  U ) =   p1x (p1) + (1 )p2x (p2)

  p1 x (p) + (1 )p2 x (p)   by (1) and (2)

= [p1 + (1 )p2] x (p)

=   p x (p) = e(p;  U )   by de…nition

Using a similar argument, we can show that a pro…t function is  convex  in prices. Try it yourself.

34

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 35/38

16-2. Quasiconcave and Quasiconvex Functions 

Let’s …rst de…ne what it means by quasiconcave/quasiconvex functions.

De…nition 16-1 (Quasiconcave & Quasiconvex Functions):(i) A function  F   : U   Rn ! R is  quasiconcave  on  U , where  U   is a convex set in  Rn;  if and onlyif for all  x;y 2 U , for all  2 [0; 1], we have:

F (x) F (y) =)   F (x + (1 )y) F (y)

(ii) A function  F   : U   Rn ! R  is  quasiconvex  on  U , where  U   is a convex set in  Rn;  if and only

if for all  x;y 2 U , for all  2 [0; 1], we have:

F (x) F (y) =)   F (x + (1 )y) F (y)

(iii) A function  F   : U   Rn ! R   is  strictly quasiconcave  on  U , where  U   is a convex set in  Rn;

if and only if for all  x 6= y 2 U , for all   2 (0; 1), we have:

F (x) F (y) =)   F (x + (1 )y) > F (y)

(iii) A function  F   : U   Rn ! R  is  strictly quasiconvex  on  U , where  U   is a convex set in  Rn;  if 

and only if for all  x 6= y 2 U , for all  2 (0; 1), we have:

F (x) F (y) =)   F (x + (1 )y) < F (y)

What does a quasiconcave (quasiconvex) function looks like? (See the graphs).Are these func-tions quasiconcave?

Obviously, quasiconcavity is a geometric characterization. So, we have the following equivalencetheorem.

Theorem 16-2 (Properties of Quasiconcave Functions): Let F   : U   Rn ! R be de…ned on

a convex set  U . Then, the following are equivalent.(a) F   is quasiconcave on U;(b) For every real number  a, the upper contour set de…ned by,

C +(a) = fx 2 U   : F (x) ag

is a convex set in  U .(c) For all  x;y 2 U , for all   2 [0; 1], we have:

F (x + (1 )y) minfF (x); F (y)g

(d)  F   is quasiconvex on  U .

Moreover, in view of Theorem 16-1 (c), we have:

35

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 36/38

F   is concave  =):

  F   is quasiconcave

F   is strictly concave  =):   F   is strictly quasiconcave

Theorem 16-3 (Cobb-Douglas Functions): Any Cobb-Douglas function  F (x; y) =  Axayb isquasiconcave for A;a;b > 0.

Question: How can we check if a function  F   is quasiconcave on  R or  R2?   Answer: If  F   : R! R

is monotone increasing, it is quasiconcave. To see this, suppose  F (x) F (y). Then, it must implythat  x   y , for otherwise  x < y  would imply  F (x) < F (y). A contradiction. But, then, it convexcombination x + (1)y  y. So, by monotonicity,  F (x + (1)y) F (y). Now, if  F   : R2 ! R,then we use the idea that level sets for quasiconcave function must form a convex function. So,follow the steps:

(i) Set  F (x; y) = c  for some arbitrary constant  c.(ii) Solve for  y  to get:   y =  g(x; c).(iii) Check if  g  is convex. Recall that  g   is convex i¤  g 00 is positive.

Example 4. Let’s do this for Cobb-Douglas function. Let F (x; y) =  Axayb =  c. Solving for  y,

y  = (c=Axa)1=b = (c=A)1=bxa=b. Now, take the second derivative.   y0 = (a=b) (c=A)1=bxa=b1 0; y00 = (a=b) (a=b + 1)(c=A)1=bxa=b2  0   if  A;a; b >  0  (Note  F   cannot take negative values for(x; y) 2 R2

+  so that  c 0). So, it is convex.

If   F   :   Rn !   R   where   n     2, then we need to invoke the following theorem to check thequasiconcavity of  F .

Theorem 16-4 (Bordered Hessian Test): Let  F   : U   Rn ! R  be a  C 2 function on a convex

set U . Consider the following bordered Hessian:

H    =

  0   D1F (x)

D1F T (x)   D2F (x)

(n+1)(n+1)

=

0BBB@

0   F x1     F xnF x1   F x1x1     F x1xn

...  ...

  . . .  ...

F xn   F xnx1     F xnxn

1CCCA(a) Starting with the 3rd leading principal minors, if the   (n 1)   leading principal minors of   H 

alternate in sign  and  the sign starts with a positive sign for all  x 2 U , then  F   is quasiconcave. *(a) Starting with the 3rd leading principal minors, if the  (n 1)  leading principal minors of  H   areall negative in sign for all  x 2 U , then  F   is quasiconvex.

*Remark: Some textbooks start with the 2nd LPM, instead of 3rd LPM, and its sign must benegative.

36

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 37/38

Example 5. Let’s use this for F (x; y) = xy, a special case of Cobb-Douglas.   F x =  y; F y  = x; F xx =0; F xy  = 1; F yy  = 0. So, the bordered Hessian is:

H  = 0@0   y xy   0 1x   1 0

1AWe start with the third leading principal minors. That means  det(H ).

det(H ) = 0

  0 11 0

y

  y   1x   0

+ x

  y   0x   1

=   xy + xy = 2xy > 0   for all x; y > 0

Thus,  F (x; y) = xy   is quasiconcave on  R2++.

16-3. Concave Programming 

Concave programming is a name assigned for a class of techniques/theorems for solving pro-gramming problems with concave/quasiconcave functions. Although it is a very rich …eld withmany analytically important theorems, as applied economists, we probably need to know just twoof them at least for this lecture. We have seen the …rst one already before.

Theorem 13-4 (Su¢ciency for Unconstrained Global Min/Max): Let  F   :  U    Rn !  R

be a C 1 function and  U  is a convex and open subset of  Rn.(c) If  F  is a concave function on U  and D1F (x) = 0 for some  x 2 U , then x is a global maximumof  F   on U .(d) If  F  is a convex function on  U  and D1F (x) = 0  for some  x 2 U , then x is a global minimumof  F   on U .

Question: Why is it essential to have  U   convex and open?   Answer: If  U   is not convex, then  F is not well de…ned as a concave or convex function (because convex combination  x + (1 )y   inU   is not well de…ned). If  U   is not open, then there may be a boundary solution at which FONCsmay fail.

Theorem 16-5 (Su¢ciency for Constrained Global Min/Max): Let   F; g1;:::;gm   :   U   Rn !  R  be a  C 1 function and  U   is a convex and open subset of  Rn. Consider the maximization

problem:

maxx2Rn

F (x1; x2;:::;xn)

subject to g1(x1; x2;:::;xn)     b1...

gm(x1; x2;:::;xn)     bm

37

8/11/2019 MathReview_3

http://slidepdf.com/reader/full/mathreview3 38/38

Suppose that F  is quasiconcave on U  and g1;:::;gm are quasiconvex on U . If all the necessary FOCsand CQ (as stated in Theorem 14-4) are satis…ed at  x, then  x is a global maximum of  F  on theconstraint set.

What does this theorem mean intuitively? Let’s consider a standard utility maximizationproblem.

max U (x; y)

subject to   px + y  I 

Suppose  U  = R2++. Note that the budget function  g(x; y) = px + y   is a quasiconvex (actually also

quasiconcave) function, because its level sets are linear. Suppose  U (x; y) is (strictly) quasiconcave.Then, there will be a tangent point that satisfy FONCs. Suppose on the contrary that   U (x; y)is strictly quasiconvex. Then, the tangency does not mean a maximum. There is also a casewhere U (x; y)  is not even quasiconvex. Then, it does not guarantee a global maximum. Consider,

furthermore, the case where U (x; y) is quasiconcave but g(x; y) is not quasiconvex. Then, a tangency(FONC) does not imply a global solution. (See the graphs).