Extrema of Functions of Two Variablesstein/math210/Slides/math210-05slides.pdf · One necessary condition for g to have a relative extremum is g0(0) = 0. We know, however, ... in

Extrema of Functions of Two Variables

Suppose we wish to examine the possibility of a relative extremumat a point (x0, y0) in the domain of a function z = f (x , y).

Let usassume that both f and as many partial derivatives as necessaryare continuous near (x0, y0).

It seems reasonable, and can be shown to be true, that f (x , y) willhave a relative extremum at (x0, y0) if and only ifg(t) = f (x0 + ut) has a relative extremum at x0 for all unitvectors u.

One necessary condition for g to have a relative extremum isg ′(0) = 0.

We know, however, that g ′(0) = Duf (x0) = (5f ) · u, and (5f ) · ucan equal 0 for all unit vectors u if and only if both partialderivatives of f are 0.

Such points are called critical points or stationary points.

Alan H. SteinUniversity of Connecticut


Suppose we wish to examine the possibility of a relative extremumat a point (x0, y0) in the domain of a function z = f (x , y). Let usassume that both f and as many partial derivatives as necessaryare continuous near (x0, y0).








It seems reasonable,

and can be shown to be true, that f (x , y) willhave a relative extremum at (x0, y0) if and only ifg(t) = f (x0 + ut) has a relative extremum at x0 for all unitvectors u.







It seems reasonable, and can be shown to be true,

that f (x , y) willhave a relative extremum at (x0, y0) if and only ifg(t) = f (x0 + ut) has a relative extremum at x0 for all unitvectors u.























We know, however, that g ′(0) = Duf (x0) = (5f ) · u,

and (5f ) · ucan equal 0 for all unit vectors u if and only if both partialderivatives of f are 0.

















Using the Second Derivative Test

Writing z = f (x , y), with x = x0 + at, y = y0 + bt, whereu =< a, b >, we can calculate

g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy

dt= fxa + fyb, where both fx and fy

are evaluated at < x0 + at, y0 + bt >.

g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b =

(fxxa + fxyb)a + (fyxa + fyyb)b = fxxa2 + 2fxyab + fyyb2.

We thus have g ′′(0) = fxxa2 + 2fxyab + fyyb2, where the partial

derivatives are evaluated at (x0, y0).

Consider the equation fxxa2 + 2fxyab + fyyb2 = 0, looking at it as a

quadratic equation in a. Using the Quadratic Formula, we get

solutions a =−2fxyb ±

√(2fxyb)2 − 4fxx fyyb2

2fxx.




g ′(t) =dz

dt

=∂z

∂x

dx

dt+

∂z

∂y

dy



g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b =








2fxx.




g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy

dt

= fxa + fyb, where both fx and fy


g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b =








2fxx.




g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy

dt= fxa + fyb,

where both fx and fy


g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b =








2fxx.




g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy



g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b =








2fxx.




g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy



g ′′(t) =d

dt(fxa + fyb)

=dfxdt

a +dfydt

b =








2fxx.




g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy



g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b

=








2fxx.




g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy



g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b =

(fxxa + fxyb)a + (fyxa + fyyb)b

= fxxa2 + 2fxyab + fyyb2.







2fxx.




g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy



g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b =








2fxx.




g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy



g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b =








2fxx.




g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy



g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b =





quadratic equation in a.

Using the Quadratic Formula, we get



2fxx.




g ′(t) =dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy



g ′′(t) =d

dt(fxa + fyb) =

dfxdt

a +dfydt

b =








2fxx.



The nature of any possible solutions is determined by thediscriminant (2fxyb)2 − 4fxx fyyb2 = 4b2(f 2

xy − fxx fyy ) inside theradical.

This will have the same sign as D = f 2xy − fxx fyy , which we will also

call the discriminant.There are three different possibilities:



The nature of any possible solutions is determined by thediscriminant (2fxyb)2 − 4fxx fyyb2 = 4b2(f 2

xy − fxx fyy ) inside theradical.

This will have the same sign as D = f 2xy − fxx fyy , which we will also

call the discriminant.There are three different possibilities:


The Second Derivative Test

I D < 0. In this case, the quadratic equation has no solutions,so the sign of g ′′(0) doesn’t change as the direction uchanges. Thus either g ′′(0) is always positive, in which case fhas a relative minimum,

or g ′′(0) is always negative, in whichcase f has a relative maximum. We can check which case weare in by checking the sign of fxx .

I D = 0. In this case, anything can happen. This occurs forf (x , y) = x4 − y4 at the origin, where there is a saddle point,but also occurs for f (x , y) = x4 + y4 at the origin, wherethere is a relative minimum.

I D > 0. In this case, g ′′(0) > 0 for some direction vectors ubut g ′′(0) < 0 for some other direction vectors and the graphhas a saddle point.



I D < 0. In this case, the quadratic equation has no solutions,so the sign of g ′′(0) doesn’t change as the direction uchanges. Thus either g ′′(0) is always positive, in which case fhas a relative minimum, or g ′′(0) is always negative, in whichcase f has a relative maximum.

We can check which case weare in by checking the sign of fxx .





I D < 0. In this case, the quadratic equation has no solutions,so the sign of g ′′(0) doesn’t change as the direction uchanges. Thus either g ′′(0) is always positive, in which case fhas a relative minimum, or g ′′(0) is always negative, in whichcase f has a relative maximum. We can check which case weare in by checking the sign of fxx .






I D = 0.

In this case, anything can happen. This occurs forf (x , y) = x4 − y4 at the origin, where there is a saddle point,but also occurs for f (x , y) = x4 + y4 at the origin, wherethere is a relative minimum.





I D = 0. In this case, anything can happen.

This occurs forf (x , y) = x4 − y4 at the origin, where there is a saddle point,but also occurs for f (x , y) = x4 + y4 at the origin, wherethere is a relative minimum.








Extrema with Constraints

Suppose we want to maximize (or minimize) a function z = f (x , y)subject to a constraint g(x , y) = 0.

We can look at g(x , y) = 0defining a function y = h(x) implicitly, so z = f (x , h(x)).

Any extrema must occur wheredz

dx= 0.

Using the Chain Rule,dz

dx= f1(x , y)

dx

dx+ f2(x , y)

dy

dx= f1 + f2(x , y)

dy

dx.

Since y = h(x) is defined implicitly by g(x , y) = 0, we havedy

dx= −g1

g2, so

dz

dx= f1(x , y) + f2(x , y)

(−g1

g2

).

We thus must have f1(x , y)− f2(x , y)g1

g2= 0,



Suppose we want to maximize (or minimize) a function z = f (x , y)subject to a constraint g(x , y) = 0. We can look at g(x , y) = 0defining a function y = h(x) implicitly,

so z = f (x , h(x)).


dx= 0.


dx= f1(x , y)

dx

dx+ f2(x , y)

dy

dx= f1 + f2(x , y)

dy

dx.


dx= −g1

g2, so

dz

dx= f1(x , y) + f2(x , y)

(−g1

g2

).


g2= 0,



Suppose we want to maximize (or minimize) a function z = f (x , y)subject to a constraint g(x , y) = 0. We can look at g(x , y) = 0defining a function y = h(x) implicitly, so z = f (x , h(x)).


dx= 0.


dx= f1(x , y)

dx

dx+ f2(x , y)

dy

dx= f1 + f2(x , y)

dy

dx.


dx= −g1

g2, so

dz

dx= f1(x , y) + f2(x , y)

(−g1

g2

).


g2= 0,





dx= 0.


dx= f1(x , y)

dx

dx+ f2(x , y)

dy

dx= f1 + f2(x , y)

dy

dx.


dx= −g1

g2, so

dz

dx= f1(x , y) + f2(x , y)

(−g1

g2

).


g2= 0,





dx= 0.


dx= f1(x , y)

dx

dx+ f2(x , y)

dy

dx= f1 + f2(x , y)

dy

dx.


dx= −g1

g2, so

dz

dx= f1(x , y) + f2(x , y)

(−g1

g2

).


g2= 0,





dx= 0.


dx= f1(x , y)

dx

dx+ f2(x , y)

dy

dx= f1 + f2(x , y)

dy

dx.

Since y = h(x) is defined implicitly by g(x , y) = 0,

we havedy

dx= −g1

g2, so

dz

dx= f1(x , y) + f2(x , y)

(−g1

g2

).


g2= 0,





dx= 0.


dx= f1(x , y)

dx

dx+ f2(x , y)

dy

dx= f1 + f2(x , y)

dy

dx.


dx= −g1

g2,

sodz

dx= f1(x , y) + f2(x , y)

(−g1

g2

).


g2= 0,





dx= 0.


dx= f1(x , y)

dx

dx+ f2(x , y)

dy

dx= f1 + f2(x , y)

dy

dx.


dx= −g1

g2, so

dz

dx= f1(x , y) + f2(x , y)

(−g1

g2

).


g2= 0,





dx= 0.


dx= f1(x , y)

dx

dx+ f2(x , y)

dy

dx= f1 + f2(x , y)

dy

dx.


dx= −g1

g2, so

dz

dx= f1(x , y) + f2(x , y)

(−g1

g2

).


g2= 0,


f1(x , y)− f2(x , y)g1

g2= 0

or f1 = f2g1

g2,

orf1f2

=g1

g2.

Equivalently, the vectors 5f and 5g are proportional to eachother, or 5f = λ5 g for some constant λ.This gives the method of Lagrange Multipliers: Any extrema forf (x , y) subject to the constraint g(x , y) = 0 must occur at a pointwhere 5f = λ5 g .


f1(x , y)− f2(x , y)g1

g2= 0

or f1 = f2g1

g2,

orf1f2

=g1

g2.



f1(x , y)− f2(x , y)g1

g2= 0

or f1 = f2g1

g2,

orf1f2

=g1

g2.

Equivalently, the vectors 5f and 5g are proportional to eachother,

or 5f = λ5 g for some constant λ.This gives the method of Lagrange Multipliers: Any extrema forf (x , y) subject to the constraint g(x , y) = 0 must occur at a pointwhere 5f = λ5 g .


f1(x , y)− f2(x , y)g1

g2= 0

or f1 = f2g1

g2,

orf1f2

=g1

g2.

Equivalently, the vectors 5f and 5g are proportional to eachother, or 5f = λ5 g for some constant λ.This gives the method of Lagrange Multipliers:

Any extrema forf (x , y) subject to the constraint g(x , y) = 0 must occur at a pointwhere 5f = λ5 g .


f1(x , y)− f2(x , y)g1

g2= 0

or f1 = f2g1

g2,

orf1f2

=g1

g2.



Lagrange Multipliers

In practice, this means we simultaneously solve the system ofequations:

∂f

∂x= λ

∂g

∂x∂f

∂y= λ

∂g

∂y

g(x , y) = 0.


A Geometric Perspective

Any maximum M for a function f (x , y) subject to a constraintg(x , y) = 0 occurs at a point where the graphs of f (x , y) = M andg(x , y) = 0 meet.

One would expect that f (x , y) > M on one sideof the graph of f (x , y) = M and f (x , y) < M on the other side.

It would thus appear the graphs of f (x , y) = M and g(x , y) = 0are tangent, since otherwise there would be points on g(x , y) = 0on either side of the graph of f (x , y) = M, and M wouldn’t be amaximum.

A similar argument could be made for a minimum.

Thus, the tangent lines to f (x , y) = M and g(x , y) = 0 at theextremum coincide, and must have parallel normals.

Since 5f is normal to the tangent to f (x , y) = M and 5g isnormal to the tangent to g(x , y) = 0, so it follows 5f = λ5 g forsome scalar λ.



Any maximum M for a function f (x , y) subject to a constraintg(x , y) = 0 occurs at a point where the graphs of f (x , y) = M andg(x , y) = 0 meet. One would expect that f (x , y) > M on one sideof the graph of f (x , y) = M and f (x , y) < M on the other side.








It would thus appear the graphs of f (x , y) = M and g(x , y) = 0are tangent, since otherwise there would be points on g(x , y) = 0on either side of the graph of f (x , y) = M,

and M wouldn’t be amaximum.























Thus, the tangent lines to f (x , y) = M and g(x , y) = 0 at theextremum coincide,

and must have parallel normals.















Since 5f is normal to the tangent to f (x , y) = M

and 5g isnormal to the tangent to g(x , y) = 0, so it follows 5f = λ5 g forsome scalar λ.







Since 5f is normal to the tangent to f (x , y) = M and 5g isnormal to the tangent to g(x , y) = 0,

so it follows 5f = λ5 g forsome scalar λ.









Multiple Constraints and Higher Dimensions

If there are multiple constraints, the gradient of the function to beoptimized must be a linear combination of the gradients of thefunctions defining the constraints.

In higher dimensions, the obvious analogue holds.


Multiple Constraints and Higher Dimensions

If there are multiple constraints, the gradient of the function to beoptimized must be a linear combination of the gradients of thefunctions defining the constraints.

In higher dimensions, the obvious analogue holds.


Documents

Extrema of Functions of Two Variablesstein/math210/Slides/math210-05slides.pdf · One necessary condition for g to have a relative extremum is g0(0) = 0. We know, however, ... in