Week1 Tutorial

8/18/2019 Week1 Tutorial

1/7

ECON 2101: Math Refresher (Tutorial W 1)Keiichi Kawai

The goal of this note is refresh your memory on constrained optimiza-

tion problem, i.e., maxx, y u (x, y) s.t. g (x, y) = 0. You studied the

Lagrangian method in the first year. We start with the intuition as to

why the Lagrangian method works. This may look like an unneces-sary torture, but it helps you nurture “economic intuition,” which you

will need to master in Microeconomics 2. After reviewing why the

Lagrangian method works, we look at a “cookbook” procedure in the

end and apply it to a few examples. Therefore, this note is never meant

to be comprehensive, and sacrifices “rigor” for some parts. In any case,

if this note doesn’t refresh your memory, go back to the textbook/note

you used in 1st year.

Introduction

Almost all math problems in Economics boil down to constrained

optimization problem, i.e.,

maxx, y

f ( x, y)

s.t. g (x, y) = 0

For example, in case of “utility maximization,” f (x, y) = u(x, y)

and g (x, y) = px x + p y y − I .1 1 If you are interested in minimizingsome γ(x, y), then you can covertproblem into maximization problem bysetting f (x, y) = −γ(x, y)Review of Single-Variable Optimization and First-Order Condi-

tion (FOC)

Let’s review how to deal with maximization problems when no “con-

straint” exists. That is, your goal is to find the maximizer x∗ such

that f (x∗) ≥ f (x) for all x in domain. The set of maximizers is often

denoted as arg max f (x). 2 2 Note that there can be multiple maxi-mizers.Suppose that x∗ ∈ argmax f (x), and f (x) is differentiable, you

know that f (x∗) = 0. This is the so-called first-order (necessary)

condition of maximizers.3 3 By the way, can you formally statewhat a function is? What is a utility"function," profit "function" etc?

To recall where we got this, you need to understand what a deriva-

tive f (x), or d f (x)

dx is.4 If you remember, the formal definition is 4 Note that the derivative of a (differen-

tiable) function is also a function

f

(x) = limh→0

f (x + h) − f (x)

h .

For an arbitrary h = 0, f (x+h)− f (x)

h measures the slope of lines

that goes through (x, f ( x)) and (x + h, f (x + h)). If you are not sure,

draw some graphs on your own to check.


2/7

econ2101 : math refresher (tutorial w1) 2

Now that we have refreshed our memory on the definition of

derivative, let’s see why (conditional on f (x) being a differentiable

function) we have

x∗ ∈ argmax f (x) ⇒ f (x∗) = 0

Recall that x such that f

(x∗

) = 0 is called a critical point of f . So, theprevious statement can be rewritten as follows:

x∗ ∈ argmax f (x) ⇒ xis a critical point of f (x)

Take an arbitrary x̃ in the domain of f (x).5 5 Do you remember the definition of thedomain of a function?If the “slope” f (x) is positive at x̃, then it means you can increase

the value of the function by increasing x slightly, i.e., for some ∆x >

0, f (x̃ + ∆x) > f (x̃). We can thus conclude that x̃ is not a maximizer,

i.e., x̃ ∈ argmax f (x).

Similarly, if the “slope” f (x) is negative, then it means you can

increase the value of function by decreasing x a little bit from x̃.

Combining those two observations, we have the conclusion that if function f (x) is maximized at x∗, then f (x∗) = 0. 6 6 Notice, this is NOT equivalent to

saying that “If f (x∗) = 0, then functionis maximized at x∗ .”Indeed, if f (x) =0, then so is − f (x). Thus, if thisstatement is true, the maximizer has to

be the minimizer too, i.e., function hasto be constant.

Again, there may be many points such that f (x∗) = 0. To pin

down which one of those critical points are maximizers, you have to

rely on other tools, e.g., second-order condition, comparing the value

of objective functions at critical points. But the first-order condition

(foc) drastically simplifies your search for the maximizers (and for

most of the problems you see in this course, foc gives you the “solu-

tion” you need.)

Example 1 If f ( x) = −x (x − 2), then, f (x) = −2 (x − 1). So

argmaxx f ( x) = {1} . 7 7 When arg max f (x) is singleton, i.e.,argmax f (x) = y for some y, it isconventional to write arg max f (x) = yeven though it is a slight abuse of notation.

Example 2 If f ( x) = ln x− px, then f (x) = 1− px

x . So arg maxx f (x) =

{1/ p}.

Even when the maximization problem involves two or more vari-

ables, the logic is the same. If f (x, y) is differentiable, and (x∗, y∗) ∈

argmaxx, y f (x, y) then ∂ f (x∗, y∗)/∂x = 0 and ∂ f (x∗, y∗)/∂ y = 0 8 8 If you are not sure what partial deriva-tive means, go back to the textbook andreview the topic.

Constrained Optimization

Now, let’s get onto the main topic. Suppose you are asked to maxi-

mize the function f ( x, y) by choosing x and y. But you cannot freelychoose x and y. You have to choose x and y so that the constraint

g (x, y) = 0 is satisfied. This problem is often written as

maxx, y

f ( x, y)

s.t. g (x, y) = 0.

©Keiichi KAWAI (ver 2016.1)


3/7


How do you solve this type of question? Remember, many eco-

nomic problems fall into this category. (Again, utility maximization

problem is a canonical example.) The issue here is that you cannot

choose x and y freely. If you choose certain x , you have to choose y

so that g(x, y) = 0. In other words, the way you can choose y is a

function γ(x) of x .Sometimes, finding this function γ(x) such that y = γ(x) is

straightforward, or more formally, g(x, y) = 0 defines y as an ex-

plicit function of x . But sometimes not, or more formally, g(x, y) = 0

defines y as an implicit function of x .

Let’s start with a simple case, where we can explicitly define y as

a function of x from g(x, y) = 0. For example suppose g (x, y) =

x − γ ( y) = 0. That is, if you choose x , then you have to choose y so

that y = γ (x).9 9 Indeed, for most of the problems thatyou see in Economics, you actually cando this.

Then, this problem becomes an unconstrained single-variable

optimization problem:

maxx

φ (x) = maxx

f ( x,γ (x)) .

Therefore, if x∗, y∗ is the solution to the original problem, y∗ =

γ(x∗), and φ (x∗) = 0.

So the biggest challenge now is finding out φ (x). Recall that

φ (x) × ∆x measures the overall change in the value of f when you

change x by ∆x.10 10 Recall that l im∆x→0φ(x+∆x)−φ(x)

∆x =φ (x)Notice that a change in x affects f (x, γ (x)) through two channels:

φ (x) = ∂ f ( x, y)

∂x

(i)+ ∂ f ( x,γ (x))

∂ y

(ii)× γ (x)

(iii).

The first channel is the direct one. If you change x by ∆x, then

it has the “direct” effect on the value of the objective function by∂ f (x, y)

∂x ∆x, as captured by term (i) above. The second channel is the

indirect one that comes through the change in y. If you change x by

∆x, then y changes by ∆ y = γ (x) × ∆x. For such a change in y, f

changes by ∂ f (x,γ(x))

∂ y × ∆ y, as represented by (ii) and (iii).

So the overall change in f is

∂ f ( x, y)

∂x +

∂ f (x,γ (x))

∂ y γ (x) .

This is the so-called chain-rule you studied.11 11 If you do care about formality, and/orare aiming for honours program,then here is the formal statement: Let f : Rn → R and let a : R → Rn

be C1. Then, the composite function g (t) = F (a (t)) is a C1 function fromR → R and

g (t) = ∑ j

∂ f (a (t))

∂x j× a j (t)

= D f (a (t)) · a (t) .

So to sum up, if x∗, y∗ is the solution to the following problem:

maxx, y

f ( x, y)

s.t. y = γ (x) .



4/7


then,∂ f (x∗,γ (x∗))

∂x +

∂ f (x∗,γ (x∗))

∂ y γ (x∗) = 0 (1)

Example 3 Suppose f ( x, y) = ln x + ln y and γ (x) = 1 − x. Then,

∂ f (x, y)

∂x +

∂ f ( x, γ (x))

∂ y γ

(x) =

1

x +

1

1− x × (−1)

Therefore, x∗ = 1/2 and y∗ = 1/2.

Example 4 Consider a utility maximization problem, g (x, y) = px x +

p y y − I. Then, γ (x) = px x−I

p y, γ (x) =

px p y

. Therefore, the corresponding

FOC becomes∂u (x, y)

∂x +

∂u (x, y)

∂ y ×

px p y

= 0.

So what we can learn from this exercise is that y does not neces-

sarily have to be an explicit function of x to covert “constrained op-

timization problem” into “unconstrained one.” All we need to know

is the change in y that arises from the change in x , i.e., γ (x). No-

tice that we can find γ (x) even when the constraint only implicitly

defines y as a function of x, as in g (x,γ (x)).

To see this, notice that

d ( g (x, γ (x)))

dx = 0.

By the definition of g (x,γ (x)), you can only choose x so that

g (x,γ (x)) = 0. This means even if you change x, the value of

g (x,γ (x)) cannot change, i.e., d( g(x,γ(x)))

dx = 0.12 12 Notice that

d( g(x,γ(x)))dx =

∂( g(x,γ(x)))∂x .

Since 13 13 Again, we are using the chain-rulehere.∂ g (x, y)

∂x + ∂ g (x, y)

∂ y γ (x) = 0,

we obtain

γ (x) = −∂ g (x, y) /∂x

∂ g (x, y) /∂ y

conditional on ∂ g (x, y) /∂ y = 0.

Therefore, the counterpart of (1) becomes

∂ f ( x∗, y∗)

∂x +

∂ f ( x∗, y∗)

∂ y

−∂ g (x∗, y∗) /∂x

∂ g (x∗, y∗) /∂ y

= 0.

Or equivalently, for λ = −∂ f (x∗, y∗)/∂ y∗

∂ g(x∗, y∗)/∂ y∗ ,

∂ f ( x∗, y∗)

∂x + λ

∂ g (x∗, y∗)

∂x

= 0

∂ f ( x∗, y∗)

∂ y + λ

∂ g (x∗, y∗)

∂ y

= 0

g (x∗, y∗) = 0



5/7


Since we have three unknowns (x∗, y∗,λ) and three equations, (for

most of cases), this system of equations have a solution.

So to sum up, the solution (x∗, y∗) of the following problem

maxx, y

f ( x, y)

s.t. g (x, y) = 0.

has to satisfy

∂ f ( x∗, y∗)

∂x + λ

∂ g (x∗, y∗)

∂x

= 0

∂ f ( x∗, y∗)

∂ y + λ

∂ g (x∗, y∗)

∂ y

= 0

g (x∗, y∗) = 0

Notice that the set of conditions here is the same as the maximiza-

tion problem of

max f (x, y) + λ g(x, y).

This is the so-called Lagrangian Theorem.14 In sum, (under some 14 Again, this is only for those who careabout formality, and/or are aiming forhonours program. But here’s the formalstatement:

Theorem 1 Let f : Rn → R and g : Rn → Rk be C 1 functions. Suppose x∗

is a local optimum of f on the set

D = U ∩ {x| g (x) = 0} ,

where U ⊂ Rn is open. Supposedim (Dg (x∗)) = k. Then, there existsa vector λ∗ ∈ Rk such that

D f (x∗) +k

∑ i=1

λ∗i Dg i (x∗) = 0.

• The condition dim (Dg (x∗)) = k is the (general version of) constraintqualification.

• This condition enables us to use the(generalized version of) implicit func-tion theorem.

mild conditions that are usually satisfied for most of economic prob-

lems), we can convert constrained optimization problems into uncon-

strained optimization ones.

Example 5 Suppose you want to maximize the utility function u (x, y) =

ln x + ln y, and you face the budget constraint x + y = 1. Then, f (x, y) =

ln x + ln y and g (x, y) = x + y− 1. Since, the solution (x∗, y∗) is a critical

point of the Lagrangian L,

L = f (x, y) + λ g(x, y),

x∗ and y∗ have to satisfy the following conditions:

∂ f (x∗, y∗)

∂x + λ

∂ g (x∗, y∗)

∂x

=

1

x∗ + λ = 0

∂ f (x∗, y∗)

∂ y + λ

∂ g (x∗, y∗)

∂ y

=

1

y∗ + λ = 0

x∗ + y∗ = 1

Solving, we get x∗ = y∗ = 1/2.

This is the basic logic behind so-called Lagrangian Method. Thiscan be generalized for the case where there are more than 2 variables.



6/7


“Cookbook” Procedure

Now let’s summarize what we have reviewed in a form of “cook-

book.” Suppose you are asked to solve the problem

maxx∈

Rn f (x)

s.t. g (x) = 0.

1. We set up a function L : Rn ×R → R, called the Lagrangian

L (x,λ) = f ( x) + λ g (x) .

The scholar λ is called the Lagrangian multiplier.

2. We find the set of all critical points of L (x,λ). That is, all points

(x, λ) such that ∂L (xi,λ) /∂xi = 0 for all x i, and ∂L (x,λ) = 0.

Since x ∈ Rn and λ ∈ R, this results in a system of (n + 1) equa-

tions in the (n + 1) unknowns:

∂L

∂x j(x,λ) = 0, j = 1, · · · , n

∂L

∂λ (x,λ) = 0.

3. Let M be the set of all solutions to these equations. We evaluate f

at each point x in this set M . “Usually” the value of x that maxi-

mizes f over this set are also the solution of the constrained maxi-

mization problem we started with. In case M is singleton, i.e., con-

sists of only one point, then check carefully if it is the maximizer

or minimizer by comparing the value of the objective function atx ∈ M and y = x.

Example 1: Simple Numerical Example:

Consider the problem of maximizing and minimizing f (x, y) =

x2 − y2 subject to g (x, y) = 1 − x2 − y2 = 0.

1. Now set up the Lagrangian:

L (x, y,λ) = x2 − y2 + λ

1− x2 − y2

.

The critical points of L are the solutions (x, y,λ) ∈ R3 to

2x − 2λx = 0

−2 y − 2λ y = 0

x2 + y2 = 1



7/7


2. From the first partial derivative, 2x (1− λ) = 0, and from the

second partial derivative 2 y (1 + λ) = 0. If λ = ±1, then these can

hold only when (x, y) = 0. So λ = ±1. Hence, there are only four

possibilities:

(x, y,λ) =

(1,0,1)

(−1,0,1)(0,1,−1)

(0,−1,−1)

3. Evaluating f at those points, we see f (1,0,1) = f (−1,0,1) = 1,

and f (0,1,−1) = f (0,−1,−1) = −1.

4. Since the critical points of L contain global maximizers and min-

imaizers of f , the first two points must be the solutions we are

after.

Example 2: Utility Maximization:

Consider the following utility maximization problem:

max x1x2

s.t. p1x1 + p2x2 = I

1. Set up the Lagrangian:

L (x1, x2,λ) = x1x2 + λ (I − p1x1 − p2x2) .

2. The critical points of L are the solutions

x∗1 , x∗2 ,λ

∈ R2++ ×R to:

x2 − λ p1 = 0

x1 − λ p2 = 0

I − p1x1 − p2x2 = 0

Let’s check if there is a solution such that λ = 0. If λ = 0, then

x1 = x2 = 0, which violates the third equation.

So, suppose λ = 0. Then we have λ = x2 p1 = x1 p2

. Thus, x1 = p2 x2

p1.

Using the third equation, we obtain

(x∗1 , x∗2 , λ

∗) =

I

2 p1,

I

2 p2,

I

2 p1 p2

.

3. Notice that (x1, x2) = (0, I / p2) satisfies the constraint, and theresulting value of objective function at this point is zero. Since the

value of objective function at

x∗1 , x∗2 ,λ

∗

=

I 2 p1

, I 2 p2 , I

2 p1 p2

. is

positive, we can conclude that

I 2 p1

, I 2 p2

is the solution we are

after.


Documents

Week1 Tutorial