Upload
oscarhigson-spence
View
213
Download
0
Embed Size (px)
Citation preview
8/18/2019 Week1 Tutorial
1/7
ECON 2101: Math Refresher (Tutorial W 1)Keiichi Kawai
The goal of this note is refresh your memory on constrained optimiza-
tion problem, i.e., maxx, y u (x, y) s.t. g (x, y) = 0. You studied the
Lagrangian method in the first year. We start with the intuition as to
why the Lagrangian method works. This may look like an unneces-sary torture, but it helps you nurture “economic intuition,” which you
will need to master in Microeconomics 2. After reviewing why the
Lagrangian method works, we look at a “cookbook” procedure in the
end and apply it to a few examples. Therefore, this note is never meant
to be comprehensive, and sacrifices “rigor” for some parts. In any case,
if this note doesn’t refresh your memory, go back to the textbook/note
you used in 1st year.
Introduction
Almost all math problems in Economics boil down to constrained
optimization problem, i.e.,
maxx, y
f ( x, y)
s.t. g (x, y) = 0
For example, in case of “utility maximization,” f (x, y) = u(x, y)
and g (x, y) = px x + p y y − I .1 1 If you are interested in minimizingsome γ(x, y), then you can covertproblem into maximization problem bysetting f (x, y) = −γ(x, y)Review of Single-Variable Optimization and First-Order Condi-
tion (FOC)
Let’s review how to deal with maximization problems when no “con-
straint” exists. That is, your goal is to find the maximizer x∗ such
that f (x∗) ≥ f (x) for all x in domain. The set of maximizers is often
denoted as arg max f (x). 2 2 Note that there can be multiple maxi-mizers.Suppose that x∗ ∈ argmax f (x), and f (x) is differentiable, you
know that f (x∗) = 0. This is the so-called first-order (necessary)
condition of maximizers.3 3 By the way, can you formally statewhat a function is? What is a utility"function," profit "function" etc?
To recall where we got this, you need to understand what a deriva-
tive f (x), or d f (x)
dx is.4 If you remember, the formal definition is 4 Note that the derivative of a (differen-
tiable) function is also a function
f
(x) = limh→0
f (x + h) − f (x)
h .
For an arbitrary h = 0, f (x+h)− f (x)
h measures the slope of lines
that goes through (x, f ( x)) and (x + h, f (x + h)). If you are not sure,
draw some graphs on your own to check.
8/18/2019 Week1 Tutorial
2/7
econ2101 : math refresher (tutorial w1) 2
Now that we have refreshed our memory on the definition of
derivative, let’s see why (conditional on f (x) being a differentiable
function) we have
x∗ ∈ argmax f (x) ⇒ f (x∗) = 0
Recall that x such that f
(x∗
) = 0 is called a critical point of f . So, theprevious statement can be rewritten as follows:
x∗ ∈ argmax f (x) ⇒ xis a critical point of f (x)
Take an arbitrary x̃ in the domain of f (x).5 5 Do you remember the definition of thedomain of a function?If the “slope” f (x) is positive at x̃, then it means you can increase
the value of the function by increasing x slightly, i.e., for some ∆x >
0, f (x̃ + ∆x) > f (x̃). We can thus conclude that x̃ is not a maximizer,
i.e., x̃ ∈ argmax f (x).
Similarly, if the “slope” f (x) is negative, then it means you can
increase the value of function by decreasing x a little bit from x̃.
Combining those two observations, we have the conclusion that if function f (x) is maximized at x∗, then f (x∗) = 0. 6 6 Notice, this is NOT equivalent to
saying that “If f (x∗) = 0, then functionis maximized at x∗ .”Indeed, if f (x) =0, then so is − f (x). Thus, if thisstatement is true, the maximizer has to
be the minimizer too, i.e., function hasto be constant.
Again, there may be many points such that f (x∗) = 0. To pin
down which one of those critical points are maximizers, you have to
rely on other tools, e.g., second-order condition, comparing the value
of objective functions at critical points. But the first-order condition
(foc) drastically simplifies your search for the maximizers (and for
most of the problems you see in this course, foc gives you the “solu-
tion” you need.)
Example 1 If f ( x) = −x (x − 2), then, f (x) = −2 (x − 1). So
argmaxx f ( x) = {1} . 7 7 When arg max f (x) is singleton, i.e.,argmax f (x) = y for some y, it isconventional to write arg max f (x) = yeven though it is a slight abuse of notation.
Example 2 If f ( x) = ln x− px, then f (x) = 1− px
x . So arg maxx f (x) =
{1/ p}.
Even when the maximization problem involves two or more vari-
ables, the logic is the same. If f (x, y) is differentiable, and (x∗, y∗) ∈
argmaxx, y f (x, y) then ∂ f (x∗, y∗)/∂x = 0 and ∂ f (x∗, y∗)/∂ y = 0 8 8 If you are not sure what partial deriva-tive means, go back to the textbook andreview the topic.
Constrained Optimization
Now, let’s get onto the main topic. Suppose you are asked to maxi-
mize the function f ( x, y) by choosing x and y. But you cannot freelychoose x and y. You have to choose x and y so that the constraint
g (x, y) = 0 is satisfied. This problem is often written as
maxx, y
f ( x, y)
s.t. g (x, y) = 0.
©Keiichi KAWAI (ver 2016.1)
8/18/2019 Week1 Tutorial
3/7
econ2101 : math refresher (tutorial w1) 3
How do you solve this type of question? Remember, many eco-
nomic problems fall into this category. (Again, utility maximization
problem is a canonical example.) The issue here is that you cannot
choose x and y freely. If you choose certain x , you have to choose y
so that g(x, y) = 0. In other words, the way you can choose y is a
function γ(x) of x .Sometimes, finding this function γ(x) such that y = γ(x) is
straightforward, or more formally, g(x, y) = 0 defines y as an ex-
plicit function of x . But sometimes not, or more formally, g(x, y) = 0
defines y as an implicit function of x .
Let’s start with a simple case, where we can explicitly define y as
a function of x from g(x, y) = 0. For example suppose g (x, y) =
x − γ ( y) = 0. That is, if you choose x , then you have to choose y so
that y = γ (x).9 9 Indeed, for most of the problems thatyou see in Economics, you actually cando this.
Then, this problem becomes an unconstrained single-variable
optimization problem:
maxx
φ (x) = maxx
f ( x,γ (x)) .
Therefore, if x∗, y∗ is the solution to the original problem, y∗ =
γ(x∗), and φ (x∗) = 0.
So the biggest challenge now is finding out φ (x). Recall that
φ (x) × ∆x measures the overall change in the value of f when you
change x by ∆x.10 10 Recall that l im∆x→0φ(x+∆x)−φ(x)
∆x =φ (x)Notice that a change in x affects f (x, γ (x)) through two channels:
φ (x) = ∂ f ( x, y)
∂x
(i)+ ∂ f ( x,γ (x))
∂ y
(ii)× γ (x)
(iii).
The first channel is the direct one. If you change x by ∆x, then
it has the “direct” effect on the value of the objective function by∂ f (x, y)
∂x ∆x, as captured by term (i) above. The second channel is the
indirect one that comes through the change in y. If you change x by
∆x, then y changes by ∆ y = γ (x) × ∆x. For such a change in y, f
changes by ∂ f (x,γ(x))
∂ y × ∆ y, as represented by (ii) and (iii).
So the overall change in f is
∂ f ( x, y)
∂x +
∂ f (x,γ (x))
∂ y γ (x) .
This is the so-called chain-rule you studied.11 11 If you do care about formality, and/orare aiming for honours program,then here is the formal statement: Let f : Rn → R and let a : R → Rn
be C1. Then, the composite function g (t) = F (a (t)) is a C1 function fromR → R and
g (t) = ∑ j
∂ f (a (t))
∂x j× a j (t)
= D f (a (t)) · a (t) .
So to sum up, if x∗, y∗ is the solution to the following problem:
maxx, y
f ( x, y)
s.t. y = γ (x) .
©Keiichi KAWAI (ver 2016.1)
8/18/2019 Week1 Tutorial
4/7
econ2101 : math refresher (tutorial w1) 4
then,∂ f (x∗,γ (x∗))
∂x +
∂ f (x∗,γ (x∗))
∂ y γ (x∗) = 0 (1)
Example 3 Suppose f ( x, y) = ln x + ln y and γ (x) = 1 − x. Then,
∂ f (x, y)
∂x +
∂ f ( x, γ (x))
∂ y γ
(x) =
1
x +
1
1− x × (−1)
Therefore, x∗ = 1/2 and y∗ = 1/2.
Example 4 Consider a utility maximization problem, g (x, y) = px x +
p y y − I. Then, γ (x) = px x−I
p y, γ (x) =
px p y
. Therefore, the corresponding
FOC becomes∂u (x, y)
∂x +
∂u (x, y)
∂ y ×
px p y
= 0.
So what we can learn from this exercise is that y does not neces-
sarily have to be an explicit function of x to covert “constrained op-
timization problem” into “unconstrained one.” All we need to know
is the change in y that arises from the change in x , i.e., γ (x). No-
tice that we can find γ (x) even when the constraint only implicitly
defines y as a function of x, as in g (x,γ (x)).
To see this, notice that
d ( g (x, γ (x)))
dx = 0.
By the definition of g (x,γ (x)), you can only choose x so that
g (x,γ (x)) = 0. This means even if you change x, the value of
g (x,γ (x)) cannot change, i.e., d( g(x,γ(x)))
dx = 0.12 12 Notice that
d( g(x,γ(x)))dx =
∂( g(x,γ(x)))∂x .
Since 13 13 Again, we are using the chain-rulehere.∂ g (x, y)
∂x + ∂ g (x, y)
∂ y γ (x) = 0,
we obtain
γ (x) = −∂ g (x, y) /∂x
∂ g (x, y) /∂ y
conditional on ∂ g (x, y) /∂ y = 0.
Therefore, the counterpart of (1) becomes
∂ f ( x∗, y∗)
∂x +
∂ f ( x∗, y∗)
∂ y
−∂ g (x∗, y∗) /∂x
∂ g (x∗, y∗) /∂ y
= 0.
Or equivalently, for λ = −∂ f (x∗, y∗)/∂ y∗
∂ g(x∗, y∗)/∂ y∗ ,
∂ f ( x∗, y∗)
∂x + λ
∂ g (x∗, y∗)
∂x
= 0
∂ f ( x∗, y∗)
∂ y + λ
∂ g (x∗, y∗)
∂ y
= 0
g (x∗, y∗) = 0
©Keiichi KAWAI (ver 2016.1)
8/18/2019 Week1 Tutorial
5/7
econ2101 : math refresher (tutorial w1) 5
Since we have three unknowns (x∗, y∗,λ) and three equations, (for
most of cases), this system of equations have a solution.
So to sum up, the solution (x∗, y∗) of the following problem
maxx, y
f ( x, y)
s.t. g (x, y) = 0.
has to satisfy
∂ f ( x∗, y∗)
∂x + λ
∂ g (x∗, y∗)
∂x
= 0
∂ f ( x∗, y∗)
∂ y + λ
∂ g (x∗, y∗)
∂ y
= 0
g (x∗, y∗) = 0
Notice that the set of conditions here is the same as the maximiza-
tion problem of
max f (x, y) + λ g(x, y).
This is the so-called Lagrangian Theorem.14 In sum, (under some 14 Again, this is only for those who careabout formality, and/or are aiming forhonours program. But here’s the formalstatement:
Theorem 1 Let f : Rn → R and g : Rn → Rk be C 1 functions. Suppose x∗
is a local optimum of f on the set
D = U ∩ {x| g (x) = 0} ,
where U ⊂ Rn is open. Supposedim (Dg (x∗)) = k. Then, there existsa vector λ∗ ∈ Rk such that
D f (x∗) +k
∑ i=1
λ∗i Dg i (x∗) = 0.
• The condition dim (Dg (x∗)) = k is the (general version of) constraintqualification.
• This condition enables us to use the(generalized version of) implicit func-tion theorem.
mild conditions that are usually satisfied for most of economic prob-
lems), we can convert constrained optimization problems into uncon-
strained optimization ones.
Example 5 Suppose you want to maximize the utility function u (x, y) =
ln x + ln y, and you face the budget constraint x + y = 1. Then, f (x, y) =
ln x + ln y and g (x, y) = x + y− 1. Since, the solution (x∗, y∗) is a critical
point of the Lagrangian L,
L = f (x, y) + λ g(x, y),
x∗ and y∗ have to satisfy the following conditions:
∂ f (x∗, y∗)
∂x + λ
∂ g (x∗, y∗)
∂x
=
1
x∗ + λ = 0
∂ f (x∗, y∗)
∂ y + λ
∂ g (x∗, y∗)
∂ y
=
1
y∗ + λ = 0
x∗ + y∗ = 1
Solving, we get x∗ = y∗ = 1/2.
This is the basic logic behind so-called Lagrangian Method. Thiscan be generalized for the case where there are more than 2 variables.
©Keiichi KAWAI (ver 2016.1)
8/18/2019 Week1 Tutorial
6/7
econ2101 : math refresher (tutorial w1) 6
“Cookbook” Procedure
Now let’s summarize what we have reviewed in a form of “cook-
book.” Suppose you are asked to solve the problem
maxx∈
Rn f (x)
s.t. g (x) = 0.
1. We set up a function L : Rn ×R → R, called the Lagrangian
L (x,λ) = f ( x) + λ g (x) .
The scholar λ is called the Lagrangian multiplier.
2. We find the set of all critical points of L (x,λ). That is, all points
(x, λ) such that ∂L (xi,λ) /∂xi = 0 for all x i, and ∂L (x,λ) = 0.
Since x ∈ Rn and λ ∈ R, this results in a system of (n + 1) equa-
tions in the (n + 1) unknowns:
∂L
∂x j(x,λ) = 0, j = 1, · · · , n
∂L
∂λ (x,λ) = 0.
3. Let M be the set of all solutions to these equations. We evaluate f
at each point x in this set M . “Usually” the value of x that maxi-
mizes f over this set are also the solution of the constrained maxi-
mization problem we started with. In case M is singleton, i.e., con-
sists of only one point, then check carefully if it is the maximizer
or minimizer by comparing the value of the objective function atx ∈ M and y = x.
Example 1: Simple Numerical Example:
Consider the problem of maximizing and minimizing f (x, y) =
x2 − y2 subject to g (x, y) = 1 − x2 − y2 = 0.
1. Now set up the Lagrangian:
L (x, y,λ) = x2 − y2 + λ
1− x2 − y2
.
The critical points of L are the solutions (x, y,λ) ∈ R3 to
2x − 2λx = 0
−2 y − 2λ y = 0
x2 + y2 = 1
©Keiichi KAWAI (ver 2016.1)
8/18/2019 Week1 Tutorial
7/7
econ2101 : math refresher (tutorial w1) 7
2. From the first partial derivative, 2x (1− λ) = 0, and from the
second partial derivative 2 y (1 + λ) = 0. If λ = ±1, then these can
hold only when (x, y) = 0. So λ = ±1. Hence, there are only four
possibilities:
(x, y,λ) =
(1,0,1)
(−1,0,1)(0,1,−1)
(0,−1,−1)
3. Evaluating f at those points, we see f (1,0,1) = f (−1,0,1) = 1,
and f (0,1,−1) = f (0,−1,−1) = −1.
4. Since the critical points of L contain global maximizers and min-
imaizers of f , the first two points must be the solutions we are
after.
Example 2: Utility Maximization:
Consider the following utility maximization problem:
max x1x2
s.t. p1x1 + p2x2 = I
1. Set up the Lagrangian:
L (x1, x2,λ) = x1x2 + λ (I − p1x1 − p2x2) .
2. The critical points of L are the solutions
x∗1 , x∗2 ,λ
∈ R2++ ×R to:
x2 − λ p1 = 0
x1 − λ p2 = 0
I − p1x1 − p2x2 = 0
Let’s check if there is a solution such that λ = 0. If λ = 0, then
x1 = x2 = 0, which violates the third equation.
So, suppose λ = 0. Then we have λ = x2 p1 = x1 p2
. Thus, x1 = p2 x2
p1.
Using the third equation, we obtain
(x∗1 , x∗2 , λ
∗) =
I
2 p1,
I
2 p2,
I
2 p1 p2
.
3. Notice that (x1, x2) = (0, I / p2) satisfies the constraint, and theresulting value of objective function at this point is zero. Since the
value of objective function at
x∗1 , x∗2 ,λ
∗
=
I 2 p1
, I 2 p2 , I
2 p1 p2
. is
positive, we can conclude that
I 2 p1
, I 2 p2
is the solution we are
after.
©Keiichi KAWAI (ver 2016.1)