24
Lecture 11 Econ 2001 2015 August 24

Lecture 11 - sites.pitt.edu

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 11 - sites.pitt.edu

Lecture 11

Econ 2001

2015 August 24

Page 2: Lecture 11 - sites.pitt.edu

Lecture 11 Outline

1 Differentiability, reprise2 Homogeneous Functions and Euler’s Theorem3 Mean Value Theorem4 Taylor’s Theorem

Announcement:- The last exam will be Friday at 10:30am (usual class time), in WWPH 4716.

Page 3: Lecture 11 - sites.pitt.edu

Differentiability Reminder

Given f : Rn −→ R. The ith partial derivative of f at x is∂f∂xi(x) = lim

h−→0

f (x+ hei )− f (x)h

.

when this limit exists;the gradient of f at x ∈ Rn is

∇f (x) =(∂f∂x1

(x),∂f∂x2

(x), . . . ,∂f∂xn

(x))

Given f : Rn −→ R and v a unit vector in Rn . The directional derivative of f inthe direction v at x is

Dv(x) = limh−→0

f (x+ hv)− f (x)h

when this limit exists.Thus

∂f∂xi(x) = Dei (x).

Page 4: Lecture 11 - sites.pitt.edu

Differentiability ReminderLet X ⊂ Rn be open, f : X → Rm , and x ∈ X.f is differentiable at x if there exist Tx ∈ L(Rn ,Rm) such that

limh→0h∈Rn

‖f (x+ h)− (f (x) + Tx (h))‖‖h‖ = 0

f is differentiable if it is differentiable at all x ∈ X.Tx is, sometimes called differential (or derivative) of f at x, is denoted dfx .The Jacobian of f at x, denoted Df (x), is the matrix corresponding to dfx withrespect to the standard basis (Tx (ei )).

If f is differentiable at x, then ∂f i

∂xjexist for 1 ≤ i ≤ m, 1 ≤ j ≤ n, and

Df (x) =

∂f 1

∂x1(x) · · · ∂f 1

∂xn(x)

.... . .

...∂f m

∂x1(x) · · · ∂f m

∂xn(x)

If f is differentiable at x, and u ∈ Rn is such that ‖u‖ = 1, then the directionalderivative in the direction u (Tx (u)) is

Df (x)ua weighted average of partial derivatives.

Page 5: Lecture 11 - sites.pitt.edu

Higher-Order Derivatives

If f : Rn −→ R is differentiable, then its partial derivatives can be viewed asfunctions from Rn to R.Therefore, one can take a derivative with respect to one variable, thenanother, then the first again, and so on.

Provided that all of the derivatives are continuous in a neighborhood of thepoint you start with, the order in which you take partial derivatives does notmatter (this is not obvious).

Second derivatives are important, but we rarely care about third or higherderivatives.

Notationf : Rn −→ R has n2 second derivatives, and we think of them as terms in a squarematrix:

D2f (x)n×n

=

∂2 f∂x 21(x) . . . ∂2 f

∂xn∂x1(x)

......

∂2 f∂x1∂xn

(x) . . . ∂2 f∂x 2n(x)

Page 6: Lecture 11 - sites.pitt.edu

Homogeneous FunctionsDefinition

A function F : Rn → R is homogeneous of degree k if F (λx) = λkF (x) for allλ > 0.

All linear functions are homogeneous of degree one, but homogeneity ofdegree one is weaker than linearity

f (x , y ) =√xy is homogeneous of degree one but not linear.

Example: Cost functions depend on the prices paid for inputsIf all prices double, then the cost doubles: this is homogeneity of degree one.

Example: Consumers’demand depends on income and pricesDemand φ(p, I ) gives the consumer’s utility maximizing choices among the feasibleconsumption bundles, given prices p and income I .

Affordable consumption x satisfies p · x ≤ I (and a non-negativity constraint).If p and I are multiplied by the same factor, λ, then the budget constraintremains unchanged.

The demand function is homogeneous of degree zero.

Page 7: Lecture 11 - sites.pitt.edu

Euler’s TheoremTheorem (Euler’s Theorem)If F : Rn −→ R is differentiable at x and homogeneous of degree k, then

∇F (x) · x = kF (x).

Proof.Fix x . Consider the function H(λ) = F (λx). This is a composite function,H(λ) = F ◦ G (λ), where G : R→ Rn , such that G (λ) = λx .

By the chain rule,DH(λ) = DF (G (λ))DG (λ)

If we evaluate this when λ = 1 we have

DH(1) = ∇F (x) · x (A)

On the other hand, we know from homogeneity that H(λ) = λkF (x).

Differentiate this with respect to λ

DH(λ) = kλk−1F (λx)

and evaluate it when λ = 1

DH(1) = kF (x) (B)

Combining equations (A) and (B) yields the result.

Page 8: Lecture 11 - sites.pitt.edu

An application of Euler’s Theorem

Euler’s Theorem provides a useful decomposition of a function F (x).

Suppose that F describes the profit produced by a team of n agents, whenagent i contributes xi .

How does such a team divide F?

If F is linear, F (x) = p · x , then just give each agent i the amount pixi .Each agent receives a constant “per unit” payment equal to her marginalcontribution to profits.

When F is non-linear, it is harder to figure out the contribution of each agent.

The theorem states that if you pay each agent her marginal contribution(Dei f (x)) per unit, then you distribute the surplus fully if F is homogeneousof degree one.

Page 9: Lecture 11 - sites.pitt.edu

Mean Value Theorem: Easy

Theorem (Mean Value Theorem, Univariate Case)Let a, b ∈ R. Suppose f : [a, b] → R is continuous on [a, b] and differentiable on(a, b). Then there exists c ∈ (a, b) such that

f (b)− f (a)b − a = f ′(c)

that is,f (b)− f (a) = f ′(c)(b − a)

Page 10: Lecture 11 - sites.pitt.edu

Need to show that f (b)− f (a) = f ′(c)(b − a)

Proof.Consider the function

g(x) = f (x)− f (a)− f (b)− f (a)b − a (x − a)

By construction, g(a) = 0 = g(b). Note that for x ∈ (a, b),

g ′(x) = f ′(x)− f (b)− f (a)b − a

so it suffi ces to find c ∈ (a, b) such that g ′(c) = 0.Case I: If g(x) = 0 for all x ∈ [a, b], choose any c ∈ (a, b); since g ′(c) = 0,we are done.

Case II: Suppose g(x) > 0 for some x ∈ [a, b].Since g is continuous on [a, b], it attains its maximum at some point c ∈ (a, b).Since g is differentiable at c and c is an interior point of the domain of g , wehave g ′(c) = 0, and we are done.

Case III: If g(x) < 0 for some x ∈ [a, b], similar argument to Case II.

Page 11: Lecture 11 - sites.pitt.edu

Mean Value Theorem

NotationFor any x , y ∈ X ⊆ Rn , the line segment from x to y is

`(x , y) = {αx + (1− α)y : α ∈ [0, 1]}

Theorem (Mean Value Theorem)Suppose f : Rn → R is differentiable on an open set X ⊂ Rn , x , y ∈ X and`(x , y) ⊂ X. Then there exists z ∈ `(x , y) such that

f (y)− f (x) = Df (z)(y − x)

The statement is exactly the same as in the univariate case.

Page 12: Lecture 11 - sites.pitt.edu

Mean Value Theorem: General

When functions’domain is Rm , things get more complicated.

For f : Rn → Rm , we could apply the Mean Value Theorem to eachcomponent f i , to obtain z1, . . . , zm ∈ `(x , y) such that

f i (y)− f i (x) = Df i (zi )(y − x)Note that each zi ∈ `(x , y) ⊂ Rn ; there are m of them, one for eachcomponent in the range.

However, we may not find a single z which works for every component.

TheoremSuppose X ⊂ Rn is open and f : X → Rm is differentiable. If x , y ∈ X and`(x , y) ⊆ X, then there exists z ∈ `(x , y) such that

‖f (y)− f (x)‖ ≤ ‖dfz (y − x)‖≤ ‖dfz‖|y − x |

Page 13: Lecture 11 - sites.pitt.edu

Taylor’s Theorem

Approximating a FunctionWe know what happens to a function at some point a ∈ Rn and we want toapproximate the function at another point x using this knowledge.

Define F : R −→ R byF (t) = f (xt + a(1− t)).

F (1) = x and F (0) = a.

We can think of this is a one-variable function, namely t.

we want to learn about f at x using information about f at a;this is equivalent to learning about the one-variable function F at t = 1 usinginformation about F at t = 0.

Multivariable version of Taylor’s Theorem: just apply the one variable versionof the theorem to F .

The chain rule describes the derivatives of F (in terms of f ) and there are alot of these derivatives.

Page 14: Lecture 11 - sites.pitt.edu

Taylor’s Theorem in R

Theorem (Taylor’s Theorem in R)Let f : I → R be n-times differentiable, where I ⊆ R is an open interval.If x , x + h ∈ I , then

f (x + h) = f (x) +n−1∑k=1

f (k)(x)hk

k!+ En

where f (k) is the k th derivative of f and

En =f (n)(x + λh)hn

n!for some λ ∈ (0, 1).

Page 15: Lecture 11 - sites.pitt.edu

Taylor’s Theorem in R

Motivation: Let

Tn(h) = f (x) +n∑k=1

f (k)(x)hk

k!= f (x) + f ′(x)h + f ′′(x)

h2

2+ · · ·+ f (n)(x)h

n

n!

thenTn(0) = f (x)T ′n(h) = f

′(x) + f ′′(x)h + · · ·+ f (n)(x) hn−1

(n−1)! hence T ′n(0) = f′(x)

T ′′n (h) = f′′(x) + · · ·+ f (n)(x) hn−2

(n−2)! hence T ′′n (0) = f′′(x)

...T (n)n (h) = f (n)(x) hence T (n)n (0) = f (n)(x)

so Tn(h) is the unique nth degree polynomial such that

Tn(0) = f (x) T ′n(0) = f′(x) ... T (n)n (0) = f (n)(x)

Page 16: Lecture 11 - sites.pitt.edu

Taylor’s Theorem in R

Theorem (Alternate Taylor’s Theorem in R)Let f : I → R be n times differentiable, where I ⊆ R is an open interval and x ∈ I .Then

f (x + h) = f (x) +n∑k=1

f (k)(x)hk

k!+ o (hn) as h→ 0

If f is (n + 1) times continuously differentiable, then

f (x + h) = f (x) +n∑k=1

f (k)(x)hk

k!+ O

(hn+1

)as h→ 0

REMARKSThe first equation in the theorem is essentially a restatement of the definitionof the nth derivative.

The second equation is proven from the theorem in the previous slide, and thecontinuity of the derivative.

Page 17: Lecture 11 - sites.pitt.edu

Taylor’s Theorem: Linear Terms

First, consider only linear terms in Rn .

TheoremSuppose X ⊂ Rn is open and x ∈ X. If f : X → Rm is differentiable, then

f (x+ h) = f (x) + Df (x)h+ o(h) as h→ 0

This is essentially a restatement of the definition of differentiability.

TheoremSuppose X ⊂ Rn is open and x ∈ X. If f : X → Rm is C 2, then

f (x+ h) = f (x) + Df (x)h+ O(|h|2)as h→ 0

Page 18: Lecture 11 - sites.pitt.edu

Taylor’s Theorem: Quadratic Terms

Consider f : X → R, with X ⊂ Rn an open set.Let

D2f (x) =

∂2 f∂x 21(x) ∂2 f

∂x2∂x1(x) · · · ∂2 f

∂xn∂x1(x)

∂2 f∂x1∂x2

(x) ∂2 f∂x 22(x) · · · ∂2 f

∂xn∂x2(x)

......

. . ....

∂2 f∂x1∂xn

(x) · · · · · · ∂2 f∂x 2n(x)

then

f ∈ C 2 ⇒ ∂2f∂xi∂xj

(x) =∂2f∂xj∂xi

(x)

⇒ D2f (x) is symmetric

⇒ D2f (x) has an orthonormal basis of eigenvectorsand thus can be diagonalized

Page 19: Lecture 11 - sites.pitt.edu

Taylor’s Theorem: Quadratic Terms

TheoremLet X ⊂ Rn be open, f : X → R, f ∈ C 2(X ), and x ∈ X. Then

f (x+ h) = f (x) + Df (x)h+12h>(D2f (x))h+ o

(|h|2)as h→ 0

If f ∈ C 3,

f (x+ h) = f (x) + Df (x)h+12h>(D2f (x))h+ O

(|h|3)as h→ 0

NoteThe second order term (h>(D2f (x))h) yields a quadratic form. We will use thisinformation later to help us understand whether f (x) is a local minimum or a localmaximum of f .

Page 20: Lecture 11 - sites.pitt.edu

Taylor’s Expansion: First Order Approximation

Slightly different notation to clarify what is going on.

DefinitionConsider f : Rn −→ R such that f is differentiable. The 1st degree Taylorpolynomial of f is at a ∈ Rn

P1(x) ≡ f (a) +∇f (a) · (x− a)

The first-order approximation looks familiar (there are n “derivatives” (the partials)in the gradient).

If we write f (x) = P1(x, a) + E2(x, a) for the first-order approximation witherror of f at x around the point a, then we have

limx→a

|E2(x)|||x− a|| = lim

x→a

|f (x)− f (a)− Df (a) · (x− a)|||x− a|| = 0

Thus, as x→ a, E2 converges to 0 faster than x to a.

Page 21: Lecture 11 - sites.pitt.edu

Taylor’s Expansion: Quadratic Terms

If f ∈ C 2, the 2nd degree Taylor approximation is

f (x) = f (a) +∇f (a)1×n

(x− a)n×1

+12(x− a)>

1×nD2f (a)n×n

(x− a)n×1︸ ︷︷ ︸

P2(x,a)

+E3(x, a)

where12(x− a)>

1×nD2f (a)n×n

(x− a)n×1︸ ︷︷ ︸

1×1

=12W =

12

n∑i=1

n∑j=1

(xi − ai )∂2f∂xi∂xj

(a)(xj − aj )

and

W = (x1 − a1, . . . , xn − an) ·

∂2 f∂x 21(a) . . . ∂2 f

∂xn∂x1(a)

......

∂2 f∂x1∂xn

(a) . . . ∂2 f∂x 2n(a)

· x1 − a1

...xn − an

The error terms are usually not that important since they vanish in the limit.

Page 22: Lecture 11 - sites.pitt.edu

Taylor’s Theorem: General Form

One can write a general form of Taylor’s theorem that includes higher levelderivatives.

Taylor’s theorem will have the form

f (x) = Pk (x, a) + Ek (x, a)where Pk (x, a) is a kth order Taylor’s Approximation, and Ek (x, a) is the errorterm.

The k-th polynomial involves derivatives of order k .The error term will have the property that:

limh→0

Ek (a+ h, a)||h||k = 0

Check a book for a detailed statement.

We will use 2-nd degree Taylor expansion to help us determine whether or notcertain points along the function are extreme points.

Page 23: Lecture 11 - sites.pitt.edu

Taylor’s Theorem: General Form

DefinitionDefine Dkh f to be a kth derivative:

Dkh f =∑

j1+···+jn=k

(k

j1 · · · jn

)hj11 · · · hjnn D

j11 · · ·D jnn f ,

where the summation is taken over all n-tuples of j1, . . . , jn of non-negativeintegers that sum to k and

( kj1···jn

)= k !

j1!···jn ! .

Theorem (Taylor’s Theorem)

If f is a real-valued function in C k+1 defined on an open set containing the linesegment connecting a to x, then there exists a point η on the segment such that

f (x) = Pk (x, a) + Ek (x, a)

where Pk (x , a) is a kth order Taylor’s Approximation: Pk (x, a) =∑k

r=0D rx−a(a)r !

and Ek (x, a) is the error term: Ek (x, a) =D k+1x−a(η)

(k+1)!

Moveover, the error term satisfies: limh→0Ek (a+h,a)||h||k = 0

Page 24: Lecture 11 - sites.pitt.edu

Tomorrow

Unconstrained Optimization and the Inverse Function Theorem.

1 Critical Points and Quadratic Forms2 Unconstrained Optimization

1 First Order Conditions2 Second Order Conditions

3 Inverse Function Theorem4 Easy Implicit Function Theorem