1 Univariate Calculus - UCL · MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou Chapter 2: Calculus 1 Univariate Calculus This section studies

MSc Maths and Statistics 2008UCL Department of Economics

Chapter 2: CalculusJidong Zhou

Chapter 2: Calculus

1 Univariate Calculus

This section studies real functions of one variable f : R→ R in the Euclidean space with

metric d(x, y) = |x− y|.

1.1 Differentiation and derivatives

• A function is differentiable at x if the following limit exists:

limz→0

f(z + x)− f(x)

z.

We denote this limit by f 0(x) ordf(x)

dx,

and call it the derivative of fucntion f at x. A function is differentiable if it is differen-

tiable at every point in its domain.

— the derivative f 0(x) is the slope of the tangent line of f at x. Roughly speaking, itmeasures the rate of change of f(x) when x changes.

— the differentiation of f is df(x) = f 0(x)dx.

• Differentiability and continuity:

— if a function f is differentiable at a point x, then it must be continuous at this

point. The proof is simple: as z → 0,

f(z + x)− f(x) =f(z + x)− f(x)

z· z → f 0(x) · 0 = 0.

— a differentiable function is always continuous, but a continuous function may notbe differentiable. For example, f(x) = |x| is continuous but not differentiable atx = 0.1

• A function f is said to be continuously differentiable or of class C1 if it is differentiableand f 0 is a continuous function.2

1 In effect, there exist functions which are continuous but nowhere differentiable. See an example in pp.154

in Rudin (1976), for instance.2The following function is differentiable everywhere but its derivative is not continuous at x = 0:

f(x) =x2 sin 1

xif x 6= 0

0 if x = 0.

1



— polynomial functions are continuously differentiable.

• Higher order derivatives can be obtained by sequential differentiation. Denote by f 00

the second (order) derivative, f 000 the third (order) derivative, and in general f (n) thederivative of degree n. More explicitly,

f 00(x) =d

dxf 0(x),

...

f (n)(x) =d

dxf (n−1)(x).

— a function is of class Cn if its nth derivative is a continuous function.

1.2 Computing derivatives

• Useful rules (k is a real constant):

— (kf)0 = kf 0

— (f ± g)0 = f 0 ± g0

— (f · g)0 = f 0 · g + f · g0 (product rule)—³fg

´0= 1

g2(f 0 · g − f · g0) (quotient rule)

— ddxf(g(x)) = f 0(g(x)) · g0(x) (chain rule)

— ddxf

−1(x) = 1f 0(f−1(x)) (inverse function rule)

3

• Useful formulas (k is a real constant):

— the derivative of a constant function is 0

— (xk)0 = kxk−1

— (ex)0 = ex

— (lnx)0 = 1x

— (sinx)0 = cosx

— (cosx)0 = − sinxHowever, f 0 cannot be “too” discontinuous in the sense that f 0 cannot have any discontinuous point x0 at

which both f 0(x−0 ) and f 0(x+0 ) exist. For instance, in the above example, f0(0−) and f 0(0+) do not exist. (See

a formal statement in pp.109 in Rudin (1976).)3Notice that f−1(x) is well defined only when f(x) is strictly monotonic on some domain.

2



Exercise 1 Use the above results to show(i) (√x)0= 1

2√x

(ii) (ax)0 = ax ln a for a > 0

(iii) (loga x)0 = 1

x ln a

(iv) (ln f(x))0 = f 0(x)f(x)

(v)¡f(x)g(x)

¢0= f(x)g(x)

hg0(x) ln f(x) + g(x)f

0(x)f(x)

iExercise 2 Let Q(P ) be the demand for a good at price P . Show that the price elasticity is

= −d lnQd lnP

.

Let R = PQ(P ) be the revenue. Show that the marginal revenue with respect to price is

Q(P )(1− ),

and the marginal revenue with respect to quantity is

P

µ1− 1

¶.

1.3 Important results

• The Mean Value Theorem:If f is a continuous function on [a, b] which is differentiable in (a, b), then there exists

a point x ∈ (a, b) such thatf(b)− f(a) = (b− a)f 0(x).

In particular, if f(a) = f(b), then there exists x ∈ (a, b) such that f 0(x) = 0.

• L’Hospital’s rule:Suppose, at some point x0, f and g are both zero or |f(x0)| = |g(x0)| = ∞ such that

f(x0)/g(x0) is indeterminate. Then

limx→x0

f(x)

g(x)= lim

x→x0

f 0(x)g0(x)

if limx→x0 f0(x)/g0(x) exists (including ∞).

This rule is very useful in evaluating limits.

• Taylor’s expansion:If f is a Cn+1 function defined on (a, b), then for any x, x+ ε ∈ (a, b), we have

f(x+ ε) = f(x) + f 0(x)ε+1

2f 00(x)ε2 + · · ·+ 1

n!f (n)(x)εn +

1

(n+ 1)!f (n+1)(x)εn+1

for some x between x and x+ ε.

3



— when n = 0, this is just the mean value theorem.

— notice that, as ε→ 0,

1

(n+ 1)!

f (n+1)(x)εn+1

εn→ 1

(n+ 1)!f (n+1)(x) · 0 = 0.

That is, the last term in the right-hand side decreases faster than εn as ε decreases

to zero. Therefore, when ε is relatively small, the right-hand side without the last

term is a good approximation of f(x+ε). The accuracy of approximation increases

as n becomes larger or ε becomes smaller.

— this theorem can be understood by appealing to the mean value theorem:

∗ f(x+ ε) = f(x) + f 0(x1)ε for some x1 between x and x+ ε;

∗ f 0(x1) = f 0(x) + f 00(x2)(x1 − x) ≈ f 0(x) + f 00(x2) ε2 for some x2 between x and

x1 if ε is small;

∗ these two steps imply

f(x+ ε) ≈ f(x) + f 0(x)ε+1

2f 00(x2)ε2;

∗ we further approximate f 00(x2) ≈ f 00(x)+f (3)(x3)ε3 for some x3 between x and

x2, and so

f(x+ ε) ≈ f(x) + f 0(x)ε+1

2f 00(x)ε2 +

1

3!f (3)(x3)ε

3.

We can continue this process till f (n+1).

Exercise 3 (i) Show limx→0 sinxx = 1; limx→0 ex−1x = 1; limx→∞

√x

lnx = ∞; limx→0+ x lnx =0; and limx→0 xx = 1.

(ii) Approximate ex around x = 0 by Taylor’s expansion.

1.4 The indefinite integral

• For the function f : R → R which is the derivative of some differentiable function, wecall Z

f(x)dx

the indefinite integral (or antiderivative) of f . Its meaning is that the derivative ofRf(x)dx should be f .

— the indefinite integral is the reverse operation of differentiation: we want to recovera function from its derivative.

4



— the indefinite integral may not exist for some (discontinuous) functions, but italways exists for continuous functions.

— clearly,Rf(x)dx does not represent a unique function. If F (x) =

Rf(x)dx, then

the derivative of F (x) + k for any constant k is also f .

• Some integration formulas (where c and k are real constants):

—Rc dx = cx+ k

—Rxn dx = xn+1

n+1 + k for n 6= −1

—R1x dx =

(lnx+ k for x > 0

ln(−x) + k for x < 0

—Rex dx = ex + k

—Rcx dx = cx

ln c + k

—Rsinxdx = − cosx+ k

—Rcosxdx = sinx+ k

They can be derived from the derivative formulas. But for many indefinite integrals,

they are irreducible (i.e., we are unable to derive their formulas explicitly). Examples

includeRe−x2dx,

R1lnxdx,

Re−xx dx,

Rsinxx dx,

Rcosxx dx, etc.

Exercise 4 Calculate Zex + 1

ex + xdx;

Z(x2 + 2x+ 4)1/2(x+ 1)dx.

1.5 The definite integral

We only review the Riemann integral in this course.

• For a bounded real function f defined on [a, b], we denote byZ b

af(x)dx

the Riemann integral of f over [a, b]. Roughly speaking, it measures the area under the

graph of f on [a, b].

• A more precise definition goes as follows:

— let P be a partition of [a, b]: xini=0 such that a = x0 ≤ x1 ≤ · · · ≤ xn−1 ≤ xn = b

and [i=0,...n−1

[xi, xi+1] = [a, b].

5



— for i = 1, ..., n, define 4i = xi − xi−1 and

Mi = supx∈[xi−1,xi]

f(x),

mi = infx∈[xi−1,xi]

f(x).

— we further define

U(P, f) =nXi=1

Mi4i,

L(P, f) =nXi=1

mi4i.

— f is Riemann integrable if

supL(P, f) = inf U(P, f),

where inf and sup are taken over all partitions of [a, b],4 and we denote this common

value by Z b

af(x)dx.

• Do we have easier ways to identify whether f is Riemann integrable? The bounded

function f is integrable on [a, b] if f is continuous, monotonic, or has only finitely many

discontinuous points.5

Example 1

f(x) =

(1 if x ∈ Q ∩ [a, b]0 if x ∈ [a, b]\Q

is not Riemann integrable on [a, b].

• Properties of the Riemann integral:

—R ba f(x) dx =

R ca f(x) dx+

R bc f(x) dx for any c ∈ [a, b].

—R ba f(x) dx = −

R ab f(x) dx.

—R ba (f1 + f2)dx =

R ba f1dx+

R ba f2dx.

— if f1 ≤ f2 on [a, b], thenR ba f1dx ≤

R ba f2dx.

— if f is integrable, |f | is integrable as well, and¯R ba f dx

¯≤ R ba |f | dx.

4One can show that supL(P, f) ≤ inf U(P, f). See, for instance, pp.124 in Rudin (1976).5 In general a bounded real function on [a, b] is Riemann integrable if and only if f is continuous “almost”

everywhere on [a, b].

6



— if both f1 and f2 are integrable, f1f2 is integrable as well.

• The first fundamental theorem of calculus:

Suppose f is Riemann integrable on [a, b]. For x ∈ [a, b], put

F (x) =

Z x

af(t) dt.

Then F is differentiable on [a, b], and

F 0(x) = f(x).

— this result indicates that integration and differentiation are, in some sense, inverseoperation.

• The second fundamental theorem of calculus:

Suppose f is Riemann integrable on [a, b], and there is a differentiable function F on

[a, b] such that F 0 = f , thenZ b

af(x)dx = F (x)|ba ≡ F (b)− F (a).

• The integral mean value theorem:If f is continuous on [a, b], then there exists some c ∈ (a, b) such thatZ b

af(x)dx = (b− a)f(c).

(Think about why continuity is needed.)

• Liebnitz’s rule: ifF (t) =

Z b(t)

a(t)f(x, t)dx

where all functions are C1, then

F 0(t) =Z b(t)

a(t)

∂f

∂t(x, t)dx+ f(b(t), t)b0(t)− f(a(t), t)a0(t)

In particular, we haved

dt

Z b

af(x, t)dx =

Z b

a

∂f

∂t(x, t)dx

if f is C1.

• Some integration rules:

7



— integration by parts: if both f and g are differentiable, thenZ b

af · g0 dx = (f · g) |ba −

Z b

af 0 · g dx.

It is just fromR ba (f · g)0dx = (f · g) |ba by the second fundamental theorem of

calculus.

— change of variables : suppose the function g(x) is monotonic and differentiable.

Then Z b

af(g(x))g0(x) dx =

Z g(b)

g(a)f(z)dz.

This is because, if we let z = g(x), then dz = g0(x)dx.

Exercise 5 (i) CalculateZ 1

0xn lnxdx,

Z e

1

lnx

xdx,

Z 0

−∞x2e2xdx, and

Z π2/4

0

sin√x√

xdx.

(ii) Suppose f is a continuously differentiable function on [a, b] with f(a) = f(b) = 0.

Prove that Z b

aff 0dx = 0.

If we further have Z b

af2dx = 1,

prove that Z b

axff 0dx = −1

2and

Z b

ax(f 0)2dx =

1

2.

(iii) For 0 < t <∞, define the Gamma function as

Γ(t) =

Z ∞

0xt−1e−xdx.

Show that Γ(t+ 1) = tΓ(t) and Γ(n+ 1) = n! for n ∈ N.(iv) Suppose the demand function is Q(P ) with Q0(P ) < 0. We define consumer surplus

at price P as

V (P ) =

Z Q(P )

0[P (t)− P ] dt

where P (·) is the inverse demand function. Show that V 0(P ) = −Q(P ) and V (P ) is convex

in P .

8



2 Multivariate Calculus

We will now study functions of several variables. In general, we are interested in functions

mapping Rn into Rm. We continue to work with the Euclidean distance as metric. For

x,y ∈ Rn, recall that it is defined as

d(x,y) = kx− yk =vuut nX

i=1

(xi − yi)2

Consider a (vector-valued) function f : S ⊂ Rn → Rm. We can write it as f(x) =

[f1(x) f2(x) · · · fm(x)]T where fi : Rn → R is a real-valued function.

2.1 Differentiation

2.1.1 Derivatives

• The partial derivative of fi with respect to xj is obtained by holding xk fixed for all

k 6= j and differentiating fi as if it were a single variable function of xj . We write,

∂fi(x)

∂xj= lim

z→0fi(x+ zej)− fi(x)

z

where z is a real number and ej is the unit n-dimensional vector with a 1 in position j

and zeros everywhere else.

— this partial derivative reflects the impact of a small change of xj on the value fiwhen all other variables are remained constant, or measures the slope of the curve

in the xj-direction at the point x.

• The matrix

Df(x) =

⎛⎜⎜⎝∂f1(x)∂x1

· · · ∂f1(x)∂xn

.... . .

...∂fm(x)∂x1

· · · ∂fm(x)∂xn

⎞⎟⎟⎠ ,

where every entry is a partial derivative of f with respect to an argument, is the deriva-

tive or the Jacobian derivative of f at x. (When m = 1 (i.e., when f is real-valued),

the column vector ∇f(x) = Df(x)T is the gradient vector of f).

• f is C1 if all its partial derivatives exist and are continuous.

• The extended chain rule: the chain rule can be naturally extended to the multivariatecase. Here we only present the simplest case:

9



Suppose we have a function f : Rn → R where the arguments (x1(t), · · · , xn(t)) arethemselves functions of another real variable t. Then

df

dt=

nXi=1

∂f

∂xi

dxidt

— in particular, if the arguments (x2, · · · , xn) can be written as functions of x1, andwe wish to know how f changes with x1 allowing for all the indirect effects of x1on the remaining arguments, then the above chain rule yields:

df

dx1=

∂f

∂x1|zdirect effect

+∂f

∂x2

dx2dx1

+ · · ·+ ∂f

∂xn

dxndx1| z

indirect effects

Exercise 6 (i) Compute the partial derivative of the following functions with respect to x:

exy+x2;

x+ y

x2 − y; [x2 + y2]1/2.

(ii) Let Q1(P1, P2, I) be the demand function for good 1, where Pi is the price of good i

and I is the income. Show that the cross price elasticity of demand for good 1 and its income

elasticity are

1,2 =∂ lnQ1∂ lnP2

and 1,I =∂ lnQ1∂ ln I

,

respectively. If Q1 = kPα1 P

β2 I

γ, show that all elasticities are constant.

(iii) Given the two vector-valued functions f(x, y) = (x2+1, y2) and g(u, v) = (u+ v, v2),

compute the Jacobian derivative matrix of g(f(x, y)) at the point (x = 1, y = 1).

In the following, we mainly focus on real-valued functions mapping Rn into R.

• Higher order derivatives:Let us consider a differentiable real-valued function f : S ⊂ Rn → R. Its derivativeDf(x) is a vector

³∂f∂x1

· · · ∂f∂xn

´Tand it is also a function mapping S ⊂ Rn into

Rn. Then we can define its second order derivative at x as

D2f(x) =

⎛⎜⎜⎝∂2f(x)∂x21

· · · ∂2f(x)∂x1∂xn

.... . .

...∂2f(x)∂xn∂x1

· · · ∂2f(x)∂x2n

⎞⎟⎟⎠where

∂2f(x)

∂xi∂xj=

∂

∂xj

µ∂f(x)

∂xi

¶.

If each entry exists, we say f is twice differentiable at x. If ∂2f∂xi∂xj

for any i and j is also

continuous at any x, f is C2.

10



— D2f(x) is also called the Hessian of f at x.

— ∂2f∂x2i

measures the curvature of f in the xi-direction, and∂2f

∂xi∂xjmeasures the rate

at which the slope in xi-direction changes as we change xj .

— (Young’s Theorem) if f is C2, then ∂2f∂xi∂xj

= ∂2f∂xj∂xi

. That is, the differentiation

order does not matter for twice continuously differentiable functions.6

— higher order derivatives can be obtained by applying differentiation sequentiallythough complicated.

2.1.2 The implicit function theorem

• Basic idea: an illustration in R2

— f(x, y) = 0 defines y as an implicit function of x or x as an implicit function of y.

In many circumstances, f(x, y) = 0 is rather complicated so that we cannot solve,

say, y as an explicit function of x. For example, exy + x2y = 1. But we still want

to know how the change of x affects y.

— applying the total differentiation to f(x, y) = 0 yields

∂f

∂xdx+

∂f

∂ydy = 0.

Thendy

dx= −∂f

∂x/∂f

∂y

if ∂f∂y 6= 0.

— or we can write f(x, y(x)) = 0 since y is an implicit function of x. The the chainrule implies the same result:

∂f

∂x+

∂f

∂yy0(x) = 0 =⇒ y0(x) = −∂f

∂x/∂f

∂y.

• The implicit function theorem in general

Let f1, · · · , fn : Rn+m → R be C1 functions. Consider the system of n equations

f1(y1, · · · , yn;x1, · · · , xm) = 0...

fn(y1, · · · , yn;x1, · · · , xm) = 06There are examples of weird functions which are twice differentiable but not continuously twice differ-

entiable and whose cross partial derivatives are not equal. See exercise 14.28 in pp.332 in Simon&Blume

(1994).

11



as possibly defining y1, · · · , yn as implicit functions of x1, · · · , xm. Suppose (y∗,x∗) isa solution. If the matrix

Dfy(y,x) =

⎛⎜⎜⎝∂f1∂y1

· · · ∂f1∂yn

.... . .

...∂fn∂y1

· · · ∂fn∂yn

⎞⎟⎟⎠evaluated at (y∗,x∗) is nonsigular, then there exist C1 functions

yi = yi(x) for i = 1, · · · , n

defined on an open ball (or a neighborhood) B around x∗ such that:

(a) fi(y1(x), · · · , yn(x);x1, · · · , xm) = 0 for all x ∈ B and i = 1, · · · , n,(b) y∗ = y(x∗), and

(c)

Dyxj (x∗) = − [Dfy(y∗,x∗)]−1

⎛⎜⎜⎝∂f1(y∗,x∗)

∂xj...

∂fn(y∗,x∗)∂xj

⎞⎟⎟⎠or

∂yi(x∗)

∂xj= − |Ai|

|Dfy(y∗,x∗)|

where Ai is the matrix Dfy(y∗,x∗) with its ith column replaced by

³∂f1(y∗,x∗)

∂xj· · · ∂fn(y∗,x∗)

∂xj

´T.

— again, the expression for Dyxj (x∗) (i.e., how xj affects y at the point x∗) is derived

from differentiating the system of equation with respect to xj (remember all yi are

functions of xj). (Show it as an exercise.)

— this implicit function theorem is very important in solving optimization problems

as we will see in next chapter.

Exercise 7 Suppose x and y satisfy exy + x2y = 1. Evaluate dydx at (x = 1, y = 0).

2.1.3 Taylor’s expansion in Rn

The spirit of Taylor’s expansion in the multi-dimensional case is the same as that in the

unidimensional case.

• Taylor’s expansion of order one:Suppose f is a C1 real-valued function defined on an open set A ⊂ Rn. For any x,x+ε ∈A, we have

f(x+ ε) = f(x) +Df(x) · ε+R1(ε;x)

12



whereR1(ε;x)

kεk → 0 as ε→ 0.

• Taylor’s expansion of order two:Suppose f is a C2 real-valued function defined on an open set A ⊂ Rn. For any x,x+ε ∈A, we have

f(x+ ε) = f(x) +Df(x) · ε+ 12εTD2f(x)ε+R2(ε;x)

whereR2(ε;x)

kεk2 → 0 as ε→ 0.

• The expansions of higher orders have the similar but more complicated forms. See, forexample, pp. 835 in Simon&Blume (1994).

• When ε is relatively small, we can use the expansion without the last term to approxi-

mate a function at some point.

Exercise 8 Use the second order Taylor’s expansion about (1, 1) to approximate f(x, y) =√xy at (x = 1.2, y = 0.9). (That is, x = (1, 1) and ε = (0.2,−0.1).)

2.2 Integrals

Since integration in multi-dimensional space is usually complicated, in this course we will

only deal with double integration with f(x, y) : Ω ⊂ R2 → R and well-behaved Ω (as we willspecify). We will also content ourselves with not very precise exposition.

The domain Ω can be drawn in a plane with x-axis as the horizontal axis and y-axis as

the vertical one. Similarly to the definition of single integration, we can partition the domain

Ω into grids by drawing horizontal and vertical lines on the plane. Let us denote by xi (with

xi−1 < xi) the points where the vertical lines cut the x-axis and by yi (with yj−1 < yj) the

points where the horizontal lines cut the y-axis. Then we form the Riemann sumXi

Xj

f(xi, yj) ·∆xi ·∆yj

where

∆xi = xi − xi−1, ∆yj = yj − yj−1.

When the partition gets finer and finer, if the limit of this sum exists, then we say f(x, y) is

integrable on Ω and denote it by Z ZΩ

f(x, y)dxdy.

13



An intuitive interpretation of this integral is that, if f(x, y) ≥ 0, it is just the volume of thesolid over Ω and beneath the graph of f .

We can calculate the double integral conveniently in the following three cases:

• Ω is a square on the plane. That is,

Ω = (x, y) : x ∈ (a, b) and y ∈ (c, d).

(This is a special case of the following two more general case.) In this case, the double

integral is written as Z d

c

Z b

af(x, y)dxdy

and it can be calculated by first keeping y fixed and integrating over x and then inte-

grating over y (or in the opposite order). That is,Z d

c

Z b

af(x, y)dxdy =

Z d

c

µZ b

af(x, y)dx

¶| z a function of y

dy.

Example 2 Z 2

0

Z 1

0(x2y +

√x)dxdy

=

Z 1

0

µZ 2

0(x2y +

√x)dy

¶dx

=

Z 1

02(x2 +

√x)dx

= 2(x3

3+2

3x32 )|10 = 2.

• Ω has the following form:

Ω = (x, y) : x ∈ (a, b) and g(x) < y < h(x).

Then the double integral can be calculated asZ ZΩ

f(x, y)dxdy

=

Z b

a

ÃZ h(x)

g(x)f(x, y)dy

!| z

a function of x

dx.

That is, integrate over y for any given x first, then integrate over x.

14



• A similar case is that Ω has the following form:

Ω = (x, y) : y ∈ (c, d) and g(y) < x < h(y).

Then the double integral can be calculated asZ ZΩ

f(x, y)dxdy

=

Z d

c

ÃZ h(y)

g(y)f(x, y)dx

!| z

a function of y

dy.

Example 3 f(x, y) = x√16+y5

and Ω = (x, y) : y ∈ (0, 2) and 0 < x < y2. ThenZ ZΩ

f(x, y)dxdy

=

Z 2

0

ÃZ y2

0

xp16 + y5

dx

!dy

=

Z 2

0

Ãy4

2p16 + y5

!dy

=

Z 48

16

1

10√tdt =

4

5(√3− 1).

Some more complicated cases can be handled if Ω can be divided into several parts and

each of them belongs to one of the above three cases by using the result thatZ ZΩ

f(x, y)dxdy =

Z ZΩ1

f(x, y)dxdy +

Z ZΩ2

f(x, y)dxdy

if Ω = Ω1 ∪ Ω2 and Ω1 ∩Ω1 = ∅.As a final remark, in some cases the order in which we take integration matters. Sometimes

the calculation involved in one order is much simpler than the other; sometimes the integration

can be calculated explicitly only in a certain order.

Exercise 9 Compute Z ZΩ

xydxdy

where Ω = (x, y) : x ∈ (0, 2) and x2 < y <√8x.

15



3 Using Calculus to Characterize Functions

3.1 Monotonic functions

• A differentiable function f : (a, b) → R is increasing iff f 0(x) > 0 for x ∈ (a, b). If theinequality is strict, the function is strictly increasing.

Decreasing and strictly decreasing functions can be defined with the inequality reversed.

— notice that a monotonic function need not be differentiable, or even continuous.

— the sum of two increasing (decreasing) functions is still increasing (decreasing).

3.2 Concave and convex functions

We mainly characterize concave functions, since convex functions can be similarly treated but

with all inequalities reversed.

• Definition A real-valued function f defined on a convex set A ⊂ Rn is said to be

concave if

f(αx+ (1− α)y) ≥ αf(x) + (1− α)f(y)

for all x and y ∈ A and all α ∈ [0, 1]. It is strictly concave if the inequality is strict forα ∈ (0, 1).Graphically, the line segment connecting two points in the graph of a concave function

lies below the graph.

• Properties:

— f is concave iff −f is convex.— the sum of two concave (or convex) functions is still concave (or convex).

— a concave or convex function must be continuous on the interior of its domain Ao.7

— (Jensen’s inequality) if f : R→ R is concave, then

f

µZxdG(x)

¶≥Z

f(x)dG(x)

for any distribution function G(x).

We then present two (more practical) tests for concavity:

7Moreover, a concave or convex function is differentiable “almost” everywhere.

16



• A C1 function f defined on a convex set A ⊂ Rn is concave if and only if

f(x+ z) ≤ f(x) +Df(x) · z

for all x and x+ z ∈ A. It is strictly concave if the inequality is strict for z 6= 0.

• A twice differentiable function f defined on a convex set A ⊂ Rn is concave if and only

if its Hessian

D2f(x) =

⎛⎜⎜⎝∂2f(x)∂x21

· · · ∂2f(x)∂x1∂xn

.... . .

...∂2f(x)∂xn∂x1

· · · ∂2f(x)∂x2n

⎞⎟⎟⎠is negative semidefinite for any x ∈ A. The function is convex iff its Hessian is positive

semidefinite.

— in the case A ⊂ R, f is concave iff f 00(x) ≤ 0 for all x ∈ A, and it is convex iff

f 00(x) ≥ 0 for all x ∈ A.

— this result can be easily understood by using the previous results. For example, letus consider the single-variable case: the Taylor’s expansion implies

f(x+ z) = f(x) + f 0(x)z +1

2f 00(x)z2

for some x between x and x+ z. f is concave iff f(x+ z) ≤ f(x) + f 0(x)z for all xand x+ z ∈ A, which equals f 00 ≤ 0.

• A twice differentiable function f defined on a convex set A ⊂ Rn is strictly concave if

the Hessian D2f(x) is negative definite for any x ∈ A. The function is strictly convex

iff the Hessian is positive definite.

— notice that negative definiteness or positive definiteness is only sufficient but notnecessary for concavity or convexity. For example, f(x) = −x4 is strictly concavebut f 00(0) = 0 is not strictly negative.

Exercise 10 (i) Using different ways to show that (a) for k > 0, xk is strictly convex on

(0,∞) if k > 1, and it is strictly concave if k < 1; (b) lnx is concave; (c) ex is convex.

(ii) For a, b > 0, show that the Cobb-Douglas function f(x, y) = xayb defined on R2+ isconcave iff a, b < 1 and a+ b < 1.

(iii) Give an example in which f − g is not concave though both f and g are concave.

(iv) Let f and g : R → R are two twice differentiable functions. Then when will fg be

convex or concave?

17



3.3 Quasiconcave and quasiconvex functions

• Definition A real-valued function f defined on a convex set A ⊂ Rn is said to be

quasiconcave if its upper contour sets

x ∈ A : f(x) ≥ t

are convex sets. That is, for any t ∈ R, if x and y ∈ A, f(x) ≥ t and f(y) ≥ t, then

f(αx+ (1− α)y) ≥ t for any α ∈ [0, 1].Analogously, f is quasiconvex if its lower contour sets x ∈ A : f(x) ≤ t are convexsets.

— the definition implies that f is quasiconave iff

f(αx+ (1− α)y) ≥ minf(x), f(y)

for all x and y ∈ A, and α ∈ [0, 1], and f is quasiconvex iff

f(αx+ (1− α)y) ≤ maxf(x), f(y)

for all x and y ∈ A, and α ∈ [0, 1]. (Show them as an exercise.)

— the two concepts are not mutually exclusive. For example, all monotonic functionsdefined on a convex set are both quaisconcave and quasiconvex.

— quasiconcavity is a “weaker” requirement than concavity. A concave function de-fined on a convex set must be quasiconcave; a convex function must also be qua-

siconvex. (Show them as an exercise.) But, again, a convex function can also

be quasiconcave, and a concave function can also be quasiconvex. For example,

f(x) = x2 on [0,∞) is both convex and quasiconcave.— quasiconcave or quasiconvex functions can be discontinuous (vs concave or convexfunctions).

• Properties:

— f is quasiconcave iff −f is quasiconvex.— any nondecreasing transformation of a quasiconcave function is still quasiconcave.8

In particular, any nondecreasing transformation of a concave function results in a

quasiconcave function.9 (Similar properties hold for quasiconvexity.)8This is an advantage of the concept of quasiconcavity relative to concavity. Concavity is only a cardinal

property, which means that an increasing transform of a concave function can become convex. But quasicon-

cavity does suffer this problem.9But not every quasiconcave function can be from a monotone tansformation of some concave function.

Otherwise, quasiconcavity would add nothing to concavity in dealing with the optimization problem.

18



We then present two tests for quasiconcavity:

• A C1 function f defined on a convex set A ⊂ Rn is quasiconcave if and only if

f(y) ≥ f(x) =⇒ Df(x) · (y − x) ≥ 0

for all x and y ∈ A. If the second inequality is strict for x 6= y, then it is strictly

quasiconcave.

— this result has a nice geometric interpretation: the gradient vector at x and thevector y − x must form an acute angle if y brings higher value of f . (See, for

instance, the graph in pp.935 in MWG.)

• A C2 function f defined on a convex set A ⊂ Rn is quasiconcave iff the Hessian D2f(x)

is negative semidefinite in the subspace z ∈ Rn : Df(x) · z = 0 for any x ∈ A. It is

strictly quasiconcave if the Hessian D2f(x) is negative definite in that subspace for any

x ∈ A.

— since checking negative semidefiniteness is quite complicated, we here only presentthe practical way to check “the Hessian D2f(x) is negative definite in the subspace

z ∈ Rn : Df(x) · z = 0.”∗ define a bordered Hessian as

Hn =

⎛⎜⎜⎜⎜⎝0 f1 · · · fn

f1 f11 · · · f1n...

.... . .

...

fn fn1 · · · fnn

⎞⎟⎟⎟⎟⎠where fi is the partial derivative with respect to xi at x and fij is the cross

partial derivative at x.

∗ its leading principal minors of size ≥ 3 alternate sign with the first one (whichhas size three) being positive. That is, (−1)k |Hk| > 0 for k = 2, · · · , n.10

• For f(x1, x2), it is strictly quasiconcave if¯¯ 0 f1 f2

f1 f11 f12

f2 f21 f22

¯¯ > 0;

10Notice that |H1| must be nonpositive.

19



and it is quasiconcave iff ¯¯ 0 f1 f2

f1 f11 f12

f2 f21 f22

¯¯ ≥ 0.

Exercise 11 (i) Give an example in which the sum of two quasiconcave functions is not

quasiconcave. (vs concavity)

(ii) For a, b > 0, show that the Cobb-Douglas function f(x, y) = xayb must be quasiconcave.

3.4 Homogeneous functions

• Definition A real-valued function f(x1, · · · , xn) defined on a cone is homogeneous ofdegree k if

f(tx1, · · · , txn) = tkf(x1, · · · , xn)for all (x1, · · · , xn) and all t > 0.11

For example, f(x, y) = xayb is homogenous of degree a+ b.

• Properties:

— if a C1 function f is homogeneous of degree k, then its first order partial derivativesare homogeneous of degree k − 1.

— fifjis homogenous of degree zero.

— (Euler’s theorem)nXi=1

fi(x)xi = kf(x).

Exercise 12 Prove the above three properties.

A Appendix:

A.1 Directional Derivatives

Consider a function f : Rn → R. We want to measure the rate of its change at a given pointx∗ in a given direction v = (v1, · · · , vn).12 To parameterize the direction v from the point x∗,we write the line through x∗ in the direction v as

x = x∗ + tv

11A cone is a set with the property that whenever x is in this set, every positive scalar multiple tx of x is

also in the set.12 In a unidimensional domain, the direction is unique. But in a multi-dimensional domain, we have infinitely

many directions which can be represented by vectors.

20



where t is a real number. The rate of change of f along that line can be evaluated as

df(x∗ + tv)

dt

¯t=0

=nXi=1

∂f(x∗)∂xi

vi = Df(x∗) · v,

where Df(x∗) is the derivative vector or gradient vector at x∗. This is the derivative of f

at x∗ in the direction v. In particular, if v is a unit vector, then the directional derivativedegenerates to the partial derivative.

Since

Df(x∗) · v = kDf(x∗)k kvk cos θwhere k·k is the length of the vector and θ is angle between the vector Df(x∗) and v at thebase point x∗, we can see that, given x∗ and kvk, f increases most rapidly when v has thesame direction as Df(x∗) (i.e., θ = 0). That is, the gradient vector Df(x∗) points at x∗ intothe direction in which f increases most rapidly.

21

Documents

1 Univariate Calculus - UCL · MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou Chapter 2: Calculus 1 Univariate Calculus This section studies