Physics Volume I Classical MechanicsChapter 0 Mathematical Preliminaries In this chapter, we review the basic mathematical toolkit that is indis-pensable for working out problems in

Physics

Volume I

Classical Mechanics

Notes compiled by

Pau Roldan-Blanco

First version: March 2018

Latest version: September 2018

Preface

This volume covers Classical Mechanics. By this, we refer to the state

of physics at the end of the 19th century. It includes the physics of Newton,

Maxwell, Faraday, and their contemporaries.

Part I, called Newtonian mechanics, covers the set of rules that describe

the motion of bodies in the realm of the macroscopic. We discuss motion,

energy, and conservation laws, both in linear and rotational contexts, as well

as the behavior of waves, fluids, and gases. In the last part of Part I, we

present the Lagrangian and Hamiltonian approaches to these same topics.

[TBW]

Part II, called Electricity and Magnetism, discusses the physics of elec-

tric and magnetic fields. [TBW]

Disclaimer: Not one bit of the material included in this book is original.

Only some presentations and extended materials are mine. The core material

has been compiled from the following sources:

• Susskind, L. and G. Hrabovsky (2013). The Theoretical Minimum.

• The Feynman Lectures in Physics, by Richard Feynman.

• Walter Lewin’s lectures from MIT Open Courseware.

I have also drawn from numerous Wikipedia articles, and adapted TIKz

templates from various users on the LATEXStack Exchange websites. All

credit goes to these people. Any errors or omissions are strictly my own.

3

Contents

0 Mathematical Preliminaries 7

0.1 Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . 7

0.2 Trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

0.3 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . 11

0.4 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . 13

0.5 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

0.6 Taylor Approximations . . . . . . . . . . . . . . . . . . . . . . 22

0.7 Differential Equations . . . . . . . . . . . . . . . . . . . . . . 25

I NEWTONIAN MECHANICS 51

1 Motion and Force 53

1.1 Newton’s Laws of Motion . . . . . . . . . . . . . . . . . . . . 53

1.2 Resistive Forces . . . . . . . . . . . . . . . . . . . . . . . . . . 77

1.3 Multi-Particle Systems . . . . . . . . . . . . . . . . . . . . . . 90

1.4 Center of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . 95

2 Energy 99

2.1 Work, Kinetic Energy, and Power . . . . . . . . . . . . . . . . 99

2.2 Potential Energy . . . . . . . . . . . . . . . . . . . . . . . . . 105

2.3 Conservation of Energy . . . . . . . . . . . . . . . . . . . . . 108

2.4 Collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

2.5 Impulse and Thrust . . . . . . . . . . . . . . . . . . . . . . . 125

2.6 Newton’s Universal Law of Gravitation . . . . . . . . . . . . . 132

5

Classical Mechanics Pau Roldan-Blanco

3 Rotation 139

3.1 Moment of Inertia . . . . . . . . . . . . . . . . . . . . . . . . 139

3.2 Angular Momentum and Torques . . . . . . . . . . . . . . . . 143

3.3 Gyroscopic Motion . . . . . . . . . . . . . . . . . . . . . . . . 158

3.4 Elliptical Orbits and Kepler’s Laws . . . . . . . . . . . . . . . 162

4 Stability and Elasticity 169

4.1 Static Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . 169

4.2 Stress, Strain, and Elasticity . . . . . . . . . . . . . . . . . . 180

5 Waves, Fluids, and Oscillations 187

5.1 Waves and the Doppler shift . . . . . . . . . . . . . . . . . . . 187

5.2 Fluid Statics . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

5.3 Fluid Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 209

5.4 Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

6 Heat, Temperature, and Thermodynamics 237

6.1 Thermal Expansion of Solids and Liquids . . . . . . . . . . . 238

6.2 The Ideal Gas Law . . . . . . . . . . . . . . . . . . . . . . . . 240

6.3 Phase Transitions . . . . . . . . . . . . . . . . . . . . . . . . . 244

6.4 Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . 246

7 Lagrangian Mechanics 249

7.1 The Euler-Lagrange Equation . . . . . . . . . . . . . . . . . . 250

7.2 Non-Inertial Reference Frames . . . . . . . . . . . . . . . . . 254

7.3 Generalized Coordinates . . . . . . . . . . . . . . . . . . . . . 257

7.4 Symmetry and Conservation Laws . . . . . . . . . . . . . . . 268

II ELECTRICITY

AND MAGNETISM 269

6

Chapter 0

Mathematical Preliminaries

In this chapter, we review the basic mathematical toolkit that is indis-

pensable for working out problems in classical mechanics.

0.1 Dynamical Systems

Classical mechanics is built around the notion of determinism and re-

versibility.

Definition 0.1 (Determinism) The law stating that the evolution of a

system is fully predictable from its law of motion and a well-defined starting

point (or initial condition).

Definition 0.2 (Reversibility) The law stating that a system is deter-

ministic regardless of the direction of motion.

Example 0.1 Time is discrete, indexed by n ∈ Z. The state space is S =

{−1, 1}. Then, the law of motion:

σ(n+ 1) = −σ(n)

is deterministic (for any n, the state in (n + 1) can be predicted) and

reversible (for any n, both states (n + 1) and (n − 1) are known). The law

of motion:

7


σ(n+ 1) = σ(n)2

is deterministic but not reversible. If σ(n) = 1, then σ(n + 1) = 1, but

if σ(n + 1) = 1, then both σ(n) = 1 and σ(n) = −1 are solutions. Since

the system is deterministic in the n → (n + 1) direction, but not in the

(n+ 1)→ n direction, it is not reversible.

Definition 0.3 (Information conservation) A system is said to conserve

information if it is both deterministic and reversible.

An implicit assumption in all of classical physics is that all dynamical

systems conserve information, that is, are both deterministic and reversible.

Definition 0.4 (Chaotic Behavior) A system is chaotic if a change in

the initial condition implies a change in the system’s outcome.

Nearly all systems are chaotic, in that they all rely on the value of

the initial state. Insofar as the law of motion of the system is know, our

inability to determine its initial value impedes that we can perfectly predict

its behavior.

0.2 Trigonometry

Trigonometry is present through all the disciplines of physics. Here we

cover the basic concepts. We start with some definitions.

Definition 0.5 (Radians) The radian is the standard measure of an angle,

defined by:

1 radian =180◦

π

For example:

• There are 2π radians in 360◦ (a full circle).

8


• A right angle has π2 radians.1

Trigonometric functions are defined in terms of the properties of right

angles. Let a, b, and c be the altitude, base, and hypothenuse of a right

triangle. Let φ be the angle opposite the base, i.e. the angle formed by sides

a and c, or φ ≡ ∠ac. Let θ = ∠bc. The angle ∠ab is 90◦ because abc is a

right triangle, that is ∠ab = π/2 radians.

Definition 0.6 (Basic trigonometric functions) We define the sine (sin),

cosine (cos) and tangent (tan) functions as follows:

sin θ =a

c, cos θ =

b

c, tan θ =

a

b.

Note we can write:

tan θ =sin θ

cos θ

Here are some useful properties of sine and cosine functions:

π2

π 32π

2π

−1

1

sin θ

cos θ

θ

Figure 0.1: The sin and cos functions over [0, 2π].

• Both sin and cos functions are oscillating waves taking values in [−1, 1].

Since positions in circles range from 0◦ to 360◦ degrees, we will typi-

1 Incidentally, this clarifies why the diameter of the circle of radius R is 2πR. Ifwe roll the circle on a plane, the total distance it will have traveled after one fullrevolution is the product of the radius and its total number of radians.

9


cally restrict these functions’ supports to [0, 2π] (see Figure 0.1). Be-

cause of their oscillatory shape, these functions are sometimes called

sinusoidal (of cosinusoidal) waves.

• A useful property relates right angles to circles. Take a circle of radius

c and centered at some (x0, y0). Suppose c is also the hypothenuse of

a right angle with origin at (x0, y0), and again denote θ = ∠bc. Then,

the position of any point (x, y) satisfies:

x = c cos θ y = c sin θ (0.1)

See Figure 0.2 for an illustration of equation (0.1) when c = 1.

a = c sin θ

b = c cos θc c

c

c

Figure 0.2: A right triangle in a circle of radius c.By equation (0.1), we have a = c sin θ and b = c cos θ.

• Pythagorean theorem:

sin2 θ + cos2 θ = 1 (0.2)

where sin2(θ) ≡ sin(θ) sin(θ), and similarly for cos2. Note this is sim-

10


ply a special case of Pythagoras’ theorem for a right triangle of hy-

pothenuse c = 1, i.e. a2 + b2 = 1.

• Product rule:

sin θ cos θ = sin 2θ

• For any two angles (θ1, θ2), the following properties hold:

sin(θ1 + θ2) = sin θ1 cos θ2 + cos θ1 sin θ2

sin(θ1 − θ2) = sin θ1 cos θ2 − cos θ1 sin θ2

cos(θ1 + θ2) = cos θ1 cos θ2 − sin θ1 sin θ2

cos(θ1 − θ2) = cos θ1 cos θ2 + sin θ1 sin θ2 (0.3)

• Sine and cosine functions are related by their derivatives:

d

dθsin θ = cos θ

d

dθcos θ = − sin θ (0.4)

0.3 Complex Numbers

Definition 0.7 (Complex number) A complex number is a number which

can be expressed in the form:

p+ qi

where p and q are real numbers, and i is a solution to x2 = −1.

Since no real number can solve the equation x2 = −1, then i =√−1 is

called an imaginary number. Hence, in the number p + qi, p is called the

real part, and qi is called the imaginary part.

Using the definition generally, all numbers can be though of having the

form p + qi, with real numbers having q = 0 and pure imaginary numbers

having q 6= 0.

11


Moreover, since i is definitional, the “+” sign in front of bi is just a

convention. Then, we adopt the following terminology:

Definition 0.8 (Complex conjugate) The complex conjugate of a com-

plex number p+ qi is p− qi.

Complex numbers have the following algebraic properties:

• Addition and subtraction: The real and imaginary parts must be

added/subtracted independently, that is:

(a+ bi) + (c+ di) = (a+ c) + (b+ d)i

(a+ bi)− (c+ di) = (a− c) + (b− d)i

• Multiplication and division: Since i2 = −1, we have:

(a+ bi)(c+ di) = (ac− bd) + (bc+ ad)i

a+ bi

c+ di=

(ac+ bd

c2 + d2

)+

(bc− adc2 + d2

)i (0.5)

Importantly, complex numbers are intimately related to trigonometric

functions through the so-called Euler’s formula:

Result 0.1 (Euler’s Formula) For any real number θ:

eiθ = cos θ + i sin θ (0.6)

where i =√−1 is the imaginary unit.2

2 We will provide a proof of this famous result in mathematics by using Taylorexpansions in Section 0.6.

12


Recall that cos θ = xc and sin θ = y

c for a right triangle of base x, height

y, hypothenuse c, and angle θ = ∠xc (see Figure 0.3). Then, another way

of writing Euler’s formula is:

x+ iy = ceiθ (0.7)

In sum, equations (0.6)-(0.7) provide us with a one-to-one mapping from

complex numbers to exponential functions and sinusoidal functions: any

complex number, composed of a real part x and an imaginary part iy, can be

written (i) in the exponential form ceiθ, with c =√x2 + y2 and some angle

θ, or (ii) equivalently, in the sinusoidal form, c(cos θ + i sin θ).

c y

xθ

Figure 0.3: Geometric interpretation of Euler’s for-mula.

A special case of Euler’s formula is θ = π (i.e. 180◦), in which case:

eiπ + 1 = 0

This formula is known as Euler’s identity.

0.4 Linear Algebra

In physics, a vector is an object with both length (or magnitude) and

a direction in space. We denote it ~r, and its length is denoted |~r|. Since

we interpret vectors as direction, multiplying a vector by a negative scalar

changes its direction. For instance, −2~r is twice the length of ~r (i.e. it has

twice the magnitude), but it points in the opposite direction.

For now, let us work in a Cartesian coordinate space with three dimen-

sions: (x, y, z). Of course, the results below extend to any finite number of

13


dimensions, and even more generally to any vector space.

Definition 0.9 (Linear independence) The vectors (~ex, ~ey, ~ez) are lin-

early independent if, for all ax, ay, az ∈ R, ax~ex + ay~ey + az~ez = ~0 implies

ax = ay = az = 0.

Definition 0.10 (Spanning property) The vectors (~ex, ~ey, ~ez) are said

to span ~V if there exist numbers Vx, Vy, Vz ∈ R such that ~V = Vx~ex +

Vy~ey + Vz~ez.

Definition 0.11 (Basis vectors) The vectors (~ex, ~ey, ~ez) form a basis of

~V if (i) they are linearly independent; (ii) they span ~V .

Thus, a vector ~V can be written in terms of a basis as ~V = Vx~ex +

Vy~ey + Vz~ez. The numbers Vx, Vy and Vz are the so-called components (or

coordinates) of ~V , and by the linear independence property they are uniquely

determined. By convention, we will denote a vector by the unique list of its

components, i.e. ~V = (Vx, Vy, Vz).

The simplest basis for any vector in is the standard basis, with ~ex =

(1, 0, 0), ~ey = (0, 1, 0), and ~ez = (0, 0, 1).

For any vector ~V = (Vx, Vy, Vz), Pythagoras’ Theorem relates the coeffi-

cients to the vector’s magnitude as follows:

|~V | =√V 2x + V 2

y + V 2z (0.8)

which may serve as a definition of magnitude. Then, the following prop-

erties hold:

• Scalar multiplication: For any α ∈ R,

α~V = (αVx, αVy, αVz)

• Addition: For any two vectors ~A and ~B, ~A+ ~B is obtained by adding

up their corresponding components:

14


(~A+ ~B

)x

= Ax +Bx

and similarly for y and z, where(~A + ~B

)x

denotes the x component

of the vector ~A+ ~B.

The product of vectors can be performed using the dot product or the

cross product. The dot product (or scalar product) of two vectors results

in a scalar number. Where θ = ∠ ~A~B denotes the angle between the two

vectors, the dot product of ~A and ~B is:

~A · ~B = | ~A|| ~B| cos θ (0.9)

Equivalently, in terms of the components of ~A and ~B, we can write:

~A · ~B = AxBx +AyBy +AzBz

Note, for instance, that | ~A|2 = ~A · ~A, a direct implication of (0.8) and

(0.9). Moreover, note that the dot product is negative if cos θ < 0. From

Figure 0.1, this is the case, for instance, if θ ∈[π2 , π

](i.e. an angle of more

than 90◦ but less than 180◦).

Definition 0.12 (Orthogonality) Two vectors ~A and ~B are called orthog-

onal if they are perpendicular. We denote this by ~A ⊥ ~B.

An implication of orthogonality is that the dot product is zero. To see

this, let ~A and ~B be orthogonal. Then θ ≡ ∠ ~A~B = π2 radians (that is, 90◦),

so cos θ = 0. Using (0.9), then ~A · ~B = 0.

The cross product (or vector product) of two vectors results in another

vector. The cross product can only be performed in R3. Where θ = ∠ ~A~B

denotes the angle between the two vectors, the cross product of ~A and ~B is:

~A× ~B =(| ~A|| ~B| sin θ

)~e (0.10)

where ~e is a unit vector perpendicular to the plane containing ~A and ~B.

Thus, ~A× ~B is also perpendicular to the plane containing ~A and ~B.

15


For (x, y, z) coordinates, if (x, y) is the plane containing ~A and ~B, then

~A× ~B is contained in z. In the standard representation where x is depth, y

is width, and z is height, the direction of ~A× ~B (i.e. the direction that ~e is

pointing to) is as follows:

• If θ < π (that is θ < 180◦), then sin θ > 0, and thus ~A × ~B points

toward z = +∞ (i.e. “upward direction”, that is toward more positive

z values).

• If θ > π (that is θ > 180◦), then sin θ < 0, and thus ~A × ~B points

toward z = −∞ (i.e. “downward direction”, that is toward more

negative z values).

• If θ = π or θ = 0 (that is, ~A and ~B are parallel), then sin θ = 0, and

thus ~e = ~A× ~B = ~0.

Because ~e is a unit vector, the magnitude of the cross product is:

| ~A× ~B| = | ~A|| ~B| sin θ

Therefore:

• If θ = π/2 (i.e. ~A ⊥ ~B), we have sin θ = 1, so | ~A × ~B| = | ~A|| ~B|.In words, the magnitude of the cross product of two perpendicular

vectors is the product of their lengths.

• If ~A and ~B are parallel (θ = 0 or θ = π), so that sin θ = 0, then

~A× ~B = ~0, and so | ~A× ~B| = 0. In words, if two vectors are parallel,

their cross product is the zero vector, and thus its magnitude is zero.

Some other useful properties of the cross product are as follows:

• Anticommutative: ~A× ~B = − ~B × ~A.

• Distributive over addition: ~A× ( ~B + ~C) = ~A× ~B + ~A× ~C.

• Scalar multiplication: (k ~A)× ~B = ~A× (k ~B) = k( ~A× ~B), ∀k ∈ R.

16


• The product rule in differentiation applies:

d

dt( ~A× ~B) =

d

dt~A× ~B + ~A× d

dt~B

Definition 0.13 (Eigenvalues and Eigenvectors) A n× 1 column vec-

tor ~v 6= ~03 is an eigenvector of an n-dimensional square matrix A if:

A~v = λ~v

where λ is a scalar known as the eigenvalue associated to ~v.

Notice that the n × 1 vector A~v can be though of as a linear transfor-

mation of ~v. Therefore, ~v is an eigenvector if the linear transformation A~v

does not change the direction of the vector, but only scales it by some scalar

λ, called the eigenvalue.

By definition, to find an eigenvalue λ and the associated eigenvector ~v

of a squared matrix A, we may solve the linear equation:

(A− λIn

)~v = ~0

where ~0 is an n × 1 column vector of zeros, and In is an n-dimensional

identity matrix. Furthermore, since we are looking for a vector ~v 6= ~0, then

it must be that:

Det(A− λIn

)= 0

This is known as the characteristic polynomial of A. Since this is a

polynomial of degree n, we can factored into the product of n linear terms:

Det(A− λIn

)= (λ1 − λ)(λ2 − λ) · · · (λn − λ)

where λi, for i = 1, . . . , n, is the i-th root. The numbers λ1, . . . , λn may

be complex numbers, and may not all have distinct values.

3 By this notation we mean that ~v has at least one non-zero entry.

17


Example 0.2 (Two dimensions example) Consider A =

(2 1

1 2

). Then,

A − λI2 =

(2− λ 1

1 2− λ

), so Det(A − λI2) = 3 − 4λ + λ2. If λ is an

eigenvalue, then 3−4λ+λ2 = 0, so λ = 1 and λ = 3 are the two eigenvalues

of A. The corresponding eigenvectors ~vλ satisfy A~v = λ~v, so:

2v1 + v2 =λv1

v1 + 2v2 =λv2

Thus, for λ = 1, ~vλ=1 = (1,−1)>, and for λ = 3, ~vλ=3 = (1, 1)>.

Example 0.3 (Three dimensions example) Consider A =

2 0 0

0 3 4

0 4 9

.

Then, A−λI3 =

2− λ 0 0

0 3− λ 4

0 4 9− λ

, so Det(A−λI3) = −λ3+14λ2−

35λ+ 22. If λ is an eigenvalue, then −λ3 + 14λ2 − 35λ+ 22 = 0, so λ = 1,

λ = 2, and λ = 11 are the three eigenvalues of A. The corresponding eigen-

vectors are ~vλ=1 = (0, 2,−1)>, ~vλ=2 = (1, 0, 0)>, and for ~vλ=11 = (0, 1, 2)>.

Definition 0.14 (Trace) The trace of an n-dimensional square matrix A =

(aij) is the sum of its diagonal elements, Tr(A) =∑n

i=1 aii.

Trace and determinants are related in different ways. One relation is

given by the eigenvalues of the matrix. Where λi is the i-th eigenvalue of a

square n-by-n matrix A, we have:

Tr(A) =

n∑i=1

λi Det(A) =n∏i=1

λi

The eigenvalues of a matrix can also be used for the following decompo-

sition. Suppose A has n linearly independent eigenvectors ~v1, . . . , ~vn, with

18


corresponding eigenvalues λ1, . . . , λn (which need not be distinct). That is,

(~v1, . . . , ~vn) is a basis. Then, we have:

A = QΛQ−1

where Q = [~v1 · · ·~vn] is an n × n matrix whose columns are the n lin-

early independent eigenvectors of A, and Λ = diag(λ1, . . . , λn) is a diagonal

matrix with the corresponding eigenvalues on the diagonal.

0.5 Calculus

The derivative of a univariate function f(t) with respect to t will be

denoted:

df(t)

dt≡ lim

∆t→0

f(t+ ∆t)− f(t)

∆t

The integral is the area below a function f(t) in the interval [a, b], and

is denoted: ∫ b

af(t)dt ≡ lim

∆t→0

∑i

f(ti)∆t

which results from dividing and adding up the areas of small rectangles

of base ∆t.

Result 0.2 (Fundamental Theorem of Calculus) Differentiation and in-

tegration are reciprocal in the following sense:∫ b

af(t)dt = F (b)− F (a)

where F : T 7→∫ Ta f(t)dt and f(t) = dF (t)

dt .

Result 0.3 (Integration by parts) For any two functions f and g:∫ b

ag(x)

df(x)

dx= f(x)g(x)

∣∣∣ba−∫ b

af(x)

dg(x)

dx(0.11)

19


Example 0.4 To compute the integral of∫ π/2

0 x cosxdx, note cosx = d sinxdx .

Using (0.11):

∫ π/2

0x cosxdx =

∫ π/2

0x

d sinx

dxdx

= x sinx∣∣∣π/20−∫ π/2

0

dx

dxsinxdx

=π

2sin

π

2−∫ π/2

0sinxdx

=π

2− (− cosx)

∣∣∣π/20

=π

2− 1

For multivariate functions, we may speak of partial differentiation. For

a function f(x1, . . . , xk), we define, for any i = 1, . . . , k:

∂f

∂xi≡ lim

∆xi→0

f(x1, . . . , xi + ∆xi, . . . , xk)− f(x1, . . . , xi, . . . , xk)

∆xi

We may collect partial derivatives into a gradient vector :4

Definition 0.15 (Gradient Vector) The gradient of a multi-variate func-

tion is the vector of partial derivatives with respect to each argument:

∂f

∂~x≡(∂f

∂x1, . . . ,

∂f

∂xk

)Equivalently, we may write ∂f

∂~x =∑k

j=1∂f∂xj

~ej , where ~ej is the standard

basis vector in the j-th coordinate.

Definition 0.16 (Stationary Point) A stationary point of a multivariate

function F satisfies ∂F∂~x = ~0 (i.e. ∂xiF = 0, ∀xi).

Definition 0.17 (Optima for univariate functions) For a univariate func-

tion F (x), a local maximum (minimum) is a stationary point for whichd2

dx2F (x) < (>) 0. If d2

dx2F (x) = 0, then x is called a point of inflection.4 For the gradient we will use the same notation as with the simple partial derivative,

except we denote the partial with respect to the whole vector ~x ≡ (x1, . . . , xk).

20


Definition 0.18 (Hessian matrix) For a function F (x1, . . . , xk) with k ≥2, the Hessian matrix H is the matrix of second partial derivatives of F :

H ≡

∂2F∂x2

1

∂2F∂x1∂x2

. . . ∂2F∂x1∂xk

∂2F∂x2∂x1

∂2F∂x2

2. . . ∂2F

∂x2∂xk...

.... . .

...∂2F

∂xk∂x1

∂2F∂xk∂x2

. . . ∂2F∂x2k

Notice that the Hessian is symmetric along its diagonal, for ∂2F

∂xi∂xj=

∂2F∂xj∂xi

, ∀i, j = 1, . . . , k.

Result 0.4 (Stationary points for bivariate functions) Consider a bi-

variate function F (x, y) with Hessian matrix H. Suppose point (x, y) is a

stationary point, i.e. ∂xF = ∂yF = 0 at (x, y) = (x, y). Then, to determine

whether this point is a local maximum, a local minimum, or a saddle point,

we may use the following rules:

• If Tr(H) > 0 and Det(H) > 0, then (x, y) is a local minimum.

Equivalently, H is positive definite (i.e. strictly positive eigenvalues).

• If Tr(H) < 0 and Det(H) > 0, then (x, y) is a local maximum.

Equivalently, H is negative definite (i.e. strictly negative eigenvalues).

• If Det(H) < 0, then (x, y) is a saddle irrespective of the trace.

Equivalently, H is indefinite (i.e. eigenvalues have mixed signs).

These rules are only valid for bivariate functions (i.e. 2 × 2 Hessian

matrices). They extend non-trivially to functions of k ≥ 3 variables.

Example 0.5 (Finding Stationary Points) Consider F (x, y) = sinx +

sin y. Then ∂xF = cosx and ∂yF = cos y. The Hessian matrix is:

H =

(∂2F∂x2

∂2F∂x∂y

∂2F∂y∂x

∂2F∂y2

)=

(− sinx 0

0 − sin y

)Notice that Det(H) = sinx sin y and Tr(H) = − sinx− sin y.

21


• Clearly, (x, y) =(π2 ,

π2

)is a stationary point, as we recall from Figure

0.1 that cos(π/2) = 0. At (x, y) =(π2 ,

π2

), we have Det(H) = 1 > 0

(recall that sin(π/2) = 1) and Tr(H) = −2 < 0. Therefore, the point

is a maximum.

• Likewise, (x, y) =(π2 ,−

π2

)is also stationary, with Det(H) = −1 < 0

and Tr(H) = 0, so the point is a saddle. Similarly, (x, y) =(−π

2 ,π2

)is also a saddle point.

• Finally, (x, y) =(−π

2 ,−π2

)is a stationary point, with Det(H) = 1 > 0

and Tr(H) = 2 > 0, so this point is a minimum.

Now consider F (x, y) = cosx + cos y. Then ∂xF = − sinx and ∂yF =

− sin y. The Hessian matrix is:

H =

(− cosx 0

0 − cos y

)so Det(H) = cosx cos y and Tr(H) = − cosx− cos y.

• The point (x, y) = (π, π) is a stationary point, as we recall from Figure

0.1 that sinπ = 0. At (x, y) = (π, π), we have Det(H) = 1 > 0 (recall

that cosπ = −1) and Tr(H) = 2 > 0. Therefore, the point is a

minimum.

• It is easily checked that (π,−π), (−π, π), and (−π,−π) are all station-

ary and minima.

0.6 Taylor Approximations

In certain situations we will have to approximate functions around a

point to gain analytical tractability. For this, we will typically use Taylor

expansions:

Definition 0.19 (Taylor series) Let k ≥ 1 be an integer and let the func-

tion f : R → R be k-times differentiable at some point a ∈ R. Then, the

Taylor series (or expansion) of f to the k-th order is:

22


T k(x, a) ≡ f(a)+df(x)

dx

∣∣∣x=a

(x−a)+1

2!

d2f(x)

dx2

∣∣∣x=a

(x−a)2+· · ·+ 1

k!

dkf(x)

dxk

∣∣∣x=a

(x−a)k

Example 0.6 (Second-order expansions) The second-order Taylor ex-

pansion of f around x = a is f(a) + f ′(a)(x− a) + f ′′(a) (x−a)2

2 . Examples:

• For f(x) = lnx, then f ′(x) = 1/x, and f ′′(x) = −1/x2, so:

T 2(x, a) = ln a+x− aa− 1

2

(x− aa

)2

For instance, around a = 1, then T 2(x, 1) = (x− 1)− 12 (x− 1)2.

• For f(x) = ex, then f ′(x) = ex, and f ′′(x) = ex, so:

T 2(x, a) = ea(

1 + x− a+(x− a)2

2

)For instance, around a = 0, then T 2(x, 0) = 1 + x+ x2

2 .

• For f(x) = sinx, then f ′(x) = cosx, and f ′′(x) = − sinx, so:

T 2(x, a) = sin a+ (x− a) cos a− (x− a)2

2sin a

For instance, around a = 0, then T 2(x, 0) = x (using that sin 0 =

1− cos 0 = 0).

• For f(x) = cosx, then f ′(x) = − sinx, and f ′′(x) = − cosx, so:

T 2(x, a) = cos a− (x− a) sin a− (x− a)2

2cos a

For instance, around a = 0, then T 2(x, 0) = 1− x2

2 .

Result 0.5 (Taylor’s Theorem) Let k ≥ 1 be an integer and let the func-

tion f : R → R be k-times differentiable at some point a ∈ R. Then, there

exists a function hk : R→ R such that:

23


f(x) = T k(x, a) + hk(x)(x− a)k

where T k(x, a) is the k-th order Taylor expansion of f around x = a,

and limx→a hk(x) = 0.

Then, to approximate a function f(x) around some x = a, we may use

Taylor’s Theorem. In particular, f(x) ≈ T k(x, a) around x = a. Since the

error of approximation (the term hk(x) in Taylor’s Theorem) declines with

k, the approximation is better for higher-order expansions.

Example 0.7 (k-th order expansions) Generally, for expansions of or-

der k:

• For f(x) = lnx, then T k(x, a) = ln a +∑k

n=1(−1)n

(n+1)!

(x−aa

)n+1. For

instance, around x = 1, lnx ≈∑k

n=1(−1)n (x−1)n+1

(n+1)! .

• For f(x) = ex, then T k(x, a) = ea∑k

n=1(x−a)n

n! . For instance, around

x = 0, ex ≈∑k

n=1xn

n! .

• For f(x) = sinx, then T k(x, a) = sin a+ (x− a) cos a− (x−a)2

2 sin a−(x−a)3

3! cos a + (x−a)4

4! sin a + . . . For instance, around x = 0, sinx ≈x− x3

3! + x5

5! −x7

7! + . . .

• For f(x) = cosx, then T k(x, a) = cos a− (x− a) sin a− (x−a)2

2 cos a+(x−a)3

3! sin a + (x−a)4

4! cos a + . . . For instance, around x = 0, cosx ≈1− x2

2 + x4

4! −x6

6! + . . .

Example 0.8 (Euler’s Formula proof) One way of proving Euler’s For-

mula (Result 0.1), is to use an infinite Taylor expansion for eit around x = 0.

Indeed, note:

eit =+∞∑n=0

(it)n

n!=

+∞∑n=0

(−1)nt2n

(2n)!+ i

+∞∑n=0

(−1)n−1t2n−1

(2n− 1)!

We recognize the Taylor expansion for cos t in the first summation, and

that of sin t in the second one. Thus, eit = cos t+ i sin t.

24


0.7 Differential Equations

Definition 0.20 (Differential equation) A differential equation (DE) is

an equation relating a function with its derivatives.

DEs may be distinguished along different dimensions. The most common

distinctions are the following:

1. Ordinary/Partial DEs:

• Ordinary differential equations (ODE) are equations containing

an unknown function y of a single independent variable t, its

derivatives, and some known functions of t.

• Partial differential equations (PDE) are equations containing un-

known multivariate functions and its (partial) derivatives.

2. Linear/Non-linear DEs:

Linear DEs involve exclusively a linear polynomial in the unknown

function and its derivative. A k-th order linear ODE has the form:

a0(t)y + a1(t)dy

dt+ a2(t)

d2y

dt2+ · · ·+ ak(t)

dky

dtk+ b(t) = 0 (0.12)

where a0(t), . . . , ak(t) and b(t) are arbitrary but known functions of t

(these functions need not be linear), and y is the unknown function

(of t) that we are solving for. Further, if a0, . . . , ak are constant in t,

then (0.12) is called a constant-coefficient linear ODE.

A linear PDE would be similar, with the difference that y (and possibly

a0, . . . , ak, b) would be multivariate, and the derivatives in (0.12) would

be partial derivatives.

3. Homogenous/nonhomogenous DEs:

25


Homogenous DEs involve functions that are homogenous of the same

degree.5 DEs are nonhomogenous if this requirement fails.

Some examples:

• dydt = cy + t2 is an nonhomogenous (first-order, linear, ordinary)

DE, while dydt = cy is homogenous.

• The second-order linear ODE sin(x)d2ydx2 +ady

dx +y = 0 is homoge-

nous, whereas 2x2 d2ydx2 + axdy

dx + y = 2 is nonhomogenous.

• The second-order linear ODE ad2ydx2 + bdy

dx + cy = f(x) is homoge-

nous only if f(x) = 0, ∀x. Otherwise, it is nonhomogenous.

Solution methods for DEs vary depending on the type of DE. Sometimes,

the same DE can be solved using different methods. Often, a DE can only be

solved numerically. Selecting the right method is very much case-specific.

Here, we will introduce some of the more popular methods for the most

common types of DEs that we encounter in physics.

• We will start with a general method that is valid for both ODEs and

PDEs, as long as they are linear and homogenous (Method I).

• We then present a method that works for linear, first-order ODEs,

even when not homogenous (Method II).

• Then, we will see methods for solving constant-coefficient, second-

order linear ODEs, both homogenous (Method III) and nonhomoge-

nous (Method IV).

• Finally, we will move to ODEs with non-constant coefficients for both

the homogenous (Method V) and the nonhomogenous (Method VI)

cases.

Remark 0.1 (Method I: Separation of Variables) .

Used for: Linear and homogenous ODEs and PDEs.

5 A multivariate function g(x1, . . . , xk) is said to be homogenous of degree n ifg(αx1, . . . , αxk) = αng(x1, . . . , xk), for all α 6= 0.

26


The method: The method of separation of variables (or Fourier method)

consists of conjecturing that the solution for a DE of the unknown function

y on t can be written as a product solution, that is:

dy

dt= g(t)h(y)

for some continuous functions g, h.

Example 0.9 (Heat Equation) Consider the bivariate function u ≡ u(x, t),

obeying the PDE:6

∂u

∂t= α

∂2u

∂x2

By the Method of Separation of Variables, we conjecture a solution of

the form:

u(x, t) = X(x)T (t)

Plugging back and using the product rule, we find T ′(t)αT (t) = X′′(x)

X(x) . Since

the left- (right-) hand side is constant in x (in t), both sides are equal to

some constant −λ, that is:

T ′(t) = −λαT (t) and X ′′(x) = −λX(x)

Here, λ is known as the separation constant.7 Importantly, we have

turned the problem into two ODEs:

• First, note T ′(t)T (t) = −λα. Taking integrals, lnT (t) = A−

∫λαdt, so

T (t) = Ae−λαt

where A ∈ R is the constant of integration.

6 This is in fact a famous equation in physics, called the heat equation, and describingthe variation in temperature u in a given region x over time t.7 The minus sign on λ will be convenient later on if we assume λ > 0.

27


• Second, to solve X ′′(x) = −λX(x), conjecture a solution of the form

X(x) = B sin(bx) + C cos(cx) for some coefficients B,C, and some

numbers b, c to be verified. Then, X ′(x) = Bb cos(bx)−Cc sin(cx) and

X ′′(x) = −Bb2 sin(bx)− Cc2 cos(cx), so

−λ =X ′′(x)

X(x)= −Bb

2 sin(bx) + Cc2 cos(cx)

B sin(bx) + C cos(cx)

Thus, λB sin(bx) + λC cos(cx) = Bb2 sin(bx) + Cc2 cos(cx), and thus

b = c =√λ. Hence:

X(x) = B sin(√λx) + C cos(

√λx)

In sum, we have found the solution for the heat equation:

u(x, t) = Ae−λαt(B sin(

√λx) + C cos(

√λx))

for some constants A,B,C, λ.8

Remark 0.2 (Method II: Integrating Factor) .

Used for: Linear, nonhomogenous, first-order ODEs:9

y′ + a(t)y = b(t)

when a(t), b(t) are continuous functions of t, and y′ ≡ dydt .

The method: First, we conjecture the existence of a function µ(t),

called the integrating factor, which solves:

µ′(t) = µ(t)a(t)

8 The precise values of these constants can be deduced from the four boundary con-ditions of the system. If x ∈ [0, x] and t ∈ [0, t], the four boundary conditions are thenumbers u(0, t), u(x, 0), u(x, t), and u(0, 0).9 We have normalized the coefficient on y′′ to one. This is with no loss of generality.

Indeed, consider c(t)y′+ a(t)y = b(t). Then, just divide both sides by c(t) and definea(t) = a(t)/c(t) and b(t) = b(t)/c(t), and we obtain the equivalent representationy′ + a(t)y = b(t).

28


Multiplying both sides of our ODE by µ(t) yields µ(t)y′+µ′(t)y = µ(t)b(t).

Noting the product rule on the LHS, we can write the ODE as:(µ(t)y

)′=

µ(t)b(t). Integrating both sides:

y =c1 +

∫µ(t)b(t)dt

µ(t)(0.13)

where c1 is a constant of integration. Now, all we need is an expression

for µ(t) to plug into equation (0.13). Since µ′(t)µ(t) = a(t) by assumption, then

integrating both sides yields lnµ(t) = c2 +∫a(t)dt, where c2 is a constant of

integration. Therefore, we have:

µ(t) = c2e∫a(t)dt

where c2 ≡ ec2. Substituting this into (0.13) gives:

y(t) =c1 +

∫c2e

∫a(t)dtb(t)dt

c2e∫a(t)dt

=c+

∫e∫a(t)dtb(t)dt

e∫a(t)dt

where c ≡ c1c2

. Often, the constant of integration can be deduced from

initial conditions. For instance, if a(t) and b(t) are defined on R+, with

a(0) = a0 and b(0) = b0 (say, because t represents real time), then µ(0) = ea0

and y(0) = c+ea0b0ea0 , so c = ea0(y(0)− b0), and we can write the solution for

the ODE as:

y(t) = ea0−∫a(t)dt

(y(0)− b0

)+ e−

∫a(t)dt

∫e∫a(t)dtb(t)dt

Thus, we have found the following result:

Result 0.6 (Solution for first-order linear ODEs) The solution of first-

order linear ODEs of the form y′ + a(t)y = b(t) is:

y(t) =c+

∫µ(t)b(t)dt

µ(t)

where µ(t) ≡ e∫a(t)dt, and c is a constant of integration determined by

initial conditions.

29


The next two methods will show how to solve in practice for linear,

second-order, constant-coefficient ODEs. That is, we will deal with ODEs

of the type:

y′′ + by′ + cy = d(t) (0.14)

where b, c are constants, and d(t) is continuous in t.10 We will see both

the homogenous case (Method III, Remark 0.4), i.e. d(t) ≡ 0; and the

nonhomogenous case (Method IV, Remark 0.5), i.e. d(t) 6≡ 0. In Method V

(Remark 0.7), we will then relax the assumption of constant coefficients.

We begin by stating some general properties and definitions of second-

order linear ODEs:

Result 0.7 (Principle of superposition) If y1, y2 are solutions to a lin-

ear, second-order ODE with constant coefficients (equation (0.14), with d(t) ≡0), then the linear combination:

y = C1y1 + C2y2

is also a solution, for all coefficients C1, C2 ∈ R.

As C1, C2 are arbitrary constants, this implies that linear, homogenous,

second-order, constant-coefficient ODE exhibit infinitely many solutions.

The coefficients are then pinned down by the initial conditions.

Example 0.10 Consider y′′−y = 0. Then y = C1et+C2e

−t is a solution for

any value of C1, C2. Indeed, y′ = C1et −C2e

−t and y′′ = C1et +C2e

−t = y,

so y′′ = y, ∀C1, C2. For example, 3et is a solution, 5e−t is a solution, and

3et + 5e−t is a solution. If y(0) = 3 and y′(0) = 1, then using the solution

y(t) = C1et+C2e

−t at t = 0, we have 3 = c1 + c2 and 1 = c1− c2, so C1 = 2

and C2 = 1. Thus, y(t) = 2et + e−t.

More generally, the initial conditions are numbers:

10 Again, the coefficient on y′′ is normalized to one wlog.

30


y(t0) = y0 and y′(t0) = y′0

Therefore, where (y1, y2) are solutions, C1 and C2 solve the system of

equations:

C1y1(t0) + C2y2(t0) =y0 (0.15a)

C1y′1(t0) + C2y

′2(t0) =y′0 (0.15b)

Solving, we obtain:

C1 =y0y′2(t0)− y′0y2(t0)

y1(t0)y′2(t0)− y′1(t0)y2(t0)and C2 =

−y0y′1(t0) + y′0y1(t0)

y1(t0)y′2(t0)− y′1(t0)y2(t0)

Note we can write these in terms of determinants as C1 = Det(A1)W and

C2 = Det(A2)W , where:

A1 ≡

(y0 y2(t0)

y′0 y′2(t0)

); A2 ≡

(y0 y2(t0)

y′0 y′2(t0)

)and

W ≡ Det

(y1(t0) y2(t0)

y′1(t0) y′2(t0)

)= y1(t0)y′2(t0)− y′1(t0)y2(t0)

The number W ≡Wy1,y2(t0) is the so-called Wronskian determinant or,

simply, the Wronskian of functions y1 and y2.

Definition 0.21 (Wronskian) The Wronskian determinant of two func-

tions y1 and y2 is:

Wy1,y2(t) ≡ Det

(y1(t) y2(t)

y′1(t) y′2(t)

)= y1(t)y′2(t)− y′1(t)y2(t)

Crucially, for the constants C1, C2 to exist, we need W 6= 0 at t = t0.

Then, we have arrived at the following result:

31


Result 0.8 (Fundamental solutions) Suppose y1 and y2 are solutions to

a linear, second-order ODE (equation (0.14), with d(t) ≡ 0). If there is a

point t0 such that Wy1,y2(t0) 6= 0, then every solution to the ODE can be

expressed as:

yc = C1y1 + C2y2

for some constants C1, C2. The solutions y1 and y2 are called the fun-

damental solutions, and yc is called the general solution.

Example 0.11 (Example 0.10 cont’d) For y′′ − y = 0, we found that

y1 = et and y2 = e−t are two solutions. The Wronskian of y1 and y2 is

W = y1y′2 − y′1y2 = −ete−t − ete−t = −2e0 = −2 6= 0, ∀t

Thus, y1 and y2 are two fundamental solutions to y′′ − y = 0, and can

be used to construct the whole set of solutions. In particular, the set of

solutions is {yc : yc = C1et + C2e

−t;C1, C2 ∈ R}.

Remark 0.3 (Linear independence) The following statements are equiv-

alent:

• The functions y1, y2 are fundamental solutions.

• The functions y1, y2 are linearly independent.

• The Wronskian of y1 and y2 is non-zero, Wy1,y2 6= 0.

To see this, recall from Definition 0.9 that y1 and y2 are linearly in-

dependent if C1y1 + C2y2 = 0 implies C1 = C2 = 0. Above, we found

that (y1, y2) and (C1, C2) are related through the initial conditions (y0, y′0) ≡

(y(t0), y′(t0)) via the system of equations (0.15a)-(0.15b). Then, (y1, y2)

will be linearly independent if the only solution for the constants when y0 =

y′0 = 0 is C1 = C2 =. We derived that C1 =y0y′2(t0)−y′0y2(t0)

W and C2 =−y0y′1(t0)+y′0y1(t0)

W , where W ≡ y1(t0)y′2(t0)− y′1(t0)y2(t0) 6= 0 since y1, y2 are

fundamental solutions. Then, if y0 = y′0 = 0, the only solution to the system

is indeed C1 = C2 = 0. �

32


Then, the sense in which the solutions y1, y2 are fundamental is that any

linear combination between them is also a solution, but neither y1 nor y2

can be written as linear combinations of other solutions.

To explain the general toolkit to solve for these ODEs, we will also need

the following terminology:

Definition 0.22 (Auxiliary equation) The auxiliary (or characteristic)

equation of a linear, second-order ODE with constant coefficients (equation

(0.14), with d(t) ≡ 0) is the equation:

α2 + bα+ c = 0

It follows that, if (α1, α2) are the roots of the auxiliary equation, then

(α− α1)(α− α2) = 0.

We are now ready to explain the method:

Remark 0.4 (Method III: Exponential guess) .

Used for: Linear, homogenous, second-order ODEs with constant coef-

ficients (equation (0.14), with d(t) ≡ 0).

The method: By Result 0.8, we know there exist two fundamental

(linearly independent) solutions, (y1, y2). To find them, conjecture that yj =

eαjt, for j = 1, 2, where α1, α2 need to be found. Note that y′j = αjeαjt and

y′′j = α2jeαjt. The Wronskian is:

W = (α2 − α1)e(α1+α2)t

Therefore, if the roots are distinct, W 6= 0 and y1, y2 are fundamental

solutions (and linearly independent). Since yj is a solution, then they satisfy

the ODE, α2jeαjt + bαje

αjt + ceαjt = 0, that is:

eαjt(α2j + bαj + c

)= 0

This is satisfied for all t if, and only if:

α2j + bαj + c = 0

33


This is the auxiliary, or characteristic, equation (Definition 0.22). Thus,

to solve the ODE we must find the roots of the auxiliary equation. These

roots might be real or complex, depending on parameters. Thus, we must

consider different cases:

1. Case 1: If b2 > 4c, then the roots are real and distinct, given by

(α1, α2) = 12

(−b+

√b2 − 4c,−b−

√b2 − 4c

)∈ R2. Using Result 0.8,

then the general solution of (0.14) is:

y = C1e(−b+

√b2−4c) t2 + C2e

(−b−√b2−4c) t2

for some arbitrary constants C1, C2.

2. Case 2: If b2 = 4c, then the roots are real but identical, given by

α1 = α2 = − b2 . Thus, now W = 0, so the method is giving us two

(trivially) linearly dependent solutions which are, in fact, one and the

same: y1 = e−b2t.

Yet, we know (Result 0.8) that there must exist a second fundamental

solution, y2. To find it, Result 0.7 tells us that if y1 is a solution, then

y2 = vy1 is also a solution, for any function v ≡ v(t). Thus, we can

just figure out a functional form for v that satisfies the ODE. Note:

y′ = v′e−b2t − b

2ve−

b2t and y′′ = v′′e−

b2t − b

2v′e−

b2t +

b2

4ve−

b2t

Into the ODE y′′ + by′ + cy = 0, we get, after some algebra:

v′′ −(b2 − 4c

4

)v = 0

and, therefore, v′′ = 0. This implies v = k3t+k4, for some k3, k4 ∈ R,

so y2 = vy1 = (k4t + k4)e−b2t. Thus, the fundamental solutions are

y1 = e−b2t and y2 = te−

b2t, and general solution is y = k1e

− b2t +

k2(k3t+ k4)e−b2t, or (since the constants are arbitrary):

34


y = (C1 + C2t)e− b

2t

This is thus the general solution when the roots of the characteristic

equation are repeated.11

3. Case 3: If b2 < 4c, then the roots (α1, α2) are imaginary numbers,

and one root is the complex conjugate of the other. That is:

α1 = β + γi and α2 = β − γi

for some (β, γ), where i =√−1. Using Result 0.8, then the general

solution of (0.14) is:

y = C1e(β+γi)t + C2e

(β−γi)t

= eβt(C1e

γit + C2e−γit

)= eβt

[C1(cos γt+ i sin γt) + C2(cos γt− i sin γt)

]= eβt

[(C1 + C2) cos γt+ (C1i− C2i) sin γt

]where the third equality uses Euler’s formula (Result 0.1). Writing

A = C1 + C2 and B = C1i− C2i, we have found the solution:

y = eβt(A cos γt+B sin γt

)for some real number A and some imaginary number B.

The next method generalizes Method III when the second-order, linear

ODE with constant coefficients is nonhomogenous (i.e. d(t) 6≡ 0). First, we

introduce two more pieces of terminology:

11 One could further verify that y1 = e−b2t and y2 = te−

b2t are indeed linearly inde-

pendent. For this, we can simply check that Wy1,y2 6= 0 (Remark 0.3). Indeed, somealgebra shows that W = e−bt 6= 0, as expected.

35


Definition 0.23 (Complementary function) The complementary func-

tion of an nonhomogenous ODE is the solution yc to its homogenous coun-

terpart.

For linear, second-order, constant-coefficient ODEs, therefore, the com-

plementary function yc solves y′′ + by′ + cy = 0. To obtain yc, we can just

use Method III (Remark 0.4).

Definition 0.24 (Particular integral) A particular integral of an ODE

is any function yp which satisfies the equation. In other words, a particular

integral is any solution of a DE.

Finally, for our next solution method, we will make use of a general

property of ODEs:

Result 0.9 (General solution) The general solution y of a linear ODE

can be written as:

y = yc + yp

where yc is the complementary function and yp is a particular integral.

The language of this Result seems slightly circular, but it should not

lead to confusion. When applied to our context, the Result states that the

general solution of the nonhomogenous ODE is the sum of the complemen-

tary function (i.e. the general solution to its homogenous counterpart) and

any function that is a solution to the ODE.

Remark 0.5 (Method IV: Method of undetermined coefficients) .

Used for: Linear, nonhomogenous, second-order ODEs with constant

coefficients, that is, equation (0.14) when d(t) 6≡ 0.

The method: Result 0.9 has laid out the plan for this method. First,

find the complementary function. Then, find the particular integral. Finally,

add the two together.

36


• Step 1: The complementary function is the solution yc to the ho-

mogenous version of (0.14), y′′ + by′ + cy = 0. For this, we can just

use Method III. The solution is (derivations in Remark 0.4):

yc =

C1e

(−b+√b2−4c) t2 + C2e

(−b−√b2−4c) t2 if b2 > 4c

(C1 + C2t)e− b

2t if b2 = 4c

eβt[(C1 + C2) cos γt+ (C1i− C2i) sin γt

]if b2 < 4c

(0.16)

for some constants C1, C2, where β (resp. γ) is the real (resp. imagi-

nary) part of the roots of the auxiliary function, α2 + bα+ c = 0 (i.e.

such that α = β ± γi). Equation (0.16) is thus the complementary

function.

• Step 2: Now, we find a particular integral, yp. This is very much

case-specific, but a commonly fruitful approach is to guess that yp has

the same functional form as d(t), and then use the method of unde-

termined coefficients. Let’s see some examples:

1. d(t) = d (a constant).

Guess yp = A ∈ R. Then y′p = y′′p = 0. Substituting into the

ODE, we get 0 + 0 + cA = d. In sum:

yp =d

c

2. d(t) = d1 + d2t (a line).

Guess yp = At + B. Then y′p = A and y′′p = 0. Substituting

into the ODE, we get 0 + bA + cAt + cB = d1 + d2t. Matching

coefficients, we get cA = d2 (so A = d2/c), and bA+ cB = d1, so

B = 1c

(d1 − bd2

c

). In sum:

yp =1

c

[d1 + d2

(t− b

c

)]37


3. d(t) = d1 cos(d2t).

Guess yp = A cos(d2t) + B sin(d2t). Then y′p = −Ad2 sin(d2t) +

Bd2 cos(d2t) and y′′p = −Ad22 cos(d2t)−Bd2

2 sin(d2t). Substituting

into the ODE, we get:

(Bbd2−Ad22+Ac) cos(d2t)−(Bd2

2+Abd2−Bc) sin(d2t) = d1 cos(d2t)

Matching coefficients, we get (after some algebra):

Bbd2 +A(c− d22) = d1

B(d22 − c) +Abd2 = 0

From the second equation, we get B = Abd2

c−d22. Into the first equa-

tion, we get A =d1(c−d2

2)

b2d22+(c−d2

2)2 . In sum:

yp =d1(c− d2

2) cos(d2t) + d1d2b sin(d2t)

b2d22 + (c− d2

2)2

A similar example is d(t) = d1 sin(d2t), for which the guess should

again be yp = A cos(d2t) +B sin(d2t).

4. d(t) = d1ed2t.

Guess yp = Aed2t. Then y′p = Ad2ed2t and y′′p = Ad2

2ed2t. Substi-

tuting into the ODE, we get:

Aed2t[d2

2 + bd2 + c]

= d1ed2t

Therefore, A = d1

d22+bd2+c

. In sum:

yp =d1

d22 + bd2 + c

ed2t

A similar example is d(t) = d1e−d2t, for which the guess should

be yp = Ae−d2t.

38


Whatever the case, the general solution to the ODE is, then:

y = yc + yp

by Result 0.9.

Example 0.12 Consider the ODE:

y′′ + 3y′ − 10y = 3t2

This is a second-order, linear, nonhomogenous ODE, of the form y′′ +

by′ + cy = d(t) with b = 3, c = −10, and d(t) = 3t2. We use Method IV to

solve.

• Complementary function (yc):

Letting y = ekt, so that y′ = kekt and y′′ = k2ekt, the auxiliary function

is k2 + 3k − 10 = 0, so (k − 2)(k + 5) = 0. Thus, the roots are k = 2

and k = −5, so the solutions are y1 = e2t and y2 = e−5t. The general

complementary solution is: yc = C1e2t + C2e

−5t, where C1, C2 are

arbitrary constants.

• Particular integral (yp):

To find a particular integral, note d(t) is quadratic, so we try a quadratic

candidate: yp = At2 +Bt+C. Thus, y′p = 2At+B and y′′p = 2A. Sub-

stituting into the ODE gives 2A+3(2At+B)−10(At2+Bt+C) = 3x2.

Matching coefficients, we have: 2A + 3B − 10C = 0, 6A − 10B = 0,

and −10A = 3, so A = − 310 , B = − 9

50 , and C = − 57500 . Thus,

yp = − 310 t

2 − 950 t−

57500

Thus, the general solution is

y = yc + yp = C1e2t + C2e

−5t − 3

10t2 − 9

50t− 57

500

for some arbitrary constants C1 and C2.

39


In the following example, we will see that the choice of the trial particular

integral is not always so straightforward. In particular, our choice here will

not replicate exactly the functional form of the independent function d(t).

The example will demonstrate the sense in which solving these type of DEs

is often very much case-specific.


y′′ − y′ − 6y = e3t

Now, b = −1, c = −6, and d(t) = e3t. We use Method IV to solve.



is k2 − k − 6 = 0, so (k − 3)(k + 2) = 0. Thus, the roots are k = 3

and k = −2, so the solutions are y1 = e3t and y2 = e−2t. The general

complementary solution is: yc = C1e3t + C2e

−2t, where C1, C2 are

arbitrary constants.


To find a particular integral, note d(t) is exponential, so let’s try an

exponential candidate and see why it fails: yp = Ae3t. Thus, y′p =

3Ae3t and y′′p = 9Ae3t. Substituting into the ODE gives Ae3t(9 − 3 −6) = e3t, a contradiction. Thus, this candidate does not work in this

case.

The reason why the candidate does not work is because the function

d(t) = e3t appeared explicitly in the complementary solution. When-

ever this is the case, it is useful to use the following alternative guess:

yp = Ate3t

that is, the same as before, times t. Now, y′p = Ae3t(3t + 1) and

y′′p = Ae3t(9t + 6). Substituting into the ODE and solving for A, we

will find A = 15 .

40




−2t +1

5te3t



y′′ − 3y′ − 4y = −8et cos 2t



is k2 − 3k − 4 = 0, so (k − 4)(k + 1) = 0. Thus, the roots are k =

4 and k = −1, so the solutions are y1 = e4t and y2 = e−t. The

general complementary solution is: yc = C1e4t + C2e

−t, where C1, C2

are arbitrary constants.


To find a particular integral, note d(t) is the product of an exponential

and a cosine function, so let’s try the candidate : yp = Aet cos 2t +

Bet sin 2t. Here:

y′p = Aet cos 2t− 2Aet sin 2t+Bet sin 2t+ 2Bet cos 2t

= (A+ 2B)et cos 2t+ (B − 2A)et sin 2t

y′′p = (A+ 2B)et cos 2t− 2(A+ 2B)et sin 2t+ (B − 2A)et sin 2t

+ 2(4B − 3A)et cos 2t

= (4B − 3A)et cos 2t− (4A+ 3B)et sin 2t

Substituting into the ODE and matching coefficients will then yield

A = 10/13, B = 2/13. Therefore, yp = 1013e

t cos 2t+ 213e

t sin 2t.


41



−t +10

13et cos 2t+

2

13et sin 2t


Next, we note that these methods extend naturally to linear, constant-

coefficient ODE of higher order.

Remark 0.6 (Higher-order ODEs) Consider a n-th order, linear, non-

homogenous, constant-coefficient ODE:

n∑j=0

bjdjy

dtj= d(t)

where {bj}nj=0 are constants, and d is continuous in t.12 Again, we use

that the generalized solution is:

y = yc + yp

and we find the complementary solution yc and the particular integral yp.


The particular integral will depend on the specific problem, and typi-

cally we will use a method of undetermined coefficients (see above).


The auxiliary equation is∑n

j=0 bjαj = 0, and thus the roots {αj}nj=1

solve the polynomial∏nj=1(α − αj) = 0. Then, the complementary

function is:

yc =n∑j=1

Cjeαjt (0.17)

Again, we may have three cases (which we saw already for second-order

ODEs, see Method III):

12 Here, we use the convention d0ydt0

= y.

42


1. If all roots αj are real and distinct, then the complementary func-

tion reads exactly as equation (0.17).

2. If each root αj is repeated kj < n times, then equation (0.17)

reads:13

yc =n∑j=1

kj∑`=1

Cj,`t`−1eαjt

3. If there exist complex roots, then we set αj = βj + γji for each of

those roots. Using Euler’s formula (Result 0.1), then:

Cjeαjt = Cje

βjt cos(γjt+ ϕj)

where ϕj is called a phase shift. That is, in this case we must re-

place the terms Cjeαjt in (0.17) on all complex roots by Cje

βjt cos(γjt+

ϕj).

Methods III and IV showed us ways to solve homogenous and nonho-

mogenous linear, second-order ODEs with constant coefficients. The next

two methods will deal with ODEs whose coefficients are non-constant, such

that:

y′′ + b(t)y′ + c(t)y = d(t) (0.18)

When the ODE is homogenous, we can use the following solution method:

Remark 0.7 (Method V: Reduction of order) .

Used for: Linear, homogenous, second-order ODEs with non-constant

coefficients.

The method: Consider a homogenous ODE with varying coefficients,

i.e. impose d(t) ≡ 0 in (0.18). Suppose that y1 is a solution to (0.18). By

13 We derived this equation within Method III for the case of n = 2 (and k = 1),arguing that if the two roots are identical, the root-finding method gives us onlyone of the two fundamental solutions, and the other one we can find via a simpleguess-and-verify approach. The derivation for higher order follows a similar logic.

43


Remark 0.7, then, y2 = vy1 is also a solution, for any function v ≡ v(t).

Note:

y′2 = v′y1 + vy′1 and y′′2 = v′′y1 + 2v′y′1 + vy′′1

Replacing these into the ODE and rearranging terms, we get:

y1v′′ +

(2y′1 + b(t)y1

)v′ +

(y′′1 + b(t)y′ + c(t)y

)︸︷︷︸=0

= 0

where the last term is zero because y1 is a solution. Thus, we get:

y1u′ +(2y′1 + b(t)y1

)u = 0

a first-order ODE in u ≡ v′. Hence, we have reduced the order of the

problem, from a second-order ODE to a first-order ODE. Using a separation

of variables method (Remark 0.1), we have:

y1du

dt= −

(2y′1 + b(t)y1

)u ⇔

∫du

u= −

∫2y′1 + b(t)y1

y1dt

On the LHS,∫

duu =

∫u′

u dt = lnu. On the RHS:

−∫

2y′1 + b(t)y1

y1dt = −2

∫y′1y1

dt−∫b(t)dt = −2 ln y1 −

∫b(t)dt

Thus, lnu = −2 ln y1 −∫b(t)dt + c, or u = c

y21e−

∫b(t)dt, with c = ec.

Therefore:

v =

∫c

y21

e−∫b(t)dtdt

Thus, the general solution to the ODE is yc = (C1 + C2v)y1 for some

C1, C2.

Example 0.15 Consider:

t2y′′ + 3ty′ + y = 0

44


The function y1 = t−1 is a solution. (To see this, note y′1 = −t−2 and

y′′1 = 2t−3, so t2y′′1 + 3ty′1 + y1 = 2t−1 − 3t−1 + t−1 = (2 − 3 + 1)t−1 = 0.)

Now, we use the method of reduction of order to find a second solution. For

this, let y2 = vy1 = vt−1 be a solution. Then:

y′2 = v′t−1 − vt−2 and y′′2 = v′′t−1 − 2v′t−2 + 2vt−3

Substituting these into the ODE and collecting terms:

0 = t2(v′′t−1 − 2v′t−2 + 2vt−3

)+ 3t

(v′t−1 − vt−2

)+ vt−1

= v′′t− 2v′ + 2vt′ + 3v′ − 3vt−1 + vt−1

= tv′′ + v′

Letting u ≡ v′, we now have a first-order ODE: tu′ = u. Solving via

separation of variables:

tdu

dt= −u ⇔

∫du

u= −

∫1

tdt ⇔ lnu = ln t+ C ⇔ u = ct−1

and thus v′ = ct−1, or v = c ln t+ k, for some constants c, k. Thus, the

second solution is y2 = vt−1 = (c ln t+ k)t−1 = ct−1 ln t+ kt−1. Since y1 =

t−1 and y1, y2 are fundamental solutions (and hence linearly independent),

we can just ignore the second term in y2. Therefore, y2 = t−1 ln t, and the

general solution to the ODE is:

y = C1t−1 + C2t

−1 ln t

for arbitrary constants C1, C2.

If ODE (0.18) is nonhomogenous, we can again solve the homogenous

equation first, then find a particular integral, and add the two together. For

the particular integral, we can always try to use the method of undetermined

45


coefficients: guess that the particular integral yp has a similar functional

form to d(t), plug it into the ODE, and then match the coefficients on y′′,

y′, and y.

Often, this method may be unfeasible. In those cases, the following

method may help:

Remark 0.8 (Method VI: Variation of parameters) .

Used for: Linear, nonhomogenous, second-order ODEs with non-constant

coefficients.

The method: Consider a homogenous ODE with varying coefficients,

and let y1 and y2 be fundamental solutions of the homogenous counterpart of

the ODE (found using the methods developed above). The general solution,

then, can be written as:

y = u1y1 + u2y2

for some functions u1 ≡ u1(t) and u2 ≡ u2(t). Then, y′ = u′1y1 +u1y′1 +

u′2y2 + u2y′2.

The key in this method is that, whatever u1, u2 may be, we make the

following assumption:

u′1y1 + u′2y2 = 0 (0.19)

Imposing this restriction, we get y′ = u1y′1+u2y

′2, and y′′ = u′1y

′1+u1y

′′1 +

u′2y′2 + u2y

′′2 . Plugging (y′′, y′, y) into the ODE, and rearranging terms:

u′1y′1 + u′2y

′2 + u1

(y′′1 + b(t)y′1 + c(t)y1

)+ u2

(y′′2 + b(t)y′2 + c(t)y2

)= d(t)

Since both y1 and y2 are complementary functions, both terms in paren-

theses equal zero. Imposing this and rearranging terms yields:

u′1y′1 + u′2y

′2 = d(t) (0.20)

46


Thus, to find (u1, u2), we need to solve the system of equations (0.19)-

(0.20). The two unknowns are (u′1, u′2). Solving the system, we find that we

can write (u′1, u′2) in terms of the Wronskian as:

u′1 = − y2d(t)

Wy1,y2

and u′2 =y1d(t)

Wy1,y2

where Wy1,y2 ≡ y1y′2 − y′1y2 6= 0 because (y1, y2) are fundamental solu-

tions. Thus:

u1 = −∫

y2d(t)

Wy1,y2

dt and u2 =

∫y1d(t)

Wy1,y2

dt

Provided that these integrals can actually be computed, we have found a

particular solution to the differential equation:

yp = y1u1 + y2u2 = −y1

∫y2d(t)

Wy1,y2

dt+ y2

∫y1d(t)

Wy1,y2

dt (0.21)


2y′′ + 18y = 6 tan 3t

where recall that tanx = sinxcosx .


It is readily checked using our methods above that the fundamental

solutions (to 2y′′ + 18y = 0) are y1 = cos 3t and y2 = sin 3t, so the

complementary function is yc = C1 cos 3t+ C2 sin 3t.


To find the particular integral, we could try the method of undeter-

mined coefficients. But because d(t) here does not involve a sum, a

polynomial, or a product, this method is likely to fail. We will use the

method of variation of parameters instead.

The general solution is y = u1y1 + u2y2, where y1 = cos 3t and y2 =

sin 3t. The Wronskian is:

47


Wy1,y2 = y1y′2 − y′1y2 = 3 cos2(3t) + 3 sin2(3t) = 3 6= 0

where we have used the Pythagorean theorem, equation (0.2). Then,

using equation (0.21), a particular integral is:

yp = − cos(3t)

∫3 sin(3t) tan(3t)

3dt+ sin(3t)

∫3 cos(3t) tan(3t)

3dt

= − cos(3t)

∫sin2(3t)

cos(3t)dt+ sin(3t)

∫sin(3t)dt

= − cos(3t)

∫1− cos2(3t)

cos(3t)dt+ sin(3t)

∫sin(3t)dt

= −cos(3t)

3

[ln (sec(3t) + tan(3t))− sin(3t)

]+

sin(3t)

3(− cos(3t))

= −cos(3t)

3ln (sec(3t) + tan(3t))

where secx = 1cosx is the reciprocal of the cosine, often called the secant

function.

Thus, the general solution y = yc + yp is

y = C1 cos 3t+ C2 sin 3t− cos(3t)

3ln (sec(3t) + tan(3t))


y′′ − 2y′ + y =et

t2 + 1


Again, we will not go through the derivation of the complementary

function, but using the methods above one should readily find that the

fundamental solutions to y′′ − 2y′ + y = 0 are y1 = et and y2 = tet.14

14 Here, the roots are identical, so we should follow Case 2 in Method III (Remark0.4).

48



To find the particular integral using the variation of parameters method,

we again argue that the general solution is y = u1y1 +u2y2, where now

y1 = et and y2 = tet. The Wronskian is:

Wy1,y2 = y1y′2 − y′1y2 = et(et + tet)− te2t = e2t 6= 0

Then, using equation (0.21), a particular integral is:

yp = −et∫

tetet

e2t(t2 + 1)dt+ tet

∫etet

e2t(t2 + 1)dt

= −et∫

t

t2 + 1dt+ tet

∫1

t2 + 1dt

= −1

2et ln(1 + t2) + tet tan−1(t)

Thus, the general solution y = yc + yp is

y = C1et + C2te

t − 1

2et ln(1 + t2) + tet tan−1(t)

49

Part I

NEWTONIAN

MECHANICS

51

Chapter 1

Motion and Force

Classical mechanics focuses on describing the motion of particles. We

start with the description of systems with a single particle. Section 1.3 will

generalize the concepts to systems of two or more particles.

1.1 Newton’s Laws of Motion

To describe particle motion, we must first specify a position, i.e. the

value of each of the spatial coordinates at a point in time, and the motion,

or the change in the position when time advances.

1.1.1 Position, Velocity, and Acceleration

We start with some definitions:

Definition 1.1 (Position) The position of a particle is a vector ~r(t) =

(x(t), y(t), z(t)) specifying the location of a single particle on each coordinate

x, y, z at time t.

Definition 1.2 (Velocity) The velocity of a particle is the displacement of

its motion over infinitesimal time along each one of its coordinates:

~v(t) ≡ ~r(t) = (x(t), y(t), z(t))

53


where x(t) denotes the time derivative of x (and similarly for y and z).

Definition 1.3 (Speed) The speed of a particle is the magnitude of its

velocity, that is, the scalar |~v(t)|.

Definition 1.4 (Acceleration) The acceleration of a particle is the dis-

placement of its velocity over infinitesimal time along each one of its coor-

dinates:

~a(t) ≡ ~v(t) =(x(t), y(t), z(t)

)Example 1.1 (Simple Harmonic Oscillator) Consider an oscillating par-

ticle along a single dimension:

x(t) = sin(ωt), y(t) = z(t) = 0

where ω is a constant. The particle is constant in the second and third

dimensions. Along the first dimension, this is a simple harmonic motion,

with larger values of ω implying more rapid oscillations. Indeed, recall from

equation (0.1) and Figure 0.2 that sinωt is the position along one dimension

of a particle on a unit circle when the angle (in radians) is given by:

θ(t) = ωt

The assumption that the angle θ increases linearly with time implies that

motion is uniform.

• The velocity of the particle is:

v =d

dtsin(ωt) = cos(ωt)

d

dt(ωt) = ω cos(ωt)

using the chain rule, and equation (0.4).

Notice that when position x is at its maximum or minimum, the ve-

locity is zero. And vice versa, when the position is at x = 0, velocity

is either at its maximum or at its minimum. Technically, in this case

it is said that position and velocity are 90◦ out of phase.

54


• The acceleration is:

a = −ω2 sin(ωt)

Note that acceleration is negative, meaning that whenever x is positive

(negative), the acceleration is positive (negative). That is, wherever the

particle is, it is accelerated back into the origin (indeed, it “oscillates”).

Technically, in this case it is said that position and acceleration are

180◦ out of phase.

For an example of motion in two dimensions, we need to introduce two

more concepts:

Definition 1.5 (Angular velocity) The number of radians that an angle

θ advances per unit of time. That is:

ω ≡ dθ

dt

Definition 1.6 (Period of motion) The time it takes for a particle to

complete one full cycle of motion. Typically denoted by T , measured in

seconds.

Definition 1.7 (Frequency of motion) The number of cycles a particle

takes per unit of time. Thus, f ≡ 1/T , measured in cycles per second, or

hertz (Hz).

Example 1.2 (Harmonic Oscillator: Circular Motion) Consider a par-

ticle moving in a perfect circle on a plane, i.e. along the x (horizontal) and

y (vertical) dimensions, but not along the z dimension. An example is the

motion of a planet orbiting the Sun in a perfect circular orbit.

Formally, the most general (counterclockwise) uniform circular motion

around an orbit of radius R > 0 is:

~r(t) ≡

(x(t)

y(t)

)=

(R cos(ωt)

R sin(ωt)

)

55


Indeed, note from equation (0.1) and Figure 0.2 that R cos(ωt) and

R sinωt are the coordinates on a circle of radius R if we define:

θ(t) = ωt

as the angle (in radians) at time t. Similar to Example 1.1, the assump-

tion that the angle θ increases linearly with time implies that motion around

the orbit is uniform.1

Notice in such a motion the two coordinates are 90◦ out of phase: as

the particle revolves around the orbit, x oscillates between a maximum of R

and a minimum of −R, and at both of these points we have y = 0. And

vice versa: at its maximum and minimum points, y = R and y = −R,

respectively, but in both cases x = 0.

The notation ω for the coefficient is no coincidence, for it is exactly the

angular velocity of this system (indeed, since θ = ωt, then dθ = ωdt). For

orbital motions, the period of motion, or the time it takes to go one full

revolution (i.e. 360◦, or 2π radians), is:

T =2π

ω

(To see this, just plug in θ = 2π into θ = ωt at t = T ). The frequency of

the oscillation is thus f = ω2π Hz. Finally, using simple differentiation, we

can find the velocity and acceleration vectors for the simple uniform circular

motion:

~v(t) =

(−Rω sin(ωt)

Rω cos(ωt)

); ~a(t) =

(−Rω2 cos(ωt)

−Rω2 sin(ωt)

)

Note two interesting observations, first noticed by Newton when studying

the motion of the Moon (see Figure 1.1):

• For uniform circular orbits, the position and velocity vectors are or-

thogonal, ~r ⊥ ~v (i.e. ~r · ~v = 0).

1 In Section 3.1, we will generalize circular orbits to non-uniform motion.

56


• The acceleration of uniform particle motion along a circular orbit is

parallel to the position vector (that is, ∠~r~a = π, or 180◦), but has op-

posite direction (i.e. toward the origin). This is called the centripetal

acceleration.

Thus, we define:

Definition 1.8 (Centripetal acceleration) In circular motion, the ac-

celeration that is perpendicular to the velocity of the particle.

Finally, let us compute magnitudes.

• The magnitude of the position is, of course, R.2

• The magnitude of the velocity (i.e. the speed) of the particle is:

|~v| =√R2ω2 sin2 ωt+R2ω2 cos2 ωt = Rω

where we have used equation (0.2).3 Because speed is proportional to

the angular velocity ω, when motion is uniform and circular we may

also refer to ω as angular frequency.4

• The magnitude of the acceleration is:

|~a| =√R2ω4 cos2 ωt+R2ω4 sin2 ωt = Rω2

That is, |~a| = |~v|2R .

In sum, in uniform circular orbits, the particle’s speed is proportional to

the angular frequency, and the acceleration is proportional to the product of

the speed and the angular frequency. For given R, higher angular frequency

increases the particle’s speed, and even more so its acceleration.2 Proof. Recall the notation sin2 x ≡ sin(x) sin(x), and similarly for cos2. Then,

|~r| =√R2 cos2 ωt+R2 sin2 ωt = R

√cos2 ωt+ sin2 ωt = R, using equation (0.2).

3 This makes sense. We have obtained |~v| = 2πRT

: the speed (in m/s) is the circum-ference (in m) divided by the time it takes for the particle to complete one full cycle(in s).4 Of course, angular frequency and angular velocity are not the same if the latter is

not constant. We will see one such case in Example 2.7.

57


~v

~a

θ(t) = ωt

~v

~a

θ(t) = ωt

~v

~aθ(t) = ωt

Figure 1.1: In a uniform circular motion around anorbit, the velocity vector is orthogonal to the positionvector. The acceleration vector is parallel to the po-sition vector, but of pointing to the origin (so-calledcentripetal acceleration).

1.1.2 First and Second Laws

We are now ready to state our first fundamental principles: Newton’s

laws of motion. These laws are based on the idea that, in order to overcome

friction, force must be applied on a body to change its velocity. An isolated

object moving in free space, with no forces acting on it, does not need force

to keep it moving if it has inertia. But to change its trajectory and/or

velocity, one must apply force.

Principle 1.1 (Newton’s Second Law of Motion) Force equals the prod-

uct of mass and acceleration:5

~F = m~a (1.1)

Remark 1.1 Some remarks:

• Here, the mass of the object is a scalar describing the resistance of the

body to being moved. Heavier bodies experience lower acceleration for

5 Original text: “Mutationem motus proportionalem esse vi motrici impressae, etfieri secundum lineam rectam qua vis illa imprimitur.” In English: “The alterationof motion is ever proportional to the motive force impressed; and is made in thedirection of the right line in which that force is impressed.”

58


given applied force. When no force is applied to an object, its velocity

does not change, whatever its mass.

• Notice force an acceleration are vectors, i.e. include all dimensions of

space, because these have not only magnitude but also direction.

• Equation (1.1) gives us the units of force. We use:

– Kilograms (kg) for mass.

– Meters per second (m/s) for velocity.

– Meters per second per second, or per second squared, (m/s2) for

acceleration, because acceleration is the change in velocity.

Thus, since F = ma, force is what it takes to accelerate one kilogram

by one meter per second per second, or “one kilogram meter per second

squared”.

Definition 1.9 (Newton units) A Newton (N) denotes one kilogram me-

ter per second squared (kg m/ s2), and is the unit of measure of force.

Example 1.3 (No Force) A particle with mass m, moving at velocity ~v,

and with no force acting on it, satisfies by Newton’s law:

m~v = ~0

Since m > 0, then ~v = ~0. Since velocity is constant in all components,

then ~v(t) = ~v0 ≡ ~v(0). Therefore, we have found:

~r = ~v0

That is, the change in the position of the particle is given by the initial

value of its velocity. This is a simple differential equation, with solution:

~r(t) = ~r0 + ~v0t

where ~r0, ~v0 ∈ R3. That is, the particle’s position at time t is the sum of

its initial position and the product of time and its initial velocity.

59


Incidentally, in this example we have derived Newton’s First Law of

Motion as a special case of his Second Law when ~F = ~0. This configuration

has a name: inertial reference frame. To give a proper definition, let us first

introduce some more concepts:

Definition 1.10 (Reference frame) A reference frame is a coordinate

system and a set of points that uniquely fix the coordinate system and stan-

dardize measures.

A coordinate system is, of course, the system that is used to uniquely

determine the position of points on a manifold.

In classical mechanics, the manifold is often the Euclidean space (i.e. a

space with no curvature). The reference frame is often the Cartesian set of

coordinates, where vectors represent directions in flat space. Alternatively,

we will sometimes consider the polar coordinate system. Above we have

seen examples of both systems of coordinates. The Cartesian system is our

usual (x, y, z) system. The polar system is used when dealing with circular

motion:

Definition 1.11 (Polar coordinate system) A two-dimensional coordi-

nate system in which each point on a plane is determined by: (i) a distance

(called radius) from a reference point (called pole, analogous to the origin

in Cartesian coordinates); (ii) an angle from the reference direction.

As we shall see, however, in Einstein’s special relativity theory the ref-

erence frame depends on the observer, so we must use non-Euclidean spaces

and non-Cartesian coordinate systems.

Definition 1.12 (Inertial reference frame) A reference frame in which

bodies, whose net force acting upon them is zero, have no acceleration.

Newton’s First Law then states:

Principle 1.2 (Newton’s First Law of Motion) In an inertial reference

60


frame, every object in a state of uniform motion remains in that state of mo-

tion unless an external force is applied to it.6

Example 1.4 (Constant Acceleration) In the previous example, we saw

the case of a particle of constant velocity (i.e. no acceleration). A slightly

more general particle has constant (zero or else) acceleration. The general

way to describe the position of such a particle is:

x(t) = c1 + c2t+ c3t2 (1.2)

for some constants c1, c2, c3 to be interpreted shortly. Indeed, note that

velocity and acceleration are:

v(t) ≡ x(t) = c2 + 2c3t a(t) ≡ x(t) = 2c3

respectively. Indeed, acceleration is constant. Moreover, note c1 = x(0),

c2 = v(0), and c3 = a(0)2 , so (c1, c2, c3) determine the initial conditions of

the system.

A specific example of constant acceleration is that of an object in free

fall.

Definition 1.13 (Free fall) The situation in which the only force exerted

on a body is exclusively that of gravity.

Example 1.5 (Free fall) Consider a particle of mass m, but now suppose

there is a constant force Fz > 0 being exerted along the z direction. Newton’s

law says that, in this case, acceleration along the z-axis is:

vz =Fzm

Again, vx(t) = vy(0), but now:

6 Original text: “Corpus omne perseverare in statu suo quiescendi vel movendi uni-formiter in directum, nisi quatenus a viribus impressis cogitur statum illum mutare.”In English: “Every body persists in its state of being at rest or of moving uniformlystraight forward, except insofar as it is compelled to change its state by force im-pressed.”

61


z(t) = vz(t) = vz(0) +Fzmt

Solving this differential equation, we find the particles position along the

third dimension at time t:

z(t) = z(0) + vz(0)t+Fz2m

t2

In vector notation, and recalling the notation ~r(t) = (x(t), y(t), z(t)), we

have:

~r(t) = ~r0 + ~v0t+ ~ε(t)

where ~ε(t) =(0, 0, Fz2m t

2).

Incidentally, we have just derived the equation for the motion of a falling

object. If z(t) represents the object’s height above the surface of the earth

at time t, and we use the acceleration caused by gravity, az = vz = g, then:

z(t) = z(0) + vz(0)t− gt2

2

describes the object’s position at time t after falling from height z(0)

with initial velocity vz(0).

These examples also bring to bear the difference between mass and

weight. Though often used interchangeably, these are two very different

concepts:

Definition 1.14 (Mass) The mass of an object is the quantity of matter

in the body, regardless of its volume or of any forces acting on it. It is

usually expressed in kilograms (kg).

Definition 1.15 (Weight) The weight of a body is the force that is exerted

on the body, that is, the product of its mass and the acceleration experienced

by the body. Therefore, weight is expressed in Newtons (N).

In inertial reference frames, the weight is just the magnitude of the

gravitational force, so a body’s weight is simply (by Newton’s Second Law)

62


W = mg, where g denotes the gravitational acceleration. On Earth, g =

9.80665m/s2, so a body of mass 10kg weighs approximately 98 Newtons. In

outer space, however, g = 0, so objects have a certain mass but they are

weightless.

More generally, a body’s weight depends not only on the gravitational

force, but all other forces presently acting upon it. Here are two classic

examples:

Example 1.6 (Weight on an Elevator) Suppose a man of mass m =

80kg enters an elevator. The elevator has a scale inside.

• If the man stands on the scale and the elevator is at rest, the man

exerts a force mg on the scale, and the scale exerts a force Fscale on

the man. By Newton’s laws, Fscale = mg gives the weight of this man.

In this case, the man weighs about 80× 9.80 = 784N .

• Now, imagine the elevator is accelerated upward, with acceleration a.

Again, let Fscale be the force that the scale exerts on the man. The

man still exerts a force mg on the scale. But now, additionally, there

is an upward acceleration, so clearly it must be that Fscale > mg,

or else the elevator would not move. In particular, by Newton’s law,

Fscale −mg = ma, or:

Fscale = m(g + a)

For instance, if the acceleration is 5 m/s2, the man weighs 80× (9.8 +

5) = 1184N . This is why we feel heavier when elevators accelerate

upward, even though our body mass has not changed.

• Finally, imagine the elevator is in free fall. Using the notation above,

the acceleration is now simply a = −g (recall Example 1.5).7 Thus,

Fscale = m(g + a) = 0. Therefore, objects in free fall are weightless,

whatever their mass.7 Here, we use the convention that the “plus” direction is upward, and the “minus”

direction is downward.

63


Example 1.7 (Body hanging from a string) Consider a body of mass

m that is attached to a (massless) string. The string is in turn attached to

the ceiling, so that the body is hanging and at rest.8

As usual, there are two forces of opposite direction. On the one hand,

the string has a tension T which pulls upward on the body. On the other,

the body has a force mg, due to gravity, pushing downward on the body.

• If the system is at rest (no acceleration), then clearly T = mg.

• Now suppose the system is accelerated upward with acceleration a.

Then, T − mg = ma, so the tension in the string is T = m(g + a).

Just like in the elevator example.

• Conversely, if the system is accelerated downward, the body will weight

less, whatever its mass. If the string is cut and the object goes in free

fall, then T = 0 (the body becomes weightless).

Later on, in Example 1.12, we will revisit the notion of weight for a more

sophisticated example of motion.

Example 1.8 (Projectile motion) When an object is shot with an angle,

the force of gravity will bring it back down, and the body will describe a

parabolic trajectory (Figure 1.2).

To describe this motion, consider two dimensions, horizontal (x) and

vertical (y). At time t = 0, the (initial) velocity is denoted ~v0, with speed

v0 = |~v0|, in an angle of α radians. As per Equation (0.1), the x and y com-

ponents of the initial vector velocity are v0 cosα and v0 sinα, respectively.

At a later moment in time, vector ~r(t) = (x(t), y(t)) describes the posi-

tion of the projectile. Since acceleration is arguably constant on both (x, y)

dimensions, we may use equation (1.2) to write:

8 For the case of a moving body, see Example 1.10. For an example with two strings,see Example 1.11.

64


~v 0

v0 cosα

v0 sinα tend2

~r(t)

α

x

y

Figure 1.2: Projectile motion.

x(t) = x(0) + vx(0)t+ ax(0)t2

2

y(t) = y(0) + vy(0)t+ ay(0)t2

2

First, we know (vx(0), vy(0)) = (v0 cosα, v0 sinα). Supposing the projec-

tile starts at the origin, (x(0), y(0)) = (0, 0). In the x direction there is no

acceleration (provided there is no air friction), so velocity is constant.9 In

the y direction, however, there is the (attractive) force of gravity, g. There-

fore, at the beginning of time, (ax(0), ay(0)) = (0,−g), so at time t:

x(t) = (v0 cosα)t y(t) = (v0 sinα)t− gt2

2(1.3)

The velocities in each direction are:

vx(t) = v0 cosα vy(t) = v0 sinα− gt

Note we have been able to decompose the complicated projectile motion

into two independent motions along each direction. In the case of a parabola,

9 This can be confirmed experimentally: if a ball is thrown vertically from a movingplatform, it will land exactly on that same platform after completing the parabolictrajectory.

65


the x displacement happens to be completely independent of the y displace-

ment.

Let us derive more properties:

• First, the shape of the trajectory. For this, use equation (1.3) to solve

for t in x(t), and obtain t = xv0 cosα . Plugging this into y(t) gives:10

y = (tanα)x− 1

2

g

v20 cos2 α

x2

This now gives us the location of the particle along the y direction, for

given location along the x direction. Thus, we have an equation of the

form y = ax+ bx2. This is indeed the formula for a parabola.

• Next, we want to know the highest point reached by the projectile. For

this, just impose vy(t) = 0. Indeed, at the highest point, the projectile’s

vertical speed is zero before it picks up again when the body starts to

move along the downward portion of the arc. Using our formula:

0 = vy(tp) = v0 sinα− gtp

where tp denotes the time at which the maximum height is reached.

Thus, tp = v0 sinαg . Now we can just substitute time tp into the y law

of motion to obtain, after some algebra:

y(tp) =v2

0 sin2 α

2g

This is intuitive: the highest point is increasing in the initial angle

(steeper trajectories reach higher maxima) and the initial speed, and

decreasing in the gravitational force. For instance, for the same initial

speed and angle, the projectile will reach a higher maximum point if

the experiment is done on the surface of the Moon.11

10 Recall that sinαcosα

= tanα.11 The gravitational force on Earth is g = 9.80m/s2. On the Moon, it is g =1.625m/s2.

66


• When will the projectile hit the ground? For this, we simply look for

the tend such that y(tend) = 0. Alternatively, we know time tp = 2tend

if there is no air friction. Thus,

tend =2v0 sinα

g

• What will the x position of the body be when it hits the ground? Know-

ing tend, we have:12

x(tend) =v2

0 sin 2α

g

and of course y(tend) = 0.

• From our last calculation we see that, for given initial velocity, the

farthest we can possibly throw the object is if we use an angle of exactly

45◦. Indeed, at this angle (α = π4 ), sin 2α reaches a maximum (recall

Figure 0.1).

Example 1.9 (Harmonic Oscillator: Hooke’s Law) Consider a parti-

cle that moves (for simplicity) only along a single dimension (x), subject to

a force that always pulls it back to the origin. The force can be represented

as:

F = −kx

where k > 0. The force being negative means that it is always a restoring

force (positive when distance is negative, and vice versa). It being propor-

tional to x means that, for each unit distance away from the origin, the force

increases by a fixed amount k. Thus, we can think of this as the motion of

a particle at the end of a spring (Figure 1.3).

When a force is restoring and proportional to the displacement, we say

that Hooke’s Law holds:

12 Recall: sinα cosα = sin 2α.

67


x > 0x = 0

F = −kx

Figure 1.3: Spring motion as a harmonic oscillator.Above: Spring in a relaxed position (x = 0); Below :Stretched spring, in position x.

Principle 1.3 (Hooke’s Law) The force F needed to extend or compress

a spring by a distance x is proportional to the distance, or F = −kx, where

k > 0 is called the spring constant.

Newton’s Second Law says ma = −κx. Thus, acceleration is negative,

as expected. We can write this as:

x+k

mx = 0 (1.4)

This important formula generally describes the acceleration motion of

a simple harmonic oscillator, which are very prevalent in nature, from the

motion of the pendulum (see Example 1.10) to the oscillations of the electric

and magnetic fields in a light wave. We will see many examples of simple

harmonic oscillators throughout these notes.

For the example of a spring, think of the spring being pulled and then

released. As the spring returns to its resting (or equilibrium) position, it

oscillates between positive (respectively, negative) and negative (respectively,

positive) distance (respectively, force) with respect to its equilibrium point.

As it oscillates, its displacement describes a wave over time.

To see formally that this is an oscillator, let us solve the ODE (1.4) and

demonstrate that it is a sinusoidal (i.e. a wave-like) function in time. Here,

equation (1.4) is a second-order, constant-coefficient homogenous ODE, so

68


we can use Method III (Remark 0.4). In the notation of Method III, the

coefficients are b = 0 and c = k/m. Since clearly b2 < 4c, then the solution

has the form:

x = A cos(ωt) +B sin(ωt)

for some numbers A, B, and ω. In physics, the following alternative

formulation is more commonly used:

x = xmax cos(ωt+ ϕ) (1.5)

To see that the two expressions are equivalent, note that cos(ωt + ϕ) =

cos(ωt) cosϕ − sin(ωt) sinϕ (as per equation (0.3)), so A ≡ xmax cosϕ and

B ≡ −xmax sinϕ.

In the formulation of equation (1.5), we have the following objects:

• xmax is the called the amplitude, corresponding to the maximum x

distance that the body reaches before oscillating back to the origin.

• ω is, after Example 1.2, the angular frequency of oscillation.

• ϕ is called the phase angle (in radians).

Figures 1.4-1.5 provide a graphical interpretation for these objects. In

Figure 1.4 we see that ϕ controls for the displacement of the wave in the x

direction. In the example, the thin line is out-of-phase by 180◦ (i.e. π radi-

ans) relative to the thick line. In Figure 1.5, we see that xmax corresponds

to the distance between the maximum and the minimum of the sinusoidal

waves at each given oscillation. If the amplitude is negative (dashed line),

the wave is out of phase by 180◦ (i.e. π radians).13 The exact realization

of these variables depends on the initial conditions of the system (more on

this in Example 2.9).

13 That is, for given ω, setting (xmax, ϕ) = (1, π) produces the same wave as setting(xmax, ϕ) = (−1, 0).

69


π2

π 32π

2π

−1

1 Phase ϕ = 0

Phase ϕ = π

θ

Figure 1.4: Same amplitude, different phases.Graphs for x = xmax cos(ωt+ϕ), xmax = 1, with ϕ = 0[thick line] and ϕ = π [thin line].

π2

π 32π

2π

−1

1

Amplitude xmax = 12

Amplitude xmax = 1

Amplitude xmax = −1

θ

Figure 1.5: Same phase, different amplitudes.Graphs for x = xmax cos(ωt+ ϕ), ϕ = 0, with xmax =0.5 [thick line], xmax = 1 [thin line], and xmax = −1[dashed line]. When the amplitude is negative [dashedline], the wave is out of phase by 180◦.

70


Recall that the time it takes for the oscillation to take one full cycle

(before it repeats itself) is called the period of motion, equal to T = 2π/ω.

The frequency of motion is f = 1/T hertz (Hz). For solution (1.5), we have:

x = − xmaxω sin(ωt+ ϕ)

x = − xmaxω2 cos(ωt+ ϕ) = −ω2x

Plugging into equation (1.4) gives −ω2x+ kmx = 0, and therefore:

ω =

√k

m

is the angular frequency of the spring. The period of oscillation is thus

T = 2π√

mk seconds. Remarkably, note the period is independent of the

amplitude xmax and the phase angle ϕ. This is characteristic of objects

whose motion obeys Hooke’s Law.

Example 1.10 (Harmonic Oscillator: Pendulum) The motion of a pen-

dulum can also be described as a harmonic oscillator. Consider two dimen-

sions, horizontal and vertical, denoted (x, y). At rest, the pendulum is per-

pendicular to the ground (x = 0). In motion, the pendulum arcs back and

forth in oscillation.

There is a ball of mass m attached at the end of the pendulum. The

length of the pendulum’s string (which is massless) is `. The angle of the

pendulum relative to its equilibrium position is θ. Two forces are acting on

this pendulum (in blue in Figure 1.6): first, the force of gravity, equal to

mg, pulling the pendulum downward; second, there is a tension Tθ, pulling

the ball upward along the string and which depends on the pendulum’s angle.

Using Figure 0.2 and equation (0.1), the tension vector can be decomposed

into x = Tθ sin θ and y = Tθ cos θ.

Let’s start with the equations of motion. In the x (horizontal) direc-

tion, the force is restoring just like with the spring (Example 1.9), so Fx =

−Tθ sin θ = −Tθx/`.14 In the y (vertical) direction, there is an upward force

14 In the second equality, we have Taylor-expanded sin θ about θ = 0 to obtain sin θ ≈

71


x = 0

`

θ

Tθ

Tθ sin θ

Tθ cos θ

mg

Figure 1.6: The pendulum as a harmonic oscillator.

Tθ cos θ, and a downward force mg, so Fy = Tθ cos θ −mg. Thus, Newton’s

Second Law is:(mx

my

)=

(−Tθ 0

0 −Tθ

)(sin θ

cos θ

)−

(0

mg

)(1.6)

This is now a coupled system of differential equations that can be solved

with a numerical solver (such as Mathematica), but is otherwise hard to

solve by hand. To make progress, one can use a so-called small-angle ap-

proximation.

Definition 1.16 (Small-Angle Approximation) A useful simplification

of trigonometric functions that considers that the angle θ is small or, specif-

ically, that θ � 1 radian.

Remark 1.2 To test the validity of this approximation, we can use a second-

order Taylor expansion (using results from Example 0.7) about θ = 0 to see

θ. Since the total horizontal displacement is x = ` sin θ, then sin θ ≈ x/`.

72


that sin θ ≈ θ = 0, cos θ ≈ 1 − θ2/2 = 1, and tan θ ≈ θ = 0, respectively.

These approximations are quite good: when θ = 0.0877 (i.e. about 5◦), then

cos θ = 0.996 (only .4% away from unity), and when θ = 0.1745 (i.e. about

10◦), then cos θ = 0.985 (only 1.5% away from unity).

The small-angle approximation is very useful for computing the pendu-

lum’s motion. In particular, is allows us to make two simplifications:

• First, we approximate cos θ ≈ 1.

• Second, notice that for small angles θ, the x displacement is much

larger than the y displacement: when swinging the pendulum, the hori-

zontal motion of the pendulum is overwhelmingly much larger than the

vertical motion.15 Hence, we say that y ≈ 0.

Imposing this approximation into our second law of motion in (1.6) gives:

Tθ ≈ mg

That is, when the angle is small, the tension force is invariant to the

angle’s size. In fact, it is approximately equal to the gravitational force.

This is intuitive from Figure 1.6: when there is no y acceleration, upward

(tension) and downward (gravitation) forces should cancel out. Using this

back into the first equation, we get:

x+g

`x = 0

Compared to (1.4), we see that this is a simple harmonic oscillation.

Using the results from Example 1.9, we then immediately know that:

• The position of the pendulum at time t is x = xmax cos(ωt+ ϕ).

• The angular frequency of the oscillation is ω =√

g` .

15 For example, at θ = 0.0877 (about 5◦), the y displacement is only 4% of the xdisplacement. At θ = 0.1745 (about 10◦), it is only 9%.

73


• The period of the pendulum is 2π√

`g seconds.16

Once again, this is an approximately correct description of the pendu-

lum’s motion insofar as it does not swing “too much” (i.e. θ small).

Example 1.11 (Body hanging from two strings) Consider a body of

mass m that is attached to two strings. Each string is, in turn, attached

to the ceiling. The two strings are separated by a certain distance, so they

form an angle with the ceiling. The body is at rest.

Let us agree to call x the horizontal direction, and y the vertical direction.

Each string is exerting a pull on the body. We thus let Ti denote the tension

in string i = 1, 2 (similar to the pendulum case, Example 1.10). The two

tensional forces disagree in direction: while the left-hand side string pulls in

the north-west direction, the right-hand side string pulls in the north-east

direction. On the other hand, the gravitational force on the body is mg, a

downward pull.

As usual, we can decompose the forces into its directional components.

• For each string i = 1, 2, the x component of Ti is Ti cos θi, and the y

component is Ti sin θi, where θi denotes the angle between string i and

the vertical axis.

• Since each string pulls in opposite directions along the x direction, then

the force along this direction is:17

Fx = T1 cos θ1 − T2 cos θ2

• In the y direction, both strings are pulling up, and so they agree in

direction. However, gravity is pulling down. Thus:

Fy = T1 sin θ1 + T2 sin θ2 −mg16 For instance, when ` = 1 meter, a full cycle takes approximately 2 seconds. When` = 0.25 meters (4 times shorter), the period is 1 second.17 Notice here we agree that force increases to the right. This is just a convention.

74


Since the body is hanging down and is not being accelerated, then ~a = ~0,

and Newton’s Second Law says ~F = m~a = ~0. Newton’s Third Law implies

that all forces must cancel out, i.e. the tension from the two strings must be

equal to the gravitational pull. Thus, Fx = Fy = 0, and we get a system of

two equations with unknowns (T1, T2):

T1 cos θ1 − T2 cos θ2 = 0

T1 sin θ1 + T2 sin θ2 = mg

Solving, we get:

(T1, T2) =

(mg

sin θ1 + tan θ2 cos θ1,

mg

sin θ2 + tan θ1 cos θ2

)For example, if m = 4 kg, θ1 = π/3 (i.e. 60◦), θ2 = π/4 (i.e. 45◦), and

using that g = 9.80m/s2 on Earth, we obtain (T1, T2) ≈ (28.696, 20.291)

Newtons.

Example 1.12 (Weight in circular motion) When a pendulum (Exam-

ple 1.10) is given a strong enough initial velocity, its trajectory will describe

a circular motion (Example 1.2).

Let R be the radius of the orbit, m be the mass of the object, and ω be

the angular velocity of motion, assumed to be constant (that is, the angle at

time t is just θ = ωt). From Example 1.2, we know that there must be a

centripetal acceleration, pulling the object toward the origin, equal to Rω2.

Let T denote the tension of the string at a given position.

We will now calculate the weight of the object in two different positions

along its orbital trajectory: point S (the peak position) and point P (the

trough position). See Figure 1.7.

• At point P , there is a gravitational force mg pushing down on the

object, and a centripetal acceleration ap = Rω2 creating tension on

the string and pulling in the object. Then, the tension of the string at

point P is Tp −mg = map by Newton’s Second Law, or:

75


S

as

P

ap

mg

mg

R

Figure 1.7: Weight for a body describing a circularmotion.

Tp = m(ap + g)

• At point S, Newton’s law says that Ts +mg = mas, so:

Ts = m(as − g)

Notice these are very similar equations to the ones we found for the

weight in an elevator (Example 1.6). Thus, the object gains weight as it goes

down, and losing weight as it comes back up. For instance, if the centripetal

and the gravitational accelerations cancel one another (namely, as = g),

and the object is effectively weightless and the string has no tension.18 For

example, if the object being swung was a bucket full of water, the water would

not fall over, for it is weightless at that point.

18 If as < g, then of course the object could not have made it to point S in thefirst place –it was not given enough initial velocity (namely, ω was too small for thependulum to describe a full orbit).

76


1.2 Resistive Forces

So far, we have assumed that objects move in a vacuum, so that the

force causing them to accelerate is not counteracted by any resistance. In

this section, we explore such resistive forces. We may refer to the frictional

force, or the drag force. Let us study each in turn.

1.2.1 Friction

We call friction the force resisting the motion of objects relative to others

that are in contact with it. When surfaces in contact accelerate relative to

one another, friction exerts a force that counteracts such acceleration.

To begin, consider an object (e.g. a brick) of mass m at rest on a surface

with no inclination. There is a gravitational pull mg on the object. The floor

exerts an upward force FN of equal magnitude (for there is no acceleration

in the vertical direction), or FN = mg. Since this force is perpendicular to

the surface, it is often called the normal force (i.e. normal in the sense of

orthogonal).

Definition 1.17 (Normal Force) The force that is perpendicular to the

surface that an object is in contact with.

Now, suppose we exert a force F on the brick in the horizontal direction.

The brick will move, though not without resistance. Indeed, a frictional

force Ff pulls back on the brick in a horizontal but opposite direction of

motion to that of the original force applied.

It is a fact of nature that the frictional force Ff has a limit, denoted by

the scalar µ > 0, whose exact value depends on the characteristics of the

materials that compose the objects considered.

Definition 1.18 (Coefficient of Friction (COF)) The ratio of the force

of friction between two objects and the force pressing them together. That

is, Ff = µFN .

There are two types of COFs:

77


• The coefficient of static friction (µs), is the COF that corresponds to

objects that are at rest relative to each other. That is, µs is the COF

associated to objects when a force is applied to break them loose.

• The coefficient of kinetic friction (µk), is the COF that corresponds

to objects that are in motion relative to each other. That is, µk is the

COF associated to objects that are in motion relative to each other

and a force is applied that keeps them moving.

Intuitively, we have:

µk < µs

Indeed, less force must be applied when the brick is already sliding through

the surface than that which is needed to get it started.

To calculate the static and kinetic COFs between two objects, the eas-

iest way is to place an object on an inclined surface, and then change the

inclination angle. The following shows how:

Example 1.13 (Friction I: Brick on an incline) Suppose that a brick

of mass m is placed on an incline at angle θ (Figure 1.8). There is a gravita-

tional force mg pushing the object perpendicularly to the floor, and a normal

force FN pulling it perpendicularly to the inclined surface. For convenience,

consider that our (x, y) coordinate system is also “inclined”, with y indicat-

ing direction perpendicular to the surface of the incline (i.e. the hypothenuse

of the triangle, the same direction as FN ), and x indicating direction parallel

to the incline.

With this choice of coordinates, we can as usual decompose the forces

into its coordinate components. FN being a normal force, it does not need

to be decomposed (as it is zero in the x direction). The gravitational force

mg can be decomposed into mg cos θ in the y direction, and mg sin θ in the

x direction. Since there is no acceleration in the y direction, then obviously

FN = mg cos θ.

This object may or may not slide downhill, depending on the inclination

of the incline. For a certain inclination θ, a frictional force Ff pulls the

object back along the x direction. When the inclination is low, Ff is strong

78


mg cos θ

~FN

mg sin θ

mg

θ

~Ff

Figure 1.8: Friction in a sliding object.

enough to prevent the object from moving at all. For a high enough θ,

however, Ff reaches its maximum value, friction gives in, and the object

starts to slide down the slope.

Let us calculate the critical value for θ, beyond which the object will slide

down. For this value, acceleration in the x direction is still zero (i.e. a = 0),

but Ff has reached its maximum. Thus, by Newton’s Second Law, at this

exact inclination we have:

mg sin θ − µsFN = 0

where we use µs for the COF because the object is currently at rest. Using

FN = mg cos θ, we then obtain:

µs = tan θ∗

where θ∗ denotes the critical angle. Thus, the angle at which an object

starts to slide down is related to the static COF (a value which depends on

the materials of the objects) by a very simple relationship. In particular,

this angle is independent of (i) the mass of the object, and (ii) the surface

area that the objects are in contact with.19 This is also a recipe to find the

19 For example, the static COF of rubber on concrete is about one, so for these objectsthe critical value of θ is about 45◦. This means that we can never park a car on aslope of more than 45◦ and hope that it will not slide down, no matter how heavy

79


static COF of different objects. Experimentally, we can infer µs from µs =

tan θ∗ by trying different inclinations and recording the minimal inclination

at which sliding occurs.

We have found that, if θ < θ∗, then a = 0 (the frictional force will

adjust so that the object is not accelerated). Thus, suppose that θ > θ∗.

Then, the brick will slide downhill. The maximum friction that the brick

will experiment is then:

Ff = µkmg cos θ

Notice we have now used the kinetic COF because the object is already

in motion. By Newton’s Second Law:

mg sin θ − µkmg cos θ = ma

where a is the acceleration downhill. Thus, the acceleration that the brick

will experience is

a = g(sin θ − µk cos θ)

Interestingly, this is once again independent of the body’s mass. How-

ever, gravity now plays a role, as does the kinetic COF of the materials

used.

How much time does it take for the object to reach the bottom of the

incline once it starts accelerating, and at what speed will it arrive? Let ` be

the length from the bottom to the object. Since acceleration is constant and

the initial speed is zero, then ` = 12at

2, so the time it takes is t =√

2`/a.

The speed at that point is just v = at, so v =√

2a` =√

2g`(sin θ − µk cos θ).

Example 1.14 (Friction II: Pulleys) Consider a similar case to the pre-

vious example, but now suppose that the brick (of mass m1) is attached to

a hanging body (of mass m2) through a pulley (Figure 1.9). The pulley is

frictionless and massless, and so is the string connecting the two objects.

the car is or the width of its tires.

80


θ

Figure 1.9: Friction in a sliding object attached to apulley and a hanging weight.

Let’s think of the forces at work. The gravitational forces acting on the

brick is m1g. The coordinate decomposition is m1g cos θ in the y direction,

and m1g sin θ in the x direction, just as in the previous example. Because

there is no acceleration in the y direction, the normal force cancels with the

gravitational force, so FN = m1g cos θ. Additionally, there is now a tension

T1 going up the string from the brick, and a tension T2 going up from the

hanging body, with T1 = T2 = T .20 If the system is at rest, then the tension

must compensate the force of gravity in the y direction on the hanging object,

so:

T = m2g

Because the pulley is frictionless, the frictional force is between the in-

cline and the brick. The direction of the frictional force is opposite to that

of the brick. Starting from a resting position, the maximum value of the

frictional force is:

Ff = µsFN = µsm1g cos θ

If the hanging body did not exist (or m2 = 0), then we would have the

previous example, with the friction pointing uphill. However, the presence

of a massive object at the other end of the string may now pull the object

uphill, in which case the frictional force would be pointing downhill. We

must therefore split the analysis into three limiting cases: uphill acceleration;

20 The tension must always same along the entire string. To see this, take an in-finitesimal (and thus massless) segment of the string. This segment is pulled by twoforces, one on each direction. If these forces differed in magnitude, the accelerationwould be infinite (by Newton’s Second Law), a contradiction. For this, of course, itis necessary that (i) the string is massless; (ii) the pulley is frictionless.

81


downhill acceleration; and no acceleration at all.

Throughout, we take (θ,m1, g) as given, and treat m2 as the critical

attribute which determines which case occurs.21

• Upward acceleration: If the system is at the critical m2 where is just

about to accelerate upward (call it ms), then Ff = µsm1g cos θ, which is

pointing downhill. Since tension T is pointing uphill, then by Newton’s

Second Law:

T −m1g sin θ − µsm1g cos θ = 0

as the object has not accelerated yet. Replacing T = m2g and solving

for m2 = ms, we get:

ms = m1

(sin θ + µs cos θ

)(1.7)

Thus, if we use any mass m2 ≥ ms, the brick will slide up.

• Downhill acceleration: If the system is at the critical m2 where is just

about to accelerate downward (call it ms), then Ff = µsm1g cos θ,

which is now pointing uphill, agreeing with the direction of tension

T = m2g = msg. Therefore:

T −m1g sin θ + µsm1g cos θ = 0

Thus, now:

ms = m1

(sin θ − µs cos θ

)(1.8)

Thus, if we use any mass m2 ≤ ms, the brick will slide down.

• No acceleration: The brick will not move at all if neither (1.7) nor

(1.8) are met. That is, if m2 ∈ [ms,ms].

21 We may also do this with θ (as before), fixing m2. The analysis is similar.

82


Let’s now look at a moving brick. Suppose m2 ≥ ms, so that the brick

moves uphill (the analysis for the other two cases is similar). The maximum

frictional force then becomes kinetic: Ff = µkm1g cos θ. Thus, Newton’s

Second Law in the x direction is now:

T −m1g sin θ − µkm1g cos θ = m1a (1.9)

if a is the acceleration uphill. Crucially, T is no longer m2g, for the

hanging body is being accelerated downward as a result of the brick’s motion.

In particular, the total force in the y direction on the hanging object is m2g−T , so Newton’s Second Law says:

m2g − T = m2a (1.10)

Solving the system of equations (1.9)-(1.10), we get:

a = gm2 −mk

m1 +m2

where mk ≡ m1(sin θ+µk cos θ). Note that since µs > µk (everything else

equal, static friction is always higher than kinetic friction), then ms > mk,

guaranteeing that m2 > mk and thus a > 0 (i.e. the brick is indeed moving

uphill). The associated tension on the string is then:

T = m2(g − a) = m2gm1 +mk

m1 +m2

and thus T < m2g, as expected. Unlike in our previous example, were

the brick’s mass played no role in its acceleration, now the acceleration does

depend on the mass of the objects, though only on the ratio m1/m2 (i.e. how

much more massive one object is relative to the other).

1.2.2 Drag

The drag force is the resistive force caused by the motion of an object

through a fluid, such as air or water. The drag force depends on the size

and shape of the body, as well as the medium through which the object is

83


moved and the speed at which it is moved. For example, the drag force is

larger in water than in air.22

The drag force is very different from the frictional force. In frictional

forces, the coefficient of kinetic friction (Definition 1.18) remains constant

independently of the speed (only a function of the material composition of

the objects in contact). However, for drag forces, the object’s velocity is

key. In particular, it has been observed experimentally that:

Definition 1.19 (Drag force) The drag force on an object of velocity ~v

is given by:

~Fdrag = −(k1|~v|+ k2|~v|2

)~v

for some numbers k1 (in kg/msec ) and k2 (in kg

m3 ). Thus, the magnitude of

the resistive force is |~Fdrag| = k1|~v|+ k2|~v|2.

Note the resistive force has opposite direction to the velocity: faster

moving objects feels a stronger drag. Moreover, the coefficients k1, k2 depend

on the shape and size of the object, as well as the kind of medium.

Henceforth in this section, we will assume that the object is a perfect

sphere of radius R > 0. For these objects, it has been observed that:

k1 = c1R

k2 = c2R2

where c1, c2 are scalars. Each component of the drag force has a name:

• Viscous term: The term c1R|~v|, which relates to the stickiness of the

medium (e.g. the “thickness” of the fluid, in the example of liquid

mediums). For example, coefficient c1 is a function of temperature:

when water is heated to a gas, it loses viscosity, and thus the water’s

drag force due to viscosity diminishes.

22 Section 5.2 will be devoted to fluid mechanics. For now, we will take some propertiesof fluids as given.

84


• Pressure term: The term c2R2|~v|2, which relates to the stress that

the body is exposed to.23 In this case, c2 is highly correlated to the

density of the body.

Thus, while viscosity depends on the temperature, the pressure terms

depends on the density, where:

Definition 1.20 (Density) The density (ρ) of an object is the ratio of its

mass (m) to its volume (V ), or:

ρ ≡ m

V

It is measured in kg/m3.

Consider a sphere of mass m and radius R in free fall. The sphere feels

a gravitational acceleration mg, and a drag force Fdrag in the opposite di-

rection as gravity, whose magnitude increases as the object picks up speed.

There comes a time when the drag force reaches its gravitational counter-

part, Fdrag = mg. At this point, the object experiences no further accel-

eration, and achieves a constant speed. This velocity is called the terminal

velocity of the object.

Definition 1.21 (Terminal velocity) The constant speed that is eventu-

ally achieved by an object in free fall through a medium that subjects it to

drag.

For spheres, the terminal velocity, denoted ~vterm, thus solves:

mg = c1R|~vterm|+ c2R2|~vterm|2

Therefore, the terminal velocity is a function of the mass, the radius, and

the attributes (viscosity and pressure) of the system that determine c1, c2.

Another key velocity is the following:

23 Intuitively, this stress is proportional to R2 because the force acts upon the area ofthe sphere, which itself is proportional to R2.

85


Definition 1.22 (Critical velocity) The velocity at which the drag due

to viscosity equals that which is due to pressure.

Thus, the critical velocity of a sphere, denoted ~vcrit, solves c1R|~vcrit| =

c2R2|~vcrit|2 or, simplifying:

|~vcrit| =c1

c2R

Thus, |~vcrit| ∝ 1R . We can now split the drag forces that are experienced

by an object into two possible regimes:

• Regime I: Drag due to viscosity (c2 ≈ 0): When most drag is due to

the medium’s viscosity, then it must be that ~v � ~vcrit (reads “velocity

is much smaller than the critical velocity”). In this case, the sphere’s

terminal velocity solves mg = c1R|~vterm|, or:

|~vterm| =mg

c1R

If ρ is the density of the sphere, then m = 43πρR

3, so24 |~vterm| =4g3c1πρR2. Interestingly, |~vterm| ∝ R2. Thus, if two objects of the

same density are dropped into water, their terminal velocity will be

proportional to the square of their radius.

• Regime II: Drag due to pressure (c1 ≈ 0): When most drag is due to

the medium’s pressure, then it must be that ~v � ~vcrit (reads “velocity

is much greater than the critical velocity”). In this case, the sphere’s

terminal velocity solves mg = c2R2|~vterm|2, or:

|~vterm| =√

mg

c2R2

Again, for spheres, m = 43πρR

3, so |~vterm| =√

4g3c2πρR. Interestingly,

|~vterm| ∝√R.

24 Recall ρ ≡ mV

(Definition 1.20). For spheres, volume is V = 43πR3.

86


Let’s looks at one example for each regime:

Example 1.15 (Regime I: Liquids) If we drop a ball into a thick liquid

(e.g. syrup), then all drag will be due to the liquid’s viscosity: the initial

velocity is zero, so the ball will approach its terminal velocity at a speed that

is proportional to the square of its radius. However, if we were to inject

the ball into the syrup medium at a great initial speed (in particular, at one

that is well above its terminal velocity), it would for a short while experience

drag due to pressure and its speed would slow down to the terminal velocity

proportionally to the square root of the object’s radius.

How long does it take for the terminal speed to be reached? Suppose we

drop the sphere into the fluid. A gravitational acceleration mg is counter-

acted by a drag force of magnitude c1R|~v| (since we are in Regime I). By

Newton’s Second Law:

md

dt|~v| = mg − c1R|~v|

a simple differential equation. To solve, write the equation as d|~v|dt +

c1Rm |~v| = g, and use the Method of the Integrating Factor (Remark 0.2) to

solve. Here, the integrating factor satisfies ddtµ = µ c1Rm , and thus µ = ke

c1Rmt

for some k > 0. Solving:

|~v| =c+ gk

∫ec1Rmtdt

ec1Rmt

=c+ gk m

c1Rec1Rmt

kec1Rmt

=mg

c1R+ ce−

c1Rmt

where c = ck . To find the constant c, set t = 0 and suppose that the initial

velocity is zero (because the object is dropped from a state of rest). Then,

0 = mgc1R

+ c. Finally, recall that mgc1R

= |~vterm| in Regime I, so c = −|~vterm|.Plugging back, we have found:

|~v| =(

1− e−c1Rmt)|~vterm|

Therefore, the speed of the sphere approaches the terminal velocity expo-

nentially from below. The speed of convergence to the terminal velocity is

higher when the sphere is smaller (low R) and heavier (higher m), which

87


is intuitive. The convergence is also faster when the fluid is more viscous

(higher c1). Had there been no drag, the velocity would have approached the

terminal speed linearly.

Example 1.16 (Regime II: Air) In we drop the ball in air, then all drag

is due to air pressure. Indeed, for air, c1 ≈ 0. In particular, it has been

measured that, at one atmospheres and room temperature, c1 = 3.1×10−4 ≈0, while c2 = 0.85. The critical speed is |~vcrit| = 3.7×10−4

R , about 400 slower

than the critical speed in syrup. Thus, if we drop a ball in air, its velocity is

way above the critical speed, and so we are in Regime II: the pressure term

always dominates, and the terminal speed is proportional to the square root

of the radius of the sphere.25

How long does it take now for the terminal speed to be reached? As

in Regime I, we have gravity mg pulling down, but now a drag force of

magnitude c2R2|~v|2 (since c1 ≈ 0). By Newton’s Second Law:

md

dt|~v| = mg − c2R

2|~v|2

This is now a non-linear ODE, so it cannot be solved analytically. But

this equation is enough to understand how velocity will build up and con-

verge to the terminal velocity as a function of the sphere’s mass and radius

(alternatively, its density).

Example 1.17 (Projectiles revisited) Consider the experiment in Ex-

ample 1.8, where an object (e.g. a sphere) was shot at an angle. In the

absence of air drag, the parabola is perfectly symmetric (black dashed line in

Figure 1.10). This is because the velocity is not assumed to change. How-

ever, with air drag (red dashed line), there is a resistive force in both the y

direction (downward, as the object gets higher in the air) and the x direction

(leftward, as the object advances in the x direction).

The action of these forces will make the parabolic trajectory asymmetric,

with (i) a highest point that is strictly below and which is reached at an earlier

25 For example, if we drop a pebble of R = 1 cm from a high building, its speed willnever exceed 75 mph. For a sphere of mass m = 70 kg and radius R = 40 cm, theterminal velocity is 150 mph.

88


time than in the case with no drag; and (ii) a downward trajectory that is

steeped than the upward one.26~v 0

v0 cosα

v0 sinαα

x

y

Figure 1.10: Projectile motion without drag (blackdashed) and with drag (red dashed).

Using Newton’s Second Law, we have (assuming c1 ≈ 0 in air drag):

mx = −c2R2x2 and my = −mg + c2R

2y2

where (x, y) = (vx, vy) are the x and y components of velocity ~v, with

initial values (x0, y0) = (v0,x cosα, v0,y sinα). If there was no drag force,

then mx = 0 (no acceleration in the x direction) and my = mg (the only

acceleration in the y direction is due to gravity), just what we got in Exam-

ple 1.8. With air drag, motion is characterized by non-linear, second-order

differential equations, which we cannot solve analytically.

At the highest point in the y direction, air drag and gravity cancel each

other out, so mg = c2R2y(tp)

2, or:

y(tp) =1

R

√mg

c2

where tp is the time at which the highest point is reached. Thus, every-

thing else equal, heavier and smaller spheres will reach a higher point.

26 Note (ii) is not reflected in Figure 1.10.

89


1.3 Multi-Particle Systems

So far, we have focused on the laws of force and motion for isolated

systems with single objects. Ultimately, we wish to have a description of the

dynamical laws of motion of all particles of nature. The fundamental forces

of nature are those that act between particles. To provide a description of

these forces, we need to determine two objects. First, the intrinsic properties

of the particle, such as its electric charge and mass. Second, the location

and (by Newton’s law) velocity of the particle, as well as how the particle is

influenced by the location of all other particles.

1.3.1 Newton’s Laws in Multi-Particle Systems

Throughout, we consider N particles indexed i = 1, 2, . . . , N . Consider

a space with three dimensions labeled (x, y, z), and let:

~ri = (xi, yi, zi)

denote the coordinates of particle i. (The extension to D > 3 dimensions is

straightforward). We denote:

~r ≡ {~ri : i = 1, . . . , N}

as the collection of locations across all particles. Specifically, x ≡ {xi :

i = 1, . . . , N} denotes the collection of locations on the x dimension across

all particles, and y and z are defined similarly. Succinctly, ~r = (x,y, z).

Henceforth, we may refer to ~r as the configuration space.

Definition 1.23 (Configuration space) A (Cartesian) configuration space

is the 3N -dimensional space of particle locations, ~r ≡ {(xi, yi, zi) : i =

1, . . . , N}.

The force exerted along each dimension on a particular particle i is as-

sumed to be a function of the location of said particle as well as those of

all other particles, and is thus denoted ~Fi(~r). We consider that the force

90


exerted on a particle i can be thought of as the sum of forces exerted by all

particles j 6= i on i. Because these forces occur between the particles within

the isolated system, they are sometimes called internal forces. A force that

acts upon the system and which does not involve the direct interaction of

particles is an external force.

Formally, let ~fij be the internal force on i due to j (with ~fii = ~0 without

loss of generality). Then:

~Fi(~r) ≡∑j 6=i

~fij(~r) (1.11)

With this notation at hand, we can first see that Newton’s law of motion

then extends naturally:

Remark 1.3 (Newton’s Second Law in Multivariate Systems) Force

on a particle equals its mass times its acceleration. In vector notation:

~Fi(~r) = mid2~ridt2

for each i = 1, . . . , N . Or, written in component form:

(Fx)i(x) = mid2xidt2

, (Fy)i(y) = mid2yidt2

, (Fz)i(z) = mid2zidt2

where (Fx)i(x), (Fy)i(y), and (Fz)i(z), denote the x, y, and z compo-

nents of the force on the i-th particle, as functions of the location of all

particles along the corresponding dimension.

Note there is one Newton equation for each coordinate of every particle.

With 3 dimensions and N particles, this means 3N equations.

Remark 1.4 (Is Newton’s dynamical law suitable?) Newton’s law of

motion is suitable because it conserves information. That is, the dynamical

law is both deterministic and reversible.

• By knowing the initial position of a particle and its initial velocity, we

can determine where it will be in the future (determinism).

91


• Moreover, by knowing the current position and velocity of the particle,

we can know where it was an instant ago (reversibility).

Later on, we will see that quantum mechanics denies this very principle:

location and velocity cannot be both known with certainty. For now, since

position and velocity are all that is required for a full description of the

dynamical system, we may call this our state space:

Definition 1.24 (State space) The state space is the 6-dimensional space

composed of three spatial coordinates, ~r = (x, y, z), and velocities along each

coordinate, ~v = (vx, vy, vz).

Thus, a point in the state space tells us the position and velocity of a

particle along each spatial dimension.

1.3.2 Momentum and Newton’s Third Law

We will now introduce a convenient reformulation of this space.

Definition 1.25 (Momentum) The quantity of motion of a moving body,

measured as the product of the mass and the velocity:

~pi ≡ mi~vi

for each particle i = 1, . . . , N . Momentum is measured in kg m / sec.

Roughly speaking, the momentum measures how hard it is to stop a

moving object. For given velocity, heavier objects need more force to be

stopped. For instance, a moving ping-pong ball takes very little force to be

stopped compared to a locomotive that is moving with the same velocity.

Definition 1.26 (Phase space) The phase space is the 6-dimensional space

composed of three spatial coordinates, ~r = (x, y, z), and momentum along

each coordinate, ~p = (px, py, pz).

Thus, a point in the phase space tells us the position and momentum of

a particle along each spatial dimension. Changing the space from position-

velocity pairs to position-momentum pairs will become convenient later on.

92


Remark 1.5 (Newton’s Second Law using momentum) Given that mass

mi is constant, we have that:

~pi = mi~vi

for each i = 1, . . . , N , by definition of momentum. Thus, we can write the

full description of the dynamical system as follows:

~pi = ~Fi(~r) =∑j 6=i

~fit(~r) (1.12a)

~ri =~pimi

(1.12b)

for each particle i = 1, . . . , N . The first equation is Newton’s law of

motion stated in terms of momentum, and the second follows by definition

of momentum. Written in component form, using ~p = (px, py, pz):

d(px)idt

= (Fx)i(x),dxidt

=(px)imi

for each particle i = 1, . . . , N , and similarly for coordinates y and z.

In words, if a force acts upon a particle, it changes its momentum. In

fact, the magnitude of the force is the magnitude of the change in momen-

tum.

We can now offer a full description of the phase space. For each particle

of a given mass, location, and momentum, we can determine the future

evolution of the object through differential equations (1.12a)-(1.12b).

Principle 1.4 (Newton’s Third Law of Motion) Every force exerted from

some particle j on i 6= j is equal and opposite to the force that i exerts on

j. That is, using the notation of equation (1.11):

~fij = −~fji

93


for all i = 1, . . . , N and j 6= i.27

Another way to state Newton’s Third Law is the following: for every

action, there is a reaction of equal force and opposite direction. Internal

forces all cancel each other out.

Result 1.1 (Conservation of Momentum) Newton’s Third Law is of-

ten referred to as the Law of Conservation of Momentum. To see why, note

by equations (1.11) and (1.12a)-(1.12b) that:

~pi =∑j 6=i

~fij(~r)

In words, the rate of change in the momentum of a particle equals the sum

of the forces due to all other particles (i.e. the sum of all internal forces),

provided that there are no external forces. Newton’s Third Law implies that

all pairwise forces cancel out, that is:

N∑i=1

~Fi(~r) = ~0

Hence,∑N

i=1 ~pi =∑N

i=1

∑j 6=i

~fij(~r) = ~0. Interchanging derivative and sum-

mation, we get:

d

dt

N∑i=1

~pi = ~0

That is, the rate of change of total momentum (the sum of momenta

across particles) is nil.

In sum, the total momentum of an isolated system never changes, assum-

ing no external forces are present. Though individual particles may acquire

27 Original text: “Actioni contrariam semper et æqualem esse reactionem: sive corpo-rum duorum actiones in se mutuo semper esse æquales et in partes contrarias dirigi.”In English: “To every action there is always opposed an equal reaction: or the mu-tual actions of two bodies upon each other are always equal, and directed to contraryparts.”

94


or lose momentum over their dynamic paths, all internal forces must cancel

out, and the total overall momentum of the system remains constant over

time.

Example 1.18 (Colliding particles) Consider a two-particle system, with

particles of mass m1 and m2 moving at velocities ~v1 and ~v2. Suppose one

particle is moving toward the other. Their momenta are ~p1 = m1~v1 and ~p2 =

−m2~v2, respectively. The system’s total momentum is ~p = m1~v1 −m2~v2.

Eventually, a collision occurs. At this point, assume the two particles

become virtually one, with mass (m1 +m2) and velocity ~v ′, where the prime

indicates “post-collision”. By the Conservation of momentum, the total mo-

mentum before the collision must be the same as the momentum right after

the collision, for there are no external forces on this system (all forces are

internal, as a result of the particle interaction implicit in the collision itself).

Therefore, m1~v1 −m2~v2 = (m1 +m2)~v ′, or:

~v ′ =m1~v1 −m2~v2

m1 +m2

For example, if the particles are going in opposite direction but at the

same speed (say ~v), the velocity after the collision will be ~v ′ = (m1−m2)~vm1+m2

.

In words, the newly formed particle will move in the same direction as that

which the more heavier particle had prior to the collision.

1.4 Center of Mass

Multi-particle systems (for example, material objects) can be compli-

cated collections of particles, i = 1, . . . , N . Each particle has a certain mass

mi and position ~ri. The center of mass is a convenient way to summarize

the properties of motion of multi-particle systems. We define the center of

mass as the unique position in space that is at the mean location of the

distribution of mass in space.

Definition 1.27 (Center of mass) The center of mass of a system of

i = 1, . . . , N particles, each with mass mi and in position ~ri, is the unique

95


position vector ~rCM satisfying:

N∑i=1

mi(~ri − ~rCM ) = ~0 (1.13)

Solving the equation for ~rCM , we find:

~rCM =1

M

N∑i=1

mi~ri

where M ≡∑N

i=1mi is the total mass of the system. In words, the

center of mass is the unique position in space with the property that the

mass-weighted position vectors of all particles relative to this point sum

to zero (i.e. cancel out). Thus, the distribution of mass of the system is

balanced around the center of mass.

Taking the time derivative, we see:

~vCM =1

M

N∑i=1

mi~vi =1

M

N∑i=1

~pi ≡~P

M

where ~P ≡∑N

i=1 ~pi is the total momentum of the system. Therefore, we

have found the following important property:

Result 1.2 (Total momentum and center of mass) The total momen-

tum of a system of particles i = 1, . . . , N satisfies:

~P = M~vCM

Equivalently, taking the time derivative, and using ~pi = ~Fi(~r) by New-

ton’s Second Law (equation 1.12a), we have:

~P =

N∑i=1

~Fi(~r) = M~aCM (1.14)

Therefore:

96


• If there are no external forces, then ~P = 0 by the conservation of

momentum, and ~aCM = ~0. In words, when only internal forces (i.e.

forces exclusively between particles within the system) are present, the

center of mass is not accelerated, has constant velocity.

• If there are external forces, then these forces equal ~F ≡∑N

i=1~Fi(~r) (as

internal forces all cancel out). Then, the laws of motion for the whole

system are condensed into those of the center of mass, as equation

(1.14) is simply Newton’s Second Law of motion for the position ~rCM .

This illustrates why it is convenient to talk about the center of mass in

multi-particle systems: it allows us to simplify the characterization of the

motion of each and every single particle within the system by just focusing

on one position, the one satisfying property (1.13). The behavior of the

body is predictable from the motion of this position.28 The center of mass

is thus the particle equivalent for application of Newton’s laws of motion to

the entire system.

Example 1.19 (A hammer in space) Suppose a hammer is floating in

outer space (i.e. no external forces). Conservation of momentum tells us

that the hammer’s center of mass (a unique point) must experience a con-

stant velocity. For example, if the hammer is rotating as it advances forward,

all of its particles will experience centripetal acceleration due to the uniform

circular motion, except for its center of mass, whose acceleration will always

be zero. Of course, here the center of mass is itself a particle of the system,

that about which the hammer is rotating.

Example 1.20 (Three particles) Consider three particles in space (x, y).

Suppose the masses are m1 = m2 = m and m3 = 2m. Consider the three

bodies are held together by massless strings, forming an equilateral triangle of

whose sides have length `. The strings hold the bodies in positions ~r1 = (0, 0)

(a normalization), ~r2 = (x2, y2) with x2, y2 > 0, and ~r3 = (`, 0). For the

28 Of course, the center of mass may or may not be itself a particle in the system.

97


triangle to be equilateral, we need x2 = `/2. Moreover, y2 is the height of

the triangle, so by Pythagoras’ Theorem we have `2 = y22 + x2

2, or y2 =√

32 `.

The position ~rCM = (xCM , yCM ) of the center of mass satisfies:

4m~rCM =∑i

mi~ri

We can decompose this into the x and y components:

(x) : 4m · xCM = m · 0 + m · 1

2` + 2m · ` ⇔ xCM =

5

8` = 0.625`

(y) : 4m · yCM = m · 0 + m ·√

3

2` + 2m · 0 ⇔ yCM =

√3

8` ≈ 0.216`

Note that, in this case, the center of mass is not a particle in the system.

98

Chapter 2

Energy

Though we often speak of many different types of energy (e.g. kinetic,

potential, nuclear, chemical, thermal), in the realm of particle motion there

exist only two fundamental forms of energy: kinetic energy and potential

energy.

2.1 Work, Kinetic Energy, and Power

The kinetic energy (T ) is the energy that a body possesses by virtue of

being in motion. It is defined as the work needed to accelerate a body from

some state (at, say, time t1) to another (at, say, time t2 > t1) with a certain

velocity.

Let us first define work and kinetic energy formally, and then state how

the two are related:

Definition 2.1 (Work) The work done by a force ~Fi(~r) on a particle i =

1, . . . , N of velocity ~vi over the time interval [t1, t2] is:

Wi(t1, t2) ≡∫ t2

t1

~Fi(~r) · ~vidt (2.1)

The unit of work is the joule (J).

99


Definition 2.2 (Kinetic Energy) The kinetic energy of a particle of mass

mi at some time t is defined as follows:

Ti(t) ≡1

2mi|~vi(t)|2 (2.2)

where ~vi(t) denotes the velocity at time t.

First, from equation (2.1), notice that ~vidt = d~ri by definition (in words,

displacement equals velocity times time elapsed), and thus we can write

work as:

Wi =

∫X~Fi(~r) · d~ri (2.3)

where X denotes the trajectory from ~ri(t1) to ~ri(t2). Thus, the work

done by a force on a body is the dot product of the force and the body’s

displacement that has accumulated over time. Note that this means that

only forces that are non-perpendicular to the direction of motion contribute

to work.1

Work and kinetic energy are very closely related, and often (and loosely)

used interchangeably. The following result states the key equivalence be-

tween the two:

Result 2.1 (Work-Energy Theorem) The work done on an object by a

net force equals the change in the kinetic energy of the object. That is:

Wi(t1, t2) = Ti(t2)− Ti(t1)

for any two instants (t1, t2) in time.

Proof. To prove this equivalence, use Newton’s Second Law in terms of

momentum (equation (1.12a)) to write work (equation (2.1)) as:

Wi(t1, t2) =

∫ t2

t1

d~pidt· ~vidt =

∫ t2

t1

~vi · d(mi~vi) (2.4)

1 For example, as we shall see in the examples below, tension in pendular motion isperpendicular to the direction of motion, so the dot product is zero. Another exampleare normal forces (Definition 1.17)

100


where the second equality follows by definition of momentum. By the

product rule, we know:

d(~vi · ~vi) = (d~vi) · ~vi + ~vi · (d~vi)

so ~vi ·(d~vi) = 12d(~vi ·~vi). Assuming that mass mi is constant (i.e. mi = 0),

we have that ~vi · d(mi~vi) = mi2 d(~vi · ~vi). Finally, ~vi · ~vi = |~vi|2. Plugging

everything back into (2.4), we find:

Wi(t1, t2) =

∫ t2

t1

d

(1

2mi|~vi|2

)=

1

2mi

(|~vi(t2)|2−|~vi(t1)|2

)= Ti(t2)−Ti(t1)

our desired result. �

Thus, the work done by a force causes the kinetic energy of the object

to change. Indeed, the fact that an object has kinetic energy implies that a

force must have accelerated it. If the work is positive, then kinetic energy

increases. If negative, the body loses kinetic energy. Finally, kinetic energy

is maintained unless a force changes the body’s velocity, that is, unless a

force works on it.

Henceforth, we will often only refer to kinetic energy, but it should always

be understood that the kinetic energy of a body results from work done on

it by one or more forces.

Finally, we shall define the total kinetic energy of the system (i.e. across

all particles) as follows:

T ≡N∑i=1

Ti =1

2

N∑i=1

mi|~vi|2

Note that, in equation (2.3), the integral is not over time, but over the

specific trajectory of displacement between times t1 and t2. In this definition

of work, therefore, the integral is potentially path-dependent.

When the work caused by a force does not depend on the path, but only

the starting and ending points, we say that the system has been exposed to

101


a conservative force.

Definition 2.3 (Conservative force) A force whose work is independent

of the trajectory of displacement.

For example, gravity is a conservative force (as shown in Example 2.3).

So are spring forces. Resistive forces (friction and drag), however, are non-

conservative.

Therefore, the kinetic energy of a body that is put in motion by a con-

servative force is also path-independent. Henceforth, we will typically deal

with conservative forces. Indeed, it is no accident that we have denoted

kinetic energy simply by T as opposed to T (~r), for the latter is redundant.

Finally, note that, when forces are conservative, we can write the limits

of integration in work as just the start and end points of the path, so (2.3)

simplifies to:

Wi =

∫ ~ri(t2)

~ri(t1)

~Fi(~r) · d~ri

Example 2.1 (Shooting an object vertically) In dimensions (x, y), con-

sider a body of mass m going from point ~rA = (xA, yA) to ~rB = (xB, yB).

Suppose xA = xB and yB = yA +h, where h ∈ R. In words, the object expe-

riences no horizontal displacement, and moves only vertically until it reaches

some maximum height of h. Thus, the displacement is simply d~r = (0, h).

For instance, consider an object that is shot vertically into the air.

The object is given some velocity ~vA = (0, vA) at point A, and it comes

to a halt at point B, so ~vB = (0, 0). On the other hand, gravity acts in the

y direction, pushing downward with a force of −mg (with a minus sign to

indicate that the force acts in a magnitude opposite to ~vA).

The work done between points A and B, denoted WAB, is:

WAB =

∫ ~rB

~rA

~F · d~r = −mg(yB − yA) = −mgh

This work is only due to the force of gravity. By the Work-Energy The-

orem (Result 2.1), we know WAB = TB − TA, where TB = 0 (as ~vB = ~0)

102


and TA = 12mv

2A. Thus, we have found mgh = 1

2mv2A, or:

h =v2A

2g

In words, if we throw an object vertically with velocity vA, the object will

gain kinetic energy as it reaches a maximum height of h =v2A

2g , at which

point it is attracted back to Earth and kinetic energy begins to decline. In-

terestingly, this height is independent of the object’s mass.

Example 2.2 (Lifting an object vertically) Consider the same case, ex-

cept now vA = 0. That is, the object is initially at rest, and it reaches point

~rB from ~rA by virtue of some force ~F = (0, Fy) that we apply along the way.

For instance, think of lifting a briefcase that it sitting on the floor.

The work now has two sources: that which is done by our own lifting

force ~F (denote it WF ), and that which is due to gravity (denote it Wg).

Clearly, to overcome gravity, we must exert work equal to WF = mgh. The

work due to gravity is, again, Wg = −mgh. The net work that is done is

then W = 0. By the Work-Energy Theorem, this means that there should

be no change in kinetic energy, or TA = TB. Indeed, TA = 12mv

2A = 0 and

TB = 12mv

2B = 0, so TB − TA = 0, as predicted.

Example 2.3 (Gravity as a conservative force) Consider a more gen-

eral example in coordinates (x, y, z). A body is displaced from position

~rA = (xA, yA, zA) to position ~rB = (xB, yB, zB). Call yA − yB = h the

height of the displacement. Gravity exerts a force ~Fg = (0,−mg, 0), that is,

only on the y direction.2 Since no other force is at work here, the work for

going from ~rA to ~rB is:

WAB =

∫ ~rB

~rA

~Fg · d~r = −mg(yB − yA) = −mgh

Thus, note that the work in this example is completely path independent:

2 Recall that we may also write ~Fg in terms of its basis vectors, such that ~Fg =(Fg)x~ex + (Fg)y~ey + (Fg)z~ez. In this case, (Fg)x = (Fg)z = 0 and (Fg)y = −mg, so~Fg = −mg~ey, where ~ey = (0, 1, 0).

103


it does not matter how we get to B from A, all that matters is the height

between these two points.

In short, gravity is a conservative force (recall Definition 2.3).

Example 2.4 (Friction revisited) Consider again Example 1.13. At the

end of this example, we calculated that the speed at which the brick arrives

at the end of the incline is v =√

2g`(sin θ − µk cos θ).

Another way of deriving this is to use the Work-Energy Theorem. Let

A be the initial position of the brick (when at rest), and B be the bottom of

the incline. Since the brick starts at rest, TA = 0. The kinetic energy when

it arrives is TB = 12mv

2B, where vB is the speed at point B.

Letting h be the height of the object at point A, the amount of work gravity

is doing is mgh, or mg` sin θ.3 The work that friction does is `µkmg cos θ.

Since the frictional and the gravitational forces are in opposite directions,

we then get WAB = mg` sin θ − `µkmg cos θ.

Invoking the Work-Energy Theorem, we know WAB = TB − TA, or:

mg` sin θ − `µkmg cos θ =1

2mv2

B

Solving for vB yields vB =√

2g`(sin θ − µk cos θ), what we wanted to

obtain.

Finally, we will define a third and closely related concept: power.

Definition 2.4 (Power) The power (P ) is the rate of change of work, or:

P =dW

dt

It is measured in joules (J) per second, also known as watts.4

Recall that W =∫~F · d~r. Thus, the change in work is the dot product

of the force and the displacement which that force produces, dW = ~F · d~r.Thus, an alternative definition for power is:

3 The latter by definition of the sine function: sin θ = h`.

4 We may also use horsepower (hp), where 1 hp = 746 watts.

104


P = ~F · ~v

where ~v ≡ d~rdt is the velocity. In particular, if the force is perpendicular

to the velocity vector, then power is zero.

2.2 Potential Energy

The second fundamental form of energy is the potential energy (V ). This

is the energy derived from a body’s position relative to others. The stronger

the force of one body on another, the lower the potential energy between

them. The key assertion is as follows:

Principle 2.1 (Potential Energy Principle) All forces are derived from

(i.e. are governed by) a potential energy function, denoted Vi(~r) for particle

i. In particular, for any system, there exists a differentiable function Vi(~r)

satisfying:

~Fi(~r) = −∂Vi(~r)

∂~r

for each i = 1, . . . , N . That is, in component-wise notation:

(Fx)i(x) = −∂Vi(~r)

∂xi(Fy)i(y) = −∂Vi(~r)

∂yi(Fz)i(z) = −∂Vi(~r)

∂zi

where recall that (Fx)i(x) denotes the x (similarly for y and z) coordinate

of the force on the i-th particle.

In short, a force exerted on a body always reduces its potential energy.5

Or, put differently, a force always works in the direction opposite to the

5 For illustration, take a single-particle, one-dimensional space. Let x denote the(scalar) position of the particle in space. Then, F (x) = −dV (x)

dx, or V (x) =

−∫F (x)dx. Thus, the particle’s potential energy equals the negative of the ac-

cumulated force. In fact, using definition (2.1), we recognize that the right-hand sideis nothing but −T (x), so V (x) + T (x) = 0. Thus, in single-particle systems, kineticand potential forces cancel each other out. This is nothing but a trivial applicationof the Energy Conservation principle, which we shall discuss shortly.

105


increase in potential energy.

Example 2.5 (Gravitational potential energy) In dimensions (x, y, z),

the force due to gravity is ~F = (0,−mg, 0) on a body of mass m. The po-

tential energy due to this gravitational force, or in short the gravitational

potential energy, denoted Vg, is

Vg = −∫

~F · d~r = mgy

In should be noted that this value for the gravitational potential energy

holds for bodies that are close enough to each other (for example, the Earth

and an object standing on it). Example 2.19 will show that, when the dis-

tance between the bodies is small, this is indeed a good enough approximation

of Newton’s Universal Law of Gravitation.

The total potential energy of the system is the sum of potential energies

across all particles:

V (~r) ≡N∑i=1

Vi(~r)

Finally, we define the total energy of the system (E) as the sum of po-

tential and kinetic energy across all particles:

Definition 2.5 (Total Energy) The total energy of a system is the sum

of its kinetic and potential energies:

E(~r) ≡ T + V (~r)

Since in this chapter we are typically dealing with large visible objects

(e.g. planets or objects on Earth), in what follows we may also use the term

mechanical energy to refer to total energy. It should be noted, however, that

kinetic and potential energy are present not only in the motion of visible

objects, but also in that of heat, gases, and even atoms. Here is a (non-

exhaustive) list of different types of energy in nature:

106


• Mechanical energy: Potential and kinetic energy present in large

visible objects (e.g. planets or large objects on earth). It usually

involves gravitational potential energy.

• Heat and chemical energy: Potential and kinetic energy contained

in gases or other collection of molecules. Heat energy (Q) is expressed

in calories (cal), and is given by:

Q = mC∆T

where C is called specific heat (in calories per gram per degree centi-

grade) and ∆T is the temperature increase, in degrees centigrade.6

Heat and mechanical energy are equivalent in the sense that mechan-

ical energy raises the temperatures of bodies. In particular:

1 cal ≈ 4.2 J

or, equivalently, doing work of 1 joule increases heat energy by 1/

4.2 ≈ .24 calories. For example, to warm up 100 kg of water by 50◦C,

Q = 5, 000 kcal ≈ 2× 107J are needed.

Other types of energy include:

• Atomic and nuclear energy: Potential and kinetic energy stored in

the bonds between the constituents of atoms and atom nuclei. Ruled

by the laws of quantum mechanics.

• Electrostatic energy: Potential energy associated with the forces of

attraction and repulsion between electrically charged particles.

• Magnetic energy: Potential energy between the poles of magnetized

objects.

6 For water, C = 1 cal per gram per degree centigrade. Thus, one calorie is definedas the energy that is required to increase the temperature of 1 gram of water by 1degree centigrade. For aluminum, C = 0.2. For ice, C = 0.5.

107


• Electromagnetic radiation: Potential and kinetic energy stored

in radiation (e.g. the sun, radio waves, laser light, etc.). It is not

the energy of particles, but of fields. Ruled by the laws of quantum

mechanics.

All of these types of energy can be converted into one another. For in-

stance, mechanical energy can be turned into electric energy with a dynamo,

and chemical energy is turned into heat energy when gasoline is burned.

2.3 Conservation of Energy

Next, we show a fundamental principle: when forces are conservative,

total energy is always conserved. What this means is that, although indi-

vidual particles might experience different kinetic and potential energies as

they move, total energy is always constant over time. Intuitively, this means

that, overall, potential energy turns into kinetic energy as force is exerted

on particles and they experience motion.7

Result 2.2 (Conservation of Total Energy) When the force is conser-

vative, total energy is conserved. That is, if kinetic energy is path-independent,

then:

∂E(~r)

∂t= 0

Proof. By the Potential Energy principle and Newton’s Second Law, we

have that:

mi~vi = −∂Vi(~r)

∂~r

Left-multiplying by velocity and adding up across particles we find:

N∑i=1

mi~vi~vi = −N∑i=1

~vi∂Vi(~r)

∂~r

7 For example, an archer turns the bow’s potential energy from pulling the stringback into kinetic energy on the arrow as the string is released; overall, the potentialand kinetic energies of the bow, arrow, and the archer’s body all cancel out.

108


Using the definition of kinetic energy, T ≡ 12

∑imi~vi · ~vi, note that the

left-hand side is just T (to see this, simply apply the chain rule, and notice

that we crucially need T to be path independent!). Moreover, ~vi∂Vi(~r)∂~r =

Vi(~r), again by the chain rule, so the right-hand side is just V (~r). There-

fore, T = −V (~r), or E(~r) = 0, our desired result. �

Once again, note that energy need not be conserved on each particle (i.e.

Ti + Vi(~r) = 0 need not be). Rather, it is total energy, i.e. the sum of the

energies across all particles within the system, which must be constant.

Moreover, we must once again stress that mechanical energy is conserved

only when the force is itself conservative, that is, when kinetic energy is

path-independent. This is often, though not always, the case. For example:

• The gravitational force is conservative (shown in Example 2.3), so the

mechanical energy of objects that experience gravitational acceleration

is always conserved.

• Spring motion is also conservative (shown in Example 2.10), so the

mechanical energy due to the motion of a spring is also conserved.

• However, frictional forces are (famously) not conservative. Intuitively,

this is clear: because of the COFs of the frictional force (Definition

1.18), the force that is needed to overcome friction is larger the longer

the path between two points. In short, kinetic energy depends on the

path that is taken between the starting and ending points. As a result,

mechanical energy is not conserved in that case.

Example 2.6 (Example 2.3, cont’d) Return to Example 2.3, where we

found WAB = −mgh. By the Work-Energy Theorem, we have −mg(yB −yA) = TB − TA or, rearranging terms:

mgyA + TA = mgyB + TB

Here, mgyA is the gravitational potential energy at point ~rA, and simi-

larly for point ~rB. Thus, the last equation says that total energy (the sum of

109


potential and kinetic) does not change from point ~rA to point ~rB. In short,

total energy is conserved.

Example 2.7 (Harmonic Oscillator: Circular Motion) Consider uni-

form motion along circular orbits (Example 1.2), so that

~r ≡

(x

y

)=

(R cosωt

R sin(ωt)

)

for radius R > 0 and angular frequency ω when the angular displacement is

simply dθ = ωdt. Using the results for velocity and acceleration obtained in

Example 1.2, we readily obtain:8

T =m(Rω)2

2, V =

kR2

2

Clearly, then, T = V = 0, so energy is conserved. Indeed, note T is path-

independent, so the force is conservative and, therefore, energy is conserved

as well.

Example 2.8 (Rolling on a semicircle) Consider placing a marble in-

side a semicircle of radius R > 0 (Figure 2.1). The ball will roll down

and oscillate back and forth about the lowest-most point of the circle (call it

(x, y) = (0, 0)), until it comes to a rest at this point.

(0, 0)

R(1− cos θ)

mg

FNR

Figure 2.1: A marble (black dot) rolling down a semi-circular slope. Blue arrows are forces. In red, the ver-tical distance between y = 0 and the current positionof the marble on the y axis.

8 For both V and T , we use that sin2 ωt+ cos2 ωt = 1 by equation (0.2).

110


Suppose the marble is in some position (x, y), at an angle θ about the

y axis. Relative to the circle’s center, the y component of the direction

is R cos θ. Relative to the origin y = 0, therefore, the distance is y =

R−R cos θ = R(1− cos θ). Thus, the gravitational potential energy is:9

Vg = mgR(1− cos θ)

For example:

• When θ = 0, then Vg = 0, which makes sense because θ = 0 means the

marble is positioned at the resting point (x, y) = (0, 0), and thus is has

no potential energy left.

• When θ = π/2 (the highest point in the semicircle, where the position

vector and the y axis are perpendicular), we have Vg = mgR, which is

exactly right since the marble in that case is at height R.

The speed is |~v| = Rθ, where θ ≡ dθdt is the angular frequency (Definition

1.5).10 Thus, the kinetic energy is:

T =1

2m|~v|2 =

1

2mR2θ2

The potential energy is:

V = mgR(1− cos θ)

Thus, the system’s total mechanical energy is:

E = T + V =1

2mR2θ2 +mgR(1− cos θ)

Now, as in the case of the pendulum, we use a Small-Angle approximation

(Definition 1.16) to make progress. Namely, we can (Taylor-)approximate

9 Note the potential energy includes forces only in the y direction, not the x direction.We explain this in the final paragraph of this example.10 The derivation of this velocity is similar to what we saw in Example 1.2, exceptthat now the angular velocity is changing with time (zero when the marble is released,and increasing as the marble rolls down, with a maximum at the lowest point).

111


cos θ around θ = 0 by 1 − θ2

2 . With this approximation, V = mgR θ2

2 , so

V = mgRθθ. Therefore:

E = T + V = mR2θθ +mgRθθ = mRθ(Rθ + gθ)

Since gravity is a conservative force, we may invoke conservation of me-

chanical energy. Thus, setting E = 0, we obtain:

θ +g

Rθ = 0

Therefore, the angular motion obeys (up to the error in the Small-Angle

approximation method) a simple harmonic oscillation (recall equation (1.4)).

From Example 1.9, we then know that:

θ = θmax cos(ωt+ ϕ) (2.5)

where ω ≡√

gR is the angular frequency of motion,11 θmax is the am-

plitude (the maximum angle that is reached before the process oscillates

back), and ϕ is the phase angle (in radians). The period of motion is,

thus, T = 2πω = 2π

√Rg seconds, and the frequency of motion is f = 1/T Hz.

Importantly, we have found exactly the laws of motion that we found

for pendular motion (Example 1.10). The only difference between the two

set-ups is that, in the case of the pendulum, a string was adding tension

to the body, whereas for the present example a normal force plays this role

(assuming no friction). Yet, we have made no mention of these forces when

invoking the conservation of energy. Why do they make no difference?

The reason is that both of these forces are orthogonal to the direction of

motion (see Figure 2.1), and therefore ~FN · d~r = 0. As potential energy is

defined V = −∫~F · d~r, normal forces do not do any work, and thus they do

not contribute to potential energy.

11 Note that angular frequency and angular velocity are here different because thelatter is non-constant. When the circular motion is uniform, as in Example 1.9, thetwo coincide.

112


Example 2.9 (Amplitude and phase angle) Consider the simple pen-

dulum once again (Example 1.10). Using the results there, as well as in

our previous example above these lines, the angle at any time t is given by

equation (2.5).

We now ask how to obtain the amplitude θmax, namely the maximum

angle that the pendulum can achieve, and the phase angle ϕ.

Let ~rA = (xA, yA) = (0, 0) be the position of the bob when at rest, ~rC =

(xC , yC) denote the position at the highest point (when θ = θmax), and ~rB =

(xB, yB) somewhere in between (at, say, angle θ0). If ` is the total length

of the string, then the heights at points C and B are, respectively, yC =

`(1− cos θmax) and yB = `(1− cos θ0).

• Amplitude: To find θmax, we invoke the conservation of mechanical

energy: TA + VA = TB + VB = TC + VC . Since the bob is at rest in

point A, we can normalize the potential energy to VA = 0. Since it has

come to a halt at C, it has no kinetic energy left, so TC = 0. Thus:

1

2m|~vA|2︸︷︷︸=TA

=1

2m|~vB|2︸︷︷︸=TB

+mgyB︸︷︷︸=VB

= mgyC︸︷︷︸=VC

(2.6)

Letting h = yC − yB, then the right side simplifies to 12m|~vB|

2 = mgh,

so h = |~vB |22g . We can now compute the amplitude. Since yC = yB+h =

`(1− cos θmax), then cos θmax = 1− yB+h` = cos θ0 + |~vB |2

2g` .

Another way to find θmax is to use the Work-Energy Theorem. First,

the change in kinetic energy between ~rA and ~rB is TB−TA = 12m|~vB|

2−12m|~vA|

2. On the other hand, the work done is just due to gravity,

WAB = −mgyB.12 By the Work-Energy Theorem, WAB = TB−TA, or

−mgyB = 12m|~vB|

2 − 12m|~vA|

2, exactly the left side of equation (2.6).

Similarly, WBC = −TB (as TC = 0), so mg(yC − yB) = 12m|~vB|

2,

exactly the right side of equation (2.6). This shows that the Work-

12 Again, even though there is also the tension force on the string, the force vector inthat case is perpendicular to the direction of motion, so the dot product of the twois zero. Thus, tension does not do any work.

113


Energy theorem and the Conservation of Mechanical Energy are, in

fact, one and the same.

• Phase angle: To find ϕ, we need to consider the initial condition of

the system at t = 0. Suppose that the bob is in point B at that time,

with velocity |~vB,0| and angle θ0. We know, by equation (2.5), that

θ0 = θmax cosϕ. The angular velocity at t = 0 is:

dθ

dt

∣∣∣t=0

= −ωθmax sinϕ

(in radians / sec), but dθdt =

|~vB,0|` ,13 so |~vB,0| = −`ωθmax sinϕ, which

we can solve for sinϕ.

Example 2.10 (Harmonic Oscillator: Spring Motion) Consider a sin-

gle body with mass m that is attached to the end of a massless spring. Recall

the force in this case is proportional to the position (Hooke’s law, Princi-

ple 1.3), Fx = −kx. Similarly, if there’s a pull in the y direction, then

Fy = −ky.

• The work needed to bring the object from point A to point B is:

WAB =

∫ ~rB

~rA

~F · d~r = k

∫ ~rB

~rA

~r · d~r =1

2k|~r|2

where ~r = (x, y). We have then derived the potential energy:

V (x, y) =1

2k(x2 + y2)

• Conversely, we can recover the force from the potential energy. By the

Potential Energy principle, we have:

~F = −

(∂V (x,y)∂x

∂V (x,y)∂y

)= −

(kx

ky

)13 Take an infinitesimal angle dθ. The arc is some ds, and the hypothenuse is `. Byconstruction, then, dθ = ds

`. Dividing both sides by dt, we find dθ

dt= |~v|

`, where the

angular speed is |~v| = dsdt

.

114


• Newton’s Second Law then says:

(x

y

)= −ω2

(x

y

)(2.7)

where ω =√

km is the spring’s angular frequency (for the derivation,

see Example 1.9). Again, this means that the spring is a simple har-

monic oscillator, so using our results for oscillators we have that:

x = xmax cos(ωt+ ϕx)

where xmax is the amplitude, and ϕ is the phase, and similarly for y.

• The system’s kinetic energy is:

T =1

2m|~v|2 =

1

2m(x2 + y2)

Note kinetic energy is path-independent (it depends on velocities, not

positions), showing that the force caused by springs is conservative.

• The system’s total energy is, therefore:

E = T (x, y) + V =1

2

[m(x2 + y2) + k(x2 + y2)

]=

1

2

(m|~r|2 + k|~r|2

)In words, the total energy is proportional to the squares of the velocity

and the position.

• Now we can compare these results with those of circular motion (Ex-

ample 1.2). For given velocity, the particle motion describes an orbit.

Along such orbit, potential energy remains constant. Thus, the particle

stays in orbits of constant energy. For particles of higher momentum,

the particle describes an orbit of larger radius, and the particle’s veloc-

ity picks up. However, though higher, potential energy remains con-

stant. This is illustrated in Figure 2.2, where outer orbits correspond

115


to particles with higher momenta and, therefore, higher velocity. We

can thus think as each concentric orbit in the figure as nothing but con-

tours of constant potential energy, with energy increasing quadratically

as we move away from the origin.

#»v 1

#»a 1

#»v 1

#»a 1

#»v 2

#»a 2

#»v 2

#»a 2

Figure 2.2: Circular motion contours of constant po-tential energy. The particle describes orbits for a con-stant potential energy. For particles of higher velocities(e.g. ~v2 > ~v1), potential energy is higher and the orbitis larger.

For instance, planets in the Solar System that are farther from the Sun

orbit at higher velocities and exhibit higher potential energies.14 The

Sun, exerting a gravitational pull, provides the centripetal acceleration

that is necessary to keep the planets in orbit.

• Let us show formally that total energy is indeed conserved. First, the

change in total potential energy is:

V = k (xx+ yy)

The change in kinetic energy is:

14 The orbits of the planets in the Solar System are elliptical, not circular. We neglectthis detail here.

116


T = m (xx+ yy) = −mω2 (xx+ yy) = −k (xx+ yy)

where the second equality uses (2.7), and the third equality uses ω2 =

k/m. Therefore, V + T = 0, as sought. Thus, all potential energy is

transformed into kinetic energy as the particle orbits around the circle.

• Finally, we can compute the amplitudes (xmax, ymax) and the phase

(ϕx, ϕy) of the spring, similarly to the way we did it in Example 2.9.

Pick three points: ~rA = (xA, yA) = (0, 0), where the spring is in a

relaxed position at some length `; ~rC = (xC , yC) = (xmax, ymax), where

the spring reaches its maximum amplitude before rebounding; and ~rB =

(xB, yB), somewhere in between. By the conservation of energy, TA +

V (xA, yA) = TB + V (xB, yB) = TC + V (xC , yC). Since V (x, y) =12k(x2 + y2), then V (xA, yA) = 0. Since T = 1

2m|~v|2 and the spring

comes to a halt at ~rC , then TC = 0. Thus:

1

2m|~vA|2 =

1

2m|~vB|2 +

1

2k(x2

B + y2B) =

1

2k(x2

max + y2max)

from which we can calculate directly the amplitudes from given speeds.

Similarly, the Work-Energy Theorem will give us the same result.

Example 2.11 (Roller Coaster) Mechanical energy is also conserved in

(frictionless) roller coasters. To show this, consider a roller coaster car of

mass m that slides down a incline with a loop at the end that has radius R

(Figure 2.3).

We will consider four positions for the car: ~rA = (0, h) (i.e. at the very

top of the incline, at height h > 0); ~rB = (xB, yB) with xB > 0, yB < h

(i.e. mid-way down the incline); ~rC = (xC , 0) with xC > xB (i.e. at the

lowest point of the loop); and ~rD = (xD, yD) with xD = xC and yD = 2R

(i.e. at the highest point in the loop, if the object ever ever makes it there).

The velocity vectors are ~vA = (0, 0), ~vB = (vB,x, vB,y), ~vC = (vC,x, 0), and

117


Figure 2.3: A roller coaster.

~vD = (vD,x, 0).15 By the Conservation of Mechanical Energy, we have:

TA + VA = TB + VB = TC + VC = TD + VD (2.8)

where Ti and Vi denote kinetic and potential energy at point ~ri for i =

A,B,C,D. In particular:

• Point ~rA: At this point, since there is no velocity, TA = 0. Potential

energy is purely gravitational, so VA = mgh.

• Point ~rB: At this point, TB = 12m(v2B,x + v2

B,y

). Potential energy is

gravitational, so VB = mgyB.

• Point ~rC : At this point, TC = 12mv

2C,x. Potential energy is gravita-

tional, so VC = mgyC = 0.

• Point ~rD: At this point, TD = 12mv

2D,x. Potential energy is gravita-

tional, so VD = mgyD = 2mgR.

Note that kinetic energy is always path independent, so the principle of

Conservation of Mechanical Energy holds. This is because only gravitational

forces, which are conservative, are at play here. If friction was added to

15 In point A, the car is released with no acceleration; at points C and D there isclearly no vertical displacement.

118


the picture, energy conservation could not be invoked (as friction is a non-

conservative force).

Here, equation (2.8) reads:

gh =1

2

(v2B,x + v2

B,y

)+ gyB =

1

2v2C,x =

1

2v2D,x + 2gR

Thus, vC,x =√

2gh and vD,x =√

2g(h− 2R). At point ~rB = (xB, yB),

half-way down the incline, the car is moving at speed

|~vB| =√

2g(h− yB)

We also know, from Example 1.2, that at point ~rD, if reached, there

must be a centripetal acceleration equal to aD ≡ |~aD| =v2D,x

R . Moreover,

from Example 1.12, we know that, if ~rD is reached, then aD ≥ g, or else

the car’s acceleration could not have overcome the force of gravity. Using

vD,x =√

2g(h− 2R), then we have 2g(h− 2R) ≥ gR, which simplifies to:

h ≥ 2.5R

This is a classic result: if we release the car down the hill, it will not

make it to the top of the loop unless we drop it from a height that is, at least,

two-and-a-half times the loop’s radius. In the real world, where there is both

air and car-to-rails friction, the Conservation of Energy will fail, and the

factor becomes even higher. However, 2.5 may be used as a lower bound, as

it holds in a frictionless environment.

2.4 Collisions

Collisions of bodies are an example of the conservation of total energy

(Result 2.2) and the conservation of momentum (Result 1.1) in practice.

In this section, we are interested in understanding the change in the ki-

netic energy of particles when total momentum and total energy must be

conserved.

119


Example 2.12 (Colliding particles, cont’d) Recall our example with col-

liding particles (Example 1.18). The particles had masses m1 and m2, ve-

locities ~v1 and ~v2, and eventually collided and stuck together into a single

particle of mass (m1 +m2) and velocity ~v ′.

By assuming no external forces, we invoked conservation of momentum

to derive:

~v ′ =m1~v1 −m2~v2

m1 +m2

Now, we can study the energy involved before and after the collision:

• Before the collision, kinetic energy is T = 12m1|~v1|2 + 1

2m1|~v2|2.

• After the collision, kinetic energy is T ′ = 12(m1 +m2)|~v ′|2. Or, using

our formula for ~v ′:

T ′ =1

2

m21|~v1|2 +m2

2|~v2|2

m1 +m2− m1m2

m1 +m2|~v1||~v2| < T

In words, the kinetic energy goes down as a result of the collision (it

turns into heat energy, so that total energy remains constant). Thus, in the

absence of external forces, kinetic energy must be destroyed if the system’s

total momentum is to be conserved.

Example 2.13 (Splitting particles) While kinetic energy has been lost

in the above example, it would increase in the opposite scenario: the (sudden)

separation of particles. Consider a particle of mass m which suddenly splits

into two sub-particles. Initially, ~v = ~0 and thus momentum ~p = ~0. After the

separation, the sub-particles have mass m1 and m2, and acquire velocities

~v ′1 and ~v ′2 , in opposite directions as they fly away from each other.

Because no external forces act upon the system, momentum is conserved,

so ~0 = m1~v′

1 − m2~v′

2 . Interestingly, for momentum to be conserved, the

relative speed of each particle must be constant, with the relatively lighter

object acquiring a relatively faster speed:

|~v ′1 ||~v ′2 |

=m2

m1

120


For instance, if one sub-particle is twice the mass of the other, after the

split it will move at half the speed as the other one. It must, or else the

system’s total momentum would not be conserved.

Finally, clearly, kinetic energy has increased: it was T = 0 initially, and

after the collision it becomes T ′ = T ′1 + T ′2 with T ′j = 12mj |~v ′j |2 for each

sub-particle j = 1, 2.

We have just seen two examples of collisions in which kinetic energy

decreases and increases, respectively. Let us analyze these cases more gen-

erally.

Suppose that the two particles collide and, as a result, the particles

bounce off. After the collision, the particles have velocities ~v ′1 and ~v ′2 .

Therefore, invoking conservation of momentum (because there are no exter-

nal forces) we have:

m1~v1 −m2~v2 = m1~v′

2 +m2~v′

2 (2.9)

On the other hand, total energy must be conserved. Here:

• Before the collision, kinetic energy is T = 12m1|~v1|2 + 1

2m1|~v2|2.

• After the collision, kinetic energy is T ′ = 12m1|~v ′1 |2 + 1

2m1|~v ′2 |2.

In any case, by the conservation of total energy, there must exist a Q

such that:

T +Q = T ′

Then, we distinguish the following cases:

Case 1: Q > 0⇒ Super-elastic collision.

In this case, kinetic energy increases as a result of the collision. This

is the case, for example, of explosions.

Case 2: Q = 0⇒ (Completely) elastic collision.

In this case, kinetic energy remains constant as a result of the collision.

121


Case 3: Q < 0⇒ Inelastic collision.

In this case, kinetic energy decreases as a result of the collision. In

this case, kinetic energy is generally transferred into heat energy.

Therefore, Example 2.12 if an example of an inelastic collision, while

Example 2.13 showed a super-elastic collision. Let us now see an example

of Case 2.

Example 2.14 (Completely elastic collision) Since Q = 0, then T =

T ′, so:

m1|~v1|2 +m1|~v2|2 = m1|~v ′1 |2 +m1|~v ′2 |2 (2.10)

by the conservation of total energy. Now, we can use equations (2.9)-

(2.10) to solve for (~v ′1 , ~v′

2 ). The solution is:

~v ′1 =(m1 −m2)~v1 − 2m2~v2

m1 +m2and ~v ′2 =

(m1 −m2)~v2 + 2m1~v1

m1 +m2

Suppose for illustration that ~v2 = ~0, so that ~v ′1 = m1−m2m1+m2

~v1 and ~v ′2 =2m1

m1+m2~v1. For instance, think of a ping-pong ball (object 1) colliding with

a billiards ball (object 2) which is at rest. Then, ~v ′2 has the same sign

as ~v1 (the billiards ball will start moving in the same direction as the one

that the ping-pong had originally), but ~v ′1 has the opposite direction as ~v1 if

m1 < m2 (the ping-pong ball will “bounce off” the billiards ball since it is

less massive).

A few special cases:

• When m2 ≈ 0, we get:

~v ′1 ≈ ~v1 and ~v ′2 ≈ ~v2 + 2~v1

For example, if a bowling ball (here object 1) collides with a ping-pong

ball (object 2) which is at rest (~v2 = ~0), the bowling ball’s velocity will

hardly be altered. However, after the collision, the ping-pong ball will

122


see its velocity increase to 2~v1. The latter is not intuitive, but it is

confirmed experimentally.

• When m1 ≈ 0, we get:

~v ′1 ≈ −(~v1 + 2~v2) and ~v ′2 ≈ −~v2

For example, if a ping-pong ball (now object 1) collides with a bowling

ball (object 2) which is at rest (~v2 = ~0), the ping-pong ball will bounce

off with the same speed but opposite direction, while the bowling ball

will stay at rest.

• When m1 = m2, we get:

~v ′1 = −~v2 and ~v ′2 = ~v1

Now, two billiards balls collide. For example, if ball 2 was originally

at rest (~v2 = ~0) and ball 1 collides into it, then ball 1 stops abruptly

after the collision and sets ball 2 in motion at an equal speed. This is

also perfectly illustrated with a so-called Newton cradle.

Let us now look at collisions from the frame of reference of the center

of mass (Definition 1.27). Recall that, in the absence of external forces, the

center of mass will always have the same velocity (Result 1.2). From the

point of view of the center of mass, the center of mass is at rest. Therefore,

from the point of view of the center of mass, particles have no momentum.

Consider two particles with masses m1 and m2 moving in opposite di-

rection toward the center of mass, i.e. with some velocities ~u1 and −~u2.16

Suppose kinetic energy is conserved, so that the collision is completely elas-

tic (Q = 0). After the collision, the velocities are ~u ′1 and ~u ′2 . Since total

momentum is zero before the collision, we have:

16 We use the notation ~u instead of ~v to emphasize that these velocities are strictlyfrom the point of view of the reference frame of the center of mass. The velocity ~v isfrom the general reference frame (e.g. the laboratory).

123


m1~u′

1 +m2~u′

2 = ~0

Because the collision is perfectly elastic, conservation of energy says:

1

2m1|~u1|2 +

1

2m1|~u2|2 =

1

2m1|~u ′1 |2 +

1

2m1|~u ′2 |2

The solution of the system of two equations is ~u ′1 = −~u1 and ~u ′2 = −~u2.

This is thus a remarkable property: at the center of mass, colliding particles

reverse direction, but the speeds remain the same.

Outside the frame of reference of the center of mass (e.g. the laboratory’s

reference frame), the center of mass has some (constant) velocity ~vCM . As

we derived in Section 1.4, the velocity of the center of mass is:

~vCM =m1~v1 −m2~v2

m1 +m2

Of course, from the point of view of the general reference frame:

~u1 = ~v1 − ~vCM and ~u2 = ~v2 − ~vCM (2.11)

(These identities allows us to go back and forth between the two reference

frames). Suppose that the collision is perfectly inelastic, so that the particles

will stick together (kinetic energy is lost and transferred into heat). For

illustration, say particle 2 was at rest (~v2 = ~0). After the collision, a single

particle of mass m1+m2 emerges with velocity ~v ′. In the absence of external

forces, total momentum is conserved, i.e. m1~v1 = (m1 +m2)~v ′, so:

~v ′ =m1

m1 +m2~v1 = ~vCM

In words, the new particle acquires the velocity of the center of mass.

The change in kinetic energy from the lab’s reference frame, call it Qv, is

negative (as the collision is inelastic) and given by:

Qv = T ′v − Tv = −1

2

m1m2

m1 +m2|~v1|2 < 0

What about from the center of mass reference frame? Using the identities

124


in (2.11), we have:

~u1 = ~v1 − ~vCM = ~v1 −m1

m1 +m2~v1 =

m2

m1 +m2~v1

~u2 = ~v2 − ~vCM = ~0− m1

m1 +m2~v1 = − m1

m1 +m2~v1

The kinetic energy in the center of mass frame before the collision is

Tu = 12m1|~u1|2 + 1

2m2|~u2|2. After some algebra, we obtain:

Tu =1

2

m1m2

m1 +m2|~v1|2 = −Qv

In words, all the kinetic energy at the center of mass is transferred to heat

when the collision occurs. The number Tu is therefore the maximum kinetic

energy that can ever be lost in an inelastic collision, and it is sometimes

called the internal kinetic energy of the system.

Definition 2.6 (Internal Kinetic Energy) The maximum kinetic energy

that can result from a collision, i.e. the change in total kinetic energy of a

system relative to the center of mass.

2.5 Impulse and Thrust

When acted upon by forces, an object is given an impulse. Impulse

is, thus, nothing but the accumulation of all these forces over a certain

interval of time. For example, when two particles collide, some or all of them

experience an impulse for the duration of the collision. In this section we

explore the connection between impulse, momentum, energy, and collisions.

Definition 2.7 (Impulse) The impulse is the accumulated force on an ob-

ject i = 1, . . . , N over a certain interval [t1, t2] of time. Formally:

~Ii ≡∫ t2

t1

~Fi(~r)dt

125


Since ~Fi = mi~ai by Newton’s Law, and mi~ai = d~pidt by the definition of

momentum, we can write impulse as:

~Ii =

∫ t2

t1

d~pidt

dt =

∫ ~pi(t2)

~pi(t1)d~p = ~pi(t2)− ~pi(t1)

This may thus serve as an alternative definition: impulse is the change

in an object’s momentum. A force acting upon an object causes a change in

the object’s momentum, and this change is what we call the impulse.

Note, of course, that if the force is conservative, then the conservation

of total momentum (Newton’s Third Law) must hold, and thus

N∑i=1

~Ii = ~0

Let’s see some examples:

Example 2.15 (Ballistic pendulum) A ballistic pendulum (Figure 2.4)

is a device used to measure the speed of a bullet. Suppose that a bullet

of mass m is fired at speed v0 against an object of mass M , which hangs

from a massless string of length L and whose location before the collision is

normalized to (x, y) = (0, 0).

The collision is completely inelastic, with the resulting object, of mass

(m + M), speeding forward at velocity ~v′ along an arc, coming to a halt at

some angle θ, and swinging back down according to the motion of the simple

pendulum (Example 1.10).

This setting can be used to calculate the initial speed v0. By momentum

conservation, we know that

m~v0 = (m+M)~v ′

When the pendulum comes to a halt, ~v = ~0, the potential energy is

zero, so the object’s kinetic energy is completely converted into gravitational

potential energy. Thus, by conservation of energy, we have 12(m+M)|~v ′|2 =

(m+M)gh or, simplifying:

126


Figure 2.4: The ballistic pendulum.

|~v ′| =√

2gh

where h denotes the vertical displacement of the object. In particular,

h = L(1− cos θ). As we know from Example 1.10, h is typically very small,

and hence hard to measure experimentally. But we can again use the Small-

Angle approximation, and say that h ≈ x2

2L .17 Therefore:

|~v ′|2 ≈ gx2

L

and so the velocity of the bullet is:

|~v0| =m+M

mx

√g

L

Finally, the impulse of the bullet on the massive object is ~I = (m+M)~v ′

(as the object’s momentum before the impact was zero), and the impulse of

the object on the bullet is ~I = (m + M)~v ′ −m~v0 = ~0, as momentum con-

17 First, perform a Taylor expansion of cos θ about θ = 0 to obtain cos θ ≈ 1− θ2/2,

so that h = L θ2

2. Next, we know that the horizontal displacement is x = L sin θ.

Since sin θ ≈ θ about θ = 0, then x ≈ Lθ, or θ ≈ x/L. In sum, h ≈ L θ2

2≈ x2

2L.

127


servation holds. Indeed, since the collision is perfectly inelastic, all kinetic

energy is lost, and the newly formed object absorbs all momentum.

Example 2.16 (Falling object) Consider an object of mass m that drops

to the floor from height h and no initial speed. Using a similar reasoning to

above, the speed at the time that the object hits the floor is |~v| =√

2gh, as

~v = −(0,√

2gh). Its momentum is ~p = m~v.

Let’s look at the impulse for different types of collisions:

• If the collision is completely elastic (no kinetic energy is lost), the

ball would bounce back up with the same speed and in the opposite

direction, ~v ′ = −~v (for a derivation, see Example 2.14). Thus, the

new momentum would be ~p ′ = −~p, and the change in momentum (or

impulse) is:

~I = ~p ′ − ~p = 2m~v

• If the collision is completely inelastic (all kinetic energy is lost), the

ball would lose all speed, ~v ′ = ~0 (for a derivation, see Example 2.14).

Thus, the new momentum would be ~p ′ = 0, and the change in mo-

mentum (or impulse) is:

~I = ~p ′ − ~p = m~v

Interestingly, if one knows the impulse, then by Definition 2.7 an average

force can be calculated via:

~Faverage ≈~I

∆t

where ∆t ≡ t1 − t2 is the time during which the force is exerted, and

~I ≡ ∆~p is the change in momentum. For example, for elastic collisions (e.g.

a fast-moving tennis ball against a player’s racket), the collision time ∆t

is often extremely short, so the average force that the objects experience

can be extremely high. That is, during the brief instants during which the

128


tennis ball changes its course, it may experience an extremely high increase

in weight.

Now suppose the impulse is exerted, more generally, by N ≥ 2 objects of

equal mass m traveling at equal velocity ~v (though at potentially different

times within the period ∆t). Then, the average force exerted on the body

satisfies ~Faverage =~I

∆t = Nm~v, where now Nm, in kg/sec, is the total mass

over the period ∆t. Thus, more generally, as ∆t→ 0, we have:

~F =dm

dt~v (2.12)

where dmdt is the rate of change in the mass of the colliding or expelling

objects. For instance, if many equal objects are thrown onto or propelled

out of another object consecutively over a short period of time, the latter

object will feel an impulse. This is the basic idea behind rockets, where the

force is called thrust.

Definition 2.8 (Thrust) The force, given by equation (2.12), that acts

upon a body and is typically due to the propulsion of particles (e.g. gas).

Let’s study rockets in more detail:

Example 2.17 (Rockets in outer space) Rockets experience an impulse

from their engines, which expel a huge number of gas particles over an ex-

tremely short period of time. As a result, the rocket experiences a thrust

given by equation (2.12), which points in the direction opposite to that of

the gas particles.

Consider a rocket in outer space (so that there are no external forces

and momentum must be conserved). The rocket is burning chemical energy

and expelling gas at some velocity ~u, which is fixed relative to the rocket.

Knowing the rate at which gas comes out (the object dmdt ) would then allow

us to calculate the thrust force:18

~Fthrust =dm

dt~u (2.13)

18 For instance, for the Saturn rocket in the Apollo missions, u = 2.5 km/sec anddmdt

= 15× 103 kg/sec, and so the rocket experienced a thrust of F = 35× 106 N.

129


The thrust acts upon the rocket for a certain amount of time, called

the burn time, and as the fuel burns the rocket’s mass goes down, so the

acceleration during the burn goes up as a result.

Let us derive this change in velocity. The derivations are from the ref-

erence frame of the lab, not the rocket’s. Let m be the mass of the rocket,

and ~v be the velocity at time t from the lab’s reference frame (recall ~u is

the velocity from the rocket’s reference frame). At time t+ ∆t, velocity has

changed to ~v + ∆~v, and the mass is now m − ∆m. Always from the lab’s

reference frame, the cloud of gas particles with total mass ∆m that has been

expelled has velocity ~v − ~u. If ~v > ~u, we will see the exhaust of gas go up

from our frame of reference, otherwise we will see it go down.

Now, we can compare momenta at both moments in time:

~p(t) = m~v

~p(t+ ∆t) = (m−∆m)(~v + ∆~v)︸︷︷︸Rocket

+ ∆m(~v − ~u)︸︷︷︸Expelled gas

= m~v +m∆~v − ~u∆m+ ∆m∆~v︸︷︷︸≈0

Thus, the change in total momentum (zero, by the conservation of total

momentum) is:

~0 = ~p(t+ ∆t)− ~p(t) = m~v − ~u∆m ⇔ m~v = ~u∆m

Diving through by ∆t, and letting ∆t→ 0, we obtain:

m~a = ~udm

dt(2.14)

We recognize on the right-hand side the thrust on the rocket (equation

(2.13)). Thus, this is nothing but Newton’s Second Law for rockets.

Example 2.18 (Rocket on Earth) In the previous example, we have seen

rockets with no external forces (e.g. in outer space). What if the rocket suf-

fers the gravitational force (e.g. as it is launched from Earth)?

130


Consider a standard vertical launch (i.e. ignore the x dimension so the

problem is one-dimensional). Now, a force mg goes opposite to the thrust

force Fthrust = udmdt . In this case, equation (2.14) must be modified to:

ma = udm

dt−mg

commonly known as the rocket equation. If the rocket has initial velocity

v(t1) and final velocity v(t2), then integration yields:

v(t2)− v(t1) = −u ln

(m(t2)

m(t1)

)− g(t2 − t1)

Note that because m(t2)m(t1) ≤ 1 due to the rocket burning fuel, the first term

will always be positive.

For instance, in a vertical launch, where the rocket starts at rest (v(t1) =

0 at t1 = 0), then the rocket’s speed after some time t (the burn time) is:

v(t) = −u ln

(m(t)

m0

)− gt

where m0 denotes the initial mass of the rocket before the exhaustion of

gas begins.19 For the rocket to lift off, then, the thrust must be strong enough

in the sense −u ln(m(t)m0

)> gt.

This has shown that the change in a rocket’s velocity for a given amount

of fuel and a given burn time is fixed. However, as we will now show, the

change in kinetic energy is not fixed. In particular, two rockets that burn the

same amount of fuel over the same amount of burn time (thereby achieving

the same increase in velocity) do not experience the same change in kinetic

energy if their initial velocities differ. For instance, the change in kinetic

energy at launch is different than that when the rocket is already in the air.

To see this, fix the burn time t and the exhaust m(t)m0

, so that ∆v ≡ v−v0

is fixed (where v0 is the initial velocity and v is the velocity at time t).

Consider two scenarios: a rocket at rest, and one that is already in the air.

• At launch (v0 = 0), the increase in kinetic energy is ∆T = 12m(∆v)2.

19 As a special case, note that if the rocket’s mass did not change, v(t) = v0 − gt,exactly what we derived for falling objects (e.g. Example 1.13).

131


• Once in the air (v0 > 0), the increase in kinetic energy is ∆T =12m[v2

0 − (∆v)2].

Thus, the change in kinetic energy depends on the initial velocity of the

rocket. The increase in kinetic energy is higher if the rocket was in motion

to begin with.

2.6 Newton’s Universal Law of Gravitation

So far, we have assumed that a force of gravity mg acts on each body

of mass m, where g is the acceleration that the body feels in free fall. The

potential energy of this fall, we have argued, is mgh (recall Example 2.5),

where h is the height from which the object is dropped.

In considering this, we have made a simplification: the body and the

surface of the Earth are separated by a relatively small distance. But what

are the gravitational forces acting between two distant bodies, say the Earth

and the Sun?

Consider two bodies, of masses m1 and m2, which are at a distance r > 0

apart from each other (that is, r = |~r1−~r2|). These two bodies are attracted

by the force of gravity. Let Fmimj be the force exerted on body i by body

j 6= i. By Newton’s Third Law, Fm1m2 = Fm2m1 ≡ F (i.e. same magnitude

but opposite in direction).

Newton postulated a value for F . His statement has since become one

of the most famous formulas in all of physics:

Principle 2.2 (Newton’s Universal Law of Gravitation) The gravita-

tional force between two bodies of masses m1 and m2, separated by distance

r, is:

F = Gm1m2

r2

where G = 6.674× 10−11 is called the gravitational constant.

In words, the attractive force between two bodies is directly proportional

to the product of their masses, and inversely proportional to the square of

132


the distance between them. This makes the Universal Law of Gravitation

an inverse-square law.

The gravitational constant is an extremely small number, showing that

gravitation is an extremely weak force. For example, two bodies of 1kg each

that are separated by 1m are attracted by a force of just G Newtons, and

extremely small value.

By Newton’s Second Law, Gm1m2r2 = m1a1 = m2a2, where ai is the

gravitational acceleration experienced by body i = 1, 2. Thus:

ai = Gm−ir2

for each i. Thus, the gravitational acceleration of a body is inversely

proportional to the squared distance to the other body. If the objects are

ten times apart, the gravitational acceleration falls by a factor of 100. This

is reminiscent of another inverse law: the centripetal acceleration experi-

enced by a body in uniform circular motion (Example 1.2), such as those

(approximately) described by the orbits of the planets.20

Remark 2.1 (Deriving g) Throughout, we have called g the gravitational

acceleration of bodies on free-fall toward the Earth (Definition 1.13). If the

body has mass m, then the gravitational force is mg (by Newton’s Second

Law). Experimentally, it has been found that g = 9.8m/s2.

Interestingly, this is nothing but a special case of Newton’s Universal

Law of Gravitation. In particular, let mEarth be the mass of the Earth.

Assuming the body is close enough to the surface of the Earth, the distance

r is approximately equal to the radius of the Earth, call it rEarth. Thus,

F = GmEarthmr2Earth

≈ mg, and therefore:

g ≈ mEarthG

r2Earth

(2.15)

Plugging in values for the Earth’s mass (mEarth = 6 × 1024 kg), the

Earth’s radius (rEarth = 6.4× 103 km), and the gravitational constant (G =

20 This is not strictly true, for planets describe elliptical (not circular) orbits aroundthe Sun. However, the connection between the two should be clear.

133


6.674× 10−11), we will obtain the famous g = 9.8.

Example 2.19 (Example 2.5, cont’d) In Example 2.5, we argued that

the gravitational potential energy is Vg = mgh when the object is at height h

from the Earth’s surface. Let us now derive the gravitational potential energy

for the general case of distant bodies, and argue that it is well approximated

by mgh when the bodies are sufficiently close to one another.

Consider two bodies, with masses m (e.g. an object in space) and M

(e.g. the Earth). Body m moves from some point A to some point B, where

A and B are at distance RA and RB of M , respectively. Along the way from

A to B, body m feels the gravitational attraction of M . Let r denote the

distance between the bodies at some point along the path. The force acting

on m at that time is, thus, F = GmMr2 .

Since gravity is a conservative force (Example 2.3), then the gravitational

potential energy is independent of the path taken by m, so we may assume,

without loss of generality and for simplicity, that the path is a straight line.

Then, the work done by the gravitational force F = GmMr2 from A to B

is:21

WAB =

∫ RB

RA

GmM

r2dr = −GmM

r

∣∣∣RBRA

= −GmM 1−RB/RARB

Using the Work-Energy Theorem, WAB = TB − TA, so TB = GmMRA

and

TA = GmMRB

. As for potential energy, we can invoke the Conservation of

Mechanical Energy to say:22

VB − VA = −WAB = GmM1−RB/RA

RB

For instance, if point A (the starting point) is infinitely far (RA = +∞),

then TB = 0. That is, the body has no kinetic energy left by the time it

21 Path-independence allows us to use the actual distances RA and RB as the limits ofintegration in W (recall gravity is a conservative force). If work was path-dependent,we would need the more general specification in equation (2.3).22 By conservation of mechanical energy, TA + VA = TB + VB . Thus, VA − VB =TB − TA = WAB , the second equality by the Work-Energy Theorem.

134


arrives in point B. In this case, work is given by

W∞B = −GmMRB

and the gravitational potential energy is, therefore:

VB − V∞ = VB = GmM

RB

(where we have set V∞ = 0 as A is infinitely far).23 Another way of

deriving this same result is by just using the Principle of Potential Energy

(Principle 2.1). Recalling that F = −∂V∂R (recall that R is the distance

between the objects), and using F = GmMR2 by Newton’s Law of Gravitation,

we have ∂V∂R = −GmM

R2 , i.e. V = −GmM∫

1R2 dR = GmM

R .

Generally, suppose A and B are separated by a distance h, i.e. RB =

RA + h (where h may be positive or negative). Then:

VB − VA = −WAB = GmMh

RB(h−RB)

This means that:

• If h < 0 (i.e. A is further from Earth than B, so the body is mov-

ing toward the Earth), then WAB < 0 and VB − VA > 0. In words,

as the body approaches the Earth, potential energy increases and (by

mechanical energy conservation), kinetic energy decreases.

• If the body moves away from the Earth (h > 0), the opposite is true:

the work of the gravitational force is positive, so potential energy is

turned into kinetic energy as the body drifts away.

Finally, let us argue that, for objects on Earth, the gravitational potential

energy is well approximated by mgh (as found in Example 2.5), where h is

the distance between the body and the surface of the Earth. To show this,

23 As a result of setting V∞ = 0, all potential energies are negative. The minus signhere is just a result of setting ∞ as our reference point. As usual, we are free tochoose this point, so the fact that potential energies are negative is meaningless perse. We only care about potential energies relative to the reference point.

135


consider point A to be on the Earth’s surface, so that RA = REarth (the

radius of the Earth), and let RB = REarth + h, where h is tiny compared to

REarth (i.e. the body is just above the surface).

Then, the potential energy between points A and B is:

VA − VB = WAB = GmMh

(REarth + h)REarth≈ mgh

(1

1 + h/REarth

)where the last equality uses g ≈ MG

R2Earth

by equation (2.15). Then, using

a simple Taylor expansion (Result 0.5), we see that h(

11+h/REarth

)≈ h

around hREarth

= 0. Therefore:

WAB = VA − VB ≈ mgh

what we wanted to show. �

Having the Newtonian Laws of Gravitation also allows us to understand

escape velocities.

Definition 2.9 (Escape velocity) The lowest velocity that a body must

have in order to escape the gravitational attraction of another body.

To derive the escape velocity formula, consider a (relatively small) object

of mass m standing on Earth (of mass M), so that the distance between

them is R, the Earth’s radius. For simplicity, suppose the Earth has no

atmosphere. If at time t = 0 the object is given a velocity vesc ≡ v(0), then

the total energy at that moment is:

E = T + V =1

2mv2

esc −GmM

R

Because gravity is a conservative force, E = 0. Where r is the distance

at a certain time tr > 0 from the Earth’s center, E = 0 implies

1

2mv2

esc −GmM

R=

1

2mv(tr)

2 −GmMr

136


Suppose the initial velocity vesc is just enough for the body to escape the

Earth’s gravitational pull. Since this body escapes, and because we assume

no other objects in space, it will eventually reach infinity (r → +∞). At

infinity, potential energy is zero. Further, because vesc is the minimum speed

at which this can happen, the body must abandon the gravitational pull of

the Earth at nearly zero “residual” velocity, so v(t∞) = 0. Imposing these

above, we get 12mv

2esc −GmM

R = 0. Solving for vesc:

vesc =

√G

2M

R

Thus, the escape velocity (e.g. from a planet) is higher for heavier and/or

larger planets. For Earth, M = 6 × 1024 kg and R = 6.4 × 103 km, so

vesc = 11.2 km/sec.

Example 2.20 (Gravity and circular orbits) Consider a satellite of mass

m orbiting in circular motion around a body of mass M � m. Let R > 0

be the radius of the circular orbit. As discussed before in these notes, object

m has some tangential velocity vorb, which we now call the orbital velocity,

and a centripetal acceleration that is perpendicular to it.

If the m object is in orbit, then the centripetal acceleration, with magni-

tudemv2

orbR , must all be due to the gravitational attraction between the two

objects. Thus,mv2

orbR = GmM

R2 , or:

vorb =

√GM

R

Note vesc =√

2vorb, so a body in orbit must increase its velocity by a

factor of√

2 in order to escape the orbit. The period of the orbit (the time

it takes to complete one full revolution) is T = 2π Rvorb

= 2πR3/2√GM

.24 Note

24 For instance, for a satellite about 400 km above the Earth’s surface, R ≈ 6, 800km, so one full revolution will be completed in about T ≈ 90 minutes, at a speed ofvorb ≈ 8 km/sec. For the Moon, the period is about T ≈ 27.5 days, and the orbitalspeed is vorb ≈ 1 km/sec (slower than satellites, as R is larger). For the Earth’smotion around the Sun (mass M = 21030 kg), at distance R ≈ 150× 106 km, we finda period of T = 365.5 days, i.e. one year. The orbital speed in that case is vorb ≈ 30km/sec.

137


the period is independent of m: a feather and a space shuttle would both

complete a revolution around the Earth at the same time.

Interestingly, the total energy at the orbital velocity is:

E = T + V =1

2mv2

orb −GmM

r=

1

2GmM

r−GmM

r= −1

2GmM

r

That is, E = 12V = −T . This is a remarkable property of objects in

orbit (when the orbits are circular): the total energy of objects in orbit is

negative. For the Earth and the Sun, E = −2.7× 1033 Joules, an extremely

large negative number.

138

Chapter 3

Rotation

So far, we have explored the laws of linear motion of bodies. In this

chapter, we will explore the dynamics of motion in rotational frameworks.

3.1 Moment of Inertia

In many examples above, we have explored motion along orbits when

velocity is constant (i.e. when the object is not being accelerated). In this

section, we generalize rotational motion to potentially non-uniform motion,

show how to work in polar coordinates (Definition 1.11), and introduce a

new concept: the moment of inertia.

Let R be the radius of the orbit, θ the angle at a given time, ω ≡ dθdt the

angular velocity (Definition 1.5), which is now possibly non-constant, and ~v

be the velocity of the particle, also possibly non-constant. In Example 1.2,

we derived that |~v| = ωR = θR. The centripetal acceleration is |~acent| =

Rω2.

Because the particle is now being accelerated (for example, it rotates

along a disk that is itself rotating due to a torque1), there is an additional

acceleration, called the tangential acceleration, in the direction of the cir-

cumference. The tangential acceleration is given by:

1 A torque is a rotational force, i.e. the rotational equivalent of a linear force. Amore general definition will be given in Definition 3.5.

139


|~atang| =d

dt|~v| = ωR = θR = αR

where α ≡ θ is the so-called angular acceleration. We have then intro-

duced two more concepts:

Definition 3.1 (Tangential acceleration) The acceleration along the ro-

tational perimeter of a particle in rotational motion.

Definition 3.2 (Angular acceleration) The acceleration of the angle de-

scribed by a particle in rotational motion.

Of course, in the special case of uniform motion, ω is constant and the

tangential and angular acceleration are zero.

Now, we can apply the usual equations of motion in polar coordinates.

For instance, if we consider that the acceleration is constant, we can use the

equations from Example 1.4 and translate them into polar coordinates, so

that:

θ = θ0 + ω0t+1

2αt2 and ω = ω0 + αt

Consider now an object i of mass mi that is standing on the rotating

disk. Let ri be the distance of the particle relative to the disk’s center, point

C, which acts here as the axis of rotation. Importantly, we assume that this

axis is perpendicular to the disk. Then, the kinetic energy for object i is:

Ti =1

2mi|~vi|2 =

1

2miω

2r2i

The kinetic energy of the entire disk is, therefore, T = 12ω

2∑N

i=1mir2i ,

for all the N elements that compose the disk. More generally, when there

is a distribution of particles composing the body, we may instead integrate

over the entire mass, and say

T =1

2ω2

∫r2dm︸︷︷︸≡I

140


where r ≡ r(m) is the distance to the (perpendicular) axis of rotation

of particle of mass m. Here, T is rotational kinetic energy, and the object

I ≡∫r2dm is called the moment of inertia of the system.

Note the similarity with linear motion. In linear motion, T = 12m|~v|

2.

In rotational motion, T = 12Iω

2. In both cases, to obtain the kinetic energy

(that is, a given force) we multiply the squared velocity (linear or angular)

by m, if motion is linear, or I, if motion is rotational. Therefore, just as mass

determines the force needed for a desired linear acceleration, the moment of

inertia determines the torque needed for a desired angular acceleration.

Hence, our definition:

Definition 3.3 (Moment of inertia) In rotational motion, the moment

of inertia is the torque needed to achieve a certain angular acceleration, given

by:

I ≡∫r2dm (3.1)

The moment of inertia depends on the properties of the system of consid-

eration. For some objects, especially symmetric and solid ones, the integral

in equation (3.1) can be solved analytically. For example:

• For a disk rotating through the center about an axis that is perpendic-

ular to it, the moment of inertia is I = 12mR

2, where R is the radius

about the center and m is the mass of the disk.

• For a solid sphere rotating about an axis through its center, the mo-

ment of inertia is I = 25mR

2.

• For a rod of length ` and mass m, the moment of inertia is I = 112m`

2.

The moment of inertia depends on the axis about which the rotation

is taken. Above we have considered that the axis goes through the center

of mass. However, this may not be. What is indispensable is for this axis

to be perpendicular to the direction of rotation. So long as this is true, a

convenient theorem allows us to calculate the moment of inertia about any

other axis:

141


Result 3.1 (Parallel axis theorem) Let ICM be the moment of inertia

of a body of mass m that is rotating about an axis going through the center

of mass. Then, if the body is made to rotate about a different axis which is

parallel to the first one and separated by distance d from it, the moment of

inertia is given by:

I = ICM +md2

Proof. The proof is easiest in Cartesian coordinates, (x, y). Suppose,

without loss of generality, that the perpendicular distance d between the

two axes lies strictly along the x-axis, and that the center of mass is at the

origin. The moment of inertia relative to the axis going through the origin

is ICM =∫

(x2 + y2)dm. The moment of inertia relative to the alternative

axis is:

I =

∫ [(x+d)2+y2

]dm =

∫(x2+y2)dm+d2

∫dm+2d

∫xdm = ICM+md2

as∫xdm = 0 because the center of mass lies at the origin. �

Notice that, by this theorem, the moment of inertia is always lowest

when the axis goes through the center of mass.

A second result is useful for rigid objects that lie entirely within a plane

(i.e. think objects such as a piece of paper):

Result 3.2 (Perpendicular axis theorem) Define perpendicular axes x,

y, and z, all of which meet at the origin O. Suppose that a body lies entirely

on the xy plane, and the z axis is perpendicular to the plane of the body. Let

Ix, Iy, and Iz be the moments of inertia about axis x, y, and z, respectively.

Then:

Iz = Ix + Iz

Proof. In Cartesian coordinates, the moment of inertia of the body about

the z axis is:

142


Iz =

∫(x2 + y2)dm =

∫x2dm+

∫y2dm

Since on the plane we have that z = 0, then the right-hand side equals

Ix + Iy. �

3.2 Angular Momentum and Torques

Angular momentum is the rotational equivalent of momentum in linear

motion. As we shall see, it too is conserved in isolated systems that are

subject to no external forces.

Let us first introduce it formally. Consider an object of mass m, velocity

~v, and momentum ~p = m~v. Fix an arbitrary point O in space, and let ~r be

the position of the object relative to O. Then:

Definition 3.4 (Angular momentum) The angular momentum relative

to point O is defined as the vector:

~L ≡ ~r × ~p = (~r × ~v)m

Therefore, the angular momentum is the vector resulting from the cross

product of the position vector and the momentum vector. Using the prop-

erties of the cross product (see equation (0.10)), the magnitude L ≡ |~L| of

the angular momentum is:

L = vmr sin θ︸︷︷︸≡r⊥

where θ = ∠~r~v is the angle between the position and the velocity vectors,

r ≡ |~r| is the distance between O and the body, and v ≡ |~v| is the speed.

Here, the number r⊥ ≡ r sin θ corresponds to the perpendicular distance

between the position vector and the point of reference, O (see Figure 3.1).

The direction of the angular momentum vector is perpendicular to the (x, y)

plane.

143


Figure 3.1: The angular momentum.

Note that angular momentum is not an intrinsic property of a moving

object. For instance, if we chose O to be exactly so that ~r ⊥ ~v, then ~L = ~0.

But if O is chosen so that ~r and ~v are not perpendicular (as in Figure 3.1),

then ~L 6= ~0. Henceforth, we will call this point the reference point.

Thus, while the momentum of an object is the same regardless of the

reference point, the angular momentum is different for different reference

points.

Example 3.1 (Projectiles revisited) Consider again the parabolic mo-

tion of Example 1.8. At time t = 0, velocity is ~v0 and the position vector

is ~r0 = ~0 if we choose the origin (x, y) = (0, 0) as our point of reference.

Therefore, the object has no angular momentum: ~L = ~0. At some time

t > 0, however, when ~r 6= ~0, the angular momentum is clearly not the zero

vector. In this example, this is because the velocity vector is changing along

the trajectory of the body.

Example 3.2 (Circular motion) Consider the Earth (mass m) orbiting

around the Sun in a perfect circular orbit. Suppose our point of reference

is the Sun (point O), and let ~r be the Earth’s position vector relative to the

Sun at some point in time. The Earth has a certain tangential velocity ~v.

144


Then, the angular momentum of the Earth is ~L = m~r × ~v. However, for

circular motion, ~r ⊥ ~v (recall results from Example 1.2), so sin θ = 1 for

θ ≡ ∠~r~v, and so the magnitude of the angular momentum is simply:

L = mrv

where r ≡ |~r| and v ≡ |v|. Therefore, although the velocity vector of the

Earth is changing direction all the time, its angular momentum relative to

the Sun remains constant in magnitude (because the speed is constant). In

short, angular momentum is conserved.

Again, this is true only relative to the Sun. Clearly, if we took a point O

that is right on the Earth’s path as our point of reference, then the magnitude

of the angular momentum would depend on the Earth’s position along this

path. It would be zero as the Earth goes through O, and non-zero otherwise.

To see this point more generally, consider taking the time derivative of

the angular momentum:

d~L

dt=

d~r

dt× ~p+ ~r × d~p

dt= ~v × ~p+ ~r × ~F = ~r × ~F (3.2)

where we have used ~v×~p = ~0, as ~v and ~p are always parallel, and d~pdt = ~F

by Newton’s Second Law. The right-hand side of this equation is what we

call the torque:

Definition 3.5 (Torque) The torque vector is defined by:

~τ ≡ ~r × ~F

that is, the cross product of the position vector of an object (relative to

some point O), and the force applied on the object.

Thus, in equation (3.2) we have found:

Result 3.3 (Torque and Angular Momentum) The torque equals the

rate of change in angular momentum, ~τ = d~Ldt .

145


Thus, if there is a torque on an object, the angular momentum must

change over time. If there is no torque, angular momentum must be con-

served. This is reminiscent of linear momentum, which is conserved (by

Netwon’s Third Law) in the absence of external forces. Since torque is

nothing but an (angular) force, a direct corollary of this result is that an-

gular momentum is conserved unless a net torque is applied on the system.

Hence:

Result 3.4 (Conservation of angular momentum) Angular momentum

is conserved, d~Ldt = ~0, unless acted upon by a net external torque.

Now, it is clear why the Earth’s angular momentum with respect to the

Sun is conserved (Example 3.2): the force of gravity between the Earth and

the Sun is exactly parallel (at 180◦) to the position vector, so ~τ = ~Fg×~r = ~0.

Since there are no other forces acting upon the system, d~Ldt = ~0. However,

relative to any other point in the system, there will be a torque, and thus

angular momentum is not conserved.

Example 3.3 (Disk) Consider a disk of mass M and radius R, with center

of mass at point C. It rotates about point C with angular velocity ω. What

is the angular momentum of the disk as a whole?

Consider a particle i of the disk, with mass mi, position ri relative to C,

and speed vi. As in Example 3.2, since the velocity and the position vectors

are here perpendicular, we simply have that Li ≡ |~Li| = mirivi or, since

vi = ωri for all particles, then Li = mir2i ω. Thus, the magnitude of the

angular momentum of the entire disk is (that is, integrating across the mass

distribution of the disk) is:

L = ωI

where I ≡∫r2dm is the disk’s moment of inertia.

Therefore, we have derived the following result:

146


Result 3.5 (Moment of inertia and angular momentum) For rotational

motion, the moment of inertia equals the ratio of the angular momentum to

the angular velocity: I = Lω .

Since a torque implies a change in the angular momentum (Result 3.3),

then this result makes clear how an external torque, by causing an angular

acceleration, implies a change in the moment of inertia.

Remarkably, if rotation is about the system’s center of mass (which we

happened to choose in Example 3.3 as our reference point), then the angular

momentum always has magnitude L = ωI regardless of the reference point.

That is, if an object is spinning about its center of mass, then the angular

momentum is uniquely determined, and its magnitude is given by L = ωI.

We call this the spin angular momentum of the system, and it is therefore

an intrinsic property of the system (i.e. independent of the reference point).

Definition 3.6 (Spin angular momentum) The intrinsic angular mo-

mentum of a spinning object that is rotating about a stationary axis going

through the object’s center of mass.

For example, the Earth spins about its center of mass, so it has an

intrinsic spin angular momentum. It also has an orbital angular momentum,

but the magnitude of the latter depends on which point of reference is chosen,

while the former does not.

We can now sum up our results for torque and angular momentum in

rotational motion:

Result 3.6 (Summary: Rotational motion) Consider the rotation of an

object with angular velocity ω ≡ θ and angular acceleration α ≡ θ, about

some axis that goes through point Q. Let C denote the object’s center of

mass. Then:

• The angular momentum vector is defined by ~LQ = ~rQ× ~p, where ~rQ is

the position relative to Q. The magnitude of the angular momentum

about Q, LQ ≡ |LQ|, is:

LQ = IQω (3.3)

147


• The torque vector is defined by ~τ = ~rQ× ~F , where ~F is the force vector

relative to point Q. The magnitude of the torque about Q, τQ ≡ |~τQ|,is:

τQ = IQα (3.4)

• An external torque changes the angular momentum of the system, for

~τQ = ddt~LQ. If there is no net torque, therefore, the angular momentum

is conserved.

• If the rotation is instead about a stationary axis going through the

center of mass, the angular momentum LC = ICω is called the spin

angular momentum, and is fixed relative to the reference point.

Let’s apply these principles to a few examples:

Example 3.4 (Revolving chair experiment) Consider a man sitting on

a round revolving chair that rotates about its center of mass. The man holds

one weight on each hand. As the chair turns, the man will pull his arms out

and back in. When his arms are pulled out, the chair’s spin slower, and as

when the arms are pulled in, the chair spins faster. Why does this happen?

When the arms are pulling in, the moment of inertia is I = 12MR2,

where M is the mass of the man and R is its radius (here, for simplicity,

suppose we approximate the man’s shape by that of a cylinder). When the

arms are pulled out, the moment of inertia goes up both because the radius

is higher and, especially, because mass increases.

Clearly, since I increases but L = Iω must be conserved (as clearly no

external torque is at play here), then ω must decrease by the same proportion.

In words, because angular momentum cannot change, an increase in the

system’s moment of inertia when the arms are pulled out must translate

into a decrease in the angular velocity of the same proportion.2

2 Similarly, spinning figure skaters pull their arms in so as to reduce their momentof inertia and, in that way, increase their angular speed.

148


The last example is true for all spinning objects. For example, when a

star shrinks, its radius goes down, its moment of inertia decreases, and so

its angular velocity must go up. In particular, if a star shrinks by 10-fold

in a certain time, the moment of inertia will decrease by 100-fold,3 and the

angular velocity will increase by 100-fold as well.

Example 3.5 (A spinning rod) Consider a rod of mass m and length `.

Suppose that it rotates about some point P , which is at distance d from the

rod’s center of mass, C. Let ω be the angular velocity of rotation.

The magnitude of the angular momentum about point P , denoted LP , is:

LP = ωIP = ω(IC +md2)

where the second equation follows from the Parallel Axis theorem (Result

3.1), and IC = 112m`

2 for rods. The torque relative to point P is ~τP =

~rP × ~FP , where ~rP and ~FP are the position and force vectors of P relative to

C. Clearly, ~r and ~F are parallel (a centripetal force is pushing the rod on P

just in the same direction as the position vector, as the rod is shaped like a

straight line), so ~τ = ~0. Since no external torque exists, angular momentum

relative to point P is conserved.

Clearly, however, angular momentum is not conserved anywhere other

than P (the point about which the rod rotates). Indeed, for any point Q, ~rQ

and ~FQ are not parallel, so a torque exists relative to Q.

If the rotation is relative to C, the center of mass, then clearly there is no

force at all, ~F = ~0. Thus, ~τ = ~0 relative to any point (inside or outside of the

rod). Because rotating about the center of mass has the special property that

no torque exists relative to any point of origin, the angular momentum about

the center of mass is the spin angular momentum, an intrinsic property of

the rod. In particular, it is LC = ICω = 112m`

2ω.

Example 3.6 (Translation and Rotation) Here is a classic problem that

illustrates well the concepts above.

3 Recall I = 25mR2 for spheres such as stars.

149


Consider a rod (e.g. a ruler) of mass m and length ` lying at rest on a

frictionless surface (e.g. a frictionless table). We give the rod an impulse

(i.e. a force for a short period of time; see Definition 2.7) at some point P

on the rod. Point P is a distance d away from the rod’s center of mass, C.

The impulse vector ~I is perpendicular to the length of the rod.

Because the center of mass is a point that behaves like a single point where

all the mass is concentrated, it must experience a certain velocity ~vC which

will never change direction or magnitude. We call this the translational

velocity. The rod will also rotate about C at some angular speed, ωC .

Now, we want to derive ~vC (the translational velocity) and ωC (the an-

gular speed). The impulse is ~I =∫~Fdt =

∫ d~pdt = ∆~p, where ~p is the

momentum of the system. By the properties of the center of mass (Result

1.2), we have m~aC∆t = ∆~p = ~I. Since the initial velocity is zero, then

~aC∆t = ~vC . Thus:

~vC =~I

m

is the translational velocity of the center of mass. Remarkably, it is

independent of d: no matter where on the rod we give the impulse, the

velocity of the center of mass will always be the same (everything else equal).

For the angular velocity about the center of mass, C, we have to choose

the origin. We are of course free to choose its location, so let us do two

cases:

• Take C itself as our reference point. The torque relative to C is clearly

not zero, as ~rC and ~F are not parallel (in fact, ~rC ⊥ ~F ). Since there is

a torque, the angular momentum relative to point C must be changing.

In particular, the torque about C is ~τC = ~rC × ~F . Assuming the

hit occurs over a short enough time so that the position vector barely

changes, we have:

∫~τCdt = ~rC ×

∫~Fdt = ~rC × ~I

Since the object is initially at rest, then the angular momentum before

150


the hit is zero, and after the hit it equals the accumulated torque, so

~LC = ~rC × ~I. Thus, the angular velocity about C, ωC , satisfies LC =

ICωC , where IC = 112m`

2 for rods, and LC = |~rC × ~I| = |~rC ||~I| = dI

is the magnitude of the angular momentum,4 where I ≡ |~I| is the

magnitude of the impulse.5 Thus, we get 112m`

2ωC = dI, so ωC = 12dIm`2

.

• Take P as our reference point. Now, the position vector is in the same

direction as the impulse, since the origin is trivially the same as the

point where the force is exerted. Thus, there is no torque, and angu-

lar momentum is conserved, about point P . The angular momentum

remains ~LP = ~0 before and after the hit. Note here we cannot invoke

the Parallel Axis theorem, for the point of rotation is still at the cen-

ter of mass.6 Thus, the moment of inertia is still IC = 112m`

2, and

ωC = LC/IC = 12LCm`2

. To compute LC using P as the reference point,

we note once again that ~rC ⊥ ~I, except now ~rC is going in the oppo-

site direction as before (from P to C instead of C to P ). The same

derivation as before then follows.

Example 3.7 (A Ruler) Take a ruler of mass m with center of mass C

and make it rotate around a perpendicular pin that is placed at some point

P , at distance b from C. We take P as our reference point.

The force vector is now due to gravity at point C, whose position vector

relative to P we denote by ~rP . Therefore, the torque at C relative to P

is ~τP = ~Fg × ~rP , with magnitude τP = mgb sin θ, where θ is the angle of

the ruler about the vertical axis. Using equation (3.4), it must also be that

τP = −IPα, where IP is the moment of inertia about a perpendicular axis

going through P , and α ≡ θ is the angular acceleration. The minus sign

is there because, just like in a spring (where ~F = −kx), the torque is a

restoring force.

4 Here, we use that ~rC ⊥ ~I to say that sin θ = 1, where θ ≡ ∠~rC~I.5 The notation of this problem should not lead to confusion: ~I denotes the impulse,

and ~IC denotes the moment of inertia about point C. Similarly, I and IC are theirrespective magnitudes.6 The theorem could be invoked if P , our new reference point, was also the new point

of rotation, which it is not.

151


Thus, τP = mgb sin θ = −IP θ. Using a Small Angle approximation and

Taylor-expanding sin θ about θ = 0 we get sin θ ≈ θ. Putting things together,

we have found mgbθ + IP θ = 0, or:

θ +mgb

IPθ = 0

Clearly, this is a simple harmonic oscillation in θ. Thus, the solution is:

θ = θmax(ωt+ ϕ)

where the angular frequency ω is a constant,7 and (as we have derived

before) it is given by ω =√

mgbIP

. The period is T = 2π√

IPmgb . Moreover, we

know IP = IC +mb2 = 112m`

2 +mb2 by the Parallel Axis theorem. Thus:

ω =

√gb

112`

2 + b2and T = 2π

√112`

2 + b2

gb

The kinetic energy of rotation of the ruler is:

Trot =1

2IPω

2

where ω = θ and IP = 112m`

2 + mb2. Using θmax and ϕ as computed

in Example 2.9, the rotational kinetic energy will readily follow. Note the

kinetic energy changes with time: it is zero when the ruler comes to a halt

(at θ = θmax), and it is maximum as the ruler is exactly at θ = 0 (i.e. in a

vertical position).

Example 3.8 (A Hula-Hoop) Hang a hula-hoop vertically from a pin at

point P . Let C be the center of mass at a certain time. Note that the center

of mass will change as the hoop moves (in Figure 3.2, the center of mass

changes from C when the hoop is at rest, to C ′ at a later time). The mass

of the hoop is m, it has a radius R, and we let θ be the angle between ~rP

and the vertical axis.

7 Typically, for the solution of a simple harmonic oscillator we will denote the angularfrequency by ω. Here, we place a tilde to avoid confusion with the angular velocityof the problem, ω, which is not constant.

152


P

CC ′

mg

~rP

Figure 3.2: The hula-hoop.

Fix P , the axis of rotation, as our reference point. The magnitude

of the torque relative to point P is τP = mgR sin θ = −IP θ, as before.

Now, the moment of inertia about point P is, by the Parallel Axis theorem,

IP = IC + mR2, where the moment of inertia about the center of mass is

IC = mR2, since all the mass is distributed along the hoop’s circumfer-

ence, which is at distance R. Thus, IP = 2mR2. In sum, using the Small

Angle approximation as before (i.e. sin θ ≈ θ for θ ≈ 0), we have found

θ + mgRIP

θ = 0, that is:

θ +g

2Rθ = 0

This is, again, a simple harmonic oscillator in θ, whose solution is θ =

θmax(ωt+ ϕ), where ω =√

g2R . The period of oscillation is T = 2π

√2Rg .

Remarkably, these are the same results (recall Example 1.10) as a pen-

dulum with length ` = 2R. Note, again, that the mass m is irrelevant for

the period of the hoop, as well as that of the pendulum.

Next, we examine rolling. Rolling is a type of rotational motion that

involves the translation of a moving object along a surface (e.g. the wheels

of cars). In particular, we will consider situations where there is pure rolling,

which we define as rolling motion that does not involve any sliding of the

153


object. That is, for a round object with radius R, pure roll is the situation

where, after one full rotation, the object has travelled a distance equal to

the perimeter of its circumference, 2πR. As a consequence, in pure rolling,

the velocity of the center of the object equals that of the circumference.

Therefore:

Definition 3.7 (Pure Rolling) Rolling motion that does not involve slid-

ing, so that the velocity of the center of the circle (point Q) is vQ = ωR,

where ω is the angular velocity.8

If there is no friction, the object would rotate on its own axis, there

would be no translation, and vQ = 0. Thus, it is friction which permits pure

rolling. Let’s look at a famous example: rolling cylinders on an incline.

Example 3.9 (Rolling cylinders) Place a round object (e.g. a cylinder

or a sphere) on an incline (see Figure 3.3). The object will (pure-)roll

downhill. The cylinder has mass m, length `, radius R, and center of mass

Q.

Q

mg cosβ

~FN

mg sinβ

mg

β

~Ff

Figure 3.3: A rolling cylinder.

The forces involved in this problem are similar to those of Example 1.13.

First, we decompose the force of gravity mg into its x and y components,

mg sinβ and mg cosβ, respectively, where β is the angle of the incline. A

8 Recall that ωR is the velocity of the circumference. If there is pure roll, it coincideswith that of the center point.

154


normal force from the incline’s surface onto the object has magnitude FN =

mg cosβ (as there is no acceleration in the y direction). Finally, there is a

frictional force Ff .

At any moment in time, there is a (time-varying) angular velocity of ω.

Point Q (the center of mass), travels at velocity vQ = ωR (because of pure

roll). Thus, the acceleration is a = ωR = αR, where α denotes the angular

acceleration.

When calculating the torque about point Q, the only contributing force

is ~Ff (for ~FN and mg both go through Q, so the position vector for them is

zero), so the magnitude of the torque is τQ = RFf . Using equation (3.4),

we know τQ = IQα, where IQ is the moment of inertia for axis through the

center of mass. Using a = αR, we have found τQ = RFf = IQaR , or:

a =R2FfIQ

On the other hand, if Q is the center of mass, we have ma = mg sinβ−Ffby Newton’s Second Law, or:

Ff = m(g sinβ − a)

Substituting this into our first equation, we find a = R2

IQm(g sinβ−a) or,

solving for a, we have a = mR2g sinβmR2+IQ

, and Ff =IQR2a. Then:

• If the cylinder is solid (the mass is evenly distributed on the area), the

moment of inertia about the center of mass is IQ = 12mR

2, and so we

obtain an acceleration of:9

asolid =2

3g sinβ

• If the cylinder is hollow (all the mass is at the circumference), the

moment of inertia is IQ = mR2, so:

9 If we considered a sphere instead of a cylinder, then IQ = 25mR2, in which case

a = 57g sinβ.

155


ahollow =1

2g sinβ

These are surprising results. First, note that the acceleration is always

lower than in the case with no friction (where a = g sinβ). Second, the

acceleration of the cylinder when rolling downhill is completely unaffected

by the cylinder’s mass, length, or radius. Downhill races between rolling

cylinders of different masses, lengths, or radii, will always end in a tie if the

starting positions (and velocities) are the same. However, solid cylinders

have higher acceleration than hollow cylinders, so the former will always hit

the bottom of the incline before the latter does (again, independent of mass,

length, or radii).

Finally, to check that there is indeed pure roll (i.e. no slipping), we

need Ff < µsFN , where µs is the coefficient of static friction (recall e.g.

Example 1.13). Here, FN = Mg cosβ and Ff =IQR2a. For a solid cylinder,

for example, the condition for pure roll becomes µs >23 tanβ. For a hollow

cylinder, the condition for pure roll is µs >12 tanβ.

Example 3.10 (Atwood machine) Figure 3.4 shows a so-called Atwood

machine: a disk of mass M and radius R with center of mass P serves as a

pulley for a massless rope which goes around it and sustains, on each end,

two objects of masses m1 and m2, respectively.

Each hanging body is dragged down by gravity. There is tension in the

strings, both for supporting the bodies and downward from the pulley. Fi-

nally, the pulley itself suffers the gravitational force on its center of mass P ,

and a normal force supports the structure in balance, so FN = T1 +T2 +Mg.

Suppose the system is being accelerated so that the pulley rotates clock-

wise. Moreover, assume that the rope does not slip around the circumference

of the disk, i.e. that friction is strong enough for the rotational of the disk to

translate one for one into the rope’s motion. Since there is no slip, the veloc-

ity of the rope is v = ωR, where ω is the angular velocity. The acceleration

of the rope is thus a = αR, where α = ω is the angular acceleration.

For object 1 and 2, Newton’s Second Law reads:

156


P

Mg

m1g

m2g

T1

T2

T1

T2

FN

R

a a

ωR

~r1~r2

Figure 3.4: The Atwood machine.

T1 −m1g = m1a m2g − T2 = m2a

respectively. On the pulley there are no linear forces, but there is a

torque. The torque is only due to forces T1 and T2, since FN and Mg go

through point P and thus do not contribute to the torque (their position

vectors being the zero vector). Relative to point P , the position vectors for

forces ~T1 and ~T2 are ~r1 and ~r2, respectively. Notice ~r1 ⊥ ~T1 and ~r2 ⊥ ~T2, so

in both cases sin θ = 1. Therefore, the magnitude of the total torque relative

to point P is simply:10 τP = RT2−RT1. Recalling τP = IPα, where IP is the

moment of inertia. Since this is a rotating disk with rotation axis through

the center of mass, IP = 12MR2, so we have, putting things together:

10 Note the two torques have opposite sign because the direction of the torque vectoris into the plane for the left-hand torque, but out of the plane for the right-handtorque.

157


T2 − T1 =1

2Ma

Thus, we have three equations for three unknowns, (a, T1, T2). Solving,

we get a =(

m2−m112M+m1+m2

)g, T1 = m1(g + a), and T2 = m2(g − a).

The solution reveals that for rotation to be clockwise, it is necessary that

m2 > m1. Only then do we have that a > 0, so that m2g > T2 (body 2’s

weight is stronger than the tension holding it up), T1 > m1g (body 1’s weight

is insufficient to counteract the force pulling it up), and T2 > T1 (a stronger

tension on the right side of the rope makes it rotate clockwise).

3.3 Gyroscopic Motion

Up until now, we have studied rotational motion for objects which rotate

about a fixed axis. The relative position of this axis to the point of reference

has been crucial for determining the torques and the angular momentum

involved in the system. Now, we study the rotation of objects about an axis

which may itself not be at rest. This is often called gyroscopic motion.

A gyroscope is a device that consists of a wheel or disk that can rotate

about an axis which is free to change in orientation. When a gyroscope spins

and its axis of rotation is altered due to a one-time external torque, the

conservation of angular momentum (which must hold because no additional

external torque is applied on the system) forces a change in the orientation

of the system’s axis. We call this a precession.

Definition 3.8 (Precession) The change in the orientation of the rota-

tional axis of a rotating body.

Let’s see this by means of example:

Example 3.11 (Spinning Wheel I) Consider spinning a bicycle wheel in

outer space and, at the same time, applying a torque that sets the axis of

rotation in motion. Importantly, the wheel cannot continue spinning about

its axis and, at the same time, have said axis rotate in the same direction

158


as that of the initial torque vector (thereby describing a circular motion

on the same plane as the torque vector). The reason is that, if that were

the case, the direction of the angular momentum vector would be changing

and, as a result, angular momentum would not be conserved.11 But angular

momentum must be conserved, because there are no external torques on the

system once the initial torque has been given! So how does nature resolve

this?

To understand what will happen, consider the upper panel of Figure 3.5.

Let x be the axis of rotation, and (x, y) be the plane on which the observer

is located. Suppose that two torques of equal magnitude F and opposite

direction are simultaneously applied for some duration ∆t at points A and

B, which lie strictly on the (x, y) plane. Let b denote the separation between

these two points, i.e. |~r| = b.

The torque relative to the center of mass is:12

~τ ≡ ~r × ~F = bF~e+

by definition of the cross product (equation (0.10)), where ~e+ is a unit

vector perpendicular to the (x, y) plane and pointing in the z+ direction (i.e.

pointing toward more positive values of z). Therefore, the torque vector is

pointing upward.13 Therefore, magnitude of the torque is τ = bF . The

angular momentum is ~L, perpendicular to the wheel’s spin.

When the new torque is applied for a period of length ∆t, the angular

momentum changes by ∆~L = ~τ∆t (by Result 3.3). Crucially, the direction

of ∆~L is the same as that of ~τ (that is, upward toward z = +∞). In words,

the spin angular momentum will change in the direction of the torque. After

the torque ceases to exist, the angular momentum of the system can no

longer change. Thus, in order for the angular momentum to be conserved,

11 Here, whenever we talk about angular momentum, we really mean spin angularmomentum (Definition 3.6), because we are computing the angular momentum takingthe axis of rotation that goes through the system’s center of mass.12 Here, we have used that ~r ⊥ ~F (so sin θ = 1).13 As usual, we are free to choose the direction of our axis so long as we are consistentthroughout. The convention we will use is that the z+ direction is out of the page,and the z− direction is into the page.

159


Figure 3.5: A spinning bicycle wheel in outer space.

the wheel must tilt. After the torque has occurred, the spinning wheel must

keep its tilted position (Figure 3.5, lower panel). In short, the spinning wheel

precesses.

Here, the torque is pointing upward because of the direction of the forces

applied at points A and B. If we flip the directions of these forces, the

torque will point downward, the change in the angular momentum will also

point downward, and that will tilt the wheel in the other direction (to the

right). Similarly, if the forces at A and B lived on the (x, z) plane (i.e. they

were pointing up and down), then the torque would point perpendicular to

(x, z), the change in the angular momentum vector would point along the

(x, y) plane, and this would tilt the wheel right and left along the horizontal

dimension.14

The above example ignored the effects of gravity, because the wheel was

spinning in outer space. The following example examines how the wheel’s

14 A famous experiment has the experimenter holding a spinning wheel, sitting on arevolving chair, and experiencing a spin in the chair from left to right and vice versaas the axis of rotation of the wheel is moved vertically.

160


precession will interact with gravity.

Example 3.12 (Spinning Wheel II) Consider now mounting the wheel

of mass m and radius R mounted on a structure similar to what is shown in

Figure 3.6. At point O (our point of reference), the wheel is attached to a rod

of length a, which serves as the axis of rotation. The position of the wheel

relative to point O is ~r. Clearly, the spin angular momentum vector is now

pointing in the y+ direction (in the figure, it is denoted by ~ω). On the other

hand, gravity acts upon the wheel in the z− direction, and has magnitude

mg. In the figure, the gravitational force vector is marked by ~W .

Relative to point O, the torque vector is:

~τ = ~r × ~W = amg~e

(note sin θ = 1 because ~r ⊥ ~W ), where ~e is pointing in the direction

of x+. In the figure, the torque vector is denoted ~Γgrav. Therefore, since

∆~L = ~τ∆t, the spin angular momentum vector (~ω, in the figure) will move

in the direction of the torque, and the system will precess. In particular, the

system will rotate about the axis of precession counterclockwise, as marked

in the figure.

Finally, the angular frequency of the precession (i.e. the velocity at

which the system rotates around its axis of precession) is given by:15

ωprecession =τ

LS

where τ and LS are, respectively, the magnitudes of the torque and the

spin angular momentum of the system. In our case, τ = amg (see above)

and LS = IOωspin (see equation 3.3), where IO is the moment of inertia

about point O, and ωspin here denotes the angular velocity of the spinning

wheel. If we assume that all the mass of the wheel is at its circumference (a

rough approximation), then IO = mR2, so ωprecession = agR2ωspin

. The period

of the precession is, as usual, Tprecession = 2πωprecession

.

15 This result is being stated without proof.

161


Figure 3.6: Gyroscopic motion. Our usual notationis slightly different for this figure. ~ω here is the spinangular momentum vector, and ~W is the gravitationalforce.

3.4 Elliptical Orbits and Kepler’s Laws

So far, when discussing rotational motion (for example, the motion of

the Earth around the Sun), we have narrowed attention to circular motion.

In reality, however, planets describe elliptical orbits. In this section, we

explore aspects of this type of motion.

Let us first sum up the results we have found for circular motion:

Result 3.7 (Summary: Circular Motion) When an object of mass m

orbits around another object of mass M and describes a circumference of

radius R, then:

• The period of rotation is T = 2π√

R3

GM , where G is the gravitational

constant.

• The orbital speed of motion is v = 2πRT =

√MGR .

• Total energy is E = T + V = 12mv

2 − mMGR = −mMG

2R .16

16 Recall that, here, we normalized potential energy to be zero at infinity, V∞ = 0.This explains why all bound orbits have negative total energy.

162


• The escape velocity is vesc =√

2MGR .17

In general, bound orbits are ellipsis, though in our Solar System most

are close to circular. The laws of motion for these types of planetary motion

were first famously formulated by Johannes Kepler (1571 – 1630). His laws

are the following:

Law 1 The orbits of planets are elliptical, and the Sun is at one focus.

Law 2 The line segment joining a planet and the Sun sweeps out the same

area at any point in the ellipse for any given fixed amount of time.

Law 3 The square of the orbital period of the ellipse is proportional to the

cube of the distance to the Sun.

Figure 3.7 illustrates what is meant by Law 2: if in a given amount of

time, a planet experiences a motion along the arc of any of the three shaded

regions, then these regions all have equal area. For this reason, this law is

commonly known as the Equal areas – Equal times law.

M

Q

m

P A

~r~v

C a

Figure 3.7: An elliptical orbit. An object of massm orbits about a mass M located at point Q. C isthe center of the ellipsis, a its semi-major axis, A theapogee, and P the perigee.

Let us thus study elliptical orbits formally. Consider (as in Figure 3.7)

an ellipsis about an object of mass M in location Q. The perigee/perihelium

and apogee/aphelium are clearly marked in Figure 3.7 by points P and A,

17 Recall that we calculated vesc by setting E = 0.

163


respectively.18 Let the distance between the two be 2a, where a is the semi-

major axis of the ellipsis and point C marks the center of the ellipsis.19

Finally, let ~r be the distance vector of object m from M , and ~v be the

former’s object velocity vector.

The critical difference between elliptical and circular motion is that the

distance between the objects, r = |~rQ|, now changes with time, whereas for

circular orbits it is some constant R. Therefore, the speed v now changes

with time. In particular, the period of motion, total energy, and escape

velocity are as follows:

Result 3.8 (Summary: Elliptical Motion) Consider the motion of m

around M along an ellipsis with semi-major axis a. Then:20

• The period of rotation is T = 2π√

a3

GM .21

• Total energy is E = T + V = 12mv

2 − mMGr = −mMG

2a .

• The escape velocity is vesc =√

2MGr .

Compare these with the results for circular orbits (Result 3.7). Note

that the functional forms are all similar, but where we had an R in circular

motion (a constant), now we have an r (a time-varying distance).

• Both kinetic and potential energy are time-varying, though by the

conservation of mechanical energy E is constant.

• The period in elliptical orbits is also very similar to that of circular

orbits, except that the radius R is replaced by the semi-major axis,

18 The perigee (if M is the Earth and m is the moon), or perihelium (if M is the Sunand m is the Earth), is the point on the ellipsis that is closest to the object M . Theapogee (for Earth and moon), or aphelium (for Earth and Sun), is the point that isfurthest.19 The major axis of an ellipse is its longest diameter, i.e. the length of the longestsegment running through the center and both foci, with ends at the widest points ofthe perimeter. The semi-major axis is half of the major axis, and thus runs from thecenter of the ellipsis, through a focus and to the perimeter.20 These results are being stated without proof.21 Indeed, T 2 ∝ a3, Kepler’s third law.

164


a. Remarkably, both period and total energy of a circular and an

elliptical orbit are the same if a = R.

• Finally, the escape velocity is again calculated by setting E = 0 and

solving for v (intuitively, the speed necessary to attain infinity with

no energy left). For elliptical motion, this speed is a function of the

orbital speed that the object is experiencing, which is now changing

in time.

Let us now calculate the semi-major axis a, and velocities at various

points. Let ϕ denote the angle between position and velocity, ϕ = ∠~r~v.

Take the initial conditions (r0, v0, ϕ0) as given. The total energy at time

t = 0 is E = T +V = 12mv

20− mMG

r0. Since energy is conserved, E will never

change. According to our formula above, E = −mMG2a , so 1

2v20−MG

r0= −MG

2a .

Solving for a, we obtain the semi-major axis that the orbit will attain:

a =MGr0

2MG− v20r0

(3.5)

Next, we calculate the velocities as the object passes through the perigee

and apogee, ~vP and ~vA. At both these points, ~v ⊥ ~r, so sinϕA = sinϕP = 1.

At t = 0, the magnitude of the angular momentum about point Q (where

the mass M is located), is then LQ = |~LQ|, is LQ(0) = mv0r0 sinϕ0. At

a later time t, when the object is at point P in Figure 3.7 (the perigee),

the angular momentum about point Q is LQ(t) = mvP rP , where rP is

the distance between Q and P (and we have used sinϕP = 1). By the

conservation of angular momentum about point Q, it must then be that

LQ(0) = LQ(t), so:

v0r0 sinϕ0 = vP rP (3.6)

On the other hand, the total energy must be conserved. At time t, when

at point P , we have E(t) = 12mv

2P −

mMGrP

. Using Result 3.8, then:

1

2v2P −

MG

rP= −MG

2a(3.7)

165


where a is given by equation (3.5). Thus, we have a system of two

equations (3.6)-(3.7) and two unknowns, (vP , rP ). Note equation (3.7) is

quadratic, so it will give us two solutions for rP . One solution will indeed

be the distance to the perigee, rP . The other one will be the distance to

the apogee, rA. Indeed, note that we could have replaced rP by rA above,

and we would have obtained the same equations and, thus, the same two

solutions.

The two solutions should verify that rAvA = rP vP , a direct consequence

of the conservation of angular momentum, and rA + rP = 2a, by construc-

tion.

Example 3.13 (A satellite) Consider a satellite orbiting around the Earth.

The Earth is M = 6 × 1024 kg, and suppose the satellite is r0 = 9, 000

km above the Earth, with initial speed of v0 = 9 km/sec and angle ϕ0 =

∠~r0~v0 = 120◦. Then, using our results above, the major axis of its orbit

will be a = 50× 103 km, a very large number relative to the initial position.

To understand this huge displacement from its initial position, notice that

vesc =√

2MGr = 9.4 km/sec, so the initial velocity of the object is just shy

of the speed it would take for the satellite to fly out of orbit. One orbit is

completed in T = 31 hours.

On the other hand, it can be checked that rP = 6.6×103 km and vP = 10.7

km/sec, while rA = 9.3× 104 km and vA = 0.75 km/sec. That is, the object

would pass very close to the Earth (at the perigee) at high speed, then lose

speed until it reached the furthest point, and pick it up again on its way back.

Next, we study orbital changes, that is situations in which an object in

orbit is given a force along the orbit (e.g. a rocket in orbit that fires its

engines). If the object is initially in a circular orbit, then clearly the thrust

will produce kinetic energy, increase the velocity, and change the orbital

path from a circle to an ellipse. Let’s see an example:

Example 3.14 (Two astronauts) Consider two astronauts, A and B, or-

biting Earth along the same circular orbit of radius R. Let A and B also

index their positions. Astronaut A is a fraction f of the circumference of

166


the circle behind astronaut B. Since the perimeter of the circumference is

2πR, this means that the distance along the arc between A and B is f2πR

in one direction, and (1− f)2πR in the other direction.

Suppose A holds a tennis ball and wants to throw it out in such a way

for astronaut B to catch it exactly as she passes through A at a later time,

using an impulse that it exactly tangential to the circumference. The ball will

experience an elliptical orbit. For a catch to occur, the throw has to satisfy

Tball = (1−f)Tastronauts. Since the astronauts are orbiting in circular motion

and the ball will describe an ellipsis, we have 2π√

a3

MG = (1−f)2π√

R3

MG or,

simplifying:

a = (1− f)23R

In words, A will have to ensure that his throw describes an ellipsis with

semi-major axis equal to a = (1− f)23R. Now we can work out the speed at

which the throw will have to be made. The total energy satisfies −mMG2a =

12mv

2ball −

mMGR , so we can solve for vball because a is known. Since A is

moving at some speed vA herself, the speed she will have to give the ball is

vball − vA.22

This is only one of many solutions, as the catch could occur at some other

point after the ball has already made nball revolutions and astronaut B has

already made nB revolutions. A similar derivation delivers a =(nB−fnball

) 23R.

For instance (nB, nball) = (1, 1) is trivially a solution (the one derived

above).

22 It is possible that this number be negative. This would mean that A would haveto throw the ball back for B to catch it.

167

Chapter 4

Stability and Elasticity

4.1 Static Equilibrium

In the previous chapters we have argued that the motion of bodies re-

quires the presence of forces and/or torques. By Netwon’s Laws, when the

(vector-)sum of forces acting on a particle is zero, the particle will be at

rest.1 For a collection of many particles, it is sufficient to consider the forces

that alter the object’s center of mass. If the sum of forces on the center of

mass is zero, the collection of particles as a whole remains at rest.

For rigid bodies, forces can also be treated as acting upon the object’s

center of mass. However, whether or not the body stays at rest ultimately

depends on where these forces are applied. In general, if the vectorial sum

of forces is zero, this may not guarantee that the body will be at rest. The

reason is that a torque (a rotational force) may be at play. For example, if

we pull two corners of a table in opposite directions but equal force, the sum

of forces is zero, but the table is not at rest (frictional forces permitting)

because a torque makes it rotate around the axis that goes through its center

of mass.

Therefore, for an object to be at rest, we require that the sum of all

forces as well as the sum of all torques be zero. This is the notion of a static

1 Alternatively, it may be in a uniform linear motion, but we can always change thereference frame so it remains at rest.

169


equilibrium.

Definition 4.1 (Static Equilibrium) A body is in static equilibrium if:

• Translational stability: All the forces applied on the body cancel out.

That is: ∑~F = ~0

• Rotational stability: All the torques about any point Q in the body

cancel out. That is: ∑~τQ = ~0

for all reference points Q.

By definition, if an object is in a static equilibrium, it has no linear

acceleration (a = 0) nor angular acceleration (α = 0), so it remains at rest.

Example 4.1 (A Ladder) Consider a ladder that is reclined on a wall,

forming a right triangle (Figure 4.1, panel (a)). Suppose there is no friction

at point P , and let the coefficient of static friction at point Q be denoted by

µ (whose magnitude depends on the attributes of the ladder and the floor on

which it stands). Let C be the ladder’s center of mass. The ladder (the line

PQ) has length ` and mass m. If the angle θ is too small, the ladder will

slide due to gravity. If it is large enough, it will stay upright, and be in a

static equilibrium.

The force of gravity acts upon the ladder’s center of mass. On the other

hand, at points Q and P , normal forces FNQ and FNP push the floor and

the wall, respectively, against the ladder. Finally, a frictional force Ff keeps

the ladder upright.

For the ladder to be in static equilibrium, the sum of all forces must be

zero in both x and y dimensions. Thus:

∑Fx = FNP − Ff = 0∑Fy = FNQ −mg = 0

170


θ

C

P

Qmg

FNQ

FNP

Ff

(a) A ladder reclined on a wall.

C

P

Q

D

Mg

(b) A mass M on the ladder.

Figure 4.1: A ladder (line PQ) on the floor, reclinedon a wall.

Moreover, the sum of all torques must be zero about any point of refer-

ence (e.g. P , C, Q, or any other point for that matter). The most conve-

nient point for the calculation of the torque is Q, because for Q the forces

~FNQ and ~Ff do not contribute to the torque (as the position vectors are

trivially zero for those forces act on Q itself).

The torques about point Q are the normal force on P and the gravitational

force on C. Thus, we require that the total sum of torques is zero, or:

∑~τC = ~FNP × ~rP + ~Fgrav × ~rC = ~0

where ~rP and ~rC are the position vectors from point Q to points P and

C, respectively. For point P , |~rP | = ` (as ~rP traces out the hypothenuse of

the triangle) and ∠~rP ~FNP = θ, so ~FNP × ~rP = (FNP ` sin θ)~e+, where ~e+ is

a unit vector from point Q that is orthogonal to the (x, y) plane and in the

plus direction (of the z axis, i.e. out of the page). For point C, |~rC | = 12`

(as the center of mass is exactly half-way between points P and Q) and

∠~rC ~Fgrav = π2 −θ, so ~Fgrav×~rP = 1

2(mg` cos θ)~e−, where ~e− is a unit vector

from Q that is orthogonal to (x, y) but into the minus direction (i.e. into the

page).2 Therefore, in magnitudes, we require FNP ` sin θ−mg` cos θ = 0, or:

2 We have used that sin(π2− θ)

= cos θ, a direct consequence of equation (0.3). For

another way to see this, note that the angle ∠~rC ~Fgrav is defined by the sides PQ and

171


FNP =1

2mg cot θ = Ff

where cot θ ≡ cos θsin θ = 1

tan θ is the reciprocal of the tangent, often called

the cotangent, and the second equality comes from∑Fx = 0 above.

By stability, the ladder does not slide. Therefore, the frictional force

cannot exceed the product of µ (the coefficient of static friction at Q) times

the normal force (recall Definition 1.18), or Ff ≤ µmg. Using the result

above, 12mg cot θ ≤ µmg, that is cot θ ≤ 2µ. Thus, the critical angle satisfies:

tan θ∗ =1

2µ(4.1)

Thus, the condition that ensures that the ladder will be stable is θ ≥ θ∗.The smaller µ is (e.g. a slippery floor), the more unstable the system is, in

the sense that θ∗ is high: only if the ladder is at a high angle relative to the

slippery floor will it remain at rest.

Now suppose that a body of mass M is placed at some point D between Q

and P (see Figure 4.1, panel (b)).3 For instance, suppose a person of mass

M starts going up the ladder. Let d be the distance between Q and D. Now,

the sum of forces is:

∑Fx = FNP − Ff = 0∑Fy = FNQ − (M +m)g = 0

on each dimension. The magnitude in the torque about point Q is:

FNP ` sin θ −mg` cos θ −Mgd cos θ = 0

where the last term is new and due to the heavy body at point D. There-

OP (where O is the origin), and the ratio of the two equals cos θ by Definition 0.6.3 In the figure, D is placed between Q and C, but this need not be the case. As we

shall see shortly, the stability of the system is highly affected by the location of Drelative to C.

172


fore, FNP = g cos θ` sin θ

(12m`+Md

), or:

FNP = g cot θ

(1

2m+

Md

`

)= Ff

where the second equality comes from ∼ Fx = 0. Again, for the ladder

not to slide we need Ff ≤ µFNQ = µ(m + M)g, so g cot θ(

12m+ Md

`

)≤

µ(m+M)g. Simplifying, we find the critical angle to satisfy:

tan θ∗∗ = tan θ∗

(m+ 2Md

`

m+M

)︸︷︷︸

≶1

(4.2)

where tan θ∗ = 12µ by equation (4.1). Therefore, whether the ladder

becomes more or less stable when an object of mass M is added depends on

whether the term in parenthesis in equation (4.2) is greater or lower than

one. Thus:

• If d ≥ `2 , then θ∗∗ ≥ θ∗.

In words, if the object is placed above the ladder’s center of mass (i.e.

between points P and C), then the ladder becomes more unstable, in

that it will remain at rest for a smaller range of angles. For instance,

if the ladder was reclined with θ = θ∗ and the object is placed above

the center of mass, the ladder will always topple.

• If d < `2 , then θ∗∗ < θ∗.

In words, if the object is placed below the ladder’s center of mass (i.e.

between points Q and C), the ladder becomes more stable, in that it

remains at rest for a wider range of angles. For instance, if the ladder

was reclined with θ = θ∗∗, it will topple. But if the object is placed, it

regain stability and stay upright.

The critical observation is that, while the friction Ff increases with the

distance d, the maximum friction, µFNP (i.e. µ(m + M)g) is independent

of d. If the person starts climbing at point Q (i.e. d = 0) and goes up the

ladder, the friction will increase but the maximum friction will stay constant,

173


so the ladder will remain stable. But once the person gets passed the ladder’s

center of mass (say, the central rung), there will be a point where the ladder

will slide and fall. This point will be the center of mass itself if θ = θ∗, or

a little bit further up if θ > θ∗.

Example 4.2 (A Pulley) Take two objects, of masses M and m, hanging

from the same string which is itself wrapped around a cylinder of radius R.

This structure is called a single pulley (e.g. the result of removing the incline

in Figure 1.9).

On each string there is a tension T , given by T1 = mg and T2 = Mg.

Letting M > m, so that T2 > T1, in the absence of friction between the rope

and the cylinder we would see the system accelerate: the more massive body

would accelerate downward as the less massive body accelerates upward.

Suppose, however, that the contact between the string and the cylinder

is frictional. Moreover, suppose that the cylinder is fixed and cannot rotate.

Since T2 > T1, the system may start slipping and accelerate as in the fric-

tionless case. For every bit of rope that is in contact with the cylinder, a

frictional force exists pointing in the opposite direction. In particular, if the

static friction coefficient is µ, then one can show that a sufficient condition

for stability is:4

T2

T1= eµθ0

where θ0 is the angle of the arc on the cylinder’s circumference where

the rope and the cylinder’s surface are in contact. (Surprisingly, this factor

is completely independent of the pulley’s radius.)

If the force of gravity is present, the two bodies will hang vertically down,

so θ0 ≈ π (i.e. 180◦), meaning T2T1≈ eµπ. We can also wrap the string

around the cylinder any number of times we want (provided the string is long

enough). For example, θ0 = 6π means that the string is wrapped 3 times

around the cylinder; θ0 = 12π means 6 turns. If, for instance, µ = 0.2, then

4 This is stated without proof. To derive this result, we must integrate the frictionalforce for each infinitesimal distance over the total length of the arc on the cylinder’scircumference which is in contact with the rope.

174


6 turns means T2T1

= 2, 000. This illustrates why pulleys are used to balance

heavy objects up in the air: by holding on to the lighter object (e.g. grabbing

one end of the string), the tension created on the heavier object is so strong

that one can balance very heavy objects up in the air with hardly any effort,

provided the rope is wrapped around the pulley enough times.

Example 4.3 (Hanging from a pin) Consider hanging an object of mass

m from some point P (for example, a piece of paper on a wall using a pin).

Figure 4.2 shows an example, though the shape of the object is not relevant.

Let C be the center of mass, separated by distance b from P .

P

C

mg

~rP

Figure 4.2: An object hanging from P with center ofmass C.

The torque relative to point P is ~τP = ~rP × ~Fgrav, of magnitude τP =

bmg sin θ, where θ is the angle between ~rP and ~Fgrav. Clearly, this object is

going to rotate about P , so the torque’s magnitude is τP = IPα, where IP is

the moment of inertia relative to P , and α is the angular acceleration.

The object will be in a stable equilibrium if the sum of all forces and that

of all torques (relative to any point) are both zero. Note that the only case

in which that can happen is if P and C are along the same vertical line.

Indeed, in this case, there is a downward force on C equal to mg, and (by

Newton’s Third Law) a force of the same magnitude and opposite direction

on P , so forces are zero. There is also no torque, because ~rP and ~Fgrav are

parallel, so ~τP = ~rP × ~Fgrav = ~0.

175


For example, a pendulum lies at rest in a stable equilibrium when the

point about which it rotates is aligned along a vertical line with the center

or mass (the center of the bob, if the bob is solid).

This last example also gives us a recipe for finding the center of mass

of an object in practice. Suspend the object from some arbitrary point,

wait for it to become stable, and mark a straight vertical line going down

along the y direction. Repeat the exercise from another arbitrary point of

suspension. The intersection of the two resulting straight lines will then

mark the location of the center of mass.

Example 4.4 (The Tightrope Walker) Consider a walker on a tightrope.

As it is, the point of suspension (the contact point between the rope and

the walker’s feet) is below the system’s center of mass, which is about the

walker’s chest. However, the walker could balance two weights, hanging down

from his/her hands along long strings and toward the floor (suppose the

tightrope is highly elevated). By adding mass onto the system, the center of

mass would now shift below the tightrope, so the walker can keep his/her

balance.

This is one reason why tightrope walkers carry a pole while doing their

stunt: the pole is often curved down, which adds weight and shifts the center

of mass down closer to (and ideally below) the suspension point. Another

reason is that the pole helps add rotational inertia to the system (as in the

case of the spinning figure skater, Example 3.4), which helps keep balance.

Example 4.5 (The Lever Law) Consider a lever, that is, a rigid rod of

mass m that pivots about a fixed hinge (see Figure 4.3). Suppose that two

forces, ~F1 and ~F2, are applied on each end of the rod.5 The center of mass

is at C, it is subject to a gravitational force mg, and di denotes the distance

between C and the force i = 1, 2. Suppose that the lever is pivoted about C.

5 These forces would be, for example, normal forces with magnitude M1g and M2g,respectively, if two bodies of masses (M1,M2) were placed one on each side of thelever. We will analyze the more general case where the forces need not be normal(i.e. perpendicular to the surface of the rod).

176


For a (one-dimensional) solid rod of length `, the center of mass is located

at the middle: d1 = d2 = `/2.

Figure 4.3: The generalized lever.

The system will be at equilibrium if ~F1 + ~F2 = ~0 and the sum of torques

about any point is zero. Decomposing the forces into its (x, y) components

in the usual way, we have:

∑Fx = 0 = F1 cos θ1 − F2 cos θ2 (4.3)∑Fy = 0 = F1 sin θ1 + F2 sin θ2 −mg + Fpivot (4.4)

where θi, i = 1, 2, is the angle between the position vector ~ri,C (i.e.

relative to the center of mass) and ~Fi, and Fpivot is the magnitude of the

normal force on the center of mass.

On the other hand, the torque about point C (the pivot point) has several

components. The torque due to ~Fpivot and that due to gravity are both zero,

because our point of reference is precisely the point C where these forces are

applied. Thus, the torque is only due to the ~F1 and ~F2 forces:

~τC = ~F1 × ~r1,C + ~F2 × ~r2,C

The first torque equals ~τ1,C ≡ ~F1 × ~r1,C = F1d1 sin θ1~e+, where ~e+ is

orthogonal to (x, y) and pointing toward positive values of z (i.e. pointing

out of the page). The second torque equals ~τ2,C ≡ ~F2×~r2,C = F2d2 sin θ2~e−,

where ~e− is orthogonal to (x, y) and pointing toward negative values of z (i.e.

pointing into the page). In magnitude, we have, F1d1 sin θ1−F2d2 sin θ2 = 0.

177


Therefore:

d1F1 sin θ1 = d2F2 sin θ2 (4.5)

Equations (4.3)-(4.4)-(4.5) then guarantee that, if we pivot the lever

about its center of mass, the lever will be at rest (it will not move linearly

or rotate). These equations compose the often called Lever Laws.

For example, if two bodies of masses M1 and M2 are placed on each end

of a lever that pivots about its center of mass, then Fi = Mig and θi = π/2

(so that sin θ1 = sin θ2 = 1), and the condition for stability becomes:

d1M1 = d2M2

with Fpivot = mg by Newton’s Third Law.

Example 4.6 (Standing on an incline) A person of mass m stands on

an incline at an angle θ relative to the floor.6 The person’s legs are separated

by a distance d, with one foot uphill and the other one downhill. The center

of mass is at a distance h above the ground, perpendicular to the hillside, and

it is midway between the person’s feet (around his/her chest). We assume

that the coefficient of friction is high enough for the person not to slide down.

The center of mass is subject to a force mg, and each leg has a normal

force FN,L and FN,R (for left and right legs). These forces are perpendicular

to the incline (our x axis). Frictional forces Ff,L and Ff,R are pointing

uphill from each leg, preventing the person to slide.

Stability requires that both (x, y) components of these forces add up to

zero. If positive values of x run “uphill”, then:

∑Fx = 0 = Ff,L + Ff,R −mg sin θ (4.6)∑Fy = 0 = FN,L + FN,R −mg cos θ (4.7)

6 As in Example 1.13, let coordinate system be “tilted” so that the direction of thex axis is parallel to the incline, not the floor.

178


For the torques, let us again pick the center of mass as the reference

point, for convenience. At the center of mass, the force of gravity does not

contribute to the torque (its position relative to C being the zero vector). The

torque is due only to the two normal forces and the two frictional forces.

For the left leg, and always relative to the center of mass C, we have:

~τL,C = ~rL,C × ~FN,L︸︷︷︸Normal force

+ ~rL,C × ~Ff,L︸︷︷︸Frictional force

=(rLFN,L sinαN

)~e++

(rLFf,L sinαf

)~e+

where rL ≡ |~rL,C | =√h2 + d2/4,7 (αL, αf ) are the angles between ~rL,C

(the position vector of the left left relative to the center of mass) and ~FN,L

and Ff,L, respectively, and ~e+ is the unit vector orthogonal to (x, y) which is

pointing toward z+ (i.e. out of the paper) because, if the person is facing us,

his/her left leg is uphill and therefore αR < 90◦. Next, by definition of the

sine function, rL sinαL = d2 and rL sinαf = h (note ~r, ~F and the height of

altitude h form a right rectangle!). Therefore, we can write the torque due

to the left leg as ~τL,C =(d2FN,R + hFf,L

)~e+. The magnitude of the torque

on the left leg is, therefore:

τL,C =d

2FN,L + hFf,L

The total torque on the right leg, similarly, is:

~τR,C =d

2FN,R~e− + hFf,R~e+

Notice that the two components of the right-leg torque now point in op-

posite z directions. Therefore, the magnitude of the torque on the right leg

is:

τR,C = −d2FN,R + hFf,R

7 If the center of mass is at height h and midway between the legs, then rL is thehypothenuse of the triangle with sides h and d/2 (and we use Pythagoras’ theorem).

179


The total torque must be zero, so τL,C + τR,C = 0, or:

d

2(FN,L − FN,R) + h(Ff,L + Ff,R) = 0 (4.8)

From equation (4.8), we have FN,R − FN,L = 2hd (Ff,L + Ff,R). Substitu-

tion into equation (4.6) yields:

FN,R − FN,L =2h

dmg sin θ

Adding this to equation (4.7) then gives FN,R, and subtracting it from

(4.7) will give FN,L. The result is:

FN,R = mg

(1

2cos θ +

h

dsin θ

)FN,L = mg

(1

2cos θ − h

dsin θ

)The moment at which the person will start to roll over and fall downhill

occurs when the normal force on the upper foot (the left foot) is zero. From

our results, this is true whenever 12 cos θ = h

d sin θ, that is:

d∗ = 2h tan θ

The distance d∗ marks the minimum distance that the person’s feet have

to be separated for him/her not to rotate and fall, given a height h and an

inclination angle θ.

4.2 Stress, Strain, and Elasticity

In this section, we introduce the concept of elasticity. In physics, elastic-

ity is an object’s ability to return to its original shape and size after being

subject to a distorting force. A solid object will deform while being un-

der such force, but if the material is sufficiently elastic it will return to its

original state.

180


The elasticity of an object depends on the materials of the object as well

as external conditions (e.g. temperature), and is generally described by a

stress-strain curve. Let us first define these concepts:

Definition 4.2 (Stress) The internal force that neighboring particles of a

solid object exert on each other. It is denoted by ~σ.

In particular, the stress between two particles is measured as the force

that a particle applies on the other particle across a small surface of the

object, and it can have any direction relative to said surface. If the distorting

force (of magnitude F ) causes a deformation, then stress is given by:

σ ≡ F

A(4.9)

in magnitude, where A is the so-called cross-sectional area, i.e. the area

of the object’s cross section that is perpendicular to the direction of the force

applied. Here, σ is in Newtons per unit area, so it has the dimensions of

pressure.

Definition 4.3 (Strain) A measure of deformation representing the dis-

placement between two particles in a solid object, relative to its initial state.

It is denoted by ε.

In other words, while stress is a measure of internal forces between pairs

of particles, strain measures the proportional change in position of the ob-

ject. Obviously, the deformation (weakly) increases with the stress. If ` is

length of the object when in a relaxed state, and ∆` is the displacement it

suffers under the distorting force (both relative to the same fixed point of

reference), then strain is defined by:8

ε ≡ ∆`

`(4.10)

Therefore, ε is a measure of proportional deformation, and it is dimen-

sionless.8 By convention, ∆` is positive (negative) if the object is being stretched (com-

pressed).

181


The stress-strain curve (Figure 4.4) is then the function ε(σ), i.e. it is

the graph of the strain ε of an object under various levels of stress σ.9 The

curve can be used to determine the properties of the materials of the object,

including the object’s modulus of elasticity (E):

Definition 4.4 (Modulus of Elasticity) The modulus of elasticity is de-

fined by:

E(σ) ≡ σ

ε(σ)

That is, it is the slope of the stress-strain curve.

Therefore, the modulus of elasticity of an object is a measure of the ob-

ject’s resistance to being deformed elastically (i.e. non-permanently) when

stress is applied to it. A stiffer object will have a higher elasticity modulus.

For many objects, if the deformation force is small enough, the elasticity

modulus is constant in the strain. That is, the stress-strain curve is linear for

low levels of stress. For these so-called linear-elasticity objects, the elasticity

modulus is called Young’s modulus. Let us look at these specific cases, and

derive Young’s modulus.

Example 4.7 (Linear Elasticity) Consider a body that behaves according

to Hooke’s Law (Principle 1.3), e.g. a spring. When the body is in a relaxed

state, we normalize its position to x = 0 and let ` be the object’s length. Sup-

pose a linear force of magnitude F is applied on the object, thereby stretching

it until extended to some position x = ∆`.

We make the following observations:

• By Hooke’s Law (i.e. F = −kx), the displacement is proportional to

the force applied: if the force is twice as strong, the displacement is

doubled. Thus, ∆` ∝ F .

9 By convention, strain is plotted as the independent variable (on the x axis), al-though we really should think of the stress-strain curve as showing strain levels undervarious stress forces, and not the other way around.

182


• Clearly, if the object is twice as long, then the amount of displacement

is doubled as well, for a given force. Thus, ∆` ∝ `.

• Finally, the amount of displacement is inversely proportional to the

cross-sectional area. If the force is applied over a certain area A, then

increasing the area by a factor of two will decrease the displacement

by the same factor. Thus, ∆` ∝ 1A .10

Thus, we have found that the displacement ∆` is proportional to the

force, proportional to the length, and inversely proportional to the cross-

sectional area over which the force is applied, or ∆` ∝ F`A .

Using equations (4.9)-(4.10), we have obtained ε ∝ σ. The constant of

proportionality is the elasticity modulus, given by:

Y ≡ F/A

∆`/`=σ

ε

The elasticity modulus in this case is denoted Y for Young’s modulus.11

A large number of materials behave in this manner, including most met-

als and some materials such as rubber. When the strain is small, the elas-

ticity modulus is constant. If released (e.g. a spring is stretched and then

released), the object eventually returns to its original position. In particu-

lar, since F = Y A` ∆` (be definition), then the released object will describe

a harmonic oscillation with oscillation constant:

k =Y A

`(4.11)

so that its position will be x = xmax cos(ωt+ϕ), with angular frequency

10 For instance, consider a rod of cross-sectional area A and length `. If two identicalsuch rods are placed together in parallel, then they each counteract F with half theforce, showing ∆` ∝ `. If instead a single rod of cross-sectional area 2A is used, thedisplacement is 1

2∆` for the same F , showing ∆` ∝ 1/A.

11 For example, consider a rod of radius r = 0.5 cm, cross-sectional area A = 8×10−5

m2, length ` = 1 m, and mass M = 500 kg. Suppose a force of F = 5, 000 N isapplied. If the rod is made of steel (Y = 20 × 1010 N/m2), then the rod will beextended by ∆` ≈ 0.3 mm. If the rod is made of nylon (Y = 0.36× 1010 N/m2), theextension is ∆` ≈ 17 mm. Indeed, ropes are much more elastic than metal rods.

183


ω =√k/m =

√Y A`m , period of oscillation T = 2π/ω = 2π

√`mY A , and fre-

quency 1/T Hz.

However, for most objects, the linear relationship disappears after a cer-

tain level of stress is applied on the object. At this level (point A in Figure

4.4), the elasticity modulus achieves its so-called elasticity limit. Past this

point, and for higher levels of stress, Hooke’s law ceases to hold, the elas-

ticity modulus is no longer constant, and the object becomes permanently

deformed. Namely, if the force were to be released past point A, the object

would not return to its original form. At some maximum strain (point D),

the object breaks.

Strain (ε)

Stress (σ)

AB

C

D

Figure 4.4: The typical shape of the stress-straincurve.

Different materials will exhibit different behaviors between points A and

D. The object’s elasticity may become negative past some certain elasticity

(a yield point), represented by point B. Typically, the object also reaches a

maximum stress (point C) before it breaks, and it may even plateau around

it (that is, there might be a range of strains whereby the object experiences

deformation without any further increase in stress). The object’s state as it

goes through this plateau is called the plastic flow state.

184


Notice that the stress decreases as the object’s strain increases just before

the breaking point (segment from C to D). This is because, just before

breaking, the object’s atoms get “squeezed out” of place, and the surface

area decreases (with equal force), making the stress lower.

185

Chapter 5

Waves, Fluids, and

Oscillations

5.1 Waves and the Doppler shift

In this section, we study the emission and perception of waves. In par-

ticular, we will focus on the so-called Doppler effect.

The Doppler effect (or Doppler shift) is the change that occurs in the

frequency of a wave for an observer who is in motion relative to the source

of the wave. We experience the Doppler shift when the sound source moves

relative to our position, for instance when an ambulance or a police car drive

by in the street.

In particular, the relationship between the observed (or perceived) fre-

quency f and the originally emitted frequency f0 is given by:

f =

(c+ vrc+ vs

)f0 (5.1)

where c is the velocity of waves in the medium of consideration, vr is

the speed of the receiver, and vs is the speed of the source.1 If the source is

approaching (vr > vs), then f > f0, so the sound is perceived with higher

1 For sound in air, c = 331.5 + 0.6tc m/sec, where tc is temperature in Celsius. Forlight in a vacuum, c = 299, 792, 458 m/sec.

187


pitch. If the source is receding (vr < vs), then f < f0, so the pitch is lower.

If the speeds vs and vr are small relative to the speed of the wave, then

a Tylor approximation yields:

∆f ≡ f − f0 =∆v

cf0 (5.2)

where ∆v ≡ vr − vs is the relative velocity of the receiver. Thus, the

change in frequency is proportional to the difference in speeds.

By definition (recall Definitions 1.6-1.7), the frequency of oscillation re-

lates to the period via T = 1f . Thus, in T seconds, the wave (or, rather, a par-

ticle moving in the field of consideration) moves by a distance of λ ≡ cT = c/

f m. This is called wavelength:

Definition 5.1 (Wavelength) The distance over which a period wave re-

peats its pattern (i.e. the distance between successive crests), defined by:

λ ≡ c

f

where f is the wave’s frequency, and c is the speed of the wave.

Therefore, by definition, waves with higher frequency have lower wave-

length, and the factor of proportionality between the two is given by c (e.g.

the speed of sound for acoustic waves, or the speed of light for electromag-

netic waves).

Suppose that source and receiver are separated by distance ~r. Let ~vs be

the velocity of the source, and θ = ∠~r~vs. The x component of the velocity

(often called the radial component) is vx = vs cos θ. If the receiver perceives

frequency f , then:

f = f0

(1 +

vsc

cos θ)

(5.3)

by equation (5.2). In terms of λ = c/f , equation (5.3) reads:

λ = λ0

(1− vs

ccos θ

)(5.4)

Thus:

188


• If θ = π/2 (i.e 90◦), then cos θ = 0, so f = f0 and λ = λ0. In words,

when the source and the receiver keep their distance constant, the

frequency and wavelength arrive unaltered to the receiver.

• If θ < π/2, then cos θ > 0, so f > f0 and λ < λ0. In words, when the

source and the receiver are approaching, the receiver reads a higher

frequency and a lower wavelength.

• If θ > π/2, then cos θ < 0, so f < f0 and λ > λ0. In words, when

the source and the receiver are receding, the receiver reads a lower

frequency and a higher wavelength.

The Doppler shift, DS, can thus be just calculated as the change in

wavelength, λλ0

, from equation (5.4):

DS ≡ 1− vsc

cos θ

Let’s now examine different types of waves, and how the Doppler effect

is manifested:

Sound waves The Doppler effect is most popularly present in sounds

waves (e.g. sirens). Suppose that the wave source exhibits a circular motion

and the receiver is on the same plane of the circumference, so the source

oscillates between being close and far from the receiver. Since the circular

motion can be described by a sinusoidal wave (as in Figure 0.1), then the

frequency f takes the shape of such a wave: as the source approaches the

receiver, the frequency increases, and vice versa.

Let fmax be the frequency when the source is the closest. Then, the

velocity of the source can be deduced using equation (5.1). Moreover, the

period T (the time elapsed between two crests) satisfies, as we know, 2πRT =

vs, from which the radius of the orbit R can be deduced.

Electromagnetic waves Electromagnetic waves reflect electromagnetic

radiation. They include waves involving the motion of light particles, so c in

equation (5.1) stands here for the speed of light. Typically, the speed of the

189


transmitter (vs) is way smaller than the speed of light (c), so approximation

(5.2) applies.

Most frequencies are invisible to the human eye and, for those that are

visible, the frequency determines the color at which the human eye per-

ceives light. From lowest to highest frequency (that is, highest to lowest

wavelength), the electromagnetic waves are:2

1. Long radio waves, with f < 106 Hz and λ > 103 m.

2. Radio waves, with f ∈ [106, 108] Hz and λ ∈ [1, 103] m. Among these:

• AM waves, with f ∈ [106, 107] Hz and λ ∈ [102, 103] m.

• FM waves, with f ∈ [107, 108] Hz and λ ∈ [1, 102] m.

3. Microwaves, with f ∈ [108, 1011] Hz and λ ∈ [10−3, 1] m.

4. Infrared waves, with f ∈ [1011, 1012] Hz and λ ∈ [10−3, 10−6] m.

5. Visible light waves, with f ∈ [430 × 1012, 770 × 1012] Hz and λ ∈[390×10−9, 700×10−9] m. These are the frequencies and wavelengths

to which a typical human eye can respond. Within these:

• Red, with f ∈ [400, 484]× 1012 Hz and λ ∈ [620, 750]× 10−9 m.

• Orange, with f ∈ [484, 508] × 1012 Hz and λ ∈ [590, 620] × 10−9

m.

• Yellow, with f ∈ [508, 526] × 1012 Hz and λ ∈ [570, 590] × 10−9

m.

• Green, with f ∈ [526, 606]× 1012 Hz and λ ∈ [495, 570]× 10−9 m.

• Blue, with f ∈ [606, 668]× 1012 Hz and λ ∈ [450, 495]× 10−9 m.

• Violet, with f ∈ [668, 789]×1012 Hz and λ ∈ [380, 450]×10−9 m.

6. Ultraviolet waves, with f ∈ [1013, 1016] Hz and λ ∈ [10−8, 10−7] m.

7. X-rays, with f ∈ [1016, 1020] Hz and λ ∈ [10−12, 10−8] m.

2 Frequencies and wavelength given in the list are crudely approximated.

190


8. Gamma rays, with f > 1020 Hz and λ < 10−12 m.

In astronomy, the frequency of light cannot be measured, and only the

wavelength λ is observable. Changes in the wavelength can then be used

to infer if the objects (say, the Earth and a distant planet) are approaching

or receding. Using the classification for color used above, astronomers have

adopted the following terminology:

Definition 5.2 (Redshift and blueshift) Redshift occurs when the elec-

tromagnetic radiation of an object is increased in wavelength (i.e. decreased

in frequency). When the opposite occurs, we call it a blueshift.

This terminology is used even when light is not visible by the human

eye. For example, for the relative motion of planets:

• When astronomers observe an increase in the wavelength emitted by

a distant planet as it arrives on Earth, we say that a redshift has

occurred, and thus the planet is receding from the Earth.

• A blueshift in the wavelength will thus denote that the planet is ap-

proaching the Earth.

The detection of wavelengths has permitted astronomers to make many

predictions about the behavior of the cosmos. For example, when explor-

ing a star, astronomers can determine which elements are present in the

composition of the star by looking at the so-called spectral lines of their

electromagnetic spectrum.3 By determining the color shift in the spectrum,

3 Spectral lines of the spectrum are discontinuities present in an the otherwise con-tinuous electromagnetic spectrum of light. They are due to either the emission orthe absorption of light within narrow frequency ranges. In the visible light spectrum,absorption lines thus appear as black lines, and emission lines appear as white lines.Each atoms and molecule has its own unique pattern of spectral lines along the spec-trum, so these can be used in practice to identify which elements are present in thecomposition of a star. For instance, the helium atom was found to be emitted by theradiation of our Sun by studying the Sun’s spectrum.

191


the velocity at which the stars are moving relative to each other can be

inferred (using equation (5.4)).4

Throughout its life-cycle, a star survives by burning its nuclear fuel in its

core, a process that is sufficiently powerful to counteract the effects of the

star’s own gravity. When this fuel is exhausted, nuclear fusion gives way to

gravity, and the star implodes into a nova. At this point, depending on the

mass of the original star, the dying star can become one of three objects:

1. A white dwarf, for stars whose mass is below 1.4 the mass of our Sun.

2. A neutron star, which are denser and smaller than white dwarfs, for

stars whose mass is between 1.4 and 3 times that of the Sun.

3. A black hole (for masses at least 3 times larger than that of our Sun)

is an object of such enormous density that it produces a gravitational

field so strong that no light or matter can escape its pull within its

vicinity.

Let’s study these with some detail:

Example 5.1 (Binary-Star Systems) Today, astronomers know that many

star systems in the universe are binary star systems, that is, systems com-

posed of two stars orbiting each other in a spiral-like motion. This can

be observed by looking at the spectra of the stars and observing continuous

red- and blueshifts in the spectral lines, an unequivocal manifestation that a

Doppler effect is occurring and that, therefore, stars are oscillating relative

to each other and to Earth. Moreover, for each one of those stars, the radial

velocity, the radius of the orbit, and the orbital period, can all be determined.

Consider a binary system. Star 1 has mass m1, and describes a circular

orbit about the center of mass of the system, at radius r2. Star 2 has mass

m2, goes around the center of mass, and r2 is the radius of its orbit. By

definition of the center of mass (Definition 1.27), m1r1 = m2r2. Suppose

the observer is on the plane of the two orbits.

4 More specifically, knowing the shift in the spectrum, λ/λ0, one can infer the radialvelocity, v cos θ, but the actual velocity v remains unknown unless the angle θ (andthus the direction of motion of the stars) can be calculated.

192


Because the orbits are circular, the period of motion is:

T = 2π

√(r1 + r2)3

G(m1 +m2)

Now, we want to make the Doppler shift measurement of both stars. The

Doppler shift (in wavelength) of star i = 1, 2 is DSi = 1 − vic cos θi. Thus,

by measuring the Doppler shift, once can know the radial velocity of each

star. Using the period, one can infer r1 + r2 and m1 + m2, and using that

m1r1 = m2r2, one can solve for m1, m2, r1, and r2.

This shows that measuring the Doppler effect is extremely informative

of the properties of the binary star system (including the positions, masses,

and velocities of each one of its component stars).

Example 5.2 (X-ray binaries) An X-ray binary is a class of binary stars

that are visible only in X-rays. They are composed of matter falling from

one of the component stars (the so-called donor), to the other component

(the so-called accretor). The latter is a very compact star, usually a neutron

star or a black hole.

When a neutron star accretor is sufficiently close to the donor, mass

from the latter is transferred to the former. But because the donor and

the accretor are orbiting each other, the motion of the transferring particle

describes a spiral motion as it falls into the accretor’s own orbit. The spiral

trajectory is usually called an accretor disk. As a result of the rotation of

the neutron star, the poles of the star heat up and radiate light, which from

Earth appear as pulsations (in a similar way that a lighthouse would). Such

types of neutron stars are called pulsars.

A piece of matter with mass m spiraling toward the accretor will arrive

and release kinetic energy T = 12mv

2, where v is the particle’s velocity at

arrival.5 By the conservation of energy, kinetic energy and gravitational

5 To illustrate the amount of energy released by neutron stars, for m as little as 10grams, the kinetic energy that would be released from the impact is as large as anatomic bomb explosion. The reason is that v is phenomenally high, given the stronggravitational pull of the neutron star. The transfer of matter from the donor to theaccretor has been calculated to be about dm

dt= 1014 kg/sec, corresponding to a power

193


potential energy add up to zero, so 12mv

2 = mMGR , where M is here the mass

of the accretor and R is the radius of its orbit. Thus, the speed at which

matter reaches the neutron star is:

v =

√2MG

R

This is, no surprise, equal to the escape velocity of the circular orbit

(recall Result 3.7): the velocity at which matter will arrive from infinitely

far is the same as the one that would take to escape the neutron star’s

gravitational pull.

Example 5.3 (Black Holes) A black hole, the most extreme manifesta-

tion of a star’s death, is an object with no size and, thus, infinite density.

Its mass, M , is at least three times that of the Sun.

The event horizon is a sphere with radius REH around the black hole,

which limits the frontier below which objects cannot escape from the hole’s

gravitational pull. At the event horizon, therefore, objects need a speed of

v ≥ vesc =√

2MGREH

to get away from the hole. Since not even light escapes

this pull, and because nothing travels faster than light, we have c =√

2MGREH

.

Thus, the radius of the event horizon is:

REH =2MG

c2

For example, if the black hole is the mass of the Earth, REH = 1 cm. If

it is the mass of the Sun, REH = 3 km.

Since a black hole has no surface (because it has no size), it does not pul-

sate as neutron stars in binary systems do. To determine its mass, therefore,

one cannot use the Doppler shift, for no electromagnetic radiation is being

emitted. Instead, one can use the Doppler shift of the donor (if the black

hole is the accretor in the binary system) and, if one has an estimate of the

mass of the donor, one can infer the mass of the accretor.

of P = 2 × 1030 J/sec. When kinetic energy is turned into heat energy, the neutronstar can achieve temperatures of 107 K. These are enormous values. The temperatureis so high, in fact, that all radiation is exclusively X-ray radiation. Hence, the nameof the system.

194


5.2 Fluid Statics

In this section, we study the behavior of fluids (e.g. water, air, or oil)

that are at rest. We will explore the properties of fluids in various states

and how they react to forces. This branch of physics is called fluid statics.

The dynamics of fluids will be explored in Section 5.3.

5.2.1 Pressure and Pascal’s Principle

Consider a fluid which is fully contained in a closed compartment. The

compartment has an opening of cross-sectional area A. Suppose that a force

F is applied everywhere on this area, and let the direction of the force be

perpendicular to the cross section of the fluid. (In short, let F be a normal

force to the fluid’s surface). Then, we will say that the liquid’s pressure is

being increased:

Definition 5.3 (Pressure) The force applied perpendicular to the surface

of an object per unit area over which the force is distributed. It is defined

by:

P ≡ F

A

where F is the magnitude of the normal force, and A is the area of the

surface on contact.

Pressure is measured in Pascals (Pa), which are force, in Newtons, per

unit area, in squared meters. Our first principle is due to Pascal himself:

Principle 5.1 (Pascal’s Principle) The pressure applied to an enclosed

and static fluid is transmitted undiminished to every point in the fluid and

to the walls in the container.

The principle says that a pressure change that occurs anywhere in the

container is transmitted throughout the fluid such that the same change

195


occurs everywhere. When gravity is not at play, this implies that the pres-

sure is literally the same everywhere in the vessel.6 Moreover, gravity or

no gravity, the force by the liquid onto the (inner) walls of the vessel must

be everywhere perpendicular to the surface of the walls. Were this not the

case, the fluid would clearly not be in a motionless state.

Formally, for every small area of size ∆A, if a force ∆F is exerted

by the fluid onto the wall and this force is perpendicular to ∆A, then

lim∆A→0∆F∆A = P , where P is the liquid’s unique pressure as postulated

by Pascal.

Example 5.4 (Hydraulic jacks) A hydraulic jack is a device that is de-

signed to lift heavy loads with relative ease and which directly exploits Pas-

cal’s principle. Consider two closed vessels whose openings have surface

areas A1 and A2, and suppose that the two vessels are connected by a tube.

Fluid fills both vessels and the tube. Forces with magnitudes F1 and F2

are applied over the two areas, in a direction that is perpendicular to the

corresponding surface.

By Pascal’s principle (and ignoring the effects of gravity), it must be thatF1A1

= F2A2

if the fluid is at rest. Therefore, if A2 is substantially larger than A1

(or vice versa), so will F2 be much higher than F1. Indeed, F2 = A2A1F1 � F1,

i.e. a small force F1 translates into a very strong one in surface 2. For

instance, by simply placing a light object on a platform covering the small

surface (e.g. a foot), we may be able to lift a great heavy object (e.g. a car)

placed on a platform over the larger surface.

Suppose that, by pressing on the small platform, we displace the fluid

down by a distance d1. The total volume of fluid displaced is d1A1. The

distance d2 that the fluid on the larger vessel is displaced up must satisfy:

d2A2 = d1A1

for all the fluid that is lost in vessel 1 is shifted into vessel 2 through

the connecting tube. Using Pascal’s principle, this means that d2F2 = d1F1.

6 The effects of gravity will be introduced shortly. In that case, we will see that thepressure depends on the region of the vessel we consider.

196


Thus, the work on the small surface is W1 = F1d1 = F2d2 = W2, thereby

guaranteeing that total energy is conserved.

Incidentally, this illustrates an impracticality in this method of heavy

lifting: in order to lift a heavy object in surface 2 by a certain distance

d2, we would need to push platform 1 down by a much grater distance, as

d1 = A2A1d2 � d2. In practice, this is resolved by a lever that shifts fluid into

vessel 2 every time it is pushed down, but lets it back into vessel 1 when

brought back up.

Gravity plays an important role in pressure. For example, in the deep

oceans, water pressure is much higher due to the effects of gravity. To

understand this, consider taking a portion of fluid. For simplicity, suppose

we take a rectangular cuboid (i.e. a “box”) at some location (x, y) (see

Figure 5.1).

y + ∆yx+ ∆x

∆mg

F2

F1

Figure 5.1: A cuboid filled with liquid.

The box has height y + ∆y and the area of the upper (and lower) face

is A. The whole box has a mass ∆m, and the liquid has density ρ, which is

possibly a function of y. Since the volume of a cuboid is V = A∆y, then by

Definition 1.20:

∆m = ρV = ρA∆y

A force of magnitude ∆mg pulls down on the fluid due to gravity. Ev-

erywhere in the fluid, the box is subject to normal forces which are per-

197


pendicular to each one of the six faces of the box. Figure 5.1 depicts only

two such forces, on the upper face, with magnitude F2, and on the lower

face, with magnitude F1, which act upon the whole surface area (omitting

the horizontal forces is without loss of generality because all forces on the x

plane clearly cancel).

For the fluid element to be in static equilibrium, in must the case that:

F1 − F2 −∆mg = 0

Moreover, F1 = PyA1, where Py is the pressure on the lower surface

(at position y), and similarly for the upper surface (where the position is

y+ ∆y, so pressure is denoted Py+∆y). Since both areas are equal, we have

(P1 − P2)A = ∆mg = ρA∆yg, orPy+∆y−Py

∆y = −ρg. Taking the limit as

∆y → 0, we then obtain:

dP

dy= −ρg (5.5)

a formula for the rate of change in pressure. Therefore, as we go deeper

into the ocean (lower values of y), pressure goes up by ρg per unit displace-

ment. This phenomenon is called hydrostatic pressure:

Definition 5.4 (Hydrostatic pressure) The pressure on static fluids that

is due to the action of gravity. The change in hydrostatic pressure from a

displacement in some direction y is given by equation (5.5).

Liquids are virtually incompressible. By this, we mean that their density

ρ is constant in y. By equation (5.5), this means that the change in the

pressure is everywhere the same, exactly what Pascal’s principle postulates.

Gases, on the other hand, are very much compressible, and their density

ρ can be increased easily by reducing the volume they occupy. For example,

the air’s density, and hence changes in atmospheric pressure, are very much

a function of the altitude, with air being denser at lower altitudes.7

7 We will study gases in more detail in Chapter 6.

198


In sum, while the change in pressure in liquids is everywhere the same

(equation 5.5), the same is not true for gases: Pascal’s principle applies only

to incompressible fluids.

Example 5.5 (Shooting at cans) Imagine filling a can to its very rim

with water, covering it with a lid, and hitting it with a hammer. The counter

force will be extremely strong and the can will resist the hit. However, recall

that P = F/A and, by Pascal’s principle, pressure propagates undiminished

on the whole fluid. Thus, if we shoot a bullet (whose surface area is extremely

small) through the can, pressure inside the liquid will build up enormously,

the can’s materials might give way, and the can itself might explode. If

the can is filled with air instead of water, however, none of this will occur.

Again, this is because air, unlike water, is compressible.

Henceforth, we will assume that liquids are strictly incompressible (so

that ρ is constant in y). Integrating equation (5.5) within some bounds

[P1, P2], corresponding to the pressures at some locations y1 and y2, we

obtain: ∫ P2

P1

dP = −ρg∫ y2

y1

dy ⇒ P2 − P1 = −ρg(y2 − y1)

We have just derived the so-called Pascal’s Law:

Result 5.1 (Pascal’s Law) For static and incompressible fluids, the change

in hydrostatic pressure ∆P is proportional to the height of the fluid above

the point of measurement, ∆y. In particular:

∆P = ρg∆y (5.6)

where ρ is the fluid’s density.

Of course, this is nothing but the mathematical analogous to the state-

ment we have made in words in Principle 5.1: a change in the pressure at

any point in an enclosed, static, and incompressible fluid is transmitted to

all points in the fluid.

199


Unlike liquids, the rate of change of pressure in gases is not constant. For

example, as we go up in the air into the thinner atmosphere, pressure goes

down, but not linearly with our displacement (as density ρy is a function

of altitude y). At sea level (y = 0), pressure on Earth is approximately

1.03 kg/cm2 (that is, it weighs roughly 10.1 N/cm2). This results in an

atmospheric pressure of about 101 kilopascals, kPa. This unit is generally

called an atmosphere. In particular:

1 atm ≡ 101, 325 Pa

The atmosphere pressure may also be called barometric pressure. What

this means is that the air that fills a column of cross-sectional area 1 cm2

running from sea level to the very top of Earth’s atmosphere has a mass of

about 1.03 kg. With these correspondences, we can see the equivalence be-

tween barometric and hydrostatic pressure. In particular, given the density

of water:

10 meters of water ≡ 1 atm

Since the change in barometric pressure depends on the density, which

is itself a function of the position, measuring it in practice can be challeng-

ing. One way to do it is to use a barometer, which exploits the following

experiment:

Example 5.6 (Measuring barometric pressure) Consider filling a glass

with liquid (e.g. mercury). In the glass, we introduce a tube with a single

opening, with the opening submerged in the liquid (e.g. a straw whose upper

opening is blocked by our finger). Let y1 denote the vertical height corre-

sponding to the liquid’s surface on the glass, and y2 be the height the liquid

reaches inside the tube or straw, with h ≡ y2 − y1. Let P1 and P2 be the

corresponding pressures. Since the liquid inside the tube is not subject to air

pressure, we have P2 = 0.

Then, we can simply apply Pascal’s Law (equation 5.6) to find:

200


P1 = ρgh

where ρ is the density of the liquid. Therefore, in practice, to know the

barometric pressure, all we have to do is take a liquid of a certain density

ρ, measure how far we can pull the liquid up in a tube before it breaks lose

due to gravity, and then use the measured distance h to compute barometric

pressure P by P = ρgh.8 This is exactly how barometers work.

Here are a couple of real-world examples in which pressure plays a role:

Submarines Submarines today can submerge up to 900 meters, where

the hydrostatic pressure is 90 atm. On every square meter of a submarine

at that depth there is a force equivalent to 900 tons of weight, and that force

is perpendicular to the surface of the submarine and distributed across the

entire outside of the vessel. Yet, submarines are able to maintain a steady

1 atm in their interior. If some air was to be sucked out of the interior of

the submarine, its shell would crush due to the enormous forces of water

pressure.

Divers Humans cannot breathe through a tube below water (i.e. snorkel)

below a depth of about 1 meter. To measure how deep we can snorkel, we

use a so-called manometer (see Figure 5.2). This device is used to measure

how much over-pressure human lungs can produce to counteract the water

pressure from outside.

In a manometer, air is blow on one end of the tube (the left end, in the

figure), which displaces the liquid (say, water) by a distance h. Letting the

pressure be P1 at y1 and P2 at y2, Pascal’s Law says P1 − P2 = ρhg, where

8 For example, using mercury (ρ = 13.6×103 kg/m3), it has been found that mercurygives way to gravity at typical heights of about h ≈ 0.76 m, so the barometric pressureis P = ρgh ≈ 1.03 × 105 Pa, i.e. exactly 1 atm. Water is much less denser thanmercury, so the tube would have to be much higher (about 13.6 times higher, makingh ≈ 10 meters) if we are to measure the barometric pressure. This is why mercury isoften the most practical choice for use in barometers and thermometers.

201


h

y1

y2

Figure 5.2: A manometer filled with water. Blowingair into the left end generates a displacement equal toh on the right end.

ρ is the density of water. Since P2 = 1 atm (as the right end is open ended

and thus subject to air pressure9), and 1 atm = 101,325 Pa, we then have:

P1 = 101, 325 + ρgh

Thus, the manometer indicates how much over-pressure ρgh the human

lungs can generate over and above 1 atm. For human lungs, the resulting

distance of this experiment is about h = 1 meter, which means humans

cannot snorkel at more than 1 meter under water: human lungs can only

generate up to one-tenth of an atmosphere of over-pressure. At lower depths,

we must rely on pressured oxygen, which they breath through a can, to

restore pressure within their chest. Incidentally, human lungs can create

about the same amount of under-pressure (i.e. sucking instead of blowing

air in the manometer).

9 Because both ends are open, the liquid is subject to barometric pressure before theexperiment starts, so the water levels are equalized (i.e. h = 0).

202


5.2.2 Archimedes’ Principle and Stability

Consider an object (e.g. a cylinder) that is partially submerged in a

fluid, for instance a liquid (Figure 5.3). The object has mass m, density ρ,

radius r, length `, and cross-sectional area A.10 The fluid has density ρf ,

and the object is submerged at depth h ≡ y2−y1, where y1 is the bottom of

the object and y2 is surface of the liquid. The pressure levels at these two

points are P1 and P2, respectively.

h

ry2

y1

mg

F1

F2

Figure 5.3: A partially submerged cylinder.

As usual, a force of magnitude mg pulls down on the object. Counter-

acting this force, a force of magnitude F1 pushes up on the object. Finally, a

force of magnitude F2, due to atmospheric (or barometric) pressure, pushes

down on the object. As discussed in the previous section, the direction of

these forces are all orthogonal to the surface of the cylinder (or else the

object, fluid, and air would not be in a static equilibrium).

From Pascal’s Law, we have P1 − P2 = ρfgh. For the object to be in a

static equilibrium, it must be that F1 − F2 −mg = 0, that is:

10 Recall that the volume of a cylinder is V = πr2`, while density is defined ρ = m/V .Therefore, m = πρr2`.

203


Fb = mg

where Fb ≡ F1 − F2 is called the buoyant force.

Definition 5.5 (Buoyant force) The upward force Fb that is exerted by

a fluid and opposes the weight of a submerged object.11

Next, by definition of pressure (Definition 5.3), F = AP , so Fb = F1 −F2 = A(P1 − P2) = Aρfgh. Thus:

Fb = Aρfgh

Importantly, note that Ah is the volume of the displaced fluid, so ρfAh

is the mass of the displaced fluid (because density is the ratio of mass to

volume). Multiply mass by g and we obtain weight.12 Thus, Aρfgh is the

weight of the displaced fluid.

We have just derived the physical law of buoyancy, popularly known as

Archimedes’ Principle after its discoverer, Archimedes of Syracuse (287 BC

– 212 BC):

Principle 5.2 (Archimedes’ Principle) The buoyant force on a (totally

or partially) submerged body has the same magnitude as the weight of the

fluid that the body displaces.

For instance, take an object of weight W1 ≡ V ρg, where V is the volume

and ρ is the density (so V ρ is the mass) of the object. If we immerse

the object in water, the weight becomes Wimmersed ≡ V ρg − Fb. Since,

by Archimedes’ Principle, the buoyant force Fb equals the weight of the

displaced fluid, then Fb = V ρwaterg, so the weight loss is:

Wloss ≡W1 −Wimmersed = V ρwaterg

11 If the object is partially submerged, the buoyant force is net of the barometricforce. If it is totally submerged, air pressure is obviously not acting upon the object.12 Recall that a body’s weight is its mass times the acceleration on the body (Defi-nitions 1.14-1.15). Since the body is in a stable equilibrium, the acceleration is onlydue to gravity.

204


In other words, W1Wloss

= ρρwater

. Knowing the density of water (ρwater =

0.9998 g/cm3 at 0 ◦C) and computing weight losses for the object in question

(e.g. by placing a fully submerged scale at the bottom of the container) thus

allows us to recover the original object’s density.13

Example 5.7 (Icebergs) Consider a block of ice floating on water (e.g. an

iceberg). The density of ice is ρice = 0.92 g/cm3 at 0 ◦C, slightly lower than

water. Since the block is floating, mg = Fb, where m = V ρice and V is the

total volume of the block of ice. By Archimedes’ Principle, Fb = Vuwρwaterg,

where Vuw is the volume of the portion of the block of ice that is underwater.

In sum, mg = Fb implies:

VuwV

=ρice

ρwater= 0.92

That is, 92% of the block of ice is underwater. Hence, when we see an

iceberg floating in the ocean, we can be certain that we are only seeing 8%

of it.

Archimedes’ Principle can also be used to derive conditions for floating.

Return to the cylinder example (Figure 5.3). For the object to float, we

have established that the buoyant force must overcome gravity, so Fb = mg.

The buoyant force is, by Archimedes’ Principle, Fb = Ahρfg, where ρf is

the density of the fluid. Therefore, if the object itself has density ρ, we have

Ahρfg = A`ρg or, simplifying:

`

h=ρfρ

Now, if the object floats, then ` ≥ h (recall Figure 5.3). Therefore, a

necessary condition for floating is:

ρf > ρ

13 Legend has it that Archimedes’ himself used this approach when commanded toexamine whether certain objects were made of gold, knowing that the density of goldis ρgold = 19.32 g/c3.

205


In words, for an object to float in a fluid, it is necessary that the object

itself is less dense than the fluid. Otherwise, the object sinks. Remarkably,

this condition is completely independent of the dimensions of the object. The

only thing that matters is the object’s density. For example, a gold pebble

will always sink in water no matter how small, and a piece of wood will

always float in water no matter how large, because ρgold > ρwater > ρwood.

5.2.3 Stability

Archimedes’ Principle is also crucial for understanding the conditions

under which floating objects (e.g. ships) will remain stable.

Consider an object floating in liquid and let C denote its center of mass

(see Figure 5.4). The center of mass is subject to a gravitational acceleration

mg. If the object’s distribution of mass is such that the center of mass does

not coincide with the point where the buoyant force acts (point B in the

figure), there will be a torque (relative to any reference point on the object),

and the object will rotate (clockwise, in our example).

B

C

mg

FbWater line

Figure 5.4: A floating object with center of mass C.

As we learned in Example 4.3, the object will be stable when B and C

are directly aligned on a vertical line. If, furthermore, the center of mass

is below the center of the buoyant force (as in the figure), the torque is

restoring, so any tilting of the object (due to an external torque) will bring

206


it back to equilibrium (through the internal restoring torque). The lower

is the center of mass relative to the object’s point of buoyancy, the more

stability the object will possess. This is why ships are designed to have very

low centers of mass, as close as possible to their keel.

Example 5.8 (Balloons) Similar intuitions can be applied to gaseous flu-

ids. Consider a balloon that is filled with gas. The ballon has mass m =

mgas + mrest, which includes the mass of the gas inside the balloon, mgas,

and the mass of the rest of materials, mrest (including rubber and string).

Let ρgas be the density of the gas inside, and ρair be the air’s density outside.

For the balloon to rise in the air, it must be that Fb > mg. Here,

the buoyant force Fb is, by Archimedes’ Principle, the weight of the fluid

(air) that is displaced by the ballon, namely V ρairg, where V is the balloon’s

volume. Thus, a necessary condition for rise is Fb = V ρairg > mg. Using

mgas = V ρgas, the necessary condition reads:

ρair > ρgas +mrest

V

Thus, the only way in which the balloon can rise is that the gas inside

has sufficiently low density.

For example, if we fill the balloon with air (so ρgas = ρair), the balloon

will never rise, as the weight of its materials (rubber and string) always

bring it down. But if we fill it up with helium, which is substantially less

dense than air, the balloon will rise provided (i) its materials are not too

heavy; (ii) the balloon is sufficiently large in volume.

Example 5.9 (Objects in a box) Consider a sealed compartment in outer

space in static equilibrium, i.e. with no external forces or torques (see Figure

5.5). Inside the compartment there is an apple (object A) and a helium-filled

ballon (object B). Suppose the box is accelerated with magnitude a in some

direction (in this case, horizontally). The acceleration generates a percep-

tion of gravity for everything that is inside the box, of the same magnitude

but opposite direction as the acceleration, denoted by gp.

Then:

207


B

A

a

gp

x1 x2

B

A

a

gp

x1 x2

Figure 5.5: An apple (A) and a ballon (B) inside asealed box in outer space. Left: Vacuum. Right: Boxfilled with air.

• If there is vacuum in the box (Figure 5.5, left panel), then both apple

and balloon would move in the direction in which gravity is perceived,

i.e. opposite to the acceleration.

• If the box is filled with air (Figure 5.5, right panel), accelerating the

box generates the perception of gravity as well. Crucially, the air inside

the box is being accelerated, just as the apple and the balloon are. The

air pushing on the box creates a pressure differential between points x1

and x2. Just as air pressure is higher on the Earth’s surface because

of gravity, pressure will be greater on the left wall of the box than on

the right, or P1 > P2 (where P1 is the pressure on any point of the

box’s surface at position y1, and similarly for P2). This means that

the ballon will “rise” relative to the perceived gravity (i.e. move in the

direction of a) whereas the apple will “fall” relative to the perceived

gravity (i.e. move leftward).

Now suppose that the same box is placed on the Earth’s surface, where

there is an actual gravitational force g. To overcome gravity, we suspend

the apple from the ceiling of the box with a string, and attach the helium

balloon to another string which is attached to the floor of the box (Figure

5.6). Again, the box is accelerated in the horizontal direction, generating a

208


perception of gravity for everything inside the box.

• If there is vacuum in the box (Figure 5.6, left panel), both the apple and

the balloon will arc to the left to some position A′ and B′, respectively,

because of the direction of the perceived gravity gp.

• If the box is filled with air (Figure 5.6, right panel), accelerating the

box horizontally generates the perception of gravity gp in the horizontal

direction. The apple will arc again toward the left to some position A′.

However, now the acceleration generates a pressure differential between

x1 and x2. In particular, P1 > P2 and, because of the low density of

helium, the balloon will go forward! The latter is counterintuitive, but

completely consistent with Archimedes’ Principle.

B

A

a

gp

g

x1 x2

A′

B′

B

A

a

gp

g

x1 x2

A′

B′

Figure 5.6: An apple and a ballon inside a sealed boxon Earth. The apple (object A) hangs from a string;the ballon (object B) floats on a string. Left: Vacuum.Right: Box filled with air.

5.3 Fluid Dynamics

In the previous section, we have analyzed the behavior of objects and

fluids when the latter are at rest. Now, we study situations in which fluids

are in motion.

209


Consider an incompressible fluid (e.g. a liquid) running inside a vessel

(e.g. a pipe), as in Figure 5.7. Let A2 denote the cross-sectional area of the

liquid on one end of the pipe, and P2 be the pressure across that surface.

The liquid is moving at speed v2. On the other end of the pipe, a given cross

section of area A1 experiences pressure P1 and velocity v1.

A1

A2

v1dt

y1

P1

v2dt

y2

P2

Figure 5.7: Liquid running through a pipe.

If the fluid were completely static (i.e. v1 = v2 = 0), then by Pascal’s

Principle we would have P1 − P2 = ρgh > 0, where h ≡ y2 − y1 > 0. Thus,

if the liquid was static, the pressure at y1 would be higher than at y2. Since

ρ = m/V , then we can write:

P1 − P2 =mgh

V

We identify in mgh the gravitational potential energy of the system.

Thus, P1 − P2 is expressed in energy per unit volume.

Let’s now set the liquid in motion. In this process, three actors are at

play: the kinetic energy of motion (per unit volume), the gravitational po-

tential energy (per unit volume), and pressure. What is key is that the total

energy is conserved, and therefore the sum of the three must be constant.

Intuitively, as we move the liquid from one place to another, we trade-off

speed for either height h or pressure P .

210


At any location y:

• Kinetic energy is T = 12mv

2 or, per unit volume, 12ρv

2.

• Gravitational potential energy is Vg = mgy or, per unit volume, ρgy.

Therefore, the conservation of total energy requires that:

1

2ρv2 + ρgy + Py = constant (5.7)

where Py is the pressure at point y. This is called Bernoulli’s equation,

after Daniel Bernoulli (1700 – 1782). In words:

Principle 5.3 (Bernoulli’s Principle) An increase in the speed of a fluid

implies a decrease in pressure or a decrease in the fluid’s potential energy,

as per equation (5.7).

Indeed, from equation (5.7) it is clear that an increase in speed must

come at the expense of a lower potential energy or a lower pressure (or both,

to some extent). Else, total energy would not be conserved. In particular,

specializing equation (5.7) to our two cross sections in Figure 5.7:

1

2ρ(v2

2 − v21) + ρgh+ ∆P = 0 (5.8)

where ∆P ≡ P2−P1 denotes the pressure change. Let’s apply Bernoulli’s

Principle to a few special cases of equation (5.8):

• First, consider the special case when h = 0, so that the liquid flows

through a pipe along a straight line at height y (Figure 5.8). Since the

liquid is incompressible, the same amount of matter must flow through

disk 1 in a certain unit of time as through disk 2, so A1v1 = A2v2.

Therefore, in this example, since A1 < A2, then v1 > v2.

By Bernoulli’s Principle, i.e. equation (5.8) with h = 0, we have:

1

2ρv2

1 + P1 =1

2ρv2

2 + P2

211


yA1 A2

P1 P2v1 v2

Figure 5.8: Special case 1: h = 0.

and since v1 > v2, then it must be that P1 < P2. Thus, surprisingly, in

the region of the pipe where the speed is fastest is where the liquid’s

pressure is lowest.

• Next, consider Figure 5.9. A tube is introduced into a container filled

with liquid. This device is called a siphon.

Take the open end and suck in the air so the tube becomes completely

filled with the liquid (as reflected in the figure by the shaded areas).

The liquid exits the tube at velocity v2, and it empties the container

at velocity v2. Because the cross-sectional area of the tube is a lot

smaller than the surface area in the container, v2 ≈ 0, i.e. the water

line descends very slowly.14

Because both ends of the tube are open, the pressure at both ends

is the same and equal to P1 = P2 = 1 atm (as there is barometric

pressure). Therefore, here we have a special case in which h > 0 but

∆P ≡ P2 − P1 = 0.

Using equation (5.8) for this special case thus gives 12ρv

21+ρgy1 = ρgy2,

that is 12v

21 = gh. Thus:

v1 =√

2gh

14 Again, here we use that A1v1 = A2v2 because the liquid is incompressible. In otherwords, A1

A2= v2

v1, but A1

A2≈ 0 because the tube’s opening is very small relative to the

surface area in the container.

212


h

y2

y1

d

y3

v1 =√

2gh

v2 ≈ 0

Figure 5.9: Special case 2: ∆P = 0.

is the speed at which the liquid is running out. We have seen this

formula before: this is exactly the same speed that would be reached

at y1 by an object in free fall from y2 (recall e.g. Example 2.16).15

In the first special case, the change in the liquid’s speed is only at the

expense of a pressure change. In the second special case, instead, since the

pressure does not change due to the vast differences in the cross-sectional

areas, all the speed is coming from gravitational potential energy being

converted into the kinetic energy involved in the liquid’s motion.

15 This provides an ingenious method for stealing gasoline from a car: introduce atube in the car’s gas tank, suck the air out of the tube, and as long as the dry end ofthe tube is below the tank (so that h > 0), the gasoline will automatically flow withno further effort from our part.

213


5.4 Oscillations

Oscillatory motion has appeared in many examples in the previous chap-

ters. In this section, we first review simple harmonic oscillators, and then

explore some other topics on oscillations.

5.4.1 Simple Harmonic Oscillators

In very general terms, consider an object of mass m with center of mass

C that rotates about some point P (e.g. Figure 4.2). The torque about

point P is ~τP = ~rP × ~F , and its magnitude is:

τP = bmg sin θ = −IP θ

where ~F is the gravitational force vector, θ ≡ ∠~rP ~F , and b ≡ |~rP | is the

distance between P and C. The second equality is because the motion is

rotational (recall Result 3.6), where IP is the moment of inertia about point

P .

In a stable equilibrium, τP = 0. Using the Small Angle approximation,

sin θ ≈ θ around θ = 0, so stability implies:

θ +

(bmg

IP

)θ = 0

This is a simple harmonic oscillator in θ. Therefore, the solution is:

θ = θmax cos(ωt+ ϕ)

where θmax is the amplitude, ω is the angular frequency (a constant),16

and ϕ is the phase angle. As we have seen before (e.g. Example 1.9), the

16 Again, the angular frequency ω should not be confused with the angular velocity,ω ≡ θ.

214


solution to this DE gives the angular frequency and period of motion:

ω =

√bmg

IPradians/sec (5.9a)

T =2π

ωsec (5.9b)

In the previous chapters, we have seen that many objects behave in this

manner, e.g. pendulums, springs, or disks. For each case, the moment of

inertia IP is different. Let’s examine each one in turn:

• Rod The moment of inertia of a rod of length L is:

IP =1

12mL2 +mb2

where the first term is the moment of inertia about the center of mass,

and the second term follows from the Parallel Axis Theorem (Result

3.1). If the mass of the rod is evenly distributed, then C is located

in the middle of the rod, so b = 12L. Therefore, by equations (5.9a)-

(5.9b), the angular frequency and period of the rod are:

ω =

√3g

2Land T = 2π

√2L

3g

• Pendulum The moment of inertia of a pendulum is:

IP = m`2 +mb2

by the Parallel Axis Theorem. A pendulum is a bob attached to a

massless string of length `. Since the string is massless, the center of

mass is on the rod, so ` = b, and thus IP = 2m`2. Therefore, the

angular frequency and period of the pendulum are:

ω =

√g

`and T = 2π

√`

g

215


in agreement with our results from Example 1.10.

• Ring The moment of inertia of a ring (i.e. a hula-hoop) of radius r is:

IP = mr2 +mb2

by the Parallel Axis Theorem. A hula-hoop has center of mass at the

center of the circle, so b = r, and thus IP = 2mr2. Therefore, the

angular frequency and period of the pendulum are:

ω =

√g

2rand T = 2π

√2r

g

exactly as we found in Example 3.8.

• Disk The moment of inertia of a disk of radius R is:

IP =1

2mR2 +mb2

by the Parallel Axis Theorem. A disk has center of mass at the center

of the circle, so b = R, and thus IP = 32mR

2. Therefore, the angular

frequency and period of the pendulum are:

ω =

√2g

3Rand T = 2π

√3R

2g

Comparing our results across the different objects, we observe that a rod

(of length L), a pendulum (of length `), a ring (of radius r), and a disk (of

radius R), will all have the same period of motion if 23L = ` = 2r = 3

2R.

Simple harmonic oscillations are not limited to these four objects, how-

ever. Let us see one such example:

Example 5.10 (Oscillating liquid) Consider the manometer (Figure 5.2).

If air is blown into one end of the tube and then released, the liquid will os-

cillate.

216


Let the liquid have mass m and density ρ. The cross-sectional area of

the tube is A. The length of the liquid (i.e. of the segment of the tube that is

filled with liquid) is L. By definition of density m = V ρ, where V = AL is

the volume of the liquid. Let y be the total displacement of the liquid (with

y = 0 when the liquid is at rest17).

Since the cross-sectional area of the tube is everywhere the same, the

velocity of the liquid (when released) is the same everywhere along the tube

at any given time, equal to v = y. Assuming, for simplicity, that the liquid’s

motion inside the tube generates no energy loss18, we can now invoke the

Conservation of Mechanical Energy. First, the kinetic energy of the system

is

T =1

2m(y)2

where recall m = ALρ. For the gravitational potential energy, first normalize

V to V = 0 when y = 0 (i.e. when the liquid is at its stable equilibrium).

When the liquid has been displaced by a distance y, the accumulated mass

is ∆m = ρ∆V , where the increase in volume is ∆V = Ay. Thus, the

gravitational potential energy is V = (∆m)gy = Aρgy2. Thus, the total

energy is:

E = T + V =1

2ALρ(y)2 +Aρgy2

Thus, the change in total energy is E = ALρyy + 2Aρgyy. By the

conservation of total energy, E = 0, and thus:

y +

(2g

L

)y = 0

a simple harmonic oscillator. Therefore, as we know, the solution will

be y = ymax cos(ωt + ϕ), where ω =√

2gL is the angular frequency, ymax is

the amplitude, and ϕ the phase angle. The period of oscillation is T = 2π/

ω = 2π√

L2g .

17 That is, in the notation of Figure 5.2, y1 ≡ 0.18 In reality, some energy loss might occur from the friction between the liquid andthe inner walls of the tube, transforming some (though a small amount of) potentialenergy into heat.

217


Compared to our examples before, therefore, an oscillating liquid has the

same period as a pendulum of length ` if L = 2`, that is, if the length of the

liquid in the tube is twice the length of the pendulum.

Example 5.11 (Torsional Pendulum) The torsional pendulum (Figure

5.10) is another example of a simple harmonic oscillator. A torsional pen-

dulum is a rod (or disk) hanging from a wire or rope and placed horizontally.

The disk is then offset over some angle, and it oscillates back and forth.

P

Figure 5.10: The torsional pendulum.

Let P be the center of the disk. The torque relative to P is restoring

and obeys a rotational version of Hooke’s law: just like the force in a simple

spring is proportional to the spring’s linear displacement, the rotational force

in the torsional pendulum is proportional to the pendulum’s angular position.

In magnitude:

τP = −κθ

where θ is the angle, and κ is the torsional spring constant. Because the

motion is rotational, τP = IPα, where α ≡ θ is the angular acceleration.

Thus, in the stable equilibrium (τP = 0):

θ +

(κ

IP

)θ = 0

which is, again, a simple harmonic oscillator. The solution is θ =

θmax cos(ωt + ϕ), with angular frequency ω =√

κIP

and period of rotation

218


T = 2π√

IPκ .19

5.4.2 Forced Oscillations: Resonance and Damping

So far, we have analyzed situations in which objects are allowed to os-

cillate freely with a certain frequency after an external force or a torque

displaces them from their stable equilibrium. In this section, we examine

situations in which certain frequencies are forced upon the system.

Consider an otherwise simple harmonic oscillator (e.g. a spring, Figure

5.11). By Hooke’s Law, there is a restoring internal force −kx at each

position x. Now, consider adding an external force F (t), which we may call

a driving force. For simplicity, we add this force in a sinusoidal fashion (that

is, the added force is itself oscillating), so that:

F (t) ≡ F0 cos(ωt)

Here, the forced amplitude F0 and angular frequency ω are our choice.

Newton’s Second Law now says that ma = −kx + F0 cos(ωt). Using

a = x, we have:

x+

(k

m

)x =

F0

mcos(ωt) (5.10)

Since the right-hand side is non-zero, this DE is no longer a simple

harmonic oscillator. If it were, we know the process would be described

by x = xmax cos(ω0t + ϕ), with angular frequency ω0 =√

km , which in

this context we may call the natural frequency. Under the driving force

F , however, we have a second-order nonhomogenous ODE with constant

coefficients, so we must use Method IV (Remark 0.5) to find its general

19 The constant κ will depend on the cross-sectional area A and the length ` of the wireor rope holding the disk. Recall that when objects are composed of linear-elasticitymaterials and obey linear oscillations, equation (4.11) says that the oscillation con-stant is given by k = Y A

`, where Y is Young’s modulus (whose value depends on the

exact material that the wire or rope is made of). Intuitively, thicker (higher A) andshorter (lower `) ropes would oscillate less when the object is stretched. Here, therope is not being stretched but twisted, but the same intuition applies: κ is increasingin A and decreasing in ` (indeed it is easier to twist longer and thinner ropes).

219


x > 0x = 0

−kx F0 cos(ωt)

Figure 5.11: Spring motion as a harmonic oscilla-tor. Above: Spring in a relaxed position (x = 0); Be-low : Stretched spring, in position x, with counteract-ing force F0 cos(ωt).

solution.

Here, we will just find a particular integral of x. Since the non-homogenous

part is F0m cos(ωt), we propose a guess with a similar functional form:

x = A cos(ωt)

where A is the amplitude of the oscillation. Then, xp = −Aω sin(ωt),

and xp = −Aω2 cos(ωt). Substituting into (5.10), we get:

−Aω2 cos(ωt) + ω20A cos(ωt) =

F0

mcos(ωt)

where we have used ω20 ≡ k

m (and ω0 denotes the angular frequency

of the process when it oscillates freely). Matching coefficients, we get the

amplitude:

A =F0

m(ω20 − ω2)

(5.11)

This means that the process:

x =F0

m(ω20 − ω2)

cos(ωt) (5.12)

is a particular integral of DE (5.10). Under certain initial conditions, the

220


system will oscillate with an amplitude A (given by equation (5.11)), and

angular frequency ω. Under general initial conditions, the system might go

through a period in which amplitude and frequency are different than A and

ω, but eventually these oscillations will die out in favor of the ones that are

forced upon the system, and the process will follow the law of motion given

in equation (5.12).

• The initial phase, if it exists, is called the transient response to F (t).

• The phase in which the system has converged to the forced oscillation

is called the steady-state.

Equation (5.12) says that the object oscillates at the same frequency as

the driving force F (t), but with an amplitude A which depends on (i) the

frequency ω of the driving force, and (ii) the frequency ω0 of the natural

motion of the oscillator. Figure 5.12 plots the amplitude A as a function of

ω.20

First, note that:

A

> 0 if ω < ω0

< 0 if ω > ω0

In words, if the forced frequency is lower than the natural frequency,

then the amplitude is positive and the motion is sinusoidal. But if the forced

frequency is higher than the natural frequency, the amplitude becomes neg-

ative, which means that the process is still sinusoidal but becomes out of

phase by 180◦ (recall Figure 1.5). Thus, as frequencies increase above ω0,

there is a phase shift of 180◦.

There are three limiting cases of interest:

• If ω → 0, then A→ F0k .

20 The angular frequency ω is in radians per second, but it can be converted intoHertz via the identity ω = 2πf , where the frequency f is in Hz, as f = 1/T (whereT is the period of oscillation, in seconds).

221


ω0 Imposed frequency (ω)

F0

k

Amplitude

Figure 5.12: The amplitude A = F0

m(ω20−ω2)

as a

function of the imposed (or driving) frequency, ω, inradians per second. The natural frequency is ω0. Note:To convert to Hz, one can use that ω = 2πf , where fis frequency in Hz.

That is, if the system is driven to very low frequencies (compared to its

natural frequency ω0), then by equation (5.11) the amplitude of motion

is proportional to the amplitude of the driving force by a factor of 1k .

• If ω → +∞, then A→ 0.

That is, if the system is driven to very high frequencies, then by equa-

tion (5.11) the amplitude decays to zero.

• If ω → ω0, then A→ +∞.

That is, if the system’s frequency is forced to being the natural one

(i.e. if the two are synchronized), the amplitude explodes to infinity,

and we would see an enormous displacement.

The latter case is pathological: when the system is forced to oscillate at

the same frequency as its natural one, but is driven by an external force with

a non-zero amplitude, the system’s amplitude increases to infinity. In Figure

5.12, this appears as an asymptote. This phenomenon is called resonance,

222


and the frequency at which it occurs (here the natural frequency ω0) is called

the resonant frequency.

Definition 5.6 (Resonance) A phenomenon in which an external force

drives a system to oscillate with greater amplitude at specific frequencies.

Definition 5.7 (Resonant frequency) Frequency (or frequencies) at which

the response amplitude is a relative maximum. Also called normal mode or

natural frequency.

In practice, the frequency will obviously be very high at frequencies close

or at the resonant frequency, but certainly it will not be infinite. The reason

is that there is always a frictional force, or damping, which will limit the

oscillations to finite amplitudes.

Definition 5.8 (Damping) A frictional force within or upon an oscilla-

tory system that reduces, restricts, or prevents its oscillations.

Thus, in fact, amplitudes look more like Figure 5.13 than Figure 5.12.

Figure 5.13 shows examples for the amplitude (in absolute value) as a func-

tion of the driving frequency, for more or less levels of damping. This graph

is sometimes called the resonance curve.

Definition 5.9 (Resonance curve) The graph of the (absolute value of)

amplitude on the driving frequency, when friction (i.e. damping) is at play.

In principle, the frictional term of damping might depend on the oscil-

lator’s displacement in a complex way. In many cases, however, damping

is often a (near) proportional function of the object’s speed. In this case,

damping is defined as a frictional force

Fdamping = −bx

for some factor of proportionality b, where the minus sign indicates that it

is always a restoring force.

223


Figure 5.13: The resonance curve: amplitude A(in absolute value) as a function of the driving fre-quency, in practice, with more or less friction (damp-ing). For frequencies higher than the resonant fre-quency (marked f0 Hz in the plot), there is a phaseshift of 180◦.

If this is the case, then Newton’s Second Law for the oscillator reads

ma = −kx− bx, and thus:

x+ γx+ ω20x =

F0

mcos(ωt)

where we have defined γ ≡ bm and, once again, ω0 ≡

√km is the reso-

nant frequency, and the driving force is assumed to be sinusoidal: F (t) =

F0 cos(ωt). Here, γ therefore measures the strength of damping.

We have now a nonhomogenous second-order ODE with constant coeffi-

cients. We conjecture a solution of the form:

x = Aeiωt

where i =√−1. By Euler’s formula (Result 0.1), stating that eiωt =

cos(ωt) + i sin(ωt), this conjectured solution is sinusoidal, with a real part

cos(ωt) and, potentially, an imaginary part.

For this conjecture, x = Aiωeiωt and x = A(iω)2eiωt. Moreover, by

224


Euler’s formula, we can write the driving-force component F0m cos(ωt) as

F0m e

iωt. Substituting these into the ODE, we get:

A(iω)2eiωt + γAiωeiωt + ω20Ae

iωt =F0

meiωt

Note that (iω)2 = −ω2. Since eiωt 6= 0, we can divide through by eiωt to

obtain:

A(− ω2 + γiω + ω2

0

)=F0

m(5.13)

If there was no driving force (F0 = 0), equation (5.13) would imply

ω2 − iγω − ω20 = 0, the characteristic equation of the system. In that case,

the roots are:

ω1 =1

2

(√4ω2

0 − γ2 + iγ

)and ω2 =

1

2

(−√

4ω20 − γ2 + iγ

)(5.14)

Thus, the general solution with no forced oscillations is x = A1eiω1t +

A2eiω2t, for some arbitrary amplitudes A1 and A2.

With a driving force (F0 > 0), equation (5.13) implies a driving ampli-

tude of:

A =F0

m(ω20 − ω2 + iγω)

and therefore the displacement is:

x =F0

m(ω20 − ω2 + iγω)

cos(ωt)

The resulting amplitude is similar to the one we obtained under no damp-

ing (equation (5.11)), except for the term iγω. The resonant frequency is

ω = ω0, as before. However, unlike before, for this frequency the amplitude

does not shoot to infinity.

225


To see this last point formally, call:

R ≡ 1

m(ω20 − ω2 + iγω)

so that x = RF0 cos(ωt). Recall from Euler’s formula (Result 0.1) that any

complex number can be written in an exponential form. Here, we can write

R = p + qi for some real numbers (p, q) and, by Euler’s formula (equation

(0.7)):

R = ρeiθ

where ρ ≡√p2 + q2 and some angle θ. Therefore, x = ρF0e

iθ cos(ωt) or,

using again Euler’s formula:

x = ρF0 cos(ωt+ θ)

This way of writing the displacement shows that:

• The oscillation is not in phase with the driving force (whose frequency

is ω), but it is shifted by some extra amount, θ. The θ angle solves

1/R = (1/ρ)e−iθ, where 1/R = m(ω20 − ω2 + iγω), and therefore:

tan θ = −γ(

ω

ω20 − ω2

)< 0

where we have used that tan(−θ) = − tan θ. Notice we get θ < 0,

meaning that the oscillator “lags behind” the driving force for any

value of the frequency of the driving force, ω.

• The amplitude of the oscillator is proportional, but not equal, to that

of the driving force (which is F0), with some factor of proportionality

ρ between the two.

So what happens as we approach the resonant frequency? For this,

consider computing R2 (so as to have everywhere positive values for the

amplitude):21

21 To compute R2, we use the rule that the square of a complex number equals theproduct of the number with its complex conjugate (equation (0.5)).

226


R2 =1

m2(ω20 − ω2 + iγω)(ω2

0 − ω2 − iγω)

=1

m2[(ω20 − ω2)2 + γ2ω2]

(5.15)

Therefore, as ω → ω0, we have R2 → 1m2γ2ω2 , or:

R→ 1

mγω

Thus, for as long as there is damping (γ 6= 0), the amplitude does not

shoot to infinity as we approach the resonant frequency ω0. Moreover, as

ω → ω0, we have:

tan θ = −γ(

ω

ω20 − ω2

)→ −∞

for any value of γ > 0. Since tan θ = sin θcos θ = −∞, then it must be that

sin θ = −1 and cos θ = 0, that is, θ = −π2 (i.e. 90◦). In words, as we

approach the resonant frequency, the oscillator approaches a state of out of

phase of 90◦ relative to the phase of the driving force (see Figure ??).

Again, γ controls the strength of damping: when frictional forces are

stronger (higher γ), the maximum amplitude R is smaller. Conversely, the

resonance is sharper as damping is made smaller. In a plot of equation 5.15,

i.e. of ρ2 (the resulting amplitude) on ω (the driving frequency), we see that

γ controls the width of resonant frequencies, with lower γ meaning more

width.

Another measure of width that is used in practice is:

Q ≡ ω0

γ

so that narrower resonances correspond to higher values for Q.

227


ω0

Driving frequency (ω)

-180◦

-90◦

0◦

θ

Figure 5.14: The phase angle θ with damping, as afunction of the driving frequency, ω.

5.4.3 Coupled Oscillators

Systems need not have only one resonance frequency, as we have assumed

so far. More complex systems may exhibit multiple resonance frequencies.

Systems in which different oscillating objects are linked are called coupled

oscillators. For instance, a double spring (i.e. a spring attached at the end

of another spring) is a system of two coupled oscillators which has two reso-

nance frequencies (see Example 5.13 below). In this case, the amplitude (in

absolute value) seen in Figure 5.13 would exhibit two peaks. More generally,

a system with n coupled oscillators will have n resonant frequencies, so the

amplitude curve will exhibit n peaks (see Example 5.14 below).

Let us examine these examples in detail.

Example 5.12 (Two coupled pendulums) Consider two pendulums of

equal length ` and equal mass m, attached to each other with a spring (Figure

5.15). Let x1 and x2 denote their respective displacements.

If there was no spring, each pendulum would oscillate independently as a

228


x1 x2

Figure 5.15: Two coupled pendulums.

simple harmonic oscillator, with natural frequency ω0 =√

g` (recall Example

1.10), obeying:

xi + ω20xi = 0

for i = 1, 2. Consider adding back the spring. The spring force is, by

Hooke’s law, proportional to the distance between the two pendulums. Letting

the spring constant be k, Newton’s law then says ma1 = −mω20x1−k(x1−x2)

for pendulum 1, and ma2 = −mω20x2 − k(x2 − x1) for pendulum 2. Thus:

mx1 +mω20x1 − k(x1 − x2) = 0

mx2 +mω20x2 − k(x2 − x1) = 0

a system of two ODEs. This system will have two solutions for the

common frequency of the pendulums, let’s call them (ωs, ωf ) for slow and

fast (i.e. ωs < ωf ). To find them:22

22 Alternative method: Conjecture solutions of the form x1 = Aeiωt and x2 =Beiωt, i.e. the pendulums will (eventually) oscillate at the same frequency ω, butwith potentially different amplitudes (A,B). Plugging into the ODEs, we obtain(ω2 − ω2

0 − km

)A = − k

mB and

(ω2 − ω2

0 − km

)B = − k

mA. Multiplying the two to-

gether, we will readily get (ω2s , ω

2f ) = (ω2

0 , ω20 + 2k/m). Plugging the solutions back,

we will get A = B for ωs, and A = −B for ωf . That is, the pendulums movesymmetrically when ω = ωs, and antisymmetrically when ω = ωf .

229


• First, add the two together to obtain:

¨x+ ω20x = 0

a simple harmonic oscillator in x ≡ x1 + x2. That is, one solution

is that the two pendulums oscillate together symmetrically as a single

system with frequency ωs = ω0. When one pendulum moves right, the

other moves right as well.

• Second, subtract the second equation from the first to obtain:

¨x+

(ω2

0 −2k

m

)x = 0

an oscillator for x ≡ x1−x2. Thus, the second solution is that the two

pendulums oscillate antisymmetrically with common frequency ω2f =

ω20 + 2k

m , that is:

ωf =

√ω2

0 +2k

m

That is, when one pendulum moves right, the other moves left, and

vice versa.

The frequencies ωf and ωs < ωf are the resonant frequencies (or normal

modes) of the system. When the pendulum oscillates at the slow resonant fre-

quency, then the pendulums oscillate together symmetrically, i.e. when one

moves right, the other does so as well (Figure 5.16, left panel). As a result,

the spring between the two pendulums is neither stretched nor compressed.

But if we increase the frequency and reach ωf , then the two pendulums will

start moving antisymmetrically: when one moves to the right, the other one

moves to the left, and vice versa (Figure 5.16, right panel). As a result, the

spring between the two pendulums becomes stretched when the pendulums are

far apart, and compressed when they are close together. In both cases, how-

ever, the two pendulums have the same frequency and, therefore, the same

period of motion.

230


x1 x2 x1 x2

Figure 5.16: Left: Oscillation at the lower resonantfrequency, ω = ωs. Right: Oscillation at the higherresonant frequency, ω = ωf .

A very similar result will be obtained in the following example, which

considers coupled springs as opposed to coupled pendulums.

Example 5.13 (Two coupled springs) Consider two coupled springs (Fig-

ure 5.17): two bobs, of masses m1 and m2, are attached to the ends of

springs. The springs constants are k1, k2, and k3, respectively for each

spring. Let x1 and x2 denote, respectively, the displacement of the first and

second bobs away from the stable equilibrium.

x2x1

−k1x1 −k2x2 k3x2

Figure 5.17: Two coupled springs.

For the first mass, Newton’s Second Law says m1a = F1, where F1 is the

magnitude of the force acting on mass 1. This is composed of the force on

mass 1 from moving mass 1 (equal to −k1x1− k2x1), and the force on mass

1 from moving mass 2 (equal to k2x2). Thus, Newton’s law on mass 1 says:

mx1 = −(k1 + k2)x1 + k2x2 (5.16)

231


Similarly, for mass 2, we have:

mx2 = −(k2 + k3)x2 + k2x1 (5.17)

We can now find two solutions to this system of ODEs:

• Adding equations (5.16)-(5.17), we obtain m(x1 + x2) = −k1x1 +

−k3x2. Note that k2 has disappeared, because the influence of the

middle section on either mass cancels out with its effect on the other

mass.

Assuming k1 = k3 = k and m1 = m2 = m for simplicity, we can write:

¨x+

(k

m

)x = 0

a simple harmonic oscillator in x ≡ x1+x2, with solution x = As cos(ωst+

ϕs), where As is the amplitude, ωs =√

km is the angular frequency,

and ϕs is the phase.

• Consider now, instead, subtracting equation (5.17) from equation (5.17).

Then, we obtain m(x1 − x2) = −(k1 + 2k2)x1 − k3x2.

Assuming, once again, that k1 = k3 = k and m1 = m2 = m, we can

write:

¨x+

(k + 2k2

m

)x = 0

a simple harmonic oscillator in x ≡ x1 − x2. Thus, the solution is

x = Af cos(ωf t+ϕf ), where Af is the amplitude, ωf =√

k+2k2m is the

angular frequency, and ϕf is the phase.

Thus, we have found two solutions, each of which oscillates at a fixed

frequency. These are the resonant frequencies for the system. The frequen-

cies ωs and ωf are the resonant frequencies of this system. Comparing the

two solutions, notice that ωs =√

km <

√k+2k2m = ωf . That is, the second

solution oscillates faster than the first one.

232


In particular, we have that the displacement for each mass is:

x1 =x+ x

2=

1

2

(As cos(ωst+ ϕs) +Af cos(ωf t+ ϕf )

)x2 =

x− x2

=1

2

(As cos(ωst+ ϕs)−Af cos(ωf t+ ϕf )

)the first equalities by definition, and the second ones by our results above.

We observe:

• Symmetric oscillation mode:

If we excite the masses so that Af = 0, then x1 = x2, so that both

masses will oscillate at the same frequency ωs. In this case, both

masses move right and left together, in unison. In practice, the masses

will have this motion if we start them off at the same positions, x1(0) =

x2(0). This is called the symmetric oscillation mode.

• Antisymmetric oscillation mode:

If we excite the masses so that As = 0, then x1 = x2, so that both

masses will oscillate at the same frequency ωf . However, in this case,

the masses are out of phase: when one moves to the left, the other one

moves to the right. In practice, the masses will have this motion if we

start them off at the opposite positions, x1(0) = −x2(0). This is called

the antisymmetric oscillation mode.

Example 5.14 (n coupled oscillators) More generally, consider a sys-

tem with n masses. All the masses are potentially linked through harmonic

oscillators. Let kij denote the spring constant between masses i and j.

Newton’s Second Law now says:

m1x1 =k11x1 + k12x2 + · · ·+ k1nxn

...

mnxn =kn1x1 + kn2x2 + · · ·+ knnxn

233


To solve this system of linear ODEs, recall from Section 0.7 that we may

guess xi = cieiωt for each i = 1, . . . , n, where i =

√−1 and (c1, . . . , cn) are

constants to be found. Plugging the guess into the system of ODEs, we find

the following collection of characteristic equations:

−ω2c1 =k11

m1c1 +

k12

m1c2 + · · ·+ k1n

m1cn

...

−ω2cn =kn1

mnc1 +

kn2

mnc2 + · · ·+ knn

mncn

or, in matrix notation:

−ω2~c = M~c

where ~c ≡ (c1, . . . , cn)>, and the matrix M ≡ (Mij) has entries Mij =kijmi

. Note the last equation can be written as (M + ω21n)~c = ~0, where 1n is

the n× n identity matrix. Since we are looking for a vector ~c 6= ~0, then we

must solve:

Det(M + ω21n) = 0

Letting λ ≡ −ω2, this means that we are looking for the n eigenvalues

of the matrix M (recall Definition 0.13). This will give us n solutions

for ω =√−λ (which are potentially complex), each with a corresponding

eigenvector ~c. The collection of solutions (ω1, . . . , ωn) then compose the

resonant frequencies of the system.

Coupled oscillators therefore may have a large number of resonance fre-

quencies. For example, a string (e.g. a violin string) can be thought of a

system that is composed of a continuum of coupled oscillators. Indeed, when

plucked, violin strings will oscillate transversally (i.e. in the y direction) and

exhibit a very large number of resonances.23

23 In contrast, the example of the spring (Example 5.13) was one of longitudinal

234


Musical instruments in general have countably many resonant frequen-

cies, fn for n = 1, 2, 3, . . . , called harmonics. For a given harmonic fn, the

vibrating string oscillates about (n−1) nodes, which are points on the string

that remain at rest (i.e. not vibrating) and which are equidistant to each

other. For example, the third harmonic has nodes at 13L and 2

3L os the

string, where L is the length of the string.

Moreover, the harmonic frequencies are proportional to the number of

nodes: fn = nf , where f is called the fundamental harmonic.24

oscillations, because the spring oscillates only in the x direction. Of course, nothingprevents us from making the spring oscillate transversally as well. These more generalcases were considered in Example 5.14.24 For example, for musical instruments, f ≈ v

2LHz, where v = 340 m/sec is the

speed of sound, and L is the length of the instrument. For example, flutes have holesso that, by putting our finger on one hole, we effectively decrease the length L of theinstrument and increase the frequency. As a result, the sound pitch becomes higher.

235

Chapter 6

Heat, Temperature, and

Thermodynamics

Heat is a form of energy transfer between bodies. It is a physical phe-

nomenon whereby energy is transmitted between objects in thermal contact.

When the temperature of an object changes, its so-called thermometric prop-

erties change. For instance:

• Objects tend to expand as they heat up, and shrink as they cool down.

• Contained gases increase in pressure as their temperature raises.

• If a hot object is put in contact with a cold one, the first object cools

off and shrinks, while the second one heats up and expands, due to

the flow of thermal energy between one object and the other.

In this last example, when the process stops, the objects reach a so-called

thermal equilibrium.

Definition 6.1 (Thermal Equilibrium) Two or more objects are in ther-

mal equilibrium if there exists a zero net thermal energy flow between them,

so their temperatures are the same. An object is in thermal equilibrium with

itself if the temperature within it is spatially and temporally uniform.

237


This chapter studies heat and temperature changes. We start off with un-

derstanding how solids and liquids expand and contract due to heat. Then,

we study how heat can change the pressure in gases. Finally, we will review

the basic laws of thermodynamics.

6.1 Thermal Expansion of Solids and Liquids

Solid and liquid bodies expand or contract when their temperature changes.

This is in fact how temperature scales were developed. Swedish astronomer

Anders Celsius (1701–1744) invented a linear scale whereby the temperature

of 0◦C was attributed to the length of a rod when dipped in melting ice,

whereas a temperature of 100◦C was given when the rod’s length changed

due to it being subject to boiling water. According to his system, the change

in the rod’s length is linear in the temperature change. Dutch-German-

Polish physicist Daniel Gabriel Fahrenheit (1686–1736), who is also credited

for inventing the mercury thermometer, used instead the human body as

the reference for his (also linear) scale, mistakenly attributing 100◦F to the

customary heat of the human body (which is typically closer to 97◦F).1

There is no upper limit to how hot systems can get. But if there ex-

ists a system which cannot transfer thermal energy to any another system

that it is in thermal contact with, then such system must be at the lowest

possible temperature. In Celsius, this minimal temperature is -273.15◦C, or

-459.67◦F. Inspired by the observation that there is in fact a lowest tempera-

ture, the physicist William Thomson, 1st Baron Kelvin (1824–1907) created

yet another temperature scale, with 0◦K given to the lowest possible tem-

perature. In physics, Kelvin is now the standard scale for temperatures.2

To understand the thermal expansion of solids and liquids formally, con-

sider a rod of length L going through a temperature change of ∆T . Then,

1 The conversion between the two scales is TF = 95TC + 32, where TF (respectively,

TC) denotes Fahrenheit (respectively, Celsius) temperature.2 The increments in Kelvin are the same as those in Celsius, i.e. TK = TC + 273.15.

238


the change in length is:

∆L = αL∆T (6.1)

where α is the so-called linear expansion coefficient, in units per ◦C,

whose exact value depends on the material. Typically, bodies expand with

heat, so α > 0.3 For instance, for copper, α = 17 × 10−6/◦C; for steel,

α = 12 × 10−6/◦C. So copper expands with temperature more easily than

steel does.4

While the length expands linearly to the temperature change, in terms

of volume there is of course a cubic expansion. To see this, consider for

simplicity a perfect cube with sides of equal length, L. Its volume is V = L3.

Consider changing the temperature by ∆T . Then, the new volume is:

V + ∆V = (L+ ∆L)3

= V

(1 +

∆L

L

)3

≈ V + 3∆LL2

= V + 3αV∆T

where the third line uses the Binomial theorem,5 and the fourth line uses

equation (6.1). In sum:

∆V = βV∆T (6.2)

where β ≡ 3α. Here, β is the so-called cubical expansion coefficient.

Comparing (6.1) and (6.2), the only difference between the rate of expansion

3 A rare exception is water in the range T ∈ [0, 4]◦C. Indeed, just before hitting itsfreezing temperature, cooling down water will cause it to expend instead of contract.4 For example, if there is a temperature change of ∆T = 50◦C between Summer

and Winter, rail tracks (made of steel) will expand by ∆L = 0.6 meters. For thisreason, rail tracks typically have small gaps every few meters, allowing the materialto expand without there being a risk of bulging. Incidentally, this causes distinctiveclicking sounds as the train wheels go over these gaps.5 That is, if x� 1, then (1 + x)n ≈ 1 + nx.

239


in length and that in volume is the factor of proportionality relative to the

temperature change, with this factor being larger (and exactly three times

larger in the simple case of perfect cube-like shapes) for volume than for

length.

Example 6.1 (Mercury thermometers) Consider a glass tube that is

attached to a spherical chamber containing a volume V of mercury. The

tube has radius r. The change in volume when the system is heated up by

∆T degrees is ∆V = βHgV∆T , where βHg ≈ 18 × 10−5/◦C is the cubic

expansion coefficient of mercury.6 If the mercury is rising up the tube, then

the change in the volume of mercury can also be written as ∆V = πr2h,

where h is the height by which the mercury rises. Thus:

h =βHgV∆T

πr2

is the height by which the mercury will rise. For example, if V = 1 cm3,

r = 1 mm, and ∆T = 1◦C, then h ≈ 5.7 mm. Thus, for every 5.7 mm rise

in the mercury inside the tube, we can be certain that the temperature has

gone up by 1◦C.

6.2 The Ideal Gas Law

We have seen that temperature changes in liquids and solids lead to their

expansion or compression. With gases, temperature changes lead to changes

in pressure (Definition 5.3).

In Sections 5.2 and 5.3, we argued that the fact that liquids are incom-

pressible means that the change in pressure is everywhere the same (see

equation 5.5). Gases, on the other hand, are compressible, so Pascal’s prin-

ciple does not apply.

It is an experimental fact, however, that there is a simple mathematical

relationship between the pressure of a gas (P , in Pa), its temperature (T ,

6 Here we are considering that the tube itself does not expand as it is heated. Inreality, these tubes are typically made of pyrex, a synthetic glass with a very lowexpansion coefficient.

240


in Kelvin), and the volume (V , in m3) that it occupies. This relationship,

which holds approximately true for most gases, is called the Ideal Gas Law :7

Result 6.1 (Ideal Gas Law I) The state of gases can be described, with

a good approximation, with the following identity:

PV = nRT

where n is the number of moles, and R = 8.3 Joules/◦K is the so-called

universal gas constant. If the gas in question obeys this law exactly, then

the gas is called an ideal gas.8

In the definition, we have used the term “mole”. A mole is a standard

unit of measure and it is equal to NA ≡ 6.02×1023 (the so-called Avogadro’s

number). In particular, a mole is defined as the amount of substance that

contains as many particles as there are atoms in 12 grams of the isotope

carbon-12. That is, one mole of carbon weighs, by definition, 12 grams.

According to this measure, one mole of helium then weighs 4 grams, and

one mole of oxygen-2 (the gas form of oxygen) weighs 32 grams.

These weights are the atomic masses (A), which are calculated as the

sum of the number of protons (Z), which are positively charged, and the

number of neutrons (N), with neutral charge, that compose the atom:

A = N + Z

Atoms also contain electrons, with negative charge and in equal number

to protons, but electrons are nearly weightless, so their contribution to the

atomic mass is negligible. The mass of neutrons (mn) and protons (mz)

happens to be nearly identical and equal to 1.66× 10−27 kg, so the atomic

(or molecular, depending on the case) mass, denoted by m, is simply:

7 In what is coming, we will say for simplicity that gases are collections of molecules,i.e. of groups of two or more atoms. While this is true for some gases (for instanceoxygen or carbon dioxide), it is not true for the so-called atomic (or noble) gases,which are composed of individual atoms (these are helium, neon, argon, krypton,xenon, radon, and oganesson).8 For example, oxygen is very close to being an ideal gas.

241


m = Nmn + Zmz ≈ A 1.66× 10−27 kg

This means that atoms and molecules weights are directly proportional

to the number of protons and neutrons they contain, which the factor of

proportionality being constant across the elements.

Example 6.2 (Gas Volumes) Consider an environment with 1 atm (i.e.

P = 101, 325 Pa, or atmospheric pressure) and n = 1 mole at room tem-

perature (i.e. T = 293◦K). By the ideal gas law, the volume of the gas is

then:

V =nRT

P=

8.3× 293

101, 325≈ 1, 000 cm3

or about 24 liters. Notice this is independent of the type of gas, whether

it is helium, or oxygen, or carbon dioxide, etc.

A surprising aspect of the Ideal Gas Law is that the mass of the particles

(atoms or molecules) of the gas does not show up at all in the equation.

What this implies is that gases whose particles are heavier will exhibit lower

particle velocities.

To see this, consider two gases with different particle masses. The two

gases are contained inside a container, and they have the same number of

moles n, the same volume V , the same temperature T , and therefore, by the

Ideal Gas Law, the same pressure P . The molecules of the two gases move

around and bounce off the inner walls of the container, describing elastic

collisions. If the mass of a molecule is m, and the (average) speed of the

molecule is v, the total momentum transfer from the collision is (proportional

to) mv. In units per second, the transfer is proportional to mv2. Moreover:

mv2 ∝ F ∝ P

the first sign by definition of momentum, the second one by definition of

pressure. However, by the Ideal Gas Law, P is constant in m. Therefore,

242


we must conclude that the product mv2 is constant in m, for each given

temperature.

And this is indeed what one observes experimentally. For example,

comparing helium (He) to oxygen (O2), with a mass ratio of 1-to-8, then

mHev2He = mO2v

2O2

implies that the average velocity of oxygen particles at a

given temperature is√

8 ≈ 2.83 times smaller than that of helium particles.9

More generally, therefore, heavier gases exhibit lower particle velocities.

A second law that ideal gases must obey is as follows:

Result 6.2 (Ideal Gas Law II) The state of gases can be described, with

a good approximation, with the following identity:

PV = NkT

where N is the number of molecules in the gas, k is the so-called Boltz-

mann constant.

Comparing this law with our first Ideal Gas Law (Result 6.1), we see

that Nk = nR. Since n = NNA

(in words, the number of gas molecules is

Avogadro’s number times the number of moles), it follows that Boltzmann’s

constant is given by:

k =R

NA≈ 1.38× 10−23 J/K

Finally, we can also write these laws in terms of density, ρ. First, the

mass of the gas m can be expressed as the molar mass M (in grams per

mole) times the number of moles n, so n = mM . Therefore, we can write

PV = nRT as PV = mMRT . By the definition of density (Definition 1.20),

ρ = m/V . Plugging it in, we get:

P = ρR

MT

9 Indeed, oxygen particles at room temperature move at about 480 m/s, while heliumparticles at room temperature move at around 480×

√8 ≈ 1357.64 m/s.

243


where RM is a specific gas constant, which is only a function of the sub-

stance. This says that, keeping pressure constant, increasing the tempera-

ture of a gas always decreases its density by the same proportion.

6.3 Phase Transitions

Substances can typically exist in three forms (gaseous, solid, and liq-

uid), and change abruptly from one form to another. These changes are

called phase transitions, and they depend on the substance in question, the

temperature, and the pressure. To understand these transitions, we usually

make use of phase diagrams, which depict the state of the substance for

different levels of pressure and temperature.

Temprature →

Pre

ssu

re→

Gas

Solid

Liquid

Figure 6.1: A typical phase diagram.

An example is given in Figure 6.1. In the figure, we see that if we take a

gas at a given temperature and increase its pressure (for example by reducing

the volume), the gas will eventually turn into a solid, if the temperature was

low enough to begin with, or even (though maybe temporarily) into a liquid,

if the temperature was higher. Similarly, if we fix the pressure (at 1atm, for

example), increasing the temperature usually will turn solids into liquids,

244


and then into gases. For low pressures, the liquid phase might be altogether

bypassed.

These phase transitions are discontinuous (i.e. abrupt), and they occur

as temperature or pressure surpass certain critical points, depicted in the

figure by the lines describing the different colored areas, and called phase

boundaries. In the figure, the dashed line represents the set of critical boiling

points, where the substance suddenly becomes gaseous if temperature is

increased for a given pressure level. The dotted line represents the set of

critical melting points, where the substance transitions from solid to liquid

under a temperature increase, for a given pressure. The intersection of the

phase boundaries (which is indicated in the figure with a black dot) is called

the triple point, and it marks the temperature and pressure conditions at

which the three different phases can coexist in a stable equilibrium.10

Figure 6.2: The phase diagram for water.

10 For example, for water the triple point is at T = 273.16 K (about 0.01◦C) andP = 611.657 Pa (about 0.006 atm). See Figure 6.2.

245


6.4 Thermodynamics

In Sections 5.2-5.3, we studied the static and dynamic behavior of fluids,

mostly in their liquid state. We can now extend what we learned for liquids

to other states of substances.

Using Pascal’s principle, we learned that the condition for there to be

a hydrostatic equilibrium in liquids is given by equation (5.5), which we

reproduce here:

dP

dy= −ρg

We argued that, since liquids are incompressible, density ρ is constant

in (hydrostatic) pressure, P . Thus, the equation can be integrated out very

easily (as we did) to get a linear relationship between pressure and position

(Result 5.1): ∆P = ρg∆y. In words, changes in altitude translate linearly

into pressure changes.

For gases, however, the density does depend on pressure. For simplicity,

assume an isothermal atmosphere, that is an atmosphere where temperature

is everywhere the same (i.e. T is constant in y). Suppose our gas has N

molecules, each with mass m, and it is contained in a volume V . The

density of the gas is, therefore, ρ = NmV . By the Ideal Gas Law, N

V = PkT ,

and therefore ρ = PmkT .

Plugging this density into equation (5.5) we get that dPdy = −Pm

kT g, or:

dP

P= − 1

H0dy

where we have denoted H0 ≡ kTmg . In words, the rate of change in pres-

sure is proportional to the absolute change in altitude, with a constant of

proportionality 1/H0.11 Integrating out between P0 (pressure at sea level,

y = 0), and Ph (pressure at some altitude of h meters above sea level), we

get:

11 Again, this object is truly constant if temperature is everywhere the same, i.e. if Tis constant in y. This is not a terrible approximation at low enough altitudes in thetroposphere. In this case, if T = 273 K (i.e. room temperature), and our gas is air(for which the approximate atomic mass is approximately 29), then H0 ≈ 8, 000 m.

246


∫ Ph

P0

dP

P= − 1

H0

∫ h

0dy

Thus, lnPh − lnP0 = − hH0

. Solving:

Ph = P0e−h/H0

This equation then allows us to compute air pressure at various altitudes

h to a good approximation.12

12 For example: for Mount Everest, h = 8.9 km, and we can calculate that pressure isonly a third of an atmosphere. At such low pressure, water boils at 72◦C as opposedto the 100◦ C at sea level, where P0 = 1 atm. At h = 30 km altitude, pressure is onlyone forty-fifth of an atmosphere, and water boils at only 20◦C!

247

Chapter 7

Lagrangian Mechanics

In the previous chapters, we argued that the main goal of classical me-

chanics is to describe the trajectories (or orbits) of systems from their equa-

tions of motion. For this, we argued, we need to know three objects: (i)

the masses of the particles; (ii) a set of forces (or, by the Potential Energy

principle, the potential energy function V ); and (iii) initial conditions for

position and velocity.

Henceforth, we will tackle this problem from a slightly different perspec-

tive: given an initial and a final condition for the system, what is the set

of actions on the particle that minimize the difference between potential

and kinetic energies? We shall call this principle the Principle of Least Ac-

tion. The mathematics that operationalize it go by the name of Lagrangian

mechanics.

Formulating the problem in this way is convenient for two reasons: (i)

we can use it to obtain the equations of motion for the system, and thus it

encodes all its descriptive properties (e.g. mass and potential energy); (ii) it

encompasses the description of not only classical mechanics, but also most

other major theories in physics (e.g. Maxwell’s theory of electrodynam-

ics, Einstein’s theory of relativity, and the Standard Model of elementary

particles).

249


7.1 The Euler-Lagrange Equation

We begin with some definitions:

Definition 7.1 (Lagrangian Equation) The Lagrangian equation of a

system of i = 1, . . . , N particles with trajectory ~r is:

L(~r, ~r) ≡ T − V (~r) =1

2

N∑i=1

mi|~ri|2 −N∑i=1

Vi(~r) (7.1)

where T is the kinetic energy and V is the potential energy.

Note we write the Lagrangian explicitly as a function of the position

(which influences the potential energy) and the velocity (which influences

the kinetic energy).

Definition 7.2 (Action) The action of a system between two instants of

time t0 and t1 > t0 is the integral of the Lagrangian between these points in

time, or:

A ≡∫ t1

t0

L(~r(t), ~r(t)

)dt (7.2)

Principle 7.1 (Principle of Least Action) The Principle of Least Ac-

tion consists of choosing the trajectory ~r(t) that minimizes A, i.e. for which

the action is stationary to first order.

Because the Principle of Least Action requires a minimization over a

set of trajectories, and trajectories are functions of (continuous) time, the

problem involves the minimization of a function of functions, namely a func-

tional. The optimization of functionals involves a branch of mathematics

called variational calculus. The minimization problem is written as:

δA = 0

to signify that we must find the least among all the possible variations

over the whole trajectory space.

250


The equation that allows us to find the stationary action is the Euler-

Lagrange equation:

Definition 7.3 (Euler-Lagrange Equation) The Euler-Lagrange equa-

tion is a second-order partial differential equation (PDE) whose solutions

are the functions for which a given functional is stationary. In this case,

the functional is (7.2), and the equation reads:

d

dt

∂L∂~r− ∂L∂~r

= ~0 (7.3)

where L is the Lagrangian, equation (7.1).

Note that (7.3) is really a system of second-order difference equations,

one for each dimension of space and each particle.1 Then, we have the

following result:

Result 7.1 The stationary action (i.e. the solution to δA = 0) is given by

the function ~r(t) that solves equation (7.3).

We offer two proofs of the result:

Proof 1. First, a heuristic proof. Discretize the time space into T ≡{∆, 2∆, 3∆, . . . }. For each n ∈ T, approximate x(t) and x(t) with x(t) ≈xn+xn+1

2 and x(t) ≈ xn+1−xn∆t , and similarly for the y and z space dimensions.

In vector notation, ~r(t) ≈ ~rn ≡(xn+xn+1

2 , yn+yn+1

2 , zn+zn+1

2

)and ~r(t) ≈

~rn ≡(xn+1−xn

∆t , yn+1−yn∆t , zn+1−zn

∆t

).

We can approximate the action as follows:

A ≈ A ≡+∞∑n=1

L(~rn, ~rn

)∆t

Then, for some k ∈ N, consider taking the derivative along some dimen-

sion (say, x). This yields:

1 Recall (from Definition 0.15) that we use the same symbol, ∂, to denote both apartial derivative and a vector of partials (or gradient).

251


∂A

∂xn

∣∣∣∣∣n=k

=∂

∂xn

[L(xn+1 − xn

∆t,xn + xn+1

2

)∆t+ L

(xn − xn−1

∆t,xn−1 + xn

2

)∆t

] ∣∣∣∣∣n=k

=1

∆t

(− ∂L∂xn

∣∣∣∣∣n=k+1

+∂L∂xn

∣∣∣∣∣n=k

)+

1

2

(∂L∂xn

∣∣∣∣∣n=k

+∂L∂xn

∣∣∣∣∣n=k+1

)

Taking the continuous time limit of the right-hand side yields

∂A

∂x= − d

dt

∂L∂x

+∂L∂x

Thus, ∂A∂x = 0 if, and only if, − d

dt∂L∂x + ∂L

∂x = 0. Repeating the argument for

the y and z coordinates proves the result. �

Proof 2. The second proof uses a more formal variational argument.

Suppose ~r(t) is a the true trajectory of the system between two states ~r0 =

~r(t0) and ~r1 = ~r(t1) and two points in time, t0 and t1. Let ~ε(t) be a small

perturbation that is zero at the endpoints, i.e. ~ε(t0) = ~ε(t1) = ~0. To a first

order, the change in the action functional, δA, can be written:

δA =

∫ t1

t0

L(~r +~ε, ~r + ~ε

)dt =

∫ t1

t0

(~ε · ∂L

∂~r+ ~ε · ∂L

∂~r

)dt

where the second equality follows from a first-order expansion of the

Lagrangian. Using integration by parts, we find:

δA =

(~ε · ∂L

∂~r

) ∣∣∣∣∣t1

t0

+

∫ t1

t0

(~ε · ∂L

∂~r−~ε · d

dt

∂L∂~r

)dt

Since ~ε(t0) = ~ε(t1) = ~0, then the first term vanishes, and we have:

δA =

∫ t1

t0

~ε ·(∂L∂~r− d

dt

∂L∂~r

)dt

Thus, δA = 0 if, and only if, ∂L∂~r −

ddt∂L∂~r

= 0. �

252


Remark 7.1 (Obtaining Newton’s Laws) To see the power of the Prin-

ciple of Least Action, we now demonstrate that one can derive Newton’s

Second Law directly from the Euler-Lagrange Equation, (7.3). Using the

Lagrangian (7.1), note:

d

dt

∂L∂~r

=d

dt

(∂L∂~r1

, . . . ,∂L∂~rN

)=

(m1

d

dt~r1, . . . ,mN

d

dt~rN

)and

∂L∂~r

= −(∂V1(~r)

∂~r1, . . . ,

∂VN (~r)

∂~rN

)=(~F1(~r), . . . , ~FN (~r)

)where the second equality uses the definition of potential energy. There-

fore, the Euler-Lagrange equation implies:

~Fi = mid2~ridt2

for all i = 1, . . . , N . This is exactly Newton’s Second Law of Motion.

Equivalently, using the definition of momentum (Definition 1.25), the Euler-

Lagrange equation gives:

d~pidt

= mid2~ridt2

where ~pi = middt~ri

Example 7.1 (Spring Motion Revisited) Consider again the single-particle

case of Example 2.10. The Lagrangian is:2

L(~r, ~r) = T − V =1

2

(m|~r|2 − k|~r|2

)The Euler-Lagrange equation says d

dt∂L∂~r− ∂L

∂~r = ~0, so:

m~r = −k~r ⇒ ~r = −ω2~r

2 For a derivation of the formula for kinetic and potential energy in this case, seeExample 2.10.

253


where ω =√

km . This is exactly what we found using Newton’s Second

Law in Equation (2.7).

7.2 Non-Inertial Reference Frames

Recall that Newton’s Laws work only within the realm of inertial refer-

ence frames (Definition 1.12), that is, reference frames in which forces add

up to zero and bodies are not being accelerated. An important advantage

of using the Lagrangian formulation over Newton’s original Laws is that it

allows us to operate easily with other reference frames. This will be very

useful when we study non-inertial reference frames, i.e. reference frames

that undergo acceleration relative to inertial frames. This will be the case,

for instance, when we study Einstein’s relativity.

Example 7.2 (Reference Frames I) Consider two reference frames, A

and B. For simplicity, suppose there is only one particle and one dimension,

x. Observers in A (an inertial reference frame) are standing at rest, and

use the coordinate x to locate the particle. Observers in B (a non-inertial

reference frame) are subject to acceleration. Let f(t) describe the location

of observers in B relative to those in A at time t. Thus, observers in B use

the coordinate X = x − f(t) at time t to locate the particle. That is, when

observers in A locate the particle at x(t), those at B locate it at X(t) +f(t).

Observers in A and B define the Lagrangians:

AA =

∫ t1

t0

(1

2mx(t)2 − V (x(t))

)dt

AB =

∫ t1

t0

(1

2m(X(t) + f(t)

)2 − V (X(t)))

dt

respectively. Both observers then find the stationary action by means of

the Euler-Lagrange equations. For observers in B, this gives:

254


mX(t) = −

(dV(X(t)

)dX(t)

+mf(t)

)

while, for observers in A, mX(t) = −dV (X)dX . Thus, observers in the

non-inertial reference frame perceive an additional force of mf(t) relative to

those in frame A as a result of the underlying acceleration f(t) that these

observers are undergoing.

Example 7.3 (Reference Frames II) Consider a single particle, now in

two dimensions. Consider again an inertial reference frame A, using coordi-

nates (x, y), and a non-inertial reference frame B, using coordinates (X,Y ).

Now, observers in B are moving with rotation, in uniform circular motion,

relative to observers in A, at rest. (For instance, observers in B are on

a carousel, while observers in A are not). Time dependence is omitted for

brevity.

Using what we know about uniform circular motion, we can then relate

the two coordinate systems as follows:

x = X cos(ωt) + Y sin(ωt) (7.4a)

y = −X sin(ωt) + Y cos(ωt) (7.4b)

For simplicity, suppose that observers in A observe that the particle

moves with no forces acting on it (no potential energy). Thus, their La-

grangian is simply given by kinetic energy:

LA =1

2m(x2 + y2

)Thus, the stationary action for these observers solves the Euler-Lagrange

equation:

m~r = ~0

In sum, according to these observers, the particle possesses no accel-

255


eration. What about observers in the non-inertial frame? Differentiating

(7.4a)-(7.4b) gives:

x = X cos(ωt)− ωX sin(ωt) + Y sin(ωt) + ωY cos(ωt)

y = − X sin(ωt)− ωX cos(ωt) + Y cos(ωt)− ωY sin(ωt)

Some algebra then shows:

x2 + y2 = X2 + Y 2 + ω2(X2 + Y 2) + 2ω(XY − Y X)

Therefore, the Lagrangian for observers in frame B is:

LB =1

2m(X2 + Y 2

)︸︷︷︸

(i)

−[−mω

2

2

(X2 + Y 2

)]︸︷︷︸

(ii)

+mω(XY − Y X

)︸︷︷︸

(iii)

We can identify three terms in the Lagrangian:

(i) This term is just the kinetic energy according to the observers in B.

(ii) This term is the potential energy according to observers in B:

V (X,Y ) = −mω2

2

(X2 + Y 2

)That is, even though observers at rest see the particle as having no

force, those in rotational motion relative to them perceive a potential

energy on the particle. By Example 2.10, the potential energy has

parameter k = mω2. Thus, ω =√k/m is the particle’s angular fre-

quency. Using the Potential Energy Principle, we have:

~F (~r) = mω2~r

Since this force is positive, it works as a centrifugal force.3

3 A centrifugal force is a centripetal force (recall Example 1.2) but in the opposite

256


(iii) This term is less familiar. It is called the Coriolis force, and it depends

not only on the position of the particle but also on its velocity.

Finally, we can work out the Euler-Lagrange equation to find the sta-

tionary action for observers in reference frame B. This gives:

m

(X

Y

)= mω2

(X

Y

)+

(−2mω 0

0 2mω

)(X

Y

)Again, this shows that, even though the particle has no acceleration from

the perspective of observers in the inertial reference frame A, from those in

the non-inertial reference frame B it is perceived as having both centrifugal

and Coriolis forces.

7.3 Generalized Coordinates

Not only can we accommodate different reference frames, but we can

also use the Principle of Least Action to describe the laws of physics in

generalized coordinates, that is, in non-Cartesian coordinates. This will be

useful when we describe systems that move in non-Euclidean spaces, such

as spherical surfaces.

Let us thus see how to set up the equations of classical mechanics in

a general way that applies to any coordinate system. For each particle

i = 1, . . . , N , the set of generalized coordinates is denoted:

q(t) ≡(q1(t), . . . , qN (t)

)For instance:

• In a Cartesian coordinate system with three spatial dimensions, q(t) ={(xi(t), yi(t), zi(t)

)}Ni=1

, where (xi, zi, zi) are the Cartesian coordinates

of particle i.

direction.

257


• In a polar coordinate system (Definition 1.11), q(t) = (R(t),θ(t)),

where R = (R1, . . . , RN ) are the radii (or distances from the pole),

and θ = (θ1, . . . , θN ) are angles from the polar direction.

More generally, q is a point in the configuration space of the system,

whatever this may be.4

The position of each particle, ri, is potentially a function of that of all

other particles:

ri = ri(q(t), t)

Finally, the generalized velocities are then defined by:

q ≡ d

dtq(t) =

(q1(t), . . . , qN (t)

)The velocity vector over the i-th particle, vi, is then:

vi ≡ ri ≡d

dtri =

N∑j=1

∂ri(q(t), t)

∂qjqj +

∂ri(q(t), t)

∂t

Though the equations of motion in a generalized system may be quite

complicated, the Principle of Least Action always applies. Conveniently, the

system (whether a wave, a field, or otherwise) can always be characterized

by a Lagrangian, which summarizes the equations of motion.

Therefore, we must look for the trajectory for which:

δA = δ

∫ t1

t0

L(q, q, t)dt = 0

Again, the stationary action solves the Euler-Lagrange equation. Writ-

ten in generalized coordinates:

d

dt

∂L∂q

=∂L∂q

4 Note we do not use the vector notation ~q to emphasize that, in a generalizedcoordinate system, the position of a particle may not be a vector in the Euclideansense of the term.

258


where L is short for L(q, q, t). This Euler-Lagrange equation encom-

passes all of classical physics in its most general form. For a given initial

condition (q, q, t), it describes the motion of any particle within the isolated

system.

Often, this equation is expressed in terms of momentum. First, we define

momentum in generalized coordinates:

Definition 7.4 (Generalized Momentum Conjugate) The generalized

momentum conjugate to q is defined by:

p ≡ ∂L∂q

In Cartesian coordinates, where we denote position by ~r, it is clear that∂L∂~ri

= ~pi for all particles i = 1, . . . , N (recall Remark 7.1), where ~pi ≡mi~ri is momentum. The previous definition simply generalizes this idea to

generalized coordinates.

Using this definition, therefore, the Euler-Lagrange equation for gener-

alized coordinate systems is often expressed as follows:

d

dtp =

∂L∂q

Let’s see some examples in the context of non-Cartesian coordinate sys-

tems:

Example 7.4 (Polar Coordinates) Consider a polar coordinate system.

Recall (Definition 1.11) that particle location in such a system is character-

ized by a distance R to the pole (the radius), and an angle θ from the polar

direction. Thus, our set of (generalized) coordinates is:

q(t) = (R(t),θ(t))

where R = (R1, . . . , RN ) and θ = (θ1, . . . , θN ) are the radii and angles

for each particle relative to the pole. For simplicity, consider N = 1 (ex-

tending it to N ≥ 2 is straightforward). Moreover, assume that potential

energy is zero.

259


First, let’s work out the Lagrangian equation. To transform Cartesian

coordinates (x, y), i.e. linear motion, into polar coordinates (R, θ), i.e. ro-

tational motion, we use:

x = R cos θ

y = R sin θ

Differentiating with respect to time, we get:

x = R cos θ −Rθ sin θ

y = R sin θ +Rθ cos θ

Some algebra shows: x2 + y2 = R2 +R2θ2. Thus, the Lagrangian is:

L(q, q) =1

2m(R2 +R2θ2

)(7.5)

where recall q ≡ (R, θ). Let’s now compute the generalized momenta.

The generalized momentum conjugate to R, denoted here by pR, is:

pR ≡∂L∂R

= mR

Thus, using the Euler-Lagrange equation, the equation of motion on the

R coordinate is ddtpR = ∂L

∂R = mRθ2. Equivalently, using pR = mR (by

definition), we have mR = mRθ2, or simply:

R = Rθ2

Interestingly, we have derived the equation of motion on the R coordinate

with no mention of the object’s mass or momentum.

Similarly, the generalized momentum conjugate to θ, denoted here by pθ,

is:

260


pθ ≡∂L∂θ

= mR2θ

What this gives us is the momentum (mass times velocity) along the

angular coordinate. Accordingly, we often call this the system’s angular mo-

mentum (recall Definition 3.4). The equation of motion on the θ coordinate

comes from the Euler-Lagrange equation:

d

dtpθ =

∂L∂θ

= 0

Therefore, we have found, using the Euler-Lagrange equation, that an-

gular momentum is always conserved (recall Result 3.4). For example, a

pegtop remains upright when spinning because its angular momentum is be-

ing conserved, until friction eventually slows it down and it topples over.

Another implication of the conservation of angular momentum is that

the angular velocity of the particle is higher as the particle gets closer to

the origin. To see this, use pθ = mR2θ to write conservation of angular

momentum as ddt

(R2θ

)= 2Rθ +R2θ = 0, or:

θ = −2R

R2θ

Therefore, smaller R implies that θ must be higher for angular acceler-

ation to be conserved. To sum up our results, the generalized momentum

conjugate to q = (R, θ) is:

p ≡ (pR, pθ) =∂L∂q

=

(∂L∂R

,∂L∂θ

)=(mR,mR2θ

)and the equations of motion read p ≡ (pR, pθ) = (mRθ2, 0) or, written

in terms of coordinates:

q =(R, θ

)=

(Rθ2,−2R

R2θ

)

In this example, both the R coordinate and its velocity R appeared

in the Lagrangian (equation 7.5). However, only the angular velocity θ,

261


and not the θ coordinate itself, showed up. As a result, the partial of the

Lagrangian with respect to θ was zero, meaning that the momentum along

this coordinate is conserved (Result 3.4).

More generally, when a coordinate has the property that shifting its

value does not change the Lagrangian, we call it a cyclic coordinate.

Definition 7.5 (Cyclic coordinates) For a generalized coordinate sys-

tem q, a coordinate q ∈ q is called cyclic if:

∂L∂q

= 0

A direct implication (and alternative definition) is, therefore:

Result 7.2 (Conservation of momentum) The momentum conjugate of

cyclic coordinates is always conserved. Indeed, using the Euler-Lagrange

equation, if q ∈ q is a cyclic coordinate, then:

d

dtpq ≡

d

dt

∂L∂q

=∂L∂q

= 0

Example 7.5 (Cartesian coordinates, zero forces) The standard exam-

ple of a cyclic coordinate is the angular momentum in polar coordinates

(previous example). Let’s now see an example of conservation of linear

momentum.

Again, consider a single particle, now in coordinates (x, y, z). Assuming

that potential energy is zero (all forces are zero), the Lagrangian is:

L =1

2m(x2 + y2 + z2

)Therefore, ∂L

∂x = ∂L∂y = ∂L

∂z = 0, so all coordinates of the system are cyclic,

and all components of momentum are conserved. Of course, this would not

be true if potential energy was not zero.

Example 7.6 (Cartesian coordinates, non-zero forces) Let’s thus rein-

troduce potential energy. Consider two particles i = 1, 2 moving in a one-

dimensional Cartesian coordinate system (a line), with coordinate x. Say

262


m1 = m2 = m for simplicity, and suppose that the potential energy depends

on the distance x1−x2 between them (where xi is the position of particle i).

The Lagrangian for this case is:

L =m

2

(x2

1 + x22

)− V

(x1 − x2

)Since ∂L

∂x1, ∂L∂x2

6= 0, then x is non-cyclic for both particles, and momen-

tum is not conserved for either particle.

Remark 7.2 (Introduction to Symmetries) Importantly, the previous

example can be shown to conserve momentum if we introduce a slight change

of coordinates. This will introduce the topic of the next section: symmetries.

Define new coordinates (x+, x−) by:

x+ =x1 + x2

2x− =

x1 − x2

2

Some simple algebra steps show that the kinetic energy in the new coor-

dinate system is T = m(x2

+ + x2−). The Lagrangian is:

L = m(x2

+ + x2−)− V (2x−)

Importantly, potential energy (and thus the Lagrangian) is only a func-

tion of x−, and not x+. Hence, x+ is a cyclic coordinate, and the conjugate

momentum to x+ (call it p+) is conserved, i.e. p+ = 0. Computing p+ by

Definition 7.4, we have p+ = ∂L∂x+

= 2mx+. Using x+ = x1+x22 , then we

obtain:

p+ = m(x1 + x2)

That is, p+ is just the total momentum (i.e. the sum of momenta of the

system). Thus, though there are no cyclic coordinates, total momentum is

conserved.

Example 7.7 (Double Pendulum) This example clearly shows the power

of the Lagrangian method in non-Cartesian coordinates. In Example 1.10

we explored the motion of a simple pendulum, that is, a harmonic oscillator

263


with a single massive body swinging from a massless string. Now we explore

the double pendulum, that is, a simple pendulum that is attached to the end

of the body of another simple pendulum (see Figure 7.1).

Figure 7.1: The double pendulum.

Let us start with Cartesian coordinates (x, y), with x (rep. y) being

the horizontal (resp. vertical) direction, and for simplicity suppose that the

string of the upper pendulum is attached to the origin (x, y) = (0, 0). Index-

ing each pendulum by i = 1, 2, where i = 1 is the upper one, let θi be the

angle of the string about the vertical axis, li be the length of the string, and

mi be the mass of the object.

We start by decomposing the tension vector into its x and y components.

For the upper pendulum, the position of the body in Cartesian coordinates

can be written:

x1 = l1 sin θ1

y1 = − l1 cos θ1

respectively. Similarly, the positions of the second pendulum’s body are:

264


x2 = x1 + l2 sin θ2

y2 = y1 − l2 cos θ2

By looking at the components of tension, we have now moved to polar

coordinates. Thus, henceforth we work in the generalized coordinate system

q = (l, θ), where l is the radius (length of the pendulum) and θ is the angle.

Differentiating with respect to time, we obtain the components of the

velocity vector for each of the two objects:

~v1 =

(x1

y1

)=

(l1θ1 cos θ1

l1θ1 sin θ1

)

~v2 =

(x2

y2

)=

(l1θ1 cos θ1 + l2θ2 cos θ2

l1θ1 sin θ1 + l2θ2 sin θ2

)

where θi is thus the angular velocity of the i-th body.

We now want to describe the equations of motion of the system. One way

to do this is to use Newton’s Laws directly (as we did for the simple pendu-

lum, Example 1.10). This would involve invoking Newton’s Laws of Motion,

which would depend on position, velocity and acceleration of particles in the

Cartesian system (x, y).

A much faster and convenient way is to use the Lagrangian method in

the polar coordinate system, (l, θ). Let’s then write down the Lagrangian.

First, kinetic energy is:

T =1

2m1v

21 +

1

2m2v

22 =

1

2m1

(x2

1 + y21

)+

1

2m2

(x2

2 + y22

)=

1

2m1l

21θ

21 +

1

2m2

(l21θ

21 + l22θ

22 + 2l1l2θ1θ2 cos(θ1 − θ2)

)where we have used equation (0.3). To know the potential energy V , we

make use of the Potential Energy Principle to back out V from the forces

265


at work. The gravitational force acting on each body is mig, for pendulum

i = 1, 2. Thus, Vi =∫migdy, so:

V = m1gy1 +m2gy2 = −m1gl1 cos θ1 −m2g(l1 cos θ1 + l2 cos θ2

)=− (m1 +m2)gl1 cos θ1 −m2gl2 cos θ2

Therefore, the Lagrangian is:

L = T − V =1

2(m1 +m2)l21θ

21 +

1

2m2l

22θ

22 +m2l1l2θ1θ2 cos(θ1 − θ2)

+ (m1 +m2)gl1 cos θ1 +m2gl2 cos θ2

The momentum conjugate to the θi coordinate for each i = 1, 2, denoted

pθi, is:

pθ1 ≡∂L∂θ1

= (m1 +m2)l21θ1 +m2l1l2θ2 cos(θ1 − θ2)

pθ2 ≡∂L∂θ2

= m2l22θ2 +m2l1l2θ1 cos(θ1 − θ2)

respectively. The equations of motion of the system can be obtained from

the Euler-Lagrange equations:

d

dtpθi =

∂L∂θi

, for each i = 1, 2 (7.6)

Let’s compute each component. First, using (pθ1 , pθ2) from above, we

have:

266


d

dtpθ1 = (m1 +m2)l21θ1 +m2l1l2θ2 cos(θ1 − θ2)

−m2l1l2θ1θ2 sin(θ1 − θ2) +m2l1l2θ22 sin(θ1 − θ2)

d

dtpθ2 = m2l

22θ2 +m2l1l2θ1 cos(θ1 − θ2)

−m2l1l2θ21 sin(θ1 − θ2) +m2l1l2θ1θ2 sin(θ1 − θ2)

On the other hand, note:

∂L∂θ1

= −m2l1l2θ1θ2 sin(θ1 − θ2)− (m1 +m2)gl1 sin θ1

∂L∂θ2

= m2l1l2θ1θ2 sin(θ1 − θ2)−m2gl2 sin θ2

Then, some algebra shows that (7.6) simplifies to:

0 = (m1 +m2)l1θ1 +m2l2θ2 cos(θ1 − θ2)

+m2l2θ22 sin(θ1 − θ2) + (m1 +m2)g sin θ1 (7.7a)

0 = l2θ2 + l1θ1 cos(θ1 − θ2)− l1θ21 sin(θ1 − θ2) + g sin θ2 (7.7b)

This is a system of coupled second-order non-linear ODEs describing the

angular acceleration of each pendulum as functions of the angular velocities

and the angles themselves. It is impossible to solve these equations by hand

(even if we invoke the Small-Angle Approximation method, Definition 1.16),

so we will need a non-linear solver such as Mathematica.

The double pendulum is one of the most famous examples of a chaotic

system (Definition 0.4): slightly different initial conditions give rise to dras-

tically different trajectories. Because we are working here with generalized

(polar) coordinates, the initial conditions are initial positions (radii) and

angular velocities for both bodies.5

5 For an interactive simulation illustrating the chaotic behavior of the double pen-

267


7.4 Symmetry and Conservation Laws

In the previous section, we have introduced Lagrangian mechanics in gen-

eralized coordinates. As we have argued, this is the most general description

that one can give for the trajectory of a particle. We now introduce the con-

cept of symmetries. A symmetry is a coordinate transformation that leaves

the physical properties of the system unchanged, no matter where the sys-

tem is located in the configuration space. Or, in the language of Lagrangian

mechanics:

Definition 7.6 (Symmetry) A symmetry of a physical system is a coor-

dinate transformation that leaves the Lagrangian unchanged.

Since the Lagrangian of the system describes all that there is to know

about the system, a symmetry is a transformation that does not alter its

properties.

Example 7.2 showed a case in which a re-labeling (i.e. a transformation)

of the coordinate system did not change the Lagrangian. As another exam-

ple, rotating the orbit described by the motion of a particle is a symmetry

because it is a transformation that does not alter the equations of motion

of the object. Let’s now see more examples:

Example 7.8 (Symmetries, Example I)

Example 7.9 (Symmetries, Example II)

dulum, visit http://bestofallpossibleurls.com/double-pendulum.html.

268

http://bestofallpossibleurls.com/double-pendulum.html

Part II

ELECTRICITY

AND MAGNETISM

269

Documents

Physics Volume I Classical MechanicsChapter 0 Mathematical Preliminaries In this chapter, we review the basic mathematical toolkit that is indis-pensable for working out problems in