Gravitation - · PDF file5.1 The Energy Momentum Tensor T . . . . . . . . . . . . . . . . . . . . . . . 57 ... time dilation (moving clocks run slow). 1.2 Covariant Formalism The title

Gravitation

J.Pearson

July 20, 2009

Abstract

These are a set of notes I have made, based on lectures given by A.Pilaftsis at theUniversity of Manchester Sept-Dec ’08. Please e-mail me with any comments/corrections:[email protected]. These notes may be found at www.jpoffline.com.

ii

CONTENTS iii

Contents

1 Recap of Special Relativity 1

1.1 The Lorentz Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Covariant Formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2.1 Lorentz Boost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Standard Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 The Equivalence Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4.1 The Weak Equivalence Principle . . . . . . . . . . . . . . . . . . . . . 6

1.4.2 The Strong Equivalence Principle . . . . . . . . . . . . . . . . . . . . 6

1.5 Gravitational Redshift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.6 Einstein’s Vision of General Relativity . . . . . . . . . . . . . . . . . . . . . 8

2 Manifolds, Metrics & Tensors 9

2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Coordinate Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.1 Example: Plane Polars . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Tangent Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4 The Metric & Line Element . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4.1 Example: Polars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.5 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.5.1 Contravariant Vectors Aµ . . . . . . . . . . . . . . . . . . . . . . . . 12

2.5.2 Covariant Vectors Aµ . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.5.3 The Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.5.4 Conformal Transformations . . . . . . . . . . . . . . . . . . . . . . . 14

2.5.5 It is a Proper Vector? . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.6 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.6.1 Symmetric & Anti-symmetric Tensors . . . . . . . . . . . . . . . . . . 15

3 Tensor Calculus 17

iv CONTENTS

3.1 Covariant Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.1.1 Parallel Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.1.2 Absolute Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.1.3 Transformation of Γλνµ . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.1.4 Locally Inertial Frames . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.1.5 Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2.1 The Affine Geodesic . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2.2 The Metric Geodesic . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.2.3 Relation Between Affine Connection & Christofell Symbol . . . . . . 33

3.3 Isometries & Killing’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.5.1 Computing Christofell Symbols: Effective Lagrangian . . . . . . . . . 38

3.5.2 Computing the Geodesic . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.5.3 Physical Meaning of the Killing Vector . . . . . . . . . . . . . . . . . 41

3.5.4 Nordstrom’s Theory of Gravity . . . . . . . . . . . . . . . . . . . . . 42

4 Curvature 45

4.1 The Riemann Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1.1 Symmetries of the Riemann Tensor . . . . . . . . . . . . . . . . . . . 47

4.1.2 The Round Trip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.2 The Ricci Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.3 The Ricci Tensor & Scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.3.1 Example: Plane Polars . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.4 The Bianchi Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.5 The Einstein Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.6 Geodesic Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5 Einstein’s Equation 57

CONTENTS v

5.1 The Energy Momentum Tensor T µν . . . . . . . . . . . . . . . . . . . . . . . 57

5.1.1 Components of T µν . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.1.2 Conservation Equations . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.1.3 Perfect Fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2 Einstein’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.2.1 The Cosmological Constant . . . . . . . . . . . . . . . . . . . . . . . 62

5.3 The Newtonian Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.3.1 Newtonian Gravity from Einstein’s Gravity . . . . . . . . . . . . . . . 64

5.4 Linearised Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.4.1 Linearising Einstein’s Equation . . . . . . . . . . . . . . . . . . . . . 68

5.4.2 Gravitational Radiation . . . . . . . . . . . . . . . . . . . . . . . . . 71

6 The Schwarzschild Solution 73

6.0.3 Gravitational Redshift . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.1 Dynamics in the Schwarzschild Spacetime . . . . . . . . . . . . . . . . . . . 76

6.1.1 Geodesics & Christofell Symbols . . . . . . . . . . . . . . . . . . . . . 77

6.1.2 Orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6.2 Light Deflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

6.3 Perihelion Precession . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.4 Black Holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.4.1 Null Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.4.2 Eddington-Finkelstein Coordinates . . . . . . . . . . . . . . . . . . . 91

7 The Friedmann-Robertson-Walker Universe 95

7.1 The FRW Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

7.2 Geodesics & Christofell Symbols . . . . . . . . . . . . . . . . . . . . . . . . . 98

7.3 Cosmology in the FRW Universe . . . . . . . . . . . . . . . . . . . . . . . . 99

7.3.1 Species Evolution & Densities . . . . . . . . . . . . . . . . . . . . . . 102

7.4 Age of the FRW Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

vi CONTENTS

7.4.1 Age of Matter Dominated Universe . . . . . . . . . . . . . . . . . . . 105

7.4.2 Age of Matter & Curvature Dominated Universe . . . . . . . . . . . . 106

7.5 Light in the FRW Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

7.6 Flatness Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

7.6.1 Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

8 The General Theory of Relativity: Discussion 113

1

1 Recap of Special Relativity

Let us quickly recap the principles of special relativity that are assumed to be known.

The postulates of SR:

• All laws of nature are the same for all inertial observers;

• The speed of light, c, is the same for all inertial observers.

1.1 The Lorentz Transformations

Consider a frame Σ′, within which an observer is stationary. The coordinates in that frameare the “primed ones”, (ct′, x′, y′, z′). Now, consider another frame, Σ, such that Σ′ is movingat constant velocity β ≡ v/c relative to a stationary observer in Σ. The coordinates in the“stationary frame” are unprimed (ct, x, y, z).

The two sets of coordinates are related via the transformations

ct′ = γ(ct− βx), x′ = γ(c− βct), y′ = y, z′ = z. (1.1)

We have defined the quantities

γ ≡ 1√1− β2

, β ≡ v

c.

From the transformations, we can compute “the invariance of the interval”, thus

c2t′2 − x′2 − y′2 − z′2 = ct2 − x2 − y2 − z2.

The physical consequences of this is that of Fitzgerald contraction (moving bodies shorten),time dilation (moving clocks run slow).

1.2 Covariant Formalism

The title “covariant formalism” is a little misleading: it should read “invariant formalism”,but convention leaves it so.

Let us define the contravariant position 4-vector as

xµ = (x0, x1, x2, x3) = (ct, x, y, z). (1.2)

The metric of SR is flat, called the Minkowski metric, and written ηµν . The elements of themetric may be represented as

(ηµν) = diag(1,−1,−1,−1) =

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

.

2 1 RECAP OF SPECIAL RELATIVITY

Notice that this metric is symmetric; ηµν = ηνµ. Consider constructing an inverse matrix tothis metric. That is, we require

ηη−1 = 14,

where 14 is the 4-D identity matrix diag(1,1,1,1). Inspection will see that the inverse matrixhas the same elements as the original. We denote the inverse of the metric as

(η−1)µν ≡ ηµν ,

thus, we have thatηµνη

νλ = δλµ.

Now, in Euclidean space, suppose we have a vector x = xiei, where ei is a basis vector andi ∈ [1, n], where n is the dimension of the Euclidean space (usually 3). Then, the dot-productof the vector with itself can be written as

x · x = xixjei · ej,

and we “mix the basis vectors” via the Kronecker-delta, which is the metric of Euclideanspace

ei · ej = δij ⇒ x · x = xixjδij = xixi.

If we expand out this implied summation, we get the radius of a sphere in the n-dimensionalEuclidean space

xixi = x2 + y2 + z2.

Now, we make the analogy to Minkowski space. We denote a contravariant vector as x = xµeµ,so that the inner-product of the vector with itself is written

x · x = xµxνeµ · eν ,

and again we mix the basis vectors by the metric of the space; the metric of Minkowski spaceis ηµν . Thus,

eµ · eν = ηµν ,

and thereforex · x = xµxνηµν .

If we say that

xµ = ηµνxν , (1.3)

then we see thatx · x = xµxµ.

From this, we are able to define the covariant position 4-vector as

xµ = ηµνxν = (ct,−x,−y,−z).

1.2 Covariant Formalism 3

And therefore, carrying out the summation, we find that the inner-product of the position4-vector with itself is the radius of a 4-D sphere in Minkowski space;

xµxµ = (ct)2 − x2 − y2 − z2.

Of course, we can write the inner-product of one 4-vector with another

x · y = xµyνηµν = xµyν .

Just as we used the metric to lower a contravariant vectors index, to become a covariantindex, we may use the inverse metric to raise a covariant index to become a contravariantone

xµ = ηµνxν . (1.4)

Therefore, using these relations, we are able to see that

xνyν = xνyν .

1.2.1 Lorentz Boost

Consider again the 4-vector x = xµeµ. Then, consider that the vector is the same in anotherframe, then we must have that

xµeµ = x′µe′µ.

The way we transform between frames is via a Lorentz boost;

x′µ = Λµνx

ν , (1.5)

where we use

(Λµν) =

γ −γβ 0 0−γβ γ 0 0

0 0 1 00 0 0 1

.

If we note all of our definitions used thus far (for contravariant vectors, and their components),and the expressions forming the Lorentz transformations, (1.1), we see that

Λµν =

∂x′µ

∂xν.

We say that Λµν (as defined above) constitutes a boost along the x-axis. It is infact a rotation

about the y − z-plane.

Hence, we have a rule for transforming contravariant component, between frames: (1.5).Then, how does a covariant component transform?


Consider using the metric to change from a contravariant vector to a covariant one, in theprimed frame,

x′µ = ηµκx′κ,

then we use (1.5) to transform the contravariant vector on the RHS

ηµκx′κ = ηµκΛ

κλx

λ,

then lower the index on the RHS

ηµκΛκλx

λ = ηµκΛκλη

λνxν .

Although not previously stated, we can imagine that the metric can lower/raise indices onanything, not just position vector-components. Thus, we see that ηµκΛ

κλ = Λµλ. Hence, the

above reads

ηµκΛκλη

λνxν = Λ νµ xν .

Now, let us define the inverse Lorentz transform as

(Λ−1)ν µ ≡ Λ νµ .

Therefore, writing this stream of algebra down, from start to finish, we arrive at our result

x′µ = ηµκx′κ

= ηµκΛκλx

λ

= ηµκΛκλη

λνxν

= Λ νµ xν

= (Λ−1)ν µxν .

That is, to find the covariant components of a vector in the primed frame, we relate them tothe unprimed frame via the inverse Lorentz transformation

x′µ = (Λ−1)ν µxν . (1.6)

Let us then right our two Lorentz transformation rules; one for contravariant components &one for covariant

x′µ = Λµνx

ν , x′µ = (Λ−1)ν µxν . (1.7)

Notice that the inverse Lorentz transformation matrix may be written as

((Λ−1)ν µ) =

γ γβ 0 0γβ γ 0 00 0 1 00 0 0 1

, (Λ−1)ν µ =∂xν

∂x′µ.

1.3 Standard Relations 5

Notice that the product of Λµν and (Λ−1)ν µ is the identity matrix, as they are inverses

Λνλ(Λ

−1)µν = δµλ .

We are now in a position to be able to prove the invariance of the interval, in Minkowskispace, under Lorentz transformations. Consider the inner-product of two vectors in theprimed frame,

x′ · y′ = x′µy′µ,

we then transform each expression on the RHS, according to the relevant rule

x′µy′µ = Λµν(Λ

−1)λµxνyλ,

then, noting the relation between the transformation & its inverse,

Λµν(Λ

−1)λµxνyλ = δλνx

νyλ,

which easily givesδλνx

νyλ = xνyν .

And therefore, putting it all together

x′µy′µ = Λµν(Λ

−1)λµxνyλ

= δλνxνyλ

= xνyν .

And thus, we have shown that the inner-product is invariant under Lorentz transformation(the invariance of the interval).

1.3 Standard Relations

Here we shall merely state the standard definitions of various 4-vectors.

The infinitesimal 4-position is defined as

dxµ = (cdt,x), ⇒ dxµ = ηµνdxν = (cdt,−x).

The line element:ds2 = ηµνdx

µdxν = c2dt2 − (dx)2.

Proper time:

dτ =1

c

√dxνdxν =

dt

γ.

4-velocity:

uµ =dxµ

dτ= (cγ, γv).


4-momentum:pµ = muµ = (E/c,p).

Differential operator:

∂µ ≡(

1

c

∂

∂t,∇).

Charge conservation:∂µJ

µ = 0, Jµ = (cρ,J).

Lorentz gauge:∂µA

µ = 0, Aµ = (φ/c,A).

1.4 The Equivalence Principles

Here we shall discuss some thought experiments which lead to the development of generalrelativity.

1.4.1 The Weak Equivalence Principle

Imagine an observer and “ball” inside a sealed lift. The observer is stationary relative tothe ball, and are unable to see out of the lift. Suppose that the lift is suspended above ahomogeneous gravitational field.

Then, suppose that the cable holding the lift up, is cut. The lift will accelerate downwards,a = g; where the acceleration due to gravity is just given by

g = −∇φg.

Now, experience tells us that both the observer and ball will remain at rest, relative to eachother, inside the lift.

From Newton, we have the relation between the resultant force on a body (which will bethe gravitational mass times the gravitational field), and the inertial mass with acceleration:

mia = mgg.

Thus, as a = g, we therefore easily see that mi = mg. This leads to the statement of theweak equivalence principle:

“Gravity couples in the same way to all mass & energy”.

1.4.2 The Strong Equivalence Principle

Consider the same setup as before: observer & ball at rest inside a sealed lift. This time, letthe lift be in free space (i.e. no gravitational fields, anywhere).

1.5 Gravitational Redshift 7

Then, suppose that we accelerate the lift (using a rocket) such that a = g. We see thatthere is no difference in this situation as to one in which the lift is sat on the earths surface.Thus, the string equivalence principle:

“All laws of physics are the same in an accelerated frame, and in a uniform static gravita-tional field”.

1.5 Gravitational Redshift

Consider a lift with a stationary observer in. Also in the lift, is a light blub, which emitslight at frequency ν ′, according to the observer stationary inside the lift. Now, consider thatthere is another observer, stationary, on the surface of the earth (which we model as having ahomogenous gravitational field). We have x pointing upwards, from the surface of the earth.Then,

g = −dφd`

x.

Let the length of the lift be d`, and the light bulb reside at the top of the lift. Then, a signaltraveling at speed c takes time dt = cd` to traverse the length of the lift.

Now, suppose that the lift is traveling at speed v, and then the observer on the earth willsee some shifted frequency, ν. The Doppler shift is just

ν ′

ν=

(1 + v/c

1− v/c

)1/2

≈ 1 +v

c,

after using the binomial expansion. From this, we see that

dν

ν=v

c.

Using the relation that v = du = gdt, this simply gives that

dν

ν=g

cdt,

which gives, using cdt = d`dν

ν=

g

c2d`.

Now, if we use the fact that gd` = −dφ, then this is just

dν

ν= −dφ

c2.

Therefore, we see that frequency shift is due to a changing gravitational potential. Thus,if a photon is moving out of a potential, then it will be red-shifted; and inward would beblue-shifted.


1.6 Einstein’s Vision of General Relativity

Einstein’s vision is that spacetime is a manifold, such that line elements are given by

ds2 = gµν(xρ)dxµdxν ,

where the metric is a function of coordinates. Within the metric (or, how the metric isconstructed) is information on how spacetime is curved; and it is curved by any form of en-ergy/momentum. According to the equivalence principle, one can always choose coordinatessuch that space is locally flat (Minkowski). Things in the spacetime travel along straightgeodesics. Massive particles travel along time-like geodesics, which have ds2 > 0, photonstravel along null geodesics ds2 = 0, and tachyons along ds2 < 0.

9

2 Manifolds, Metrics & Tensors

2.1 Definitions

Let us state some rather (mathematically) loose definitions.

Manifold A manifold is a continuous set of points, which locally looks like an n-dimensionalMinkowski space.

That is, given a manifold M, if we “zoom in” on a little bit, that little bit will look flat.Suppose we zoom in on a bit which we label ui(p), where i just means that we chose one ofmany bits; and p is the point at the middle of the bit ui. The coordinate system in u1 (say)is Minkowski, xa(p). The whole collection of these little bits leads us to our next definition.

A manifold endowed with a metric is called a Riemannian manifold

Atlas An atlas is the complete set of coordinate systems ui in the manifold M.

Curve A curve, in an n-manifold (whereM merely has n coordinates), is a subset of pointsdefined parametrically

xa = xa(λ), a = 1, 2, . . . , n, λ ∈ R.For example, consider a 1-sphere (i.e. a circle), defined by the equation x2 + y2 = 1. Weparameterise it thus

xa = (x(λ), y(λ)) ⇒ x(λ) = sinλ, y(λ) = cosλ; 0 ≤ λ < 2π.

Surfaces A m-dim hypersurface in an n-manifold (whereby m < n), is defined as

xa = xa(λ1, . . . , λm); λ1,...,m ∈ R.

So that a curve is a 1D hypersurface. Or, alternatively, a surface is a generalisation of acurve.

For example, consider a 2-sphere (i.e. the surface of a ball), of constant radius r. It isdefined by x2 + y2 + z2 = r2 = const. We parameterise the surface by (θ, φ), so that

x = r sin θ cosφ, y = r sin θ sinφ, z = r cos θ; 0 ≤ θ < π, 0 ≤ φ < 2π.

2.2 Coordinate Transformations

Consider moving from one coordinate system to another

xµ 7−→ x′µ = x′µ(xν).

10 2 MANIFOLDS, METRICS & TENSORS

Such a transformation is defined by displacement vectors dxµ and dx′ν , such that

dx′µ = Jµνdxν , (2.1)

whereby the inverse is justdxµ =

(J−1)µ

νdx′ν .

By the chain rule, it is easy to see that the transformation matrix is just the Jacobian

Jµν =∂x′µ

∂xν. (2.2)

The transformation & inverse satisfy

Jµν(J−1)ν

σ= δµσ . (2.3)

This is easier to see if we represent the Jacobians in terms of differentials,

Jµν(J−1)ν

σ=∂x′µ

∂xν∂xν

∂x′σ=∂x′µ

∂x′σ= δµσ .

2.2.1 Example: Plane Polars

Consider that some point in the R2 plane may be defined by Cartesian coordinates (x, y) orplane polars, (r, θ). Then, we make the identifications

(x1, x2) = (x, y), (x′1, x′2) = (r, θ).

We also know that

x = r cos θ, y = r sin θ; r =√x2 + y2, θ = tan−1 y/x.

Then, we can compute the elements of the Jacobian

J i j =∂x′i

∂xj

=∂(r, θ)

∂(x, y)

=

( ∂r∂x

∂r∂y

∂θ∂x

∂θ∂y

)=

(cos θ sin θ− sin θ

rcos θr

).

And therefore,

dr =∑j

Jrjdxj

= Jrxdx+ Jrydy

= cos θdx+ sin θdy.

And similarly,

dθ = −sin θ

rdx+

cos θ

rdy.

2.3 Tangent Vector 11

2.3 Tangent Vector

Imagine that on a manifold M, we have curves parameterised by u. On one curve, there isa point p(u). So, we have xµ = xµ(u), then the tangent curve is defined to be

T µ =dxµ

du

∣∣∣∣u=up

. (2.4)

2.4 The Metric & Line Element

We have the line element

ds2 = gµν(x)dxµdxν . (2.5)

Now, a common requirement, is the invariance of the line element (i.e. invariance of theinterval). Thus, we require that

ds2(x) = ds2(x′).

So, under transformation xµ 7→ x′ν(xµ), we want that

gµνdxµdxν = g′αβdx

′αdx′β. (2.6)

So, we proceed by writing down the known transformation of the RHS “primed” to “un-primed” displacement vectors,

gµνdxµdxν = g′αβdx

′αdx′β = g′αβJαµJ

βνdx

µdxν .

But, this must always be consistent, so we see that we must have

gµν = g′αβJαµJ

βν . (2.7)

We can derive a similar relation, by starting from (2.6), and instead of transforming the RHS,transform the LHS. So,

g′αβdx′αdx′β = gµνdx

µdxν = gµν(J−1)ν

β

(J−1)µ

αdx′αdx′β,

which we require to always be true, leaving us with

g′αβ = gµν(J−1)ν

β

(J−1)µ

α. (2.8)

The alternative way of writing the Jacobian leads us to be able to rewrite (trivially) expres-sions (2.7) and (2.8)

gµν =∂x′α

∂xµ∂x′β

∂xνg′αβ, g′αβ =

∂xν

∂x′β∂xµ

∂x′αgµν .

We call gµν the “metric”, and gµν the “inverse metric”; where they must satisfy

gµνgνλ = δλµ. (2.9)


2.4.1 Example: Polars

We know that the line element in plane polars is ds2 = dr2 + r2dθ2. Thus, we can read offthe elements of the metric

(gij) =

(1 00 r2

),

and, by (2.9), we see that we require

(gij) =

(1 00 1/r2

).

In spherical polars, the line element is

ds2 = dr2 + r2dθ2 + r2 sin2 θdφ2;

and we can easily read off the metric

(gij) =

1 0 00 r2 00 0 r2 sin2 θ

, (gij) =

1 0 00 1/r2 00 0 1/r2 sin2 θ

.

Raising & Lowering We can use the metric to raise & lower indices. We shall not showthis in use here; see the next subsection.

2.5 Vectors

We start this by discussing contravariant and covariant vectors.

2.5.1 Contravariant Vectors Aµ

These are sometimes just denoted “vectors”.

These are defined to transform, under coordinate transformation xµ 7→ x′µ(xν) as

A′µ = JµνAν . (2.10)

2.5.2 Covariant Vectors Aµ

These are sometimes called “covectors”.

Let us say that we define a covector Aµ via

Aµ = gµνAν .

2.5 Vectors 13

Then, we may derive its transformation properties. Consider that

A′µ = g′µνA′ν ,

the RHS of which we know the transformation rules for

A′µ = g′µνA′ν =

(J−1)α

µ

(J−1)β

νgαβJ

νσA

σ.

We can rearrange the terms in this expression,


(J−1)α

µ

(J−1)β

νJνσgαβA

σ,

so that we notice the appearance of a transformation-inverse multiplication, which results ina Kronecker-delta


(J−1)α

µδβσgαβA

σ,

acting the Kronecker-delta results in (ignoring the middle equality now)

A′µ =(J−1)α

µgαβA

β,

then lowering the index, via the metric,

A′µ =(J−1)α

µAα. (2.11)

And therefore, we have arrived at the relation we require.

2.5.3 The Scalar Product

The scalar product between two vectors is written

S · T = SµT νgµν = SνTν .

A fairly obvious thing we need to prove is the invariance of the dot-product. So,

SνTν = S ′αT

′β (J−1)α

νJνβ = S ′αT

′βδαβ = S ′αT′α.

This is a very important proof. Infact, it also states that scalars are invariant under trans-formation.

Within the scalar product, we must briefly mention the modulus of a vector. We denotethem as ||S||, and define them

||S|| =

(SµSµ)1/2 time− like, ds2 > 0(−SµSµ)1/2 space− like, ds2 < 0.


2.5.4 Conformal Transformations

Following from the previous definition of the scalar product, we have the definition of theangle between two vectors;

cos θ =SµTµ||S|| ||T ||

=SµTµ

(SαSα)1/2(T βTβ)1/2. (2.12)

A conformal transformation is defined as one whose angle between two vectors does notchange. That is, under a conformal transformation, the angle between two vectors is un-changed.

Associated metrics are termed “conformal metrics”. How can we find such metrics? Theyare given by

gµν = Ω(x)gµν , Ω(x) 6= 0. (2.13)

We can see this by putting this new metric into the cos θ expression,

cos θ =gµνS

µT ν

(gαγSαSγ)1/2(gβδT βT δ)1/2,

and by substituting gµν = Ω(x)gµν , we see that the factors of Ω end up canceling, leaving theangle unchanged.

2.5.5 It is a Proper Vector?

Here, we ask if various quantities are “proper vectors”, or not.

Consider Cµ(x) = aAµ(x)+bBµ(x). It is clearly a proper vector, as each of its constituentstransform as we expect - each is defined at the same coordinate point.

Consider Cµ = aAµ(x1) + bBµ(x2). This is not a proper vector, as the constituents aredefined at different points, and different points transform differently.

2.6 Tensors

These are basically vectors, with more indices. We can also mix the indices, so that we havesome up, some down.

For example, consider F µν ≡ AµBν . We call it a second rank contravariant tensor, or a(20)-tensor. It clearly transforms as

F ′µν = A′µB′ν = JµαJνβA

αBβ = JµαJνβF

αβ.

2.6 Tensors 15

Similarly, a second rank covariant tensor, or a (02)-tensor, transforms like

F ′µν = A′µB′ν =

(J−1)α

µ

(J−1)β

νAαBβ =

(J−1)α

µ

(J−1)β

νFαβ.

Finally, a mixed (11)-tensor transforms

F ′µν = Jµα(J−1)β

νFα

β.

This obviously generalises to higher-rank tensors. One must include a Jacobian for eachcontravariant index, and one inverse Jacobian for each covariant index.

Getting equations & other expressions into tensorial form (i.e. into a form consistent withthe above tensor transformations), is extremely useful. For example, given a tensor equationin one frame of reference, one therefore knows the form in all frames of reference. Thisbecomes particularly useful when one finds a frame in which a particular equation becomessimple to analyse; then, one can simply transform out of that frame, and know that theanalysis still holds.

Also, consider a tensor for whom all components are zero. Then, one cannot make a coor-dinate transformation that will be able to “reinstate” those (completely) zero components.That is, a tensor with zero components in one frame, has zero components in all frames. Thisis a very useful concept. If a quantity is not a tensor, then this does not hold true. That is,a non-tensor with zero components in one frame may have non-zero components in another.

2.6.1 Symmetric & Anti-symmetric Tensors

A symmetric (20)-tensor is one where

Aµν = Aνµ,

that is, the sign is unchanged under exchange of the indices. An anti-symmetric tensor isone for whom

Bµν = −Bνµ.

Now then, using these relations (definitions, if you will), we can see some interesting formulae.

Suppose that Aµν is a symmetric tensor. Then, Aµν = Aνµ. Then, we see that

Aµν = 12(Aµν + Aνµ) = 1

2(Aµν + Aµν) = 1

22Aµν = Aµν .

Similarly, suppose that Bµν is an anti-symmetric tensor. Then,

Bµν = 12(Bµν −Bνµ) = 1

2(Bµν +Bµν) = Bµν .

These obviously all hold for covariant tensors. Lets introduce some notation that will bepretty useful.


Suppose we have some tensor, defined as

Tµν ≡ 12(Bµν −Bνµ),

then, we writeB[µν] ≡ 1

2(Bµν −Bνµ).

That is, we could say that Tµν is formed by the anti-symmetric interchange of indices on Bµν .We use the “square brackets” to denote the anti-symmetric interchange. Similarly, supposewe have

Cµν ≡ 12(Aµν + Aνµ),

then, we defineA(µν) ≡ 1

2(Aµν + Aνµ).

Thus, we say that Cµν is formed by the symmetric interchange of indices. We used “roundbrackets” to denote the symmetric interchange.

Suppose we have some tensor, Yµν . Then, we can write it as the sum of an anti-symmetricpart, and a symmetric part. That is,

Yµν = A[µν] + A(µν) = 12(Aµν − Aνµ) + 1

2(Aµν + Aνµ).

This is infact pretty obvious. If the tensor is symmetric, then A[µν] = 0. And, if the tensoris anti-symmetric, then A(µν) = 0.

The notation of a lower bracket to denote index interchange can be used in another way.Recall the electromagnetic field tensor,

Fµν ≡ ∂µAν − ∂νAµ,

then, we can write this asFµν = 2∂[µAν].

Also recall that two of Maxwells equations may be recovered from

∂µFαβ + ∂αFβµ + ∂βFµα = 0,

well, we can denote this (notice that this is a cyclic interchange of index) as

∂(µFαβ) = 0.

In this final example, we were a little sloppy. There is infact a numerical factor associatedwith this; however, it gets very messy, and the factor cancels out anyway. However, oneshould be aware that there is a factor there.

17

3 Tensor Calculus

Here we shall lay some formal groundwork for dealing with objects in curved spacetime. Westart by looking at differentiation, going on to geodesics.

3.1 Covariant Differentiation

Let us just state some notation. We have

∂µ ≡∂

∂xµ, ∂µ ≡ ∂

∂xµ.

Now, let us look at the coordinate transformation xµ 7→ x′µ(xν). Then, we have that

dx′µ = Jµνdxν , Jµν =

∂x′µ

∂xν= ∂νx

′µ.

Now, let us consider differentiation of a scalar, and a coordination transformation (notingthat scalars do not transform under a coordinate transformation); thus

∂µφ 7−→ ∂′µφ =∂

∂x′µφ

=∂xν

∂x′µ∂νφ

=(J−1)ν

µ∂νφ.

Therefore, we see that the derivative of a scalar ∂µφ transforms as a covariant vector

∂′µφ =(J−1)ν

µ∂νφ.

Now, let us try this with a vector (again, under a coordinate transformation)

∂µAν 7−→ ∂′µA

′ν ;

where we want to derive how the RHS relates back to the LHS. Notice, if ∂µAν is a (1

1)-tensor,then we know what it gives. However, let us derive it. So, using the known transformationrules for Aν and ∂µ,

∂′µA′ν =

(J−1)α

µ∂αJ

νβA

β.

Now, to continue, we must consider the partial derivative above. We must use the productrule on everything to the right of it. That is(

J−1)α

µ∂α(JνβA

β)

=(J−1)α

µ

(∂αJ

νβ

)Aβ +

(J−1)α

µJνβ

(∂αA

β).

18 3 TENSOR CALCULUS

This is not the transformation rule for a (11)-tensor, due to the presence of the first term on

the RHS. We write the result, swapping the two terms on the RHS, to see this more clearly:

∂′µA′ν =

(J−1)α

µJνβ

(∂αA

β)

+(J−1)α

µ

(∂αJ

νβ

)Aβ.

Therefore, we see that the partial derivative of a vector is not a tensor. The non-tensorialpart is the added term on the far right. There is a rather more fundamental reasoning behindwhy the partial derivative of a vector is not tensorial. Recall that the partial derivative of avector is defined as

∂µAν(xα) = lim

δu→0

Aν(xα)− Aν(xα + δu)

δu.

So, the partial derivative is composed by finding the value of a vector at different points. Aswe have seen, the sum of two vectors evaluated at different points, is not a proper vector(this is due to the Jacobian being evaluated at different positions). Therefore, one shouldexpect the partial derivative of a vector not to be tensorial; which is what we find.

Now, consider the vector

A(x) = Aν(x)eν(x) = A′ν(x)e′ν(x),

where we use the fact that a vector is the same in all frames. Now consider differentiatingA, noting that the components and basis vectors are all function of coordinate;

∂νA = ∂ν (Aµeµ) = (∂νAµ)eµ + Aµ(∂νeµ).

Now, to continue, we shall write the final bracketed term as a sum over coefficients

∂νeµ = Γρ νµeρ.

The logic behind this will become clear. However, one may think of it in a similar way toquantum theory. Given a state, one can write it as a sum over coefficients times the basis.What we are doing here, is to say that ∂νeµ is a “new object”, and write that new object asa sum over the original basis eρ, with coefficients Γρ νµ. Notice that this then results in

∂νA = (∂νAµ)eµ + AµΓρ νµeρ.

In the final term, let us swap indices ρ→ µ and µ→ β,

AµΓρµνeρ → AβΓµνβeµ.

This therefore results in∂νA = (∂νA

µ)eµ + AβΓµνβeµ,

which we factorise (and move the position of the final Aβ) to

∂νA = eµ(∂νA

µ + ΓµνβAβ).

3.1 Covariant Differentiation 19

Furthermore, we define the bracketed quantity as

∇νAµ ≡ ∂νA

µ + ΓµνβAβ. (3.1)

This defines the covariant derivative of a contravariant vector. We can use this rule for thecovariant derivative of a contravariant vector to derive the rule for a covariant vector.

The covariant derivative of a contravariant vector is

∇αAµ = ∂αA

µ + ΓµαλAλ.

A covector is constructed from the contravariant vector via

Aν = gνµAµ.

So,

∇αAµ = ∇α (gµνAν)

= gµν∇αAν + Aν∇αgµν

= ∂αAµ + ΓµαλA

λ

= ∂α (gµνAν) + Γµαλ(gλβAβ

)= Aν∂αg

µν + gµν∂αAν + ΓµαλgλβAβ.

If we equate the second and last lines,

gµν∇αAν + Aν∇αgµν = Aν∂αg

µν + gµν∂αAν + ΓµαλgλβAβ.

Now, the index ν is a “dummy index”, so we can swap β → ν in the last term, to give

gµν∇αAν + Aν∇αgµν = Aν∂αg

µν + gµν∂αAν + ΓµαλgλνAν ,

collecting terms,

gµν∇αAν =(∂αg

µν + Γµαλgλν −∇αg

µν)Aν + gµν∂αAν .

We then expand out the covariant derivative of the metric (the third term in the bracket),to give

gµν∇αAν =(∂αg

µν + Γµαλgλν − ∂αgµν − Γµαλg

λν − Γν αλgµλ)Aν + gµν∂αAν .

Now, the first and third terms cancel each other out, as do the second and fourth. Leaving

gµν∇αAν = gµν∂αAν − Γν αλgµλAν .

If we multiply through by gπµ, then we see that

gπµgµλ = δλπ , gπλg

µν = δνπ.


Hence, this gives

δνπ∇αAν = δνπ∂αAν − δλπΓν αλAν ,

which is

∇αAπ = ∂αAπ − Γν απAν .

Putting into more “standard indices”, we have our desired result. Hence, the covariantderivative of a covariant vector is

∇νAµ ≡ ∂νAµ − ΓβνµAβ. (3.2)

Now, remember that a scalar is invariant; and that the derivative of a scalar is a tensor, weshould have that ∇µ(AνAν) = ∂µ(AνAν). This can be checked. So,

∇µ(AνAν) = (∇µAν)Aν + Aν(∇µAν)

= Aν(∂µA

ν + Γν µβAβ)

+ Aν(∂µAν − ΓαµνAα

)= ∂µ (AνAν) + AνΓ

νµβA

β − AνΓαµνAα= ∂µ (AνAν) + AνA

βΓν µβ − AνAαΓαµν .

Now, the last two expressions can be shown to cancel, by interchanging indices. Let usmanipulate the final expression

AνAαΓαµν α→ ν → β ⇒ AβAνΓνµβ,

and so, if we put this expression back in, we see that

∇µ(AνAν) = ∂µ (AνAν) + AνAβΓν µβ − AβAνΓν µβ

= ∂µ (AνAν) .

Therefore, we see an expected result: the covariant derivative of a scalar is the same as thepartial derivative.

We call the expansion coefficients Γλνµ the affine connection.

We are able to find the covariant derivative of tensors of arbitrary rank. A few are givenbelow.

∇αAµν = ∂αA

µν + ΓµαλAλν + Γν αλA

µλ,

∇αAµν = ∂αAµν − ΓλαµAλν − ΓλανAµλ,

∇αAµν = ∂αA

µν + ΓµαλA

λν − ΓλανA

µλ,

∇αAµνσ = ∂αA

µνσ + ΓµαλAλνσ + Γν αλA

µλσ + ΓσαλAµνλ.

Basically, for each contravariant component, there should be a positive connection term, andfor each covariant a negative term.


3.1.1 Parallel Transport

The main idea in parallel transport is this:

Consider moving a vector from one place to another. Then, in general, that vector willchange direction; thus, a change in the vector upon moving said vector. So, we can find thedifference in a vector,

DAµ = Aµ(x′)− Aµ(x′).

Considering how the basis changes as well, we end up with

DAµ = δxν(∂νA

µ + ΓµνλAλ)

The bracketed quantity is just the covariant derivative. Thus,

DAµ = δxν∇νAµ.

Now, the point is that this gives another insight as to what the covariant derivative is. Whenmoving a vector around a manifold, one must consider how the basis vectors change frompoint to point, as well as the components. This information is within the affine connection.

For an example as to what parallel transport is, consider a circle in the plane. Considerthat there is an arrow living on the circle, pointing in a given direction (say parallel to they-axis). Then, consider moving the arrow around the circle. The arrow undergoes paralleltransport if it always points in the same direction, nomatter what its position on the circle.Now, consider that the entire space is the circle-line. That is, we have a 1D manifold. For avector living on the manifold, parallel transport means moving on tangents to the circle.

3.1.2 Absolute Derivative

We define the absolute derivative as

DAµ

Du=dxν

du∇νA

µ,

where we have considered a curve, parameterised so that

Aµ = Aµ(xν(u)).

3.1.3 Transformation of Γλνµ

Let us consider the transformation property of the affine connection, Γλνµ. Let us start withour previous definition, but in the primed-frame (we will then transform to the unprimed)

Γ′ρµνe′ρ = ∂′µe

′ν .


Then, we know how to transform the RHS,

∂′µe′ν =

(J−1)α

µ∂α(J−1)β

νeβ,

we then use the product rule on the RHS,(J−1)α

µ∂α(J−1)β

νeβ =

(J−1)α

µ

(J−1)β

ν∂αeβ +

(J−1)α

µeβ∂α

(J−1)β

ν.

Now, we also know that ∂αeβ = Γδ αβeδ, so that(J−1)α

µ∂α(J−1)β

νeβ =

(J−1)α

µ

(J−1)β

νΓδ αβeδ +

(J−1)α

µeβ∂α

(J−1)β

ν,

remembering that the LHS is of course just

Γ′ρµνe′ρ =

(J−1)α

µ

(J−1)β

νΓδ αβeδ +

(J−1)α

µeβ∂α

(J−1)β

ν.

If we then transform the basis vector on the LHS, we have

Γ′ρµν(J−1)λ

ρeλ =

(J−1)α

µ

(J−1)β

νΓδ αβeδ +

(J−1)α

µeβ∂α

(J−1)β

ν.

On the RHS, let us change the indices on the basis vectors, so that they are the same asthose on the left. That is, δ → λ and β → λ;


ρeλ =

(J−1)α

µ

(J−1)β

νΓλαβeλ +

(J−1)α

µeλ∂α

(J−1)λ

ν,

which allows us to then cancel off the basis vectors,


ρ=(J−1)α

µ

(J−1)β

νΓλαβ +

(J−1)α

µ∂α(J−1)λ

ν.

If we then multiply this through by something which will kill-off the inverse Jacobian on theLHS, we will have got to our result. Notice that Jπλ will do this. So,

Γ′ρµνJπλ

(J−1)λ

ρ= Jπλ

(J−1)α

µ

(J−1)β

νΓλαβ + Jπλ

(J−1)α

µ∂α(J−1)λ

ν

⇒ Γ′ρµνδπρ = Jπλ

(J−1)α

µ

(J−1)β

νΓλαβ + Jπλ

(J−1)α

µ∂α(J−1)λ

ν

⇒ Γ′πµν = Jπλ(J−1)α

µ

(J−1)β

νΓλαβ + Jπλ

(J−1)α

µ∂α(J−1)λ

ν.

We therefore have our result: the transformation of the affine connection is

Γ′πµν = Jπλ(J−1)α

µ

(J−1)β

νΓλαβ + Jπλ

(J−1)α

µ∂α(J−1)λ

ν. (3.3)

Now, although not a notation we have been using much, we can represent the Jacobians bydifferentials,

Jµν =∂x′µ

∂xν,(J−1)µ

ν=∂xµ

∂x′ν;


and, using this notation, the transformation of the affine connection looks like

Γ′πµν =∂x′π

∂xλ∂xα

∂x′µ∂xβ

∂x′νΓλαβ +

∂x′π

∂xλ∂xα

∂x′µ∂

∂xα∂xλ

∂x′ν.

We can see that this immediately shows that the affine connection is not a tensor (due to theexistence of the second term on the RHS). Now, if the affine connection were a tensor, then,if one were to find a coordinate system in which all the components were zero, then theymust be zero in all coordinate systems (this is a general property of tensors). That the affineconnection is not a (1

2)-tensor means that even if the connection has zero components in oneframe, there exists frames in which the components are non-zero. Infact, one can show thatthere exists a frame in which the components are zero, at a point. We shall now show that.

3.1.4 Locally Inertial Frames

This will all seem a little pointless, until we reach the very end of our discussion.

Let us make the following coordinate transformation,

x′µ = xµ +1

2Γµαβx

αxβ, xµ ≡ xµ − xµ∗ ,

where xµ∗ is a single point. Now, under this transformation, we can write down the Jacobian

Jµν =∂x′µ

∂xν=

∂

∂xν

(xµ +

1

2Γµαβx

αxβ)

= δµν +1

2xαxβ∂νΓ

µαβ +

1

2Γµαβ

(δαν x

β + δβν xα)

= δµν +1

2xαxβ∂νΓ

µαβ + Γµνβx

β,

thus, the Jacobian is

Jµν = δµν +1

2xαxβ∂νΓ

µαβ + Γµνβx

β. (3.4)

Notice that this can be written,

Jµν = δµν +O(xβ). (3.5)

Infact, the inverse Jacobian is also this,(J−1)µ

ν= δµν −O(xβ). (3.6)

Now then, returning to (3.4), we see that we can differentiate it,

∂αJπλ = ∂α

(δπλ +

1

2

(Γπλβx

β + Γπλν xν))

+O(xβ)

=1

2

(Γπλβδ

βα + Γπνλδ

να

)+O(xβ)

=1

2(Γπλα + Γπαλ) +O(xβ)

= Γπλα +O(xβ).


Thus,

∂αJπλ = Γπλα +O(xβ). (3.7)

Now, we previously derived the transformation rule of the affine connection,


µ

(J−1)β

νΓλαβ + Jπλ

(J−1)α

µ∂α(J−1)λ

ν.

Let us look at the final term,

Jπλ(J−1)α

µ∂α(J−1)λ

ν,

we see that we can write it as

−(J−1)λ

ν

(J−1)α

µ∂αJ

πλ.

To see how we can do this, consider that

δαβ =∂x′α

∂x′β=∂x′α

∂xπ∂xπ

∂x′β.

Also, ∂νδαβ = 0. Then, that means that

∂νδαβ =

∂

∂xν∂x′α

∂xπ∂xπ

∂x′β

=∂x′α

∂xπ∂2xπ

∂xν∂x′β+∂xπ

∂x′β∂2x′α

∂xν∂xπ

= 0.

That is,∂x′α

∂xπ∂2xπ

∂xν∂x′β= − ∂x

π

∂x′β∂2x′α

∂xν∂xπ.

Or, using the Jacobian notation,

Jαπ∂ν(J−1)π

β= −

(J−1)π

β∂νJ

απ.

Thus, we have shown that we can do the “swap” we did above.

Therefore, we write the transformation rule of the connection once again, with this rewritingof the last term,


µ

(J−1)β

νΓλαβ −

(J−1)λ

ν

(J−1)α

µ∂αJ

πλ.

Now, we have all of these expression. We use (3.5) and (3.6) for the Jacobian/inverse, and(3.7) for the derivative of the Jacobian;

Γ′πµν = δπλδαµδ

βνΓλαβ − δλν δαµΓπαλ +O(xρ).


Using the Kronecker-deltas results in

Γ′πµν = Γπµν − Γπµν +O(xρ) = O(xρ).

Therefore, the components of the transformed connection are all

Γ′πµν = O(xρ).

Now, if we let xµ → 0, which is equivalent (by our definition of xµ) to saying xµ = xµ∗ , thenwe see that

Γ′πµν(xρ = xρ∗) = 0.

Therefore, we have a transformation which renders all components of the affine connectionzero. That is, we can transform to a frame in which the geometry is Euclidean (flat), atthat single point. This is actually an incredibly useful & important result. Notice that ifthe Christofell symbols are zero, then the covariant derivative is just the partial derivative.This tends to hugely simplify calculations. In the later discussions on curvature, we shall seethat in transforming to a locally inertial frame, where the connection components are zero,we can compute this a lot easier. And, as the things we are transforming are tensors, theresults hold in any frame.

Some literature call such a set of coordinates, geodesic coordinates.

Alternative Derivation Here we shall present a rather more mathematically rigorousderivation of the existence of geodesic coordinates.

Let xµ = aµ be coordinates at a point A in the frame Σ. Let us transform to a new frame,via the transformation

xµ = aµ + x′µ +1

2aµνλx

′νx′λ,

where the coefficient aµνλ is symmetric in its lower indices, and is constant (i.e. we definethis as part of the transformation). Thus, at the point A, x′µ = 0. So, let us compute thedifferentials of the transformation;

∂xµ

∂x′ν= δµν +

1

2aµκλ

∂

∂x′ν

(x′κx′λ)

= δµν +1

2aµκλ

(x′κδλν + x′

λδκν

)= δµν + aµνλx

′λ.

Hence,∂2xµ

∂x′ν∂x′λ= aµνλ.

Hence, at the point A (i.e. where x′µ = 0), we see that

∂xµ

∂x′ν= δµν ,

∂2xµ

∂x′ν∂x′λ= aµνλ. (3.8)


Now, we can do a little work to get a relation between the coefficents aµνλ and the metric.The metric transforms via

g′µν =∂xα

∂x′µ∂xβ

∂x′νgαβ.

Then, differentiating it,

∂g′µν∂x′λ

=∂2xα

∂x′λ∂x′µ∂xβ

∂x′νgαβ +

∂xα

∂x′µ∂2xβ

∂x′λ∂x′νgαβ +

∂xα

∂x′µ∂xβ

∂x′ν∂gαβ∂x′λ

.

Now, rewrite the last term using the chain rule;

∂gαβ∂x′λ

=∂gαβ∂xσ

∂xσ

∂x′λ,

so that we have


=∂2xα

∂x′λ∂x′µ∂xβ

∂x′νgαβ +

∂xα

∂x′µ∂2xβ

∂x′λ∂x′νgαβ +

∂xα

∂x′µ∂xβ

∂x′ν∂gαβ∂xσ

∂xσ

∂x′λ.

So, at the point A, using (3.8), this reads


= aαλµδβν gαβ + aβλνδ

αµgαβ + δαµδ

βν δ

σλ

∂gαβ∂xσ

= gανaαλν + gµβa

βµβ +

∂gµν∂xλ

. (3.9)

Now let us choose that


= 0. (3.10)

which is equivalent to choosing the metric to be flat at that point A. Now, note that

gανaαλµ = aνλµ.

Then, we see that (3.9) becomes

aνλµ + aµλν +∂gµν∂xλ

= 0,

which trivially becomes

aνλµ + aµλν = −∂gµν∂xλ

. (3.11)

Now, if we permute the indices ν → µ→ λ→ ν, this becomes

aµνλ + aλνµ = −∂gλµ∂xν

, (3.12)


permuting again,

aλµν + aνµλ = −∂gνλ∂xµ

. (3.13)

Now, if we form (3.12) + (3.13)–(3.11), then we see

aµνλ + aλνµ + aλµν + aνµλ − aνλµ − aµλν

= −∂gνλ∂xµ

− ∂gλµ∂xν

+∂gµν∂xλ

.

Now, as aµνλ = aµλν , we see that the fourth and fifth terms cancel, as do the first and sixth,leaving

2aλνµ = −(∂gνλ∂xµ

+∂gλµ∂xν

− ∂gµν∂xλ

),

that is,

aλνµ = [λν, µ] ,

where

[λν, µ] ≡ −1

2

(∂gνλ∂xµ

+∂gλµ∂xν

− ∂gµν∂xλ

).

We call the [λν, µ] a Christofell symbol of the first kind. Now, by (3.10) we see that

a′λνµ = 0

at the point A.

Therefore, we have derived that under the coordinate transformation xµ = aµ + x′µ +12aµνλx

′νx′λ, the Christofell symbols are zero at the point xµ = aµ; such coordinates aregeodesic coordinates. In the derivation, we assumed that:

aµνλ constant and symmetric in lower indices,


= 0 constant metric at point we transform to.

3.1.5 Torsion

Let us define torsion to be

T ρµν ≡1

2

(Γρµν − Γρ νµ

)= Γρ [µν]. (3.14)

We shall work with symmetric affine connections; so that the torsion goes to zero. A torsionfree space merely allows us to interchange the lower indices on the connection componentsat will. This expression for torsion is a tensor; let us prove it.


So, the transformation of torsion can be written

T ′ρµν = Jρα(J−1)β

µ

(J−1)γ

νTαβγ + δT ρµν ,

where δT ρµν is a term that can be easily seen from the transformation rule of the connection,

δT ρµν = Jρλ(J−1)α

µ∂α(J−1)λ

ν− Jρλ

(J−1)α

ν∂α(J−1)λ

µ.

Now, a “trick” that we have used before is to note that

δµν = Jµα(J−1)α

ν

⇒ ∂βδµν = ∂β

(Jµα

(J−1)α

ν

)= Jµα∂β

(J−1)α

ν+(J−1)α

ν∂βJ

µα

= 0

⇒ Jµα∂β(J−1)α

ν= −

(J−1)α

ν∂βJ

µα.

We can then use this in the expression for δT ρµν , to see that

δT ρµν = −(J−1)α

µ

(J−1)λ

ν∂αJ

ρλ +

(J−1)α

ν

(J−1)λ

µ∂αJ

ρλ

= ∂αJρλ

[(J−1)α

ν

(J−1)λ

µ−(J−1)α

µ

(J−1)λ

ν

].

Now, let us define

Aαλνµ ≡(J−1)α

ν

(J−1)λ

µ−(J−1)α

µ

(J−1)λ

ν,

then we see thatAαλνµ = −Aλανµ ,

i.e. it is anti-symmetric under interchange of α and λ. Now, notice that

∂αJρλ =

∂2x′ρ

∂xα∂xλ

=∂2x′ρ

∂xλ∂xα

= ∂λJρα.

That is, ∂αJρλ is symmetric under interchange of α and λ. Therefore, as the product of

something which is symmetric and anti-symmetric is zero, we see that

δT ρµν = 0.

Hence,

T ′ρµν = Jρα(J−1)β

µ

(J−1)γ

νTαβγ,

which is the rule of transformation of a(

12

)-tensor.

3.2 Geodesics 29

3.2 Geodesics

A geodesic is the curve which gives an extremal of motion. That we use the word extremal,rather than minima (or, indeed, maxima), is very important.

Suppose we are “living in a manifold” (suppose we are confined to the surface of a sphere).Then suppose that we wish to compute the equation of the line (in that manifold) that joinstwo points, where the equation of the line is an extremum. That is, we can compute manyequations of that line, but only one of them will be an extremum. Then, that curve is ageodesic.

The geodesic will depend upon the geometry of the manifold; the line has its motion con-fined to the manifold. As we shall see, the metric is used to give the geometrical dependence.

3.2.1 The Affine Geodesic

We call an affine geodesic the curve for which the tangent vector is parallel transported toitself. That is,

DT µ

Du= λ(u)T µ, T µ ≡ dxµ

du.

That is, we find a curve, along which the tangent vector does not change direction. It mayget longer (hence the factor of λ(u)), but it does not change direction.

We have our definition of the absolute derivative,

DAµ

Du=dxν

du∇νA

µ = T ν∇νAµ.

Therefore, the affine geodesic satisfies

T ν∇νTµ = λ(u)T µ,

which is, using the definition of the covariant derivative

T ν(∂νT

µ + ΓµνγTγ)

= λT µ.

Now, consider

∂ν =∂

∂xν=

du

dxνd

du=

1

T νd

du,

then, we see that the affine geodesic can be written

T ν(

1

T νd

duT µ + ΓµνγT

γ

)= λT µ.

Therefore, noting that T µ = dxµ

du, we see that

d

du

dxµ

du+ T νT γΓµνγ = λT µ,


which is of course just

d2xµ

du2+ Γµνγ

dxγ

du

dxµ

du= λT µ. (3.15)

As an example, consider a Cartesian system, whereby the affine connections are all zero.The resultant differential equation has a straight line as the solution. That is, the affinegeodesic in Cartesian coordinates is a straight line.

We say that u is the affine parameter. If λ = 0, then we say that the geodesic is affinelyparameterised. That is,

T ν∇νTµ = 0, T µ ≡ dxµ

du,

along an affinely parameterised geodesic.

3.2.2 The Metric Geodesic

This geodesic is perhaps a little less hand-wavey.

Consider two points in some space. Consider that they are joined by a line. Then, themetric geodesic is the line which extremises that joining line. So, given a line element

ds2 = gµνdxµdxν ,

we see that the corresponding action is

S =

∫ds.

Now, considering that the line is parameterised by u, the affine parameter, then we see thatthe action is simply

S =

∫ds =

∫ds

dudu =

∫du

√gµν

dxµ

du

dxν

du.

Then, by the variational principle, the Euler-Lagrange equation

d

du

∂L

∂xµ− ∂L

∂xµ= 0, xµ ≡ dxµ

du

extremises the action (note, we use the word extremise, rather than maximise or minimise).We must state that gµν(x

ρ), only.

So, the Lagrangian is

L =(gαβx

αxβ)1/2

.

3.2 Geodesics 31

We are now left to compute the elements of the EL equation. So,

∂L

∂xµ=

1

2L

∂

∂xµ(gαβx

αxβ)

=1

2Lgαβ

(∂xα

∂xµxβ + xα

∂xβ

∂xµ

)=

1

2Lgαβ(δαµ x

β + δβµ xα)

=1

2Lgαβ(δαµ x

β + δαµ xβ)

=1

Lgαβx

βδαµ

=1

Lgµβx

β.

And also,∂L

∂xµ=

1

2Lxαxβ∂µgαβ.

Finally,

d

du

∂L

∂xµ=

d

du

(1

Lgµβx

β

)= − L

L2gµβx

β +1

L

d

du

(gµβx

β),

the last expression we evaluate via

d

du

(gµβx

β)

= gµβdxβ

du+ xβ

d

dugµβ

= gµβxβ + xβ

dxγ

du

dgµβdxγ

= gµβxβ + xβxγ∂γgµβ.

Therefore,d

du

∂L

∂xµ= − L

L2gµβx

β +1

L

(gµβx

β + xβxγ∂γgµβ)

;

and consequently, the EL equation reads

− L

L2gµβx

β +1

L

(gµβx

β + xβxγ∂γgµβ)− 1

2Lxαxβ∂µgαβ = 0.

Now, the job is to get this into a “nice form”, without mention to L. We can move the firstterm over to the RHS, and multiply through by L, giving

gµβxβ + xβxγ∂γgµβ −

1

2xαxβ∂µgαβ =

L

Lgµβx

β.


Let us now multiply this by something that will kill-off the metric multiplying xβ, the LHS.Multiplying by gρµ will work,

gρµgµβxβ + gρµ

(xβxγ∂γgµβ −

1

2xαxβ∂µgαβ

)=L

Lgρµgµβx

β,

noting that gρµgµβ = δρβ, and using this relation, we see we now have

xρ + gρµ(xβxγ∂γgµβ −

1

2xαxβ∂µgαβ

)=L

Lxρ.

To continue, we use the simple result that if a = b, then a = 12(a+ b). So, we see that

xβxγ∂γgµβ =1

2

(xβxγ∂γgµβ + xγxβ∂βgµγ

),

thus, using this, we see that we have the geodesic equation being

xρ + gρµ1

2

(xβxγ∂γgµβ + xγxβ∂βgµγ − xαxβ∂µgαβ

)=L

Lxρ.

We can pull out common factors of the bracketed term, by relabeling indices α→ γ, thus

xρ + gρµ1

2(∂γgµβ + ∂βgµγ − ∂µgγβ) xβxγ =

L

Lxρ.

Now, by way of convenient notation, we define everything multiplying the xβxγ as

γ ρ β ≡ gρµ1

2(−∂µgγβ + ∂γgµβ + ∂βgµγ) , (3.16)

a symbol we call the Christofell symbol. Thus, the geodesic equation looks like

xρ + γ ρ β xβxγ =L

Lxρ.

Now, if L = 0, then this reads

xρ + α ρ β xαxβ = 0,d2s

du2= 0.

Notice that the Christofell symbol is symmetric in its lower indices,

γ µ β = β µ γ ,

which we can see by its definition, noting that the metric is symmetric.

Just to write the result again,

xρ + α ρ β xαxβ = 0 (3.17)

3.2 Geodesics 33

is the affinely parameterised metric geodesic.

So, to recap these geodesics. The curve which preserves the direction of the tangent vectoron that curve, is called the affine geodesic. When deriving the geodesic, one uses the Γλµνsymbol, so we call it the affine connection; or just connection. The second type of geodesicwas derived to be the curve which extremises the path length between two points. This wascalled the metric geodesic. In deriving the metric geodesic, one defines some quantities, theChristofell symbols.

3.2.3 Relation Between Affine Connection & Christofell Symbol

We shall start by asserting that for a torsion free connection, Tαµν = 0, and that for ametric with zero covariant derivative, ∇αgνµ = 0, then the affine connection is the Christofellsymbol. That is,

∇αgµν = 0, T µαβ = 0 ⇒ Γµαβ = α µ β .The way to use “torsion free” is that the final two indices on the affine connection can beinterchanged. We also use a symmetric metric throughout.

Let us prove it.

We start by writing the covariant derivative of the metric,

∇αgµν = ∂αgµν − Γλαµgλν − Γλανgµλ.

But, by our definition of the problem, this is zero. So,

∂αgµν = Γλαµgλν + Γλανgµλ.

Let us now cyclicly change the indices. First, we shall do α→ µ→ ν → α. Giving

∂µgνα = Γλµνgλα + Γλµαgνλ.

Let us do the interchange again, on this new equation. Giving

∂νgαµ = Γλναgλµ + Γλνµgαλ.

Let us add the first two, and subtract the last equation. Giving

∂αgµν + ∂µgνα − ∂νgαµ = Γλαµgλν + Γλανgµλ + Γλµνgλα + Γλµαgνλ − Γλναgλµ − Γλνµgαλ.

Now, we notice that by our torsion free assert, we can cancel off some of the terms on theRHS. These are the second with the fifth, and third with sixth. This leaves

∂αgµν + ∂µgνα − ∂νgαµ = Γλαµgλν + Γλµαgνλ,

from which we further use the torsion-free assert, to see that the two terms on the RHS areidentical, leaving

∂αgµν + ∂µgνα − ∂νgαµ = 2Γλαµgλν .


Rearranging this trivially results in

Γλαµgλν =1

2(∂αgµν + ∂µgνα − ∂νgαµ) .

Now, let us multiply the whole thing by gρν ,

Γλαµgρνgλν =

1

2gρν (∂αgµν + ∂µgνα − ∂νgαµ)

⇒ Γλαµδρλ =

1

2gρν (∂αgµν + ∂µgνα − ∂νgαµ) ,

which is just

Γραµ =1

2gρν (∂αgµν + ∂µgνα − ∂νgαµ) .

Now, if we switch over the µ index to β (just relabelling),

Γραβ =1

2gρν (∂αgβν + ∂βgνα − ∂νgαβ) .

Upon inspection of this with (3.16), we find that they are equal. Therefore,

Γραβ = α ρ β ; ∇αgµν = 0, T µαβ = 0.

It is very important to note that this only holds for a torsion free connection, with metric hav-ing zero covariant derivative. Under these conditions, the affine connection is the Christofellsymbol.

3.3 Isometries & Killing’s Equation

Consider the coordinate transformation

gµν(x) 7−→ gµν(x′),

so that the metric in the new frame has the same functional dependance as in the old frame.That is, the new metric depends on x′ in the same way as the old metric depended on x.Then, we have

ds2(x) = ds2(x′),

and that

g′µν(x′) = gµν(x). (3.18)

Therefore, by the transformation rule of the metric,

gµν(x) =∂x′α

∂xµ∂x′β

∂xνg′αβ(x′),

3.3 Isometries & Killing’s Equation 35

and using (3.18) on the RHS, we see that

gµν(x) =∂x′α

∂xµ∂x′β

∂xνgαβ(x′). (3.19)

So, a coordinate transformation leaving the metric in the same form (form invariant), iscalled an isometry.

The coordinate transformation we considered was xµ 7→ x′µ. Let us consider a special caseof this; namely

xµ 7−→ x′µ = xµ + εξµ,

where ε is small, and ξµ a vector field. Now, the Jacobian,

Jµν = ∂νx′µ = ∂ν (xµ + εξµ) ,

which is clearlyJµν = δµν + ε∂νξ

µ.

Now, by a Taylor expansion, we see that

gαβ(x′) = gαβ(xµ + εξµ) = gαβ(xµ) + εξµ∂µgαβ(xµ) +O(ε2).

So, we now have enough terms to be able to put them all into the metric isometry transfor-mation equation (3.19). Thus,

gµν(x) = JαµJβνgαβ(x′)

=(δαµ + ε∂µξ

α) (δβν + ε∂νξ

β)

(gαβ(x) + εξρ∂ρgαβ(x)) .

if we expand out the RHS, neglecting terms in O(ε2), one finds that

gµν(x) = gµν(x) + εξρ∂ρgµν + εgµβ∂νξβ + εgαν∂µξ

α,

rearranging,gµν(x) = gµν(x) + ε

(gµβ∂νξ

β + gαν∂µξα + ξρ∂ρgµν

),

which is obviously just

gµβ∂νξβ + gαν∂µξ

α + ξρ∂ρgµν = 0. (3.20)

Now, notice thatξα = gαβξ

β,

and then its differential is

∂νξα = ∂ν(gαβξ

β)

= ξβ∂νgαβ + gαβ∂νξβ.


Hence, we can rearrange this into the form

gαβ∂νξβ = ∂νξα − ξβ∂νgαβ.

So, if we put this into (3.20) for the first and second expressions (being very careful inchanging indices), we get

∂νξµ − ξβ∂νgµβ + ∂µξν − ξβ∂µgνβ + ξβ∂βgµν = 0.

Collecting terms,∂νξµ + ∂µξν + ξβ (∂βgµν − ∂νgµβ − ∂µgνβ) = 0

The bracketed quantity is just −2gρβΓρµν , so that

∂νξµ + ∂µξν − 2ξβgρβΓρµν = 0,

which is just,∂νξµ + ∂µξν − 2Γρµνξρ = 0.

This is just the covariant derivative (noting the symmetry of the Christofell symbols),

∇µξν +∇νξµ = 0. (3.21)

This is known as Killing’s equation. A vector ξν satisfying Killing’s equation is called aKilling vector.

Let us just recap what we have done. A metric is said to have an isometry if it can transform,retaining its functional dependence. Then, under a small coordinate transformation, with avector field ξµ, the field satisfying Killing’s equation will give an isometry.

Now, a theorem states that, for a tangent and Killing vector, T µ, ξµ respectively, thereis a conserved quantity T µξµ along an affinely parameterised geodesic. So, to prove it, weconsider

D

Du(T µξµ) = T ν∇ν(T

µξµ) = T ν (ξµ∇νTµ + T µ∇νξµ) .

Now, the first term is zero, as we are on an affinely parameterised geodesic. Now, notice thatwe can write the final term as

∇νξµ =1

2(∇νξµ +∇µξν),

and thus we haveD

Du(T µξµ) = T νT µ

1

2(∇νξµ +∇µξν).

We were able to interchange the indices (with the factor of one-half to cancel out the doublecounting), because the things multiplying it are symmetric under interchange of indices. No-tice that the bracketed term is just Killing’s equation, for some Killing vector ξν . Therefore,

D

Du(T µξµ) = 0,

thus, T µξµ is some conserved quantity along an affinely parameterised geodesic.

3.4 Summary 37

3.4 Summary

We shall soon see some examples of geodesics, and what a Killing vector corresponds to; butbefore then we shall bring together our definitions of the Christofell symbol, and introducea little new notation (just to be inkeeping with the literature).

We have that the affine connection, Γµνλ is the same as the geodesic connection µ α ν, formanifolds for whom

∇αgµν = 0, Tαµν = 0 ⇒ Γαµν = µ α ν .

We also derived that the relation between the Christofell symbol (as we may as well call it),and the metric, is

Γραµ =1

2gρν (−∂νgαµ + ∂αgµν + ∂µgνα) .

Infact, Γραµ are generally denoted Christofell symbols of the second kind. We can infact seethat

Γραµ = gρνΓναµ,

where we call the Γναµ the Christofell symbols of the first kind. When we refer to the“Christofell symbols”, we shall mean those of the second kind.

We derived that the affine geodesic is the same as the metric geodesic, for affinely param-eterised geodesics (satisfying the above torsion & covariant derivative relations).

We also saw that the Christofell symbols are not tensors. The non-tensorial nature of thesymbols allowed us to derive that there exists a point in a manifold, where all componentsof the symbol are zero. That is, there exists a point where the manifold is flat.

The Lagrangian squared,

L2 =

(ds

du

)2

= gµν xµxν ,

is just the line element length. Its possible values are 0,±1. We classify 0 as null geodesics,+1 as time-like and −1 as space-like.

Also by way of being inkeeping with the literature, some books denote partial & covariantderivatives in a different way. Sometimes one may see

∂νAµ ≡ Aν,µ, ∇νAµ ≡ Aν;µ.

That is, a “comma” representing partial derivatives, and a “semi-colon” for covariant deriva-tives.

3.5 Examples

Here we shall see specific examples of geodesics, Killing vectors & how to compute Christofellsymbols.


3.5.1 Computing Christofell Symbols: Effective Lagrangian

Now, before we go onto the effective Lagrangian method of computing the Christofell symbols,we shall see how to do so, via brute force.

Brute force: plane polars In plane polars, we have the line element

ds2 = dr2 + r2dφ2,

and therefore, reading off the components of the metric & inverse

(gij) =

(1 00 r2

), (gij) =

(1 00 1/r2

).

Then, using the notation that dsi = (dr, dφ), we see that grr = 1, gφφ = r2, grr = 1, gφφ = r−2

are the only non-zero components. So, to compute the Christofell symbols (the brute forceway), we must find

Γi jk =1

2

∑a

gia (−∂agjk + ∂jgak + ∂kgja) , i, j, k, a = r, φ.

We shall spell out, in detail, how to compute one of the components;

Γr φφ =1

2

∑a=r,φ

gra (−∂agφφ + ∂φgφa + ∂φgφa)

=1

2

[grr (−∂rgφφ + ∂φgφr + ∂φgφr) + grφ (−∂φgφφ + ∂φgφφ + ∂φgφφ)

].

Now, one of the first things we note, is that the metric is diagonal: all off-diagonal componentsare zero. So, the above reduces to

Γr φφ = −1

2grr∂rgφφ

= −1

2.1.

∂

∂rr2

= −r.

We have thus found one of the components of the Christofell symbol. We shall state the restof them (as going through how to find each one is very tedious).

Γr rr = 0, Γr rφ = Γr φr = 0, Γr φφ = −r,Γφφφ = 0, Γφrφ = Γφφr = r−1, Γφrr = 0.

Now, we shall show how to find them in a more intelligent manner.

3.5 Examples 39

Effective Lagrangian Method When we derived the metric geodesic, we had that theLagrangian was

L =

√ds

du.

Now, considerLeff ≡ L2.

The Euler-Lagrange equation for Leff is

d

du

(∂Leff

∂xµ

)− ∂Leff

∂xµ= 0,

from which it is clear to see that

2L

[d

du

(∂L

∂xµ

)− ∂L

∂xµ

]= 0.

Thus, the if L satisfies the Euler-Lagrange equation, then so does L2. This makes life a lotsimpler, as we can consider just gµν x

µxν , rather than its square-root.

So, for plane polars, whereLeff = L2 = r2 + r2φ2,

we have two Euler-Lagrange equations, one for each coordinate r, φ. They are

r − rφ2 = 0, 2rφ+ rφ = 0.

Now, if we get these equations into the form x+ Cx1x2 = 0,

r − rφ2 = 0, φ+rφ

r+φr

r= 0.

So, we see that we can read off the Christofell symbols, by inspection. To see this a littleclearer, the “general” metric geodesic, for r, is

r +∑i,j=r,φ

Γr ijxixj = 0;

then, we can see that the only Christofell components that is non-zero is that where i = j = φ,and that value is −r. Thus, we read off that Γr φφ = −r, which is in accord to what we hadby the brute force method. For φ, the general geodesic is

φ+∑i,j=r,φ

Γφijxixj = 0;

and we therefore see two non-zero Christofell symbols: when i = r, j = φ and i = φ, j = r.The corresponding Christofell symbols components are thus Γφrφ = Γφφr = r−1. Again, inaccord with the brute force components.


3.5.2 Computing the Geodesic

Now, we are able to find the geodesic: a parameterised curve that extremises the distancebetween two points, in the plane polar coordinate system.

When we computed the Euler-Lagrange equation for φ, we had a term (which we didntstate above, but is easy to see, upon computation)

d

du(2r2φ) = 0 ⇒ r2φ = B = const.

That is, φ = B/r2. Now, the effective Lagrangian is just ds2/du2, which is just the lineelement, which can be one of 3 values (as previously stated),

Leff = L2 =

01−1

≡ A.

So, the effective Lagrangian is just

Leff = r2 + r2φ2 = A.

Hence, using our expression for φ,

r2 + r2B2

r4= A ⇒ r =

√A− B2

r2.

Now, if we notice that

φ

r=dφ

dr=B

r2

(A− B2

r2

)−1/2

,

then this integrates to

r cos(φ− φc) =B√A.

If we take A = 1, so that we are talking about time-like geodesics, then the equation becomes

r cos(φ− φc) = r cosφ cosφc + r sinφ sinφc = B.

We now note that φc is a constant, x = r cosφ, y = r sinφ, giving

mx+ ty = B ⇒ y = mx+ c.

That is, the time-like geodesic is a straight line.

Let us now consider the null geodesic. We appeal back to the effective Lagrangian, whichbecomes

r2 + r2φ2 = 0.

This has solution r = 0 and φ = 0. That is, both radius & angle are constants. That is, asingle point. Thus, the null geodesic is a point (null size).

When we consider the space-like geodesic, we find that there is no solution: it does notexist in plane polars.

3.5 Examples 41

Example of Geodesic 2 Let us compute another geodesic, for another line element,

ds2 =1

t2(dt2 − dx2).

So, we see that the effective Lagrangian is

Leff =t2 − x2

t2, t ≡ dt

du, x ≡ dx

du.

Then, the Euler-Lagrange equations for this effective Lagrangian are

t− t2

t− x2

t= 0, x− 2

txt = 0.

So, we can read off the Christofell symbol components. The only non-zero components are

Γt xx = Γt tt = −1

t, Γxxt = Γxtx = −1

t.

3.5.3 Physical Meaning of the Killing Vector

Again, let us go back to plane polars. The line element is

ds2 = dr2 + r2dφ2.

We would like to think of a vector that leaves the line element unchanged. A transformationon φ works:

φ 7−→ φ′ = φ+ ε,

so that

(r, φ) 7−→ (r′, φ′) = (r, φ+ ε) = (r, φ) + ε(0, 1).

Therefore, our Killing vector is

ξi = (0, 1).

Now, we stated that T iξi is a conserved quantity. Let us consider what it is. So,

ξixi = gijξixj

= gφφξφxφ

= r2.1.φ

= r2φ.

This quantity is a constant (as it is conserved). We also notice that it is the expression forthe angular momentum of the system. Therefore, the conserved quantity associated with theKilling vector is the angular momentum, in plane polars.


3.5.4 Nordstrom’s Theory of Gravity

Let us compute the connection associated with gµν = Ω2gµν . Now, the connection associatedwith gµν is

Γραβ =1


Hence,

∂α (gµν) = ∂α(Ω2gµν

)= gµν2Ω∂αΩ + Ω2∂αgµν .

Therefore,

Γραβ =1

2gρν (∂αgβν + ∂β gνα − ∂ν gαβ)

=1

2

1

Ω2gρν(2Ωgβν∂αΩ + Ω2∂αgβν + 2Ωgνα∂βΩ+

Ω2∂βgνα − 2Ωgαβ∂νΩ− Ω2∂νgαβ)

=1

2gρν (∂αgβν + ∂βgνα − ∂νgαβ) +

1

Ωgρν (gβν∂αΩ + gνα∂βΩ− gαβ∂νΩ)

= Γραβ +1

Ω

(δρβ∂αΩ + δρα∂βΩ− gρνgαβ∂νΩ

)Hence,

Γραβ = Γραβ +1

Ω

(δρβ∂αΩ + δρα∂βΩ− gρνgαβ∂νΩ

).

Let us suppose that we havegµν = e2φηµν ,

so thatΩ = eφ, gµν = ηµν .

Hence,∂αΩ = eφ∂αφ, Γµαβ = 0.

Hence, using these,

Γραβ = e−φ(δρβe

φ∂αφ+ δραeφ∂βφ− ηρνηαβeφ∂νφ

)= δρβ∂αφ+ δρα∂βφ− ηρνηαβ∂νφ.

So, the geodesic equation, with this connection, reads

xρ +(δρβ∂αφ+ δρα∂βφ− ηρνηαβ∂νφ

)xαxβ = 0.,

which reduces toxρ + xρ2∂αφx

α − x2∂ρφ = 0.

3.5 Examples 43

Now, null geodesics have x2 = 0, so that this geodesics null value is

xρ + xρ2∂αφxα = 0.

Similarly, timelike geodesics have x2 = 1, so that this geodesics timelike value is

xρ + 2∂αφxα − ∂ρφ = 0.

Now, this example provides us with some practice with using tensors & computing geodesics.In addition to this, we have found the geodesics for a theory whereby the metric is given bye2φηµν . This theory was proposed by Nordstrom before Einstein.


45

4 Curvature

We have now got enough mathematical tools to be able to consider the curvature of a mani-fold.

To continue, consider the commutator of covariant derivatives, acting upon a scalar,

[∇µ,∇ν ]φ = ∇µ∇νφ−∇ν∇µφ.

Now, as we previously showed, the covariant derivative of a scalar is just the normal partialderivative. Therefore,

[∇µ,∇ν ]φ = ∇µ(∂νφ)−∇ν(∂µφ).

We can now expand out the covariant derivatives. So,

∇µ(∂νφ) = ∂µ∂νφ− Γλµν∂λφ.

Therefore, the commutator is

[∇µ,∇ν ]φ = ∂µ∂νφ− Γλµν∂λφ− ∂ν∂µφ+ Γλνµ∂λφ.

In a torsion free manifold, the two Christofell terms cancel out, as do the partial derivativeterms (as they commute naturally). Therefore, we see that

[∇µ,∇ν ]φ = 0.

So, the commutator of partial derivatives, acting upon a scalar, is zero. This result isn’tperhaps that surprising. So, let us consider the commutator acting upon a vector.

4.1 The Riemann Tensor

As previously stated, we shall compute the commutator of covariant derivatives, acting upona vector. That is,

[∇µ,∇ν ]Aρ = ∇µ∇νA

ρ −∇ν∇µAρ.

Now, before, we expanded out the inner covariant derivatives first (as they resulted in justpartial derivatives). However, if we do that this time, we will end up having to compute thecovariant derivative of the Christofell symbol, which we don’t know how to do. Hence, weexpand out the outer derivatives first. So,

∇µ∇νAρ = ∂µ(∇νA

ρ) + Γρµλ∇νAλ − Γλµν∇λA

ρ,

thus, the commutator reads

[∇µ,∇ν ]Aρ = ∂µ(∇νA

ρ) + Γρµλ∇νAλ − Γλµν∇λA

ρ

−∂ν(∇µAρ)− Γρ νλ∇µA

λ + Γλνµ∇λAρ.

46 4 CURVATURE

So, we see that the final terms on the RHS cancel (i.e. third & sixth),

[∇µ,∇ν ]Aρ = ∂µ(∇νA

ρ) + Γρµλ∇νAλ

−∂ν(∇µAρ)− Γρ νλ∇µA

λ.

Now, expanding out the remaining covariant derivatives,

[∇µ,∇ν ]Aρ = ∂µ

(∂νA

ρ + ΓρλνAλ)

+ Γρµλ(∂νA

λ + ΓλνβAβ)

−∂ν(∂µA

ρ + ΓρλµAλ)− Γρ νλ

(∂µA

λ + ΓλµβAβ).

Now, as partial derivatives commute, the two terms on the far LHS cancel. So, cancelling &expanding out the brackets, we have

[∇µ,∇ν ]Aρ = (∂µΓρλν)A

λ + Γρλν∂µAλ + Γρµλ∂νA

λ + ΓρµλΓλνβA

β

−(∂νΓρλµ)Aλ − Γρλµ∂νA

λ − Γρ νλ∂µAλ − Γρ νλΓ

λµβA

β.

Now, we see that the second term cancels with the seventh, and the third with the sixth(again, by assuming torsion free manifolds). Leaving us with


λ + ΓρµλΓλνβA

β − (∂νΓρλµ)Aλ − Γρ νλΓ

λµβA

β.

Now, in the second & fourth terms, let us interchange β ↔ λ, giving


λ + ΓρµβΓβνλAλ − (∂νΓ

ρλµ)Aλ − Γρ νβΓβµλA

λ,

so that we can take out a common factor of Aλ,

[∇µ,∇ν ]Aρ =

(∂µΓρλν + ΓρµβΓβνλ − ∂νΓ

ρλµ − Γρ νβΓβµλ

)Aλ.

Now, we define the bracketed quantity to be the Riemann tensor,

Rρλµν ≡ ∂µΓρλν + ΓρµβΓβνλ − ∂νΓ

ρλµ − Γρ νβΓβµλ, (4.1)

so that the commutator reads

[∇µ,∇ν ]Aρ = Rρ

λµνAλ. (4.2)

The Riemann tensor is a (13)-tensor. It is clear that Rρ

λµν is a tensor, as the LHS of the aboveis a tensor, the RHS must also be (as Aρ is a tensor). This obviously not a rigorous proof ofthe tensorial nature of the Riemann “tensor”, so we shall prove it.

We have(∇µ∇ν −∇ν∇µ)Aρ = Rρ

λµνAλ,

and therefore that (∇′µ∇′ν −∇′ν∇′µ

)A′ρ = R′ρλµνA

′λ.

4.1 The Riemann Tensor 47

Now, as covariant derivatives are tensors, we know that

∇′µ∇′νA′ρ =(J−1)α

µ

(J−1)β

νJργ∇α∇βA

γ.

Hence, (J−1)α

µ

(J−1)β

νJργ (∇α∇β −∇β∇α)Aγ = R′ρλµνJ

λπA

π.

Now, on the LHS, we see that (∇α∇β −∇β∇α)Aγ = RγσαβA

σ. Therefore,(J−1)α

µ

(J−1)β

νJργR

γσαβA

σ = R′ρλµνJλπA

π.

Now, multiplying through by something that will ‘kill off’ the Jacobian on the RHS, (J−1)δλ

for example, (J−1)δ

λ

(J−1)α

µ

(J−1)β

νJργR

γσαβA

σ = R′ρλµνδδπA

π = R′ρλµνAδ.

Now, as this must be valid for all Aµ, we must set δ = σ. Therefore, doing so & cancelingoff the Aµ, (

J−1)σ

λ

(J−1)α

µ

(J−1)β

νJργR

γσαβ = R′ρλµν ,

which is the transformation rule of a (13)-tensor. Therefore, we have proven that the Riemann

tensor is infact a tensor.

Just to be in-keeping with some literature, the Riemann tensor is also called the Riemann-Christofell tensor, or the curvature tensor.

4.1.1 Symmetries of the Riemann Tensor

Now, in one of our previous discussions, we introduced the local inertial frame (LIF), wherebyat a point xµ = xµ∗ , the metric is flat, and the Christofell symbols are all zero;

gµν(x∗) = ηµν , ∂ρgµν(x∗) = 0, Γρµν(x∗) = 0.

In a LIF, the Riemann tensor looks quite simple. So, we see that the Riemann tensor, in aLIF, is just

Rρλµν = ∂µΓρλν − ∂νΓ

ρλµ.

Putting in expressions for the Christofell symbols, and noting that the first derivative of themetric is zero;

Rρλµν =

1

2gρπ (∂µ∂λgνπ + ∂µ∂νgπλ − ∂µ∂πgλν − ∂ν∂λgµπ − ∂ν∂µgπλ + ∂ν∂πgλµ) ,

the second & fifth terms cancel each other (as partial derivatives commute), leaving

Rρλµν =

1

2gρπ (∂µ∂λgνπ − ∂µ∂πgλν − ∂ν∂λgµπ + ∂ν∂πgλµ) .

48 4 CURVATURE

Now, to get rid of the metric multiplying the bracket, we form

Rαλµν = gαρRρλµν

=1

2gαρg

ρπ (∂µ∂λgνπ − ∂µ∂πgλν − ∂ν∂λgµπ + ∂ν∂πgλµ)

=1

2δπα (∂µ∂λgνπ − ∂µ∂πgλν − ∂ν∂λgµπ + ∂ν∂πgλµ)

=1

2(∂µ∂λgνα − ∂µ∂αgλν − ∂ν∂λgµα + ∂ν∂αgλµ) .

Of course, it must be clear that this is only valid an a LIF. Now, although the above expressionis only valid in a LIF, the resulting symmetries are valid everywhere (as the Riemann tensoris a tensor). We see that

Rαλµν = −Rλαµν = −Rαλνµ = Rµναλ = Rλανµ. (4.3)

And further that

Rαλµν +Rαµνλ +Rανλµ = 0. (4.4)

This can also be denotedRα(λµν) = 0,

where the notation is understood to mean cyclic interchange, and sum, over bracketed indices.

Theorem We state (without proof), that if all components of the Riemann tensor are zero,then the space is flat. That is

Rλµνδ = 0 ⇒ flat space.

4.1.2 The Round Trip

Now, although we shall not go into the details here (we have already presented a full mathe-matical treatment, however, of the Riemann tensor), one can show that the Riemann tensorcomes about from a round-trip around a rectangle.

Consider a rectangle, with horizontal sides of length ∆xµ and vertical sides length δxµ.Then, if one makes the parallel-transported round trip A → B → C → D → A, and if onecomputes the coordinate shift (merely due to displacement) at each vertex, then one findsthat

Aρ1 = (1 + δxµ∆xν [∇µ,∇ν ])Aρ0,

where Aρ1 is the component of A, after visiting that point after making the round trip (i.e.one starts at Aρ0). Then,

Aρ1 − Aρ0 = ∆Aρ = δxµ∆xν [∇µ,∇ν ]A

ρ0.

4.2 The Ricci Identity 49

Now, we see that [∇µ,∇ν ]Aρ0 = Rρ

αµνAα, and so,

∆Aρ = δxµ∆xνRραµνA

α

Now, notice that

Rραµνδx

µ∆xν =1

2

(Rρ

αµνδxµ∆xν +Rρ

ανµδxν∆xµ

)=

1

2

(Rρ

αµνδxµ∆xν −Rρ

αµνδxν∆xµ

)=

1

2Rρ

αµν∆Sµν ,

where we have used the anti-symmetry identity of the Riemann tensors last two indices.Also, we have defined ∆Sµν ≡ δxµ∆xν − δxν∆xµ. Therefore, we can write the round-tripexpression as

∆Aρ =1

2∆SµνRρ

αµνAα.

Therefore, we have a semi-geometrical interpretation of the Riemann tensor. It is able to tellus the difference in the orientation of a vector, after making a round trip about a rectangle,in a manifold.

4.2 The Ricci Identity

We call the commutator we defined above, the Ricci identity. That is, the Ricci identity is


λµνAλ,

where Rρλµν is the Riemann tensor, in a torsion-less manifold.

4.3 The Ricci Tensor & Scalar

If we contract the Riemann tensor on its first & third indices,

Rρλρν = gραRαλρν ,

we have a quantity we defineRλν ≡ Rρ

λρν .

If we further contract Rλν ,R ≡ gλνRλν .

Thus, we have what we define the Ricci tensor, Rµν and Ricci scalar, R. By the symmetriesof the Riemann tensor above, we can easily see that the Ricci tensor is symmetric.

Now, when we stated that the condition for flat space was that all components of theRiemann tensor were zero; if the Ricci scalar is zero, then the space is not necessarily flat.One can see this, as upon contraction, some non-zero components may cancel each other outin summation.

50 4 CURVATURE

4.3.1 Example: Plane Polars

Consider the line element

ds2 = dθ2 + sin2 θdφ2,

and suppose that we are given that

Rθφθφ = sin2 θ

is the only non-zero component of the Riemann tensor (obviously we can find the othernon-zero components by symmetry relations); then, we can compute the Ricci scalar R.

The non-zero components of the metric are easily read off the line element;

gθθ = gθθ = 1, gφφ = sin2 θ, gφφ =1

sin2 θ.

Now,

Rθφθφ = gθθRθφθφ = sin2 θ.

Now, by symmetry of the Riemann tensor,

Rθφθφ = −Rφθθφ = −Rθφφθ = Rφθφθ.

Now, the Ricci tensor is found by contraction,

Rij = gnmRnimj.

We are slightly fortunate in that the metric is diagonal. So,

Rθθ = gijRiθjθ

= gθθRθθθθ + gφφRφθφθ

= 1.0 +1

sin2 θ. sin2 θ

= 1.

Also,

Rθφ = gθθRθθθφ + gφφRφθφφ

= 0.

And,

Rφφ = gθθRθφθφ + gφφRφφφφ

= 1. sin2 θ + 0

= sin2 θ.

4.4 The Bianchi Identity 51

Therefore, the Ricci scalar,

R = gijRij

= gθθRθθ + gφφRφφ

= 1 +1

sin2 θsin2 θ

= 2.

Therefore, the Ricci scalar is 2 for the plane polar metric.

Now, if we were to repeat this, for the line element

ds2 = dr2 + r2dθ2 + r2 sin2 θdφ2,

we would find that R = 0.

4.4 The Bianchi Identity

Consider the Riemann tensor, in a LIF,


ρλµ.

Then, let us differentiate it,

∇πRρλµν = ∇π∂µΓρλν −∇π∂νΓ

ρλµ.

Now, even though we dont know how to evaluate these expressions, we can still cycle indicesto see what happens. So, making the change

π → µ→ ν → π,

then∇µR

ρλνπ = ∇µ∂νΓ

ρλπ −∇µ∂πΓρλν ,

and again,∇νR

ρλπµ = ∇ν∂πΓρλµ −∇ν∂µΓρλπ.

Now, if we add these 3 expressions,

∇πRρλµν +∇µR

ρλνπ +∇νR

ρλπµ = ∇π∂µΓρλν −∇π∂νΓ

ρλµ

+∇µ∂νΓρλπ −∇µ∂πΓρλν ,

+∇ν∂πΓρλµ −∇ν∂µΓρλπ.

Now, in a LIF, the Christofell symbols are zero. Therefore, the covariant derivative is thesame as the “usual” partial derivative. So, rather than changing the above symbols, we let

52 4 CURVATURE

covariant and partial derivative swap indices. Then, we can see that the entire RHS cancelsitself out, leaving

∇πRρλµν +∇µR

ρλνπ +∇νR

ρλπµ = 0.

Now, if we drop the ρ-index (using a metric, but index relabeling is trivial),

∇πRρλµν +∇µRρλνπ +∇νRρλπµ = 0.

And, using the symmetry property that Rαβγδ = Rγδαβ, then

∇πRµνρλ +∇µRνπρλ +∇νRπµρλ = 0,

which we see is just a cyclic interchange of the first three indices of the whole expression.That is,

∇(πRµν)ρλ = 0.

Hence, we have arrived at our result. The Bianchi identity is that

∇πRµνρλ +∇νRπµρλ +∇µRνπρλ = 0. (4.5)

The Bianchi identity is infact the equivalent of the rectangular round-trip expression we de-rived above. The Bianchi identity will come about if one considers the difference in orientationof a vector being parallelly-transported around a cuboid.

Although we derived the Bianchi identity with the Riemann tensor in a LIF, the expressionis completely valid in all frames. This is because the Riemann tensor is a tensor; and atensor equation has the same form in all frames. Thus, one begins to see the power of gettingexpressions into tensorial form, and of the local inertial frame.

4.5 The Einstein Tensor

Now, let us take the Bianchi identity,

∇πRµνρλ +∇νRπµρλ +∇µRνπρλ = 0.

Now, let us figure out how to contract this expression, so that we have Ricci tensors, ratherthan Riemann tensors. Now, if we multiply the expression by gµλ, then we will have achievedour goal (one can see that the indices of this metric are those on the first and last parts ofthe first Riemann tensor). However, let us do this methodically. So, the first expression willread,

gµλ∇πRµνρλ = −gµλ∇πRµνλρ = −∇πRνρ,

after using the anti-symmetry of the last two indices of the Riemann tensor. The secondterm can be rewritten, using the symmetry identity of the interchange first two & last twoindices of the Riemann tensor;

gµλ∇νRπµρλ = gµλ∇νRµπλρ = ∇νRπρ.

4.6 Geodesic Deviation 53

Lastly, the final term of the contracted Bianchi identity is just

gµλ∇µRνπρλ = ∇λRνπρλ.

Therefore, putting these all together, our contracted Bianchi identity looks like

∇νRπρ −∇πRνρ +∇λRνπρλ = 0.

Now, multiplying this whole expression by gνρ will contract the last Riemann tensor into aRicci tensor; as well as the middle Ricci tensor into a Ricci scalar. Thus,

gνρ∇νRπρ − gνρ∇πRνρ + gνρ∇λRνπρλ = 0,

⇒ ∇ρRπρ −∇πR +∇λRπλ = 0.

Now, the first and last expressions are identical, as we can interchange the indices at will.Therefore, we have

2∇ρRπρ −∇πR = 0.

Then, notice that we can write

2∇ρRπρ − gπρ∇ρR = 0,

and therefore that∇ρ(Rπρ − 1

2gπρR

)= 0.

Therefore, we can define the Einstein tensor,

Gµν ≡ Rµν − 12gµνR, (4.6)

whereby

∇µGµν = 0; (4.7)

after noting that the metric, Ricci & therefore the Einstein tensor are symmetric. This iscalled the contracted Bianchi identity.

4.6 Geodesic Deviation

Suppose we take, on flat space, two affinely parameterised geodesics xµ(τ), yµ(τ), that areon a collision course. That is, the distance between the two lines,

δµ(τ) ≡ xµ(τ)− yµ(τ),

decreases. On flat space, the distance will decrease linearly. That is,

dδµ

dτ= const ⇒ d2δµ

dτ 2= 0.

54 4 CURVATURE

Now consider a curved space. Let the paths be tangents. Then, the distance between thetwo wont decrease linearly. Instead, they will accelerate; thus

D2δµ

Dτ 2= Rµ

αβρTαT βδρ. (4.8)

To imagine this in a physical situation, consider two balls falling towards the centre ofthe earth. Now, the balls will obviously move towards each other, as their motion is radial.However, there will be deviation from radial, and that deviation will be due to the curvatureof space. That is, one will observe the balls accelerate towards each other (rather than theexpected linear motion towards each other).

Derivation We can derive the geodesic deviation equation, by considering the 2-dim man-ifold swept out by two affinely parameterised geodesics, next to each other. The manifoldmay be parameterised by xµ = xµ(τ, σ) (i.e. two coordinates on this surface, rather than theusual one, on a curve). The tangent vectors are

T µ =dxµ

dτ, δµ =

dxµ

dσ.

Now, let us show a useful relation. Consider

T µ∇µδν = T µ

(∂µδ

ν + Γν µλδλ)

=dxµ

dτ

∂

∂xµdxν

dσ+ Γν µλT

µδλ.

Now, the first term can be rewritten

dxµ

dτ

∂

∂xµdxν

dσ=

d2xν

dτdσ

=d2xν

dσdτ

=dxµ

dσ

∂

∂xµdxν

dτ.

Hence, we use this to see that

T µ∇µδν =

dxµ

dσ

∂

∂xµdxν

dτ+ Γν µλT

µδλ

= δµ∂µTν + Γν λµT

λδµ

= δµ∇µTν ,

where we have merely used the symmetry of the Christofell symbol. Hence, we have therelation

T µ∇µδν = δµ∇µT

ν . (4.9)

4.6 Geodesic Deviation 55

Now, let us state the operatorD2

Dτ 2= Tα∇αT

β∇β,

and compute its action upon δµ,

D2δµ

Dτ 2= Tα∇α

(T β∇βδ

µ).

Now, we use our relation (4.9), to see that

D2δµ

Dτ 2= Tα∇α

(δβ∇βT

µ).

Let us now expand this out,

D2δµ

Dτ 2= Tα∇αδ

β∇βTµ + Tαδβ∇α∇βT

µ.

Now, we can rewrite the two-covariant derivatives term on the far RHS, using the Ricciidentity,


λµνAλ,

so that we have

D2δµ

Dτ 2= Tα∇αδ

β∇βTµ + TαδβRµ

λαβTλ + Tαδβ∇β∇αT

µ

= δα∇αTβ∇βT

µ + T βδα∇α∇βTµ +Rµ

λαβTαT λδβ

= δα(∇αT

β∇βTµ + T β∇α∇βT

µ)

+RµλαβT

αT λδβ.

In the first step we used our relation (4.9) on the first term, and changed dummy indiceson the third term, then we merely factorised the expression. Now, notice that the bracketedterm can be written

∇αTβ∇βT

µ + T β∇α∇βTµ = ∇α

(T β∇βT

µ),

but the bracketed part on the RHS is zero on an affinely parameterised goedesic. Hence,

D2δµ

Dτ 2= Rµ

λαβTαT λδβ,

or, trivially relabelling indices, we arrive at our equation for geodesic deviation

D2δµ

Dτ 2= Rµ

αβρTαT βδρ. (4.10)

56 4 CURVATURE

57

5 Einstein’s Equation

We almost have enough tools to be able to write Einstein’s equation.

We have seen that freely-falling particles follow geodesics. In curved spacetime, the geodesicswill probably be curves. So then, what makes the spacetime curved?

If we consider electromagnetic theory, there is a source for the electric field: the electron.For a field, there is a source. Therefore, we need a source term that will curve spacetime.We shall now discuss a term that is the “gravitation source term”.

5.1 The Energy Momentum Tensor T µν

We shall start by stating that there exists a tensor T µν , which is symmetric. That is,T µν = T νµ. Furthermore, we shall state that the components of this tensor contain allpossible forms of energy and momentum (it will be this tensor which is the source-term). Letus state how to compute a given component of the tensor.

A given element T µν is the flux of pµ that goes through the hypersurface xν = const.

The structure of the tensor is clearly

(T µν) =

(T tt T ti

T it T ij

).

Also, before we start to compute the components of the tensor, we must state that the full4-volume is just ∆t∆x∆y∆z.

5.1.1 Components of T µν

Lets consider T 00 = T tt. Then, by our definition, that element is the flux of p0 through thesurface x0 = const. Now, p0 = E and x0 = t. Therefore, we see that T tt is the flux ofenergy E through a 3-volume ∆x∆y∆z (it is the hypersurface that holds x0 = t constant).Therefore,

T tt =E

∆x∆y∆z≡ ε,

that is, the energy per unit volume, the energy density ε.

Consider the component T 01 = T tx. Then, we see that it is the flux of p0 = E through thehypersurface x1 = x = const. That is,

T 01 =E

∆t∆y∆z,

58 5 EINSTEIN’S EQUATION

which has the interpretation of being the energy flux through the y − z plane, in unit time.This is easily extrapolated to the any term T 0i: the energy flux through a surface, in unittime.

Now consider the purely-spatial components, T ij. For example,

T ix =∆pi

∆t∆y∆z.

Now, notice that we can rewrite this,

T ix =∆pi/∆t

Ayz, Ayz ≡ ∆y∆z;

where we have fairly obviously defined an area-element. A change in momentum per unittime is just a force. Thus,

T ix =F i

Ayz,

which is a force per unit area: a pressure. Consider the specific component,

T yx =∆py/∆t

∆y∆z,

then, using the relation pi = viE, we see that

T yx =∆vyE/∆t

∆y∆z.

So, as vi = xi/t, we see that this is just

T yx =∆y/∆tE/∆t

∆y∆z.

Now then, as the ∆y’s cancel, we can just replace them with ∆x’s, thus

T yx =∆x/∆tE/∆t

∆x∆z

=∆vxE/∆t

∆x∆z

=∆px/∆t

∆x∆z= T xy.

Therefore, with this little exercise, we see that the spatial components of the energy-momentumtensor are infact symmetric. One may be able to see that the off-diagonal components of thespatial part, those T ii, correspond to the force perpendicular to a surface. Those off-digonal

5.1 The Energy Momentum Tensor T µν 59

elements are the force parallel to a surface (shear). Therefore, the spatial components, T ij

are components of the stress-tensor.

The final part to the tensor, are those components T it. Thus, we see that they are the flowof pi through the hypersurface t = const. That is, the momentum flow in a given 3-volume,at a constant time. That is, how much momentum there exists in a unit volume, at a singletime. This is clearly the momentum density.

T it =∆pi

∆x∆y∆z≡ πi.

To see that T it = T ti, consider the above expression; writing pi = viE = xiE/t, then,

T it =∆xi/∆tE

∆x∆y∆z=

∆xiE

∆t∆x∆y∆z.

Then, as i is changed through i = x, y, z, different components on the denominator will becancelled out, leaving only those in the corresponding T ti.

Therefore, we have seen that the energy-momentum tensor T µν is symmetric, by consideringits components. The colloquial construction of the tensor is thus

(T µν) =

(energy density energy flux

momentum density stress tensor

).

We shall write that T it = πi, T tt = ε.

5.1.2 Conservation Equations

The standard conservation equation is that

∇νTµν = 0. (5.1)

Let us consider this in a LIF. Then, it simply reads ∂νTµν = 0.

Now, take the time-component, µ = 0 = t. Then, the conservation equation reads

∂

∂tT tt +

∂

∂xiT ti = 0,

which is just∂ε

∂t+∂πi

∂xi= 0.

This equation can be written∂ε

∂t+∇ · π = 0,

and is the familiar continuity equation, for energy. Note that this is only valid in a LIF.


Let us take the spatial components, µ = i of the conservation equation. Thus,

∂

∂tT it +

∂

∂xjT ij = 0.

Now, if we write the force density, in a given direction

φi ≡ −∂Tij

∂xj,

then we see that the conservation equation reads

∂πi

∂t− φi = 0,

which is just the statement that the rate of change of momentum density is the force density.This is the familiar statement of Newton’s second law. That is, the above equation is just

∂π

∂t= φ.

Therefore, we see that the energy-momentum tensor T µν contains all sources of energy andmomentum, and satisfies basic conservation relations.

5.1.3 Perfect Fluids

A perfect fluid is defined to be one for whom there is no viscosity or heat conduction. This“restriction” makes the energy-momentum tensor look a lot simpler.

That a fluid has no heat conduction means that there is no transfer of energy, acrosssurfaces. Viscous forces are those which are parallel to a surface (shear). Thus, the absenceof such forces, implies that all forces on surfaces are perpendicular to those surfaces.

Therefore, if we consider our previous “derivation” of the components of T µν , we see thatit must be diagonal. This is because

• No heat conduction implies no energy flux. Therefore, T ti = T it = 0.

• No viscosity means that all parallel forces are zero. This only leaves diagonal com-ponents to the stress tensor. All components left-over are just pressures (as discussedpreviously), P .

We shall change notation slightly, so that ρ is the energy density (which is clearly the case,via ε = ρc2, with c = 1). Therefore, we see that for a perfect fluid, at rest,

T µν =

ρ 0 0 00 P 0 00 0 P 00 0 0 P

= diag(ρ, P, P, P ).

5.2 Einstein’s Equation 61

The Perfect Fluid Tensor The general expression for the energy-momentum tensor, fora perfect fluid in its LIF is

T µν = (ρ+ P )uµuν − Pgµν , (5.2)

where uµ = γ(1,u) is the 4-velocity, ρ the energy density and P the pressure of the fluid. If wetake this tensor, with the fluid at rest in its LIF, in flat space, then, some of the componentsare

T 00 = (ρ+ P )− P = ρ,

T 12 = 0,

T ij = P.

Infact, all off-diagonal components are zero. Then, we see that we have recovered our previousexpression for a perfect fluid at rest.

We can easily recover some standard fluid mechanics results from the perfect fluid tensor.Suppose we have a non-relativistic pressure-less fluid, P = 0, then, the energy-conservationequation is just

∂µ(ρuµu0) = 0,

which easily becomes∂ρ

∂t+∂ρui

∂xi= 0,

which is just∂ρ

∂t+∇ · (ρu) = 0.

5.2 Einstein’s Equation

We now have a source term. The sources of energy and momentum can be written “into”the energy-momentum tensor, T µν ; a tensor which satisfies the conservation equation.

Now, from the previous sections contracted Bianchi identity,

∇µGµν = 0, Gµν ≡ Rµν − 12gµνR,

we have an expression which takes care of the geometry of the spacetime. Recall that theRicci tensor/scalar are composed of differentials (of various orders) of the metric, where themetric gives meaning to distances within a manifold. Then, if we can equate this expressionto an expression which gives information as to what is doing the curving, then we haveour general theory of relativity. We must use an expression which also has zero covariantderivative.

The obvious choice is the energy-momentum tensor. Therefore, we write

Gµν = κTµν .


Therefore, up to a constant κ, we have a LHS which describes the geometry of a manifold,and a RHS which describes the distribution of all forms of energy and momentum in themanifold. Therefore, we say that the distribution of energy-momentum in a manifold causesthe manifold to become curved.

Therefore, Einstein’s equation is

Rµν − 12gµνR = κTµν . (5.3)

Notice that both sides have vanishing covariant derivative. We shall be able to find theconstant κ when we consider the Newtonian limit of the theory.

We can write this in an alternative form. Consider multiplying the whole expression bygµν ,

gµν(Rµν − 12gµνR) = κgµνTµν ,

then, writing the trace of the energy-momentum tensor gµνTµν ≡ T , and noting that we seethat the Ricci tensor becomes the Ricci scalar upon contraction; thus, we see that

R− 12gµµR = κT.

Now, the metric multiplied by its inverse is just the Kronecker-delta. Thus, gµµ = δµµ = 4.Therefore,

R− 124R = κT,

hence,R = −κT.

Thus, we have written the Ricci scalar in terms of the trace of the energy-momentum tensor.Hence, we can write the Einstein equation as

Rµν + 12gµνκT = κTµν ,

which is just

Rµν = κ(Tµν − 12gµνT ). (5.4)

This an entirely equivalent form of Einstein’s equation.

5.2.1 The Cosmological Constant

Now, if we require the covariant derivative of an expression to be zero, we may add on an“extra term”, a constant, which will not change the value of the covariant derivative. Thecovariant derivative of the metric is zero, so we may add on any number of metrics and retainzero covariant derivative. Therefore,

Gµν = κTµν + Λgµν

5.3 The Newtonian Limit 63

is still consistent with zero covariant derivative. So, why is this a problem?

Consider the expression, from electrodynamic theory, in a LIF,

∂µFµν = Jν .

Then, consider taking the differential of the expression,

∂ν∂µFµν = ∂νJν = 0;

where the equality with zero comes from the “usual” conservation equation. Now, considerthat we try to add on an extra term,

∂µ(Fµν + Ληµν).

Then, these two expressions are not the same. That is, we do not have the freedom to modifythe field tensor by adding on an arbitrary quantity of metrics. The reason we are not ableto do this, is because the field tensor is anti-symmetric, and the metric is symmetric.

Therefore, the reason we are able to add the constant metric term into Einstein’s equation,is because both the Einstein tensor Gµν and energy-momentum tensor Tµν are symmetric (asis the metric); as well as the metric having zero covariant derivative.

The cosmological constant Λ has been measured to exist within the universe, having a verysmall numerical value. We shall usually define the cosmological constant within the energy-momentum tensor, so that we will essentially ignore it. However, it is to be understood thatthe term is within the energy-momentum tensor.

5.3 The Newtonian Limit

Let us discuss the correspondences of the theory of gravity on curved spacetime, with New-tonian gravity.

The equation of motion of a free particle, in Newton’s theory, is just given by Newton’ssecond law of motion,

d2xi

dt2= −δij ∂Φ

∂xj,

where Φ is the gravitational potential a particle feels. The corresponding equation of motionfor a free particle, in curved spacetime, is the geodesic

d2xµ

dτ 2= −Γµαβ

dxα

dτ

dxβ

dτ,

where the Christofell symbol Γµαβ contains information about the geometry of the spacetime.

The field equation, which describes how “stuff” generates the gravitational field, for theNewtonian theory is

∇2Φ = 4πGρm.


That is, Poisson’s equation. This equation tells us that for some mass density ρm, there isan associated gravitational potential Φ. Combined with the equation of motion, we see thata mass density gives rise to a gravitational potential, which affects how a free particle moves.

The field equation in general relativity, is Einstein’s equation,

Rµν − 12gµνR = κTµν .

Some distribution of energy and momentum, defined within the energy-momentum tensor,gives rise to a different geometry. This geometrical information is then carried around by theRicci tensor, within the metric. The metric then gives the Christofell symbols, which changethe equation of motion - the geodesic.

The basic correspondence is

gµν ←→ Φ Tµν ←→ ρm.

Notice that we have only been referring to “free-particles”. A free-particle is one which doesnot have any external influences on its motion. For example, this could mean a stone beingdropped, in vacuum, from a building. The stones motion is only affected by the gravitationalpotential from the earth. Notice then, that the motion of a freely-falling particle in a curvedspacetime is entirely due to the spacetime through which is moves. That is, its trajectorywill be curved because of the geometry of the spacetime.

To modify these equations for a particle which is acted upon by an external force, Fext,one must merely add this to each component of the equation of motion.

5.3.1 Newtonian Gravity from Einstein’s Gravity

Let us consider the geodesic equation,for a free particle,

d2xµ

dτ+ Γµαβ

dxα

dτ

dxβ

dτ= 0.

Now, let us consider the non-relativistic limit of this geodesic.

Firstly, for non-relativistic motion, τ = t. Second, dxi/dt 1. Then, we can write thegeodesic equation as

d2xµ

dτ 2+ Γµ00

(dt

dτ

)2

+O(dxi

dτ

)2

= 0,

which is justd2xµ

dτ 2+ Γµ00 = 0.

Now then, to continue, we make an assumption about the metric. We say that the metric isMinkowskian, with a small perturbation,

gµν = ηµν + hµν , hµν 1.

5.3 The Newtonian Limit 65

We shall only work to first order in the perturbation. That is, we shall neglect any termsO(h2). We further say that the perturbation is static. That is, hµν(x

i) only; which immedi-ately tells us that

∂0hµν = ∂thµν = 0.

Now, the general expression for the Christofell symbol is

Γραβ =1


Then, the components that we are interested in are just

Γµ00 =1

2

∑ν

gµν(∂0g0ν + ∂0gν0 − ∂νg00).

Now, as the time-differential of the metric is zero, all but the last term is zero. We shall alsodrop the implied summation;

Γµ00 = −1

2gµν∂νg00.

We shall now drop the greek index on the RHS, and only use roman. This is because thetime-differential of the metric is zero. Thus,

Γµ00 = −1

2gµi∂ig00.

Inserting our expression for the metric,

Γµ00 = −1

2(ηµi − hµi)∂i(η00 + h00)

= −1

2

(ηµi∂ih00 − hµi∂ih00

).

Now, the expression on the far right is O(h2) thus, we ignore it. Therefore,

Γµ00 = −1

2ηµi∂ih00.

Finally, recall that the Minkowski metric is diagonal. Therefore, we only have contributionfor µ = i. Therefore, as ηii = −1, to first order static-perturbation

Γi 00 =1

2∂ih00, Γ0

00 = 0.

Therefore, the geodesic equation is

d2t

dτ 2= 0,

d2xi

dτ 2+

1

2∂ih00 = 0.

Now, we use the first expression to tell us that dt = Adτ . We then set A = 1, to see thatt = τ . Therefore, the second expression is just

d2xi

dt2+

1

2∂ih00 = 0.


Writing this as a vector equation, this is just

d2x

dt2+

1

2∇h00 = 0,

trivially rewriting results ind2x

dt2= −1

2∇h00.

Now then, recall the Newtonian equation,

d2x

dt2= −∇Φ(x).

Then, we can read off the correspondence,

h00 = 2Φ.

Finally, as the metric is just gµν = ηµν + hµν , then

g00 = 1 + 2Φ.

Therefore, we see that the time-component of a static perturbation to the Minkowski metricis the gravitational potential.

Recall the Riemann tensor, in a LIF,


ρλµ,

and thus the Ricci tensor,

Rλν = Rρλρν = ∂ρΓ

ρλν − ∂νΓ

ρλρ.

Now, let us compute the component R00. Then,

R00 = ∂ρΓρ

00 − ∂0Γρ 0ρ,

noting that

Γi 00 =1

2∂ih00, Γ0

00 = 0,

we then see that

R00 = ∂iΓi

00

=1

2∂i∂ih00

=1

2∇2h00.

Further recall that we just derived that h00 = 2Φ, then

R00 = ∇2Φ.

5.4 Linearised Gravity 67

Now then, we are now in a position to compute the constant κ in Einstein’s field equation.Let us use the alternative form of the field equation, and take the “00” components;

R00 = κ(T00 − 12g00T ).

Now, the trace T is justT ≡ gµνTµν = T µµ.

Therefore,

g00T = (η00 + h00)(η00 − h00)T00

= (η00η00 − η00h

00 + η00h00 − h00h00)T00

= T00 +O(h2).

Let us suppose that the field is generated by a static, non-relativistic body, mass densityρm. Then, T00 = ρm. Therefore, the field equation becomes

R00 = κ(ρm − 12ρm) = κ1

2ρm.

Now, we have the Poisson equation, ∇2Φ = 4πGρm, and also that R00 = ∇2Φ. Therefore,equating the two,

∇2Φ = 4πGρm = 12κρm,

we see thatκ = 8πG.

Therefore, the full field equation is

Rµν = 8πG(Tµν − 12gµνT ). (5.5)

5.4 Linearised Gravity

Let us take our perturbed metric,

gµν = ηµν + hµν , gµν = ηµν − hµν ,

where hµν << 1. Now then, notice that

gµνgνλ = (ηµν − hµν)(ηνλ + hνλ)

= ηµνηνλ + ηµνhνλ − hµνηνλ − hµνhνλ= δµλ +O(h2).

Now, consider a coordinate transformation,

xµ 7→ x′µ = xµ + εµ, xµ = x′µ − εµ.


Then, the Jacobians are clearly

Jµν = δµν + ∂νεµ,

(J−1)µ

ν= δµν − ∂νεµ.

The εµ << 1. So, we work to first order in εµ only. Now then, lets consider the transformationof the metric,

g′µν =(J−1)α

µ

(J−1)β

νgαβ.

Then, using our Jacobians for the coordinate transformation, this becomes

g′µν = (δαµ − ∂µεα)(δβν − ∂νεβ)gαβ

= (δαµδβν − δαµ∂νεβ − ∂µεαδβν + ∂µε

α∂νεβ)gαβ

= gµν − ∂νεβgµβ − ∂µεαgαν +O(ε2).

Now then, notice that by the product rule,

∂νεµ = ∂ν(gµβεβ) = εβ∂νgµβ + gµβ∂νε

β,

and therefore thatgµβ∂νε

β = ∂νεµ − εβ∂νgµβ.

Hence, using this, we see that the transformation of the metric looks like

g′µν = gµν − ∂νεµ − ∂µεν + εβ∂νgµβ + εα∂µgαν .

Now, the last two terms on the right are both O(ε2). This is because the metric is of O(ε),and therefore ε times the derivative of the metric is O(ε2). Therefore,

g′µν = gµν − ∂νεµ − ∂µεν +O(ε2).

Now, using the fact that gµν = ηµν + hµν , and ηµν = η′µν , then the above is just

ηµν + h′µν = ηµν + hµν − ∂νεµ − ∂µεν ,

which simply becomes

h′µν = hµν − ∂νεµ − ∂µεν . (5.6)

5.4.1 Linearising Einstein’s Equation

Now, recall that Einstein’s equation was composed of the Ricci tensor and the energy-momentum tensor. Now, the Ricci tensor was composed of derivatives of the Christofellsymbol, which in turn contained derivatives of the metric. Now, we can recompute theEinstein equation under the coordinate transformation defined above as

xµ 7→ x′µ = xµ + εµ ⇒ h′µν = hµν − ∂νεµ − ∂µεν .


So, consider∂νgαβ = ∂ν(ηαβ + hαβ) = ∂νhαβ.

Therefore, the Christofell symbol, defined as

Γραβ =1

2gρν(∂αgβν + ∂βgνα − ∂νgαβ),

becomes

Γραβ =1

2ηρν(∂αhβν + ∂βhνα − ∂νhαβ).

Now, the Ricci tensor has components such as the product of two Christofell symbols. It isclear that these will be O(ε2), and therefore negligible. Hence, the Ricci tensor would looklike

Rµν = ∂ρΓρµν − ∂νΓρµρ.

Then, plugging in our Christofell symbols,

Rµν =1

2ηρσ(∂ρ∂µhνσ + ∂ρ∂νhσµ − ∂ρ∂σhµν − ∂ν∂µhρσ − ∂ν∂ρhσµ + ∂ν∂σhµρ).

This becomes, after noting that partial derivatives commute, the Minkowski metric commuteswith partial derivatives and that the second and fifth terms cancel,

2Rµν = ∂σ∂µhνσ − ∂ρ∂ρhµν − ∂ν∂µhρρ + ∂ρ∂νhµρ.

Now, hρρ ≡ h, and changing the σ index on the first expression to a ρ,

Rµν =1

2(∂ρ∂µhνρ + ∂ρ∂νhµρ − ∂ρ∂ρhµν − ∂ν∂µh) .

Then, the Ricci scalar is

R = gµνRµν

= ηµνRµν

=1

2(∂ρ∂νhνρ + ∂ρ∂µhµρ − ∂ρ∂ρh− ∂ν∂νh)

= ∂ρ∂νhνρ − ∂ν∂νh.

Now then, the Einstein tensor is defined as

Gµν ≡ Rµν −1

2gµνR.

Therefore, using our linearised Ricci tensor and scalar,

Gµν =1

2(∂ρ∂µhνρ + ∂ρ∂νhµρ − ∂ρ∂ρhµν − ∂ν∂µh

−ηµν∂σ∂πhπσ + ηµν∂ρ∂ρh) . (5.7)


Now, let us define

hµν ≡ hµν −1

2ηµνh, (5.8)

and that the Lorentz gauge is

∂µhµν = ∂µhµν = 0. (5.9)

That is,

∂µhµν −1

2ηµν∂

µh = 0,

which is just the statement that

∂µhµν =1

2ηµν∂

µh =1

2∂νh.

Hence, using this in (5.7) (and swapping the ∂µ∂ν ↔ ∂ν∂µ at will), we see that

Gµν =1

2

(1

2∂µ∂νh+

1

2∂ν∂µh− ∂ρ∂ρhµν − ∂ν∂µh−

1

2ηµν∂

σ∂σh+ ηµν∂σ∂σh

).

Now, the first and second terms are identical, but their sum cancels with the fourth term.Hence,

Gµν =1

2

(−∂ρ∂ρhµν −

1

2ηµν∂

σ∂σh+ ηµν∂σ∂σh

).

The second and third terms add, to give

Gµν =1

2

(−∂ρ∂ρhµν +

1

2ηµν∂

σ∂σh

).

Now, if we use a little bit of notation,

≡ ∂µ∂µ,

then

Gµν = −1

2

(hµν −

1

2ηµνh

),

or

Gµν = −1

2

(hµν −

1

2ηµνh

).

Hence, using our substitution (5.8) again,

Gµν = −1

2hµν .

Then, if we write down Einstein’s equation,

Gµν = 8πGTµν ⇒ hµν = −16πGTµν .

Therefore, we have a wave equation in the metric perturbation, with the energy-momentumtensor as the source. This is the equation for gravitational radiation.


5.4.2 Gravitational Radiation

Under the Lorentz gauge (to be inkeeping with the literature, this is sometimes also referredto as the Einstein gauge, or harmonic gauge),

∂µhµν = 0, hµν ≡ hµν −

1

2ηµνh,

Einstein’s equation becomes

hµν = −16πGTµν , ≡ ∂µ∂µ.

That is, a wave equation.

hµν = −16πGTµν . (5.10)

We can write down the solution to this directly, if one recalls the solution to the equivalentequation from electrodynamic theory.

In electrodynamics, under the Lorentz gauge ∂µAµ = 0, we could derive the wave equation

Aν = µ0Jν ,

which has solution

Ai =µ0

4π

∫d3x′

J iret

|x− x′|.

Hence, we can basically read off our solution by analogy,

hij = 4G

∫d3x′

T ijret

|x− x′|. (5.11)

One should recall that these are retarded integrals. The minus sign has “gone” because wehave raised indices.

Therefore, we have derived that upon linearising Einstein’s equation, and using the Lorentzgauge, we have derived that there is a wave equation in the metric perturbation. The sourceto the wave is the distribution of energy-momentum.


73

6 The Schwarzschild Solution

We can write Einstein’s equation, in a vacuum, as

Rµν = 0. (6.1)

That is, in a vacuum, where Tµν = 0, the “alternative form” of Einstein’s equation reducesto the above.

Now, we can look for spherically symmetric solutions to this. That is, we are looking for aline element which possesses spherical symmetry. The most general such line element is

ds2 = eν(r,t)dt2 − eλ(r,t)dr2 − r2(dθ2 + sin2 θdφ2).

The reason we make this supposedly general line element diagonal, is that we can transformout of a frame in which there are diagonal elements.

In the line element we chose to use exponentials, as they are generally easy to work with(differentiating them is easy). Hence, the aim is to now find those functions ν(r, t), λ(r, t).

Now, although we shall not derive them, the only non-zero components of the Ricci tensorare

Rtt =1

2e−λ

(ν ′′ +

1

2ν ′(ν ′ − λ′) +

2ν ′

r

)+e−ν

(λ(ν − λ)− 1

2λ

),

Rtr =λ

2r,

Rrr =1

2e−ν

(λ− 1

2λ(ν − λ)

)−1

2e−λ

(ν ′′ +

1

2ν ′(ν ′ − λ′)− 2λ′

r

),

Rθθ = 1− e−λ(

1 +1

2r(ν ′ − λ′)

),

Rφφ = sin2 θRθθ.

We have used that an over-dot represents derivative with respect to time t, and a prime withrespect to r.

Hence, due to the reduction of Einstein’s equation to the form Rµν = 0, each of theseequation are equal to 0.

The easiest to start with, is the Rtr term. So,

λ

2r= 0,

74 6 THE SCHWARZSCHILD SOLUTION

which immediately allows us to state that λ(r) only. That is, λ does not have any dependanceupon t. Thus, using λ = 0 allows Rtt and Rrr to look very similar. Infact, as Rrr = Rtt = 0,then Rrr +Rtt = 0. This then easily shows that

Rtt +Rrr =1

2e−λ

(2ν ′

r+

2λ′

r

)= 0,

that is, assuming r 6= 0,ν ′ + λ′ = 0.

Integrating this easily shows thatν + λ = f(t).

Now, we can set f(t) to zero, by a time coordinate transformation. Then, ν = −λ. Therefore,

ν(r) = −λ(r).

Hence, using this in Rθθ, we see that

Rθθ = 1− eν(1 + rν ′) = 0,

that is,eν + rν ′eν = 1.

Now, we can rewrite this as(reν)′ = eν + rν ′eν = 1,

that is, as(reν)′ = 1.

Integrating easily reveals that

eν = 1 +C

r,

where C is some constant. We can find the value of C, by considering the Newtonian limitof the metric. That is, recall that we derived

g00 = 1 + 2Φ,

where we know that

Φ = −GMr.

Now, eν = g00 by inspection (it is the coefficient of the dt2 term). Hence,

1− 2GM

r= 1 +

C

r⇒ C = −2GM.

Let us recall that this M is the mass of the body generating the potential Φ. That is, it willbe the mass of the planet/star that is curving the spacetime. Therefore,

eν = 1− 2GM

r, eλ =

(1− 2GM

r

)−1

.

75

And finally, we have our metric,

ds2 =

(1− 2GM

r

)dt2 −

(1− 2GM

r

)−1

dr2 − r2(dθ2 + sin2 θdφ2). (6.2)

That is, we have the vacuum solution of Einstein’s equation, due to a body of mass M ; wherer > 0. This metric is called the Schwarzschild metric.

Properties of the Schwarzschild Metric The metric, by construction, is sphericallysymmetric. Also, the metric is static; it clearly is not a function of time. That the metric isstatic, then means that changing the time coordinate by a constant amount leaves the metricunchanged. That is, the metric is invariant under constant translations and reflections. Alsonotice that the metric has Killing vectors (1, 0, 0, 0) and (0, 0, 0, 1) (i.e. on t and φ); thesecorrespond to conservation of energy and angular momentum.

Notice that as r →∞, the metric goes over to Minkowski. That is, we say that the metricis asymptotically flat.

Also notice, at r = 2GM , the gtt and grr components flip sign. We call this the Schwarzschildradius, or the event horizon. We denote the event horizon as

rs ≡ 2GM. (6.3)

6.0.3 Gravitational Redshift

Consider some radial slices in the metric, so that ds2 = gttdt2. Also consider that

dτ =ds

c,

hence,

dτ =√gttdt

c.

Now, it is fairly obvious that a frequency is inversely proportional to the proper time. Thatis,

ν ∝ 1

∆τ.

Now, if we take two events which are at the same t, then

ν1

ν2

=

√gtt(2)

gtt(1).

If we use a weak gravitational field, then we can use the previously derived relation gtt =1 + 2Φ. Hence, this gives

ν1

ν2

= 1 + Φ(2)− Φ(1).

That is, the shift in frequency is a function of the distance from the gravitating body.


6.1 Dynamics in the Schwarzschild Spacetime

Recall that the effective Lagrangian is

Leff =

(ds

dτ

)2

.

Therefore, the effective Lagrangian is

Leff =(

1− rs

r

)t2 −

(1− rs

r

)−1

r2 − r2(θ2 + sin2 θφ2), (6.4)

where an over-dot denotes derivative with respect to the affine parameter τ , and rs = 2GM .So, let us consider the first integrals of this effective Lagrangian.

The Euler-Lagrange equations, for this effective Lagrangian are

d

dτ

∂Leff

∂xµ− ∂Leff

∂xµ= 0.

Then, consider that∂Leff

∂t= 0,

∂Leff

∂t= 2

(1− rs

r

)t,

then, the t-first integral is that

2(

1− rs

r

)t = const ≡ 2ε.

Similarly,∂Leff

∂φ= 0,

∂Leff

∂φ= 2r2 sin2 θφ,

with its first integral being

2r2 sin2 θφ = const ≡ 2`.

These constants, ε, `, are related to the conserved energy and angular momentum, per unitmass. Recall that these were predicted to be conserved, by the associated Killing vectors.

Finally, the effective Lagrangian is just the line element, and that can take on one of 3values;

Leff = K =

0 null,

+1 time-like,−1 space-like.

Hence, using the derived relations for `, ε, we can easily see that

t2 = ε2(

1− rs

r

)−2

, φ2 =`2

r4 sin4 θ.

6.1 Dynamics in the Schwarzschild Spacetime 77

And thus, using that the effective Lagrangian is just a constant K, we can easily put theeffective Lagrangian into the form

K =(

1− rs

r

)−1 (ε2 − r2

)− r2

(θ2 +

`2

r4 sin2 θ

).

Now, as the system has spherical symmetry, we may as well take a value of θ that makes theabove expression look simpler. Taking θ = π/2 (note that then θ = 0), we see that

K =(

1− rs

r

)−1 (ε2 − r2

)− `2

r2,

which is trivially just

K =(

1− rs

r

)−1[ε2 −

(dr

dτ

)2]− `2

r2.

Before we carry on with this expression, let us compute the Christofell symbols and geodesics.

6.1.1 Geodesics & Christofell Symbols

Let us compute the geodesics and Christofell symbols for the effective Lagrangian (6.4) inthis Schwarzschild spacetime.

We can compute the geodesic for the θ-component of the effective Lagrangian. We havethat

∂Leff

∂θ= −2r2θ,

∂Leff

∂θ= −2r2 sin θ cos θφ2,

hence,d

dτ

∂Leff

∂θ= −4rrθ − 2r2θ.

Therefore, the Euler-Lagrange equation for the θ-component, is

−4rrθ − 2r2θ + 2r2 sin θ cos θφ2.

Putting this into a more usable form,

θ +2

rrθ − sin θ cos θφ2 = 0.

Thus, we have the geodesic for θ. Now, we can read off the Christofell symbols. The non-zerocomponents are

Γθ rθ = Γθ θr =1

r, Γθ φφ = − sin θ cos θ.

We can compute the geodesic for r. So,

∂Leff

∂r= −2

(1− rs

r

)−1

r,

∂Leff

∂r= t2

rs

r2− rs

r2

(1− rs

r

)−2

r2 − 2r(θ2 + sin2 θφ2),


andd

dτ

∂Leff

∂r= −2r

(1− rs

r

)−1

+ 2r2 rs

r2

(1− rs

r

)−2

.

Therefore, the geodesic is

−2r(

1− rs

r

)−1

+ 2r2 rs

r2

(1− rs

r

)−2

− t2 rs

r2− rs

r2

(1− rs

r

)−2

r2

+2r(θ2 + sin2 θφ2) = 0.

This simplifies down to

r − r2rs

2r2

(1− rs

r

)−1

+t2rs

2r2

(1− rs

r

)+ r

(1− rs

r

)(θ2 + sin2 θφ2) = 0.

This is the r-geodesic. From this, we can read off the non-zero Christofell symbols. They are

Γr rr= −rs

2r2

(1− rs

r

)−1

, Γr tt=rs

2r2

(1− rs

r

),

Γr θθ = r(

1− rs

r

), Γr φφ = r sin2 θ

(1− rs

r

).

Then, let us compile these four geodesics (i.e. including the two not explicitly computedhere). The geodesics for the Schwarzschild spacetime are:

t+rs

r2

(1− rs

r

)−1

tr = 0,

r − rs

2r2

(1− rs

r

)−1

r2 +rs

2r2

(1− rs

r

)t2 + r

(1− rs

r

)(θ2 + sin2 θφ2) = 0,

θ +2

rrθ − sin θ cos θφ2 = 0,

φ+2

rrφ+ 2 cot θθφ = 0.

These complicated non-linear differential equations can be solved to find the trajectories ofparticles in the spacetime. The non-zero Christofell symbols are easily read off, and can beseen to be

Γt rt =rs

2r2

(1− rs

r

)−1

, Γr rr= −rs

2r2

(1− rs

r

)−1

,

Γr tt=rs

2r2

(1− rs

r

),

Γr θθ = r(

1− rs

r

), Γr φφ = r sin2 θ

(1− rs

r

),

Γθ rθ = Γθ θr =1

r, Γθ φφ = − sin θ cos θ,

Γφrφ =1

r, Γφθφ = cot θ.


6.1.2 Orbits

Let us return to the expression we derived, for θ = π/2,

K =(

1− rs

r

)−1[ε2 −

(dr

dτ

)2]− `2

r2.

We can rearrange it into the form

r2 = ε2 −K −[`2

r2

(1− rs

r

)− Krs

r

],

and indeed into the form

1

2r2 =

ε2 −K2

−[`2

2r2

(1− rs

r

)− Krs

2r

].

Now, we put it into this form, as we see that the LHS is a “velocity term”, the middle termis just the “energy”, and the far-RHS we call the effective potential Veff:

E =1

2r2 + Veff(r),

where

Veff ≡`2

2r2

(1− rs

r

)− Krs

2r. (6.5)

Now, one familiar with the Newtonian derivation of this formula, will realise that this ex-pression is not quite the same as its Newtonian counterpart. The GR “correction” is thers/r, creating a 1/r3 term.

Just to recap what the symbols are in this effective potential. ` is the angular momentumof the “moving thing”, rs ≡ 2GM , where M is the mass of the “big body” that the “movingthing” is moving in. That is, the big body is curving spacetime, and some moving objectis having its motion deflected, by the curved spacetime, which is due to the big body. Theamount of deflection is just a function of the distance from the big body to the smaller one.We shall call the “smaller body” the test mass, and the “big body” the gravitating mass.

Suppose that ` = 0. Then,

Veff = −Krs

2r= −K 2GM

2r= −KGM

r.

This is the Newtonian result. That is, for a test mass with no angular momentum, theeffective potential is just what we would expect.

Recall that we derived

ε =(

1− rs

r

) dtdτ,


then, we can clearly see thatdt

dτ= ε

(1− rs

r

)−1

.

That is, the proper time of a test mass is a function of the distance from the gravitatingmass, and of the total energy.

Let us now give some results relating to orbits in the spacetime. Circular orbits have

dVeff

dr= 0,

and stable circular orbits are those for whom the second differential of the potential is positive.

(a) ` <√

3rs (b)√

3rs < ` < 2rs (c) ` > 2rs

Figure 6.1: The effective potential, as a function of distance from the gravitating body, for particleorbits.

Particle Orbits K = 1 If we vary the angular momentum, `, with respect to the eventhorizon, rs, then various shapes of effective potential are found. With reference to Figure(6.1), we see the 3 ranges of `.

• ` <√

3rs. Here, we see that any particle with energy E > 0, escapes, whilst anyparticle with E < 0 crushes back into the origin. No stable orbits exist.

•√

3rs < ` < 2rs. For this range, there are two positions in which orbits can exist, butonly one of then is capable of sustaining stable orbits. If E > 0, then any particlewill escape. If we define Vmax as the position of the maximum of Veff, and Vmin asthe minimum, then we can see that for any 0 > E > Vmax, a particle will crush intothe origin. Also, for a particle with E = Vmin, then there is a stable circular orbit.E = Vmax is an unstable circular orbit. Any particle trapped in the “well” will havesome sort of elliptical orbit.

• ` > 2rs. Here, if E = Vmin, the particle will have a stable circular orbit, and ellipticalfor perturbations about that minimum. If a particle has E < Vmax, and lives to the leftof the maximum, then it will crush into the origin. Now, if a particle has E < Vmax,and approaches the system from the right of the maximum, then the particle will be


repelled back to ∞. However, above a certain value, the particle will hit the origin.This is not present in Newtonian gravity, where particles are always repelled.

Photon Orbits K = 0 In this case, we have that

Veff ≡`2

2r2

(1− rs

r

).

Upon plotting the effective potential, we see that for a given E < Vmax, it depends on where

Figure 6.2: The effective potential, as a function of distance from the gravitating body, all forphoton orbits.

the photon is, relative to the peak. That is, if the photon is within the peak, the photon willcrush to the origin. If the photon is outside, then the photon will repel to infinity.

6.1.3 Summary

Let us just summarise the results obtained, as they will be useful in subsequent discussions.

We derived that, on a θ = π/2 trajectory,

1

2r2 = E − Veff,

where the “energy” is given by

E =ε2 −K

2,

and the effective potential by

Veff =`2

2r2

(1− rs

r

)− Krs

2r.

The angular momentum of the test mass was computed to be

` = r2φ,


and the energy density

ε =(

1− rs

r

)t.

Light-like trajectories are those for whom K = 0. Particle-like are those for whom K = 1.The event horizon is related to the mass of the gravitating body rs = 2GM , and is idealisedso that all mass is concentrated at a single point. Over-dots represent derivative with respectto the affine parameter. Notice that we can then easily write that

dφ

dr=

φ

r

= ± 1

r2

`√2(E − Veff)1/2

. (6.6)

6.2 Light Deflection

We can compute the angle that light is deflected by, due to the curved spacetime of a star.

drrmin

!"defl

!"

Figure 6.3: Light deflection due to a gravitating mass. Notice how various angles are defined. d isthe impact parameter of the photon, with respect to the radius of the gravitating mass.

The effective potential, for photons with K = 0, reads

Veff =`2

2r2

(1− rs

r

). (6.7)

Consider the combination`

ε=

r2φ(1− rs

r

)t,

6.2 Light Deflection 83

then, considering that r rs, then

`

ε≈ r2dφ

dt+O

(rs

r

)⇒ `

ε= r2dφ

dt. (6.8)

Now, for small angles, we have that

φ =d

r.

Then,dφ

dt= − d

r2

dr

dt.

Now,dr

dt= −1,

where the unity comes from c = 1, and the minus-sign because distances are shrinking.Hence,

dφ

dt=

d

r2,

which we use in (6.8) to see that`

ε= d.

Therefore, for photons,

d =`

ε=

`√2E

. (6.9)

Now, with reference to Figure (6.3), we see that the deflection angle is given by

δφdefl = ∆φ− π.

The total angle change is just the integral

∆φ =

∫dφ,

or, as we have an expression for dφ/dr,

∆φ =

∫drdφ

dr.

Hence, using (6.6), we have that

∆φ = 2

∫ rmax

rmin

dr1

r2

`√2(E − Veff)1/2

, (6.10)

using the light-like effective potential (6.7),

∆φ = 2

∫ rmax

rmin

dr1

r2

`√

2[E − `2

2r2

(1− rs

r

)]1/2 .


We take rmax →∞, and note that the factor of 2 out-front is due to the photon coming frominfinity, the going back to infinity. If we put the factor of ` inside the square-root in thedenominator, as well as the

√2, then

∆φ = 2

∫ ∞rmin

dr

r2

[2E

`2− 1

r2

(1− rs

r

)]−1/2

.

Now, noting that via (6.9), we rewrite

2E

`2=

1

d2,

and also change variables to

w ≡ d

r⇒ dw = − d

r2dr.

Hence, using this change of variables, and rewrite,

∆φ = 2

∫ 0

wmax

−dwd

[1

d2− w2

d2

(1− rsw

d

)]−1/2

,

the minus sign obviously flipping the integration limits to

∆φ = 2

∫ wmax

0

dw

d

[1

d2− w2

d2

(1− rsw

d

)]−1/2

.

The factor of 1d

can be taken inside the square-root, giving

∆φ = 2

∫ wmax

0

dw[1− w2

(1− rsw

d

)]−1/2

.

Now, if we refer to (6.10), we see that there is a singularity at E = Veff. It is an integral ofthe form ∫ 1

0

dx√x+ ε

≈∫ 1

0

dx√x− ε

2

∫ 1

0

dx

x3/2,

whereby upon integration, the first term does not give a singularity, but the second does (atzero). Thus, we say that the integral has an essential singularity.

Let us continue. If we take out a factor, from the square root, then

∆φ = 2

∫ wmax

0

dw(

1− rs

dw)−1/2

[(1− rs

dw)−1

− w2

]−1/2

.

Now, we can expand the two terms,(1− rs

dw)−1/2

= 1 +rs

2dw +O

(rs

d

)2

,(

1− rs

dw)−1

= 1 +rs

dw +O

(rs

d

)2

,

6.3 Perihelion Precession 85

so that

∆φ = 2

∫ wmax

0

dw(

1 +rs

2dw)(

1 +rs

dw − w2

)−1/2

+O(rs

d

)2

.

We can obviously now multiply out the bracket,

∆φ = 2

∫ wmax

0

dw(

1 +rs

dw − w2

)−1/2

+rs

d

∫ wmax

0

dww(1− w2)−1/2.

Now, we can see that there is a pole in the second integral, at wmax = 1. If we look up thevalues of the integrals, we find∫ wmax

0

dw(

1 +rs

dw − w2

)−1/2

=π

2+rs

2d,∫ wmax

0

dww(1− w2)−1/2 = 1.

Therefore, we see that

∆φ = π +2rs

d,

and hence, the deflection angle,

δφdefl =2rs

d. (6.11)

Therefore, we have derived the deflection angle of a photons trajectory, with impact param-eter d with respect to a gravitating body of event horizon rs.

To get a handle on how big this angle is, consider the Sun. rs ≈ 3km, and suppose thephoton just grazes the suns surface. Then,

δφ =2rs

d=

2.3km

7× 10−5km≈ 10−5rad.

This angle is equivalent to the observed height of a 1m high object, viewed from 10km away.That is, the effect is very small. However, this angle can be measured (best in solar eclipses),and has been confirmed to be closer to the actual value than the Newtonian prediction (whichis a factor of 4 smaller).

This is one of the tests of general relativity.

6.3 Perihelion Precession

Here we consider the motion of a planet, about a star. Supposing that the orbit of theplanet is elliptical, and that the “size” of the orbit is unchanged over may periods, does the“position” of the orbit change? That is, after each revolution, let us consider that rmin is thesame, but is shifted in position by δφprec. Then, we have that

∆φ = δφprec − 2π,


where we use 2π to make the Newtonian prediction give ∆φ = 0.

We follow a similar tack as for light deflection, but we must take K = 1 as we are dealingwith time-like objects. So, the effective potential is now

Veff =`2

2r2

(1− rs

r

)− rs

2r.

We also use (6.6)

dφ

dr=

1

r2

`√2(E − Veff)1/2

, (6.12)

where the energy for time-like objects is

E =ε2 − 1

2.

Then, we write, as before,

∆φ = 2

∫ rmax

rmin

drdφ

dr,

which is just

∆φ = 2

∫ rmax

rmin

dr`

r2

[√2(E − Veff)1/2

]−1

,

putting in the effective potential,

∆φ = 2

∫ rmax

rmin

dr`

r2

[2E − `2

r2

(1− rs

r

)+rs

r

]−1/2

.

If we now take the ` inside the square-root, and use the expression for E, then

∆φ = 2

∫ rmax

rmin

dr1

r2

[ε2

`2− 1

`2− 1

r2

(1− rs

r

)+

rs

r`2

]−1/2

.

Let us rewrite the square-rooted bit slightly,

ε2

`2− 1

r2

(1− rs

r

)− 1

`2

(1− rs

r

).

Let us change integration variables,

u ≡ 1

r,

hence,

∆φ = 2

∫ umax

umin

du

[ε2

`2− u2(1− rsu)− 1

`2(1− rsu)

]−1/2

.

6.3 Perihelion Precession 87

If we take out a common factor,

∆φ = 2

∫ umax

umin

du(1− rsu)−1/2

[ε2

`2(1− rsu)−1 − 1

`2− u2

]−1/2

.

We now expand out, but we must take to higher order within the expression on the right,

∆φ = 2

∫ umax

umin

du(

1 +rsu

2

)[ε2

`2

(1 + rsu+ r2

su2)− 1

`2− u2

]−1/2

,

collecting terms,

∆φ = 2

∫ umax

umin

du(

1 +rsu

2

)[ε2

`2(1 + rsu)− 1

`2− u2

(1− ε2r2

s

`2

)]−1/2

,

thus,

∆φ = 2

(1 +

ε2r2s

2`2

)∫ umax

umin

du

[ε2

`2(1 + rsu)− 1

`2− u2

]−1/2

+rs

∫ umax

umin

duu

[ε2

`2(1 + rsu)− 1

`2− u2

]−1/2

.

Now, by looking up the integrals, the first gives π, the second π2(umin + umax). Now, the

integrand on the second integral has poles at the integration limits. Therefore, one can easilysee that the sum of the roots of the integrand, is

ε2

`2rs,

and therefore

∆φ = 2π

(1 +

ε2r2s

2`2

)+πε2r2

s

2`2.

Hence, we read off

δφprec =3πr2

s

2`2=

6πG2M2

`2.

Now, in getting a handle on how big this is, we appeal to standard ellipse-theory. The resultof which allows us to write the angular momentum ` in terms of the semi-major axis a of theorbit, and the eccentricity e,

`2 = GMa(1− e2).

Hence, the precession angle reads

δφprec =6πGM

a(1− e2). (6.13)

See Table (6.1) for a comparison of the prediction and observations of these precession angles.


Planet GR Prediction (per century) Observation

Mercury 43′′ 43.1± 0.5′′

Venus 8.6′′ 8.4± 4.8′′

Earth 3.8′′ 5.0± 1.2′′

Table 6.1: The GR prediction of, and experimental observation of, the perihelion precession ofvarious planets. The agreement is one of the most convincing experimental “proofs” of generalrelativity.

6.4 Black Holes

Let us consider what the mass and radius is, of a gravitating body for whom the escapevelocity is the speed of light. That is, what is M,R for which vesc = c?

Recall that the Newtonian expression for total energy is

EN =1

2mv2 − GMm

r,

so that rearranging into the familiar form

1

2

(dr

dt

)2

=ENm−(−GM

r

), v =

dr

dt,

we see the presence of the effective potential. Now, escape velocity is when EN = 0, whichcorresponds to

v2esc =

2GM

R,

which we require to be c2, which, under the units of c = 1, is just the statement that

R = 2GM = rs.

That is, we seem to have derived the Schwarzschild radius (which was a GR result) usingNewtonian mechanics. This is actually just a coincidence, as we have neglected both SR andGR (i.e. no mention of mass-energy in the above derivation).

Let us return to the Schwarzschild metric, with the assumption that θ, φ are constant.Then, it reads

ds2 =(

1− rs

r

)dt2 −

(1− rs

r

)−1

dr2.

Notice that this expression has two singularities. One at r = rs and one at r = 0.

Now, it is not immediately obvious whether these singularities are an artifact of how wehave constructed our coordinate system, or if they are “true singularities”. So, a way offinding this out, would be to construct a quantity that is invariant of coordinate system.Such a quantity is of course a scalar. Now, we want a scalar that is dependent upon the

6.4 Black Holes 89

geometry of the system. Such quantities are the contracted Riemann and Ricci tensors, andthe Ricci scalar. Now, experience has shown us that the best test is the Riemann tensor, inthe form

RαβνµRαβνµ =6r2

s

r6.

That is, we see that this coordinate-system independent quantity does not have a singularityas r → rs, but does have one for r → 0.

Therefore, we see that r → rs is a removable axis singularity, whereby we can changecoordinates so that the metric does not retain the singularity; and that r → 0 is an essentialsingularity. Now, although we shall not go into it at all, a quantum theory of gravity will beable to “sort out” this essential singularity.

6.4.1 Null Geodesics

Let us consider the case ` = 0, and ds2 = 0. Then, the metric is just(1− rs

r

)dt2 −

(1− rs

r

)−1

dr2 = 0,

which trivially rearranges into (dr

dt

)2

=(

1− rs

r

)2

,

which is justdr

dt= ±

(1− rs

r

).

Notice that this is the radial geodesic. So, we can solve this,

t = ±∫

dr

1− rs/r

= ±∫

rdr

r − rs

= ±∫dr

(1 +

rs

r − rs

)= ± [r + rs ln |r − rs|+ const]

= ±[r + rs ln

∣∣∣∣ rrs

− 1

∣∣∣∣+ const

].

Now, we define the tortoise coordinate

r∗ ≡ r + rs ln

∣∣∣∣ rrs

− 1

∣∣∣∣ , (6.14)

so that the geodesic readst = ±r∗ + const


Now, notice that for flat space, rs → 0. Hence, the geodesics read

t = ±r + const.

Hence, we denote this as

u = t− r, v = t+ r,

so that lines of u = const and v = const define the null geodesics. See Figure (6.4) for theselines.

r

t

Figure 6.4: Null geodesics for flat space. Blue (left to right) lines are u = const, Red (right to left)lines are v = const. Photons move on these lines, and massive particles move within a light cone,defined by the lines. That is, the light cone is defined at an intersection of lines of v = const andu = const; where the particles future is everything above that point, within that cone, and its pastis everything below that point, within the cone.

We say that u, v are the light-cone coordinates for flat space. Hence, for flat space, themetric is

ds2 = dt2 − dr2 − r2(dθ2 + sin2 dφ2)

Now, notice that

t =1

2(u+ v), r =

1

2(v − u).

Also that

dr =dr

dudu+

dr

dvdv =

1

2(dv − du), dt =

1

2(dv + du).

Therefore, the metric reads

ds2 = dudv − r2(dθ2 + sin2 dφ2),

which is no longer diagonal.

6.4 Black Holes 91

6.4.2 Eddington-Finkelstein Coordinates

Now, let us return computing the null geodesics, but for curved space. We shall still use thelight-cone coordinates,

u = t− r∗, v = t+ r∗,

with the tortoise coordinate

r∗ = r + rs ln

∣∣∣∣ rrs

− 1

∣∣∣∣ . (6.15)

From which we can compute

dr∗dr

=r

r − rs

⇒ dr2 =(

1− rs

r

)2

dr2∗.

r

r*

Figure 6.5: The tortoise coordinate (6.15). The position of rs is obvious.

Now, the Schwarzschild metric may be written as (where we are suppressing the angularpart)

ds2 =(

1− rs

r

)[dt2 − dr2(

1− rsr

)2

],

which, using our derived relation for dr2∗. is

ds2 =(

1− rs

r

) [dt2 − dr2

∗].

This, in terms of u, v is just

ds2 =(

1− rs

r

)dudv.

Notice that this metric is no longer singular at r = rs, but is still singular at r = 0.

With reference to Figure (6.6), we can see the geodesics for curved spacetime. We haveplotted the lines u = const and v = const. The interesting things to note from the plot:


r

t

Figure 6.6: The null geodesics for curved spacetime. Blue lines are u = const and red lines are v =const. The future direction, for a light cone, is that were a red line is on the left, and a blue lineon the right. Notice that for r > rs, all future cones are pointing upwards, and that at r < rs, allfuture cones point leftwards.

• As r decreases towards rs, the angle between a u and v line decrease. This means thatthe future (and past) light cone of a particle becomes sharper. This means that “stuff”must be closer to the particle for it to influence the particle, as the particle gets closerto the Schwarzschild radius.

• As a particle crosses r = rs, light cones flip 90, and point towards the t-axis. Thatis, the future of the particle can only be for motion towards the origin. That is, theparticle can never escape.

Therefore, we have seen that as a particle crosses the Schwarzschild radius, its light cone getstilted so that its future is always within the Schwarzschild radius. That is, particles can getinto this region, but never out.

Thus, we see that r = rs is some sort of membrane which allows one-way travel. This isthe event horizon.

Therefore, we see how black holes “work”. We have only considered stationary black holes.To consider rotating holes, one must analyse the Kerr metric, which we shall not do here.

Hawking Radiation Now, classically, particles cannot escape from a black hole, as wehave just seen. However, quantum mechanically, they can tunnel out. According to quantumfield theory, there is a “sea” of particle-anti-particle pairs being created and annihilated allthe time, in vacuum (i.e. there is no true vacuum). Now, suppose one of these pairs werecreated on the event horizon, so that one of the particle gets created inside the horizon, oneout side. Then, as the particle inside cannot get out (it is inside the horizon), then it cannotannihilate with the one that was created outside the horizon. Therefore, the particle outsidethe horizon can escape. Now, the energy to create the particle-anti-particle pair came fromthe vacuum inside the horizon. Therefore, by the particle escaping, energy is removed fromthe black hole, and over time, the black hole evaporates. This is called Hawking radiation.

6.4 Black Holes 93

To properly understand this radiation requires a huge amount of QFT, which we shall notgo into here.

This effect can be conceived in a rather tamer environment. Consider two metal plates,which posses opposite electric charge, where the space between the plates is “vacuum”. Now,the energy density due to the electric field may be ramped up so that it is high enough tocreate an electron-positron pair from the vacuum. This experiment, as far as I am aware,has not been done, but it is conceivable to see that it could (if the idea of a sea of virtualparticles is correct).


95

7 The Friedmann-Robertson-Walker Universe

We shall abbreviate the above name to FRW.

Now, we can start to consider the geometry of our universe. Historically, there were twotheories for the universe.

The FRW universe was one based upon the cosmological principle: “Our universe is ho-mogenous and isotropic.” This means that the universe is pretty much the same everywhereyou look, and in any direction. That is, the ensemble properties of the universe are invariantunder both translation and rotation.

The competing theory was that of a steady state universe, proposed in 1948 by F.Hoyle,H.Bondi and T.Gold. The steady state theory was a more “perfect” version of the cosmolog-ical principle, by imposing a condition that the universe be invariant under time as well astranslation/rotation. This means that the universe looks the same at any time.

The main differences between the theories are that the FRW universe started, and thenexpanded, whereas the steady state universe “always has been”. At the time these twotheories were proposed, the church preferred FRW, with scientists preferring steady state.

The FRW universe model predicts some background radiation from the beginning event(i.e. the big bang), in the form of the cosmic microwave background (CMB). The CMBsignature was predicted by Gamow and Alpher, and was observed by Penzias and Wilson.Therefore, providing evidence for the FRW universe.

The standard model of cosmology, today, uses the FRW model of the universe.

7.1 The FRW Metric

Schur’s theorem (which we state without proof) states a globally isotropic n-dimensionalmanifold (n > 2) has a constant curvature k, and that the Riemann tensor has the form

Rµναβ = k(gµαgνβ − gµβgνα).

Following this, one can construct a isotropic metric,

ds2 = dt2 − a2(t)dσ2, (7.1)

where dσ2 is the line element for 3−dim space, and a(t) is the scale factor. We define theHubble parameter, noting its present value,

H ≡ a

a, H0 = 73 km/sec/Mpc;

where it is important to note that an overdot here denotes derivative with respect to coordi-nate time t. Furthermore, the metric actually looks like

ds2 = dt2 − a2(t)

[dr2

1− kr2+ r2

(dθ2 + sin2 θdφ2

)]. (7.2)

96 7 THE FRIEDMANN-ROBERTSON-WALKER UNIVERSE

Then, by a suitable coordinate transformation, the curvature constant k can take on one of3 values,

k =

1 closed0 flat−1 open

(7.3)

So, consider the values of k, to see how the actually correspond to the above “claimed”geometries.

Closed Space Consider setting k = 1, and the transformation

r = sinχ ⇒ dr = cosχdχ,

so that the metric looks like

ds2 = dt2 − a2(t)

[cos2 χdχ2

1− sin2 χ+ sin2 χ

(dθ2 + sin2 θdφ2

)],

which simplifies trivially down to

ds2 = dt2 − a2(t)[dχ2 + sin2 χ

(dθ2 + sin2 θdφ2

)].

Now, consider taking a slice through θ. That is, set θ = π/2, then one finds that

ds2 = dt2 − a2(t)[dχ2 + sin2 χdφ2

],

where it is clear that the bracketed quantity is the line element of the 2-sphere. That is,

dχ2 + sin2 χdφ2 ⇒ sphere.

Open Space Consider setting k = −1, and the coordinate transformation

r = sinhχ.

Then, under a completely analogous manner as before, we get the line element

ds2 = dt2 − a2(t)[dχ2 + sinh2 χdφ2

],

where we now notice that

dχ2 + sinh2 χdφ2 ⇒ hyperboloid.

That is, k = −1 corresponds to a geometry based upon the surface of a hyperboloid.

7.1 The FRW Metric 97

(a) Sphere - Closed (b) Hyperboloid - Open

Figure 7.1: A visualisation of closed and open geometries.

Flat Space Let us set k = 0, and the transformation

r = χ,

then, we have the line element

ds2 = dt2 − a2(t)[dχ2 + χ2

(dθ2 + sin2 θdφ2

)],

if we set θ = π/2 again, then the square-bracketed quantity is just

dχ2 + χ2dφ2.

This line element is just that of plane polars, which is flat. Hence, we see that k = 0corresponds to flat space,

These correspondences of k with a particular geometry will become much clearer later on.

The standard way to write the FRW metric, in light of these coordinate transformations,is

ds2 = dt2 − a2(t)

dχ2 +

sin2 χχ2

sinh2 χ

(dθ2 + sin2 θdφ2) ,

k =

+10−1

. (7.4)


7.2 Geodesics & Christofell Symbols

We can compute the geodesics, and read off the Christofell symbols, from the effective La-grangian formed from the FRW metric (7.2)

Leff = t2 − a2(t)

[1

1− kr2r2 + r2

(θ2 + sin2 θφ2

)],

where an overdot denotes derivative with respect to the affine parameter, λ, say. Now, onewill need to use the following relation

a =da

dλ

=∂a

∂t

∂t

∂λ

= a′t, a′ ≡ da

dt.

Upon careful computation, one finds the four geodesics:

t− aa′

1− kr2r2 − aa′r2θ2 − aa′r2 sin2 θφ2 = 0,

r +kr2

1− kr2

(2a2 − 1

a2

)r2 + 2

a′

atr − r(1− kr2)θ2 − r sin2 θ(1− kr2)φ2 = 0,

θ + 2a′

atθ +

2

rrθ − sin θ cos θφ2 = 0,

φ+ 2a′

atφ+ 2

sin2 θ

rrφ+ 2 cot θθφ = 0.

This allows us to read off the non-zero components of the Christofell symbols;

Γt rr = − aa′

1− kr2, Γt θθ = −aa′r2, Γt φφ = −aa′r2 sin2 θ,

Γr rr =kr2

1− kr2

(2a2 − 1

a2

), Γr tr =

a′

a, Γr θθ = −r(1− kr2),

Γr φφ = −r sin2 θ(1− kr2),

Γθ tθ =a′

a, Γθ rθ =

1

r, Γθ φφ = − sin θ cos θ,

Γφtφ =a′

a, Γφrφ =

sin2 θ

r, Γφθφ = cot θ.

Notice that using the definition of the Hubble parameter, H = a′/a, we see that

Γr tr = Γθ tθ = Γφtφ = H.

This is the only section in which the derivative with respect to the affine parameter will beused; hence, an overdot from hereon denotes derivative with respect to coordinate time t.

7.3 Cosmology in the FRW Universe 99

7.3 Cosmology in the FRW Universe

We now wish to consider what happens to spacetime, in the FRW Universe. To do so, weshall need the Ricci tensor corresponding to the FRW metric, and some energy-momentumtensor.

So, following from the FRW metric, (7.2), one can compute the associated components ofthe Ricci tensor. Doing so, one finds

R00 = −3a

a, (7.5)

R0i = Ri0 = 0, (7.6)

Rij = −(

2k

a2+a

a+

2a2

a2

)gij. (7.7)

The metric gµν is the FRW metric, which we note can be written as

g00 = 1, gij = −a2(t)diag((1− kr2)−1, r2, r2 sin2 θ

).

Recall Einstein’s equation, in the form

Rµν = 8πG

(Tµν −

1

2gµνT

), T ≡ gµνTµν .

We now use Weyl’s postulate which is that our Universe is a perfect fluid. A perfect fluid isone for whom there is no heat conduction or viscosity.

Recall that the general energy-momentum tensor is given by

Tµν = (ρ+ P )uµuν − Pgµν ,

where P is the pressure of the fluid, and ρ the density. Hence, its trace is

T = (ρ+ P )uµuµ − Pgµµ = ρ+ P − 4P,

that is,T = ρ− 3P.

Infact, this result can be obtained in a slightly easier way. Recall that in the comoving frameof the fluid, uµ = (1, 0, 0, 0), then the energy-momentum tensor is diagonal,

Tµν = diag(ρ,−P,−P,−P ).

Hence, its trace is just the sum of its components, T = ρ− 3P .

So, let us compute the bracketed bit of the Einstein equation,

Tµν −1

2gµνT = (ρ+ P )uµuν − Pgµν −

1

2gµν(ρ− 3P )

= (ρ+ P )uµuν −1

2gµν(ρ− P ).


Hence, the Einstein equation reads

Rµν = 8πG

((ρ+ P )uµuν −

1

2gµν(ρ− P )

).

Now, consider the comoving frame of the fluid, then we have that

Tµν = diag (ρ,−Pgij) , T = ρ− 3P,

and thus that

Tµν −1

2gµνT =

1

2diag (ρ+ 3P, gij(P − ρ)) .

Hence,

T00 −1

2g00T =

1

2(ρ+ 3P ),

so, the 00-component of the Einstein equation, using (7.5) is

−3a

a= 8πG

1

2(ρ+ 3P ),

trivially rearranging into

a

a= −4πG

3(ρ+ 3P ). (7.8)

This is known as Raychauhuri’s equation.

Similarly, suppose we took the ij-part of the Einstein equation, using (7.7), then

−(

2k

a2+a

a+

2a2

a2

)gij = −8πG

1

2gij(ρ− P ),

from which we cancel out the metric gij,

2k

a2+a

a+

2a2

a2= 4πG(ρ− P ).

Let us then insert Raychauhuri’s equation for the middle term on the LHS,

2k

a2− 4πG

3(ρ+ 3P ) +

2a2

a2= 4πG(ρ− P ).

This can then be rearranged easily enough into(a

a

)2

=8πG

3ρ− k

a2. (7.9)

This is known as the Friedmann equation. It is common to notate

a

a≡ H,


so that the Friedmann equation reads

H2 =8πG

3ρ− k

a2.

In deriving these two equations, we jumped around a bit between comoving frames. Theseequations describe the expansion of the universe, in the comoving frame of the fluid.

Let us see where the continuity equation

∇νTνµ = 0,

can get us. So, this is just

∂νTνµ + Γν ναT

αµ − Γν µαT

αν = 0,

whereT µν = diag(ρ,−P,−P,−P ).

Now, the Christofell symbols relevant are

Γt tt = 0, Γθ tθ = Γφtφ = Γr tr = H.

Now, let us take the µ = t-component of the continuity equation,

∂νTνt + Γν ναT

αt − Γν tαT

αν = 0,

that is,∂tT

tt − ∂iT it + Γν ναT

αt − Γν tαT

αν = 0.

Now, the second term above is zero, as the energy-momentum tensor is diagonal. Hence, ifwe write that T µν = δµνT

µν , then

∂tTtt + Γν ναδ

αt T

αt − Γν tαδ

αν T

αν = 0,

which is just∂tT

tt + Γν νtT

tt − Γν tνT

νν = 0.

Now, the only non-zero Christofell symbols of the form Γν νt are those Γi it. Hence,

∂tTtt + Γi itT

tt − Γi tiT

ii = 0.

Therefore, with reference to the above Christofell symbols, we see that this is just

∂tρ+ 3Hρ+ 3HP = 0,

which is

ρ = −3H(ρ+ P ), (7.10)

which is known as the energy conservation equation, or the fluid equation.

Hence, the three important equations we have derived, for a Universe in a perfect fluid:


• The Raychaudhuri equation:

a

a= −4πG

3(ρ+ 3P ). (7.11)

• The Friedmann equation: (a

a

)2

=8πG

3ρ− k

a2. (7.12)

• The fluid equation:

ρ = −3a

a(ρ+ P ). (7.13)

All three equations are dependent upon the others, so that in solving them, one must use allthree. Infact, using any two, one can derive the third.

7.3.1 Species Evolution & Densities

The components to the fluid are called “species”. That is, we could conceive that the fluidis composed of matter, radiation and possibly some other “stuff” (which we shall come tolater).

Notice that we can write the fluid equation as

a∂ρ

∂a= −3(ρ+ P ).

Then, this can be solved, for the evolution of ρ as a function of scale factor a. We nowconsider three cases. We shall consider how the density of a particular species evolves, as afunction of scale factor, if only that species exists in the Universe.

Matter Dominated FRW Universe Consider a Universe that is filled solely with matter.For matter, there is no associated pressure. Hence, Pm = 0, and the fluid equation becomes

a∂ρm

∂a= −3ρm,

integrating,

−3

∫da

a=

∫dρm

ρm

⇒ −3 ln a = ln ρm,

which is just

ρm =ρm,0

a3, (7.14)

whereby ρm,0 the (constant) initial density of matter.


Radiation Dominated FRW Universe Radiation has the equation of state

ρr = 3Pr,

which may be derived from black-body radiation theory. Hence using this, the fluid equationreads

a∂ρr

∂a= −4ρr,

integrating as before results in

ρr =ρr,0

a4. (7.15)

Vacuum Dominated FRW Universe The equation of state for vacuum is

ρV = −PV,

so that the fluid equation reads

ρV = 0,

hence, we see that ρV = const.

Critical Density Recall the Friedmann equation, but let us set k = 0 (i.e. flat),

H2 =8πG

3ρ.

Then, let us define this ρ to be ρcrit, so that

ρcrit =3H2

8πG. (7.16)

That is, ρcrit is the density required to make the Universe flat. If we take the present valueof the Hubble parameter to be

H0 = 100h km s−1Mpc−1,

then the critical density should have value (if measured today),

ρcrit = 10.54h2keV cm−3.

We use the notation that a subscript “0” denotes the present value of a quantity. In particular,we define

a0 ≡ 1;

the present value of the scale factor is unity.


Normalised Energy Densities Let us suppose that there are four species present in theUniverse: matter, radiation, vacuum and curvature. Let us now define

Ωm ≡ρm,0

ρcrit

, Ωr ≡ρr,0

ρcrit

, ΩV ≡ρV,0

ρcrit

, Ωk ≡ −k

H20a

20

. (7.17)

That is, the Ωi are called the normalised energy densities of the species; they represent thecurrent fraction of that species, in terms of the critical density. We impose the condition

Ωm + Ωr + ΩV + Ωk = 1,

as the Universe appears to be flat, by measurement. The matter species is composed of bothbaryonic and dark matter, radiation is composed of both photons and neutrinos. We tend tocall the vacuum species the cosmological constant, so that ΩV = ΩΛ. See Table (7.1) for thecurrent values of various quantities.

Quantity Current Accepted Value

Ωm 0.24Ωb 0.04

ΩDM 0.20Ωr < 0.01Ωk 0.05ΩΛ 0.7

Table 7.1: Various quantities, as a fraction of ρcrit.

7.4 Age of the FRW Universe

Let us return to the Friedmann equation

H2 =8πG

3ρ− k

a2,

if we the divide through by H20 ,

H2

H20

=8πG

3H20

ρ− k

a2H20

,

and the last expression on the RHS multiply/divide by a20, to give

H2

H20

=8πG

3H20

ρ− k

a20H

20

a20

a2.

Now, we notice the presence of our definitions of ρcrit and Ωk, so that

H2

H20

=ρ

ρcrit

+Ωk

a2,

7.4 Age of the FRW Universe 105

after using that a0 = 1. We now insert our derived evolutions of the various species ρi,

H2

H20

=1

ρcrit

(ρm,0

a3+ρr,0

a4+ ρV,0

)+

Ωk

a2

=Ωm

a3+

Ωr

a4+ ΩV +

Ωk

a2.

Hence, if we set H = H0, and a = a0, then we have

Ωm + Ωr + ΩV + Ωk = 1.

So, let us write our expression back in terms of the scale factor, so that(a

a

)2

= H20

[Ωm

a3+

Ωr

a4+ ΩV +

Ωk

a2

],

or,

a

a= H0

[Ωm

a3+

Ωr

a4+ ΩV +

Ωk

a2

]1/2

,

multiplying through by a, and pulling inside the square-root,

a = H0

[Ωm

a+

Ωr

a2+ ΩVa

2 + Ωk

]1/2

. (7.18)

Now, consider that

t0 =

∫ t0

0

dt =

∫ a0=1

0

dt

dada =

∫ 1

0

da

a.

Hence, we have that

t0 =1

H0

∫ 1

0

da

[Ωm

a+

Ωr

a2+ ΩVa

2 + Ωk

]−1/2

. (7.19)

Therefore, this expression will give us the age of the Universe.

7.4.1 Age of Matter Dominated Universe

So, let us assume that Ωm = 1 (all other species are zero). Hence, the present age of theUniverse may be given by

t0 =1

H0

∫ 1

0

da a1/2

=2

3H0

.


Also notice that in the matter dominated universe, (7.18) looks quite simple,

a = H0

√Ωma

−1/2,

which is easily solved to give

a ∝ t2/3. (7.20)

That is, if the Universe is matter dominated, then the scale factor evolves in time as t2/3.

Another curious result, is that for a vacuum dominated Universe, a ∝ a, which impliesthat

a ∝ et,

that is, in a vacuum dominated Universe, the scale factor grows exponentially with time.

7.4.2 Age of Matter & Curvature Dominated Universe

Here, we have a mixture of two species, such that

Ωr = ΩV = 0, Ωm + Ωk = 1.

Let us introduce a rescaling of time, known as conformal time, whereby

adη = dt.

Hence,

η =

∫ t

0

dt

a=

∫ a

0

da

aa.

Notice that in writing this, we have that η = η(a). We should then be able to invert it, sothat a = a(η). Notice that if we use conformal time, the FRW metric (7.2) can be writtenin the form

ds2 = a2(t)

[dη2 − dr2

1− kr2+ r2

(dθ2 + sin2 θdφ2

)]∼ a2(t)gµνdx

µdxν .

That is, we have a conformal transformation of the metric. This is why we call η conformaltime.

Now, (7.18) in our model is

a = H0

[Ωm

a+ Ωk

]1/2

,

hence,

aa = H0

[Ωma+ Ωka

2]1/2

.

7.4 Age of the FRW Universe 107

Therefore, using this,

η =1

H0

∫ a

0

da√Ωma+ Ωka2

.

To integrate this, we complete the square, giving

η =1

H0

∫ a

0

da

Ωk

[(a+

Ωm

2Ωk

)2

− Ω2m

4Ω2k

]−1/2

.

If we then define

x ≡ 2Ωk

Ωm

a+ 1,

then we see that we can write

η =1

H0

∫ x

a

dx2Ωk

Ωm

Ωk

[Ω2

m

4Ω2k

(x2 − 1)

]−1/2

=1

H0

√Ωk

∫ x

1

dx√x2 − 1

,

where we look up the value of the integral,∫ x

1

dx√x2 − 1

= cosh−1 x.

Hence,

η =1

H0

√Ωk

cosh−1 x.

Therefore,

x = cosh(ηH0

√Ωk

).

Hence,

a(η) =Ωm

2Ωk

[cosh

(ηH0

√Ωk

)− 1].

Writing Ωk = 1− Ωm, then this reads

a(η) =Ωm

2(1− Ωm)

[cosh

(ηH0

√1− Ωm

)− 1], Ωk > 0. (7.21)

Clearly, this only holds for Ωk > 0. If Ωk < 0, then the cosh becomes a cosine, and we have

a(η) =Ωm

2(Ωm − 1)

[1− cos

(ηH0

√Ωm − 1

)], Ωk < 0. (7.22)

With reference to Figure (7.2), we see the two different types of Universes. It is clear fromthe analytic forms of the evolution of scale factor with conformal time, a(η), that Ωk > 0


Η

aHΗL

Figure 7.2: A visualisation of closed and open universes. Closed has Ωk < 0, and open Ωk > 0. Theformer is just a sinusoidal-oscillation, the latter an exponential expansion.

corresponds to an exponential increase in scale factor (7.21), and Ωk < 0 an oscillatory scalefactor (7.22). Also, from the definition of Ωk,

Ωk = − k

H20a

20

,

we see that

Ωk > 0 ⇒ k < 0 ⇒ open, (7.23)

Ωk < 0 ⇒ k > 0 ⇒ closed. (7.24)

which are in agreement of our previous statements of open and closed Universes. So,

An oscillatory Universe will have a definite (conformal) time when it ends, when a(η) hitsthe axis again,

cos(ηtotH0

√Ωm − 1

)= 1 ⇒ ηtot =

2π√Ωm − 1H0

.

Hence, the actual total time is given by

ttot =

∫ ηtot

0

dηa(η),

which easily evaluates to

ttot =πΩm

(Ωm − 1)3/2H0

.

Therefore, we have an expression for the total possible age of the Universe, if the Universehas a closed geometry. Hence, a small non-zero k is sufficient to control the future “fate”of the Universe. That is, the Universe will either end up exponentially growing (the “heatdeath”), or will crunch back on itself (the “big crunch”).

7.5 Light in the FRW Universe 109

7.5 Light in the FRW Universe

Consider the FRW metric, where we shall ignore all angular terms;

ds2 = dt2 − a2(t)dr2

1− kr2.

Now, assuming flatness, for light (i.e. null geodesics, ds2 = 0), we have that the metricreduces to

dt = a(t)dr.

Therefore, consider

R =

∫ R

0

dr =

∫ te

to

dt

a(t).

That is, the distance between two points that have photons sent between them. We havethat te is the time of emission of the photon, and to the time of observation. Now, we shallassume that this distance is unchanged, for pulses sent slightly after this first set, so that

R =

∫ te+δte

to+δto

dt

a(t).

Therefore, we have that ∫ te+δte

to+δto

dt

a(t)=

∫ te

to

dt

a(t).

Now, the only non-zero contribution to this (via a general calculus mid-point theorem) is

δtoa(to)

− δtea(te)

= 0,

which easily rearranges toδtoδte

=a(to)

a(te).

Now, we can express the LHS as a ratio of frequencies (by units), so that

νeνo

=a(to)

a(te)≡ 1 + z.

Hence, we arrive at a standard relation in cosmology,

νeνo

= 1 + z. (7.25)

This is always> 0. Therefore, we see that the ratio of received frequency and “sent” frequency(i.e. the frequency that the light was, when it was sent by the object) is dependent uponthe redshift z that the light was emitted. This quantity z is just the ratio of the scalefactors when the light was received, to when it was emitted. Hence, we see that the further


away something is, the frequency we see light emitted by it drops. That is, the wavelengthincreases. Hence, this is called the cosmological redshift effect. This is a different effect fromgravitational redshift, because gravitational redshift occurred due to different distances froma gravitating body.

Expansion of Universe ⇒ Cosmological redshift,

Different distances up gravitational potential ⇒ Gravitational redshift.

To get a handle on the numbers invloved, consider that the most distant quasar is at z ≈ 6.6,and that recombination is at z ≈ 103.

Notice that we can write

z =νe − νoνo

=ao − aeae

.

Also, recall that (non-relativistic) redshift is related to the velocity of the object,

z =v

c=δa

a.

Hence, notice that we may compute

δa

a=δa/δt

aδt = H

R

c.

Therefore,v = HR.

This is Hubble’s law, as derived from first principles from the FRW metric.

7.6 Flatness Problem

Now, there are problems with the FRW Universe.

Recall that the fraction, today, of curvatures contribution to the total density of the Uni-verse is Ωk,0 < 10%. Also recall that we defined

Ωk(t) ≡ k

H(t)a2(t),

where t is the time at which we are measuring. Hence, let us compute,

Ωk(t0)

Ωk(tr)=k/H2

0a20

k/H2ra

2r

;

the ratio of the curvature contributions today and in the radiation dominated epoch. Thiseasily reduces to

Ωk(t0)

Ωk(tr)=

a2r

H20a

20

.

7.6 Flatness Problem 111

Now, recalling that the scale factor, in the radiation dominated epoch, depends upon timeas

ar = a0

(trt0

)1/2

⇒ ar =a0

2t0

(trt0

)−1/2

,

Also, recall that the Hubble parameter, in the radiation dominated epoch, goes as

H0 =1

2t0.

Hence,

ar = H0a0

(trt0

)−1/2

.

And therefore,Ωk(t0)

Ωk(tr)=t0tr.

Putting some typical numbers in, one sees that

Ωk(t0)

Ωk(tr)≈ 1017secs

10−43secs= 1060.

Hence,Ωk(t0) = 1060Ωk(tr).

That is, the value of Ωk is 1060 times what it was in the radiation epoch! This requires avery small (so called “fine-tuning”) curvature in the early epoch, so that the Universe couldbe 1060 times more curved now, than it was.

This fine-tuning required is called the flatness problem.

7.6.1 Inflation

One way to “solve” the flatness problem, is to introduce the concept of inflation. If weallow an epoch before the radiation domination, that was vacuum dominated (recall thataV (t) = aie

Ht). In this case, we can compute that

Ωk(tr)

Ωk(ti)=a2i

a2r

,

after assuming that Hi ≈ Hr. This gives

Ωk(tr)

Ωk(ti)= e−2H(tr−ti),

a number we require to be less than 10−60. Therefore, we require

Ne ≡ 2H(tr − ti) > 60.


That is, we require the number of e-folds to be about 60, in order for us to observe theflatness that we do today.

Basically, this idea of inflation gives a mechanism by which the Universe is able to stretchand flatten out, very quickly. Infact, inflation also aids in explaining the observed homogene-ity of the Universe.

113

8 The General Theory of Relativity: Discussion

We have now come to a place whereby all the mathematical groundwork has been laid, for a“wordy” discussion about the general theory of relativity.

Before general relativity (or at least a few hundred years before Einstein, as general relativ-ity went through a few people before Einstein, in various forms), gravity was some force thatwas present between two bodies having mass. As this was so, things that don’t have massdon’t interact with gravity. This means that things like photons are not affected by gravity,and that photons are not capable of generating a gravitational field. Also, the structureof spacetime was that space is flat, and time is just something to be moved through, at aconstant rate; where the rate is the same for all observers.

General relativity somewhat starts off by letting space and time mix: spacetime. The wholecollection of bits of spacetime is then what we call a manifold; further to this, allowing ameaning to the term “distance” in a manifold, we introduce a metric. We call a manifold(collection of points) that has a metric, a Riemannian manifold. We “used” to think ofspacetime as being flat (Pythagoras’ theorem for distances between two points). A flatspacetime is described by a metric with constant components; taking the derivative of anyone of them, with respect to any coordinate, is zero. Now, general relativity introduces theidea that a metric has components that depend on position. This means that in order tofind the distance between two points, you not only have to know where the points are, butwhere you are relative to the origin of the coordinate system. This is in contrast with onlyneeding to know the relative positions of the two points.

When one computes the derivative of something, one is computing the rate of change ofsomething in a particular direction. Now, when one did this in a flat spacetime, the derivativeof the metric didn’t do anything: its derivative was zero. In a position dependent metric,this is no longer true. One finds that there is an extra bit, added onto the differential ofsomething, that is proportional to the derivative of the metric. That this extra bit exists, isdirectly due to the metric being position dependent. Therefore, various combinations of thismetric (in the form of differential with respect to various coordinates), will give us a handleon the geometry of the manifold. A slightly curious thing is that a manifold does not requirea higher dimension in which to curve. Usually, when one imagines a ball (as an example),one can see that the surface of the ball is curved round, through three dimensions, but thesurface of the ball itself is two dimensional. Manifolds do not require this extra dimension(to those within the manifold itself) in which to curve.

Mathematically, we carry around the “extra bits” of the differential in the Christofellsymbols; and the “various combinations” of the differentials of the metric in the Riemanntensor.

Now, something that lives in a manifold, and moves in that manifold, will move alongsome sort of curve; which is fairly obvious. Now, the motion of something, with respect toa stationary observer, can be determined. In a flat spacetime, something will move along

114 8 THE GENERAL THEORY OF RELATIVITY: DISCUSSION

lines that are determined by Newton’s equations of motion. In a curved spacetime (i.e. aspacetime that doesn’t have all zero components of its Riemann tensor), the curves thatthings move along are changed; and the amount that they are changed by is proportional tothe Christofell symbols. These curves are geodesics. A geodesic, in flat spacetime, with noexternal forces (such as a rocket boost, or magnetic fields), is a straight line. A correspondinggeodesic, in curved spacetime is curved. This curvature of the movement free “thing” is dueto the curvature of the spacetime.

So, this far in our discussion, we have seen that if a manifold is curved, then things don’ttend to move in straight lines within the manifold. That is, the geodesics are curved lines.A way to imagine this, is to envisage a cube threaded with a 3D grid; beads move along thegridlines, but the gridlines are not straight. This is only an analogy, as the real geodesics are4D. Then, we must consider what it is that does the curving. What thing, in a manifold,causes it to be curved?

The proposal of Einstein is that all forms of matter and energy (even though they areessentially the same) curve spacetime. The proposal equates the distribution of “stuff” (i.e.the things that do the curving, things that have mass & energy) with the geometry of thespacetime. That is, the distribution of mass-energy with combinations of the metric. Thismeans that the more energy you put in a given place, the more the spacetime is curved (andhence the more curvy geodesics get). The distribution of mass-energy is carried around inthe energy-momentum tensor, and the geometry in the Einstein tensor.

This curvature of spacetime, due to the distribution of mass-energy is the “main idea” ofgeneral relativity.

Some of the consequences of this general theory include the “ability” of massless things,which have energy, to interact with gravity. This is because the massless things move throughthe spacetime, and gravity is just the curvature of spacetime. This allows the geodesic ofa photon to be curved. Notice that this is in contrast with the previous flat spacetime westarted off discussing. This gives the so-called “light deflection” effect. Another consequenceis that things at a different distance from the centre of a body doing the curving (the so-calledgravitating mass), experience difference rates of passage through time. This is because theposition-dependent metric has different values at different positions (obviously). An exampleof this, is that if we synchronize two clocks, on the surface of the earth, then take one upfrom the surface of the earth, and leave one on the surface, they will tell different times whenbrought back together.

Documents

Gravitation - · PDF file5.1 The Energy Momentum Tensor T . . . . . . . . . . . . . . . . . . . . . . . 57 ... time dilation (moving clocks run slow). 1.2 Covariant Formalism The title