Upload
duongtuong
View
229
Download
2
Embed Size (px)
Citation preview
Gravitation
J.Pearson
July 20, 2009
Abstract
These are a set of notes I have made, based on lectures given by A.Pilaftsis at theUniversity of Manchester Sept-Dec ’08. Please e-mail me with any comments/corrections:[email protected]. These notes may be found at www.jpoffline.com.
ii
CONTENTS iii
Contents
1 Recap of Special Relativity 1
1.1 The Lorentz Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Covariant Formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.1 Lorentz Boost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Standard Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 The Equivalence Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 The Weak Equivalence Principle . . . . . . . . . . . . . . . . . . . . . 6
1.4.2 The Strong Equivalence Principle . . . . . . . . . . . . . . . . . . . . 6
1.5 Gravitational Redshift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Einstein’s Vision of General Relativity . . . . . . . . . . . . . . . . . . . . . 8
2 Manifolds, Metrics & Tensors 9
2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Coordinate Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Example: Plane Polars . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Tangent Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 The Metric & Line Element . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.1 Example: Polars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5.1 Contravariant Vectors Aµ . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5.2 Covariant Vectors Aµ . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5.3 The Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5.4 Conformal Transformations . . . . . . . . . . . . . . . . . . . . . . . 14
2.5.5 It is a Proper Vector? . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6.1 Symmetric & Anti-symmetric Tensors . . . . . . . . . . . . . . . . . . 15
3 Tensor Calculus 17
iv CONTENTS
3.1 Covariant Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.1 Parallel Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.2 Absolute Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.3 Transformation of Γλνµ . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.4 Locally Inertial Frames . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.5 Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.1 The Affine Geodesic . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.2 The Metric Geodesic . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2.3 Relation Between Affine Connection & Christofell Symbol . . . . . . 33
3.3 Isometries & Killing’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5.1 Computing Christofell Symbols: Effective Lagrangian . . . . . . . . . 38
3.5.2 Computing the Geodesic . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5.3 Physical Meaning of the Killing Vector . . . . . . . . . . . . . . . . . 41
3.5.4 Nordstrom’s Theory of Gravity . . . . . . . . . . . . . . . . . . . . . 42
4 Curvature 45
4.1 The Riemann Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.1 Symmetries of the Riemann Tensor . . . . . . . . . . . . . . . . . . . 47
4.1.2 The Round Trip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 The Ricci Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3 The Ricci Tensor & Scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.1 Example: Plane Polars . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4 The Bianchi Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 The Einstein Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.6 Geodesic Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5 Einstein’s Equation 57
CONTENTS v
5.1 The Energy Momentum Tensor T µν . . . . . . . . . . . . . . . . . . . . . . . 57
5.1.1 Components of T µν . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.1.2 Conservation Equations . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.1.3 Perfect Fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2 Einstein’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.1 The Cosmological Constant . . . . . . . . . . . . . . . . . . . . . . . 62
5.3 The Newtonian Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3.1 Newtonian Gravity from Einstein’s Gravity . . . . . . . . . . . . . . . 64
5.4 Linearised Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.4.1 Linearising Einstein’s Equation . . . . . . . . . . . . . . . . . . . . . 68
5.4.2 Gravitational Radiation . . . . . . . . . . . . . . . . . . . . . . . . . 71
6 The Schwarzschild Solution 73
6.0.3 Gravitational Redshift . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.1 Dynamics in the Schwarzschild Spacetime . . . . . . . . . . . . . . . . . . . 76
6.1.1 Geodesics & Christofell Symbols . . . . . . . . . . . . . . . . . . . . . 77
6.1.2 Orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.2 Light Deflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.3 Perihelion Precession . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.4 Black Holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.4.1 Null Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.4.2 Eddington-Finkelstein Coordinates . . . . . . . . . . . . . . . . . . . 91
7 The Friedmann-Robertson-Walker Universe 95
7.1 The FRW Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.2 Geodesics & Christofell Symbols . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.3 Cosmology in the FRW Universe . . . . . . . . . . . . . . . . . . . . . . . . 99
7.3.1 Species Evolution & Densities . . . . . . . . . . . . . . . . . . . . . . 102
7.4 Age of the FRW Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
vi CONTENTS
7.4.1 Age of Matter Dominated Universe . . . . . . . . . . . . . . . . . . . 105
7.4.2 Age of Matter & Curvature Dominated Universe . . . . . . . . . . . . 106
7.5 Light in the FRW Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.6 Flatness Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.6.1 Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8 The General Theory of Relativity: Discussion 113
1
1 Recap of Special Relativity
Let us quickly recap the principles of special relativity that are assumed to be known.
The postulates of SR:
• All laws of nature are the same for all inertial observers;
• The speed of light, c, is the same for all inertial observers.
1.1 The Lorentz Transformations
Consider a frame Σ′, within which an observer is stationary. The coordinates in that frameare the “primed ones”, (ct′, x′, y′, z′). Now, consider another frame, Σ, such that Σ′ is movingat constant velocity β ≡ v/c relative to a stationary observer in Σ. The coordinates in the“stationary frame” are unprimed (ct, x, y, z).
The two sets of coordinates are related via the transformations
ct′ = γ(ct− βx), x′ = γ(c− βct), y′ = y, z′ = z. (1.1)
We have defined the quantities
γ ≡ 1√1− β2
, β ≡ v
c.
From the transformations, we can compute “the invariance of the interval”, thus
c2t′2 − x′2 − y′2 − z′2 = ct2 − x2 − y2 − z2.
The physical consequences of this is that of Fitzgerald contraction (moving bodies shorten),time dilation (moving clocks run slow).
1.2 Covariant Formalism
The title “covariant formalism” is a little misleading: it should read “invariant formalism”,but convention leaves it so.
Let us define the contravariant position 4-vector as
xµ = (x0, x1, x2, x3) = (ct, x, y, z). (1.2)
The metric of SR is flat, called the Minkowski metric, and written ηµν . The elements of themetric may be represented as
(ηµν) = diag(1,−1,−1,−1) =
1 0 0 00 −1 0 00 0 −1 00 0 0 −1
.
2 1 RECAP OF SPECIAL RELATIVITY
Notice that this metric is symmetric; ηµν = ηνµ. Consider constructing an inverse matrix tothis metric. That is, we require
ηη−1 = 14,
where 14 is the 4-D identity matrix diag(1,1,1,1). Inspection will see that the inverse matrixhas the same elements as the original. We denote the inverse of the metric as
(η−1)µν ≡ ηµν ,
thus, we have thatηµνη
νλ = δλµ.
Now, in Euclidean space, suppose we have a vector x = xiei, where ei is a basis vector andi ∈ [1, n], where n is the dimension of the Euclidean space (usually 3). Then, the dot-productof the vector with itself can be written as
x · x = xixjei · ej,
and we “mix the basis vectors” via the Kronecker-delta, which is the metric of Euclideanspace
ei · ej = δij ⇒ x · x = xixjδij = xixi.
If we expand out this implied summation, we get the radius of a sphere in the n-dimensionalEuclidean space
xixi = x2 + y2 + z2.
Now, we make the analogy to Minkowski space. We denote a contravariant vector as x = xµeµ,so that the inner-product of the vector with itself is written
x · x = xµxνeµ · eν ,
and again we mix the basis vectors by the metric of the space; the metric of Minkowski spaceis ηµν . Thus,
eµ · eν = ηµν ,
and thereforex · x = xµxνηµν .
If we say that
xµ = ηµνxν , (1.3)
then we see thatx · x = xµxµ.
From this, we are able to define the covariant position 4-vector as
xµ = ηµνxν = (ct,−x,−y,−z).
1.2 Covariant Formalism 3
And therefore, carrying out the summation, we find that the inner-product of the position4-vector with itself is the radius of a 4-D sphere in Minkowski space;
xµxµ = (ct)2 − x2 − y2 − z2.
Of course, we can write the inner-product of one 4-vector with another
x · y = xµyνηµν = xµyν .
Just as we used the metric to lower a contravariant vectors index, to become a covariantindex, we may use the inverse metric to raise a covariant index to become a contravariantone
xµ = ηµνxν . (1.4)
Therefore, using these relations, we are able to see that
xνyν = xνyν .
1.2.1 Lorentz Boost
Consider again the 4-vector x = xµeµ. Then, consider that the vector is the same in anotherframe, then we must have that
xµeµ = x′µe′µ.
The way we transform between frames is via a Lorentz boost;
x′µ = Λµνx
ν , (1.5)
where we use
(Λµν) =
γ −γβ 0 0−γβ γ 0 0
0 0 1 00 0 0 1
.
If we note all of our definitions used thus far (for contravariant vectors, and their components),and the expressions forming the Lorentz transformations, (1.1), we see that
Λµν =
∂x′µ
∂xν.
We say that Λµν (as defined above) constitutes a boost along the x-axis. It is infact a rotation
about the y − z-plane.
Hence, we have a rule for transforming contravariant component, between frames: (1.5).Then, how does a covariant component transform?
4 1 RECAP OF SPECIAL RELATIVITY
Consider using the metric to change from a contravariant vector to a covariant one, in theprimed frame,
x′µ = ηµκx′κ,
then we use (1.5) to transform the contravariant vector on the RHS
ηµκx′κ = ηµκΛ
κλx
λ,
then lower the index on the RHS
ηµκΛκλx
λ = ηµκΛκλη
λνxν .
Although not previously stated, we can imagine that the metric can lower/raise indices onanything, not just position vector-components. Thus, we see that ηµκΛ
κλ = Λµλ. Hence, the
above reads
ηµκΛκλη
λνxν = Λ νµ xν .
Now, let us define the inverse Lorentz transform as
(Λ−1)ν µ ≡ Λ νµ .
Therefore, writing this stream of algebra down, from start to finish, we arrive at our result
x′µ = ηµκx′κ
= ηµκΛκλx
λ
= ηµκΛκλη
λνxν
= Λ νµ xν
= (Λ−1)ν µxν .
That is, to find the covariant components of a vector in the primed frame, we relate them tothe unprimed frame via the inverse Lorentz transformation
x′µ = (Λ−1)ν µxν . (1.6)
Let us then right our two Lorentz transformation rules; one for contravariant components &one for covariant
x′µ = Λµνx
ν , x′µ = (Λ−1)ν µxν . (1.7)
Notice that the inverse Lorentz transformation matrix may be written as
((Λ−1)ν µ) =
γ γβ 0 0γβ γ 0 00 0 1 00 0 0 1
, (Λ−1)ν µ =∂xν
∂x′µ.
1.3 Standard Relations 5
Notice that the product of Λµν and (Λ−1)ν µ is the identity matrix, as they are inverses
Λνλ(Λ
−1)µν = δµλ .
We are now in a position to be able to prove the invariance of the interval, in Minkowskispace, under Lorentz transformations. Consider the inner-product of two vectors in theprimed frame,
x′ · y′ = x′µy′µ,
we then transform each expression on the RHS, according to the relevant rule
x′µy′µ = Λµν(Λ
−1)λµxνyλ,
then, noting the relation between the transformation & its inverse,
Λµν(Λ
−1)λµxνyλ = δλνx
νyλ,
which easily givesδλνx
νyλ = xνyν .
And therefore, putting it all together
x′µy′µ = Λµν(Λ
−1)λµxνyλ
= δλνxνyλ
= xνyν .
And thus, we have shown that the inner-product is invariant under Lorentz transformation(the invariance of the interval).
1.3 Standard Relations
Here we shall merely state the standard definitions of various 4-vectors.
The infinitesimal 4-position is defined as
dxµ = (cdt,x), ⇒ dxµ = ηµνdxν = (cdt,−x).
The line element:ds2 = ηµνdx
µdxν = c2dt2 − (dx)2.
Proper time:
dτ =1
c
√dxνdxν =
dt
γ.
4-velocity:
uµ =dxµ
dτ= (cγ, γv).
6 1 RECAP OF SPECIAL RELATIVITY
4-momentum:pµ = muµ = (E/c,p).
Differential operator:
∂µ ≡(
1
c
∂
∂t,∇).
Charge conservation:∂µJ
µ = 0, Jµ = (cρ,J).
Lorentz gauge:∂µA
µ = 0, Aµ = (φ/c,A).
1.4 The Equivalence Principles
Here we shall discuss some thought experiments which lead to the development of generalrelativity.
1.4.1 The Weak Equivalence Principle
Imagine an observer and “ball” inside a sealed lift. The observer is stationary relative tothe ball, and are unable to see out of the lift. Suppose that the lift is suspended above ahomogeneous gravitational field.
Then, suppose that the cable holding the lift up, is cut. The lift will accelerate downwards,a = g; where the acceleration due to gravity is just given by
g = −∇φg.
Now, experience tells us that both the observer and ball will remain at rest, relative to eachother, inside the lift.
From Newton, we have the relation between the resultant force on a body (which will bethe gravitational mass times the gravitational field), and the inertial mass with acceleration:
mia = mgg.
Thus, as a = g, we therefore easily see that mi = mg. This leads to the statement of theweak equivalence principle:
“Gravity couples in the same way to all mass & energy”.
1.4.2 The Strong Equivalence Principle
Consider the same setup as before: observer & ball at rest inside a sealed lift. This time, letthe lift be in free space (i.e. no gravitational fields, anywhere).
1.5 Gravitational Redshift 7
Then, suppose that we accelerate the lift (using a rocket) such that a = g. We see thatthere is no difference in this situation as to one in which the lift is sat on the earths surface.Thus, the string equivalence principle:
“All laws of physics are the same in an accelerated frame, and in a uniform static gravita-tional field”.
1.5 Gravitational Redshift
Consider a lift with a stationary observer in. Also in the lift, is a light blub, which emitslight at frequency ν ′, according to the observer stationary inside the lift. Now, consider thatthere is another observer, stationary, on the surface of the earth (which we model as having ahomogenous gravitational field). We have x pointing upwards, from the surface of the earth.Then,
g = −dφd`
x.
Let the length of the lift be d`, and the light bulb reside at the top of the lift. Then, a signaltraveling at speed c takes time dt = cd` to traverse the length of the lift.
Now, suppose that the lift is traveling at speed v, and then the observer on the earth willsee some shifted frequency, ν. The Doppler shift is just
ν ′
ν=
(1 + v/c
1− v/c
)1/2
≈ 1 +v
c,
after using the binomial expansion. From this, we see that
dν
ν=v
c.
Using the relation that v = du = gdt, this simply gives that
dν
ν=g
cdt,
which gives, using cdt = d`dν
ν=
g
c2d`.
Now, if we use the fact that gd` = −dφ, then this is just
dν
ν= −dφ
c2.
Therefore, we see that frequency shift is due to a changing gravitational potential. Thus,if a photon is moving out of a potential, then it will be red-shifted; and inward would beblue-shifted.
8 1 RECAP OF SPECIAL RELATIVITY
1.6 Einstein’s Vision of General Relativity
Einstein’s vision is that spacetime is a manifold, such that line elements are given by
ds2 = gµν(xρ)dxµdxν ,
where the metric is a function of coordinates. Within the metric (or, how the metric isconstructed) is information on how spacetime is curved; and it is curved by any form of en-ergy/momentum. According to the equivalence principle, one can always choose coordinatessuch that space is locally flat (Minkowski). Things in the spacetime travel along straightgeodesics. Massive particles travel along time-like geodesics, which have ds2 > 0, photonstravel along null geodesics ds2 = 0, and tachyons along ds2 < 0.
9
2 Manifolds, Metrics & Tensors
2.1 Definitions
Let us state some rather (mathematically) loose definitions.
Manifold A manifold is a continuous set of points, which locally looks like an n-dimensionalMinkowski space.
That is, given a manifold M, if we “zoom in” on a little bit, that little bit will look flat.Suppose we zoom in on a bit which we label ui(p), where i just means that we chose one ofmany bits; and p is the point at the middle of the bit ui. The coordinate system in u1 (say)is Minkowski, xa(p). The whole collection of these little bits leads us to our next definition.
A manifold endowed with a metric is called a Riemannian manifold
Atlas An atlas is the complete set of coordinate systems ui in the manifold M.
Curve A curve, in an n-manifold (whereM merely has n coordinates), is a subset of pointsdefined parametrically
xa = xa(λ), a = 1, 2, . . . , n, λ ∈ R.For example, consider a 1-sphere (i.e. a circle), defined by the equation x2 + y2 = 1. Weparameterise it thus
xa = (x(λ), y(λ)) ⇒ x(λ) = sinλ, y(λ) = cosλ; 0 ≤ λ < 2π.
Surfaces A m-dim hypersurface in an n-manifold (whereby m < n), is defined as
xa = xa(λ1, . . . , λm); λ1,...,m ∈ R.
So that a curve is a 1D hypersurface. Or, alternatively, a surface is a generalisation of acurve.
For example, consider a 2-sphere (i.e. the surface of a ball), of constant radius r. It isdefined by x2 + y2 + z2 = r2 = const. We parameterise the surface by (θ, φ), so that
x = r sin θ cosφ, y = r sin θ sinφ, z = r cos θ; 0 ≤ θ < π, 0 ≤ φ < 2π.
2.2 Coordinate Transformations
Consider moving from one coordinate system to another
xµ 7−→ x′µ = x′µ(xν).
10 2 MANIFOLDS, METRICS & TENSORS
Such a transformation is defined by displacement vectors dxµ and dx′ν , such that
dx′µ = Jµνdxν , (2.1)
whereby the inverse is justdxµ =
(J−1)µ
νdx′ν .
By the chain rule, it is easy to see that the transformation matrix is just the Jacobian
Jµν =∂x′µ
∂xν. (2.2)
The transformation & inverse satisfy
Jµν(J−1)ν
σ= δµσ . (2.3)
This is easier to see if we represent the Jacobians in terms of differentials,
Jµν(J−1)ν
σ=∂x′µ
∂xν∂xν
∂x′σ=∂x′µ
∂x′σ= δµσ .
2.2.1 Example: Plane Polars
Consider that some point in the R2 plane may be defined by Cartesian coordinates (x, y) orplane polars, (r, θ). Then, we make the identifications
(x1, x2) = (x, y), (x′1, x′2) = (r, θ).
We also know that
x = r cos θ, y = r sin θ; r =√x2 + y2, θ = tan−1 y/x.
Then, we can compute the elements of the Jacobian
J i j =∂x′i
∂xj
=∂(r, θ)
∂(x, y)
=
( ∂r∂x
∂r∂y
∂θ∂x
∂θ∂y
)=
(cos θ sin θ− sin θ
rcos θr
).
And therefore,
dr =∑j
Jrjdxj
= Jrxdx+ Jrydy
= cos θdx+ sin θdy.
And similarly,
dθ = −sin θ
rdx+
cos θ
rdy.
2.3 Tangent Vector 11
2.3 Tangent Vector
Imagine that on a manifold M, we have curves parameterised by u. On one curve, there isa point p(u). So, we have xµ = xµ(u), then the tangent curve is defined to be
T µ =dxµ
du
∣∣∣∣u=up
. (2.4)
2.4 The Metric & Line Element
We have the line element
ds2 = gµν(x)dxµdxν . (2.5)
Now, a common requirement, is the invariance of the line element (i.e. invariance of theinterval). Thus, we require that
ds2(x) = ds2(x′).
So, under transformation xµ 7→ x′ν(xµ), we want that
gµνdxµdxν = g′αβdx
′αdx′β. (2.6)
So, we proceed by writing down the known transformation of the RHS “primed” to “un-primed” displacement vectors,
gµνdxµdxν = g′αβdx
′αdx′β = g′αβJαµJ
βνdx
µdxν .
But, this must always be consistent, so we see that we must have
gµν = g′αβJαµJ
βν . (2.7)
We can derive a similar relation, by starting from (2.6), and instead of transforming the RHS,transform the LHS. So,
g′αβdx′αdx′β = gµνdx
µdxν = gµν(J−1)ν
β
(J−1)µ
αdx′αdx′β,
which we require to always be true, leaving us with
g′αβ = gµν(J−1)ν
β
(J−1)µ
α. (2.8)
The alternative way of writing the Jacobian leads us to be able to rewrite (trivially) expres-sions (2.7) and (2.8)
gµν =∂x′α
∂xµ∂x′β
∂xνg′αβ, g′αβ =
∂xν
∂x′β∂xµ
∂x′αgµν .
We call gµν the “metric”, and gµν the “inverse metric”; where they must satisfy
gµνgνλ = δλµ. (2.9)
12 2 MANIFOLDS, METRICS & TENSORS
2.4.1 Example: Polars
We know that the line element in plane polars is ds2 = dr2 + r2dθ2. Thus, we can read offthe elements of the metric
(gij) =
(1 00 r2
),
and, by (2.9), we see that we require
(gij) =
(1 00 1/r2
).
In spherical polars, the line element is
ds2 = dr2 + r2dθ2 + r2 sin2 θdφ2;
and we can easily read off the metric
(gij) =
1 0 00 r2 00 0 r2 sin2 θ
, (gij) =
1 0 00 1/r2 00 0 1/r2 sin2 θ
.
Raising & Lowering We can use the metric to raise & lower indices. We shall not showthis in use here; see the next subsection.
2.5 Vectors
We start this by discussing contravariant and covariant vectors.
2.5.1 Contravariant Vectors Aµ
These are sometimes just denoted “vectors”.
These are defined to transform, under coordinate transformation xµ 7→ x′µ(xν) as
A′µ = JµνAν . (2.10)
2.5.2 Covariant Vectors Aµ
These are sometimes called “covectors”.
Let us say that we define a covector Aµ via
Aµ = gµνAν .
2.5 Vectors 13
Then, we may derive its transformation properties. Consider that
A′µ = g′µνA′ν ,
the RHS of which we know the transformation rules for
A′µ = g′µνA′ν =
(J−1)α
µ
(J−1)β
νgαβJ
νσA
σ.
We can rearrange the terms in this expression,
A′µ = g′µνA′ν =
(J−1)α
µ
(J−1)β
νJνσgαβA
σ,
so that we notice the appearance of a transformation-inverse multiplication, which results ina Kronecker-delta
A′µ = g′µνA′ν =
(J−1)α
µδβσgαβA
σ,
acting the Kronecker-delta results in (ignoring the middle equality now)
A′µ =(J−1)α
µgαβA
β,
then lowering the index, via the metric,
A′µ =(J−1)α
µAα. (2.11)
And therefore, we have arrived at the relation we require.
2.5.3 The Scalar Product
The scalar product between two vectors is written
S · T = SµT νgµν = SνTν .
A fairly obvious thing we need to prove is the invariance of the dot-product. So,
SνTν = S ′αT
′β (J−1)α
νJνβ = S ′αT
′βδαβ = S ′αT′α.
This is a very important proof. Infact, it also states that scalars are invariant under trans-formation.
Within the scalar product, we must briefly mention the modulus of a vector. We denotethem as ||S||, and define them
||S|| =
(SµSµ)1/2 time− like, ds2 > 0(−SµSµ)1/2 space− like, ds2 < 0.
14 2 MANIFOLDS, METRICS & TENSORS
2.5.4 Conformal Transformations
Following from the previous definition of the scalar product, we have the definition of theangle between two vectors;
cos θ =SµTµ||S|| ||T ||
=SµTµ
(SαSα)1/2(T βTβ)1/2. (2.12)
A conformal transformation is defined as one whose angle between two vectors does notchange. That is, under a conformal transformation, the angle between two vectors is un-changed.
Associated metrics are termed “conformal metrics”. How can we find such metrics? Theyare given by
gµν = Ω(x)gµν , Ω(x) 6= 0. (2.13)
We can see this by putting this new metric into the cos θ expression,
cos θ =gµνS
µT ν
(gαγSαSγ)1/2(gβδT βT δ)1/2,
and by substituting gµν = Ω(x)gµν , we see that the factors of Ω end up canceling, leaving theangle unchanged.
2.5.5 It is a Proper Vector?
Here, we ask if various quantities are “proper vectors”, or not.
Consider Cµ(x) = aAµ(x)+bBµ(x). It is clearly a proper vector, as each of its constituentstransform as we expect - each is defined at the same coordinate point.
Consider Cµ = aAµ(x1) + bBµ(x2). This is not a proper vector, as the constituents aredefined at different points, and different points transform differently.
2.6 Tensors
These are basically vectors, with more indices. We can also mix the indices, so that we havesome up, some down.
For example, consider F µν ≡ AµBν . We call it a second rank contravariant tensor, or a(20)-tensor. It clearly transforms as
F ′µν = A′µB′ν = JµαJνβA
αBβ = JµαJνβF
αβ.
2.6 Tensors 15
Similarly, a second rank covariant tensor, or a (02)-tensor, transforms like
F ′µν = A′µB′ν =
(J−1)α
µ
(J−1)β
νAαBβ =
(J−1)α
µ
(J−1)β
νFαβ.
Finally, a mixed (11)-tensor transforms
F ′µν = Jµα(J−1)β
νFα
β.
This obviously generalises to higher-rank tensors. One must include a Jacobian for eachcontravariant index, and one inverse Jacobian for each covariant index.
Getting equations & other expressions into tensorial form (i.e. into a form consistent withthe above tensor transformations), is extremely useful. For example, given a tensor equationin one frame of reference, one therefore knows the form in all frames of reference. Thisbecomes particularly useful when one finds a frame in which a particular equation becomessimple to analyse; then, one can simply transform out of that frame, and know that theanalysis still holds.
Also, consider a tensor for whom all components are zero. Then, one cannot make a coor-dinate transformation that will be able to “reinstate” those (completely) zero components.That is, a tensor with zero components in one frame, has zero components in all frames. Thisis a very useful concept. If a quantity is not a tensor, then this does not hold true. That is,a non-tensor with zero components in one frame may have non-zero components in another.
2.6.1 Symmetric & Anti-symmetric Tensors
A symmetric (20)-tensor is one where
Aµν = Aνµ,
that is, the sign is unchanged under exchange of the indices. An anti-symmetric tensor isone for whom
Bµν = −Bνµ.
Now then, using these relations (definitions, if you will), we can see some interesting formulae.
Suppose that Aµν is a symmetric tensor. Then, Aµν = Aνµ. Then, we see that
Aµν = 12(Aµν + Aνµ) = 1
2(Aµν + Aµν) = 1
22Aµν = Aµν .
Similarly, suppose that Bµν is an anti-symmetric tensor. Then,
Bµν = 12(Bµν −Bνµ) = 1
2(Bµν +Bµν) = Bµν .
These obviously all hold for covariant tensors. Lets introduce some notation that will bepretty useful.
16 2 MANIFOLDS, METRICS & TENSORS
Suppose we have some tensor, defined as
Tµν ≡ 12(Bµν −Bνµ),
then, we writeB[µν] ≡ 1
2(Bµν −Bνµ).
That is, we could say that Tµν is formed by the anti-symmetric interchange of indices on Bµν .We use the “square brackets” to denote the anti-symmetric interchange. Similarly, supposewe have
Cµν ≡ 12(Aµν + Aνµ),
then, we defineA(µν) ≡ 1
2(Aµν + Aνµ).
Thus, we say that Cµν is formed by the symmetric interchange of indices. We used “roundbrackets” to denote the symmetric interchange.
Suppose we have some tensor, Yµν . Then, we can write it as the sum of an anti-symmetricpart, and a symmetric part. That is,
Yµν = A[µν] + A(µν) = 12(Aµν − Aνµ) + 1
2(Aµν + Aνµ).
This is infact pretty obvious. If the tensor is symmetric, then A[µν] = 0. And, if the tensoris anti-symmetric, then A(µν) = 0.
The notation of a lower bracket to denote index interchange can be used in another way.Recall the electromagnetic field tensor,
Fµν ≡ ∂µAν − ∂νAµ,
then, we can write this asFµν = 2∂[µAν].
Also recall that two of Maxwells equations may be recovered from
∂µFαβ + ∂αFβµ + ∂βFµα = 0,
well, we can denote this (notice that this is a cyclic interchange of index) as
∂(µFαβ) = 0.
In this final example, we were a little sloppy. There is infact a numerical factor associatedwith this; however, it gets very messy, and the factor cancels out anyway. However, oneshould be aware that there is a factor there.
17
3 Tensor Calculus
Here we shall lay some formal groundwork for dealing with objects in curved spacetime. Westart by looking at differentiation, going on to geodesics.
3.1 Covariant Differentiation
Let us just state some notation. We have
∂µ ≡∂
∂xµ, ∂µ ≡ ∂
∂xµ.
Now, let us look at the coordinate transformation xµ 7→ x′µ(xν). Then, we have that
dx′µ = Jµνdxν , Jµν =
∂x′µ
∂xν= ∂νx
′µ.
Now, let us consider differentiation of a scalar, and a coordination transformation (notingthat scalars do not transform under a coordinate transformation); thus
∂µφ 7−→ ∂′µφ =∂
∂x′µφ
=∂xν
∂x′µ∂νφ
=(J−1)ν
µ∂νφ.
Therefore, we see that the derivative of a scalar ∂µφ transforms as a covariant vector
∂′µφ =(J−1)ν
µ∂νφ.
Now, let us try this with a vector (again, under a coordinate transformation)
∂µAν 7−→ ∂′µA
′ν ;
where we want to derive how the RHS relates back to the LHS. Notice, if ∂µAν is a (1
1)-tensor,then we know what it gives. However, let us derive it. So, using the known transformationrules for Aν and ∂µ,
∂′µA′ν =
(J−1)α
µ∂αJ
νβA
β.
Now, to continue, we must consider the partial derivative above. We must use the productrule on everything to the right of it. That is(
J−1)α
µ∂α(JνβA
β)
=(J−1)α
µ
(∂αJ
νβ
)Aβ +
(J−1)α
µJνβ
(∂αA
β).
18 3 TENSOR CALCULUS
This is not the transformation rule for a (11)-tensor, due to the presence of the first term on
the RHS. We write the result, swapping the two terms on the RHS, to see this more clearly:
∂′µA′ν =
(J−1)α
µJνβ
(∂αA
β)
+(J−1)α
µ
(∂αJ
νβ
)Aβ.
Therefore, we see that the partial derivative of a vector is not a tensor. The non-tensorialpart is the added term on the far right. There is a rather more fundamental reasoning behindwhy the partial derivative of a vector is not tensorial. Recall that the partial derivative of avector is defined as
∂µAν(xα) = lim
δu→0
Aν(xα)− Aν(xα + δu)
δu.
So, the partial derivative is composed by finding the value of a vector at different points. Aswe have seen, the sum of two vectors evaluated at different points, is not a proper vector(this is due to the Jacobian being evaluated at different positions). Therefore, one shouldexpect the partial derivative of a vector not to be tensorial; which is what we find.
Now, consider the vector
A(x) = Aν(x)eν(x) = A′ν(x)e′ν(x),
where we use the fact that a vector is the same in all frames. Now consider differentiatingA, noting that the components and basis vectors are all function of coordinate;
∂νA = ∂ν (Aµeµ) = (∂νAµ)eµ + Aµ(∂νeµ).
Now, to continue, we shall write the final bracketed term as a sum over coefficients
∂νeµ = Γρ νµeρ.
The logic behind this will become clear. However, one may think of it in a similar way toquantum theory. Given a state, one can write it as a sum over coefficients times the basis.What we are doing here, is to say that ∂νeµ is a “new object”, and write that new object asa sum over the original basis eρ, with coefficients Γρ νµ. Notice that this then results in
∂νA = (∂νAµ)eµ + AµΓρ νµeρ.
In the final term, let us swap indices ρ→ µ and µ→ β,
AµΓρµνeρ → AβΓµνβeµ.
This therefore results in∂νA = (∂νA
µ)eµ + AβΓµνβeµ,
which we factorise (and move the position of the final Aβ) to
∂νA = eµ(∂νA
µ + ΓµνβAβ).
3.1 Covariant Differentiation 19
Furthermore, we define the bracketed quantity as
∇νAµ ≡ ∂νA
µ + ΓµνβAβ. (3.1)
This defines the covariant derivative of a contravariant vector. We can use this rule for thecovariant derivative of a contravariant vector to derive the rule for a covariant vector.
The covariant derivative of a contravariant vector is
∇αAµ = ∂αA
µ + ΓµαλAλ.
A covector is constructed from the contravariant vector via
Aν = gνµAµ.
So,
∇αAµ = ∇α (gµνAν)
= gµν∇αAν + Aν∇αgµν
= ∂αAµ + ΓµαλA
λ
= ∂α (gµνAν) + Γµαλ(gλβAβ
)= Aν∂αg
µν + gµν∂αAν + ΓµαλgλβAβ.
If we equate the second and last lines,
gµν∇αAν + Aν∇αgµν = Aν∂αg
µν + gµν∂αAν + ΓµαλgλβAβ.
Now, the index ν is a “dummy index”, so we can swap β → ν in the last term, to give
gµν∇αAν + Aν∇αgµν = Aν∂αg
µν + gµν∂αAν + ΓµαλgλνAν ,
collecting terms,
gµν∇αAν =(∂αg
µν + Γµαλgλν −∇αg
µν)Aν + gµν∂αAν .
We then expand out the covariant derivative of the metric (the third term in the bracket),to give
gµν∇αAν =(∂αg
µν + Γµαλgλν − ∂αgµν − Γµαλg
λν − Γν αλgµλ)Aν + gµν∂αAν .
Now, the first and third terms cancel each other out, as do the second and fourth. Leaving
gµν∇αAν = gµν∂αAν − Γν αλgµλAν .
If we multiply through by gπµ, then we see that
gπµgµλ = δλπ , gπλg
µν = δνπ.
20 3 TENSOR CALCULUS
Hence, this gives
δνπ∇αAν = δνπ∂αAν − δλπΓν αλAν ,
which is
∇αAπ = ∂αAπ − Γν απAν .
Putting into more “standard indices”, we have our desired result. Hence, the covariantderivative of a covariant vector is
∇νAµ ≡ ∂νAµ − ΓβνµAβ. (3.2)
Now, remember that a scalar is invariant; and that the derivative of a scalar is a tensor, weshould have that ∇µ(AνAν) = ∂µ(AνAν). This can be checked. So,
∇µ(AνAν) = (∇µAν)Aν + Aν(∇µAν)
= Aν(∂µA
ν + Γν µβAβ)
+ Aν(∂µAν − ΓαµνAα
)= ∂µ (AνAν) + AνΓ
νµβA
β − AνΓαµνAα= ∂µ (AνAν) + AνA
βΓν µβ − AνAαΓαµν .
Now, the last two expressions can be shown to cancel, by interchanging indices. Let usmanipulate the final expression
AνAαΓαµν α→ ν → β ⇒ AβAνΓνµβ,
and so, if we put this expression back in, we see that
∇µ(AνAν) = ∂µ (AνAν) + AνAβΓν µβ − AβAνΓν µβ
= ∂µ (AνAν) .
Therefore, we see an expected result: the covariant derivative of a scalar is the same as thepartial derivative.
We call the expansion coefficients Γλνµ the affine connection.
We are able to find the covariant derivative of tensors of arbitrary rank. A few are givenbelow.
∇αAµν = ∂αA
µν + ΓµαλAλν + Γν αλA
µλ,
∇αAµν = ∂αAµν − ΓλαµAλν − ΓλανAµλ,
∇αAµν = ∂αA
µν + ΓµαλA
λν − ΓλανA
µλ,
∇αAµνσ = ∂αA
µνσ + ΓµαλAλνσ + Γν αλA
µλσ + ΓσαλAµνλ.
Basically, for each contravariant component, there should be a positive connection term, andfor each covariant a negative term.
3.1 Covariant Differentiation 21
3.1.1 Parallel Transport
The main idea in parallel transport is this:
Consider moving a vector from one place to another. Then, in general, that vector willchange direction; thus, a change in the vector upon moving said vector. So, we can find thedifference in a vector,
DAµ = Aµ(x′)− Aµ(x′).
Considering how the basis changes as well, we end up with
DAµ = δxν(∂νA
µ + ΓµνλAλ)
The bracketed quantity is just the covariant derivative. Thus,
DAµ = δxν∇νAµ.
Now, the point is that this gives another insight as to what the covariant derivative is. Whenmoving a vector around a manifold, one must consider how the basis vectors change frompoint to point, as well as the components. This information is within the affine connection.
For an example as to what parallel transport is, consider a circle in the plane. Considerthat there is an arrow living on the circle, pointing in a given direction (say parallel to they-axis). Then, consider moving the arrow around the circle. The arrow undergoes paralleltransport if it always points in the same direction, nomatter what its position on the circle.Now, consider that the entire space is the circle-line. That is, we have a 1D manifold. For avector living on the manifold, parallel transport means moving on tangents to the circle.
3.1.2 Absolute Derivative
We define the absolute derivative as
DAµ
Du=dxν
du∇νA
µ,
where we have considered a curve, parameterised so that
Aµ = Aµ(xν(u)).
3.1.3 Transformation of Γλνµ
Let us consider the transformation property of the affine connection, Γλνµ. Let us start withour previous definition, but in the primed-frame (we will then transform to the unprimed)
Γ′ρµνe′ρ = ∂′µe
′ν .
22 3 TENSOR CALCULUS
Then, we know how to transform the RHS,
∂′µe′ν =
(J−1)α
µ∂α(J−1)β
νeβ,
we then use the product rule on the RHS,(J−1)α
µ∂α(J−1)β
νeβ =
(J−1)α
µ
(J−1)β
ν∂αeβ +
(J−1)α
µeβ∂α
(J−1)β
ν.
Now, we also know that ∂αeβ = Γδ αβeδ, so that(J−1)α
µ∂α(J−1)β
νeβ =
(J−1)α
µ
(J−1)β
νΓδ αβeδ +
(J−1)α
µeβ∂α
(J−1)β
ν,
remembering that the LHS is of course just
Γ′ρµνe′ρ =
(J−1)α
µ
(J−1)β
νΓδ αβeδ +
(J−1)α
µeβ∂α
(J−1)β
ν.
If we then transform the basis vector on the LHS, we have
Γ′ρµν(J−1)λ
ρeλ =
(J−1)α
µ
(J−1)β
νΓδ αβeδ +
(J−1)α
µeβ∂α
(J−1)β
ν.
On the RHS, let us change the indices on the basis vectors, so that they are the same asthose on the left. That is, δ → λ and β → λ;
Γ′ρµν(J−1)λ
ρeλ =
(J−1)α
µ
(J−1)β
νΓλαβeλ +
(J−1)α
µeλ∂α
(J−1)λ
ν,
which allows us to then cancel off the basis vectors,
Γ′ρµν(J−1)λ
ρ=(J−1)α
µ
(J−1)β
νΓλαβ +
(J−1)α
µ∂α(J−1)λ
ν.
If we then multiply this through by something which will kill-off the inverse Jacobian on theLHS, we will have got to our result. Notice that Jπλ will do this. So,
Γ′ρµνJπλ
(J−1)λ
ρ= Jπλ
(J−1)α
µ
(J−1)β
νΓλαβ + Jπλ
(J−1)α
µ∂α(J−1)λ
ν
⇒ Γ′ρµνδπρ = Jπλ
(J−1)α
µ
(J−1)β
νΓλαβ + Jπλ
(J−1)α
µ∂α(J−1)λ
ν
⇒ Γ′πµν = Jπλ(J−1)α
µ
(J−1)β
νΓλαβ + Jπλ
(J−1)α
µ∂α(J−1)λ
ν.
We therefore have our result: the transformation of the affine connection is
Γ′πµν = Jπλ(J−1)α
µ
(J−1)β
νΓλαβ + Jπλ
(J−1)α
µ∂α(J−1)λ
ν. (3.3)
Now, although not a notation we have been using much, we can represent the Jacobians bydifferentials,
Jµν =∂x′µ
∂xν,(J−1)µ
ν=∂xµ
∂x′ν;
3.1 Covariant Differentiation 23
and, using this notation, the transformation of the affine connection looks like
Γ′πµν =∂x′π
∂xλ∂xα
∂x′µ∂xβ
∂x′νΓλαβ +
∂x′π
∂xλ∂xα
∂x′µ∂
∂xα∂xλ
∂x′ν.
We can see that this immediately shows that the affine connection is not a tensor (due to theexistence of the second term on the RHS). Now, if the affine connection were a tensor, then,if one were to find a coordinate system in which all the components were zero, then theymust be zero in all coordinate systems (this is a general property of tensors). That the affineconnection is not a (1
2)-tensor means that even if the connection has zero components in oneframe, there exists frames in which the components are non-zero. Infact, one can show thatthere exists a frame in which the components are zero, at a point. We shall now show that.
3.1.4 Locally Inertial Frames
This will all seem a little pointless, until we reach the very end of our discussion.
Let us make the following coordinate transformation,
x′µ = xµ +1
2Γµαβx
αxβ, xµ ≡ xµ − xµ∗ ,
where xµ∗ is a single point. Now, under this transformation, we can write down the Jacobian
Jµν =∂x′µ
∂xν=
∂
∂xν
(xµ +
1
2Γµαβx
αxβ)
= δµν +1
2xαxβ∂νΓ
µαβ +
1
2Γµαβ
(δαν x
β + δβν xα)
= δµν +1
2xαxβ∂νΓ
µαβ + Γµνβx
β,
thus, the Jacobian is
Jµν = δµν +1
2xαxβ∂νΓ
µαβ + Γµνβx
β. (3.4)
Notice that this can be written,
Jµν = δµν +O(xβ). (3.5)
Infact, the inverse Jacobian is also this,(J−1)µ
ν= δµν −O(xβ). (3.6)
Now then, returning to (3.4), we see that we can differentiate it,
∂αJπλ = ∂α
(δπλ +
1
2
(Γπλβx
β + Γπλν xν))
+O(xβ)
=1
2
(Γπλβδ
βα + Γπνλδ
να
)+O(xβ)
=1
2(Γπλα + Γπαλ) +O(xβ)
= Γπλα +O(xβ).
24 3 TENSOR CALCULUS
Thus,
∂αJπλ = Γπλα +O(xβ). (3.7)
Now, we previously derived the transformation rule of the affine connection,
Γ′πµν = Jπλ(J−1)α
µ
(J−1)β
νΓλαβ + Jπλ
(J−1)α
µ∂α(J−1)λ
ν.
Let us look at the final term,
Jπλ(J−1)α
µ∂α(J−1)λ
ν,
we see that we can write it as
−(J−1)λ
ν
(J−1)α
µ∂αJ
πλ.
To see how we can do this, consider that
δαβ =∂x′α
∂x′β=∂x′α
∂xπ∂xπ
∂x′β.
Also, ∂νδαβ = 0. Then, that means that
∂νδαβ =
∂
∂xν∂x′α
∂xπ∂xπ
∂x′β
=∂x′α
∂xπ∂2xπ
∂xν∂x′β+∂xπ
∂x′β∂2x′α
∂xν∂xπ
= 0.
That is,∂x′α
∂xπ∂2xπ
∂xν∂x′β= − ∂x
π
∂x′β∂2x′α
∂xν∂xπ.
Or, using the Jacobian notation,
Jαπ∂ν(J−1)π
β= −
(J−1)π
β∂νJ
απ.
Thus, we have shown that we can do the “swap” we did above.
Therefore, we write the transformation rule of the connection once again, with this rewritingof the last term,
Γ′πµν = Jπλ(J−1)α
µ
(J−1)β
νΓλαβ −
(J−1)λ
ν
(J−1)α
µ∂αJ
πλ.
Now, we have all of these expression. We use (3.5) and (3.6) for the Jacobian/inverse, and(3.7) for the derivative of the Jacobian;
Γ′πµν = δπλδαµδ
βνΓλαβ − δλν δαµΓπαλ +O(xρ).
3.1 Covariant Differentiation 25
Using the Kronecker-deltas results in
Γ′πµν = Γπµν − Γπµν +O(xρ) = O(xρ).
Therefore, the components of the transformed connection are all
Γ′πµν = O(xρ).
Now, if we let xµ → 0, which is equivalent (by our definition of xµ) to saying xµ = xµ∗ , thenwe see that
Γ′πµν(xρ = xρ∗) = 0.
Therefore, we have a transformation which renders all components of the affine connectionzero. That is, we can transform to a frame in which the geometry is Euclidean (flat), atthat single point. This is actually an incredibly useful & important result. Notice that ifthe Christofell symbols are zero, then the covariant derivative is just the partial derivative.This tends to hugely simplify calculations. In the later discussions on curvature, we shall seethat in transforming to a locally inertial frame, where the connection components are zero,we can compute this a lot easier. And, as the things we are transforming are tensors, theresults hold in any frame.
Some literature call such a set of coordinates, geodesic coordinates.
Alternative Derivation Here we shall present a rather more mathematically rigorousderivation of the existence of geodesic coordinates.
Let xµ = aµ be coordinates at a point A in the frame Σ. Let us transform to a new frame,via the transformation
xµ = aµ + x′µ +1
2aµνλx
′νx′λ,
where the coefficient aµνλ is symmetric in its lower indices, and is constant (i.e. we definethis as part of the transformation). Thus, at the point A, x′µ = 0. So, let us compute thedifferentials of the transformation;
∂xµ
∂x′ν= δµν +
1
2aµκλ
∂
∂x′ν
(x′κx′λ)
= δµν +1
2aµκλ
(x′κδλν + x′
λδκν
)= δµν + aµνλx
′λ.
Hence,∂2xµ
∂x′ν∂x′λ= aµνλ.
Hence, at the point A (i.e. where x′µ = 0), we see that
∂xµ
∂x′ν= δµν ,
∂2xµ
∂x′ν∂x′λ= aµνλ. (3.8)
26 3 TENSOR CALCULUS
Now, we can do a little work to get a relation between the coefficents aµνλ and the metric.The metric transforms via
g′µν =∂xα
∂x′µ∂xβ
∂x′νgαβ.
Then, differentiating it,
∂g′µν∂x′λ
=∂2xα
∂x′λ∂x′µ∂xβ
∂x′νgαβ +
∂xα
∂x′µ∂2xβ
∂x′λ∂x′νgαβ +
∂xα
∂x′µ∂xβ
∂x′ν∂gαβ∂x′λ
.
Now, rewrite the last term using the chain rule;
∂gαβ∂x′λ
=∂gαβ∂xσ
∂xσ
∂x′λ,
so that we have
∂g′µν∂x′λ
=∂2xα
∂x′λ∂x′µ∂xβ
∂x′νgαβ +
∂xα
∂x′µ∂2xβ
∂x′λ∂x′νgαβ +
∂xα
∂x′µ∂xβ
∂x′ν∂gαβ∂xσ
∂xσ
∂x′λ.
So, at the point A, using (3.8), this reads
∂g′µν∂x′λ
= aαλµδβν gαβ + aβλνδ
αµgαβ + δαµδ
βν δ
σλ
∂gαβ∂xσ
= gανaαλν + gµβa
βµβ +
∂gµν∂xλ
. (3.9)
Now let us choose that
∂g′µν∂x′λ
= 0. (3.10)
which is equivalent to choosing the metric to be flat at that point A. Now, note that
gανaαλµ = aνλµ.
Then, we see that (3.9) becomes
aνλµ + aµλν +∂gµν∂xλ
= 0,
which trivially becomes
aνλµ + aµλν = −∂gµν∂xλ
. (3.11)
Now, if we permute the indices ν → µ→ λ→ ν, this becomes
aµνλ + aλνµ = −∂gλµ∂xν
, (3.12)
3.1 Covariant Differentiation 27
permuting again,
aλµν + aνµλ = −∂gνλ∂xµ
. (3.13)
Now, if we form (3.12) + (3.13)–(3.11), then we see
aµνλ + aλνµ + aλµν + aνµλ − aνλµ − aµλν
= −∂gνλ∂xµ
− ∂gλµ∂xν
+∂gµν∂xλ
.
Now, as aµνλ = aµλν , we see that the fourth and fifth terms cancel, as do the first and sixth,leaving
2aλνµ = −(∂gνλ∂xµ
+∂gλµ∂xν
− ∂gµν∂xλ
),
that is,
aλνµ = [λν, µ] ,
where
[λν, µ] ≡ −1
2
(∂gνλ∂xµ
+∂gλµ∂xν
− ∂gµν∂xλ
).
We call the [λν, µ] a Christofell symbol of the first kind. Now, by (3.10) we see that
a′λνµ = 0
at the point A.
Therefore, we have derived that under the coordinate transformation xµ = aµ + x′µ +12aµνλx
′νx′λ, the Christofell symbols are zero at the point xµ = aµ; such coordinates aregeodesic coordinates. In the derivation, we assumed that:
aµνλ constant and symmetric in lower indices,
∂g′µν∂x′λ
= 0 constant metric at point we transform to.
3.1.5 Torsion
Let us define torsion to be
T ρµν ≡1
2
(Γρµν − Γρ νµ
)= Γρ [µν]. (3.14)
We shall work with symmetric affine connections; so that the torsion goes to zero. A torsionfree space merely allows us to interchange the lower indices on the connection componentsat will. This expression for torsion is a tensor; let us prove it.
28 3 TENSOR CALCULUS
So, the transformation of torsion can be written
T ′ρµν = Jρα(J−1)β
µ
(J−1)γ
νTαβγ + δT ρµν ,
where δT ρµν is a term that can be easily seen from the transformation rule of the connection,
δT ρµν = Jρλ(J−1)α
µ∂α(J−1)λ
ν− Jρλ
(J−1)α
ν∂α(J−1)λ
µ.
Now, a “trick” that we have used before is to note that
δµν = Jµα(J−1)α
ν
⇒ ∂βδµν = ∂β
(Jµα
(J−1)α
ν
)= Jµα∂β
(J−1)α
ν+(J−1)α
ν∂βJ
µα
= 0
⇒ Jµα∂β(J−1)α
ν= −
(J−1)α
ν∂βJ
µα.
We can then use this in the expression for δT ρµν , to see that
δT ρµν = −(J−1)α
µ
(J−1)λ
ν∂αJ
ρλ +
(J−1)α
ν
(J−1)λ
µ∂αJ
ρλ
= ∂αJρλ
[(J−1)α
ν
(J−1)λ
µ−(J−1)α
µ
(J−1)λ
ν
].
Now, let us define
Aαλνµ ≡(J−1)α
ν
(J−1)λ
µ−(J−1)α
µ
(J−1)λ
ν,
then we see thatAαλνµ = −Aλανµ ,
i.e. it is anti-symmetric under interchange of α and λ. Now, notice that
∂αJρλ =
∂2x′ρ
∂xα∂xλ
=∂2x′ρ
∂xλ∂xα
= ∂λJρα.
That is, ∂αJρλ is symmetric under interchange of α and λ. Therefore, as the product of
something which is symmetric and anti-symmetric is zero, we see that
δT ρµν = 0.
Hence,
T ′ρµν = Jρα(J−1)β
µ
(J−1)γ
νTαβγ,
which is the rule of transformation of a(
12
)-tensor.
3.2 Geodesics 29
3.2 Geodesics
A geodesic is the curve which gives an extremal of motion. That we use the word extremal,rather than minima (or, indeed, maxima), is very important.
Suppose we are “living in a manifold” (suppose we are confined to the surface of a sphere).Then suppose that we wish to compute the equation of the line (in that manifold) that joinstwo points, where the equation of the line is an extremum. That is, we can compute manyequations of that line, but only one of them will be an extremum. Then, that curve is ageodesic.
The geodesic will depend upon the geometry of the manifold; the line has its motion con-fined to the manifold. As we shall see, the metric is used to give the geometrical dependence.
3.2.1 The Affine Geodesic
We call an affine geodesic the curve for which the tangent vector is parallel transported toitself. That is,
DT µ
Du= λ(u)T µ, T µ ≡ dxµ
du.
That is, we find a curve, along which the tangent vector does not change direction. It mayget longer (hence the factor of λ(u)), but it does not change direction.
We have our definition of the absolute derivative,
DAµ
Du=dxν
du∇νA
µ = T ν∇νAµ.
Therefore, the affine geodesic satisfies
T ν∇νTµ = λ(u)T µ,
which is, using the definition of the covariant derivative
T ν(∂νT
µ + ΓµνγTγ)
= λT µ.
Now, consider
∂ν =∂
∂xν=
du
dxνd
du=
1
T νd
du,
then, we see that the affine geodesic can be written
T ν(
1
T νd
duT µ + ΓµνγT
γ
)= λT µ.
Therefore, noting that T µ = dxµ
du, we see that
d
du
dxµ
du+ T νT γΓµνγ = λT µ,
30 3 TENSOR CALCULUS
which is of course just
d2xµ
du2+ Γµνγ
dxγ
du
dxµ
du= λT µ. (3.15)
As an example, consider a Cartesian system, whereby the affine connections are all zero.The resultant differential equation has a straight line as the solution. That is, the affinegeodesic in Cartesian coordinates is a straight line.
We say that u is the affine parameter. If λ = 0, then we say that the geodesic is affinelyparameterised. That is,
T ν∇νTµ = 0, T µ ≡ dxµ
du,
along an affinely parameterised geodesic.
3.2.2 The Metric Geodesic
This geodesic is perhaps a little less hand-wavey.
Consider two points in some space. Consider that they are joined by a line. Then, themetric geodesic is the line which extremises that joining line. So, given a line element
ds2 = gµνdxµdxν ,
we see that the corresponding action is
S =
∫ds.
Now, considering that the line is parameterised by u, the affine parameter, then we see thatthe action is simply
S =
∫ds =
∫ds
dudu =
∫du
√gµν
dxµ
du
dxν
du.
Then, by the variational principle, the Euler-Lagrange equation
d
du
∂L
∂xµ− ∂L
∂xµ= 0, xµ ≡ dxµ
du
extremises the action (note, we use the word extremise, rather than maximise or minimise).We must state that gµν(x
ρ), only.
So, the Lagrangian is
L =(gαβx
αxβ)1/2
.
3.2 Geodesics 31
We are now left to compute the elements of the EL equation. So,
∂L
∂xµ=
1
2L
∂
∂xµ(gαβx
αxβ)
=1
2Lgαβ
(∂xα
∂xµxβ + xα
∂xβ
∂xµ
)=
1
2Lgαβ(δαµ x
β + δβµ xα)
=1
2Lgαβ(δαµ x
β + δαµ xβ)
=1
Lgαβx
βδαµ
=1
Lgµβx
β.
And also,∂L
∂xµ=
1
2Lxαxβ∂µgαβ.
Finally,
d
du
∂L
∂xµ=
d
du
(1
Lgµβx
β
)= − L
L2gµβx
β +1
L
d
du
(gµβx
β),
the last expression we evaluate via
d
du
(gµβx
β)
= gµβdxβ
du+ xβ
d
dugµβ
= gµβxβ + xβ
dxγ
du
dgµβdxγ
= gµβxβ + xβxγ∂γgµβ.
Therefore,d
du
∂L
∂xµ= − L
L2gµβx
β +1
L
(gµβx
β + xβxγ∂γgµβ)
;
and consequently, the EL equation reads
− L
L2gµβx
β +1
L
(gµβx
β + xβxγ∂γgµβ)− 1
2Lxαxβ∂µgαβ = 0.
Now, the job is to get this into a “nice form”, without mention to L. We can move the firstterm over to the RHS, and multiply through by L, giving
gµβxβ + xβxγ∂γgµβ −
1
2xαxβ∂µgαβ =
L
Lgµβx
β.
32 3 TENSOR CALCULUS
Let us now multiply this by something that will kill-off the metric multiplying xβ, the LHS.Multiplying by gρµ will work,
gρµgµβxβ + gρµ
(xβxγ∂γgµβ −
1
2xαxβ∂µgαβ
)=L
Lgρµgµβx
β,
noting that gρµgµβ = δρβ, and using this relation, we see we now have
xρ + gρµ(xβxγ∂γgµβ −
1
2xαxβ∂µgαβ
)=L
Lxρ.
To continue, we use the simple result that if a = b, then a = 12(a+ b). So, we see that
xβxγ∂γgµβ =1
2
(xβxγ∂γgµβ + xγxβ∂βgµγ
),
thus, using this, we see that we have the geodesic equation being
xρ + gρµ1
2
(xβxγ∂γgµβ + xγxβ∂βgµγ − xαxβ∂µgαβ
)=L
Lxρ.
We can pull out common factors of the bracketed term, by relabeling indices α→ γ, thus
xρ + gρµ1
2(∂γgµβ + ∂βgµγ − ∂µgγβ) xβxγ =
L
Lxρ.
Now, by way of convenient notation, we define everything multiplying the xβxγ as
γ ρ β ≡ gρµ1
2(−∂µgγβ + ∂γgµβ + ∂βgµγ) , (3.16)
a symbol we call the Christofell symbol. Thus, the geodesic equation looks like
xρ + γ ρ β xβxγ =L
Lxρ.
Now, if L = 0, then this reads
xρ + α ρ β xαxβ = 0,d2s
du2= 0.
Notice that the Christofell symbol is symmetric in its lower indices,
γ µ β = β µ γ ,
which we can see by its definition, noting that the metric is symmetric.
Just to write the result again,
xρ + α ρ β xαxβ = 0 (3.17)
3.2 Geodesics 33
is the affinely parameterised metric geodesic.
So, to recap these geodesics. The curve which preserves the direction of the tangent vectoron that curve, is called the affine geodesic. When deriving the geodesic, one uses the Γλµνsymbol, so we call it the affine connection; or just connection. The second type of geodesicwas derived to be the curve which extremises the path length between two points. This wascalled the metric geodesic. In deriving the metric geodesic, one defines some quantities, theChristofell symbols.
3.2.3 Relation Between Affine Connection & Christofell Symbol
We shall start by asserting that for a torsion free connection, Tαµν = 0, and that for ametric with zero covariant derivative, ∇αgνµ = 0, then the affine connection is the Christofellsymbol. That is,
∇αgµν = 0, T µαβ = 0 ⇒ Γµαβ = α µ β .The way to use “torsion free” is that the final two indices on the affine connection can beinterchanged. We also use a symmetric metric throughout.
Let us prove it.
We start by writing the covariant derivative of the metric,
∇αgµν = ∂αgµν − Γλαµgλν − Γλανgµλ.
But, by our definition of the problem, this is zero. So,
∂αgµν = Γλαµgλν + Γλανgµλ.
Let us now cyclicly change the indices. First, we shall do α→ µ→ ν → α. Giving
∂µgνα = Γλµνgλα + Γλµαgνλ.
Let us do the interchange again, on this new equation. Giving
∂νgαµ = Γλναgλµ + Γλνµgαλ.
Let us add the first two, and subtract the last equation. Giving
∂αgµν + ∂µgνα − ∂νgαµ = Γλαµgλν + Γλανgµλ + Γλµνgλα + Γλµαgνλ − Γλναgλµ − Γλνµgαλ.
Now, we notice that by our torsion free assert, we can cancel off some of the terms on theRHS. These are the second with the fifth, and third with sixth. This leaves
∂αgµν + ∂µgνα − ∂νgαµ = Γλαµgλν + Γλµαgνλ,
from which we further use the torsion-free assert, to see that the two terms on the RHS areidentical, leaving
∂αgµν + ∂µgνα − ∂νgαµ = 2Γλαµgλν .
34 3 TENSOR CALCULUS
Rearranging this trivially results in
Γλαµgλν =1
2(∂αgµν + ∂µgνα − ∂νgαµ) .
Now, let us multiply the whole thing by gρν ,
Γλαµgρνgλν =
1
2gρν (∂αgµν + ∂µgνα − ∂νgαµ)
⇒ Γλαµδρλ =
1
2gρν (∂αgµν + ∂µgνα − ∂νgαµ) ,
which is just
Γραµ =1
2gρν (∂αgµν + ∂µgνα − ∂νgαµ) .
Now, if we switch over the µ index to β (just relabelling),
Γραβ =1
2gρν (∂αgβν + ∂βgνα − ∂νgαβ) .
Upon inspection of this with (3.16), we find that they are equal. Therefore,
Γραβ = α ρ β ; ∇αgµν = 0, T µαβ = 0.
It is very important to note that this only holds for a torsion free connection, with metric hav-ing zero covariant derivative. Under these conditions, the affine connection is the Christofellsymbol.
3.3 Isometries & Killing’s Equation
Consider the coordinate transformation
gµν(x) 7−→ gµν(x′),
so that the metric in the new frame has the same functional dependance as in the old frame.That is, the new metric depends on x′ in the same way as the old metric depended on x.Then, we have
ds2(x) = ds2(x′),
and that
g′µν(x′) = gµν(x). (3.18)
Therefore, by the transformation rule of the metric,
gµν(x) =∂x′α
∂xµ∂x′β
∂xνg′αβ(x′),
3.3 Isometries & Killing’s Equation 35
and using (3.18) on the RHS, we see that
gµν(x) =∂x′α
∂xµ∂x′β
∂xνgαβ(x′). (3.19)
So, a coordinate transformation leaving the metric in the same form (form invariant), iscalled an isometry.
The coordinate transformation we considered was xµ 7→ x′µ. Let us consider a special caseof this; namely
xµ 7−→ x′µ = xµ + εξµ,
where ε is small, and ξµ a vector field. Now, the Jacobian,
Jµν = ∂νx′µ = ∂ν (xµ + εξµ) ,
which is clearlyJµν = δµν + ε∂νξ
µ.
Now, by a Taylor expansion, we see that
gαβ(x′) = gαβ(xµ + εξµ) = gαβ(xµ) + εξµ∂µgαβ(xµ) +O(ε2).
So, we now have enough terms to be able to put them all into the metric isometry transfor-mation equation (3.19). Thus,
gµν(x) = JαµJβνgαβ(x′)
=(δαµ + ε∂µξ
α) (δβν + ε∂νξ
β)
(gαβ(x) + εξρ∂ρgαβ(x)) .
if we expand out the RHS, neglecting terms in O(ε2), one finds that
gµν(x) = gµν(x) + εξρ∂ρgµν + εgµβ∂νξβ + εgαν∂µξ
α,
rearranging,gµν(x) = gµν(x) + ε
(gµβ∂νξ
β + gαν∂µξα + ξρ∂ρgµν
),
which is obviously just
gµβ∂νξβ + gαν∂µξ
α + ξρ∂ρgµν = 0. (3.20)
Now, notice thatξα = gαβξ
β,
and then its differential is
∂νξα = ∂ν(gαβξ
β)
= ξβ∂νgαβ + gαβ∂νξβ.
36 3 TENSOR CALCULUS
Hence, we can rearrange this into the form
gαβ∂νξβ = ∂νξα − ξβ∂νgαβ.
So, if we put this into (3.20) for the first and second expressions (being very careful inchanging indices), we get
∂νξµ − ξβ∂νgµβ + ∂µξν − ξβ∂µgνβ + ξβ∂βgµν = 0.
Collecting terms,∂νξµ + ∂µξν + ξβ (∂βgµν − ∂νgµβ − ∂µgνβ) = 0
The bracketed quantity is just −2gρβΓρµν , so that
∂νξµ + ∂µξν − 2ξβgρβΓρµν = 0,
which is just,∂νξµ + ∂µξν − 2Γρµνξρ = 0.
This is just the covariant derivative (noting the symmetry of the Christofell symbols),
∇µξν +∇νξµ = 0. (3.21)
This is known as Killing’s equation. A vector ξν satisfying Killing’s equation is called aKilling vector.
Let us just recap what we have done. A metric is said to have an isometry if it can transform,retaining its functional dependence. Then, under a small coordinate transformation, with avector field ξµ, the field satisfying Killing’s equation will give an isometry.
Now, a theorem states that, for a tangent and Killing vector, T µ, ξµ respectively, thereis a conserved quantity T µξµ along an affinely parameterised geodesic. So, to prove it, weconsider
D
Du(T µξµ) = T ν∇ν(T
µξµ) = T ν (ξµ∇νTµ + T µ∇νξµ) .
Now, the first term is zero, as we are on an affinely parameterised geodesic. Now, notice thatwe can write the final term as
∇νξµ =1
2(∇νξµ +∇µξν),
and thus we haveD
Du(T µξµ) = T νT µ
1
2(∇νξµ +∇µξν).
We were able to interchange the indices (with the factor of one-half to cancel out the doublecounting), because the things multiplying it are symmetric under interchange of indices. No-tice that the bracketed term is just Killing’s equation, for some Killing vector ξν . Therefore,
D
Du(T µξµ) = 0,
thus, T µξµ is some conserved quantity along an affinely parameterised geodesic.
3.4 Summary 37
3.4 Summary
We shall soon see some examples of geodesics, and what a Killing vector corresponds to; butbefore then we shall bring together our definitions of the Christofell symbol, and introducea little new notation (just to be inkeeping with the literature).
We have that the affine connection, Γµνλ is the same as the geodesic connection µ α ν, formanifolds for whom
∇αgµν = 0, Tαµν = 0 ⇒ Γαµν = µ α ν .
We also derived that the relation between the Christofell symbol (as we may as well call it),and the metric, is
Γραµ =1
2gρν (−∂νgαµ + ∂αgµν + ∂µgνα) .
Infact, Γραµ are generally denoted Christofell symbols of the second kind. We can infact seethat
Γραµ = gρνΓναµ,
where we call the Γναµ the Christofell symbols of the first kind. When we refer to the“Christofell symbols”, we shall mean those of the second kind.
We derived that the affine geodesic is the same as the metric geodesic, for affinely param-eterised geodesics (satisfying the above torsion & covariant derivative relations).
We also saw that the Christofell symbols are not tensors. The non-tensorial nature of thesymbols allowed us to derive that there exists a point in a manifold, where all componentsof the symbol are zero. That is, there exists a point where the manifold is flat.
The Lagrangian squared,
L2 =
(ds
du
)2
= gµν xµxν ,
is just the line element length. Its possible values are 0,±1. We classify 0 as null geodesics,+1 as time-like and −1 as space-like.
Also by way of being inkeeping with the literature, some books denote partial & covariantderivatives in a different way. Sometimes one may see
∂νAµ ≡ Aν,µ, ∇νAµ ≡ Aν;µ.
That is, a “comma” representing partial derivatives, and a “semi-colon” for covariant deriva-tives.
3.5 Examples
Here we shall see specific examples of geodesics, Killing vectors & how to compute Christofellsymbols.
38 3 TENSOR CALCULUS
3.5.1 Computing Christofell Symbols: Effective Lagrangian
Now, before we go onto the effective Lagrangian method of computing the Christofell symbols,we shall see how to do so, via brute force.
Brute force: plane polars In plane polars, we have the line element
ds2 = dr2 + r2dφ2,
and therefore, reading off the components of the metric & inverse
(gij) =
(1 00 r2
), (gij) =
(1 00 1/r2
).
Then, using the notation that dsi = (dr, dφ), we see that grr = 1, gφφ = r2, grr = 1, gφφ = r−2
are the only non-zero components. So, to compute the Christofell symbols (the brute forceway), we must find
Γi jk =1
2
∑a
gia (−∂agjk + ∂jgak + ∂kgja) , i, j, k, a = r, φ.
We shall spell out, in detail, how to compute one of the components;
Γr φφ =1
2
∑a=r,φ
gra (−∂agφφ + ∂φgφa + ∂φgφa)
=1
2
[grr (−∂rgφφ + ∂φgφr + ∂φgφr) + grφ (−∂φgφφ + ∂φgφφ + ∂φgφφ)
].
Now, one of the first things we note, is that the metric is diagonal: all off-diagonal componentsare zero. So, the above reduces to
Γr φφ = −1
2grr∂rgφφ
= −1
2.1.
∂
∂rr2
= −r.
We have thus found one of the components of the Christofell symbol. We shall state the restof them (as going through how to find each one is very tedious).
Γr rr = 0, Γr rφ = Γr φr = 0, Γr φφ = −r,Γφφφ = 0, Γφrφ = Γφφr = r−1, Γφrr = 0.
Now, we shall show how to find them in a more intelligent manner.
3.5 Examples 39
Effective Lagrangian Method When we derived the metric geodesic, we had that theLagrangian was
L =
√ds
du.
Now, considerLeff ≡ L2.
The Euler-Lagrange equation for Leff is
d
du
(∂Leff
∂xµ
)− ∂Leff
∂xµ= 0,
from which it is clear to see that
2L
[d
du
(∂L
∂xµ
)− ∂L
∂xµ
]= 0.
Thus, the if L satisfies the Euler-Lagrange equation, then so does L2. This makes life a lotsimpler, as we can consider just gµν x
µxν , rather than its square-root.
So, for plane polars, whereLeff = L2 = r2 + r2φ2,
we have two Euler-Lagrange equations, one for each coordinate r, φ. They are
r − rφ2 = 0, 2rφ+ rφ = 0.
Now, if we get these equations into the form x+ Cx1x2 = 0,
r − rφ2 = 0, φ+rφ
r+φr
r= 0.
So, we see that we can read off the Christofell symbols, by inspection. To see this a littleclearer, the “general” metric geodesic, for r, is
r +∑i,j=r,φ
Γr ijxixj = 0;
then, we can see that the only Christofell components that is non-zero is that where i = j = φ,and that value is −r. Thus, we read off that Γr φφ = −r, which is in accord to what we hadby the brute force method. For φ, the general geodesic is
φ+∑i,j=r,φ
Γφijxixj = 0;
and we therefore see two non-zero Christofell symbols: when i = r, j = φ and i = φ, j = r.The corresponding Christofell symbols components are thus Γφrφ = Γφφr = r−1. Again, inaccord with the brute force components.
40 3 TENSOR CALCULUS
3.5.2 Computing the Geodesic
Now, we are able to find the geodesic: a parameterised curve that extremises the distancebetween two points, in the plane polar coordinate system.
When we computed the Euler-Lagrange equation for φ, we had a term (which we didntstate above, but is easy to see, upon computation)
d
du(2r2φ) = 0 ⇒ r2φ = B = const.
That is, φ = B/r2. Now, the effective Lagrangian is just ds2/du2, which is just the lineelement, which can be one of 3 values (as previously stated),
Leff = L2 =
01−1
≡ A.
So, the effective Lagrangian is just
Leff = r2 + r2φ2 = A.
Hence, using our expression for φ,
r2 + r2B2
r4= A ⇒ r =
√A− B2
r2.
Now, if we notice that
φ
r=dφ
dr=B
r2
(A− B2
r2
)−1/2
,
then this integrates to
r cos(φ− φc) =B√A.
If we take A = 1, so that we are talking about time-like geodesics, then the equation becomes
r cos(φ− φc) = r cosφ cosφc + r sinφ sinφc = B.
We now note that φc is a constant, x = r cosφ, y = r sinφ, giving
mx+ ty = B ⇒ y = mx+ c.
That is, the time-like geodesic is a straight line.
Let us now consider the null geodesic. We appeal back to the effective Lagrangian, whichbecomes
r2 + r2φ2 = 0.
This has solution r = 0 and φ = 0. That is, both radius & angle are constants. That is, asingle point. Thus, the null geodesic is a point (null size).
When we consider the space-like geodesic, we find that there is no solution: it does notexist in plane polars.
3.5 Examples 41
Example of Geodesic 2 Let us compute another geodesic, for another line element,
ds2 =1
t2(dt2 − dx2).
So, we see that the effective Lagrangian is
Leff =t2 − x2
t2, t ≡ dt
du, x ≡ dx
du.
Then, the Euler-Lagrange equations for this effective Lagrangian are
t− t2
t− x2
t= 0, x− 2
txt = 0.
So, we can read off the Christofell symbol components. The only non-zero components are
Γt xx = Γt tt = −1
t, Γxxt = Γxtx = −1
t.
3.5.3 Physical Meaning of the Killing Vector
Again, let us go back to plane polars. The line element is
ds2 = dr2 + r2dφ2.
We would like to think of a vector that leaves the line element unchanged. A transformationon φ works:
φ 7−→ φ′ = φ+ ε,
so that
(r, φ) 7−→ (r′, φ′) = (r, φ+ ε) = (r, φ) + ε(0, 1).
Therefore, our Killing vector is
ξi = (0, 1).
Now, we stated that T iξi is a conserved quantity. Let us consider what it is. So,
ξixi = gijξixj
= gφφξφxφ
= r2.1.φ
= r2φ.
This quantity is a constant (as it is conserved). We also notice that it is the expression forthe angular momentum of the system. Therefore, the conserved quantity associated with theKilling vector is the angular momentum, in plane polars.
42 3 TENSOR CALCULUS
3.5.4 Nordstrom’s Theory of Gravity
Let us compute the connection associated with gµν = Ω2gµν . Now, the connection associatedwith gµν is
Γραβ =1
2gρν (∂αgβν + ∂βgνα − ∂νgαβ) .
Hence,
∂α (gµν) = ∂α(Ω2gµν
)= gµν2Ω∂αΩ + Ω2∂αgµν .
Therefore,
Γραβ =1
2gρν (∂αgβν + ∂β gνα − ∂ν gαβ)
=1
2
1
Ω2gρν(2Ωgβν∂αΩ + Ω2∂αgβν + 2Ωgνα∂βΩ+
Ω2∂βgνα − 2Ωgαβ∂νΩ− Ω2∂νgαβ)
=1
2gρν (∂αgβν + ∂βgνα − ∂νgαβ) +
1
Ωgρν (gβν∂αΩ + gνα∂βΩ− gαβ∂νΩ)
= Γραβ +1
Ω
(δρβ∂αΩ + δρα∂βΩ− gρνgαβ∂νΩ
)Hence,
Γραβ = Γραβ +1
Ω
(δρβ∂αΩ + δρα∂βΩ− gρνgαβ∂νΩ
).
Let us suppose that we havegµν = e2φηµν ,
so thatΩ = eφ, gµν = ηµν .
Hence,∂αΩ = eφ∂αφ, Γµαβ = 0.
Hence, using these,
Γραβ = e−φ(δρβe
φ∂αφ+ δραeφ∂βφ− ηρνηαβeφ∂νφ
)= δρβ∂αφ+ δρα∂βφ− ηρνηαβ∂νφ.
So, the geodesic equation, with this connection, reads
xρ +(δρβ∂αφ+ δρα∂βφ− ηρνηαβ∂νφ
)xαxβ = 0.,
which reduces toxρ + xρ2∂αφx
α − x2∂ρφ = 0.
3.5 Examples 43
Now, null geodesics have x2 = 0, so that this geodesics null value is
xρ + xρ2∂αφxα = 0.
Similarly, timelike geodesics have x2 = 1, so that this geodesics timelike value is
xρ + 2∂αφxα − ∂ρφ = 0.
Now, this example provides us with some practice with using tensors & computing geodesics.In addition to this, we have found the geodesics for a theory whereby the metric is given bye2φηµν . This theory was proposed by Nordstrom before Einstein.
44 3 TENSOR CALCULUS
45
4 Curvature
We have now got enough mathematical tools to be able to consider the curvature of a mani-fold.
To continue, consider the commutator of covariant derivatives, acting upon a scalar,
[∇µ,∇ν ]φ = ∇µ∇νφ−∇ν∇µφ.
Now, as we previously showed, the covariant derivative of a scalar is just the normal partialderivative. Therefore,
[∇µ,∇ν ]φ = ∇µ(∂νφ)−∇ν(∂µφ).
We can now expand out the covariant derivatives. So,
∇µ(∂νφ) = ∂µ∂νφ− Γλµν∂λφ.
Therefore, the commutator is
[∇µ,∇ν ]φ = ∂µ∂νφ− Γλµν∂λφ− ∂ν∂µφ+ Γλνµ∂λφ.
In a torsion free manifold, the two Christofell terms cancel out, as do the partial derivativeterms (as they commute naturally). Therefore, we see that
[∇µ,∇ν ]φ = 0.
So, the commutator of partial derivatives, acting upon a scalar, is zero. This result isn’tperhaps that surprising. So, let us consider the commutator acting upon a vector.
4.1 The Riemann Tensor
As previously stated, we shall compute the commutator of covariant derivatives, acting upona vector. That is,
[∇µ,∇ν ]Aρ = ∇µ∇νA
ρ −∇ν∇µAρ.
Now, before, we expanded out the inner covariant derivatives first (as they resulted in justpartial derivatives). However, if we do that this time, we will end up having to compute thecovariant derivative of the Christofell symbol, which we don’t know how to do. Hence, weexpand out the outer derivatives first. So,
∇µ∇νAρ = ∂µ(∇νA
ρ) + Γρµλ∇νAλ − Γλµν∇λA
ρ,
thus, the commutator reads
[∇µ,∇ν ]Aρ = ∂µ(∇νA
ρ) + Γρµλ∇νAλ − Γλµν∇λA
ρ
−∂ν(∇µAρ)− Γρ νλ∇µA
λ + Γλνµ∇λAρ.
46 4 CURVATURE
So, we see that the final terms on the RHS cancel (i.e. third & sixth),
[∇µ,∇ν ]Aρ = ∂µ(∇νA
ρ) + Γρµλ∇νAλ
−∂ν(∇µAρ)− Γρ νλ∇µA
λ.
Now, expanding out the remaining covariant derivatives,
[∇µ,∇ν ]Aρ = ∂µ
(∂νA
ρ + ΓρλνAλ)
+ Γρµλ(∂νA
λ + ΓλνβAβ)
−∂ν(∂µA
ρ + ΓρλµAλ)− Γρ νλ
(∂µA
λ + ΓλµβAβ).
Now, as partial derivatives commute, the two terms on the far LHS cancel. So, cancelling &expanding out the brackets, we have
[∇µ,∇ν ]Aρ = (∂µΓρλν)A
λ + Γρλν∂µAλ + Γρµλ∂νA
λ + ΓρµλΓλνβA
β
−(∂νΓρλµ)Aλ − Γρλµ∂νA
λ − Γρ νλ∂µAλ − Γρ νλΓ
λµβA
β.
Now, we see that the second term cancels with the seventh, and the third with the sixth(again, by assuming torsion free manifolds). Leaving us with
[∇µ,∇ν ]Aρ = (∂µΓρλν)A
λ + ΓρµλΓλνβA
β − (∂νΓρλµ)Aλ − Γρ νλΓ
λµβA
β.
Now, in the second & fourth terms, let us interchange β ↔ λ, giving
[∇µ,∇ν ]Aρ = (∂µΓρλν)A
λ + ΓρµβΓβνλAλ − (∂νΓ
ρλµ)Aλ − Γρ νβΓβµλA
λ,
so that we can take out a common factor of Aλ,
[∇µ,∇ν ]Aρ =
(∂µΓρλν + ΓρµβΓβνλ − ∂νΓ
ρλµ − Γρ νβΓβµλ
)Aλ.
Now, we define the bracketed quantity to be the Riemann tensor,
Rρλµν ≡ ∂µΓρλν + ΓρµβΓβνλ − ∂νΓ
ρλµ − Γρ νβΓβµλ, (4.1)
so that the commutator reads
[∇µ,∇ν ]Aρ = Rρ
λµνAλ. (4.2)
The Riemann tensor is a (13)-tensor. It is clear that Rρ
λµν is a tensor, as the LHS of the aboveis a tensor, the RHS must also be (as Aρ is a tensor). This obviously not a rigorous proof ofthe tensorial nature of the Riemann “tensor”, so we shall prove it.
We have(∇µ∇ν −∇ν∇µ)Aρ = Rρ
λµνAλ,
and therefore that (∇′µ∇′ν −∇′ν∇′µ
)A′ρ = R′ρλµνA
′λ.
4.1 The Riemann Tensor 47
Now, as covariant derivatives are tensors, we know that
∇′µ∇′νA′ρ =(J−1)α
µ
(J−1)β
νJργ∇α∇βA
γ.
Hence, (J−1)α
µ
(J−1)β
νJργ (∇α∇β −∇β∇α)Aγ = R′ρλµνJ
λπA
π.
Now, on the LHS, we see that (∇α∇β −∇β∇α)Aγ = RγσαβA
σ. Therefore,(J−1)α
µ
(J−1)β
νJργR
γσαβA
σ = R′ρλµνJλπA
π.
Now, multiplying through by something that will ‘kill off’ the Jacobian on the RHS, (J−1)δλ
for example, (J−1)δ
λ
(J−1)α
µ
(J−1)β
νJργR
γσαβA
σ = R′ρλµνδδπA
π = R′ρλµνAδ.
Now, as this must be valid for all Aµ, we must set δ = σ. Therefore, doing so & cancelingoff the Aµ, (
J−1)σ
λ
(J−1)α
µ
(J−1)β
νJργR
γσαβ = R′ρλµν ,
which is the transformation rule of a (13)-tensor. Therefore, we have proven that the Riemann
tensor is infact a tensor.
Just to be in-keeping with some literature, the Riemann tensor is also called the Riemann-Christofell tensor, or the curvature tensor.
4.1.1 Symmetries of the Riemann Tensor
Now, in one of our previous discussions, we introduced the local inertial frame (LIF), wherebyat a point xµ = xµ∗ , the metric is flat, and the Christofell symbols are all zero;
gµν(x∗) = ηµν , ∂ρgµν(x∗) = 0, Γρµν(x∗) = 0.
In a LIF, the Riemann tensor looks quite simple. So, we see that the Riemann tensor, in aLIF, is just
Rρλµν = ∂µΓρλν − ∂νΓ
ρλµ.
Putting in expressions for the Christofell symbols, and noting that the first derivative of themetric is zero;
Rρλµν =
1
2gρπ (∂µ∂λgνπ + ∂µ∂νgπλ − ∂µ∂πgλν − ∂ν∂λgµπ − ∂ν∂µgπλ + ∂ν∂πgλµ) ,
the second & fifth terms cancel each other (as partial derivatives commute), leaving
Rρλµν =
1
2gρπ (∂µ∂λgνπ − ∂µ∂πgλν − ∂ν∂λgµπ + ∂ν∂πgλµ) .
48 4 CURVATURE
Now, to get rid of the metric multiplying the bracket, we form
Rαλµν = gαρRρλµν
=1
2gαρg
ρπ (∂µ∂λgνπ − ∂µ∂πgλν − ∂ν∂λgµπ + ∂ν∂πgλµ)
=1
2δπα (∂µ∂λgνπ − ∂µ∂πgλν − ∂ν∂λgµπ + ∂ν∂πgλµ)
=1
2(∂µ∂λgνα − ∂µ∂αgλν − ∂ν∂λgµα + ∂ν∂αgλµ) .
Of course, it must be clear that this is only valid an a LIF. Now, although the above expressionis only valid in a LIF, the resulting symmetries are valid everywhere (as the Riemann tensoris a tensor). We see that
Rαλµν = −Rλαµν = −Rαλνµ = Rµναλ = Rλανµ. (4.3)
And further that
Rαλµν +Rαµνλ +Rανλµ = 0. (4.4)
This can also be denotedRα(λµν) = 0,
where the notation is understood to mean cyclic interchange, and sum, over bracketed indices.
Theorem We state (without proof), that if all components of the Riemann tensor are zero,then the space is flat. That is
Rλµνδ = 0 ⇒ flat space.
4.1.2 The Round Trip
Now, although we shall not go into the details here (we have already presented a full mathe-matical treatment, however, of the Riemann tensor), one can show that the Riemann tensorcomes about from a round-trip around a rectangle.
Consider a rectangle, with horizontal sides of length ∆xµ and vertical sides length δxµ.Then, if one makes the parallel-transported round trip A → B → C → D → A, and if onecomputes the coordinate shift (merely due to displacement) at each vertex, then one findsthat
Aρ1 = (1 + δxµ∆xν [∇µ,∇ν ])Aρ0,
where Aρ1 is the component of A, after visiting that point after making the round trip (i.e.one starts at Aρ0). Then,
Aρ1 − Aρ0 = ∆Aρ = δxµ∆xν [∇µ,∇ν ]A
ρ0.
4.2 The Ricci Identity 49
Now, we see that [∇µ,∇ν ]Aρ0 = Rρ
αµνAα, and so,
∆Aρ = δxµ∆xνRραµνA
α
Now, notice that
Rραµνδx
µ∆xν =1
2
(Rρ
αµνδxµ∆xν +Rρ
ανµδxν∆xµ
)=
1
2
(Rρ
αµνδxµ∆xν −Rρ
αµνδxν∆xµ
)=
1
2Rρ
αµν∆Sµν ,
where we have used the anti-symmetry identity of the Riemann tensors last two indices.Also, we have defined ∆Sµν ≡ δxµ∆xν − δxν∆xµ. Therefore, we can write the round-tripexpression as
∆Aρ =1
2∆SµνRρ
αµνAα.
Therefore, we have a semi-geometrical interpretation of the Riemann tensor. It is able to tellus the difference in the orientation of a vector, after making a round trip about a rectangle,in a manifold.
4.2 The Ricci Identity
We call the commutator we defined above, the Ricci identity. That is, the Ricci identity is
[∇µ,∇ν ]Aρ = Rρ
λµνAλ,
where Rρλµν is the Riemann tensor, in a torsion-less manifold.
4.3 The Ricci Tensor & Scalar
If we contract the Riemann tensor on its first & third indices,
Rρλρν = gραRαλρν ,
we have a quantity we defineRλν ≡ Rρ
λρν .
If we further contract Rλν ,R ≡ gλνRλν .
Thus, we have what we define the Ricci tensor, Rµν and Ricci scalar, R. By the symmetriesof the Riemann tensor above, we can easily see that the Ricci tensor is symmetric.
Now, when we stated that the condition for flat space was that all components of theRiemann tensor were zero; if the Ricci scalar is zero, then the space is not necessarily flat.One can see this, as upon contraction, some non-zero components may cancel each other outin summation.
50 4 CURVATURE
4.3.1 Example: Plane Polars
Consider the line element
ds2 = dθ2 + sin2 θdφ2,
and suppose that we are given that
Rθφθφ = sin2 θ
is the only non-zero component of the Riemann tensor (obviously we can find the othernon-zero components by symmetry relations); then, we can compute the Ricci scalar R.
The non-zero components of the metric are easily read off the line element;
gθθ = gθθ = 1, gφφ = sin2 θ, gφφ =1
sin2 θ.
Now,
Rθφθφ = gθθRθφθφ = sin2 θ.
Now, by symmetry of the Riemann tensor,
Rθφθφ = −Rφθθφ = −Rθφφθ = Rφθφθ.
Now, the Ricci tensor is found by contraction,
Rij = gnmRnimj.
We are slightly fortunate in that the metric is diagonal. So,
Rθθ = gijRiθjθ
= gθθRθθθθ + gφφRφθφθ
= 1.0 +1
sin2 θ. sin2 θ
= 1.
Also,
Rθφ = gθθRθθθφ + gφφRφθφφ
= 0.
And,
Rφφ = gθθRθφθφ + gφφRφφφφ
= 1. sin2 θ + 0
= sin2 θ.
4.4 The Bianchi Identity 51
Therefore, the Ricci scalar,
R = gijRij
= gθθRθθ + gφφRφφ
= 1 +1
sin2 θsin2 θ
= 2.
Therefore, the Ricci scalar is 2 for the plane polar metric.
Now, if we were to repeat this, for the line element
ds2 = dr2 + r2dθ2 + r2 sin2 θdφ2,
we would find that R = 0.
4.4 The Bianchi Identity
Consider the Riemann tensor, in a LIF,
Rρλµν = ∂µΓρλν − ∂νΓ
ρλµ.
Then, let us differentiate it,
∇πRρλµν = ∇π∂µΓρλν −∇π∂νΓ
ρλµ.
Now, even though we dont know how to evaluate these expressions, we can still cycle indicesto see what happens. So, making the change
π → µ→ ν → π,
then∇µR
ρλνπ = ∇µ∂νΓ
ρλπ −∇µ∂πΓρλν ,
and again,∇νR
ρλπµ = ∇ν∂πΓρλµ −∇ν∂µΓρλπ.
Now, if we add these 3 expressions,
∇πRρλµν +∇µR
ρλνπ +∇νR
ρλπµ = ∇π∂µΓρλν −∇π∂νΓ
ρλµ
+∇µ∂νΓρλπ −∇µ∂πΓρλν ,
+∇ν∂πΓρλµ −∇ν∂µΓρλπ.
Now, in a LIF, the Christofell symbols are zero. Therefore, the covariant derivative is thesame as the “usual” partial derivative. So, rather than changing the above symbols, we let
52 4 CURVATURE
covariant and partial derivative swap indices. Then, we can see that the entire RHS cancelsitself out, leaving
∇πRρλµν +∇µR
ρλνπ +∇νR
ρλπµ = 0.
Now, if we drop the ρ-index (using a metric, but index relabeling is trivial),
∇πRρλµν +∇µRρλνπ +∇νRρλπµ = 0.
And, using the symmetry property that Rαβγδ = Rγδαβ, then
∇πRµνρλ +∇µRνπρλ +∇νRπµρλ = 0,
which we see is just a cyclic interchange of the first three indices of the whole expression.That is,
∇(πRµν)ρλ = 0.
Hence, we have arrived at our result. The Bianchi identity is that
∇πRµνρλ +∇νRπµρλ +∇µRνπρλ = 0. (4.5)
The Bianchi identity is infact the equivalent of the rectangular round-trip expression we de-rived above. The Bianchi identity will come about if one considers the difference in orientationof a vector being parallelly-transported around a cuboid.
Although we derived the Bianchi identity with the Riemann tensor in a LIF, the expressionis completely valid in all frames. This is because the Riemann tensor is a tensor; and atensor equation has the same form in all frames. Thus, one begins to see the power of gettingexpressions into tensorial form, and of the local inertial frame.
4.5 The Einstein Tensor
Now, let us take the Bianchi identity,
∇πRµνρλ +∇νRπµρλ +∇µRνπρλ = 0.
Now, let us figure out how to contract this expression, so that we have Ricci tensors, ratherthan Riemann tensors. Now, if we multiply the expression by gµλ, then we will have achievedour goal (one can see that the indices of this metric are those on the first and last parts ofthe first Riemann tensor). However, let us do this methodically. So, the first expression willread,
gµλ∇πRµνρλ = −gµλ∇πRµνλρ = −∇πRνρ,
after using the anti-symmetry of the last two indices of the Riemann tensor. The secondterm can be rewritten, using the symmetry identity of the interchange first two & last twoindices of the Riemann tensor;
gµλ∇νRπµρλ = gµλ∇νRµπλρ = ∇νRπρ.
4.6 Geodesic Deviation 53
Lastly, the final term of the contracted Bianchi identity is just
gµλ∇µRνπρλ = ∇λRνπρλ.
Therefore, putting these all together, our contracted Bianchi identity looks like
∇νRπρ −∇πRνρ +∇λRνπρλ = 0.
Now, multiplying this whole expression by gνρ will contract the last Riemann tensor into aRicci tensor; as well as the middle Ricci tensor into a Ricci scalar. Thus,
gνρ∇νRπρ − gνρ∇πRνρ + gνρ∇λRνπρλ = 0,
⇒ ∇ρRπρ −∇πR +∇λRπλ = 0.
Now, the first and last expressions are identical, as we can interchange the indices at will.Therefore, we have
2∇ρRπρ −∇πR = 0.
Then, notice that we can write
2∇ρRπρ − gπρ∇ρR = 0,
and therefore that∇ρ(Rπρ − 1
2gπρR
)= 0.
Therefore, we can define the Einstein tensor,
Gµν ≡ Rµν − 12gµνR, (4.6)
whereby
∇µGµν = 0; (4.7)
after noting that the metric, Ricci & therefore the Einstein tensor are symmetric. This iscalled the contracted Bianchi identity.
4.6 Geodesic Deviation
Suppose we take, on flat space, two affinely parameterised geodesics xµ(τ), yµ(τ), that areon a collision course. That is, the distance between the two lines,
δµ(τ) ≡ xµ(τ)− yµ(τ),
decreases. On flat space, the distance will decrease linearly. That is,
dδµ
dτ= const ⇒ d2δµ
dτ 2= 0.
54 4 CURVATURE
Now consider a curved space. Let the paths be tangents. Then, the distance between thetwo wont decrease linearly. Instead, they will accelerate; thus
D2δµ
Dτ 2= Rµ
αβρTαT βδρ. (4.8)
To imagine this in a physical situation, consider two balls falling towards the centre ofthe earth. Now, the balls will obviously move towards each other, as their motion is radial.However, there will be deviation from radial, and that deviation will be due to the curvatureof space. That is, one will observe the balls accelerate towards each other (rather than theexpected linear motion towards each other).
Derivation We can derive the geodesic deviation equation, by considering the 2-dim man-ifold swept out by two affinely parameterised geodesics, next to each other. The manifoldmay be parameterised by xµ = xµ(τ, σ) (i.e. two coordinates on this surface, rather than theusual one, on a curve). The tangent vectors are
T µ =dxµ
dτ, δµ =
dxµ
dσ.
Now, let us show a useful relation. Consider
T µ∇µδν = T µ
(∂µδ
ν + Γν µλδλ)
=dxµ
dτ
∂
∂xµdxν
dσ+ Γν µλT
µδλ.
Now, the first term can be rewritten
dxµ
dτ
∂
∂xµdxν
dσ=
d2xν
dτdσ
=d2xν
dσdτ
=dxµ
dσ
∂
∂xµdxν
dτ.
Hence, we use this to see that
T µ∇µδν =
dxµ
dσ
∂
∂xµdxν
dτ+ Γν µλT
µδλ
= δµ∂µTν + Γν λµT
λδµ
= δµ∇µTν ,
where we have merely used the symmetry of the Christofell symbol. Hence, we have therelation
T µ∇µδν = δµ∇µT
ν . (4.9)
4.6 Geodesic Deviation 55
Now, let us state the operatorD2
Dτ 2= Tα∇αT
β∇β,
and compute its action upon δµ,
D2δµ
Dτ 2= Tα∇α
(T β∇βδ
µ).
Now, we use our relation (4.9), to see that
D2δµ
Dτ 2= Tα∇α
(δβ∇βT
µ).
Let us now expand this out,
D2δµ
Dτ 2= Tα∇αδ
β∇βTµ + Tαδβ∇α∇βT
µ.
Now, we can rewrite the two-covariant derivatives term on the far RHS, using the Ricciidentity,
[∇µ,∇ν ]Aρ = Rρ
λµνAλ,
so that we have
D2δµ
Dτ 2= Tα∇αδ
β∇βTµ + TαδβRµ
λαβTλ + Tαδβ∇β∇αT
µ
= δα∇αTβ∇βT
µ + T βδα∇α∇βTµ +Rµ
λαβTαT λδβ
= δα(∇αT
β∇βTµ + T β∇α∇βT
µ)
+RµλαβT
αT λδβ.
In the first step we used our relation (4.9) on the first term, and changed dummy indiceson the third term, then we merely factorised the expression. Now, notice that the bracketedterm can be written
∇αTβ∇βT
µ + T β∇α∇βTµ = ∇α
(T β∇βT
µ),
but the bracketed part on the RHS is zero on an affinely parameterised goedesic. Hence,
D2δµ
Dτ 2= Rµ
λαβTαT λδβ,
or, trivially relabelling indices, we arrive at our equation for geodesic deviation
D2δµ
Dτ 2= Rµ
αβρTαT βδρ. (4.10)
56 4 CURVATURE
57
5 Einstein’s Equation
We almost have enough tools to be able to write Einstein’s equation.
We have seen that freely-falling particles follow geodesics. In curved spacetime, the geodesicswill probably be curves. So then, what makes the spacetime curved?
If we consider electromagnetic theory, there is a source for the electric field: the electron.For a field, there is a source. Therefore, we need a source term that will curve spacetime.We shall now discuss a term that is the “gravitation source term”.
5.1 The Energy Momentum Tensor T µν
We shall start by stating that there exists a tensor T µν , which is symmetric. That is,T µν = T νµ. Furthermore, we shall state that the components of this tensor contain allpossible forms of energy and momentum (it will be this tensor which is the source-term). Letus state how to compute a given component of the tensor.
A given element T µν is the flux of pµ that goes through the hypersurface xν = const.
The structure of the tensor is clearly
(T µν) =
(T tt T ti
T it T ij
).
Also, before we start to compute the components of the tensor, we must state that the full4-volume is just ∆t∆x∆y∆z.
5.1.1 Components of T µν
Lets consider T 00 = T tt. Then, by our definition, that element is the flux of p0 through thesurface x0 = const. Now, p0 = E and x0 = t. Therefore, we see that T tt is the flux ofenergy E through a 3-volume ∆x∆y∆z (it is the hypersurface that holds x0 = t constant).Therefore,
T tt =E
∆x∆y∆z≡ ε,
that is, the energy per unit volume, the energy density ε.
Consider the component T 01 = T tx. Then, we see that it is the flux of p0 = E through thehypersurface x1 = x = const. That is,
T 01 =E
∆t∆y∆z,
58 5 EINSTEIN’S EQUATION
which has the interpretation of being the energy flux through the y − z plane, in unit time.This is easily extrapolated to the any term T 0i: the energy flux through a surface, in unittime.
Now consider the purely-spatial components, T ij. For example,
T ix =∆pi
∆t∆y∆z.
Now, notice that we can rewrite this,
T ix =∆pi/∆t
Ayz, Ayz ≡ ∆y∆z;
where we have fairly obviously defined an area-element. A change in momentum per unittime is just a force. Thus,
T ix =F i
Ayz,
which is a force per unit area: a pressure. Consider the specific component,
T yx =∆py/∆t
∆y∆z,
then, using the relation pi = viE, we see that
T yx =∆vyE/∆t
∆y∆z.
So, as vi = xi/t, we see that this is just
T yx =∆y/∆tE/∆t
∆y∆z.
Now then, as the ∆y’s cancel, we can just replace them with ∆x’s, thus
T yx =∆x/∆tE/∆t
∆x∆z
=∆vxE/∆t
∆x∆z
=∆px/∆t
∆x∆z= T xy.
Therefore, with this little exercise, we see that the spatial components of the energy-momentumtensor are infact symmetric. One may be able to see that the off-diagonal components of thespatial part, those T ii, correspond to the force perpendicular to a surface. Those off-digonal
5.1 The Energy Momentum Tensor T µν 59
elements are the force parallel to a surface (shear). Therefore, the spatial components, T ij
are components of the stress-tensor.
The final part to the tensor, are those components T it. Thus, we see that they are the flowof pi through the hypersurface t = const. That is, the momentum flow in a given 3-volume,at a constant time. That is, how much momentum there exists in a unit volume, at a singletime. This is clearly the momentum density.
T it =∆pi
∆x∆y∆z≡ πi.
To see that T it = T ti, consider the above expression; writing pi = viE = xiE/t, then,
T it =∆xi/∆tE
∆x∆y∆z=
∆xiE
∆t∆x∆y∆z.
Then, as i is changed through i = x, y, z, different components on the denominator will becancelled out, leaving only those in the corresponding T ti.
Therefore, we have seen that the energy-momentum tensor T µν is symmetric, by consideringits components. The colloquial construction of the tensor is thus
(T µν) =
(energy density energy flux
momentum density stress tensor
).
We shall write that T it = πi, T tt = ε.
5.1.2 Conservation Equations
The standard conservation equation is that
∇νTµν = 0. (5.1)
Let us consider this in a LIF. Then, it simply reads ∂νTµν = 0.
Now, take the time-component, µ = 0 = t. Then, the conservation equation reads
∂
∂tT tt +
∂
∂xiT ti = 0,
which is just∂ε
∂t+∂πi
∂xi= 0.
This equation can be written∂ε
∂t+∇ · π = 0,
and is the familiar continuity equation, for energy. Note that this is only valid in a LIF.
60 5 EINSTEIN’S EQUATION
Let us take the spatial components, µ = i of the conservation equation. Thus,
∂
∂tT it +
∂
∂xjT ij = 0.
Now, if we write the force density, in a given direction
φi ≡ −∂Tij
∂xj,
then we see that the conservation equation reads
∂πi
∂t− φi = 0,
which is just the statement that the rate of change of momentum density is the force density.This is the familiar statement of Newton’s second law. That is, the above equation is just
∂π
∂t= φ.
Therefore, we see that the energy-momentum tensor T µν contains all sources of energy andmomentum, and satisfies basic conservation relations.
5.1.3 Perfect Fluids
A perfect fluid is defined to be one for whom there is no viscosity or heat conduction. This“restriction” makes the energy-momentum tensor look a lot simpler.
That a fluid has no heat conduction means that there is no transfer of energy, acrosssurfaces. Viscous forces are those which are parallel to a surface (shear). Thus, the absenceof such forces, implies that all forces on surfaces are perpendicular to those surfaces.
Therefore, if we consider our previous “derivation” of the components of T µν , we see thatit must be diagonal. This is because
• No heat conduction implies no energy flux. Therefore, T ti = T it = 0.
• No viscosity means that all parallel forces are zero. This only leaves diagonal com-ponents to the stress tensor. All components left-over are just pressures (as discussedpreviously), P .
We shall change notation slightly, so that ρ is the energy density (which is clearly the case,via ε = ρc2, with c = 1). Therefore, we see that for a perfect fluid, at rest,
T µν =
ρ 0 0 00 P 0 00 0 P 00 0 0 P
= diag(ρ, P, P, P ).
5.2 Einstein’s Equation 61
The Perfect Fluid Tensor The general expression for the energy-momentum tensor, fora perfect fluid in its LIF is
T µν = (ρ+ P )uµuν − Pgµν , (5.2)
where uµ = γ(1,u) is the 4-velocity, ρ the energy density and P the pressure of the fluid. If wetake this tensor, with the fluid at rest in its LIF, in flat space, then, some of the componentsare
T 00 = (ρ+ P )− P = ρ,
T 12 = 0,
T ij = P.
Infact, all off-diagonal components are zero. Then, we see that we have recovered our previousexpression for a perfect fluid at rest.
We can easily recover some standard fluid mechanics results from the perfect fluid tensor.Suppose we have a non-relativistic pressure-less fluid, P = 0, then, the energy-conservationequation is just
∂µ(ρuµu0) = 0,
which easily becomes∂ρ
∂t+∂ρui
∂xi= 0,
which is just∂ρ
∂t+∇ · (ρu) = 0.
5.2 Einstein’s Equation
We now have a source term. The sources of energy and momentum can be written “into”the energy-momentum tensor, T µν ; a tensor which satisfies the conservation equation.
Now, from the previous sections contracted Bianchi identity,
∇µGµν = 0, Gµν ≡ Rµν − 12gµνR,
we have an expression which takes care of the geometry of the spacetime. Recall that theRicci tensor/scalar are composed of differentials (of various orders) of the metric, where themetric gives meaning to distances within a manifold. Then, if we can equate this expressionto an expression which gives information as to what is doing the curving, then we haveour general theory of relativity. We must use an expression which also has zero covariantderivative.
The obvious choice is the energy-momentum tensor. Therefore, we write
Gµν = κTµν .
62 5 EINSTEIN’S EQUATION
Therefore, up to a constant κ, we have a LHS which describes the geometry of a manifold,and a RHS which describes the distribution of all forms of energy and momentum in themanifold. Therefore, we say that the distribution of energy-momentum in a manifold causesthe manifold to become curved.
Therefore, Einstein’s equation is
Rµν − 12gµνR = κTµν . (5.3)
Notice that both sides have vanishing covariant derivative. We shall be able to find theconstant κ when we consider the Newtonian limit of the theory.
We can write this in an alternative form. Consider multiplying the whole expression bygµν ,
gµν(Rµν − 12gµνR) = κgµνTµν ,
then, writing the trace of the energy-momentum tensor gµνTµν ≡ T , and noting that we seethat the Ricci tensor becomes the Ricci scalar upon contraction; thus, we see that
R− 12gµµR = κT.
Now, the metric multiplied by its inverse is just the Kronecker-delta. Thus, gµµ = δµµ = 4.Therefore,
R− 124R = κT,
hence,R = −κT.
Thus, we have written the Ricci scalar in terms of the trace of the energy-momentum tensor.Hence, we can write the Einstein equation as
Rµν + 12gµνκT = κTµν ,
which is just
Rµν = κ(Tµν − 12gµνT ). (5.4)
This an entirely equivalent form of Einstein’s equation.
5.2.1 The Cosmological Constant
Now, if we require the covariant derivative of an expression to be zero, we may add on an“extra term”, a constant, which will not change the value of the covariant derivative. Thecovariant derivative of the metric is zero, so we may add on any number of metrics and retainzero covariant derivative. Therefore,
Gµν = κTµν + Λgµν
5.3 The Newtonian Limit 63
is still consistent with zero covariant derivative. So, why is this a problem?
Consider the expression, from electrodynamic theory, in a LIF,
∂µFµν = Jν .
Then, consider taking the differential of the expression,
∂ν∂µFµν = ∂νJν = 0;
where the equality with zero comes from the “usual” conservation equation. Now, considerthat we try to add on an extra term,
∂µ(Fµν + Ληµν).
Then, these two expressions are not the same. That is, we do not have the freedom to modifythe field tensor by adding on an arbitrary quantity of metrics. The reason we are not ableto do this, is because the field tensor is anti-symmetric, and the metric is symmetric.
Therefore, the reason we are able to add the constant metric term into Einstein’s equation,is because both the Einstein tensor Gµν and energy-momentum tensor Tµν are symmetric (asis the metric); as well as the metric having zero covariant derivative.
The cosmological constant Λ has been measured to exist within the universe, having a verysmall numerical value. We shall usually define the cosmological constant within the energy-momentum tensor, so that we will essentially ignore it. However, it is to be understood thatthe term is within the energy-momentum tensor.
5.3 The Newtonian Limit
Let us discuss the correspondences of the theory of gravity on curved spacetime, with New-tonian gravity.
The equation of motion of a free particle, in Newton’s theory, is just given by Newton’ssecond law of motion,
d2xi
dt2= −δij ∂Φ
∂xj,
where Φ is the gravitational potential a particle feels. The corresponding equation of motionfor a free particle, in curved spacetime, is the geodesic
d2xµ
dτ 2= −Γµαβ
dxα
dτ
dxβ
dτ,
where the Christofell symbol Γµαβ contains information about the geometry of the spacetime.
The field equation, which describes how “stuff” generates the gravitational field, for theNewtonian theory is
∇2Φ = 4πGρm.
64 5 EINSTEIN’S EQUATION
That is, Poisson’s equation. This equation tells us that for some mass density ρm, there isan associated gravitational potential Φ. Combined with the equation of motion, we see thata mass density gives rise to a gravitational potential, which affects how a free particle moves.
The field equation in general relativity, is Einstein’s equation,
Rµν − 12gµνR = κTµν .
Some distribution of energy and momentum, defined within the energy-momentum tensor,gives rise to a different geometry. This geometrical information is then carried around by theRicci tensor, within the metric. The metric then gives the Christofell symbols, which changethe equation of motion - the geodesic.
The basic correspondence is
gµν ←→ Φ Tµν ←→ ρm.
Notice that we have only been referring to “free-particles”. A free-particle is one which doesnot have any external influences on its motion. For example, this could mean a stone beingdropped, in vacuum, from a building. The stones motion is only affected by the gravitationalpotential from the earth. Notice then, that the motion of a freely-falling particle in a curvedspacetime is entirely due to the spacetime through which is moves. That is, its trajectorywill be curved because of the geometry of the spacetime.
To modify these equations for a particle which is acted upon by an external force, Fext,one must merely add this to each component of the equation of motion.
5.3.1 Newtonian Gravity from Einstein’s Gravity
Let us consider the geodesic equation,for a free particle,
d2xµ
dτ+ Γµαβ
dxα
dτ
dxβ
dτ= 0.
Now, let us consider the non-relativistic limit of this geodesic.
Firstly, for non-relativistic motion, τ = t. Second, dxi/dt 1. Then, we can write thegeodesic equation as
d2xµ
dτ 2+ Γµ00
(dt
dτ
)2
+O(dxi
dτ
)2
= 0,
which is justd2xµ
dτ 2+ Γµ00 = 0.
Now then, to continue, we make an assumption about the metric. We say that the metric isMinkowskian, with a small perturbation,
gµν = ηµν + hµν , hµν 1.
5.3 The Newtonian Limit 65
We shall only work to first order in the perturbation. That is, we shall neglect any termsO(h2). We further say that the perturbation is static. That is, hµν(x
i) only; which immedi-ately tells us that
∂0hµν = ∂thµν = 0.
Now, the general expression for the Christofell symbol is
Γραβ =1
2gρν (∂αgβν + ∂βgνα − ∂νgαβ) .
Then, the components that we are interested in are just
Γµ00 =1
2
∑ν
gµν(∂0g0ν + ∂0gν0 − ∂νg00).
Now, as the time-differential of the metric is zero, all but the last term is zero. We shall alsodrop the implied summation;
Γµ00 = −1
2gµν∂νg00.
We shall now drop the greek index on the RHS, and only use roman. This is because thetime-differential of the metric is zero. Thus,
Γµ00 = −1
2gµi∂ig00.
Inserting our expression for the metric,
Γµ00 = −1
2(ηµi − hµi)∂i(η00 + h00)
= −1
2
(ηµi∂ih00 − hµi∂ih00
).
Now, the expression on the far right is O(h2) thus, we ignore it. Therefore,
Γµ00 = −1
2ηµi∂ih00.
Finally, recall that the Minkowski metric is diagonal. Therefore, we only have contributionfor µ = i. Therefore, as ηii = −1, to first order static-perturbation
Γi 00 =1
2∂ih00, Γ0
00 = 0.
Therefore, the geodesic equation is
d2t
dτ 2= 0,
d2xi
dτ 2+
1
2∂ih00 = 0.
Now, we use the first expression to tell us that dt = Adτ . We then set A = 1, to see thatt = τ . Therefore, the second expression is just
d2xi
dt2+
1
2∂ih00 = 0.
66 5 EINSTEIN’S EQUATION
Writing this as a vector equation, this is just
d2x
dt2+
1
2∇h00 = 0,
trivially rewriting results ind2x
dt2= −1
2∇h00.
Now then, recall the Newtonian equation,
d2x
dt2= −∇Φ(x).
Then, we can read off the correspondence,
h00 = 2Φ.
Finally, as the metric is just gµν = ηµν + hµν , then
g00 = 1 + 2Φ.
Therefore, we see that the time-component of a static perturbation to the Minkowski metricis the gravitational potential.
Recall the Riemann tensor, in a LIF,
Rρλµν = ∂µΓρλν − ∂νΓ
ρλµ,
and thus the Ricci tensor,
Rλν = Rρλρν = ∂ρΓ
ρλν − ∂νΓ
ρλρ.
Now, let us compute the component R00. Then,
R00 = ∂ρΓρ
00 − ∂0Γρ 0ρ,
noting that
Γi 00 =1
2∂ih00, Γ0
00 = 0,
we then see that
R00 = ∂iΓi
00
=1
2∂i∂ih00
=1
2∇2h00.
Further recall that we just derived that h00 = 2Φ, then
R00 = ∇2Φ.
5.4 Linearised Gravity 67
Now then, we are now in a position to compute the constant κ in Einstein’s field equation.Let us use the alternative form of the field equation, and take the “00” components;
R00 = κ(T00 − 12g00T ).
Now, the trace T is justT ≡ gµνTµν = T µµ.
Therefore,
g00T = (η00 + h00)(η00 − h00)T00
= (η00η00 − η00h
00 + η00h00 − h00h00)T00
= T00 +O(h2).
Let us suppose that the field is generated by a static, non-relativistic body, mass densityρm. Then, T00 = ρm. Therefore, the field equation becomes
R00 = κ(ρm − 12ρm) = κ1
2ρm.
Now, we have the Poisson equation, ∇2Φ = 4πGρm, and also that R00 = ∇2Φ. Therefore,equating the two,
∇2Φ = 4πGρm = 12κρm,
we see thatκ = 8πG.
Therefore, the full field equation is
Rµν = 8πG(Tµν − 12gµνT ). (5.5)
5.4 Linearised Gravity
Let us take our perturbed metric,
gµν = ηµν + hµν , gµν = ηµν − hµν ,
where hµν << 1. Now then, notice that
gµνgνλ = (ηµν − hµν)(ηνλ + hνλ)
= ηµνηνλ + ηµνhνλ − hµνηνλ − hµνhνλ= δµλ +O(h2).
Now, consider a coordinate transformation,
xµ 7→ x′µ = xµ + εµ, xµ = x′µ − εµ.
68 5 EINSTEIN’S EQUATION
Then, the Jacobians are clearly
Jµν = δµν + ∂νεµ,
(J−1)µ
ν= δµν − ∂νεµ.
The εµ << 1. So, we work to first order in εµ only. Now then, lets consider the transformationof the metric,
g′µν =(J−1)α
µ
(J−1)β
νgαβ.
Then, using our Jacobians for the coordinate transformation, this becomes
g′µν = (δαµ − ∂µεα)(δβν − ∂νεβ)gαβ
= (δαµδβν − δαµ∂νεβ − ∂µεαδβν + ∂µε
α∂νεβ)gαβ
= gµν − ∂νεβgµβ − ∂µεαgαν +O(ε2).
Now then, notice that by the product rule,
∂νεµ = ∂ν(gµβεβ) = εβ∂νgµβ + gµβ∂νε
β,
and therefore thatgµβ∂νε
β = ∂νεµ − εβ∂νgµβ.
Hence, using this, we see that the transformation of the metric looks like
g′µν = gµν − ∂νεµ − ∂µεν + εβ∂νgµβ + εα∂µgαν .
Now, the last two terms on the right are both O(ε2). This is because the metric is of O(ε),and therefore ε times the derivative of the metric is O(ε2). Therefore,
g′µν = gµν − ∂νεµ − ∂µεν +O(ε2).
Now, using the fact that gµν = ηµν + hµν , and ηµν = η′µν , then the above is just
ηµν + h′µν = ηµν + hµν − ∂νεµ − ∂µεν ,
which simply becomes
h′µν = hµν − ∂νεµ − ∂µεν . (5.6)
5.4.1 Linearising Einstein’s Equation
Now, recall that Einstein’s equation was composed of the Ricci tensor and the energy-momentum tensor. Now, the Ricci tensor was composed of derivatives of the Christofellsymbol, which in turn contained derivatives of the metric. Now, we can recompute theEinstein equation under the coordinate transformation defined above as
xµ 7→ x′µ = xµ + εµ ⇒ h′µν = hµν − ∂νεµ − ∂µεν .
5.4 Linearised Gravity 69
So, consider∂νgαβ = ∂ν(ηαβ + hαβ) = ∂νhαβ.
Therefore, the Christofell symbol, defined as
Γραβ =1
2gρν(∂αgβν + ∂βgνα − ∂νgαβ),
becomes
Γραβ =1
2ηρν(∂αhβν + ∂βhνα − ∂νhαβ).
Now, the Ricci tensor has components such as the product of two Christofell symbols. It isclear that these will be O(ε2), and therefore negligible. Hence, the Ricci tensor would looklike
Rµν = ∂ρΓρµν − ∂νΓρµρ.
Then, plugging in our Christofell symbols,
Rµν =1
2ηρσ(∂ρ∂µhνσ + ∂ρ∂νhσµ − ∂ρ∂σhµν − ∂ν∂µhρσ − ∂ν∂ρhσµ + ∂ν∂σhµρ).
This becomes, after noting that partial derivatives commute, the Minkowski metric commuteswith partial derivatives and that the second and fifth terms cancel,
2Rµν = ∂σ∂µhνσ − ∂ρ∂ρhµν − ∂ν∂µhρρ + ∂ρ∂νhµρ.
Now, hρρ ≡ h, and changing the σ index on the first expression to a ρ,
Rµν =1
2(∂ρ∂µhνρ + ∂ρ∂νhµρ − ∂ρ∂ρhµν − ∂ν∂µh) .
Then, the Ricci scalar is
R = gµνRµν
= ηµνRµν
=1
2(∂ρ∂νhνρ + ∂ρ∂µhµρ − ∂ρ∂ρh− ∂ν∂νh)
= ∂ρ∂νhνρ − ∂ν∂νh.
Now then, the Einstein tensor is defined as
Gµν ≡ Rµν −1
2gµνR.
Therefore, using our linearised Ricci tensor and scalar,
Gµν =1
2(∂ρ∂µhνρ + ∂ρ∂νhµρ − ∂ρ∂ρhµν − ∂ν∂µh
−ηµν∂σ∂πhπσ + ηµν∂ρ∂ρh) . (5.7)
70 5 EINSTEIN’S EQUATION
Now, let us define
hµν ≡ hµν −1
2ηµνh, (5.8)
and that the Lorentz gauge is
∂µhµν = ∂µhµν = 0. (5.9)
That is,
∂µhµν −1
2ηµν∂
µh = 0,
which is just the statement that
∂µhµν =1
2ηµν∂
µh =1
2∂νh.
Hence, using this in (5.7) (and swapping the ∂µ∂ν ↔ ∂ν∂µ at will), we see that
Gµν =1
2
(1
2∂µ∂νh+
1
2∂ν∂µh− ∂ρ∂ρhµν − ∂ν∂µh−
1
2ηµν∂
σ∂σh+ ηµν∂σ∂σh
).
Now, the first and second terms are identical, but their sum cancels with the fourth term.Hence,
Gµν =1
2
(−∂ρ∂ρhµν −
1
2ηµν∂
σ∂σh+ ηµν∂σ∂σh
).
The second and third terms add, to give
Gµν =1
2
(−∂ρ∂ρhµν +
1
2ηµν∂
σ∂σh
).
Now, if we use a little bit of notation,
≡ ∂µ∂µ,
then
Gµν = −1
2
(hµν −
1
2ηµνh
),
or
Gµν = −1
2
(hµν −
1
2ηµνh
).
Hence, using our substitution (5.8) again,
Gµν = −1
2hµν .
Then, if we write down Einstein’s equation,
Gµν = 8πGTµν ⇒ hµν = −16πGTµν .
Therefore, we have a wave equation in the metric perturbation, with the energy-momentumtensor as the source. This is the equation for gravitational radiation.
5.4 Linearised Gravity 71
5.4.2 Gravitational Radiation
Under the Lorentz gauge (to be inkeeping with the literature, this is sometimes also referredto as the Einstein gauge, or harmonic gauge),
∂µhµν = 0, hµν ≡ hµν −
1
2ηµνh,
Einstein’s equation becomes
hµν = −16πGTµν , ≡ ∂µ∂µ.
That is, a wave equation.
hµν = −16πGTµν . (5.10)
We can write down the solution to this directly, if one recalls the solution to the equivalentequation from electrodynamic theory.
In electrodynamics, under the Lorentz gauge ∂µAµ = 0, we could derive the wave equation
Aν = µ0Jν ,
which has solution
Ai =µ0
4π
∫d3x′
J iret
|x− x′|.
Hence, we can basically read off our solution by analogy,
hij = 4G
∫d3x′
T ijret
|x− x′|. (5.11)
One should recall that these are retarded integrals. The minus sign has “gone” because wehave raised indices.
Therefore, we have derived that upon linearising Einstein’s equation, and using the Lorentzgauge, we have derived that there is a wave equation in the metric perturbation. The sourceto the wave is the distribution of energy-momentum.
72 5 EINSTEIN’S EQUATION
73
6 The Schwarzschild Solution
We can write Einstein’s equation, in a vacuum, as
Rµν = 0. (6.1)
That is, in a vacuum, where Tµν = 0, the “alternative form” of Einstein’s equation reducesto the above.
Now, we can look for spherically symmetric solutions to this. That is, we are looking for aline element which possesses spherical symmetry. The most general such line element is
ds2 = eν(r,t)dt2 − eλ(r,t)dr2 − r2(dθ2 + sin2 θdφ2).
The reason we make this supposedly general line element diagonal, is that we can transformout of a frame in which there are diagonal elements.
In the line element we chose to use exponentials, as they are generally easy to work with(differentiating them is easy). Hence, the aim is to now find those functions ν(r, t), λ(r, t).
Now, although we shall not derive them, the only non-zero components of the Ricci tensorare
Rtt =1
2e−λ
(ν ′′ +
1
2ν ′(ν ′ − λ′) +
2ν ′
r
)+e−ν
(λ(ν − λ)− 1
2λ
),
Rtr =λ
2r,
Rrr =1
2e−ν
(λ− 1
2λ(ν − λ)
)−1
2e−λ
(ν ′′ +
1
2ν ′(ν ′ − λ′)− 2λ′
r
),
Rθθ = 1− e−λ(
1 +1
2r(ν ′ − λ′)
),
Rφφ = sin2 θRθθ.
We have used that an over-dot represents derivative with respect to time t, and a prime withrespect to r.
Hence, due to the reduction of Einstein’s equation to the form Rµν = 0, each of theseequation are equal to 0.
The easiest to start with, is the Rtr term. So,
λ
2r= 0,
74 6 THE SCHWARZSCHILD SOLUTION
which immediately allows us to state that λ(r) only. That is, λ does not have any dependanceupon t. Thus, using λ = 0 allows Rtt and Rrr to look very similar. Infact, as Rrr = Rtt = 0,then Rrr +Rtt = 0. This then easily shows that
Rtt +Rrr =1
2e−λ
(2ν ′
r+
2λ′
r
)= 0,
that is, assuming r 6= 0,ν ′ + λ′ = 0.
Integrating this easily shows thatν + λ = f(t).
Now, we can set f(t) to zero, by a time coordinate transformation. Then, ν = −λ. Therefore,
ν(r) = −λ(r).
Hence, using this in Rθθ, we see that
Rθθ = 1− eν(1 + rν ′) = 0,
that is,eν + rν ′eν = 1.
Now, we can rewrite this as(reν)′ = eν + rν ′eν = 1,
that is, as(reν)′ = 1.
Integrating easily reveals that
eν = 1 +C
r,
where C is some constant. We can find the value of C, by considering the Newtonian limitof the metric. That is, recall that we derived
g00 = 1 + 2Φ,
where we know that
Φ = −GMr.
Now, eν = g00 by inspection (it is the coefficient of the dt2 term). Hence,
1− 2GM
r= 1 +
C
r⇒ C = −2GM.
Let us recall that this M is the mass of the body generating the potential Φ. That is, it willbe the mass of the planet/star that is curving the spacetime. Therefore,
eν = 1− 2GM
r, eλ =
(1− 2GM
r
)−1
.
75
And finally, we have our metric,
ds2 =
(1− 2GM
r
)dt2 −
(1− 2GM
r
)−1
dr2 − r2(dθ2 + sin2 θdφ2). (6.2)
That is, we have the vacuum solution of Einstein’s equation, due to a body of mass M ; wherer > 0. This metric is called the Schwarzschild metric.
Properties of the Schwarzschild Metric The metric, by construction, is sphericallysymmetric. Also, the metric is static; it clearly is not a function of time. That the metric isstatic, then means that changing the time coordinate by a constant amount leaves the metricunchanged. That is, the metric is invariant under constant translations and reflections. Alsonotice that the metric has Killing vectors (1, 0, 0, 0) and (0, 0, 0, 1) (i.e. on t and φ); thesecorrespond to conservation of energy and angular momentum.
Notice that as r →∞, the metric goes over to Minkowski. That is, we say that the metricis asymptotically flat.
Also notice, at r = 2GM , the gtt and grr components flip sign. We call this the Schwarzschildradius, or the event horizon. We denote the event horizon as
rs ≡ 2GM. (6.3)
6.0.3 Gravitational Redshift
Consider some radial slices in the metric, so that ds2 = gttdt2. Also consider that
dτ =ds
c,
hence,
dτ =√gttdt
c.
Now, it is fairly obvious that a frequency is inversely proportional to the proper time. Thatis,
ν ∝ 1
∆τ.
Now, if we take two events which are at the same t, then
ν1
ν2
=
√gtt(2)
gtt(1).
If we use a weak gravitational field, then we can use the previously derived relation gtt =1 + 2Φ. Hence, this gives
ν1
ν2
= 1 + Φ(2)− Φ(1).
That is, the shift in frequency is a function of the distance from the gravitating body.
76 6 THE SCHWARZSCHILD SOLUTION
6.1 Dynamics in the Schwarzschild Spacetime
Recall that the effective Lagrangian is
Leff =
(ds
dτ
)2
.
Therefore, the effective Lagrangian is
Leff =(
1− rs
r
)t2 −
(1− rs
r
)−1
r2 − r2(θ2 + sin2 θφ2), (6.4)
where an over-dot denotes derivative with respect to the affine parameter τ , and rs = 2GM .So, let us consider the first integrals of this effective Lagrangian.
The Euler-Lagrange equations, for this effective Lagrangian are
d
dτ
∂Leff
∂xµ− ∂Leff
∂xµ= 0.
Then, consider that∂Leff
∂t= 0,
∂Leff
∂t= 2
(1− rs
r
)t,
then, the t-first integral is that
2(
1− rs
r
)t = const ≡ 2ε.
Similarly,∂Leff
∂φ= 0,
∂Leff
∂φ= 2r2 sin2 θφ,
with its first integral being
2r2 sin2 θφ = const ≡ 2`.
These constants, ε, `, are related to the conserved energy and angular momentum, per unitmass. Recall that these were predicted to be conserved, by the associated Killing vectors.
Finally, the effective Lagrangian is just the line element, and that can take on one of 3values;
Leff = K =
0 null,
+1 time-like,−1 space-like.
Hence, using the derived relations for `, ε, we can easily see that
t2 = ε2(
1− rs
r
)−2
, φ2 =`2
r4 sin4 θ.
6.1 Dynamics in the Schwarzschild Spacetime 77
And thus, using that the effective Lagrangian is just a constant K, we can easily put theeffective Lagrangian into the form
K =(
1− rs
r
)−1 (ε2 − r2
)− r2
(θ2 +
`2
r4 sin2 θ
).
Now, as the system has spherical symmetry, we may as well take a value of θ that makes theabove expression look simpler. Taking θ = π/2 (note that then θ = 0), we see that
K =(
1− rs
r
)−1 (ε2 − r2
)− `2
r2,
which is trivially just
K =(
1− rs
r
)−1[ε2 −
(dr
dτ
)2]− `2
r2.
Before we carry on with this expression, let us compute the Christofell symbols and geodesics.
6.1.1 Geodesics & Christofell Symbols
Let us compute the geodesics and Christofell symbols for the effective Lagrangian (6.4) inthis Schwarzschild spacetime.
We can compute the geodesic for the θ-component of the effective Lagrangian. We havethat
∂Leff
∂θ= −2r2θ,
∂Leff
∂θ= −2r2 sin θ cos θφ2,
hence,d
dτ
∂Leff
∂θ= −4rrθ − 2r2θ.
Therefore, the Euler-Lagrange equation for the θ-component, is
−4rrθ − 2r2θ + 2r2 sin θ cos θφ2.
Putting this into a more usable form,
θ +2
rrθ − sin θ cos θφ2 = 0.
Thus, we have the geodesic for θ. Now, we can read off the Christofell symbols. The non-zerocomponents are
Γθ rθ = Γθ θr =1
r, Γθ φφ = − sin θ cos θ.
We can compute the geodesic for r. So,
∂Leff
∂r= −2
(1− rs
r
)−1
r,
∂Leff
∂r= t2
rs
r2− rs
r2
(1− rs
r
)−2
r2 − 2r(θ2 + sin2 θφ2),
78 6 THE SCHWARZSCHILD SOLUTION
andd
dτ
∂Leff
∂r= −2r
(1− rs
r
)−1
+ 2r2 rs
r2
(1− rs
r
)−2
.
Therefore, the geodesic is
−2r(
1− rs
r
)−1
+ 2r2 rs
r2
(1− rs
r
)−2
− t2 rs
r2− rs
r2
(1− rs
r
)−2
r2
+2r(θ2 + sin2 θφ2) = 0.
This simplifies down to
r − r2rs
2r2
(1− rs
r
)−1
+t2rs
2r2
(1− rs
r
)+ r
(1− rs
r
)(θ2 + sin2 θφ2) = 0.
This is the r-geodesic. From this, we can read off the non-zero Christofell symbols. They are
Γr rr= −rs
2r2
(1− rs
r
)−1
, Γr tt=rs
2r2
(1− rs
r
),
Γr θθ = r(
1− rs
r
), Γr φφ = r sin2 θ
(1− rs
r
).
Then, let us compile these four geodesics (i.e. including the two not explicitly computedhere). The geodesics for the Schwarzschild spacetime are:
t+rs
r2
(1− rs
r
)−1
tr = 0,
r − rs
2r2
(1− rs
r
)−1
r2 +rs
2r2
(1− rs
r
)t2 + r
(1− rs
r
)(θ2 + sin2 θφ2) = 0,
θ +2
rrθ − sin θ cos θφ2 = 0,
φ+2
rrφ+ 2 cot θθφ = 0.
These complicated non-linear differential equations can be solved to find the trajectories ofparticles in the spacetime. The non-zero Christofell symbols are easily read off, and can beseen to be
Γt rt =rs
2r2
(1− rs
r
)−1
, Γr rr= −rs
2r2
(1− rs
r
)−1
,
Γr tt=rs
2r2
(1− rs
r
),
Γr θθ = r(
1− rs
r
), Γr φφ = r sin2 θ
(1− rs
r
),
Γθ rθ = Γθ θr =1
r, Γθ φφ = − sin θ cos θ,
Γφrφ =1
r, Γφθφ = cot θ.
6.1 Dynamics in the Schwarzschild Spacetime 79
6.1.2 Orbits
Let us return to the expression we derived, for θ = π/2,
K =(
1− rs
r
)−1[ε2 −
(dr
dτ
)2]− `2
r2.
We can rearrange it into the form
r2 = ε2 −K −[`2
r2
(1− rs
r
)− Krs
r
],
and indeed into the form
1
2r2 =
ε2 −K2
−[`2
2r2
(1− rs
r
)− Krs
2r
].
Now, we put it into this form, as we see that the LHS is a “velocity term”, the middle termis just the “energy”, and the far-RHS we call the effective potential Veff:
E =1
2r2 + Veff(r),
where
Veff ≡`2
2r2
(1− rs
r
)− Krs
2r. (6.5)
Now, one familiar with the Newtonian derivation of this formula, will realise that this ex-pression is not quite the same as its Newtonian counterpart. The GR “correction” is thers/r, creating a 1/r3 term.
Just to recap what the symbols are in this effective potential. ` is the angular momentumof the “moving thing”, rs ≡ 2GM , where M is the mass of the “big body” that the “movingthing” is moving in. That is, the big body is curving spacetime, and some moving objectis having its motion deflected, by the curved spacetime, which is due to the big body. Theamount of deflection is just a function of the distance from the big body to the smaller one.We shall call the “smaller body” the test mass, and the “big body” the gravitating mass.
Suppose that ` = 0. Then,
Veff = −Krs
2r= −K 2GM
2r= −KGM
r.
This is the Newtonian result. That is, for a test mass with no angular momentum, theeffective potential is just what we would expect.
Recall that we derived
ε =(
1− rs
r
) dtdτ,
80 6 THE SCHWARZSCHILD SOLUTION
then, we can clearly see thatdt
dτ= ε
(1− rs
r
)−1
.
That is, the proper time of a test mass is a function of the distance from the gravitatingmass, and of the total energy.
Let us now give some results relating to orbits in the spacetime. Circular orbits have
dVeff
dr= 0,
and stable circular orbits are those for whom the second differential of the potential is positive.
(a) ` <√
3rs (b)√
3rs < ` < 2rs (c) ` > 2rs
Figure 6.1: The effective potential, as a function of distance from the gravitating body, for particleorbits.
Particle Orbits K = 1 If we vary the angular momentum, `, with respect to the eventhorizon, rs, then various shapes of effective potential are found. With reference to Figure(6.1), we see the 3 ranges of `.
• ` <√
3rs. Here, we see that any particle with energy E > 0, escapes, whilst anyparticle with E < 0 crushes back into the origin. No stable orbits exist.
•√
3rs < ` < 2rs. For this range, there are two positions in which orbits can exist, butonly one of then is capable of sustaining stable orbits. If E > 0, then any particlewill escape. If we define Vmax as the position of the maximum of Veff, and Vmin asthe minimum, then we can see that for any 0 > E > Vmax, a particle will crush intothe origin. Also, for a particle with E = Vmin, then there is a stable circular orbit.E = Vmax is an unstable circular orbit. Any particle trapped in the “well” will havesome sort of elliptical orbit.
• ` > 2rs. Here, if E = Vmin, the particle will have a stable circular orbit, and ellipticalfor perturbations about that minimum. If a particle has E < Vmax, and lives to the leftof the maximum, then it will crush into the origin. Now, if a particle has E < Vmax,and approaches the system from the right of the maximum, then the particle will be
6.1 Dynamics in the Schwarzschild Spacetime 81
repelled back to ∞. However, above a certain value, the particle will hit the origin.This is not present in Newtonian gravity, where particles are always repelled.
Photon Orbits K = 0 In this case, we have that
Veff ≡`2
2r2
(1− rs
r
).
Upon plotting the effective potential, we see that for a given E < Vmax, it depends on where
Figure 6.2: The effective potential, as a function of distance from the gravitating body, all forphoton orbits.
the photon is, relative to the peak. That is, if the photon is within the peak, the photon willcrush to the origin. If the photon is outside, then the photon will repel to infinity.
6.1.3 Summary
Let us just summarise the results obtained, as they will be useful in subsequent discussions.
We derived that, on a θ = π/2 trajectory,
1
2r2 = E − Veff,
where the “energy” is given by
E =ε2 −K
2,
and the effective potential by
Veff =`2
2r2
(1− rs
r
)− Krs
2r.
The angular momentum of the test mass was computed to be
` = r2φ,
82 6 THE SCHWARZSCHILD SOLUTION
and the energy density
ε =(
1− rs
r
)t.
Light-like trajectories are those for whom K = 0. Particle-like are those for whom K = 1.The event horizon is related to the mass of the gravitating body rs = 2GM , and is idealisedso that all mass is concentrated at a single point. Over-dots represent derivative with respectto the affine parameter. Notice that we can then easily write that
dφ
dr=
φ
r
= ± 1
r2
`√2(E − Veff)1/2
. (6.6)
6.2 Light Deflection
We can compute the angle that light is deflected by, due to the curved spacetime of a star.
drrmin
!"defl
!"
Figure 6.3: Light deflection due to a gravitating mass. Notice how various angles are defined. d isthe impact parameter of the photon, with respect to the radius of the gravitating mass.
The effective potential, for photons with K = 0, reads
Veff =`2
2r2
(1− rs
r
). (6.7)
Consider the combination`
ε=
r2φ(1− rs
r
)t,
6.2 Light Deflection 83
then, considering that r rs, then
`
ε≈ r2dφ
dt+O
(rs
r
)⇒ `
ε= r2dφ
dt. (6.8)
Now, for small angles, we have that
φ =d
r.
Then,dφ
dt= − d
r2
dr
dt.
Now,dr
dt= −1,
where the unity comes from c = 1, and the minus-sign because distances are shrinking.Hence,
dφ
dt=
d
r2,
which we use in (6.8) to see that`
ε= d.
Therefore, for photons,
d =`
ε=
`√2E
. (6.9)
Now, with reference to Figure (6.3), we see that the deflection angle is given by
δφdefl = ∆φ− π.
The total angle change is just the integral
∆φ =
∫dφ,
or, as we have an expression for dφ/dr,
∆φ =
∫drdφ
dr.
Hence, using (6.6), we have that
∆φ = 2
∫ rmax
rmin
dr1
r2
`√2(E − Veff)1/2
, (6.10)
using the light-like effective potential (6.7),
∆φ = 2
∫ rmax
rmin
dr1
r2
`√
2[E − `2
2r2
(1− rs
r
)]1/2 .
84 6 THE SCHWARZSCHILD SOLUTION
We take rmax →∞, and note that the factor of 2 out-front is due to the photon coming frominfinity, the going back to infinity. If we put the factor of ` inside the square-root in thedenominator, as well as the
√2, then
∆φ = 2
∫ ∞rmin
dr
r2
[2E
`2− 1
r2
(1− rs
r
)]−1/2
.
Now, noting that via (6.9), we rewrite
2E
`2=
1
d2,
and also change variables to
w ≡ d
r⇒ dw = − d
r2dr.
Hence, using this change of variables, and rewrite,
∆φ = 2
∫ 0
wmax
−dwd
[1
d2− w2
d2
(1− rsw
d
)]−1/2
,
the minus sign obviously flipping the integration limits to
∆φ = 2
∫ wmax
0
dw
d
[1
d2− w2
d2
(1− rsw
d
)]−1/2
.
The factor of 1d
can be taken inside the square-root, giving
∆φ = 2
∫ wmax
0
dw[1− w2
(1− rsw
d
)]−1/2
.
Now, if we refer to (6.10), we see that there is a singularity at E = Veff. It is an integral ofthe form ∫ 1
0
dx√x+ ε
≈∫ 1
0
dx√x− ε
2
∫ 1
0
dx
x3/2,
whereby upon integration, the first term does not give a singularity, but the second does (atzero). Thus, we say that the integral has an essential singularity.
Let us continue. If we take out a factor, from the square root, then
∆φ = 2
∫ wmax
0
dw(
1− rs
dw)−1/2
[(1− rs
dw)−1
− w2
]−1/2
.
Now, we can expand the two terms,(1− rs
dw)−1/2
= 1 +rs
2dw +O
(rs
d
)2
,(
1− rs
dw)−1
= 1 +rs
dw +O
(rs
d
)2
,
6.3 Perihelion Precession 85
so that
∆φ = 2
∫ wmax
0
dw(
1 +rs
2dw)(
1 +rs
dw − w2
)−1/2
+O(rs
d
)2
.
We can obviously now multiply out the bracket,
∆φ = 2
∫ wmax
0
dw(
1 +rs
dw − w2
)−1/2
+rs
d
∫ wmax
0
dww(1− w2)−1/2.
Now, we can see that there is a pole in the second integral, at wmax = 1. If we look up thevalues of the integrals, we find∫ wmax
0
dw(
1 +rs
dw − w2
)−1/2
=π
2+rs
2d,∫ wmax
0
dww(1− w2)−1/2 = 1.
Therefore, we see that
∆φ = π +2rs
d,
and hence, the deflection angle,
δφdefl =2rs
d. (6.11)
Therefore, we have derived the deflection angle of a photons trajectory, with impact param-eter d with respect to a gravitating body of event horizon rs.
To get a handle on how big this angle is, consider the Sun. rs ≈ 3km, and suppose thephoton just grazes the suns surface. Then,
δφ =2rs
d=
2.3km
7× 10−5km≈ 10−5rad.
This angle is equivalent to the observed height of a 1m high object, viewed from 10km away.That is, the effect is very small. However, this angle can be measured (best in solar eclipses),and has been confirmed to be closer to the actual value than the Newtonian prediction (whichis a factor of 4 smaller).
This is one of the tests of general relativity.
6.3 Perihelion Precession
Here we consider the motion of a planet, about a star. Supposing that the orbit of theplanet is elliptical, and that the “size” of the orbit is unchanged over may periods, does the“position” of the orbit change? That is, after each revolution, let us consider that rmin is thesame, but is shifted in position by δφprec. Then, we have that
∆φ = δφprec − 2π,
86 6 THE SCHWARZSCHILD SOLUTION
where we use 2π to make the Newtonian prediction give ∆φ = 0.
We follow a similar tack as for light deflection, but we must take K = 1 as we are dealingwith time-like objects. So, the effective potential is now
Veff =`2
2r2
(1− rs
r
)− rs
2r.
We also use (6.6)
dφ
dr=
1
r2
`√2(E − Veff)1/2
, (6.12)
where the energy for time-like objects is
E =ε2 − 1
2.
Then, we write, as before,
∆φ = 2
∫ rmax
rmin
drdφ
dr,
which is just
∆φ = 2
∫ rmax
rmin
dr`
r2
[√2(E − Veff)1/2
]−1
,
putting in the effective potential,
∆φ = 2
∫ rmax
rmin
dr`
r2
[2E − `2
r2
(1− rs
r
)+rs
r
]−1/2
.
If we now take the ` inside the square-root, and use the expression for E, then
∆φ = 2
∫ rmax
rmin
dr1
r2
[ε2
`2− 1
`2− 1
r2
(1− rs
r
)+
rs
r`2
]−1/2
.
Let us rewrite the square-rooted bit slightly,
ε2
`2− 1
r2
(1− rs
r
)− 1
`2
(1− rs
r
).
Let us change integration variables,
u ≡ 1
r,
hence,
∆φ = 2
∫ umax
umin
du
[ε2
`2− u2(1− rsu)− 1
`2(1− rsu)
]−1/2
.
6.3 Perihelion Precession 87
If we take out a common factor,
∆φ = 2
∫ umax
umin
du(1− rsu)−1/2
[ε2
`2(1− rsu)−1 − 1
`2− u2
]−1/2
.
We now expand out, but we must take to higher order within the expression on the right,
∆φ = 2
∫ umax
umin
du(
1 +rsu
2
)[ε2
`2
(1 + rsu+ r2
su2)− 1
`2− u2
]−1/2
,
collecting terms,
∆φ = 2
∫ umax
umin
du(
1 +rsu
2
)[ε2
`2(1 + rsu)− 1
`2− u2
(1− ε2r2
s
`2
)]−1/2
,
thus,
∆φ = 2
(1 +
ε2r2s
2`2
)∫ umax
umin
du
[ε2
`2(1 + rsu)− 1
`2− u2
]−1/2
+rs
∫ umax
umin
duu
[ε2
`2(1 + rsu)− 1
`2− u2
]−1/2
.
Now, by looking up the integrals, the first gives π, the second π2(umin + umax). Now, the
integrand on the second integral has poles at the integration limits. Therefore, one can easilysee that the sum of the roots of the integrand, is
ε2
`2rs,
and therefore
∆φ = 2π
(1 +
ε2r2s
2`2
)+πε2r2
s
2`2.
Hence, we read off
δφprec =3πr2
s
2`2=
6πG2M2
`2.
Now, in getting a handle on how big this is, we appeal to standard ellipse-theory. The resultof which allows us to write the angular momentum ` in terms of the semi-major axis a of theorbit, and the eccentricity e,
`2 = GMa(1− e2).
Hence, the precession angle reads
δφprec =6πGM
a(1− e2). (6.13)
See Table (6.1) for a comparison of the prediction and observations of these precession angles.
88 6 THE SCHWARZSCHILD SOLUTION
Planet GR Prediction (per century) Observation
Mercury 43′′ 43.1± 0.5′′
Venus 8.6′′ 8.4± 4.8′′
Earth 3.8′′ 5.0± 1.2′′
Table 6.1: The GR prediction of, and experimental observation of, the perihelion precession ofvarious planets. The agreement is one of the most convincing experimental “proofs” of generalrelativity.
6.4 Black Holes
Let us consider what the mass and radius is, of a gravitating body for whom the escapevelocity is the speed of light. That is, what is M,R for which vesc = c?
Recall that the Newtonian expression for total energy is
EN =1
2mv2 − GMm
r,
so that rearranging into the familiar form
1
2
(dr
dt
)2
=ENm−(−GM
r
), v =
dr
dt,
we see the presence of the effective potential. Now, escape velocity is when EN = 0, whichcorresponds to
v2esc =
2GM
R,
which we require to be c2, which, under the units of c = 1, is just the statement that
R = 2GM = rs.
That is, we seem to have derived the Schwarzschild radius (which was a GR result) usingNewtonian mechanics. This is actually just a coincidence, as we have neglected both SR andGR (i.e. no mention of mass-energy in the above derivation).
Let us return to the Schwarzschild metric, with the assumption that θ, φ are constant.Then, it reads
ds2 =(
1− rs
r
)dt2 −
(1− rs
r
)−1
dr2.
Notice that this expression has two singularities. One at r = rs and one at r = 0.
Now, it is not immediately obvious whether these singularities are an artifact of how wehave constructed our coordinate system, or if they are “true singularities”. So, a way offinding this out, would be to construct a quantity that is invariant of coordinate system.Such a quantity is of course a scalar. Now, we want a scalar that is dependent upon the
6.4 Black Holes 89
geometry of the system. Such quantities are the contracted Riemann and Ricci tensors, andthe Ricci scalar. Now, experience has shown us that the best test is the Riemann tensor, inthe form
RαβνµRαβνµ =6r2
s
r6.
That is, we see that this coordinate-system independent quantity does not have a singularityas r → rs, but does have one for r → 0.
Therefore, we see that r → rs is a removable axis singularity, whereby we can changecoordinates so that the metric does not retain the singularity; and that r → 0 is an essentialsingularity. Now, although we shall not go into it at all, a quantum theory of gravity will beable to “sort out” this essential singularity.
6.4.1 Null Geodesics
Let us consider the case ` = 0, and ds2 = 0. Then, the metric is just(1− rs
r
)dt2 −
(1− rs
r
)−1
dr2 = 0,
which trivially rearranges into (dr
dt
)2
=(
1− rs
r
)2
,
which is justdr
dt= ±
(1− rs
r
).
Notice that this is the radial geodesic. So, we can solve this,
t = ±∫
dr
1− rs/r
= ±∫
rdr
r − rs
= ±∫dr
(1 +
rs
r − rs
)= ± [r + rs ln |r − rs|+ const]
= ±[r + rs ln
∣∣∣∣ rrs
− 1
∣∣∣∣+ const
].
Now, we define the tortoise coordinate
r∗ ≡ r + rs ln
∣∣∣∣ rrs
− 1
∣∣∣∣ , (6.14)
so that the geodesic readst = ±r∗ + const
90 6 THE SCHWARZSCHILD SOLUTION
Now, notice that for flat space, rs → 0. Hence, the geodesics read
t = ±r + const.
Hence, we denote this as
u = t− r, v = t+ r,
so that lines of u = const and v = const define the null geodesics. See Figure (6.4) for theselines.
r
t
Figure 6.4: Null geodesics for flat space. Blue (left to right) lines are u = const, Red (right to left)lines are v = const. Photons move on these lines, and massive particles move within a light cone,defined by the lines. That is, the light cone is defined at an intersection of lines of v = const andu = const; where the particles future is everything above that point, within that cone, and its pastis everything below that point, within the cone.
We say that u, v are the light-cone coordinates for flat space. Hence, for flat space, themetric is
ds2 = dt2 − dr2 − r2(dθ2 + sin2 dφ2)
Now, notice that
t =1
2(u+ v), r =
1
2(v − u).
Also that
dr =dr
dudu+
dr
dvdv =
1
2(dv − du), dt =
1
2(dv + du).
Therefore, the metric reads
ds2 = dudv − r2(dθ2 + sin2 dφ2),
which is no longer diagonal.
6.4 Black Holes 91
6.4.2 Eddington-Finkelstein Coordinates
Now, let us return computing the null geodesics, but for curved space. We shall still use thelight-cone coordinates,
u = t− r∗, v = t+ r∗,
with the tortoise coordinate
r∗ = r + rs ln
∣∣∣∣ rrs
− 1
∣∣∣∣ . (6.15)
From which we can compute
dr∗dr
=r
r − rs
⇒ dr2 =(
1− rs
r
)2
dr2∗.
r
r*
Figure 6.5: The tortoise coordinate (6.15). The position of rs is obvious.
Now, the Schwarzschild metric may be written as (where we are suppressing the angularpart)
ds2 =(
1− rs
r
)[dt2 − dr2(
1− rsr
)2
],
which, using our derived relation for dr2∗. is
ds2 =(
1− rs
r
) [dt2 − dr2
∗].
This, in terms of u, v is just
ds2 =(
1− rs
r
)dudv.
Notice that this metric is no longer singular at r = rs, but is still singular at r = 0.
With reference to Figure (6.6), we can see the geodesics for curved spacetime. We haveplotted the lines u = const and v = const. The interesting things to note from the plot:
92 6 THE SCHWARZSCHILD SOLUTION
r
t
Figure 6.6: The null geodesics for curved spacetime. Blue lines are u = const and red lines are v =const. The future direction, for a light cone, is that were a red line is on the left, and a blue lineon the right. Notice that for r > rs, all future cones are pointing upwards, and that at r < rs, allfuture cones point leftwards.
• As r decreases towards rs, the angle between a u and v line decrease. This means thatthe future (and past) light cone of a particle becomes sharper. This means that “stuff”must be closer to the particle for it to influence the particle, as the particle gets closerto the Schwarzschild radius.
• As a particle crosses r = rs, light cones flip 90, and point towards the t-axis. Thatis, the future of the particle can only be for motion towards the origin. That is, theparticle can never escape.
Therefore, we have seen that as a particle crosses the Schwarzschild radius, its light cone getstilted so that its future is always within the Schwarzschild radius. That is, particles can getinto this region, but never out.
Thus, we see that r = rs is some sort of membrane which allows one-way travel. This isthe event horizon.
Therefore, we see how black holes “work”. We have only considered stationary black holes.To consider rotating holes, one must analyse the Kerr metric, which we shall not do here.
Hawking Radiation Now, classically, particles cannot escape from a black hole, as wehave just seen. However, quantum mechanically, they can tunnel out. According to quantumfield theory, there is a “sea” of particle-anti-particle pairs being created and annihilated allthe time, in vacuum (i.e. there is no true vacuum). Now, suppose one of these pairs werecreated on the event horizon, so that one of the particle gets created inside the horizon, oneout side. Then, as the particle inside cannot get out (it is inside the horizon), then it cannotannihilate with the one that was created outside the horizon. Therefore, the particle outsidethe horizon can escape. Now, the energy to create the particle-anti-particle pair came fromthe vacuum inside the horizon. Therefore, by the particle escaping, energy is removed fromthe black hole, and over time, the black hole evaporates. This is called Hawking radiation.
6.4 Black Holes 93
To properly understand this radiation requires a huge amount of QFT, which we shall notgo into here.
This effect can be conceived in a rather tamer environment. Consider two metal plates,which posses opposite electric charge, where the space between the plates is “vacuum”. Now,the energy density due to the electric field may be ramped up so that it is high enough tocreate an electron-positron pair from the vacuum. This experiment, as far as I am aware,has not been done, but it is conceivable to see that it could (if the idea of a sea of virtualparticles is correct).
94 6 THE SCHWARZSCHILD SOLUTION
95
7 The Friedmann-Robertson-Walker Universe
We shall abbreviate the above name to FRW.
Now, we can start to consider the geometry of our universe. Historically, there were twotheories for the universe.
The FRW universe was one based upon the cosmological principle: “Our universe is ho-mogenous and isotropic.” This means that the universe is pretty much the same everywhereyou look, and in any direction. That is, the ensemble properties of the universe are invariantunder both translation and rotation.
The competing theory was that of a steady state universe, proposed in 1948 by F.Hoyle,H.Bondi and T.Gold. The steady state theory was a more “perfect” version of the cosmolog-ical principle, by imposing a condition that the universe be invariant under time as well astranslation/rotation. This means that the universe looks the same at any time.
The main differences between the theories are that the FRW universe started, and thenexpanded, whereas the steady state universe “always has been”. At the time these twotheories were proposed, the church preferred FRW, with scientists preferring steady state.
The FRW universe model predicts some background radiation from the beginning event(i.e. the big bang), in the form of the cosmic microwave background (CMB). The CMBsignature was predicted by Gamow and Alpher, and was observed by Penzias and Wilson.Therefore, providing evidence for the FRW universe.
The standard model of cosmology, today, uses the FRW model of the universe.
7.1 The FRW Metric
Schur’s theorem (which we state without proof) states a globally isotropic n-dimensionalmanifold (n > 2) has a constant curvature k, and that the Riemann tensor has the form
Rµναβ = k(gµαgνβ − gµβgνα).
Following this, one can construct a isotropic metric,
ds2 = dt2 − a2(t)dσ2, (7.1)
where dσ2 is the line element for 3−dim space, and a(t) is the scale factor. We define theHubble parameter, noting its present value,
H ≡ a
a, H0 = 73 km/sec/Mpc;
where it is important to note that an overdot here denotes derivative with respect to coordi-nate time t. Furthermore, the metric actually looks like
ds2 = dt2 − a2(t)
[dr2
1− kr2+ r2
(dθ2 + sin2 θdφ2
)]. (7.2)
96 7 THE FRIEDMANN-ROBERTSON-WALKER UNIVERSE
Then, by a suitable coordinate transformation, the curvature constant k can take on one of3 values,
k =
1 closed0 flat−1 open
(7.3)
So, consider the values of k, to see how the actually correspond to the above “claimed”geometries.
Closed Space Consider setting k = 1, and the transformation
r = sinχ ⇒ dr = cosχdχ,
so that the metric looks like
ds2 = dt2 − a2(t)
[cos2 χdχ2
1− sin2 χ+ sin2 χ
(dθ2 + sin2 θdφ2
)],
which simplifies trivially down to
ds2 = dt2 − a2(t)[dχ2 + sin2 χ
(dθ2 + sin2 θdφ2
)].
Now, consider taking a slice through θ. That is, set θ = π/2, then one finds that
ds2 = dt2 − a2(t)[dχ2 + sin2 χdφ2
],
where it is clear that the bracketed quantity is the line element of the 2-sphere. That is,
dχ2 + sin2 χdφ2 ⇒ sphere.
Open Space Consider setting k = −1, and the coordinate transformation
r = sinhχ.
Then, under a completely analogous manner as before, we get the line element
ds2 = dt2 − a2(t)[dχ2 + sinh2 χdφ2
],
where we now notice that
dχ2 + sinh2 χdφ2 ⇒ hyperboloid.
That is, k = −1 corresponds to a geometry based upon the surface of a hyperboloid.
7.1 The FRW Metric 97
(a) Sphere - Closed (b) Hyperboloid - Open
Figure 7.1: A visualisation of closed and open geometries.
Flat Space Let us set k = 0, and the transformation
r = χ,
then, we have the line element
ds2 = dt2 − a2(t)[dχ2 + χ2
(dθ2 + sin2 θdφ2
)],
if we set θ = π/2 again, then the square-bracketed quantity is just
dχ2 + χ2dφ2.
This line element is just that of plane polars, which is flat. Hence, we see that k = 0corresponds to flat space,
These correspondences of k with a particular geometry will become much clearer later on.
The standard way to write the FRW metric, in light of these coordinate transformations,is
ds2 = dt2 − a2(t)
dχ2 +
sin2 χχ2
sinh2 χ
(dθ2 + sin2 θdφ2) ,
k =
+10−1
. (7.4)
98 7 THE FRIEDMANN-ROBERTSON-WALKER UNIVERSE
7.2 Geodesics & Christofell Symbols
We can compute the geodesics, and read off the Christofell symbols, from the effective La-grangian formed from the FRW metric (7.2)
Leff = t2 − a2(t)
[1
1− kr2r2 + r2
(θ2 + sin2 θφ2
)],
where an overdot denotes derivative with respect to the affine parameter, λ, say. Now, onewill need to use the following relation
a =da
dλ
=∂a
∂t
∂t
∂λ
= a′t, a′ ≡ da
dt.
Upon careful computation, one finds the four geodesics:
t− aa′
1− kr2r2 − aa′r2θ2 − aa′r2 sin2 θφ2 = 0,
r +kr2
1− kr2
(2a2 − 1
a2
)r2 + 2
a′
atr − r(1− kr2)θ2 − r sin2 θ(1− kr2)φ2 = 0,
θ + 2a′
atθ +
2
rrθ − sin θ cos θφ2 = 0,
φ+ 2a′
atφ+ 2
sin2 θ
rrφ+ 2 cot θθφ = 0.
This allows us to read off the non-zero components of the Christofell symbols;
Γt rr = − aa′
1− kr2, Γt θθ = −aa′r2, Γt φφ = −aa′r2 sin2 θ,
Γr rr =kr2
1− kr2
(2a2 − 1
a2
), Γr tr =
a′
a, Γr θθ = −r(1− kr2),
Γr φφ = −r sin2 θ(1− kr2),
Γθ tθ =a′
a, Γθ rθ =
1
r, Γθ φφ = − sin θ cos θ,
Γφtφ =a′
a, Γφrφ =
sin2 θ
r, Γφθφ = cot θ.
Notice that using the definition of the Hubble parameter, H = a′/a, we see that
Γr tr = Γθ tθ = Γφtφ = H.
This is the only section in which the derivative with respect to the affine parameter will beused; hence, an overdot from hereon denotes derivative with respect to coordinate time t.
7.3 Cosmology in the FRW Universe 99
7.3 Cosmology in the FRW Universe
We now wish to consider what happens to spacetime, in the FRW Universe. To do so, weshall need the Ricci tensor corresponding to the FRW metric, and some energy-momentumtensor.
So, following from the FRW metric, (7.2), one can compute the associated components ofthe Ricci tensor. Doing so, one finds
R00 = −3a
a, (7.5)
R0i = Ri0 = 0, (7.6)
Rij = −(
2k
a2+a
a+
2a2
a2
)gij. (7.7)
The metric gµν is the FRW metric, which we note can be written as
g00 = 1, gij = −a2(t)diag((1− kr2)−1, r2, r2 sin2 θ
).
Recall Einstein’s equation, in the form
Rµν = 8πG
(Tµν −
1
2gµνT
), T ≡ gµνTµν .
We now use Weyl’s postulate which is that our Universe is a perfect fluid. A perfect fluid isone for whom there is no heat conduction or viscosity.
Recall that the general energy-momentum tensor is given by
Tµν = (ρ+ P )uµuν − Pgµν ,
where P is the pressure of the fluid, and ρ the density. Hence, its trace is
T = (ρ+ P )uµuµ − Pgµµ = ρ+ P − 4P,
that is,T = ρ− 3P.
Infact, this result can be obtained in a slightly easier way. Recall that in the comoving frameof the fluid, uµ = (1, 0, 0, 0), then the energy-momentum tensor is diagonal,
Tµν = diag(ρ,−P,−P,−P ).
Hence, its trace is just the sum of its components, T = ρ− 3P .
So, let us compute the bracketed bit of the Einstein equation,
Tµν −1
2gµνT = (ρ+ P )uµuν − Pgµν −
1
2gµν(ρ− 3P )
= (ρ+ P )uµuν −1
2gµν(ρ− P ).
100 7 THE FRIEDMANN-ROBERTSON-WALKER UNIVERSE
Hence, the Einstein equation reads
Rµν = 8πG
((ρ+ P )uµuν −
1
2gµν(ρ− P )
).
Now, consider the comoving frame of the fluid, then we have that
Tµν = diag (ρ,−Pgij) , T = ρ− 3P,
and thus that
Tµν −1
2gµνT =
1
2diag (ρ+ 3P, gij(P − ρ)) .
Hence,
T00 −1
2g00T =
1
2(ρ+ 3P ),
so, the 00-component of the Einstein equation, using (7.5) is
−3a
a= 8πG
1
2(ρ+ 3P ),
trivially rearranging into
a
a= −4πG
3(ρ+ 3P ). (7.8)
This is known as Raychauhuri’s equation.
Similarly, suppose we took the ij-part of the Einstein equation, using (7.7), then
−(
2k
a2+a
a+
2a2
a2
)gij = −8πG
1
2gij(ρ− P ),
from which we cancel out the metric gij,
2k
a2+a
a+
2a2
a2= 4πG(ρ− P ).
Let us then insert Raychauhuri’s equation for the middle term on the LHS,
2k
a2− 4πG
3(ρ+ 3P ) +
2a2
a2= 4πG(ρ− P ).
This can then be rearranged easily enough into(a
a
)2
=8πG
3ρ− k
a2. (7.9)
This is known as the Friedmann equation. It is common to notate
a
a≡ H,
7.3 Cosmology in the FRW Universe 101
so that the Friedmann equation reads
H2 =8πG
3ρ− k
a2.
In deriving these two equations, we jumped around a bit between comoving frames. Theseequations describe the expansion of the universe, in the comoving frame of the fluid.
Let us see where the continuity equation
∇νTνµ = 0,
can get us. So, this is just
∂νTνµ + Γν ναT
αµ − Γν µαT
αν = 0,
whereT µν = diag(ρ,−P,−P,−P ).
Now, the Christofell symbols relevant are
Γt tt = 0, Γθ tθ = Γφtφ = Γr tr = H.
Now, let us take the µ = t-component of the continuity equation,
∂νTνt + Γν ναT
αt − Γν tαT
αν = 0,
that is,∂tT
tt − ∂iT it + Γν ναT
αt − Γν tαT
αν = 0.
Now, the second term above is zero, as the energy-momentum tensor is diagonal. Hence, ifwe write that T µν = δµνT
µν , then
∂tTtt + Γν ναδ
αt T
αt − Γν tαδ
αν T
αν = 0,
which is just∂tT
tt + Γν νtT
tt − Γν tνT
νν = 0.
Now, the only non-zero Christofell symbols of the form Γν νt are those Γi it. Hence,
∂tTtt + Γi itT
tt − Γi tiT
ii = 0.
Therefore, with reference to the above Christofell symbols, we see that this is just
∂tρ+ 3Hρ+ 3HP = 0,
which is
ρ = −3H(ρ+ P ), (7.10)
which is known as the energy conservation equation, or the fluid equation.
Hence, the three important equations we have derived, for a Universe in a perfect fluid:
102 7 THE FRIEDMANN-ROBERTSON-WALKER UNIVERSE
• The Raychaudhuri equation:
a
a= −4πG
3(ρ+ 3P ). (7.11)
• The Friedmann equation: (a
a
)2
=8πG
3ρ− k
a2. (7.12)
• The fluid equation:
ρ = −3a
a(ρ+ P ). (7.13)
All three equations are dependent upon the others, so that in solving them, one must use allthree. Infact, using any two, one can derive the third.
7.3.1 Species Evolution & Densities
The components to the fluid are called “species”. That is, we could conceive that the fluidis composed of matter, radiation and possibly some other “stuff” (which we shall come tolater).
Notice that we can write the fluid equation as
a∂ρ
∂a= −3(ρ+ P ).
Then, this can be solved, for the evolution of ρ as a function of scale factor a. We nowconsider three cases. We shall consider how the density of a particular species evolves, as afunction of scale factor, if only that species exists in the Universe.
Matter Dominated FRW Universe Consider a Universe that is filled solely with matter.For matter, there is no associated pressure. Hence, Pm = 0, and the fluid equation becomes
a∂ρm
∂a= −3ρm,
integrating,
−3
∫da
a=
∫dρm
ρm
⇒ −3 ln a = ln ρm,
which is just
ρm =ρm,0
a3, (7.14)
whereby ρm,0 the (constant) initial density of matter.
7.3 Cosmology in the FRW Universe 103
Radiation Dominated FRW Universe Radiation has the equation of state
ρr = 3Pr,
which may be derived from black-body radiation theory. Hence using this, the fluid equationreads
a∂ρr
∂a= −4ρr,
integrating as before results in
ρr =ρr,0
a4. (7.15)
Vacuum Dominated FRW Universe The equation of state for vacuum is
ρV = −PV,
so that the fluid equation reads
ρV = 0,
hence, we see that ρV = const.
Critical Density Recall the Friedmann equation, but let us set k = 0 (i.e. flat),
H2 =8πG
3ρ.
Then, let us define this ρ to be ρcrit, so that
ρcrit =3H2
8πG. (7.16)
That is, ρcrit is the density required to make the Universe flat. If we take the present valueof the Hubble parameter to be
H0 = 100h km s−1Mpc−1,
then the critical density should have value (if measured today),
ρcrit = 10.54h2keV cm−3.
We use the notation that a subscript “0” denotes the present value of a quantity. In particular,we define
a0 ≡ 1;
the present value of the scale factor is unity.
104 7 THE FRIEDMANN-ROBERTSON-WALKER UNIVERSE
Normalised Energy Densities Let us suppose that there are four species present in theUniverse: matter, radiation, vacuum and curvature. Let us now define
Ωm ≡ρm,0
ρcrit
, Ωr ≡ρr,0
ρcrit
, ΩV ≡ρV,0
ρcrit
, Ωk ≡ −k
H20a
20
. (7.17)
That is, the Ωi are called the normalised energy densities of the species; they represent thecurrent fraction of that species, in terms of the critical density. We impose the condition
Ωm + Ωr + ΩV + Ωk = 1,
as the Universe appears to be flat, by measurement. The matter species is composed of bothbaryonic and dark matter, radiation is composed of both photons and neutrinos. We tend tocall the vacuum species the cosmological constant, so that ΩV = ΩΛ. See Table (7.1) for thecurrent values of various quantities.
Quantity Current Accepted Value
Ωm 0.24Ωb 0.04
ΩDM 0.20Ωr < 0.01Ωk 0.05ΩΛ 0.7
Table 7.1: Various quantities, as a fraction of ρcrit.
7.4 Age of the FRW Universe
Let us return to the Friedmann equation
H2 =8πG
3ρ− k
a2,
if we the divide through by H20 ,
H2
H20
=8πG
3H20
ρ− k
a2H20
,
and the last expression on the RHS multiply/divide by a20, to give
H2
H20
=8πG
3H20
ρ− k
a20H
20
a20
a2.
Now, we notice the presence of our definitions of ρcrit and Ωk, so that
H2
H20
=ρ
ρcrit
+Ωk
a2,
7.4 Age of the FRW Universe 105
after using that a0 = 1. We now insert our derived evolutions of the various species ρi,
H2
H20
=1
ρcrit
(ρm,0
a3+ρr,0
a4+ ρV,0
)+
Ωk
a2
=Ωm
a3+
Ωr
a4+ ΩV +
Ωk
a2.
Hence, if we set H = H0, and a = a0, then we have
Ωm + Ωr + ΩV + Ωk = 1.
So, let us write our expression back in terms of the scale factor, so that(a
a
)2
= H20
[Ωm
a3+
Ωr
a4+ ΩV +
Ωk
a2
],
or,
a
a= H0
[Ωm
a3+
Ωr
a4+ ΩV +
Ωk
a2
]1/2
,
multiplying through by a, and pulling inside the square-root,
a = H0
[Ωm
a+
Ωr
a2+ ΩVa
2 + Ωk
]1/2
. (7.18)
Now, consider that
t0 =
∫ t0
0
dt =
∫ a0=1
0
dt
dada =
∫ 1
0
da
a.
Hence, we have that
t0 =1
H0
∫ 1
0
da
[Ωm
a+
Ωr
a2+ ΩVa
2 + Ωk
]−1/2
. (7.19)
Therefore, this expression will give us the age of the Universe.
7.4.1 Age of Matter Dominated Universe
So, let us assume that Ωm = 1 (all other species are zero). Hence, the present age of theUniverse may be given by
t0 =1
H0
∫ 1
0
da a1/2
=2
3H0
.
106 7 THE FRIEDMANN-ROBERTSON-WALKER UNIVERSE
Also notice that in the matter dominated universe, (7.18) looks quite simple,
a = H0
√Ωma
−1/2,
which is easily solved to give
a ∝ t2/3. (7.20)
That is, if the Universe is matter dominated, then the scale factor evolves in time as t2/3.
Another curious result, is that for a vacuum dominated Universe, a ∝ a, which impliesthat
a ∝ et,
that is, in a vacuum dominated Universe, the scale factor grows exponentially with time.
7.4.2 Age of Matter & Curvature Dominated Universe
Here, we have a mixture of two species, such that
Ωr = ΩV = 0, Ωm + Ωk = 1.
Let us introduce a rescaling of time, known as conformal time, whereby
adη = dt.
Hence,
η =
∫ t
0
dt
a=
∫ a
0
da
aa.
Notice that in writing this, we have that η = η(a). We should then be able to invert it, sothat a = a(η). Notice that if we use conformal time, the FRW metric (7.2) can be writtenin the form
ds2 = a2(t)
[dη2 − dr2
1− kr2+ r2
(dθ2 + sin2 θdφ2
)]∼ a2(t)gµνdx
µdxν .
That is, we have a conformal transformation of the metric. This is why we call η conformaltime.
Now, (7.18) in our model is
a = H0
[Ωm
a+ Ωk
]1/2
,
hence,
aa = H0
[Ωma+ Ωka
2]1/2
.
7.4 Age of the FRW Universe 107
Therefore, using this,
η =1
H0
∫ a
0
da√Ωma+ Ωka2
.
To integrate this, we complete the square, giving
η =1
H0
∫ a
0
da
Ωk
[(a+
Ωm
2Ωk
)2
− Ω2m
4Ω2k
]−1/2
.
If we then define
x ≡ 2Ωk
Ωm
a+ 1,
then we see that we can write
η =1
H0
∫ x
a
dx2Ωk
Ωm
Ωk
[Ω2
m
4Ω2k
(x2 − 1)
]−1/2
=1
H0
√Ωk
∫ x
1
dx√x2 − 1
,
where we look up the value of the integral,∫ x
1
dx√x2 − 1
= cosh−1 x.
Hence,
η =1
H0
√Ωk
cosh−1 x.
Therefore,
x = cosh(ηH0
√Ωk
).
Hence,
a(η) =Ωm
2Ωk
[cosh
(ηH0
√Ωk
)− 1].
Writing Ωk = 1− Ωm, then this reads
a(η) =Ωm
2(1− Ωm)
[cosh
(ηH0
√1− Ωm
)− 1], Ωk > 0. (7.21)
Clearly, this only holds for Ωk > 0. If Ωk < 0, then the cosh becomes a cosine, and we have
a(η) =Ωm
2(Ωm − 1)
[1− cos
(ηH0
√Ωm − 1
)], Ωk < 0. (7.22)
With reference to Figure (7.2), we see the two different types of Universes. It is clear fromthe analytic forms of the evolution of scale factor with conformal time, a(η), that Ωk > 0
108 7 THE FRIEDMANN-ROBERTSON-WALKER UNIVERSE
Η
aHΗL
Figure 7.2: A visualisation of closed and open universes. Closed has Ωk < 0, and open Ωk > 0. Theformer is just a sinusoidal-oscillation, the latter an exponential expansion.
corresponds to an exponential increase in scale factor (7.21), and Ωk < 0 an oscillatory scalefactor (7.22). Also, from the definition of Ωk,
Ωk = − k
H20a
20
,
we see that
Ωk > 0 ⇒ k < 0 ⇒ open, (7.23)
Ωk < 0 ⇒ k > 0 ⇒ closed. (7.24)
which are in agreement of our previous statements of open and closed Universes. So,
An oscillatory Universe will have a definite (conformal) time when it ends, when a(η) hitsthe axis again,
cos(ηtotH0
√Ωm − 1
)= 1 ⇒ ηtot =
2π√Ωm − 1H0
.
Hence, the actual total time is given by
ttot =
∫ ηtot
0
dηa(η),
which easily evaluates to
ttot =πΩm
(Ωm − 1)3/2H0
.
Therefore, we have an expression for the total possible age of the Universe, if the Universehas a closed geometry. Hence, a small non-zero k is sufficient to control the future “fate”of the Universe. That is, the Universe will either end up exponentially growing (the “heatdeath”), or will crunch back on itself (the “big crunch”).
7.5 Light in the FRW Universe 109
7.5 Light in the FRW Universe
Consider the FRW metric, where we shall ignore all angular terms;
ds2 = dt2 − a2(t)dr2
1− kr2.
Now, assuming flatness, for light (i.e. null geodesics, ds2 = 0), we have that the metricreduces to
dt = a(t)dr.
Therefore, consider
R =
∫ R
0
dr =
∫ te
to
dt
a(t).
That is, the distance between two points that have photons sent between them. We havethat te is the time of emission of the photon, and to the time of observation. Now, we shallassume that this distance is unchanged, for pulses sent slightly after this first set, so that
R =
∫ te+δte
to+δto
dt
a(t).
Therefore, we have that ∫ te+δte
to+δto
dt
a(t)=
∫ te
to
dt
a(t).
Now, the only non-zero contribution to this (via a general calculus mid-point theorem) is
δtoa(to)
− δtea(te)
= 0,
which easily rearranges toδtoδte
=a(to)
a(te).
Now, we can express the LHS as a ratio of frequencies (by units), so that
νeνo
=a(to)
a(te)≡ 1 + z.
Hence, we arrive at a standard relation in cosmology,
νeνo
= 1 + z. (7.25)
This is always> 0. Therefore, we see that the ratio of received frequency and “sent” frequency(i.e. the frequency that the light was, when it was sent by the object) is dependent uponthe redshift z that the light was emitted. This quantity z is just the ratio of the scalefactors when the light was received, to when it was emitted. Hence, we see that the further
110 7 THE FRIEDMANN-ROBERTSON-WALKER UNIVERSE
away something is, the frequency we see light emitted by it drops. That is, the wavelengthincreases. Hence, this is called the cosmological redshift effect. This is a different effect fromgravitational redshift, because gravitational redshift occurred due to different distances froma gravitating body.
Expansion of Universe ⇒ Cosmological redshift,
Different distances up gravitational potential ⇒ Gravitational redshift.
To get a handle on the numbers invloved, consider that the most distant quasar is at z ≈ 6.6,and that recombination is at z ≈ 103.
Notice that we can write
z =νe − νoνo
=ao − aeae
.
Also, recall that (non-relativistic) redshift is related to the velocity of the object,
z =v
c=δa
a.
Hence, notice that we may compute
δa
a=δa/δt
aδt = H
R
c.
Therefore,v = HR.
This is Hubble’s law, as derived from first principles from the FRW metric.
7.6 Flatness Problem
Now, there are problems with the FRW Universe.
Recall that the fraction, today, of curvatures contribution to the total density of the Uni-verse is Ωk,0 < 10%. Also recall that we defined
Ωk(t) ≡ k
H(t)a2(t),
where t is the time at which we are measuring. Hence, let us compute,
Ωk(t0)
Ωk(tr)=k/H2
0a20
k/H2ra
2r
;
the ratio of the curvature contributions today and in the radiation dominated epoch. Thiseasily reduces to
Ωk(t0)
Ωk(tr)=
a2r
H20a
20
.
7.6 Flatness Problem 111
Now, recalling that the scale factor, in the radiation dominated epoch, depends upon timeas
ar = a0
(trt0
)1/2
⇒ ar =a0
2t0
(trt0
)−1/2
,
Also, recall that the Hubble parameter, in the radiation dominated epoch, goes as
H0 =1
2t0.
Hence,
ar = H0a0
(trt0
)−1/2
.
And therefore,Ωk(t0)
Ωk(tr)=t0tr.
Putting some typical numbers in, one sees that
Ωk(t0)
Ωk(tr)≈ 1017secs
10−43secs= 1060.
Hence,Ωk(t0) = 1060Ωk(tr).
That is, the value of Ωk is 1060 times what it was in the radiation epoch! This requires avery small (so called “fine-tuning”) curvature in the early epoch, so that the Universe couldbe 1060 times more curved now, than it was.
This fine-tuning required is called the flatness problem.
7.6.1 Inflation
One way to “solve” the flatness problem, is to introduce the concept of inflation. If weallow an epoch before the radiation domination, that was vacuum dominated (recall thataV (t) = aie
Ht). In this case, we can compute that
Ωk(tr)
Ωk(ti)=a2i
a2r
,
after assuming that Hi ≈ Hr. This gives
Ωk(tr)
Ωk(ti)= e−2H(tr−ti),
a number we require to be less than 10−60. Therefore, we require
Ne ≡ 2H(tr − ti) > 60.
112 7 THE FRIEDMANN-ROBERTSON-WALKER UNIVERSE
That is, we require the number of e-folds to be about 60, in order for us to observe theflatness that we do today.
Basically, this idea of inflation gives a mechanism by which the Universe is able to stretchand flatten out, very quickly. Infact, inflation also aids in explaining the observed homogene-ity of the Universe.
113
8 The General Theory of Relativity: Discussion
We have now come to a place whereby all the mathematical groundwork has been laid, for a“wordy” discussion about the general theory of relativity.
Before general relativity (or at least a few hundred years before Einstein, as general relativ-ity went through a few people before Einstein, in various forms), gravity was some force thatwas present between two bodies having mass. As this was so, things that don’t have massdon’t interact with gravity. This means that things like photons are not affected by gravity,and that photons are not capable of generating a gravitational field. Also, the structureof spacetime was that space is flat, and time is just something to be moved through, at aconstant rate; where the rate is the same for all observers.
General relativity somewhat starts off by letting space and time mix: spacetime. The wholecollection of bits of spacetime is then what we call a manifold; further to this, allowing ameaning to the term “distance” in a manifold, we introduce a metric. We call a manifold(collection of points) that has a metric, a Riemannian manifold. We “used” to think ofspacetime as being flat (Pythagoras’ theorem for distances between two points). A flatspacetime is described by a metric with constant components; taking the derivative of anyone of them, with respect to any coordinate, is zero. Now, general relativity introduces theidea that a metric has components that depend on position. This means that in order tofind the distance between two points, you not only have to know where the points are, butwhere you are relative to the origin of the coordinate system. This is in contrast with onlyneeding to know the relative positions of the two points.
When one computes the derivative of something, one is computing the rate of change ofsomething in a particular direction. Now, when one did this in a flat spacetime, the derivativeof the metric didn’t do anything: its derivative was zero. In a position dependent metric,this is no longer true. One finds that there is an extra bit, added onto the differential ofsomething, that is proportional to the derivative of the metric. That this extra bit exists, isdirectly due to the metric being position dependent. Therefore, various combinations of thismetric (in the form of differential with respect to various coordinates), will give us a handleon the geometry of the manifold. A slightly curious thing is that a manifold does not requirea higher dimension in which to curve. Usually, when one imagines a ball (as an example),one can see that the surface of the ball is curved round, through three dimensions, but thesurface of the ball itself is two dimensional. Manifolds do not require this extra dimension(to those within the manifold itself) in which to curve.
Mathematically, we carry around the “extra bits” of the differential in the Christofellsymbols; and the “various combinations” of the differentials of the metric in the Riemanntensor.
Now, something that lives in a manifold, and moves in that manifold, will move alongsome sort of curve; which is fairly obvious. Now, the motion of something, with respect toa stationary observer, can be determined. In a flat spacetime, something will move along
114 8 THE GENERAL THEORY OF RELATIVITY: DISCUSSION
lines that are determined by Newton’s equations of motion. In a curved spacetime (i.e. aspacetime that doesn’t have all zero components of its Riemann tensor), the curves thatthings move along are changed; and the amount that they are changed by is proportional tothe Christofell symbols. These curves are geodesics. A geodesic, in flat spacetime, with noexternal forces (such as a rocket boost, or magnetic fields), is a straight line. A correspondinggeodesic, in curved spacetime is curved. This curvature of the movement free “thing” is dueto the curvature of the spacetime.
So, this far in our discussion, we have seen that if a manifold is curved, then things don’ttend to move in straight lines within the manifold. That is, the geodesics are curved lines.A way to imagine this, is to envisage a cube threaded with a 3D grid; beads move along thegridlines, but the gridlines are not straight. This is only an analogy, as the real geodesics are4D. Then, we must consider what it is that does the curving. What thing, in a manifold,causes it to be curved?
The proposal of Einstein is that all forms of matter and energy (even though they areessentially the same) curve spacetime. The proposal equates the distribution of “stuff” (i.e.the things that do the curving, things that have mass & energy) with the geometry of thespacetime. That is, the distribution of mass-energy with combinations of the metric. Thismeans that the more energy you put in a given place, the more the spacetime is curved (andhence the more curvy geodesics get). The distribution of mass-energy is carried around inthe energy-momentum tensor, and the geometry in the Einstein tensor.
This curvature of spacetime, due to the distribution of mass-energy is the “main idea” ofgeneral relativity.
Some of the consequences of this general theory include the “ability” of massless things,which have energy, to interact with gravity. This is because the massless things move throughthe spacetime, and gravity is just the curvature of spacetime. This allows the geodesic ofa photon to be curved. Notice that this is in contrast with the previous flat spacetime westarted off discussing. This gives the so-called “light deflection” effect. Another consequenceis that things at a different distance from the centre of a body doing the curving (the so-calledgravitating mass), experience difference rates of passage through time. This is because theposition-dependent metric has different values at different positions (obviously). An exampleof this, is that if we synchronize two clocks, on the surface of the earth, then take one upfrom the surface of the earth, and leave one on the surface, they will tell different times whenbrought back together.