The Geometry behind the Special Relativity Theory

The Geometry behind the Special Relativity Theory

Paul Anthony Fronteri

June 4, 2012

Bachelor Project in Mathematics

Abstract

This project is an attempt of approaching known physical concepts from a mathematical pointof view, or more precisely, from a geometrical framework. In particular, we are studying con-cepts from special relativity (Lorentz transformation), electromagnetism (Skew-symmetry),and quantum mechanics/fields (Spinors).

Most of the material of the project is based on the book ”The Geometry of MinkowskiSpace. An introduction to the Mathematics of the Special Relativity Theory” by Gregory L.Naber [1]. But other literature have also been used, such as ”Semi-Riemannian Geometry”by Barrett O’Neill [2], ”Introduction to 2-Spinors in General Relativity” by Peter O’Donnell[3] and ”Riemannian manifolds” by John M. Lee [4] to complement the mathematical under-standing. My education in physics allowed me to understand the content and motivation ofthe G. L. Naber book, but from my point of view there is a lack of mathematical logic andcare. One of the aims of the project was to complement the physically interesting materialof the book written by Gregory L. Naber by a mathematical understanding of the geometryand algebraic structures behind the physical processes.

The first Chapter starts with the motivation for studying the special relativity theory froma mathematical point of view. The second Chapter consists of an account of the geometry ofthe Minkowski space and the Lorentz transformation in this space. Sections 2.1 to 2.4 concernthe basic theory behind, and in Section 2.5 some practical examples of the usage of the Lorentztransformation are presented in order to connect the mathematical theory and physics. Thethird Chapter goes further and introduces particles and charged particle, which naturallyintroduces the electric and magnetics fields (electromagnetism). As in the second Chapter,Sections 3.1 and 3.2 deal with the mathematical framework, meanwhile Section 3.3 gives thephysical application of this framework. The last Chapter includes one additional structurethat posses our particles, namely spin. Sections 4.1 - 4.4 are devoted to the presentation ofthe theory, and Section 4.5 gives a clear image for the ”double-value” property of the spinorsby the null flag.

Paul Anthony Fronteri Bachelor Project in Mathematics - page i

Acknowledgements

I which to thank my supervisor on this project, Irina Markina. Thank you for giving mea interesting, exciting and sometimes very hard project. And also, just as importantly, forhaving the patient and believing in me.

Paul Anthony Fronteri09 May, 2012

Paul Anthony Fronteri Bachelor Project in Mathematics - page ii

Contents

1 Our world as a space-time 2

2 Geometry of Minkowski space and the Lorentz transformations 3

2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Minkowski Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Lorentz Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.4 Timelike Vectors and Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4.1 Some properties of the world velocity and acceleration . . . . . . . . . . 132.5 Spacelike Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.6 Some Classical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.6.1 Time dilation and relativity of simultaneity . . . . . . . . . . . . . . . . 142.6.2 Special Lorentz Transformation and Boots . . . . . . . . . . . . . . . . . 16

3 Particles and Electromagnetic Fields 19

3.1 Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Charged particles and Electromagnetic Fields . . . . . . . . . . . . . . . . . . . . 203.3 Interactions of particles and Charged particles in Electromagnetic Fields . . . 23

3.3.1 Relativistic Doppler and Compton effect . . . . . . . . . . . . . . . . . . 233.3.2 Constant fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.3.3 Variable fields and Maxwell’s Equations . . . . . . . . . . . . . . . . . . . 26

4 Spin Transformations and Spinors 28

4.1 Spin Transformation and Spinor map . . . . . . . . . . . . . . . . . . . . . . . . . 284.2 Representations of matrix groups . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.3 Spin Space and spinor of valence . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.4 Spinors and World Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.5 Bivectors and Null Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Paul Anthony Fronteri Bachelor Project in Mathematics - page 1

1 Our world as a space-time

Our world, the universe, the space-time or the reality, is a four-dimensional space endowedwith a special kind of metric called Minkowski metric. These are different names for onecommon concept, namely, for the place we live and breathe in. Nevertheless, the reality ofour habitat has not always been thought of this form. Some time ago one though that theworld as flat, a disk where the world states and where sailors on the far seas have to be carefulnot to fall into the deep of nothing.

Following to the geometry of Euclid, Descartes assigned to each observer O a coordinatesystem (x1, x2, x3), which one could identify with three space dimensions.

Galileo Galilei discovered that laws of mechanics are the same for all observers. So, givenan observation by one observer O, one could translate the observation to an other observer Oby the Galilean transformation. Time was disconnected from space itself. The ancient Greeksthought that time was not real, it was just an illusion of the mind, of the flow of the world.Newton made it more concrete by declaring it absolute.

The first who imagined the universe in more than three dimension were mathematicians:Gauss, Riemann, Mobius, Lorentz, Poincare and Minkowski, just to mention a few of them.They assumed that the reality should not only have more dimensions, but also should beendowed with a curvature. The curvature could be positive, like of the sphere Sn, or negativelike of the hyperbolic space H

n. But what does the spaces of the mathematical imagination,to do with the real world we live in? It turned out that they much more related than one couldexpect. The starting point in the understanding of this connection belongs to the physicistAlbert Einstein, who invented theories of special and general relativity.

As the first step, our three dimensional space and time were not anymore thought of as twoseparated concepts, but they became a part of a bigger space. Namely, they formed the space-time, or a four-dimensional space which was furnished with special tools of measurement.Each event in the universe, was described as one point in this space-time. The birth anddeath of Einstein, are two different points in the space-time, connected by the worldline ofEinstein himself: the journey of his life in this universe of ours. However, binding these twoconcepts of space and time was not all. They became also interconnected, space and timewere no longer absolute, as it was according Newton’s theory, but relative to each observersviewpoint. Thereby, the name relativity theory comes.

The distance between two points in the space-time was no longer given by the Euclideandistance, or its metric, but it became a byproduct of the Minkowski metric. The transforma-tion between two observers was changed from the Galilean transformation to the Lorentzianone in order to preserve the new type of metric in the universe.


2 Geometry of Minkowski space and the Lorentz transforma-

tions

This chapter goes through the general geometric structures of the Minkowski space, suchas inner product, null cone, future directedness and causality. It is also defined the Lorentztransformation and the Lorentz group. Timelike and spacelike vector properties are discussed,while a physical meaning is postponed to the end of the Chapter.

2.1 Definitions

Let V be an arbitrary vector space of dimension n ≥ 1 over the real numbers R and v,ω beelements (vectors) of this vector space.

Definition 2.1 (Bilinear form). A bilinear form on V is a map g ∶ V ×V → R that is linear in

each variable, i.e. such that g(a1v1 +a2v2,ω) = a1g(v1,ω)+a2g(v2,ω) and g(v, a1ω1 +a2ω2) =a1g(v,ω1) + a2g(v,ω2), where a ∈ R,∀v,ω ∈ V

A bilinear form g is symmetric if g(v,ω) = g(ω, v), ∀v,ω ∈ V and non-degenerate ifg(v,ω) = 0 for ∀ω ∈ V implies v = 0.

Definition 2.2 (Scalar product). A non-degenerate, symmetric, bilinear form g is called an

scalar product and the image g(v,ω) is written as v ⋅ ω.

Note that Naber [1] use the term inner product for the scalar product. In general isa scalar product g for which g(v, v) > 0 if v ≠ 0 called positive definite (or inner product),negative definite for which g(v, v) < 0 if v ≠ 0, and indefinite if it is neither positive or negativedefinite. A vector space V , with a scalar product g, (V, g) is called a scalar space.

Definition 2.3 (Orthogonality). If (V, g) is an n-dimensional scalar space then, two vectors

v,ω ∈ V are said to be g-orthogonal (or just orthogonal) if g(v,ω) = 0.

Definition 2.4 (Orthogonal complement). If W is a subspace of V , then the orthogonal

complement W⊥of W in V is defined by W

⊥ = {v ∈ V �g(v,ω) = 0, ∀ω ∈W}.Definition 2.5 (Quadratic forms). The quadratic forms associated with a scalar product g on

an n-dimensional vector space V is the map Q ∶ V → R defined by Q(v) = g(v, v) = v2, v ∈ V .

Definition 2.6 (Index of bilinear form [2]). The index of a bilinear form Q defined on a

vector space V is the maximal dimension of the subspace of V where the form Q is negative

definite.

Theorem 2.7 ([1] page 8,[2]). Let V be an n-dimensional real vector space on which a non-

degenerate, symmetric, bilinear form g ∶ V × V → R is defined. Then there exists a basis

{e1, . . . , en} for V such that g(ei, ej) = 0 if i ≠ j and Q(ei) = ±1 for each i = 1, . . . , n.Moreover, the number of basis vectors ei for which Q(ei) = −1 is the same for any such basis

and coincides with the index of g. Such a basis is called for a orthonormal basis.


2.2 Minkowski Space

All the definitions in this section has been taken from Naber [1] pages 9-18 and 64-66.

The Minkowski spacetime is a 4-dimensional real vector spaceM on which a non-degenerate,symmetric, bilinear form g of index 1 is defined. The points ofM are called events in physicsand g is referred to as a Lorentz scalar product onM.

One make the use of the Einstein summation. To illustrate the usage of the summationconvention do one have the indices a and b that range over the set {1,2,3,4}, then

xaea =

4

�a=1x

aea = x1e1 + x2e2 + x3e3 + x4e4,

Λa

bxb =

4

�b=1

Λa

bxb = Λa

1x1 +Λa

2x2 +Λa

3x3 +Λa

4x4,

ηabvaωb = η11v1ω1 + η12v1ω2 + η13v1ω3 + η14v1ω4 + η21v2ω1 + . . . + η44v4ω4

.

An event x ∈M expressed in the orthonormal basis {e1, e2, e3, e4} = {ea} ofM is writtenas x = xaea = x1e1+x2e2+x3e3+x4e4. Here (x1, x2, x3, x4) are called coordinates of x relativeto the basis {ea}, with the spatial (x1, x2, x3) and time (x4) coordinates. Choosing twoelements v,ω inM relative to same basis ofM, one has v = vaea, ω = ωa

ea, and

g(v,ω) = v1ω1 + v2ω2 + v3ω3 − v4ω4.

The metric written in matrix form is

η =

��

1 0 0 00 1 0 00 0 1 00 0 0 −1

��with entries ηab. Thus

ηab =��

1 a = b = 1,2,3−1 a = b = 40 otherwise.

One can then write g(ea, eb) = ηab, and g(v,ω) = ηabvaωb. The inverse matrix to η = {ηab} isdenoted by η

−1 = {ηab} = η.Since the Lorentz scalar product is not positive definite, there exist nonzero vectors v ∈M

such that g(v, v) = Q(v) = 0. Such vectors are called null (or lightlike) vectors.

Theorem 2.8 ([1] page 10,[2]). Two nonzero null vectors v and ω in M are orthogonal if

and only if they are parallel, i.e. if there is t ∈ R such that v = tω.Consider two distinct events x0 and x for which the displacement vector v = x − x0 from

x0 to x is null, i.e., Q(v) = Q(x−x0) = 0. Expressing this in relation to an orthonormal basis{ea}, do one have that x = xaea and x0 = xa0ea, such that

(x1 − x10)2 + (x2 − x20)2 + (x3 − x30)2 + (x4 − x40)2 = 0.

This equation describes a cone in the four dimensional space R4 with a vertex at the point

(x10, x20, x30, x40).


Definition 2.9 (Null cone, null line). The null cone (or light cone) CN(x0) at x0 in M is

defined by

CN(x0) = {x ∈M�Q(x − x0) = 0}.The null line (or physically, a light ray) Rx0,x containing x0 and x is defined by

Rx0,x = {x0 + t(x − x0)�t ∈ R}.

Notice that: Rx0,x = Rx,x0 .

Theorem 2.10 ([1] page 12). Let x0 and x be two distinct events with Q(x − x0) = 0. Then

Rx0,x = CN(x0) ∩CN(x).

A vector v in M is said to be timelike if Q(v) < 0 and spacelike if Q(v) > 0, or if it isthe zero vector v = 0. Further if v is the timelike displacement vector v = x − x0, then v willbe inside the null cone CN(x0). If it is spacelike, then it is outside the cone. A timelike lineT = {x0 + t(x − x0)�t ∈ R} is a line inM, where x − x0 is a timelike vector.

Theorem 2.11 ([1] page 16). Suppose that v is timelike and ω is either timelike or null. Let

{ea} be a orthonormal basis forM with v = vaea and ωaea. Then either

(a) v4ω4 > 0, in which case g(v,ω) < 0, or

(b) v4ω4 < 0, in which case g(v,ω) > 0.

Corollary 2.12 ([1] page 16). If a nonzero vector in M is orthogonal to a timelike vector,

then it must be spacelike.

We have found now some basic building blocks for the Minkowski space, but we wouldlike to have some sort of time orientation inM. We define then the future and past direction.

Definition 2.13 (Future- and Past-directed Equivalence Relation). By taking the collection

of all timelike vectors inM called τ , we define an equivalence relation ∼ on τ : if v and ω are

in τ , then v ∼ ω if and only if g(v,ω) < 0.

Note that in the collection τ , the coordinates v4 and ω

4 will have the same sign in anyorthonormal basis ofM by Theorem (2.11). This leads us to that the collection τ is the unionof two distinct subsets, τ+ for positive value of v4 (or ω4) and τ

− for negative value of v4 (orω4). They have the following properties: v ∼ ω for all v,ω ∈ τ+, v ∼ ω for all v,ω ∈ τ− and

v �∼ ω if one of the vectors v or ω is in τ+ and the other one belongs to τ

−. The elements ofτ+ are called future directed and the elements in τ

− are past-directed.

Definition 2.14 (Time cone, future time cone and past time cone). For each x0 in M we

define the time cone CT (x0), future time cone C+T(x0) and past time cone C

−T(x0) at x0 by

CT (x0) = {x ∈M�Q(x − x0) < 0},

C+T (x0) = {x ∈M�x − x0 ∈ τ+} = CT (x0) ∩ τ+,

C−T (x0) = {x ∈M�x − x0 ∈ τ−} = CT (x0) ∩ τ−.

The time cone CT (x0) is a open solid cone which boundary is the null cone CN(x0) andit is the disjoint union of C+

T(x0) and C

−T(x0).


Definition 2.15 (Future- and Past-directed nonzero null vectors). A nonzero null vector n

is future-directed if n ⋅ v < 0,∀v ∈ τ+ and it is past-directed if n ⋅ v > 0,∀v ∈ τ+.Note that one uses only τ

+ to define the direction of the null vector. This is due to thesign of the product n ⋅ v that is the same for all v ∈ τ+.Definition 2.16 (Future- and Past null cone at x0). For any x0 ∈M we define the future

null cone at x0 by

C+N(x0) = {x ∈ CN(x0)�x − x0 is future directed},

and past null cone at x0 by

C−N(x0) = {x ∈ CN(x0)�x − x0 is past directed}.

Illustration of the future and past cone for a point. Time is the vertical axis, and space theothers. As we see, we can define a plane of simultaneity, where all these events happen at

the same time.

With the basic building blocks of the Minkowski space and time orientation one can definea causal structure onM.

Definition 2.17 (Chronologic, causal relations). There exist two order relations onM called

causality relations onM. For two events x and y inM we say that

1) the event x chronologically precedes y, and write x� y, if the displacement vector y −xis timelike and future-directed.

2) the event x causally precedes y, and write x < y, if the displacement vector y − x is null

and future-directed.

Definition 2.18 (Causal Automorphism). A map F ∶M →M is said to be a causal auto-

morphism if it is bijective, and both F and F−1

preserves the causal order <, i.e., x < y if and

only if F (x) < F (y). Note that, in particular, F is not assumed to be linear or continuous.

Definition 2.19 (Translation). A continuous map T ∶M →M is said to be a translation if

T (v) = v + v0 for some fixed v0 ∈M. Geometrically is it a function that moves every point a

constant distance in a the direction of v.


Definition 2.20 (Dilation). A linear transformation K ∶M →M is said to be a dilation if

K(v) = kv for some positive real number k. Geometrically is it a function that scales, enlarges

or increases the vectors by a factor k.

Definition 2.21 (Orthochronous). A orthogonal transformation L ∶M →M is said to be

orthochronous if x ⋅Lx < 0 for all timelike or null vectors x.

In the following theorem the structure of orthochronous transformations is presented.They are essentially translations, dilations and the orthogonal transformations. In this sensethe orthochronous transformations of the Minkowski space are analogous of the Mobius trans-formations of the Euclidean space.

Theorem 2.22 (Zeeman Theorem [1] page 66). Let F ∶M→M be a causal automorphism of

M. Then there exists an orthochronous orthogonal transformation L ∶M→M, a translation

T ∶M→M, and a dilation K ∶M→M such that F = T ○K ○L.

2.3 Lorentz Group

One have now seen how the causal structure came from the Lorentz metric in the Minkowskispace and the definitions of null-, space-, and timelike vectors in Minkowski space. However,we need to relate two different observers O and O in the Minkowski space, ”interpret” dif-ferent events. One must introduce a transformation rule between two observers, relating anevent viewed from the first observer respective to the second one. In the Minkowski spacethis transformation is called the Lorentz group. We shall now look on some of its properties.

All the definitions in this section has been taken from Naber [1] pages 18-22.

Definition 2.23 (Linear and orthogonal transformations). If {ea} and {ea} are two or-

thonormal bases for M then there is a unique linear transformation L ∶M →M such that

L(ea) = ea for each a = 1,2,3,4. The linear transformation L ∶ M → M is said to be an

orthogonal transformation of M if g(Lx,Ly) = g(x, y) for ∀x, y ∈ M, so, it preserves the

length of vectors.

Lemma 2.24 ([1] page 13,[2]). Let L ∶M→M be a linear transformation. Then the following

are equivalent:

(a) L is an orthogonal transformation.

(b) L preserves the quadratic form ofM, i.e., Q(Lx) = Q(x) for ∀x ∈M

(c) L carries any orthonormal basis ofM onto another orthonormal basis ofM.

We define the matrix Λ = �Λa

b�a,b=1,2,3,4 associated to the orthogonal transformation L

and the orthonormal basis {ea} by

Λ =

��

Λ11 Λ1

2 Λ13 Λ1

4

Λ21 Λ2

2 Λ23 Λ2

4

Λ31 Λ3

2 Λ33 Λ3

4

Λ41 Λ4

2 Λ43 Λ4

4

��

,


meaning that for two orthonormal bases {ea} and {ea} for M, each element of {ea} can beexpressed as a linear combination of the ea:

eu = Λ1ue1 +Λ2

ue2 +Λ3ue3 +Λ4

ue4.

The orthogonality conditions g(ec, ed) = ηcd, c, d = 1,2,3,4, is written as

Λ1cΛ

1d+Λ2

cΛ2d+Λ3

cΛ3d−Λ4

cΛ4d= ηcd,

or, with the summation convention,

Λa

cΛb

dηab = ηcd and Λa

cΛb

dηcd = ηab. (1)

Having this matrix we can write the transformation of one event written in basis of {ea}to itself written in another basis {ea} by:

xa = Λa

bxb, a = 1,2,3,4, (2)

or more detailed

x1 = Λ1

1x1 +Λ1

2x2 +Λ1

3x3 +Λ1

4x4

x2 = Λ2

1x1 +Λ2

2x2 +Λ2

3x3 +Λ2

4x4

x3 = Λ3

1x1 +Λ3

2x2 +Λ3

3x3 +Λ3

4x4

x4 = Λ4

1x1 +Λ4

2x2 +Λ4

3x3 +Λ4

4x4.

Definition 2.25 (General (homogeneous) Lorentz transformation). Any 4×4 matrix Λ which

satisfies

ΛTηΛ = η (3)

where T is the transposed, is called an general (homogeneous) Lorentz transformation.

Note that the eq. (3) is equivalent to eq. (1). Moreover one can find the inverse of Λ by

Λ−1 = ηΛTη.

With this one can write the entries of the inverse matrix Λ−1 = �Λ b

a�a,b=1,2,3,4 relative to

the orthonormal basis {ea} by,

Λ−1 =��

Λ 11 Λ 1

2 Λ 13 Λ 1

4

Λ 21 Λ 2

2 Λ 23 Λ 2

4

Λ 31 Λ 3

2 Λ 33 Λ 3

4

Λ 41 Λ 4

2 Λ 43 Λ 4

4

��

=

��

Λ11 Λ2

1 Λ31 −Λ4

1

Λ12 Λ2

2 Λ32 −Λ4

2

Λ13 Λ2

3 Λ33 −Λ4

3

−Λ14 −Λ2

4 −Λ34 Λ4

4

��

. (4)

Definition 2.26 (General Lorentz Group LGH). The set of all general (homogeneous) Lorentz

transformations form a group under matrix multiplication. This group is called the general

(homogeneous) Lorentz group and is denoted by LGH .

Note that physically not all transformations from LGH are interesting. The most impor-tant those which preserve the time orientations and do the same for the spacelike part of theMinkowski space. Now we are going to define these special transformations inside of LGH .

Setting c = d = 4 in Eq. (1) one obtains (Λ44)2 = 1+ (Λ1

4)2 + (Λ24)2 + (Λ2

4)2, that implies(Λ4

4)2 ≥ 1. Consequently,Λ4

4 ≥ 1 or Λ44 ≤ −1.


Definition 2.27. Any element Λ of LGH is said to be orthochronous if Λ44 ≥ 1 or it is

non-orthochronous if Λ44 ≤ −1.

Theorem 2.28 ([1] page 18,[2]). Let Λ = �Λa

b�a,b=1,2,3,4 be an element of LGH and {ea}a=1,2,3,4

an orthonormal basis forM. Then the following are equivalent:

(a) Λ is orthochronous,

(b) Λ preserves the time orientation of all null vectors, i.e., if v = vaea is a null vector, then

the numbers v4and v

4 = Λ4bvbhave the same sign,

(c) Λ preserves the time orientation of all timelike vectors.

Notice that if Λ nonorthochronous, then it reverses the time orientation of all timelikeand nonzero vectors.

The determinant of Λ is found from taking the determinant of both side of Eq. (3):

det �ΛT � det(η)det(Λ) = det(η)det �ΛT � = det(Λ)(det(Λ))2 = 1,

leading to det(Λ) = 1 or det(Λ) = −1.

Definition 2.29 (Proper and improper Lorentz transformation). We call a Lorentz trans-

formation proper if detΛ = 1 and improper if detΛ = −1.

An improper orthochronous Lorentz transformation will reverse the spatial orientationfrom left-handed to right-handed, or right-handed to left-handed system. All in all, theorthochronous transformations (time preserving elements of LGH) form a subgroup of LGH

and this subgroup has two components that we called proper and improper transformation.Now we are ready to the following important definition.

Definition 2.30 (Lorentz Group L). The Lorentz group, L, is the subgroup of the general

Lorentz group, LGH , of proper and orthochronous Lorentz transformation.

As we mentioned before, the elements of the Lorentz group are linear transformations ofMthat preserve the length of vectors, preserves the time orientation and the space orientation.Notice that in the literature the general (homogeneous) group often refers as a Lorentz group.Further one can define an admissible basis, or a reference frame for an observer, where theLorentz group will act.

Definition 2.31 (Admissible basis and reference frame). We define an admissible basis for

M to be an orthonormal basis {e1, e2, e3, e4} with e4 timelike and future-directed and {ea} ={e1, e2, e3} spacelike and “right-handed”, i.e., satisfying (e1 × e2) ⋅ e3 = 1.An admissible basis called as an admissible frame of reference. Any two such bases (frames)

are related by a Lorentz transformation from the Lorentz group L.


The Lorentz group L has an important subgroup R, consisting of matrices R = �Ra

b� of

the form

R =

��

0�Ri

j� 0

00 0 0 1

��

,

where �Ri

j�i,j=1,2,3 is a unimodular orthogonal matrix, i.e., satisfying det �Ri

j� = 1 and

�Ri

j�T = �Ri

j�−1. The coordinate transformation associated with R corresponds physically

corresponds to a rotation of the spatial coordinate axes within a given frame of reference [1]page 21,[5]. For this reason R is called the rotational subgroup of L and its elements arecalled rotations in L.

Lemma 2.32 ([1] page 21[5]). Let Λ = �Λa

b�a,b=1,2,3,4 be a proper, orthochronous Lorentz

transformation. Then the following are equivalent:

(a) Λ is a rotation.

(b) Λ14 = Λ2

4 = Λ34 = 0

(c) Λ41 = Λ4

2 = Λ43 = 0

(d) Λ44 = 1.

2.4 Timelike Vectors and Curves

In two next sections we would like to investigate how a voyage of an observer O in theMinkowski space, represented by a curve in the Minkowski space, behaves and what propertiesit has. First, we look on properties of timelike vectors itself, and then go ahead to study curvesthrough the behavior of its velocity vector.

This section with theory has been taken, with some rewriting, from Naber [1] pages 46-156.

Definition 2.33 (Duration τ(v)). For any timelike vector inM we define the duration τ(v)of v by τ(v) =

�−Q(v).

If v is defined as the displacement vector between two events x0 and x, i.e. v = x−x0, thenτ(x− x0) is to be interpreted physically as the time separation of x0 and x in any admissibleframe of reference in which both events occur at the same spatial location. The durationτ(x − x0) is also a lower bound for the temporal separation of x0 and x, and it is called theproper time separation of x0 and x.

Definition 2.34 (Time axis). A null line in the Minkowski space which passes trough the

origin is called a time axis.

In general if T is a time axis, then there exists an admissible basis {ea} forM such thatthe subspace ofM spanned by e4 is T . Then the space Span{e4} is timelike and Span{e4}⊥is spacelike by Cor. (2.12).


Theorem 2.35 (Reversed Schwartz Inequality [1] page 48,[2]). If v and ω are timelike vectors

inM, then

(v ⋅ ω)2 ≥ v2ω2

and equality holds if and only if v and ω are linearly dependent.

Theorem 2.36 (Reversed Triangle Inequality [1] page 49,[2]). Let v and ω be timelike vectors

with the same time orientation (i.e. v ⋅ ω < 0). Then

τ(v + ω) ≥ τ(v) + τ(ω)

and equality holds if and only if v and ω are linearly dependent.

Lemma 2.37 ([1] page 50,[2]). The sum of any finite number of vectors in M all of which

are timelike or null and all future-directed (respectively, past-directed) is timelike and future-

directed (respectively, past-directed) except when all of the vectors are null and parallel, in

which case the sum is null and future-directed (respectively, past-directed).

Corollary 2.38 ([1] page 50). Let v1, . . . , vn be timelike vectors, all with the same time

orientation. Then

τ(v1 + v2 +� + vn) ≥ τ(v1) + τ(v2) +� + τ(vn)

and equality holds if and only if v1, v2, . . . , vn are all parallel.

Corollary 2.39 ([1] page 51,[2]). Let v and ω be two non-parallel null vectors. Then v and

ω have the same time orientation if and only if v ⋅ ω < 0.

Definition 2.40 (Curve, Worldline). Let I ⊆ R be an open interval. A continuous map

α ∶ I →M is a curve inM. A curve in the Minkowski space is often called a worldline.

Illustration of curve, or worldline, in Minkowski space.


Relative to any admissible basis {ea} forM, one can write α(t) = xa(t)ea for each t ∈ I.The curve α is smooth if each component function x

a(t) is infinitely differentiable and if α’svelocity vector

α′(t) = dx

a

dtea

is nonzero for each t ∈ I.

A curve α ∶ I →M is said to be spacelike, timelike or null, if α′(t) ⋅ α′(t) is > 0, < 0 or

= 0 respectively for each t. A timelike or null curve α is future-directed (respectively past-

directed) if α′(t) is future-directed (respectively past-directed) for each t. This notion can

also be extended to intervals that can contain one or both of it endpoints.

Definition 2.41 (Reparametrization of a curve). If α ∶ I → M is a curve and J ⊆ R is

another interval and h ∶ J → I, t = h(s) is an infinitely differentiable function with h′(s) > 0

for each s ∈ J , then the curve β = α ○ h ∶ J →M is called a reparametrization of α

Definition 2.42 (Proper time length). If α ∶ [a, b] →M is a timelike curve in M, then we

define the proper time length of α by

L(α) = �b

a

�α′(t) ⋅ α′(t)�1�2

dt = �b

a

�−ηab

dxa

dt

dxb

dtdt.

The proper time L(α) is interpreted, by the Clock Hypothesis, as the time lapse betweenthe events α(a) and α(b) as measured by the ideal standard clock carried along by a observerwhose travel in the Minkowski space is represented by the curve α.

Theorem 2.43 ([1] page 53,[2]). Let p and q be two points inM. Then p− q is timelike and

future-directed if and only if there exists a smooth, future-directed timelike curve α ∶ [a, b]→Msuch that α(a) = q and α(b) = p.

This leads us to the conclusion that the acceleration has no effect on the rate of the idealstandard clock, i.e., that the instantaneous rate of such clock depends only on its instantaneousspeed and not the rate at which the speed is changing.

In analogue to the proper time separation between two events one can define the propertime function (arclength) on a curve.

Definition 2.44 (Proper time function). Let α ∶ I →M be a timelike smooth curve. The

proper time function τ(t) on I is defined by

τ = τ(t) = �t

0�α′(u) ⋅ α′(u)�

1�2du.

Thus, dτ

dt= �α′(t) ⋅ α′(t)�1�2 is positive and infinitely differentiable since α is smooth and

timelike. The inverse t = h(τ) therefore exists and dh

dτ= �dτ

dt�−1 > 0. We then conclude that τ

is a legitimate parameter along α. Notation: α(τ) = xa(τ)ea.Using the proper time function, we can reparametrize our curve and define an analogue

to the velocity vector, namely the world velocity, and world acceleration.

Definition 2.45 (World velocity, 4-velocity). The vector α′ = dx

a

dτea of α is called the world

velocity (or 4-velocity) of α and denoted U(τ) = Uaea.

Definition 2.46 (World acceleration, 4-acceleration). The second proper time derivative α′′ =

d2xa

dτ2ea of α is called the world acceleration (or 4-acceleration) of α and denoted A(τ) = Aa

ea.


2.4.1 Some properties of the world velocity and acceleration

This example has been taken, with some rewriting, from Naber [1] pages 56 to 58.

Having a world velocity U along a curve the Lorentz dot product with itself and the worldacceleration is:

U ⋅U = −1 and U ⋅A = 0, (5)

on each point along the curve α.

For a given curve an admissible observer would probably prefer to parametrize the curveby his time x

4 than by the proper time τ . This will give us

dτ

dx4= �α′(x4) ⋅ α′(x4)�

1�2=

��1 −

��dx

1

dx4�2

+ �dx2

dx4�2

+ �dx3

dx4�2��=�1 − β2(x4) = γ(x4) = γ,

where β(x4) is the usual instantaneous speed of the observer whose curve is α relative to theframe S(x1, x2, x3, x4). We get then

Ui = dx

i

dτ= dx

i

dx4

dx4

dτ= γ dx

i

dx4, i = 1,2,3,

and

U4 = γ

Thus

U = Uaea = γ

dx1

dx4e1 + γ

dx2

dx4e2 + γ

dx3

dx4e3 + γe4

or

U = (U1, U

2, U

3, U

4) = γ �dx1

dx4,dx

2

dx4,dx

3

dx4,1� ,

Similarly, we get the world acceleration

Ai = γ d

dx4�γ dx

i

dx4� , i = 1,2,3,

and

A4 = γ d

dx4(γ),

So

U = (A1,A

2,A

3,A

4) = γ d

dx4�γ dx

1

dx4,γ

dx2

dx4,γ

dx3

dx4,γ� .


At each fixed point α(τ0) along a timelike curve α, the world velocity U(τ0), is a future-directed unit timelike vector. It is often used as the timelike vector e4 in some admissiblebasis forM. Relative to such a basis, U(τ0) = (0,0,0,1). Letting x

40 = x4(τ0) we find

� dxi

dx4�x4=x4

0

= 0, i = 1,2,3,

and so β(x40) = 0 and γ(x40) = 1. One say that the frame of reference corresponds to a basis thatis ”momentarily at rest” (x4 = x40). Any such frame of references is called an instantaneous

rest frame for α at α(τ0), which is of importance in physics. In a instantaneous rest frameone will then have that g(A,A) = ��a�2, where �a = �u is the 3-acceleration in S defined by theordinary derivative of the 3-velocity in Eq. (6).

2.5 Spacelike Vectors

This part with theory has been taken, with some rewriting, from Naber [1] pages 61-63.

Spacelike separations x−x0, i.e., two events x and x0 for which Q(x−x0) > 0, lies outsideof the null cone at x0. There does not exist an admissible basis in which the separation oftwo events is zero, i.e., there is no admissible observer who can experience both events. Onethen has to travel faster then the speed of light.

Choosing an admissible frame S for an observer for which the separation ∆x4 of x and x0

is r ∈ R, an observer in another admissible frame S, in general, will not agree on the temperal

order of x and x0.

Definition 2.47 (Proper Spatial Separation S). For any two events x and x0 for which

Q(x − x0) > 0 (spacelike vector) one define the proper spatial separation S(x − x0) of x and

x0 by

S(x − x0) =�Q(x − x0).

Suppose that the spacelike displacement vector x−x0 is orthogonal to the timelike line T

(containing x1, x2, and x). Then the proper spatial separation would be

S(x − x0) =1

2(τ(x0 − x1) + τ(x2 − x1)).

Physically one can interpreted this as the distance of an event xmeasured by an admissibleobserver O, between the emission and reception of light signals connecting O with x.

Suppose now that v and ω are nonzero vectors in M with v ⋅ ω = 0. If v and ω are null,then they must be parallel. If on the other hand v is timelike then ω must be spacelike.And if v and ω are both spacelike then their proper spatial lengths satisfy the PythagoreanTheorem S

2(v + ω) = S2(v) + S2(ω). This we proved in the Appendix, Exercise 1.5.3.


2.6 Some Classical Example

2.6.1 Time dilation and relativity of simultaneity


Consider two admissible frames of reference S and S with two admissible bases {ea} and{ea}, respectively. Two events on a curve in S being spatially (physically) at rest will satisfythe equations ∆x1 = ∆x2 = ∆x3 = 0. The value ∆e4 is the time difference between these twoevents. The coordinate difference in S then will be:

∆xb = Λ b

a ∆xa = Λ b

4 ∆x4.

From this and the fact that Λ 44 and Λ4

4 are nonzero it follows that the ratio

∆xi

∆x4= Λ i

4

Λ 44

= −Λ4i

Λ44

, i = 1,2,3,

is constant and independent on the particular point at rest in S. Physically, these ratios areinterpreted as the components of the ordinary velocity 3-vector of S relative to S:

�u = u1e1 + u2e2 + u3e3, where ui = Λ i

4

Λ 44

= −Λ4i

Λ44

, i = 1,2,3. (6)

Similarly, the velocity 3-vector of S relative to S is:

�u = u1e1 + u2e2 + u3e3, where ui = Λi

4

Λ44

= −Λ4

i

Λ 44

, i = 1,2,3.

Next we observe that ∑3i=1 �∆x

i

∆x4 �2= �Λ4

4�−2∑3

i=1 �Λ4i�2 = �Λ4

4�−2 ��Λ4

4�2 − 1� . Similarly,

∑3i=1 �∆x

i

∆x4 �2= �Λ4

4�−2 ��Λ4

4�2 − 1� . Physically, we interpret these equalities as asserting that

the velocity of S relative to S and the velocity of S relative to S have the same constant

magnitude which we shall denote by β. Thus β2 = 1 − �Λ44�−2, in particular, 0 ≤ β2 < 1, and

β = 0 if and only if Λ is a rotation. Solving for Λ44 (assuming orthochronous) yields

Λ44 = Λ 4

4 = �1 − β2�−1�2 = γ−1�2. (7)

Definition 2.48 (Direction 3-vector, direction cosine). Assuming that Λ is not a rotation

one can define the direction 3-vector �d of S relative to S by:

�u = β �d = β(d1e1 + d2e2 + d3e3), di = ui�β (8)

where diis the direction cosines of the direction line segment along which the observer in

S sees moving. Similarly the direction 4-vector (and direction cosines) of S relative to S is

defined by:

�u = β �d = β(d1e1 + d2e2 + d3e3), di = ui�β.


Comparing (6) and (8) and using (7) we obtain

Λ i

4 = −Λ4i = β(1 − β2)−1�2di, i = 1,2,3 (9)

and similarly

Λi

4 = −Λ 4i = β(1 − β2)−1�2di, i = 1,2,3 (10)

Equations (7), (9) and (10) give the last row and column of Λ in terms of physically measurablequantities. We obtain from (2)

∆x4 = −βγ(d1∆x

1 + d2∆x2 + d3∆x

3) + γ∆x4

for any two events. A special case of two events on the curve of a point in rest in S ∆x1 =

∆x2 =∆x

3 = 0) gives

∆x4 = γ∆x

4 = 1�1 − β2

∆x4.

In particular, ∆x4 =∆x

4 if and only if Λ is a rotation. Any relative motion of S and S givesrise to a time dilation effect according to the relation ∆x

4 >∆x4.

Another special case is also interesting, namely when two events are simultaneous in S,i.e., ∆x

4 = 0. Then

∆x4 = −βγ(d1∆x

1 + d2∆x2 + d3∆x

3).Assuming that β ≠ 0 gives, in general, that ∆x

4 ≠ 0, meaning that two events are notsimultaneous in S. The only way they will agree on the simultaneity is if and only if thespatial locations of the events have a very special relation in the direction along which S ismoving, namely,

d1∆x

1 + d2∆x2 + d3∆x

3 = 0.It is called the relativity of simultaneity.

2.6.2 Special Lorentz Transformation and Boots


Looking at a subgroup of the Lorentz group L, which direction cosines are given by

d1 = 1, d1 = −1 and d

2 = d2 = d3 = d3 = 0, the direction vectors be �d = e1 and �d = −e1. Thiscorresponds to the situation where the observer S sees S moving in the positive x

1-direction,and S sees S moving in the negative x1-direction. The origin of both systems will coincide atx4 = x4 = 0, and two of three spatial coordinates will be the same in both frames of reference.

Now, from the eq. (7), (9) and (10) we find that this Lorentz transformation matrix Λ musthave the form

Λ =

��

Λ11 Λ1

2 Λ13 −βγ

Λ21 Λ2

2 Λ23 0

Λ31 Λ3

2 Λ33 0

−βγ 0 0 γ

��

,


and with the use of the orthogonality conditions (1), Λ must take the form

Λ =

��

γ 0 0 −βγ0 Λ2

2 Λ23 0

0 Λ32 Λ3

3 0−βγ 0 0 γ

��

,

where �Λi

j�i,j=2,3 is a (2 × 2)-matrix, is a rotation on the plane R

2.

Definition 2.49 (Special Lorentz Transformation). Any Lorentz transformation with Λ24 =

Λ44 = Λ4

2 = Λ43 = 0 and �Λi

j�i,j=2,3 equal to a (2×2) identity matrix is called a special Lorentz

transformation. In matrix form it is written as

Λ =

��

γ 0 0 −βγ0 1 0 00 0 1 0−βγ 0 0 γ

��

, (11)

with the associated coordinate transformation,

x1 = (1 − β2)−1�2x1 − β(1 − β)−1�2x4,

x2 = x2,

x3 = x3,

x4 = −β(1 − β)−1�2x1 + (1 − β)−1�2x4.

(12)

The inverse is

Λ−1 =��

γ 0 0 βγ

0 1 0 00 0 1 0βγ 0 0 γ

��

.

and the corresponding coordinate transformation is

x1 = (1 − β2)−1�2x1 + β(1 − β)−1�2x4,

x2 = x2,

x3 = x3,

x4 = β(1 − β)−1�2x1 + (1 − β)−1�2x4.

The special Lorentz transformation allows −1 < β < 1. By choosing β > 0 when Λ14 < 0

and β < 0 when Λ14 > 0 all special Lorentz transformations can be written in the form of (11).

Definition 2.50 (Boost). For each real number β with −1 < β < 1 we define γ = γ(β) =(1 − β2)−1�2 and

Λ(β) =

��

γ 0 0 −βγ0 1 0 00 0 1 0−βγ 0 0 γ

��

.

The matrix Λ(β) is called a boost in the x1-direction.


The composition of two boosts in the x1-direction is another boost in the x

1-direction.Since Λ−1(β) = Λ(−β) the collection of such special Lorentz transformations (boosts) forms asubgroup of the Lorentz group L.

Remark that the composition of two boost in two different directions, is in general, notequivalent to a single boost in any direction.

Suppose now that −1 < β1 ≤ β2 < 1, then

� β1 + β21 + β1β2

� < 1, (13)

and

Λ(β1)Λ(β2) = Λ�β1 + β21 + β1β2

� . (14)

Both Eq. (13) and Eq. (14) are proven in the Appendix, Exercise 1.3.14.

The physical interpretation is the following: if the speed of S relative to S is β1 and the

speed of ˆS relative to S is β2, then the speed of ˆS relative to S is not β1 + β2, but rather

β1 + β21 + β1β2

,

which is always less then β1 + β2, unless β1β2 = 0. Equation (14) is called the relativisticaddition of velocities formula[1] page 29,[2].

Even if the velocities are not additive directly, one can define an velocity parameter θ that

is additive. Such that if the speed of S relative to S is θ1 and the speed of ˆS relative to S is

θ2, then the speed of ˆS relative to S is θ1 + θ2. The parameter β is then a one-to-one relationwith θ, β = f(θ). Additivity and (14) require that f satisfies the functional equation

f(θ1 + θ2) = f �f(θ1) + fθ2)1 + f(θ1)f(θ2)

� .

This formula reminiscent of the sum formula for the hyperbolic tangent, making the changeof variable

β = tanh(θ) or θ = tanh−1 β.The hyperbolic form of the Lorentz transformation Λ(β) [1] page 30,[2] (proven in the Ap-pendix, Exercise 1.3.16) is

L(θ) =

��

cosh(θ) 0 0 − sinh(θ)0 1 0 00 0 1 0

− sinh(θ) 0 0 cosh(θ)

��

.

The boots and the rotations as a subgroups are buildings blocks of the Lorentz group,that is expressed in the following theorem.


Theorem 2.51 ([1] page 30). Let Λ = �Λa

b�a,b=1,2,3,4 be a proper, orthochronous Lorentz

transformation. Then there exist a real number θ and two rotations R1, R2 ∈ R such that

Λ = R1L(θ)R2.

The physical interpretation of Theorem (2.51) is that a Lorentz transformation from Sto S can be accomplished by (1) rotating the relative motion of S with S, (such that thepositive x

1-directions coincide), (2) boosting to corresponding speed as S (relative to S), (3)rotating the spatial axes until it coincides with those of S.

3 Particles and Electromagnetic Fields

In this chapter we shall investigate how particles and electromagnetic fields are described inthe Minkowski space. The first section defines general properties of particles. In the secondsection we see how a charge of the particle, defines a electromagnetic and a magnetic field inthe Minkowski space.

3.1 Particles

All the definitions has been taken from Naber [1] pages 87-91.

Definition 3.1 (Material particle, proper mass). A material particle inM is a pair (α,m),where α ∶ I →M is a timelike curve parametrized by proper time τ and m is a positive real

number called the particle proper mass.

One interprets the curve α as the trajectory of the particle.

Definition 3.2 (Free material particle). A free material particle is a material particle where

α has the form α(τ) = x0 + τU for some fixed event x0 and unit timelike world velocity vector

U(τ).

Definition 3.3 (World Momentum). The world momentum (or the energy-momentum) of a

material particle (α,m) is denoted by P and is defined by

P = P (τ) =mU(τ).

Notice that P ⋅ P = −m2.

Definition 3.4 (Relative 3-momentum). The world momentum in an arbitrary admissible

basis {ea} is P = P aea, or in other notation

P = (P 1, P

2, P

3, P

4) =mγ(�µ,1) = (�p,mγ),

where �p = (P 1, P

2, P

3) is called the relative 3-momentum. The magnitude of the 3-momentum

is given by its Euclidean norm ��p�2 = �P 1�2 + �P 2�2 + �P 3�2.

The quantitymγ = m�1−β2

is sometimes referred as the ”relativistic mass” of (α,m) relativeto {ea}.


Definition 3.5 (Total relativistic energy). The total relativistic energy of (α,m) in the basis

{ea} denoted as E is defined by

E = −P ⋅ e4 = P 4 =mγ =m + 1

2mβ

2 + . . .

When β = 0, the observer is in the instantaneous rest frame of the particle with the totalrelativistic energy E = P 4 =m.

Let N be a future-directed null vector inM written as N = Naea in an admissible basis

{ea}. Then,N = �(�e + e4), (15)

where � = −N ⋅ e4 = N4 and �e is the direction 3-vector of N relative to {ea}, i.e.,

�e = �(N1)2 + (N2)2 + (N3)2�−1�2 �N1e1 +N2

e2 +N3e3� .

This property is proven in the Appendix, Exercise 1.8.3.

Definition 3.6 (Massless particle, photon). A photon, or in general a massless particle, in

M is a pair (α,N). A future-directed null vector N is called the photon world momentum

and α ∶ I →M is a null curve given by α(t) = x0 + tN for some fixed event x0 in M and all

t ∈ I.

A free particle is referred, in general, as either a free material particle of a photon.

Definition 3.7 (Energy, frequency and wavelength of a photon). Relative to any admissible

basis {ea} the positive real number

� = −N ⋅ e4 = N4,

is called the energy of the photon in {ea}. The frequency ν and wavelength λ of the photon

in {ea} are defined by ν = ��h and λ = 1�ν, where h is the Planck constant.

3.2 Charged particles and Electromagnetic Fields

This section has been taken, with some rewriting, from Naber [1] pages 100-107 and 113-121.

Definition 3.8 (Charged Particle). A charged particle inM is a triple (α,m, e) where (α,m)is a material particle and e is a nonzero real number called the charge of the particle. A free

charged particle is a charged particle where (α,m) is a free material particle.

Definition 3.9 (Skew-Symmetry). A linear transformation F ∶M→M which satisfies

Fx ⋅ y = −x ⋅ Fy,

for all x and y inM, is called skew-symmetric. In particular is

Fx ⋅ x = x ⋅ Fx = 0. (16)


The Lorentz World Force Law is an equation of motion expressing the rate at which theparticle’s world momentum changes at each point of its curve as a linear function of theparticles velocity. Mathematically is it expressed as:

dP

dτ= eFU, (17)

where U = U(τ) is the particle’s world velocity, P =mU its world momentum, and F ∶M→Mis a skew-symmetric linear transformation.

To see that F has to be skew-symmetric, we rewrite the Lorentz world force law as

FU = m

e

dU

dτ= m

eA.

Dotting both sides with U gives us

FU ⋅U = m

eA ⋅U = 0,

by Eq. (5). Then for all timelike u inM, Fu ⋅ u is zero, that implies skew-symmetri of F .

Representing our skew-symmetric linear transformation F as a matrix relative to an ar-bitrary admissible basis {ea} forM, we have Feb = F a

bea = F 1

be1 + F 2

be2 + F 3

be3 + F 4

be4.

We get F a

a = 0 for a = 1,2,3,4 from the definition of skew-symmetry (16), i.e., the diagonalof the matrix of F is zero. Also are F

i

j= −F j

ifor i, j = 1,2,3 and F

4i= F i

i. The matrix of

F is then

[F a

b] =

��

0 F12 F

13 F

14

−F 12 0 F

23 F

24

−F 13 −F 2

3 0 F34

F14 F

24 F

34 0

��

. (18)

Relating it to the two 3-vectors for the electric �E = E1e1 + E2

e2 + E3e3 and magnetic

�B = B1e1 +B2

e2 +B3e3 fields, we get that E1 = F 1

4, E2 = F 2

4, E3 = F 3

4, B1 = F 2

3, B2 = −F 1

3

and B3 = F 1

2. This is well known from physics and can be found in [6] page 569 - 573. Notethat they use in [6] Fab instead of F a

b. The skew-symmetric matrix (18) of the Lorentz force

F is then written as

[F a

b] =

��

0 B3 −B2

E1

−B3 0 B1

E2

B2 −B1 0 E

3

E1

E2

E3 0

��

. (19)

Values Fa

b, Ei and B

i for i = 1,2,3,4, are defined in the same way, relative to anotheradmissible basis {ea}. Thus if Λ is a special Lorentz transformation, or more precisely a bootsin the x

1 direction, that carries {ea} to {ea}, one get F a

b= Λa

αΛβ

bF

α

β, or then

E1 = E1

, E2 = γ(E2 − βB3), E

3 = γ(E3 + βB2),B

1 = B1, B

2 = γ(βE3 +B2), B3 = −γ(βE2 −B3).


Physically one can see that even if the electric field (magnetic field) is zero for one observer,is it not in general zero for another. This result is shown in the Appendix, Exercise 2.2.1.The next Theorem is proven also in the Appendix, Exercise 2.2.5.

Theorem 3.10 ([1] page 106). Let F ∶M →M be a skew-symmetric linear transformation

and {ea} an arbitrary basis forM. With the matrix �F a

b� written in the form (19) and 4× 4

identity matrix I we have

det ([F a

b] − λI) = λ4 + � ��B�2 − ��E�2�λ2 − � �E ⋅ �B�2 ,

where ��E�2 = �E1�2 + �E2�2 + �E3�2, ��B�2 = �B1�2 + �B2�2 + �B3�2 and �E ⋅ �B = E1B

1 +E2B

2 +E

3B

3.

Consequently, the eigenvalues of F are real solutions to the equations

λ4 + � ��B�2 − ��E�2�λ2 − � �E ⋅ �B�2 .

Moreover, the algebraic combinations ��B�2 − ��E�2 and �E ⋅ �B will be invariant under Lorentz

transformations. We say that F is null if ��B�2 − ��E�2 = �E ⋅ �B = 0, otherwise F is called regular.

With the definition of null and regular nonzero skew-symmetric linear function F , does itexists a basis forM relative to F , called the canonical basis for F , such that in matrix formis it written as:

FN =

��

0 0 0 00 0 α 00 −α 0 α

0 0 α 0

��

or FR =

��

0 δ 0 0−δ 0 0 00 0 0 �

0 0 � 0

��

, (20)

respectively for null or regular F . One call these matrices for the canonical form of F .

By comparing with the matrix F on the form of (19) we see that, the matrix FN representsan observer which measures the electric and magnetic 3-vectors perpendicular to each other,and having the same magnitude, �E = αe3 and �B = αe1. The canonical form FR then representsan observer which measures the electric and magnetic 3-vectors in the same direction, and ofmagnitude � and δ respectively, �E = �e3 and �B = δe1. Furthermore since FR is regular can we

use Theorem (3.10) and obtain that ��B�2 − ��E�2 = δ2 − �2 and �E ⋅ �B = δ�.Definition 3.11 (Principal null direction). If F is regular, then there are the two independent

null directions which span the whole eigenspace of FR, namely e3±e4, called the principal null

direction of FR. If F is null, then there is just one null direction, namely e2 + e4, called the

principal null direction of FN .

Definition 3.12 (Energy-momentum transformation). Let F be a nonzero, skew-symmetric

linear transformation on M. The linear transformation T ∶M→M defined by

T = 1

4π�14tr �F 2� I − F 2� ,

is called the energy momentum transformation associated with F . Here F2 = F ○ F , I is the

identity transformation, and tr(F2) is the trace of F

2.


The energy momentum T is symmetric with respect to the Lorentz scalar product, i.e.

Tx ⋅ y = x ⋅ Ty (21)

for all x and y inM. This is proven in the Appendix, Exercise 2.5.1.

Moreover, it is trace free, i.e., tr(T ) = 0, and relative to an arbitrary admissible basis forM, the matrix �T a

b� of T has its entries given by

Ta

b= 1

4π�14F

α

βF

β

αδa

b− F a

αFα

b� , a, b = 1,2,3,4.

Relating T to the two 3-vectors for the electric �E and magnetic fields �B by the matrix(19), we get

T11 =

1

8π�−(E1)2 + (E2)2 + (E3)2 − (B1)2 + (B2)2 + (B3)2�

T22 =

1

8π�(E1)2 − (E2)2 + (E3)2 + (B1)2 − (B2)2 + (B3)2�

T33 =

1

8π�(E1)2 + (E2)2 − (E3)2 + (B1)2 + (B2)2 − (B3)2�

T44 = −

1

8π�� E�2 + � �B�2� .

In classical electromagnetic theory, the quantity −T 44 = 1

8π �� E�2 + � �B�2� is called the energy

density measured in the given frame of reference for the electric and magnetic fields. The 3-vector 1

4π�E× �B = (E2

B3−E3

B2)e1+(E3

B1−E1

B3)e2+(E1

B2−E2

B1)e3 = T 1

4e1+T 24e2+T 3

4e3

is the Poynting 3-vector and describes the energy density flux of the field. Also, the 3×3 matrix�T i

j� for i, j = 1,2,3 is the Maxwell stress tensor of the field. From this we can physically

interpret the energy-momentum transformation as a description of the energy content of thefield F (the electromagnetic field), relative to an admissible basis forM.

3.3 Interactions of particles and Charged particles in Electromagnetic Fields

In this section we investigate two important physical effects, namely the relativistic Dopplerand the Compton effect. Next we investigate how the behaviour of charged particles is inelectromagnetic fields, both in a constant field and a changing field. At the end we shall seethat these relations actually satisfy the Maxwell equations.

3.3.1 Relativistic Doppler and Compton effect

This examples has been taken, with some rewriting, from Naber [1] pages 90 to 97.

Let {ea} and {ea} be two admissible bases for two observersM, and M respectively. Sothat we have the null vector as defined in (15) N = �(�e+ e4) = N = �(�e+ e4), where � = −N ⋅ e4and � = −N ⋅ e4. Then

� = γ� �1 − β(�e ⋅ �d)� ,

where �d is the direction 3-vector of M relative toM.


We define θ as the angle between the direction of the photon for an observer S and thedirection of another observer S relative to S by �e ⋅ �d = cos(θ). From this we get the relativisticformula for the Doppler effect

�

�= ν

ν= γ(1 − β cos(θ)) = 1 − β cos(θ)�

1 − β2.

Some examples of interest:

θ = 0⇒ ν

ν=�

1 − β1 + β , �

�d = �e� ,

θ = π⇒ ν

ν=�

1 + β1 − β , �

�d = −�e� ,

θ = π

2⇒ ν

ν= 1�

1 − β2, � �d ⋅ �e = 0� . (22)

The classical theory predicts no Doppler effect shift in the case of θ = π�2 so Eq. (22) is calledtransverse Doppler effect.

Similarly one can define the angle, cos(θ) = �e ⋅ �d, such that

cos(θ) = β − cos(θ)1 − β cos(θ) ,

or by defining the angle θ′ = π − θ do we get the standard relativistic aberration formula

cos(θ′) = cos(θ) − β1 − β cos(θ) .

Definition 3.13 (Contact interaction). A contact interaction in M is a triple �A, x, A�,where A and A are two finite sets of free particles neither of which contains a pair of particles

with linearly dependent world momenta. x is an event such that

(a) x is the terminal point of all the particles in A, i.e., for each free particle in A with

α ∶ [a, b]→M, we have α(b) = x,

(b) x is a inital point of all the particles in A, i.e., for each free particle in A with α ∶[a, b]→M, we have α(a) = x,

(c) The total world momentum of A, i.e. the sum of all momentums of free particles in A,equals the total world momentum of A.

Physically x is the event of a collision of particles in A, and from this collision the particlesin A emerges.

If one photon collides with an material particle, for example an electron, with a propermass me, the photon is either adsorbed by the electron or it will change direction and its wave-length by The Compton Effect or Compton scattering. The photon will change its directionby θ from the original path before the collision. The change in wavelength will be


λ′ − λ = 2h

me

sin2 (θ�2) ,

where ∆λmax = 2h�me is the Compton wavelength of the electron.

Illustration of Compton scattering on an electron.

The Compton effect of light is essential in physics, since light that is scattered by classicaltheory at relativistic speeds could not explain the low intensity shifts in the wavelength. Toexplain this effect must light behave as if consist of particles, or quanta, whose energy isproportional to the frequency.

3.3.2 Constant fields


We will now consider the simplest of all electromagnetic fields, namely constant electro-magnetic fields. To simplify things ever further we look at fields which are purely electrical( �B = 0), or purely magnetic ( �E = 0). If the skew-symmetric linear transformation F ∶M→Mis null, then it would satisfy our examples only if it is identically zero. Therefore, we willwork with the regular case of the canonical basis.

In general we have two real numbers � ≥ 0 and δ ≥ 0, and an admissible basis {ea} forMsuch that the matrix F written in {ea}, takes the form of FR in (20). As we have seen before,the fields �E = �e3 and �B = δe3 will be parallel to each other.

Letting (α,m, e) be a charged particle with the world velocity U = U(τ) = Ua(τ)ea whichsatisfies the Lorentz force law (17) at each point of α

��

dU1�dτ

dU2�dτ

dU3�dτ

dU4�dτ

��

= e

m

��

0 δ 0 0−δ 0 0 00 0 0 �

0 0 � 0

��

��

U1

U2

U3

U4

��

=

��

ωU1

−ωU2

νU3

νU4

��

or thendU

1

dτ= ωU2 dU

2

dτ= −ωU1

, (23)

and

dU3

dτ= νU4 dU

4

dτ= νU3

, (24)

where ω = δe

mand ν = �e

m.


The solutions to these differential equations for purely magnetic fields (� = 0, δ ≠ 0) arethe following

α(τ) = �a sin(ωτ + φ) + x10, a cos(ωτ + φ) + x20, C3τ + x30, C4

τ + x40� .

So that

U(τ) = �aω cos(ωτ + φ),−aω sin(ωτ + φ), C3, C

4� .

Here x0 = (x10, x20, x30, x40) is a point in the Minkowski space M, and a > 0,φ, C3, C

4 areconstant. If C3 ≠ 0, then the trajectory of the particle in the {e1, e2, e3}-space is a spiral inthe e3-direction (i.e. along the magnetic field lines). If C3 = 0, the trajectory is a circle.

For a purely electrical field (� ≠ 0, δ = 0) we have solutions,

α(τ) = �C1τ + x10, C2

τ + x20,C

3

νcosh(ντ) + C

4

νsinh(ντ) + x30,

C3

νsinh(ντ) + C

4

νcosh(ντ) + x40� .

Thus

U(τ) = �C1, C

2, C

3 sinh(ντ) +C4 cosh(ντ), C3 cosh(ντ) +C4 sinh(ντ)� .

Again x0 = (x10, x20, x30, x40) is a point in the Minkowski space M, and C1, C

2, C

3, C

4 areconstant. Assuming the initial condition α(0) = 0, we get the solution,

α(τ) = �τ,0,√2

ν(cosh(ντ) − 1) ,

√2

νsinh(ντ)� .

The trajectory of the particle in the {e1, e2, e3}-space is the curve (τ,0,√2�ν (cosh(ντ) − 1)),

which is a catenary in the (x1x3)-plane.

3.3.3 Variable fields and Maxwell’s Equations

This part with theory has been taken, with some rewriting, from Naber [1] pages 126-140.

Since now we consider fields that depends on the point of M we define them in someopen sets of M. Open sets of M are usually called regions in physics. Maxwell’s equationsin regions free of charge, and in terms of the electric and magnetic fields �E and �B are

div �E = 0, curl �B − ∂ �E∂x4= �0 (25)

div �B = 0, curl �E − ∂ �B∂x4= �0, (26)

where div and curl are the divergence and curl in R3.

Definition 3.14 (Assignment). Let R be a region inM, then a smooth assignment pF→ F (p)

is a map which for each p ∈ R assigns a linear transformation F (p) ∶M→M.


Relative to an admissible basis {ea} ofM will F (p) have a matrix [F a

b(p)]. Let F be an

assignment, then the divergence of divF relative to the basis {ea} is

(divF )b = ηbβFα

β,α, b = 1,2,3,4, (27)

where ,α is the partial derivation ∂

∂aof F with respect to slot in α. Thus, (divF )i = Fα

i,αfor

i = 1,2,3 and (divF )4 = −Fα

4,α. The equations (25) are equivalent to

divF = 0, (28)

and the equations (26) are equivalent to

Fab,c + Fbc,a + Fca,b = 0 a, b, c = 1,2,3,4. (29)

The equivalence between Eq. (28) and the first set of Maxwell’s Equation (25) is provenin the Appendix, Exercise 2.7.3. An the the equivalence between (29) and (26) is also provenin the Appendix, Exercise 2.7.5.

Definition 3.15 ((Non-constant) electromagnetic field). An electromagnetic field in a region

R ∈M is a smooth assignment pF→ F (p) of a skew-symmetric linear transformation to each

point of p ∈ R satisfying both (28) and (29).

Theorem 3.16 ([1] page 138). Let K ∶M→M be a nonzero, skew-symmetric linear transfor-

mation ofM, k a nonzero vector inM and P ∶ R→ R a smooth, nonconstant function. Then

F (x) = P (k ⋅ x)K defines a smooth assignment of a skew-symmetric linear transformation to

each x ∈M and satisfies the Maxwell equations if and only if

kbKb

c = 0, c = 1,2,3,4, (30)

and

Kabkc +Kbcka +Kcakb = 0, a, b, c = 1,2,3,4. (31)

Any F (x) of the form as in Theorem 3.16 satisfying (30) and (31) is an electromagneticfield and is called a simple plane electromagnetic wave.

Theorem 3.17 ([1] page 140). Let K ∶M →M be a nonzero, skew-symmetric linear trans-

formation of M, k a nonzero vector in M and P ∶ R → R a smooth, nonconstant function.

Then F (x) = P (k ⋅ x)K defines a simple plane electromagnetic wave of and only if K is null

and k is in the principal null direction of K.

This examples has been taken, with some rewriting, from Naber [1] pages 141 to 142.

If one have an arbitrary nonzero, null, skew-symmetric linear transformation K ∶M→Mand let {ea} be a canonical basis for K. Then k = e2 + e4 is along the principal null directionof K such that

F (x) = sin(nk ⋅ x)K = sin(n(e2 + e4) ⋅ x)K = sin(n(x2 − x4))K,

where n is a positive integer defines a simple electromagnetic wave. For some nonzero α ∈ R,

[F a

b] =

��

0 0 0 00 0 α sin(n(x2 − x4)) 00 −α sin(n(x2 − x4)) 0 α sin(n(x2 − x4))0 0 α sin(n(x2 − x4)) 0

��

.


Thus, �E = α sin(n(x2 − x4))e3 and �B = α sin(n(x2 − x4))e1.

For any electromagnetic field, each of the functions Fab satisfies the wave equation, (provenin the Appendix, Exercise 2.7.17.),

∂2Fab

(∂x1)2 +∂2Fab

(∂x2)2 +∂2Fab

(∂x3)2 =∂2Fab

(∂x4)2 . (32)

Illustration of an simple plane electromagnetic wave.

4 Spin Transformations and Spinors

In this chapter we introduce the spin transformations, spinor maps and spinors. The conceptof spin transformation gives us the connection between SL(2,C) and the Lorentz group byspinor maps. Further we will look at the spinor equivalence of both vectors and the elec-tromagnetic field tensors. At the end we will discuss the Null flag, which illustrates the”doubleness” of the spinors.

Furthermore is this section build on Naber [1] section 1.7, and chapter 3, as well as thefirst few chapters of 0’Donnell [3]. One have organized and rewritten it differently to getan clearer and better picture of the structure and properties of spinors, which is of greatimportance in physics, then what has been given in Naber [3].

4.1 Spin Transformation and Spinor map

Intersecting the future (or past) null cone with the a constant hyperplane x4 = C would

represent a sphere in R3. The observer at the origin would consider this sphere as his entire

domain of vision, i.e. his celestial sphere [3].Letting the hyperspace x

4 = 1 intersect the future null cone (or the past null cone withx4 = −1) gives

(x1)2 + (x2)2 + (x3)2 = 1,

or the Riemann sphere C.


Illustration of Riemann sphere and the stereographic projection [3].

Using the stereographic projection, one can construct a one-to-one correspondence betweenthe extended complex plane and the Riemann sphere. Each point on C, except the ”Northpole” which is infinity, can then be identified by a single complex number, called the stereo-graphic coordinate [3]

ξ =X + iY = x1 + ix21 − x3 . (33)

The inverse relations with respect to Eq. (33) are

x1 = ξ + ξ

ξξ + 1, x

2 = i(ξ − ξ)ξξ + 1

, x3 = ξξ − 1

ξξ + 1.

Using the Mobius transformation [3], that is a conformal transformation, of the Riemannsphere leads to the map

ξ � ξ = aξ + bcξ + d, ad − bc ≠ 0, (34)

which we call a spin transformation. The associated matrixA = � a b

c d� , is called spin matrix.

The next step is to see on spin transformation A as a member of some group, that ispossible if we normalize the non-vanishing value ad − bc by ad − bc = 1.

Definition 4.1 (Special linear group, Spin transformations). SL(2,C) denotes the set of all

(2×2) matrices with complex entries with determinate 1 and is called the special linear group

of order 2. Elements of SL(2,C) are spin transformations.

With the definition of the spin transformation are we going to look at the connectionbetween the special linear group SL(2,C) and the Lorentz group L. To define this relationwe use a special type of2 × 2 with complex entries called Hermitian matrices.

Definition 4.2 (Hermitian). A 2 × 2 matrix H with complex entries is said to be Hermitian

if the conjugate transposes is the same as the original matrix, i.e. HCT = H, and we denote

the subgroup of Hermitian matrices by H2.

Any Hermitian matrix H in H2 can uniquely be expressed in the form

H = � x3 + x4 x

1 + ix2x1 − ix2 −x3 + x4 � (35)


where xa for a = 1,2,3,4 are real numbers. This is shown in the Appendix, Exercise 1.7.1.

Letting A ∈ SL(2,C), the spin transformation A in matrix form is given by

A = � α β

γ δ� ,

and its conjugate transpose ACT by

ACT = � α γ

β δ� .

Each matrix A ∈ SL(2,C) gives rise to a mapping MA ∶H2 →H2 defined by

MA(H) = AHACT

for every H ∈ H2. But MA(H) is also itself in H2, and therefore, can be uniquely written inthe form

MA(H) = �x3 + x4 x

1 + ix2x1 − ix2 −x3 + x4 � .

Thus, the mapping [xa]→ [xa] defined by

� x3 + x4 x

1 + ix2x1 − ix2 −x3 + x4 � = A �

x3 + x4 x

1 + ix2x1 − ix2 −x3 + x4 �A

CT, (36)

is linear, and preserves the quadratic form ηabxaxb. This matrix is a general homogeneous

Lorentz transformation, by Lemma (2.24). To find the matrix form of MA(H) we writeh11 = x3 + x4, h12 = x1 + ix2, h21 = x1 − ix2, h22 = −x3 + x4. Then, the equation (36) became

��

h11

h12

h21

h22

��

=

��

0 0 1 11 i 0 01 −i 0 00 0 −1 1

��

��

x1

x2

x3

x4

��or more compactly

[hij] = G �xi� .

Do the same for �hij�. The matrix MA(H) is then expressed as

��

h11

h12

h21

h22

��

=

��

αα αβ αβ ββ

αγ αδ βγ βδ

αγ βγ αδ βδ

γγ γδ γδ δδ

��

��

h11

h12

h21

h22

��or more compactly as

�hij� = RA [hij] .

The map [xa]→ [xa] is then defined by


[xa]→G [hij]→RA�hij�→G−1 [xa]

and the Lorentz transformation ΛA ∈ L generated by A ∈ SL(2,C) is then given by

ΛA = G−1RAG.

To see that ΛA is indeed belong to the Lorentz group, we observe that the (4,4)-entry of ΛA

are positive so ΛA is orthochronous. Moreover is the detΛA = detRA = 1, so ΛA is proper.

Definition 4.3 (Spinor map). The map Spin ∶ SL(2,C) → L given by A � ΛA is called the

spinor map.

Note that

ΛAΛB = (G−1RAG)(G−1RBG) = G−1RARBG

ΛAΛB = ΛAB,

Thus, the spinor map preserves matrix multiplication, i.e., it is a group homomorphism ofSL(2,C) to L. It is not a diffeomorphism since it has a two-covering (two-to-one), i.e., if Aand B are in SL(2,C) and ΛA = ΛB, then A = ±B.

Definition 4.4 (Unitary). An element A of SL(2,C) is said to be unitary if A−1 = ACT

.

Definition 4.5 (SU(2) [1] page 78[5]). The set of all unitary matrices in SL(2,C) is denotesSU(2) and is a subgroup of SL(2,C).

A rotation matrix �Ri

j�i,j=1,2,4 can be represented by the ”Euler angles” φ1, θ and φ2 as

�Ri

j� =��

cφ2cφ1 − cθsφ1sφ2 −cφ2sφ1 − cθcφ1sφ2 sφ2sθ

sφ2cφ1 + cθsφ1cφ2 −sφ2sφ1 + cθcφ1cφ2 −cφ2sθ

sθsφ1 sθcφ1 cθ

��,

where cθ = cos(θ), cφ1 = cos(φ1), cφ2 = cos(φ2), sθ = sin(θ), sφ1 = sin(φ1) and sφ2 = sin(φ2).

Moreover

A =��

cos θ

2e12i(φ1+φ2) i sin θ

2e− 1

2i(φ2−φ1)

i sin θ

2e12i(φ2−φ1) cos θ

2e− 1

2i(φ1+φ2)

��,

is in SU2 and maps onto

ΛA =

��

0�Ri

j� 0

00 0 0 1

��

,

under the spinor map [1] page 151,[5].


4.2 Representations of matrix groups

Definition 4.6 (Matrix group, order). A matrix group is a group consisting of invertible

matrices which is closed under the operations of matrix multiplication. The order is the

dimension of the matrix, i.e. an n-by-n matrix is as a square matrix of order n.

Definition 4.7 (Representation and Carriers). A homomorphism of a group G into a matrix

group H of order m, is called a (finite-dimensional) representation of G. Let Vm be an m-

dimensional vector space (over R or C), where the group H acts as linear transformations on

Vm, (as a change of basis on Vm). The elements of Vm are called carriers of the representation.

Let us present some results concerning representations of matrix groups.

Definition 4.8 (Subrepresentation). A subspace H of G which is invariant under the group

action is called a subrepresentation.

Definition 4.9 (Irreducible). A representation is irreducible, is a representation H which

only have two subrepresentations, namely the zero-dimensional subspace {0} and H itself.

Theorem 4.10 (Schur’s Lemma [1] page 151,[7]). Let G and H be matrix groups of order n

and m respectively and D ∶ G → H, taking the matrix G ∈ G to the matrix D(G) = DG ∈ H,an irreducible representation of G. If A is an m ×m matrix which commutes with every DG,

i.e., ADG = DGA for every G in G, then A is a multiple of the identity matrix, i.e., A = λIfor some (in general, complex) number λ.

Corollary 4.11 ([1] page 152,[7]). Let G be a matrix group that contains −G for every G in

G and D ∶ G →H an irreducible representation of G. Then

D−G = ±DG.

Using the spinor map, defined in Definition 4.3, and its double-covering properties, weobtain that any representation of L, D ∶ L → H, is ”lifted” up to a representation D ofSL(2,C)

D ∶ SL(2,C)→H by D = D ○ Spin.Thus there is a one-to-one correspondence between the representation of L and the rep-

resentation of SL(2,C) that satisfies D−G =DG for all G in SL(2,C).Now we present an example of an irreducible representation of SL(2,C) where carriers are

complex polynomials. Let G = � a b

c d� ∈ SL(2,C) and Pmn the vector space of all complex

polynomials with degree most m is z and at most n in z. Define then the linear transformation

D(m

2,n2)

G∶ Pmn → Pmn,

by

D(m

2,n2)

G(p(z, z) =D(m2 ,

n2)

G(prszrzs) = (bz + d)m(bz + n)np(ω, ω),

where ω is the Mobius transformation of z given by (34): ω = az+bcz+d .

Definition 4.12 (Spinor representation). A spinor representation of type (m,n) of SL(2,C)is the representation that takes G ∈ SL(2,C) to a matrix D

(m2,n2)

G∈ GL(2,C).

Theorem 4.13 ([1] page 153, [8]). For all m,n = 0,1,2, ... the spinor representation of type

(m,n) of SL(2,C) is irreducible and every finite-dimensional irreducible representation of

SL(2,C) is equivalent to some spinor representation of type (m,n).


4.3 Spin Space and spinor of valence

In this section shall we investigate the space of carriers of the representations of SL(2,C andits structure. The goal is to define a spinor of valence and in the next section, the spinorequivalence to a world vector, which includes both the vector structure, and spin structure ofa particle with spin.

Definition 4.14 (Spin space[1] page 161,[3]). Spin space is a vector space B over the complex

numbers on which a map � , � ∶ B ×B → C which satisfies the following properties:

1. It is non-degenerate, i.e., for any φ ≠ 0 there exists ψ ≠ 0 in B such that �φ,ψ� ≠ 0.

2. It is skew-symmetric, i.e., �φ,ψ� = −�ψ,φ� for all φ,ψ in B.

3. It is bilinear, i.e., �aφ + bψ, ξ� = a�φ,ψ� + b�ψ, ξ� for all φ,ψ, ξ in B and all a, b in C.

4. �φ,ψ�ξ + �ξ,φ�ψ + �ψ, ξ�φ = 0 for all φ,ψ, ξ in B.

An element of B is called a spin-vector.

Lemma 4.15 (Properties of Spin Space[1] page 161,[3]). Each of the following properties

holds in the spin space B.a) �φ,φ� = 0 for every φ in B.b) Any pair of φ and ψ in B which satisfies �φ,ψ� ≠ 0 form a basis for B. In particular,

dimB = 2.c) There exists a basis {s1, s0} for B which satisfies �s1, s0� = 1 = −�s0, s1�. Such a basis is

called a spin frame for B.d) If {s1, s0} is a spin frame and φ = φ1s

1 + φ0s0 = φAs

Athen φ1 = �φ, s0� and φ0 = −�φ, s1�.

e) If {s1, s0} is a spin frame and φ = φAsA, ψ = ψAs

A, then

�φ,ψ� = � φ1 ψ1

φ0 ψ0� = φ1ψ0 − φ0ψ1.

f) Elements φ and ψ in B are linearly independent if and only if �φ,ψ� ≠ 0.g) If {s1, s0} and {s1, s0} are two spin frames with s

1 = G1

1 s1 + G 1

0 s0 = G

1AsA

and s0 =

G0

1 s1 +G 0

0 s0 = G 0

AsA, i.e.,

sB = G B

A sA, B = 0,1, (37)

then G = �G B

A� = � G

11 G

01

G1

0 G0

0� is in SL(2,C).

i) If {s1, s0} and {s1, s0} are two spin frames and φ = φAsA = φAs

A, then

� φ1

φ0� = � G

11 G

01

G1

0 G0

0� � φ1

φ0� ,

i.e.,

φA = G B

A φB, A = 0,1,where G

B

Ais given by (37).


j) A linear transformation T ∶ B → B preserves � , �, i.e. satisfies �Tφ, Tψ� = �φ,ψ� for all

φ,ψ in B, if and only if the matrix of T relative to any spin frame [T B

A] is in SL(2,C).

On the next step one would consider the extension notion of the tensors that are multilinearmaps acting of four different copies of the vector space B. We start from definition of thesedifferent vector spaces. The dual space B∗ of the vector space B, with the dual basis {s0, s1},has the spin frame {s1, s0}. Thus

sA(sB) = δBA , A,B = 0,1.

The elements of B∗ are called spin covectors. For each φ in B we define φ∗ in B∗ by

φ∗(ψ) = �φ,ψ�,

for every ψ in B

Lemma 4.16 ([1] page 164,[3]). Every element φ∗of B∗ can be defined as above by some φ

in B.

Doing some calculation we find that

φ1 = −φ0, φ

0 = φ1

and by letting

[GAB] = �G11 G10G01 G00

� = � G0

0 −G 10

−G 01 G

11� = �[G B

A ]−1�T ,

we find that

φA = GABφB

.

Definition 4.17 (Conjugate vector space). The conjugate vector space is a vector space

B = B × 1 such that for each element φ of B is there a corresponding element of B denoted

(φ,1) = φ. It preserves the linear structure of B by

φ + ψ = φ + ψ, and cφ = cφ,

for φ, ψ in B and c ∈ C.

Then there is a bijective map τ ∶ B → B taking φ � φ which is a conjugate (or anti-)isomorphism, i.e., satisfying

φ + ψ � φ + ψ, and cφ� cφ.

The elements of B is called conjugate spin vectors.We show some formalities of working with elements of B. If {s1, s0} is a spin frame of B,

then there images {s1, s0} is a basis for B. Letting s1 = G 1

1s1 + G 1

0s0 and similar for s0 do we

get the transformation rules:

¯sX = GXYsY, X = 0, 1, , and s

Y = G Y

X

¯sX , Y = 0, 1,


If φ = φYsY = ¯

φX¯sX then:

¯φX= G Y

XφY, X = 0, 1, , and φ

Y= GX

Y

¯φX, Y = 0, 1,

The element of the dual B∗ of B are called conjugate spin covectors and the bases dual to

{sX} and {¯sX} are denoted {sX} and {s

X} respectively. As before:

¯sX= G Y

XsY, X = 0, 1, and s

Y= GX

Y

¯sX, Y = 0, 1,

If φ∗ = φYsY= ¯φX ¯s

Xthen:

¯φX = GX

YφY, X = 0, 1, and φ

Y = G Y

X

¯φX, Y = 0, 1.

Definition 4.18 (Spinor of valence). A spinor of valence � r s

m n� = (r, s;m,n) (also called

spinor with m undotted lower indices, n dotted indices, r undotted upper indices, and s dotted

upper indices) is multilinear functional

ξ ∶ B ×�×B��r factors

× B ×�× B��s factors

×B∗ ×�×B∗��

m factors

× B∗ ×�× B∗��

n factors

→ C.

A spin-vector is considered as a particular case of spinor of a valence (1,0; 0,0). and aspin-covector is a spinor of valence (0,0; 1,0).

If {s1, s0} is a spin frame (with associate bases {s1, s0}, {s1, s0} and {s1, s0} for B∗, Band B∗), then the components of ξ relative to {sA} are defined by

ξA1�ArX1�Xs

B1�BmY1�Yn= ξ �sA1 ,�, sAr , s

X1 ,�, sXs , sB1 ,�, sBm , sY1,�, s

Yn� ,

A1,�,Ar,B1,�,Bm = 0,1,Y1,�, Yn, X1,�, Xs = 0, 1.

4.4 Spinors and World Vectors

Since there only existes a homomorphism between special linear group SL(2,C) and Lorentzgroup L, but not an isomorphism, not all spinors have a tensors counterpart [3].

Given an 2 × 2 Hermitian matrix with complex entries, as in (35), one can represent it asa span

H = x1σ1 + x2σ2 + x3σ3 + x4σ4,where σi, i = 1,2,3 are the Pauli spin matrices

σ1 = �0 11 0

� , σ2 = �0 i

−i 0� , σ3 = �

1 00 −1 � ,

and σ4 is the (2 × 2) identity matrix. This is also shown in the Appendix, Exercise 1.7.1.

Introducing a factor of 1√2and the Infeld-van der Waerden symbols σ

AX

a , for each A = 0,1, X =0, 1, one obtains the matrices:


σAX

1 = 1√2� 0 11 0

� , σAX

2 = 1√2� 0 i

−1 0� , σ

AX

3 = 1√2� 1 00 −1 � , σ

AX

4 = 1√2� 1 00 1

� ,

σAX

a = � σ11a σ

10a

σ01a σ

00a

� = 1√2σa, a = 1,2,3,4.

Note that σAX

a is transformed as a world-tensor on the index a, while in indices A and X

it transforms as a spinor[3].

Definition 4.19 (Spinor equivalent to world vector). A spinor V equivalent to a world vector

v ∈M relative to an admissible basis {ea} ofM and a spin frame {sA} of B has its components

defined by

VAX = σ AX

a va, A = 1,0, X = 0, 1.

This defines a spinor of valence (1,1; 0,0) which is transformed as

VAX = GABGXY V

BY, A = 0,1, X = 0, 1.

or equivalently as

VAX = GA

BGX

YV

BY, A = 0,1, X = 0, 1.

Furthermore are the spinor equivalent of any world vector a Hermitian spinor, i.e. V AX =V

AX .

Theorem 4.20 ([1] page 185). Let {ea} be an admissible basis for M and {sA} a spin

frame for B. The map which assigns to each vector v ∈M, (v = vaea) its spinor equivalent

(V = VAX

sA ⊗ sX

where VAX = σ

AX

a va) is one-to-one and onto the set of all Hermitian

spinors of valence (1,1; 0,0).Theorem 4.21 ([1] page 185). Let {ea} be an admissible basis forM and {sA} a spin frame

for B. The map which assigns to each covector v∗ ∈ M∗

, (v∗ = vaea) its spinor equivalent

(V = VAX

sA ⊗ s

Xwhere V

AX= σ

a

AXva) is one-to-one and onto the set of all Hermitian

spinors of valence (0,0; 1,1).Theorem (4.21) is proven in the Appendix, Exercise 3.4.9.

Theorem 4.22 ([1] page 188,[3]). Let {ea} be an admissible basis for M and {sA} a spin

frame for B. Let v ∈ M be a null vector, (v = vaea), and V its spinor equivalent, (V =V

AXsA ⊗ s

Xwhere V

AX = σ AX

a va). Then there exists a spin vector ξ such that:

(a) if v is future-directed, then

VAX = ξAξX ,

(b) if v is past-directed, then

VAX = −ξAξX

The process can also be reversed: thus every nonzero spin vector ξ gives rise to a future-directed null vector v, which we call the flagpole of ξ.

The vector ξ is not uniquely determined, since every nonzero spin vector in the family

{eiθξ�θ ∈ R} gives rise to the same flagpole. If we take νA = e

iθξA such that ν

X = e−iθ

ξX ,

then νAνX = eiθξAe−iθ ξX = ξAξX = V AX . We call eiθ the phase factor of the corresponding

member of the family.


4.5 Bivectors and Null Flag

This example has been taken from Naber [1] pages 188 to 196.

Let F be an electromagnetic field tensor as defined in (19). The components of F relativeto {ea} are denoted by Fab = F (ea, eb). By skew symmetry implies

Fab =1

2(Fab − Fba) = F[ab].

For A,B = 0,1 and X, Y = 0, 1 we define

FAXBY

= FABXY

= σa

AXσb

BYFab

be components of the spinor equivalence of the electromagnetic field tensor relative to a spinframe {sA}. The spinor equivalent of the electromagnetic field tensor have the properties:

Fab = σ AX

a σBY

bFAXBY

, a, b = 1,2,3,4,FAXBY

= FAXBY

, i.e., FAXBY

is Hermitian

FBXAY

= −FAXBY

.

We define a symmetric spinor φAB of valence (0,0; 2,0), i.e. φAB = φBA by

φAB =1

2F

U

UA B, A,B = 0,1, then φ

XY= 1

2F

C

CX Y, X, Y = 0, 1.

Further, we define the spinor �AB by the matrix � = � 0 −11 0

� = [�AB]. By the properties

of the electromagnetic spinor equivalence, one can rewrite it as

FAXBY

= �ABφXY+ φAB �XY

, (38)

which gives a spinor of valence (0,0; 2,2). The spinor �AB is called the Levi-Civita spinor,and φAB is the electromagnetic spinor or the Maxwell spinor [3]. Note, that in general anyskew symmetric world tensor of covariant rank 2 and contravariant 0 is decomposable intothe form of (38).

As was mentioned in the last section, one can not determine the differences betweenflagpoles of the spin vectors which differ by a phase factor. One need another element todistinguishes them. We write the Maxwell spinor φAB by making use of a spin vector ξ as

φAB = ξAξB.

Furthermore, one can select a spin vector η which, together with ξ, form a spin frame {ξ,η}for B. Then we define a spinor of valence (0,0; 1,1) by

WAX= ηAξX + ξAηX ,

which is Hermitian. Consequently, we define a covector ω∗ ∈M∗ by

wa = −σ AX

a WAX= −σ AX

a�ηAξX + ξAηX� .


Thus since the symmetric spinor gives rise to a spinor of valence (0,0; 2,2) as above, the samedoes

FAB = vbwa − vawb.

Moreover, if w orthogonal to v, and since v is null, then w is spacelike. However, w is notuniquely determined since our choice of spin frame {ξ,η} is not unique. Choosing anotherspin frame {ξ, η}, the new spin partner must be of the form

η = η + λξ, λ ∈ C.

The new vector w is then

wa = wa + (λ + λ)va, w = (λ + λ)v.

The vector w lies in the 2-dimensional plane spanned by v and w and is a spacelike vector or-thogonal to v. Thus, ξ uniquely determines a future-directed null vector v and a 2-dimensionalplane spanned by v and any of the spacelike vectors w, w,�. This 2-dimensional plane lies inthe 3-dimensional subspace (Span{v})�, which is tangent to the null cone along v. We drawa flag along v to stress that it is a 2-dimensional plane. The pair consisting of v and the2-dimensional space in (Span{v})� is called the null flag of ξ.

Illustration of the spin vector as a null flag [3].

Changing the phase of ξ:ξA → e

iθξA (θ ∈ R),

leaves the flagpole v unchanged, but gives us a new w by

w → (cos 2θ)w + (sin 2θ)u + rv, θ, r ∈ R.

Here w and u are perpendicular spacelike vectors in the 3-dimensional space (Span{v})�,so that (cos 2θ)w+ (sin 2θ)u is a spacelike vector in the plane of w and u forming an angle 2θwith w. The phase change of w gives a new vector w in the plane of v and (cos 2θ)w+(sin 2θ)u.Thus the corresponding flag is rotated by 2θ in the plane of w and u.

Notice that if θ = π, then the phase change carries ξ to −ξ, but the null flag is rotated by2π, which means that is has returned to its original position. This means that the null flagis represented by both ξ and −ξ. Hence, null flags represent spin vectors only ”up to sign”.This is a reflection of the double value property of spinors, which has its roots in the fact thatthe spinor map is two-to-one.


References

[1] GL Naber. The geometry of Minkowski spacetime: An introduction to the mathematics of

the special theory of relativity. Springer New York, 1992.

[2] B. O’neill. Semi-Riemannian geometry: with applications to relativity, volume 103. Aca-demic Pr, 1983.

[3] P.J. O’Donnell. Introduction to 2-spinors in general relativity. World Scientific Pub CoInc, 2003.

[4] J.M. Lee. Riemannian manifolds: An introduction to curvature, volume 176. SpringerVerlag, 1997.

[5] J. Stillwell. Naive lie theory. Springer Verlag, 2008.

[6] J.R. Reitz, F.J. Milford, and R.W. Christy. Foundations of Electromagnetic Theory.Addison-Wesley Publishing Company, 1992.

[7] A.W. Knapp. Lie groups beyond an introduction, volume 140. Birkhauser, 2002.

[8] M. Carmeli and S. Malin. Infinite-dimensional representations of the lorentz group: Thecomplete series. International Journal of Theoretical Physics, 9(3):145–156, 1974.


Appendix: Solutions of some chosen Excersizes from Naber [1]

This appendix is a supplement to the book of Naber [1], giving solutions to some of theproblems which where let to the reader to solve.

Exercise 1.3.14, page 28:

Problems:

Suppose −1 < β1 ≤ β2 < 1. Show that

a) � β1+β21+β1β2

� < 1,

b) Λ(β1)Λ(β2) = Λ � β1+β21+β1β2

� .

Solutions:

a) Letting β2 be a constant between −1 < β2 < 1, can one solve the problem by analysingthe function f(β1) = β1+β2

1+β2β1.

Taking the derivative of f gives

∂f

∂β1= (1 + β2β1) − (β1 + β2)β

22

(1 + β2β1)2= 1 + β2β1 − β2β1 − β2

2

1 + 2β2β1 + β22β

21

= 1 − β22

1 + 2β2β1 + β22β

21

.

The first term (1 − β2) will always be positive, and the second term (1 + β2β1)2 willstay positive as long as β1 ∈ [−1,1]. Therefore is the derivative positive in our domaindefinition, and f always increasing. The same is true if one holds β1 constant and varyβ2 between [−1,1].One have the values of f at −1 and 1 is f(β1 = −1,β2 = −1) = −1, and f(β1 = 1,β2 = 1) = 1,respectively. This gives that � β1+β2

1+β1β2� < 1, as one wanted.

b) In general do one have that

Λ(β) =

��

γ 0 0 −βγ0 1 0 00 0 1 0−βγ 0 0 γ

��

with β ∈ (−1,1) and γ = 1(1−β2)1�2 > 1.

Letting β1+2 = β1+β21+β1β2

, then γ1+2 = 1(1−β21+2)1�2 =

1

�1−� β1+β21+β1β2 �

2�1�2 =1

��

1−β21−β22+β21β22�1+β21β22�2

��1�2 =

1+β1β2(1−β21−β2

2+β21β

22)1�2 .


Multiplying Λ(β1) with Λ(β2) as in the form above, do one get Λ(β1)Λ(β2) =��

γ1 0 0 −β1γ10 1 0 00 0 1 0

−β1γ1 0 0 γ1

��

��

γ2 0 0 −β2γ20 1 0 00 0 1 0

−β2γ2 0 0 γ2

��

=

��

γ1γ2(1 + β1β2) 0 0 −γ1γ2(β1 + β2)0 1 0 00 0 1 0

−γ1γ2(β1 + β2) 0 0 γ1γ2(1 + β1β2)

��

.

Our task is to show that this equals Λ(β1+2).

First one observe that γ1γ2 = 1(1−β21)1�2

1(1−β22)1�2 =

1(1−β21−β2

2+β21β

22)1�2 .

It is then easy to see that the terms in (1,1) and (4,4) entries is γ1γ2(1 + β1β2) =1+β1β2(1−β2

1−β22+β2

1β22)1�2 = γ1+2.

The two last terms gives us −γ1γ2(β1+β2) = β1+β2(1−β21−β2

2+β21β

22)1�2 = −

β1+β21+β1β2

1+β1β2(1−β21−β2

2+β21β

22)1�2 =

−β1+2γ1+2. In matrix form one have

��

γ1+2 0 0 −β1+2γ1+20 1 0 00 0 1 0

−β1+2γ1+2 0 0 γ1+2

��which is Λ(β1+2).


Problem:

Show that if β = tanh(θ), then the hyperbolic form of the Lorentz transformation Λ(β) is

L(θ) =

��

cosh(θ) 0 0 − sinh(θ)0 1 0 00 0 1 0


��

.

Solution:

One have from before that Λ(β) is given by:

��

γ 0 0 −βγ0 1 0 00 0 1 0−βγ 0 0 γ

��

,

and γ = 1(1−β2)1�2 .Having β = tanh(θ) = sinh(θ)

cosh(θ) , one get that

γ = 1

�1 − � sinh(θcosh(θ)�

2�1�2 =

cosh(θ)�cosh2(θ) − sinh2(θ)�1�2

= cosh(θ).


The second term is then −βγ = − sinh(θ)cosh(θ) cosh(θ) = − sinh(θ), giving us the matrix:

L(θ) =

��

cosh(θ) 0 0 − sinh(θ)0 1 0 00 0 1 0


��

,

as one wanted.


Problem:

Let x,x0 and x1 be events for which x − x0 and x1 − x are spacelike and orthogonal. Showthat

S2(x1 − x0) = S2(x1 − x) + S2(x − x0).

Solution:

One have that S2(x1 − x0) = S2(x1 − x + x − x0) and from from Naber [1] page 61-62 will thisgive us S2(x1 − x0) = Q(x1 − x + x − x0). Next one have that,

S2(x1 − x0) = Q(x1 − x + x − x0)

= Q(x1 − x) +Q(x − x0) + 2(x1 − x) ⋅ (x − x0)

Since x − x0 and x1 − x are orthogonal to each other is (x1 − x) ⋅ (x − x0) = 0, and sincethey are spacelike do one have,

S2(x1 − x0) = Q(x1 − x) +Q(x − x0) = S2(x1 − x) + S2(x − x0).

Using the notation v = x1 − x and w = x − x0 can one see that one have proven thePythagorean Theorem,

S2(v +w) = S2(v) + S2(w),

as one wanted.


Problems:

Show that any Hermitian H in C2×2 is uniquely expressible in the form

H = � x3 + x4 x

1 + ix2x1 − ix2 −x3 + x4 � ,

where xa, a = 1,2,3,4, are real. Show, moreover, that the representation above is equiva-

lent to

H = x1σ1 + x2σ2 + x3σ3 + x4σ4,


where σi, i = 1,2,3, are the Pauli spin matrices

σ1 = �0 11 0

� , σ2 = �0 i

−i 0� , σ3 = �

1 00 −1 � ,

and σ4 is the 2 × 2 identity matrix.

Solutions:

First one show that it is actually a Hermitian matrix, i.e. H = HCT where C is conjugate,and T is transposed.

HC = � x

3 + x4 x1 − ix2

x1 + ix2 −x3 + x4 � ,

�HC�T = � x3 + x4 x

1 + ix2x1 − ix2 −x3 + x4 � =H.

In general do one have that for a 2×2 matrix H with complex coefficient and its conjugatetransposed as (α is the conjugate of α),

H = � α β

δ γ� and H

CT = � α δ

β γ� .

For these to be equal, i.e. being a Hermitian matrix, then β = δ and δ = β which leadsus to conclude that β and δ must be each others conjugate. From this can one conclude thatβ = x

1 + ix2 and β = x1 − ix2, where x

1 and x2 are real is a unique way of writing β and

δ. Moreover must α = α and γ = γ leading to the conclusion that it must only have a realpart. One can then consider alpha and δ to be written as the system of equations of the formα = x3+x4 and γ = −x3+x4, where x3 and x

4 are real. From basic linear algebra one has thatthey are unique, having the solutions x3 = 1

2(α + γ) and x4 = 1

2(α − γ). One concludes that

HC = � x

3 + x4 x1 − ix2

x1 + ix2 −x3 + x4 � ,

is a unique way of written a 2 × 2 Hermitian matrix of complex entries.

To Show the last part, does one multiply the matrices σa, with the constant xa, a = 1,2,3,4,and use matrix multiplication:

H = x1σ1 + x2σ2 + x3σ3 + x4σ4 = �0 x

1

x1 0

� + � 0 ix2

−ix2 0� + � x

3 00 −x3 � + �

x4 00 x

4 �

= � x3 + x4 x

1 + ix2x1 − ix2 −x3 + x4 � .



Problem:

Let N be a future-directed null vector in M and {ea} an admissible basis with N = Naea.

Show that

N = � (�e + e4) ,

where � = −N ⋅ e4 = N4 and �e is the direction 3-vector of N relative to {ea}, i.e.,

�e = �(N1)2 + (N2)2 + (N3)2�−1�2 �N1e1 +N2

e2 +N3e3� .

Solution:

Written out the left side do one get,

N = Naea = N1

e1 +N2e2 +N3

e3 +N4e4.

The right side is then,

� (�e + e4) = N4 (�e + e4) =N

4

((N1)2 + (N2)2 + (N3)2)1�2 �N1e1 +N2

e2 +N3e3� +N4

e4

= N4

N4�N1

e1 +N2e2 +N3

e3� +N4e4 = N1

e1 +N2e2 +N3

e3 +N4e4.

The last line comes from the fact that the vector N is null and one have that (N1)2 +(N2)2 + (N3)2 − (N4)2 = 0 or then (N1)2 + (N2)2 + (N3)2 = (N4)2.


Problems:

With

[F a

b] =

��

0 F12 F

13 F

14

−F 12 0 F

23 F

24

−F 13 −F 2

3 0 F34

F14 F

24 F

34 0

��

=

��

0 B3 −B2

E1

−B3 0 B1

E2

B2 −B1 0 E

3

E1

E2

E3 0

��as in (19) and λ = λ(β) for some β ∈ (−1,1), show that

E1 = E1

, E2 = γ(E2 − βB3), E

3 = γ(E3 + βB2),B

1 = B1, B

2 = γ(βE3 +B2), B3 = −γ(βE2 −B3).


Solutions:

The transformation Λ(β) in matrix form are given by,

Λ(β) =

��

γ 0 0 −βγ0 1 0 00 0 1 0−βγ 0 0 γ

��

with β ∈ (−1,1) and γ = 1(1−β2)1�2 > 1. This transformation takes our skew-symmetric matrix

given by (19) from the orthonormal basis {ea} to the orthonormal basis {ea} by

Fa

b= Λa

αΛβ

bF

α

β.

The inverse of Λ(β) is given by (4),

(Λ(β))−1 =��

γ 0 0 βγ

0 1 0 00 0 1 0βγ 0 0 γ

��

.

Each components of the electric field are then,

E1 = F 1

4 = Λ1αΛ

β

4 Fα

β= Λ1

1Λβ

4 F1β+Λ1

4Λβ

4 F4β= γΛ 1

4 F11+γΛ 4

4 F14−βγΛ 1

4 F41−βγΛ 4

4 F44

= γ2E1 − (βγ)2E1 = � 1

1 − β2− β

2

1 − β2�E1 = E1

E2 = F 2

4 = Λ2αΛ

β

4 Fα

β= Λ2

2Λ1

4 F21 +Λ2

2Λ4

4 F24 = −βγB3 + γE2 = γ(E2 − βB3),

E3 = F 3

4 = Λ3αΛ

β

4 Fα

β= Λ3

3Λβ

4 F3β= Λ 1

4 F31 +Λ 4

4 F34 = γβB2 + γE3 = γ(βE3 + βB2).

And the components of the magnetic field are,

B1 = F 2

3 = Λ2αΛ

β

3 Fα

β= Λ2

2Λβ

3 F2β= Λ 3

3 F23 = B1

,

B2 = −F 1

3 = −Λ1αΛ

β

3 Fα

β= −Λ1

1Λβ

3 F1β−Λ1

4Λβ

3 F4β= −γΛ 3

3 F13+βγΛ 3

3 F43 = γB2+βγE3 = γ(βE3+B2),

B3 = F 1

2 = Λ1αΛ

β

2 Fα

β= Λ1

1Λβ

2 F1β+Λ1

4Λβ

2 F4β= γΛ 2

2 F12−βγΛ 2

2 F42 = γB3−βγE2 = −γ(βE2−B3).


Problems:

Prove Theorem (3.10), Exercise 2.2.3 in Naber [1].


Solutions:

Having the matrix

[F a

b] =

��

0 B3 −B2

E1

−B3 0 B1

E2

B2 −B1 0 E

3

E1

E2

E3 0

��

,

gives us that �F a

b� − λI as:

��

−λ B3 −B2

E1

−B3 −λ B1

E2

B2 −B1 −λ E

3

E1

E2

E3 −λ

��

.

The determinate of this matrix is:

det ([F a

b] − λI) = −λ

��

−λ B1

E2

−B1 −λ E3

E2

E3 −λ

��−B3

��

−B3B

1E

2

B2 −λ E

3

E1

E3 −λ

��−B2

��

−B3 −λ E2

B2 −B1

E3

E1

E2 −λ

��

−E1

��

−B3 −λ B1

B2 −B1 −λ

E1

E2

E3

��= −λ�−λ � −λ E

3

E3 −λ � −B

1 � −B1

E3

E2 −λ � +E

2 � −B1 −λ

E2

E3 ��

−B3 �−B3 � −λ E3

E3 −λ � −B

1 � B2

E3

E1 −λ � +E

2 � B2 −λ

E1

E3 ��

−B2 �−B3 � −B1

E3

E2 −λ � + λ �

B2

E3

E1 −λ � +E

2 � B2 −B1

E1

E2 ��

−E1 �−B3 � −B1 −λ

E2

E3 �λ �

B2 −λ

E1

E3 � +B1 � B

2 −B1

E1

E2 �� ,

with det � a b

c d� = ad − bc does one have,

−λ �−λ(λ2 − (E3)2) −B1(B1λ −E3

E2) +E2(−B1

E3 + λE2)�

−B3 �−B3(λ2 − (E3)2) −B1(−B2λ −E3

E1) +E2(B2

E3 + λE1)�

−B2 �−B3(B1λ −E3

E2) + λ(−B2

λ −E3E

1) +E2(B2E

2 +B1E

1)�

−E1 �−B3(−B1E

3 + λE2) + λ(B2E

3 + λE1) +B1(B2E

2 +B1E

1)�

gathering the terms in the inner parentheses,

= −λ �−λ3 + λ(E3)2 − (B1)2λ +B1E

3E

2 −E2B

1E3 + λ(E2)2�


−B3 �−B3λ2 +B3(E3)2 +B1

B2λ +B1

E3E

1 +E2B

2E

3 +E2λE

1�

−B2 �−B3B

1λ +B3

E3E

2 + −B2(λ)2 − λE3E

1 +B2(E2)2 +E2B

1E

1�

−E1 �B3B

1E

3 −B3λE

2 + λB2E

3 + (λ)2E1 +B1B

2E

2 + (B1)2E1�

and the next set of parentheses,

= λ4 − (λ)2(E3)2 + (B1)2(λ)2 − (λ)2(E2)2 + (B3)2λ2

−(B3)2(E3)2 −B1B

2B

3λ −B1

B3E

3E

1 −B3E

2B

2E

3 −B3E

2λE

1

+B2B

3B

1λ −B2

B3E

3E

2 + (B2)2(λ)2 +B2λE

3E

1 − (B2)2(E2)2 −B2E

2B

1E

1

−E1B

3B

1E

3 +E1B

3λE

2 −E1λB

2E

3 − (λ)2(E1)2 −E1B

1B

2E

2 − (B1)2(E1)2

eliminating and rearranging term,

λ4 + λ2 �(B1)2 + (B2)2 + (B3)2 − (E1)2 − (E2)2 − (E3)2�

−(E1)2(B1)2 −E1B

1B

2E

2 −E1B

1E

3B

3

−(B2)2(E2)2 −B2E

2B

1E

1 −E2B

2E

3B

3

−(B3)2(E3)2 −E3B

3E

1B

1 −E3B

3E

2B

2

which in the end gives us in the end,

= λ4 + � ��B�2 − ��E�2�λ2 − � �E ⋅ �B�2 ,

where ��E�2 = �E1�2 + �E2�2 + �E3�2, ��B�2 = �B1�2 + �B2�2 + �B3�2 and �E ⋅ �B = E1B

1 +E

2B

2 +E3B

3.


Problem:

Prove that the energy-momentum transformation T is symmetric with respect to the Lorentzscalar product, i.e.,

Tx ⋅ y = x ⋅ Ty.


Solution:

It is easier to prove the symmetry by looking on the indices of the matrix �T a

b� of T relative

to a arbitrary admissible basis {ea} forM. It is given by.

Ta

b= 1

4π�14F

α

βF

β

αδa

b− F a

αTα

b� , a, b = 1,2,3,4.

One know from before that Fα

βis skew-symmetric, meaning that changing the indices α

and β leads to Fα

β= −F β

α.

Changing then both skew-symmetric indices on the first tern in �T a

b� gives F β

αFα

β, which

is the same as renaming the indices. One can use the same argument on the last termF

a

αTα

b, but one have to be careful with the summation over a and b. But since F again is

skew-symmetric, summing over F a

α or Fα

a just change the sign, i.e. F 12 = −F 2

1.

Since changing the indices doesn’t does not affect the energy momentum tensor, oneconclude that it has to be symmetric.


Problems:

Show that, if each F (p) is skew-symmetric, then, in terms of the 3-vectors �E and �B,

�div �F�i = � ∂�E

∂x4− curl �B� ⋅ ei, i = 1,2,3,

�div �F�4 = −div �E.

Solutions:

First one rewrite the divergence and curl in tensor notation:

div �E = ∂E1

∂x1+ ∂E

2

∂x2+ ∂E

3

∂x3= E1

,1 +E2,2 +E3

,3,

curl �B = �∂B3

∂x2− ∂B

2

∂x3� ⋅ e1 + �

∂B1

∂x3− ∂B

3

∂x1� ⋅ e2 + �

∂B2

∂x1− ∂B

1

∂x2� ⋅ e3

= �B3,2 −B2

,3� ⋅ e1 + �B1,3 −B3

,1� ⋅ e2 + �B2,1 −B1

,2� ⋅ e3,

∂ �E∂x4= ∂E

1

∂x4⋅ e1 +

∂E2

∂x4⋅ e2 +

∂E3

∂x4⋅ e3 = E1

,4 ⋅ e1 +E2,4 ⋅ e2 +E3

,4 ⋅ e3,

and remember that,

ei ⋅ ej = δij .

gives,


� ∂�E

∂x4− curl �B� ⋅ ei = �E1

,4 +E2,4 +E3

,4� + �B2,3 −B3

,2� + �B3,1 −B1

,3� + �B1,2 −B2

,1� .

The divergence of our skew symmetric linear transformation,

F = [F a

b] =

��

F11 F

12 F

13 F

14

F21 F

22 F

23 F

24

F31 F

32 F

33 F

34

F41 F

42 F

43 F

44

��

=

��

0 B3 −B2

E1

−B3 0 B1

E2

B2 −B1 0 E

3

E1

E2

E3 0

��

,

relative to any {ea} is given by:

(divF )b = ηbβFα

β,α, b = 1,2,3,4,

such that

(divF )i = Fα

i,α for i = 1,2,3, and (divF )4 = −Fα

4,α.

This gives for i = 1:

(divF )1 = Fα

1,α = F 11,1 + F 2

1,2 + F 31,3 + F 4

1,4 = −B3,2 +B2

,3 +E1,4,

and for i = 2,

(divF )2 = Fα

2,α = B3,1 −B1

,3 +E2,4,

and for i = 3,

(divF )3 = Fα

3,α = −B2,1 +B1

,2 +E3,4,

and last for i = 4,

(divF )4 = −Fα

4,α = −E1,1 −E2

,2 −E3,3 = −div �E.

Collecting the terms for i = 1,2,3 leads us to:

−B3,2 +B2

,3 +E1,4 +B3

,1 −B1,3 +E2

,4 −B2,1 +B1

,2 +E3,4

= �E1,4 +E2

,4 +E3,4� + �B2

,3 −B3,2� + �B3

,1 −B1,3� + �B1

,2 −B2,1� .

Exercise 2.7.5, page 129

Problem:

Show that the second pair of Maxwell’s equations (26) is equivalent to

Fab,c + Fbc,a + Fca,b, a, b, c = 1,2,3,4.


Solution:

The values of Fab relative to an admissible basis {ea} has it entries given by:

Fab = ηacF c

b, or Fib = F i

bfor i = 1,2,3 and Fab = −F 4

b.

where Fc

bare the entries in the skew symmetric matrix,

F = [F c

b] =

��

F11 F

12 F

13 F

14

F21 F

22 F

23 F

24

F31 F

32 F

33 F

34

F41 F

42 F

43 F

44

��

=

��

0 B3 −B2

E1

−B3 0 B1

E2

B2 −B1 0 E

3

E1

E2

E3 0

��

.

This gives for i = 1:

F11 = F 11 = 0, F12 = F 1

2 = B3, F13 = F 1

3 = −B2, F14 = F 1

4 = E1,

and for i = 2,F21 = F 2

1 = −B3, F23 = F 2

3 = B1, F24 = F 2

4 = E2,

and for i = 3,F31 = F 1

1 = B2, F32 = F 3

2 = −B1, F34 = F 3

4 = E3,

and last for i = 4,

F41 = F 41 = E1

, F42 = F 42 = E2

, F43 = F 43 = E3

.

Showing one of the the cyclic rotations of these coordinates, and removing those that arezero:

For a = 1, and b = 1:

F11,c + F1c,1 + Fc1,1 = F12,1 + F21,1 + F13,1 + F31,1 + F14,1 − F41,1,

= B3,1 −B3

,1 −B2,1 +B2

,1 +E1,1 −E1

,1 = 0

for a = 1, and b = 2:

F12,c + F2c,1 + Fc1,2 = F12,1 + F21,1 + F12,2 + F21,2 + F12,3 + F23,1 + F31,2 + F12,4 + F24,1 − F41,2,

= B3,1 −B3

,1 +B3,2 −B3

,2 +B3,3 +B1

,1 +B2,2 +B3

,4 +E2,1 −E1

,2

= B3,3 +B1

,1 +B2,2 +B3

,4 +E2,1 −E1

,2

for a = 1, and b = 3:

F13,c + F3c,1 + Fc1,3 = F13,1 + F31,1 + F13,2 + F32,1 + F21,3 + F13,3 + F31,3 + F13,4 + F34,1 − F43,1,

= −B2,1 +B2

,1 −B2,2 −B1

,1 −B33 −B2

,3 +B2,3 −B2

,4 +E3,1 −E1

,3


= −B2,2 −B1

,1 −B33 −B2

,4 +E3,1 −E1

,3,

for a = 1, and b = 4:

F14,c − F4c,1 + Fc1,4 = F14,1 − F41,1 + F14,2 − F42,1 + F21,4 + F14,3 − F43,1 + F31,4 + F14,4 − F41,4,

= E1,1 −E1

,1 +E1,2 −E2

,1 −B3,4 +E1

,3 −E3,1 +B2

,4 +E1,4 −E1

,4

= E1,2 −E2

,1 −B3,4 +E1

,3 −E3,1 +B2

,4,

and collected:

B3,3 +B1

,1 +B2,2 +B3

,4 +E2,1 −E1

,2 −B2,2 −B1

,1 −B33

−B2,4 +E3

,1 −E1,3 +E1

,2 −E2,1 −B3

,4 +E1,3 −E3

,1 +B2,4 = 0

By symmetry, or pure calculations can one see that Fab,c + Fbc,a + Fca,b is also zero fora = 2,3,4.


Problem:

Show that, for any electromagnetic field, each of the functions Fab satisfies the wave equation

∂2Fab

(∂x1)2+ ∂

2Fab

(∂x2)2+ ∂

2Fab

(∂x3)2= ∂

2Fab

(∂x4)2.

Solution:

Notation from vector calculus: div �E = ∇ ⋅ �E, and curl �E = ∇× �E.First one can identify each Fab as in the problem 2.7.5, that relative to a basis {ea} one

have that,

Fab = ηacF c

b, or Fib = F i

bfor i = 1,2,3 and F4b = −F 4

b.

Therefore will each Fab correspond to one component of the electric field �E or magneticfield �B up to a sign. Rewriting the problem to:

∂2Fab

(∂x4)2− ∂

2Fab

(∂x1)2+ ∂

2Fab

(∂x2)2+ ∂

2Fab

(∂x3)2= 0,

is it then easy to see that this is just component-wise the equations,

∂2 �E

(∂x4)2−∇2 �E = 0,

and


∂2 �B

(∂x4)2−∇2 �B = 0,

for electric and magnetic field, respectively.

One then prove these two equations by using the second equation of the Maxwell’s equa-tions in (25) and (26) and rewrite:

∇× �E = − ∂�B

∂x4and ∇× �B = ∂ �E

∂x4.

Taking then the curl of the curl gives:

∇× �∇× �E� = − ∂

∂x4�∇× �B� = − ∂

2 �E(∂x4)2

and ∇× �∇× �B� = ∂

∂x4�∇× �E� = − ∂

2 �B(∂x4)2

.

Using a general vector field �V one has from vector calculus that,

∇× �∇× �V � = ∇ �∇ ⋅ �V � −∇2 �V .

Since the first equation of the Maxwell’s equations in (25) and (26) are:

∇ ⋅ �E = 0 and ∇ ⋅ �B = 0,one have then finally that,

∇× �∇× �E� = −∇2 �E and ∇× �∇× �B� = −∇2 �B.


Problem:

Complete the proof of Theorem (4.21), in the book 3.4.2.

Solution:

The proof goes almost the same as the proof of the Theorem before (4.21), which is in Naber[1] on page 184.

Proof:

Given a Hermitian spinor V of valence (0,0; 1,1), a spin frame {sA} and an admissiblebasis {ea}, one define our covector v

∗ ∈M∗ (with corresponding dual space basis {ea} anddual spin frame {sA}) by specifying its components in every admissible basis in the followingway:

Writing V = VAX

sA ⊗ s

X and defining the components va of v∗ relative to {ea} by

va = −VAXsA ⊗ s

XσAX

a , a = 1,2,3,4.Suppose that {ea} is another basis for the dual space, then let Λ ∈ L be the Lorentz

transformation taking us from {ea} to {ea}.


By the spinor map do one have that, Λ = Λ±G = Spin(±G) and let {sA} be the dual spin

frame related by to {sA} by G (or by −G). Then V = VAX

sA⊗ ¯sX , where V

AX= G B

AG

Y

XVBY

(−G gives the same components). One define the components of v∗ relative to {ea} by

va = −VAXσAX

a , a = 1,2,3,4.

To see that it transforms correctly one calculate,

−VAX

σAX

a = −G B

A GY

XVBY

σAX

a

= −�G B

A GY

XσAX

a �VBY

= −�Λ b

a σBY

b�V

BY

= Λ b

a �−VBYσBY

b�

= Λ b

a vb.


Documents

The Geometry behind the Special Relativity Theory