Branching Rules and Little Higgs models · group. A symmetry group that is of special importance in particle physics is the group SU(N) as the Standard Model symmetry group is SU(3)

Branching Rules and Little Higgs models

Cathelijne ter Burg

Bachelor Thesis year 3

Supervisors: prof. dr. J. Stokman & prof. dr. E. Laenen

July 16, 2015

Image is from [28]

KdVI & IoP-ITFA

Faculteit der Natuurwetenschappen, Wiskunde en Informatica

Universiteit van Amsterdam

Abstract

This thesis will discuss branching rules for GL(n,C) and Little Higgs models and will revolve around

the concept of symmetry breaking. An introduction to representation theory is given after which the

results are applied to Sn and GL(n,C). The irreducible characters of GL(n,C) will be related to the

Schur polynomials which enables us to derive the branching rules for GL(n,C). The results are then

extended to SU(N) and some examples are given. Then I discuss spontaneous symmetry breaking

in physics, introduce Nambu-Goldstone bosons (NGB) and discuss the Higgs mechanism applied to

the Standard Model gauge group. The hierarchy problem and the need to search for physics beyond

the Standard model are discussed. We focus on Little Higgs models that are a partial solution to

the hierarchy problem by postulating new physics at the TeV scale, and yield a naturally light Higgs

through a mechanism called collective symmetry breaking. Collective symmetry breaking will be

introduced via an SU(3) based toy model after which the ”Littlest Higgs”, based on a global SU(5)

symmetry is discussed. Branching rules for SU(5) will be also be discussed in the framework of Grand

Unified Theories.

Title: Branching rules and Little Higgs models.

Author: Cathelijne ter Burg, 10422722

Supervisors: Prof. dr. J. Stokman & Prof. dr. E. Laenen

Final date: 17-07-2015

IoP-ITFA & KdVI

University of Amsterdam

Science Park 105-107, 1098 XG Amsterdam

Contents

1 Introduction 3

2 Necessities from representation theory 5

2.1 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Character theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3 Modules and the group algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4 Restricted and induced representations . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 The irreducible representations of Sn 17

3.1 The symmetric group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2 Young diagrams and Young tableau’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.3 Constructing the irreducible representations of Sn . . . . . . . . . . . . . . . . . . . . 20

3.4 Young subgroups, induced representations and Young’s Rule . . . . . . . . . . . . . . . 23

4 The irreducible representations of GL(V ) 27

4.1 The irreducible characters of SλV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Branching Rules 34

5.1 Branching Rules for GL(n,C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.2 The irreducible representations of SU(n) . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6 Lagrangians, symmetries and symmetry breaking 40

6.1 Lagrangian formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6.2 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6.3 Symmetry breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.3.1 Explicit symmetry breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.3.2 Spontaneous symmetry breaking . . . . . . . . . . . . . . . . . . . . . . . . . . 42

7 Goldstone bosons and the Higgs mechanism 49

7.1 Local U(1) gauge theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

7.2 Abelian Higgs Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

7.3 The Standard model Higgs mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

7.3.1 Assigning mass to gauge bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

7.3.2 Assigning mass to fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

8 The Hierarchy problem 59

8.1 Naturalness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

8.2 Hierarchy problem in the Higgs sector . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

9 Little Higgs models 62

9.1 Transformation of NGB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

9.2 Constructing ”The Simplest Little Higgs”. . . . . . . . . . . . . . . . . . . . . . . . . . 63

9.2.1 Adding the Gauge coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

9.2.2 Adding the Yukawa coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

9.2.3 The Higgs potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

1

9.2.4 Hypercharge and color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

9.2.5 The gauge sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

9.2.6 Cancellation of the W boson loop . . . . . . . . . . . . . . . . . . . . . . . . . 77

10 Representations, particle multiplets and symmetry breaking 79

10.1 SU(5)→ SU(3)C × SU(2)W × U(1)Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

11 The Littlest Higgs 85

11.1 Requirements for the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

11.2 The Gauge bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

11.3 The Quartic Higgs potential and Higgs mass . . . . . . . . . . . . . . . . . . . . . . . . 89

11.4 Viability of ’Littlest Higgs’ and signatures in experiment . . . . . . . . . . . . . . . . . 92

12 Summary 94

12.1 Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

12.2 Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

12.3 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

13 Popular summary (Dutch) 96

A Symmetric polynomials 101

A.1 Monomial symmetric polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

A.2 Complete symmetric polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

A.3 Elementary symmetric polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

A.4 Schur polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

A.5 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

A.6 Relations among the symmetric polynomials . . . . . . . . . . . . . . . . . . . . . . . . 104

A.6.1 Skew Schur functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

B Lie groups and Lie algebra’s 106

B.1 Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

B.2 Lie algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

B.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

C Notation and relevant quantum numbers 109

C.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

C.2 Quantum numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

C.2.1 isospin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

C.2.2 Weak isospin and weak hypercharge . . . . . . . . . . . . . . . . . . . . . . . . 110

D Feynman rules and calculating loop integrals 112

D.1 Superficial degree of divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

D.2 Regularization schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

D.2.1 Momentum Cut-off regularization . . . . . . . . . . . . . . . . . . . . . . . . . . 113

D.3 Calculation of quadratic divergent contributions to the Higgs mass . . . . . . . . . . . 113

2

1 Introduction

Symmetries and symmetry groups play an important role in modern science, as well in mathemat-

ics as in physics. Mathematically we say an object obeys a symmetry when it is invariant under a

transformation. A 3D sphere for one, obeys rotational symmetry i.e. is invariant under the contin-

uous symmetry group SO(3). In most cases though, continuous symmetry groups are not easy to

work with. Therefore, mathematicians ”represent” their elements as linear transformations between

vectorspaces. Representation theory goes back to the late eighteen hundreds and finds applications in

many fields, ranging from mathematics and statistics, both pure and applied, to physics. In the latter

it has among others applications in particle physics. There, it proved to be convenient to associate

the transformations of different particles under a symmetry group with its different representations.

Each particle is assigned to a certain representation and is said to transform as the representation or

”lie in the representation”. As important as symmetries is the notion of symmetry breaking, meaning

that the symmetry group is reduced to a smaller group. This will be an integral part throughout

this thesis. Mathematically, symmetry breaking is described by branching rules. These describe how

the restriction of an irreducible representation decomposes into irreducible representations of the sub-

group. Put in terms of particles, they tell us how these will transform under the reduced symmetry

group. A symmetry group that is of special importance in particle physics is the group SU(N) as the

Standard Model symmetry group is SU(3)C × SU(2)W × U(1)Y .

The Standard Model describes the universe in terms of fermions (matter) and the forces between

them by interchanging bosons1. With the discovery of the Higgs boson in 2012 all particles predicted

by the standard model have been observed. It is highly consistent with experimental data, stands as

one of the biggest successes of modern science. However, it still has some unanswered questions, one

of them being the nature of dark matter, and physicists nowadays believe that it is only an effective

theory, meaning that at some high energy scale it must be replaced by a more fundamental theory.

One reason for expecting physics beyond the standard model comes from measurements of the coupling

constants. They are very different at low energies but at higher energies seem to converge to a single

point at around 1015 GeV indicating that the three forces were once united. Here we discuss a different

motivation for expecting physics not far beyond the Standard Model, namely the hierarchy problem.

Put briefly, the hierarchy problem refers to the vast difference between the energy scales of different

physical theories and causes the Higgs mass to acquire quadratically divergent quantum corrections.

If the Standard Model is assumed to remain valid up to energies many orders of magnitude above the

electroweak symmetry breaking scale, it becomes an extremely fine-tuned theory to keep the Higgs

mass at its measured value ∼ 100 GeV, which is highly unnatural. In this thesis I will discuss a class

of models that form a partial solution to the hierarchy problem. These are called ’Little Higgs models’

and they postulate new physics at the 1 TeV scale by introducing new particles. Little Higgs models

will be the subject of the second part.

The full content of this thesis will be organized as follows: The first part will have a focus on math-

ematics whilst the second will be focusing on physics. I start by providing an overview of results

from representation theory that will be needed. In section 3 I discuss the symmetric group Sn. I

introduce the Young diagram corresponding to the partitions of n and the Young tableaux and show

1This includes the strong, weak and electromagnetic forces. It does not incorporate gravity.

3

that by constructing a particular element in its group algebra, called the Young symmetrizer, we can

construct all the irreducible representations of Sn, which will be parametrized by the partitions. Then

we turn our attention to the group GL(V ) ∼= GL(n,C) in section 4. It turns out that the same Young

symmetrizer can be used to also construct many of the irreducible representations of GL(n,C). In

particular I show that the irreducible characters are given by certain symmetric polynomials, called

Schur polynomials. Once we have made this identification, determining branching rules in section 5

will come down to using known identities between these Schur polynomials. Needed results about these

Schur polynomials can be found in Appendix A. Once we have these branching rules, I will discuss

some results from Lie theory to argue that we can extend all results to SU(N) ⊂ U(N) ⊂ GL(n,C).

Then, in section 6 I will introduce the field theoretic Lagrangian, discuss symmetry breaking and in-

troduce Nambu-Goldstone bosons. Section 7 will introduce local symmetries, the covariant derivative,

gauge fields and discuss the Higgs mechanism in the Standard Model responsible for the masses of

the elementary fermions and the W±, Z0 bosons. Section 8 will focus on the hierarchy problem. In

section 9 I start by discussing the first Little Higgs model that will be based on SU(3). It will act as a

toy model to comprehend the physics. Later in section 11 I will discuss the ’Littlest Higgs’, based on

SU(5) that is the minimal model that could act as a viable extension of the Standard Model at the

TeV scale. In section 10 I will give a general introduction to the group SU(5) by discussing how the

fundamental particles can be distributed over the irreducible SU(5) representations. For this, we will

use the results of one of the branching rules as derived in section 5.

4

2 Necessities from representation theory

This section will serve as an overview of all results from representation theory that will be needed in

the following sections. The results are mainly based on [1], [4], [5] and [6]2. We begin by defining what

we mean by a representation.

2.1 Representations

Definition 2.1. Let G a group. A representation of a group G on a C-vectorspace V is a homo-

morphism ρ : G → GL(V ) of G to the group of automorfisms of V , such that ρ(gh) = ρ(g)ρ(h) and

ρ(1) = 1 for all g, h ∈ G.

The dimension of the representation is the dimension of the vectorspace V . Here GL(V ) is the group

of all invertible linear transformations φ : V → V . We will often call V the representation of G and

omit the symbol ρ, that is, write gv for ρ(g)v. A few representations that will be important are the

following.

Definition 2.2. The trivial representation of G is the representation ρ : G → GL(C) such that

ρ(g) = 1 for all g ∈ G.

All groups have this one-dimensional representation. In the case where G = Sn, the symmetric group

on n letters, there is a second one-dimensional representation.

Definition 2.3. The sign representation of Sn is the representation sgn : G→ GL(C) such that:

sgn(g) =

1 if g is an even permutation

-1 if g is an odd permutation(2.1)

Definition 2.4. If X is a finite set and G acts on X on the left, then there is an associated permutation

representation. If V is the vectorspace with basis ex : x ∈ X, then G acts on V by

g ·∑x∈X

axex =∑x∈X

axegx. (2.2)

Definition 2.5. The permutation representation corresponding to the left action of G on itself is

called the regular representation.

This representation is of dimension |G| and has a set basis vectors given by eg : g ∈ G.

Definition 2.6. A sub-representation of a representation V is a sub-vectorspace W of V which is

invariant under the action of G, i.e. g · w ∈W for all w ∈W .

Definition 2.7. A representation V is called irreducible (or simple) if the only sub-representations

are 0 or V itself. It is called indecomposable if it cannot be written as a direct sum of two nonzero

sub-representations. It is called reducible if it has a proper sub-representation.

2Proofs that have been omitted are in accordance with Prof. Stokman, mainly due to length and actual

relevance.

5

Example 2.1. Consider the symmetric group on n letters and let e1, e2, . . . , en be the standard basis

of Cn. Sn is a permutation group and thus has a natural permutation representation, where it acts on

Cn by permuting the indices. Note that the one-dimensional subspace spanned by e1+e2+. . .+en is left

invariant under the action of Sn. It has a complementary subspace (x1, . . . , xn) : x1 + . . .+ xn = 0.This subspace is n − 1 dimensional and is also invariant and hence a sub-representation. It is called

the standard representation of Sn.

Making new representations

Once we have two representations V and W it is possible to construct new representations and this

is most easily done by taking their direct sum, i.e. V ⊕ W . Also the tensor product V ⊗ W is a

representation via g(v ⊗ w) = gv ⊗ gw. It is then easily deduced that the nth tensor power V ⊗n is

also a representation. This nth tensor power has the exterior power and symmetric powers, denoted

Λ⊗nV and Sym⊗nV respectively, as sub-representations. They are defined as:

Definition 2.8. The nth symmetric power is the subspace of V ⊗n spanned by

∑σ∈Sn

vσ(1) ⊗ . . .⊗ vσ(n) | vi ∈ V

Definition 2.9. The nth exterior power is the subspace of V ⊗n spanned by

∑σ∈Sn

sgn(σ)vσ(1) ⊗ . . .⊗ vσ(n) | vi ∈ V

A completely different way of constructing a new representation is through the dual representation V ∗

of V . This is the space of all linear maps φ : V → C.

Definition 2.10. If ρV : G → GL(V ) is a representation of G on a vectorspace V , then the dual

representation ρV ∗ : G→ GL(V ∗) is defined by

ρV ∗(g)(φ) = φ ρV (g−1)

The vectorspace Hom(V,W ) can also be made into a representation through the map

ρHom(V,W )(g)(φ) = ρW (g) φ ρV (g−1)

Now, if we compare this equation with the defining map for the dual representation, then we can

deduce that in the case of W = C, the trivial representation, we can make the identification of:

V ∗ ∼= Hom(V,C). And this is in fact a special case of the general case where we have

Hom(V,W ) ∼= V ∗ ⊗W

Complete reducibility and Schur’s lemma

Given a representation we would now like to know how it is build up in terms of its irreducible sub-

representations, or put differently, how it decomposes in terms of its irreducible sub-representations.

We start with the following result.

6

Proposition 2.1. If W is a sub-representation of a representation V of a group G, then there is a

complementary invariant subspace W⊥ of V such that we have V = W ⊗W⊥.

Proof To prove this, we define a Hermitian inproduct H(·, ·) on V such that H(gv, gw) = (v, w) for

all g ∈ G, v ∈ V,w ∈W . We can get such an Hermitian inner product by taking any Hermitian inner

product H0 and averaging it over G. That is, we define it as

H(v, w) =1

|G|∑g∈G

H0(gv, gw)

Then, if W is a sub representation of V , then W⊥ is also a sub representation of V since:

gv ∈W⊥ ⇔ H(gv, w) = 0, for all w ∈W ⇔ H(v, g−1w) = 0, for all w ∈W

since g−1w ∈W , and we can thus describe W⊥ using W .

This proposition tells us that we can in fact consider any representation as a direct sum of sub

representations, and by induction on the dimension it can be concluded that it can in fact be written

as a direct sum of irreducible sub representations.

Corollary 2.1. Any representation of a finite group G can be written as a direct sum of irreducibles.

This is a property that all finite groups have and it is called complete reducibility. It does not tell us

however, how it decomposes as a direct sum of its irreducibles and whether this decomposition will

in fact be unique. This we are told by Schur’s lemma. To formulate this we need to introduce the

G-homomorphism/intertwining operator.

Definition 2.11. Given two representations (ρ, V ) and (π,W ) of the same group G, a intertwining

operator or G-homomorphism is a linear map ψ : V → W that intertwines with the action of G, i.e.

for which the following holds:

ψ(ρ(g)v) = π(g)(ψ(v)), for all g ∈ G, v ∈ V

Remark Both the kernel and image of φ denoted Ker(φ) and Im(φ) are sub representations of V and

W respectively. The vectorspace of all intertwining operators is denoted by HomG(V,W ) which is a

subspace of Hom(V,W ). We can now formulate Schur’s lemma, telling us under what conditions two

irreducible representations V and W will be equivalent.

Lemma 2.1. Schur’s lemma If V and W are two irreducible representations of G and φ : V →W is

a G-homomorfism, then

1. φ is an isomorphism or φ = 0.

2. If V = W , then φ = λI for some complex scalar λ, where I is the identity map.

Proof The first part of the theorem follows from the fact that both the kernel and image of an

intertwiner are invariant subspaces of V and W respectively. Since V and W are both irreducible they

can not be proper subspaces. Therefore, Ker(φ) is either 0 or V . If Ker(φ) = V then φ = 0, if φ 6= 0

then Ker(φ) = 0 meaning φ is injective. Similarly, Im(φ) is either 0 or W . If φ = 0 Im(φ) = 0,

else Im(φ) = W and φ is surjective. Thus, if both V and W are irreducible, φ is an isomorphism or

the zero map.

7

For (2) we let λ an eigenvalue3 of φ. It exists since C is algebraically closed. Then the operator φ−λ·Idhas a non-zero kernel. But then (1) implies that φ− λ·Id= 0. Thus φ is scalar multiplication.

Note that with this lemma it follows that the intertwiners between irreducible representations satisfy:

• HomG(V,W ) = 0 if V is not isomorphic to W .

• HomG(V,W ) ∼= C if V ∼= W .

Proposition 2.2. For any representation V of a finite group G there is a decomposition

V = V ⊕a11 ⊕ . . .⊕ V ⊕akk

where the Vi are non-isomorphic irreducible representations. This decomposition of V into a direct

sum of k factors is unique up to isomorphism, and so are the Vi that occur in the decomposition and

their multiplicities ai.

proof Suppose W is another representation of G with decomposition W =⊕W⊕bjj . Suppose further

that φ : V → W is a G-homomorphism. Then by Schur φ must map the factor V ⊕aii into that factor

W⊕bjj for which Vi ∼= Wj since if it was mapped to more that one, Vi would not be irreducible. By

applying this to the Identity map φ : V → V the uniqueness follows.

2.2 Character theory

This section will discuss character theory, a convenient way to characterize representations.

Definition 2.12. If (ρ, V ) is a representation of G, then its character χV is the function χV : G→ Cdefined by

χV (g) = Tr(ρ(g)|V )

i.e. the trace of ρ(g) on V .

The characters represent class functions on the group G. The set class functions on G, written Cclass(G)

is the set of functions that are constant on the conjugacy classes of G. It can be seen as follows:

χV (hgh−1) = Tr(hgh−1) = Tr(h−1hg) = Tr(g) = χV (g).

A few more results about characters that we will need are given by the following proposition.

Proposition 2.3. Let V and W two representations of G. Then

χV⊕W = χV + χW , χV⊗W = χV · χW

χV ∗(g) = χV (g), χ∧2V (g) =1

2[χV (g)2 − χV (g2)]

3λ is an eigenvalue if it is a root of the characteristic polynomial of det(φ - λId)

8

proof To prove this we consider a fixed element g ∈ G and compute the values of these characters on

g. For the action of g we let λi and µi be the eigenvalues of V and W respectively. Then the

first two formulas follow from the observation that λi +µi are the eigenvalues of V ⊕W and λiµithose of V ⊗W . Similarly λ−1

i = λi are the eigenvalues of g on V ∗, since all eigenvalues are nth roots

of unity, with n the order of g. Regarding the last formula we observe that λiλj : i < j are the

eigenvalues for g on∧2

V and ∑i<j

λiλj =(∑λi)

2 −∑λ2i

2

Then the formula follows since g2 has eigenvalues λ2i .

Characters have many applications. For one, they can be used to explicitly find the decomposition of

a representation into a direct sum of its irreducible sub-representations. In order to achieve this we

want to find a way to project V onto the irreducible representations to find out if those irreducibles

are in V , and if so, to determine the multiplicity that they appear with. For this we introduce the

first projection formula by setting:

φ =1

|G|∑g∈G

ρ(g) ∈ End(V ). (2.3)

Here |G| represents the number of elements of the group G. φ thus represents an average of all

endomorphisms ρ(g) : V → V . We further set for any representation V of G

V G = v ∈ V : gv = v, for all g ∈ G (2.4)

This is again a representation of G and it is a direct sum of trivial sub-representations of V .

Proposition 2.4. The map φ defined in (2.3) is a projection of V onto V G.

proof Suppose first that v = φ(w) = 1|G|∑gw. Then:

hv =1

|G|∑

hgw =1

|G|∑

gw for any h ∈ G

so Im(φ) ⊂ V G. To prove ” ⊃ ”, we let v ∈ V G, then φ(v) = 1|G|∑v = v, so V G ⊂ Im(φ). Further,

φ φ = φ.

With formula (2.3) we can explicitly find the direct sum of the trivial sub representations in a given

representations. In particular, the multiplicity of the trivial representation appearing in the decompo-

sition of V is the dimension of V G. Since φ is a projection onto V G, this dimension is the trace of φ.

Therefore, writing m for the multiplicity, we have:

m = Trace(φ) =1

|G|∑g∈G

Trace(g) =1

|G|∑g∈G

χV (g) (2.5)

We can do more with this idea. We let

Hom(V,W )G = G-module homomorphisms φ : V →W

9

Then it follows by Schur that if V is irreducible, dim(Hom(V,W )G) is the multiplicity of V in W .

Also, is V and W are both irreducible we have

dim(Hom(V,W )G) =

1 if V ∼= W

0 if V 6= W(2.6)

Using proposition 2.3 we have

χHom(V,W )G = χV (g) · χW (g)

where we used the fact Hom(V,W ) = V ∗ ⊗W . By now applying (2.5) we deduce:

1

|G|∑g∈G

χV (g)χW (g) =

1 if V ∼= W

0 if V 6= W(2.7)

where V andW are irreducible. This relation looks a lot like some kind of inner product< χV (g), χW (g) >

and that is precisely what it is. It represents an hermitian inner product on Cclass:

< α, β >=1

|G|∑g∈G

α(g)β(g) (2.8)

Therefore we can reformulate (2.7) as:

Theorem 2.1. In terms of the inner product (2.8), the characters of the irreducible representations

are orthonormal.

We now let V ∼= V ⊕a11 ⊕. . .⊕V ⊕akk with the Vi distinct irreducible representations, then χV =∑aiχVi .

Since the χVi is linearly independent, we can conclude the following.

Corollary 2.2. Any representation is determined by its character.

By using (2.7) we can further deduce the following results.

Corollary 2.3. A representation V is irreducible iff (χV , χV ) = 1.

proof The implication from left to right follows immediately with (2.7). For the other implication we

let V ∼= V ⊕a11 ⊕, . . . ,⊕V ⊕akk with the Vi distinct irreducible representations. Then, (χV , χV ) =∑a2i

which is 1 only if ai = 1 for all i and n = 1.

Corollary 2.4. Let V have a decomposition as above. Then the multiplicity ai of Vi in V is the inner

product ai = (χV , χVi).

proof We have that

(χV , χVi) =∑j

aj(χVj , χVi) = ai

since (χVj , χVi) = 0 for i 6= j and 1 when i = j.

Another important result follows from the fixed point formula applied to the regular representation.

Theorem 2.2. Fixed point formula Let G be a finite group, and X a finite set. Let V be the

permutation representation as in definition 2.4. Then for all g ∈ G, χV (g) is the number of elements

in X left fixed under the action of g.

10

proof Observe that the matrix M that is associated with the action of g is a permutation matrix.

Suppose first that X = x1, x2, x3 and ρ(g) permutes the basis vectors of V by sending ex1→ ex3

,

ex2to itself and ex3

→ ex1. Then

M =

0 0 1

0 1 0

1 0 0

In the general case: if gexi = egxi = exj then the matrix M will have a 1 in the i−th column and

j−th row, and zeros in all other entries of that column. In particular, when xi is held fixed by g, then

gexi = egxi = exi then M has a 1 in the i-th row and i-th column. Therefore, the trace is the number

of 1’s on the diagonal, i.e. the number of elements left fixed by g.

Then we deduce that the character of the regular representation, χR, is given by:

χR(g) =

0 if g 6= e

|G| if g = e(2.9)

Thus only when G = e is R irreducible and if we let R =⊕V aii a decomposition into distinct

irreducibles Vi we find

ai = (χVi , χR) =1

|G|χVi(e)|G| = dim(Vi),

which gives us the following corollary:

Corollary 2.5. Any irreducible representation V of G appears in the regular representation dim (V )

times. In particular this means that the regular representation contains all irreducibles.

Another consequence is the following:

|G| = dim(R) =∑i

aidim(Vi) =∑i

(dim(Vi))2

To conclude this section on character theory there is one more result that we will need.

Proposition 2.5. The number of irreducible representations of G is equal to the number of conjugacy

classes of G. Equivalently, the characters form an orthonormal basis for the set of class functions

CClass(G).

proof Take α : G→ C a class function and (α, χV ) = 0 for all irreducible representations V . Then it

is to show that α = 0. For this consider the endomorphism

φα,V =∑g∈G

α(g)g V → V

We now want to apply Schur’s lemma. For this we first have to show that φα,V is aG-homomorphism/intertwining

operator.

φα,V (hv) =∑

α(g)g(hv)

=∑

α(hgh−1)hgh−1(hv)

=h(∑

α(hgh−1)g(v))

=h(∑

α(g)g(v))

=h(φα,V (v))

11

Then it now follows by Schur’s lemma (2.1) that φα,V = λ· Id, whereby:

λ =1

dimVTrace(φα,V )

=1

dimV

∑α(g)χV (g)

=|G|

dimV(α, χV ∗)

=0

Therefore φα,V = 0 and hence∑g∈G α(g)g = 0. This also holds for the regular representation R and

in this representation the elements of G are linearly independent, implying α(g) = 0 for all g ∈ G as

was to be shown.

2.3 Modules and the group algebra

There is one particular choice for the vectorspace V that turns out to be very convenient. This is

when V is taken to the the group algebra C[G]. Before getting into the necessary details about the

group algebra, I will first review the concepts of algebra’s and modules.

Definition 2.13. An associative algebra over C is a vectorspace A over C together with a bilinear

map A×A→ A, (a, b)→ ab such that (ab)c = a(bc).

Definition 2.14. A left A-module with unit 1A, is a finite dimensional vector space V over C together

with a function φ : A × V → V , (a, v) → av which is bilinear and satisfies a(bv) = (ab)v for all

a, b ∈ A, v ∈ V .

Just as is we can define a representation of a group G, we can define a representation of an algebra.

Definition 2.15. A representation of an algebra A (or equivalently a left A-module) is a vectorspace

V together with an algebra homomorphism

φ : A→ End(V )

such that φ(ab) = φ(a)φ(b) and φ(1A) = 1.

Definition 2.16. The regular representation of A (also called the left regular A-module) is the vec-

torspace A itself, made into an A-module through the map A × A → A, given by (a, b) → ab for all

a, b ∈ A.

As said, we now consider the particular case where the vectorspace A is taken to be the group algebra

C[G]. This is the vectorspace with eg|g ∈ G as set of basis vectors and multiplication defined by

eg · eh = egh. It consists of all element of the form

C[G] =

∑g∈G

egg|eg ∈ C

.

In this case it holds that C[G] modules correspond directly to representations of G over C since any

representation ρ : G→ GL(V ) can be linearly extended to a map ρ : C[G]→ End(V ) via the map:

ρ : C[G]→ End(V ), ρ

∑g∈G

egg

=∑g∈G

egρ(g) ∈ End(V ). (2.10)

12

Therefore the correspondence ρ 7→ ρ gives us an equivalence between the representations of G and

C[G] modules. Further, sub-representations correspond to submodules, irreducible representations to

simple submodules etc. All statements about representations of G have an equivalent statement in

terms of its group algebra.

Proposition 2.6. If Wi are the irreducible representations of G then we have an isomorphism of

algebra’s:

C[G] ∼=⊕

End(Wi)

proof As mentioned above, a map G → GL(V ) of groups extends linearly to a map C[G] → End(V )

of algebra’s. By applying this to each of the Wi we find the canonical map:

φ : C[G]→⊕

End(Wi)

This map is an isomorphism. It is injective since the representation on the regular representation is

faithful. Surjectivity follows from the observation that both have dimension∑

(dim(Wi))2.

Remark. We can alternatively formulate this in terms of n × n matrix algebra’s over a division

ring C because endomorphisms between vectorspaces can be seen as matrices. If we denote ni as the

dimension of Wi, then:

C[G] ∼=⊕

Matni(C).

This relation holds in fact for the more general case where A is a semisimple algebra. It is a result due

to Wedderburn. A proof can be found in [4] pp 26. Since it is rather long I will not include it here. It

introduces the opposite algebra to show that A ∼=⊕

End(Vi) and uses the observation that if we can

decompose Vi =⊕

i niUi into irreducibles Ui with a certain multiplicity, then

End(Vi) = End

(k⊕i=1

niUi

)∼=

⊕1≤i,j≤k

Hom(niUi, njUj) ∼=k⊕i=1

End(niUi) ∼=k⊕i=1

Matni(C)

where the third equality follows from Schur’s lemma (2.1), since intertwiners between non-isomorphic

irreducibles are zero. Any semisimple algebra A can in this way be written as a direct sum of matrix

algebra’s over C.

Primitive idempotents in the center of the group algebra

I will now introduce an important type of elements, called idempotent elements, in an algebra A.

These we will need in the irreducible representations of the symmetric group in the next section. In

the following A will always be a unital, finite dimensional, associative, commutative algebra over C.

Definition 2.17. An idempotent element p ∈ A is an element that satisfies p2 = p. Two idempotents

p1, p2 are called mutually orthogonal if p1p2 = 0 = p2p1. An idempotent p is called primitive or

minimal if p = p1 + p2 implies p1 = 0 or p2 = 0, where p1 and p2 are mutually orthogonal. A set of

mutual orthogonal idempotents p1, . . . pn is complete if p1 + . . .+ pn = 1

These idempotent elements generate left ideals in a commutative algebra A and these left ideals are

precisely its submodules. The irreducible submodules of A correspond to the minimal left ideals

generated by primitive idempotents. In fact we have the following lemma:

13

Lemma 2.2. Let A be an algebra. If V = U ⊕ W is a decomposition of V as direct sum of A

-submodules, then the projection of V onto U and W , denoted pu and pw respectively, satisfy

• pu and pw and mutually orthogonal idempotents in A.

• 1 = pu + pw

• if p ∈ A is an idempotent then 1− p is an idempotent as well, 1 = p+ (1− p) is a decomposition

of 1 ∈ A as sum of orthogonal idempotents and V = pV ⊕ (1− p)V is a decomposition of V as

a direct sum of A-submodules.

Proof Since (1) and (2) are obvious, we only prove (3). Suppose that p ∈ A is an idempotent, then

also 1 − p ∈ A is an idempotent, since we have (1 − p)2 = 1 − 2p + p2 = 1 − p. They are clearly

orthogonal since p(1− p) = (1− p)p = p− p2 = 0. Therefore, Im(p)∩Im(1− p) = pV ∩ (1− p)V = 0

and V =Im(p)+Im(1− p) = pV + (1− p)V . Thus V = pV ⊕ (1− p)V is a direct sum decomposition

of V .

In the case where pii=1,...,n represents a complete set of orthogonal idempotents, then by the previous

lemma we have that V = ⊕ni=1piV is a decomposition of V as direct sum of A-submodules. There a a

few more results about these idempotents that we will need.

Lemma 2.3. Suppose p1, p2 ∈ A are primitive idempotents. Then p1p2 = 0 iff p1 6= p2.

proof We will prove p1p2 6= 0 iff p1 = p2. For ”⇒ ” this let p1, p2 ∈ A be primitive idempotents such

that p1p2 6= 0. Then

p1 = p1p2 + p1(1− p2)

is a decomposition of p1 in mutually orthogonal idempotents. Note that A is commutative. Now, since

p1 is primitive and p1p2 6= 0 we must have p1(1 − p2) = 0 and thus p1 = p1p2. Also p2 = p2p1 by

interchanging p1 and p2 and thus p1 = p2. Conversely, if p1 = p2, then p1p2 = p21 = p1 6= 0.

Corollary 2.6. The primitive idempotents of an algebra A form a finite, linear independent set.

Proof Suppose∑i λipi = 0, then 0 = pj

∑i λipi = λjpj by lemma 2.3. Thus, λj = 0 for all j and the

set is this linear independent. Since A is finite dimensional the set pi is a finite set.

Proposition 2.7. Let ai be the set of primitive idempotents in A. Then

1 =∑i

ai

Proof We prove this with induction to the dimension of A. If dim(A) = 1 there is nothing to prove

since then 1 is the only primitive idempotent. Suppose now dim(A) > 1. If 1 ∈ A is primitive, then it

is the only primitive idempotent. For this, suppose a ∈ A where another primitive idempotent. Then

0 6= a 6= a · 1 and thus a = 1 by lemma 2.3. Therefore, it remains to prove the induction step in the

case that 1 ∈ A is not primitive. Then there exist nonzero, pairwise orthogonal idempotents b, c ∈ Asuch that 1 = b + c. Now set A(b) = Ab = ab : a ∈ A and A(c) = Ac = ac : a ∈ A. Then

A(b), A(c) ⊂ A are subalgebras of A with unit elements b and c respectively. Further, we have that

14

A = A(b) +A(c), since 1 = b+ c and A(b)∩A(c) = 0 since bc = 0. Thus, viewed as vectorspaces we

have

A = A(b)⊕A(c)

We further conclude that A is isomorphic to the direct sum of the two subalgebras A(b) and A(c)

since A(b)A(c) = 0. Now, since b ∈ A(b) and c ∈ A(c) we have A(b) 6= 0 6= A(c) and thus dimA(b),

dimA(c) <dim(A). By the induction hypothesis, b∑i bi and c =

∑j cj with bi and cj the sets of

primitive idempotents of A(b) and A(c) respectively. Then the observation that bi ∪ cj is the set

primitive idempotents of A(b)⊕A(c) completes the proof.

Proposition 2.8. Let p ∈ A an idempotent element and A an algebra. Then the left ideal Ap is

indecomposable if p is a primitive idempotent.

Proof Suppose Ap where decomposable. Then Ap = I1 ⊕ I2 for two nonzero A−submodules of Ap.

Since, p ∈ Ap, we can find unique p1 ∈ I1 and p2 ∈ I2 that satisfy p = p1 + p2. If p1 = 0 then

p = p2 ∈ I2 implying that Ap = Ap2 ⊆ I2, since I2 is a left ideal, whereby I1 ⊕ I2 ⊂ I2, which

contradicts the assumption I1 6= 0. Similar reasoning for p2 = 0. Further, since p ∈ Ap we have

p1 = p1p, and thus

(1− p1)p1 = p1 − p21 = p1p− p2

1 = p1(p1 + p2)− p21 = p1p2

However, p1p2 ∈ I2 and (1−p1)p1 ∈ I1 and we thus conclude that p1−p21 = p1p2 = 0 since I1∩I2 = 0.

Thus p21 = p1 and p1p2 = 0. Similarly, p2

2 = p2 and p2p1 = 0 by interchanging p1 and p2. Thus p1 and

p2 are orthogonal idempotents with p1 + p2 = p. Therefore p is not primitive.

Definition 2.18. A finite dimensional, associative, unital algebra A over C is semi simple if A is the

sum of its simple left ideals.

The group algebra C[G], for one, is semi-simple. We can thus conclude that finding a complete set of

primitive orthogonal idempotent elements in Z(C[G]), the center of the group algebra, and determining

the left ideals they generate will give a decomposition in terms of its simple left ideals, i.e. simple

submodules. That the idempotents be in the center is important, since a the group algebra is in

general not commutative. These idempotents are defined as:

pπ =dim(Vπ)

|G|∑g∈G

χπ(g)eg ∈ Z(C[G]) (2.11)

where π is an irreducible representation of G. Then4:

Corollary 2.7. 1. pπ is a linear basis of Z(C[G])

2.∑π pπ = ee where ee ∈ C[G] is the unit element

3. pπ is the set of primitive orthogonal idempotent elements of Z(C[G])

4I will not prove this here since its proof is rather long. It can be found in [5] The proof relies on the

observation that f 7→ ψf ≡∑g∈G f(f)eg defines an isomorphism between the set of class functions F (G) and

Z(C[G]) and the observations that the characters span the set of class functions and form a set of idempotent

elements of F (G)

15

Now, we have also seen that the left regular C[G] module contains all simple submodules in its

decompositions. Thus, the problem of finding all irreducible representations amounts to constructing

such a complete set of idempotents of Z(C[G]). However, constructing such a set is not as easy as

it might seem. What we know is that there are as many irreducibles as conjugacy classes, but these

are not always easy to determine for a general group. In the case of the symmetric group though, we

will see in the next section that these conjugacy classes are in bijection with the partitions of n, and

these we can determine easily. First though, we will need to take a look at restricted and induced

representations.

2.4 Restricted and induced representations

Given a representation (ρ, V ) of a groupG and a subgroupH ⊂ G we can consider the restricted representation,

denoted ResGH(ρ). This is the representation of H defined by

ResGH(ρ) : H → GL(V ) ResGH(ρ) := ρ|H (2.12)

In the same way that the above operation of restricted representations provides us to construct repre-

sentations of subgroups, we can consider induced representations. This operation produces represen-

tations of G from representations of H. Here I will briefly discuss this particular construction. For

this we let V a representation of G and W ⊆ V an H−invariant subspace. We write G/H for the left

cosets of H in G, i.e. it is the set of equivalence classes of G w.r.t. the equivalence relation g ∼ g′

iff g−1g′ ∈ H. Its elements are thus the left cosets gH = gh : h ∈ H, g ∈ G. Now, for any g ∈ G,

the subspace g ·W depends only on the left coset gH of g modulo H, since ghW = g(hW ) = gW .

Let now σ ∈ G/H, a coset and write σW for the subspace of V . Then the induced representation is

defined as follows:

Definition 2.19. Let V and W be two representations of G and H respectively with H ⊆ G a

subgroup and W ⊆ V . Then we say that V is induced by W if:

V =⊕

σ∈G/H

σW

I this case we write V = IndGHW , of simply Ind W if there is no ambiguity.

From the previous section we now that representations of a group G have an exact equivalence in

terms of its group algebra C[G]. In this sense then, the induced representation IndGHW is defined as

the left C[G] module5.

C[G]⊗C[H] W with action ρ(a)(a′ ⊗C[H] w) = (aa′)⊗C[H] w

Two important induced representation that we will see later are the following.

Example 2.2. The permutation representation of G is induced from the trivial one dimensional

representation W of H.

Example 2.3. The regular representation of G is induced from the trivial representation on the trivial

subgroup.

5A proof can be found in [6]. Again it will not be included since it is rather long.

16

3 The irreducible representations of Sn

Equipped with all the results of the previous section we now turn to the symmetric group and in

particular to the construction of its irreducible representations. This construction will also allow us to

construct the irreducible representations of GL(V ) in section 4. First thought, we need some results

concerning the symmetric group and in particular we discuss the Young diagrams. Results are from

[1], [4], [7] and [8].

3.1 The symmetric group

Recall that the symmetric group was defined as:

Definition 3.1. Let n ≥ 1 and write Sn for the symmetric group on n letters. Sn is the group that

consists of all bijections of Ωn = 1, 2, . . . , n into itself under composition. We have #Sn = n!. The

elements of Sn are the permutations. If σ and π represent two permutations, then π σ means that

we first apply σ and then π.

We will write a permutation σ ∈ Sn using cycle notation. Take for example, the permutation σ =

(142)(53)(6). This notation means that σ maps 1 → 4 → 2 , 2 → 1 5 → 3 and 3 → 5 and maps 6 to

itself. The content of a cycle (i1, . . . , ir) we denote by I and this is an ordered subset of i1, . . . , ir ⊆Ωn of cardinality r. The permutation σ consists of 3 disjoint cycles, where by disjoint we mean that

their contents have trivial intersection. Note that disjoint cycles commute. The length of a cycle is

the number of elements it contains and the identity element is the cycle of length 1. The permutation

σ thus consists of one 3-cycle, one 2-cycle and one 1-cycle.

Lemma 3.1. Any permutation σ ∈ Sn can be written as a product of disjoint cycles.

Definition 3.2. A partition λ of n is a sequence λ = (λ1, λ2, . . . ) of nonnegative integers such that∑i λi = n and λ1 ≥ λ2 ≥ λ3 ≥ . . . . We write λ ` n. We refer to the length of λ by writing l(λ) where

l(λ) is the largest i such that λi 6= 0.

Definition 3.3. Let σ ∈ Sn and write σ as a product of disjoint cycles such that each i ∈ 1, 2, . . . , nis in one of the cycles. Collect the lengths of the disjoint cycles and put them in nondecreasing order.

This again defines a partition c(σ) of n, which is called the cycle type of σ.

Lemma 3.2. Two permutations σ and τ are conjugate iff they have the same cycle type.

Recall that by proposition 2.5 we have for any group G that the number of irreducible representations

is equal to its number of conjugacy classes. In the case of G = Sn we further have the following result.

Proposition 3.1. The conjugacy classes of the symmetric group are in bijection with the partitions.

proof Write Sn/ ∼ for the set conjugacy classes and Pn for the set of partitions of n. Now consider

the map

Sn/ ∼→ Pn; Ad(Sn)(τ) 7→ c(τ)

with c(τ) the cycle type of τ as in the above definition and Ad(Sn)(τ) the orbit of τ under conjugation

with σ ∈ Sn, i.e. Ad(σ)(τ) = στσ−1. There orbits represent the conjugacy classes of Sn. It is now to

show that the map Ad is well defined and bijective. Consider now a cycle (i1, i2,· · · , ir), then

σ(i1, i2,· · · , ir)σ−1 = (σ(i1), σ(i2),· · · , σ(ir))

17

for all σ ∈ Sn and all subsets I = i1, . . . , ir ⊆ Ωn of cardinality r. It then follows that:

Ad(Sn)σ = τ ∈ Sn : c(τ) = c(σ)

and thus Ad is a well defined and injective map. To show surjectivity, let λ = (λ1, . . . , λm) be a

partition of n, where m = l(λ). Choose further subsets

Ij = ij1, ij2, . . . , i

jλj ⊂ Ωn

of cardinality λj such that Ij ∩ Ij′ = ∅ if 1 ≤ j 6= j′ ≤ m. For the corresponding cycles of length λj

let σj be the cycle of content Ij , i.e.

σj = (ij1, ij2,· · · , i

jλj

)

Then

σ ≡ σ1σ2· · ·σm ∈ Sn

is a product of disjoint cycles such that c(σ) = λ. Thus Ad(Sn)σ → c(σ) maps onto Pn and surjectivity

is also shown.

Corollary 3.1. The number of irreducible representations of the symmetric group is equal to the

number of partitions.

This tells us that we can parametrize the irreducible representations with the partitions λ of n. In

the next subsection we will discuss an alternative way to look at these partitions by associating them

Young diagrams. These are a combinatorial tool named after the British mathematician Alfred Young

who first introduced them, in particular to study the representations of the symmetric group.

3.2 Young diagrams and Young tableau’s

The Young diagram is an array of boxes that is associated to a given partition λ = (λ1, λ2, .....) of n. A

Young diagram has λi boxes in the ith row which are lined up on the left. The conjugate partition λ′

of the partition λ is defined by interchanging the rows and columns of the Young diagram associated

to λ. Note that (λ′)′ = λ.

Example 3.1. The Young diagrams of the partition λ = (5, 4, 2) and that of its conjugate partition

λ′ = (5, 4, 2)′ = (3, 3, 2, 2, 1) are respectively given by

and

Ordering on partitions

Given two partitions of n, λ = (λ1, λ2, ..., λl) and µ = (µ1, µ2, ..., µk) we distinguish two different

orderings. The dominance ordering and the lexicographic ordering.

Definition 3.4. We say that λ dominates µ in dominance ordering, written as λD µ if

m∑i=1

λi ≥m∑j=1

µj for all 1 ≤ m ≤ maxk, l.

and defining λi = 0 if i > l and µj = 0 if j > k

18

In terms of Young diagrams we say that the Young diagram for λ dominates that for µ if there are

more boxes in the first m rows for λ than in the first m rows for µ for all 1 ≤ m ≤ maxk, l.

Example 3.2. This dominance ordering is a partial ordering for all n. However, for n ≤ 5, we can

consider it also as a total ordering on the partitions. This is easily verified by drawing all the possible

young diagrams for the first n = 1, . . . 5. For example if n = 2 we see that the only possible Young

diagrams satisfy

D

For n = 3 we have:

D D

Up to n = 5 we can construct such a tree of Young diagrams to compare the Young diagrams in total

ordering. However, when arriving at n = 6, the the dominance ordering becomes a partial ordering,

which can be seen by considering the Young diagrams for λ = (2, 2, 2) and µ = (3, 1, 1, 1).

Definition 3.5. We say that λ dominates µ in lexicographic ordering, written as λ µ if the first

non vanishing λi − µi is positive.

Note, that these two orderings are almost the same. Further, the dominance ordering implies the

lexicographic ordering, i.e. if λD µ then also λ µ. The other implication does not hold.

We obtain the Young tableau of a given Young diagram by numbering the boxes with the numbers

1, . . . , n where each number may be assigned once. We refer to a Young tableau λ by writing Tλ

and write Tλ(i, j) for the numbers in the i-th row and j-th column (1 ≤ j ≤ λi).

Definition 3.6. Let Tλ a Young tableau and λ ` n. Then Tλ is called

1. row standard if its filling is increasing along each row,

2. column standard if its filling is increasing along each column,

3. standard if it row standard as well as column standard.

A Young tableau is further called semi-standard if its filling is nondecreasing along each row and strictly

increasing along each column. Note that here it is allowed to place the same number in multiple boxes.

Example 3.3. The λ-tableau defined by

tλ(i, j) =

∑i−1k=1 λk + j if i 6= 1

j if i = 1

for (1 ≤ j ≤ λk)is a standard λ-tableau. For the partition λ = (4, 3, 2) it corresponds to:

1 2 3 45 6 78 9

Having defined the tableau we write T (λ) for the set of all λ- tableaux. Given a permutation σ ∈ Snwe can obtain a new tableau σT by defining this to be the tableau with the number σ(T (i, j)) in the

(i, j)-th box of its tableaux. This defines a left action of Sn on T (λ) i.e. Sn × T (λ) → T (λ). In the

next section we will use these Young tableau for the construction of the irreducible representations of

the symmetric group, following [1].

19

3.3 Constructing the irreducible representations of Sn

The first step is to define the row and column stabilizer of the Young tableau Tλ. That is, we define

the following subgroups:

P = Pλ = σ ∈ Sn : σ preserves each row of Tλ, (3.1)

and

Q = Qλ = σ ∈ Sn : σ preserves each column of Tλ. (3.2)

Corresponding to these subgroups we introduce two elements in the group algebra C[Sn] by setting:

aλ =∑σ∈Pλ

eσ and bλ =∑σ∈Qλ

sgn(σ) · eσ (3.3)

and we further define the Young symmetrizer cλ ∈ CSn to be

cλ = aλ · bλ =∑

σ∈Pλ,τ∈Qλ

sgn(τ) · eστ . (3.4)

This cλ generates a left ideal in the group algebra C[Sn]. This left ideal is a sub representation of the

regular Sn representation and we define it as follows:

Definition 3.7. We call Vλ = C[Sn]cλ the Specht module.

Example 3.4. this example demonstrates how Vλ is computed. For λ = (n) we have cλ = aλ =∑σ∈Sn eσ, so

V(n) = C[Sn]∑σ∈Sn

eσ = C ·∑σ∈Sn

eσ

which is the 1-dimensional trivial representation. For n ≥ 2 we have a second 1-dimensional repre-

sentation which we find by taking λ = (1, 1, . . . , 1) Then cλ = bλ and we have

V(1,1,...,1) = C[Sn]∑σ∈Sn

sgn(σ)eσ = C ·∑σ∈Sn

sgn(σ)eσ

which is the sign representation. Taking λ = (2, 1), we find for c(2,1) ∈ CS3,

c(2,1) = (e(1) + e(12)) · (e(1) − e(13)) = e(1) + e(12) − e(13) − e(132).

To find out which subspace this is we multiply c(2,1) by the basis elements of C[S3]. Then we find:

e(1)(e(1) + e(12) − e(13) − e(132)) = e(1) + e(12) − e(13) − e(132)

e(12)(e(1) + e(12) − e(13) − e(132)) = e(12) + e(1) − e(132) − e(13)

e(13)(e(1) + e(12) − e(13) − e(132)) = e(13) + e(123) − e(1) − e(23)

e(23)(e(1) + e(12) − e(13) − e(132)) = e(23) + e(132) − e(123) − e(12)

e(123)(e(1) + e(12) − e(13) − e(132)) = e(123) + e(13) − e(23) − e(1)

e(132)(e(1) + e(12) − e(13) − e(132)) = e(132) + e(23) − e(12) − e(123)

Thus C[S3] · c(12) is the 2-dimensional subspace spanned by the first and third vector, and we conclude

that it is the standard representation we introduced in example 2.1.

20

We will see that, after a normalization, these cλ form the complete set of primitive orthogonal idem-

potents we set out for and that thus the Vλ are the irreducible representations.

Theorem 3.1. cλ is an idempotent up to scalar multiplication, i.e. c2λ = nλcλ and the Specht module

Vλ is an irreducible representation of Sn. Every irreducible representation can be obtained in this way.

We need some more results to prove this theorem. In the following, the subscript λ is omitted when

it is clear it should be there, i.e. write a for aλ, etc. The idea behind the proof is to show that the

cλ are primitive idempotents up to a scalar multiple which we will call nλ. Further, we show that for

different partitions, λ, µ the product of the corresponding Young symmetrizers yields zero, meaning

they are mutually orthogonal. Lastly we show that the left ideals they generate are irreducible. We

start by observing that P and Q satisfy the following property which is clear from the way they are

defined.

1. For p ∈ P we have p · a = a · p = a

2. For q ∈ Q we have (sgn(q)q) · b = b · (sgn(q)q) = b

Lemma 3.3. For all p ∈ P and q ∈ Q we have p · c · (sgn(q)q) = c, and c is the only such element in

C[Sn] up to scalar multiplication.

We first note that since P and Q have trivial intersection, an element of Sn can be written as the

product p · q, p ∈ P, q ∈ Q in at most one way. Therefore, c =∑±eg where the sum is over al cycles

g ∈ Sn that can be written as the product p ·q with the coefficient being sgn(q). For one, the coefficient

of e1 in c is 1. In the proof we consider the tableau Tλ = tλ defined in example 3.3. Proof If∑ngeg

satisfies this condition, then we have that npgq = sgn(q)ng for all g, p, q. In particular we have npq =

sgn(q)ng. We therefore have to verify that ng = 0 if g /∈ PQ. For such a g it is sufficient to find a

transposition t such that p = t ∈ P and q = g−1tg ∈ Q, since then g = pgq, so ng = −ng. We now

define T ′ = gT , i.e. the tableau obtained by replacing each entry i of T by g(i). The claim is that

there are two distinct integers appearing in the same row of T and in the same column of T ′ and that

the element t is the transposition of these two integers. It is now to verify that if such a pair of integers

did not exist, that one could then write g = pq for some p ∈ P, q ∈ Q. To show this, we take p1 ∈ Pand q′1 ∈ Q′ = gQg−1 such that the tableaux p1T and q′1T

′ have the same first row. This we repeat

on the rest of the tableau. Then one gets p ∈ P and q′ ∈ Q′ so that pT = q′T ′. Then pT = q′gT from

which it follows that p = q′g and thus g = pq where q = g−1q′−1g ∈ Q.

In the following we will use the lexicographic ordering on the partitions λ and µ.

Lemma 3.4. 1. If λ > µ, then for all x ∈ C[Sn] we have that aλ ·x · bµ = 0. In particular we have

cλ · cµ = 0.

2. For all x ∈ C[Sn], cλ · x · cλ is a scalar multiple of cλ. In particular we have cλ · cλ = nλ · cλ,

for some nλ ∈ C.

Proof

1. Take x = g ∈ Sn. Then, since gbµg−1 is the element constructed from gT ′, with T ′ the tableau

used to construct bµ, we have to show that aλbµ = 0. When λ > µ this implies that there are

21

two integers in the same row of T and in the same column of T ′. Let now t is as in the previous

lemma be the transposition of these two integers. Then aλ = aλ · t and t · bµ = −bµ, hence

aλ · bµ = aλ · t · t · bµ = −aλ · bµ as was required to show.

2. This follows from Lemma 3.3

Corollary 3.2. If λ < µ, then cλ · C[Sn] · cµ = 0; in particular cλ · cµ = 0.

Proof We use the anti-involution6 map ˆ of C[Sn] that is induced by the map g 7→ g−1, g ∈ Sn.

Noting that its fixed points are aλ, bλ, aµ, bµ , i.e. cλ = (aλbλ) = bλaλ = bλaλ, we have (cλxcµ) =

(aλbλxaµbµ) = bµaµxbλaµ = bµaµxbλaµ = 0 since aµxbλ = 0.

Having showed that the cλ are idempotent elements that are mutually orthogonal, we now show that

Vλ they define are the irreducible. That is we proof:

Lemma 3.5. 1. Each Vλ is an irreducible representation of Sn.

2. If λ 6= µ, then Vλ and Vµ are not isomorphic.

Proposition 3.2. Let R be a ring and I 6= 0 a left ideal of R. If I is a direct summand of R, then

I2 6= 0.

Proof Suppose that I is a direct summand of R, then there exists a left ideal J such that I ⊕ J = R.

In particular, we can find i ∈ I and j ∈ J such that i + j = 1. Then i = i2 + ij by multiplying both

sides on the left with i. Then I2 6= 0, for else we had i = ij ∈ I ∩ J = 0, 1 = j ∈ J , and hence

J = R, I = 0.

Proof lemma 3.5.

1. We begin by noting that cλVλ ⊂ Ccλ by Lemma 3.4. If W ⊂ Vλ is a sub representation, then

either cλW is Ccλ or 0. If the first is true, then cλ ∈ cλW ⊆ W so Vλ = C[Sn]cλ ⊂ W .

Otherwise W ·W ⊂ C[Sn] · cλW = 0, but then W = 0 with proposition 3.2. This shows in

particular that cλVλ 6= 0, i.e. that the number nλ 6= 0.

2. We may assume λ > µ. Then cλVλ = Ccλ 6= 0, but cλVµ = cλC[Sn]cµ = 0. So they can not be

isomorphic as C[Sn] modules.

As a final step, we determine the factor nλ in c2λ = nλcλ.

Lemma 3.6. For any partition λ, cλcλ = nλcλ with nλ = n!Dim(Vλ) .

Proof Let F be right multiplication by cλ on C[Sn]. Then, since F is multiplication by nλ on Vλ, and

zero on Ker(cλ), the trace of F is nλ times the dimension of Vλ. But the coefficient of eg in egcλ is 1,

so trace(F) = |Sn| = n!

We have thus shown that the elements cλ = dim(Vλ)n! cλ

7 form a mutually orthogonal set of primi-

tive idempotents of Z(C[Sn]). This therefore proves the theorem since they give us all the irreducible

representations by letting λ vary over the partitions. In the remaining of the chapter I will discuss

some more properties about the Specht modules. In particular, I will introduce the Young-subgroup

and discuss Young’s Rule [1].

6An (anti-)involution map is a function f that is it’s own inverse, i.e. f(f(x)) = x for all x in the domain

of f .7Compare with (2.11)

22

3.4 Young subgroups, induced representations and Young’s Rule

We now introduce the Young subgroup. This is a subgroup of Sn that is isomorphic to

Sλ = Sλ1× . . .× Sλk

for some partition λ = (λ1, . . . , λk). There is one specific Young subgroup that is called the standard

Young subgroup. It is defined as

Sλ = Sλ1× Sλ2

× . . .× Sλk

where Sλ1 acts on the set 1, 2, ..., λ1,Sλi acts on the set

i−1∑j=1

λj + 1,

i−1∑j=1

λj + 2, . . . ,

i−1∑j=1

λj + λi

and Sλk acts on

k−1∑j=1

λj + 1,k−1∑j=1

λj + 2, . . . , n

.

Since this a subgroup, we can induce representations on it to representations of Sn. In particular,

inducing the trivial representation on each of the Sλi to Sn gives us the permutation representation.

Definition 3.8. We write Mλ for the permutation representation obtained by inducing the trivial

representation on Sλ to Sn, i.e. Mλ = Ind ↑SnSλ (1).

This Mλ we can equivalently define as Mλ = C[Sn]aλ, with aλ as before. Further, since we have a

surjection

Mλ = C[Sn]aλ Vλ = C[Sn]aλbλ, x 7→ x · bλ

and an isomorphism

Vλ = C[Sn]aλbλ ∼= C[Sn]bλaλ ⊂ C[Sn]aλ = Mλ

we note that Vλ appears in the decomposition of Mλ for every partition λ. To see the second equality

note that right multiplication by aλ gives a map C[Sn]aλbλ → C[Sn]bλaλ and right multiplication by

bλ gives a map back. These compositions are multiplications by non-zero scalars.

There is in fact an explicit formula, known as Young’s Rule, that tells us how the permutation module

decomposes in terms of the irreducible Specht modules.

Theorem 3.2. (Young’s Rule) The permutation module Mλ decomposes as

Mλ =⊕µDλ

KµλVµ

Definition 3.9. The numbers Kµλ are called the Kostka numbers.

These Kostka numbers are defined combinatorially as the number of semi-standard tableaux of shape

µ and content λ. That is, it is the number of ways we can fill the boxes of the Young diagram for µ

with λ1 1’s, λ2 2’s, up to λk, k’s, in such a way that the entries in each row are nondecreasing and the

entries in each column are strictly increasing.

23

Proposition 3.3. Suppose λ, µ ` n. Then the Kostka number Kµλ is non vanishing if and only if

µD λ. Further, Kλλ = 1

The property of the Kostka numbers as stated in proposition 3.3 is important. Suppose we consider

an ordering on the partitions λ1, λ2, . . . from (n) up to (1n)8. Then this will also gives us an ordering

on Mλ. Young’s rule now says that the first module Mλ1 will be equal to one copy of V λ1 . The next

module, Mλ2 will contain this same V λ1 with a certain multiplicity, plus one copy of a new irreducible

V λ2 etc.

Example 3.5. Consider the partition λ = (1, . . . , 1) Then Mλ is easily seen to be the regular rep-

resentation since we induce the trivial representation from the trivial subgroup. It therefore follows

that in this case Kµ(1,...,1) = dim(Vµ) since for the regular representation the irreducibles occur with

multiplicity being their dimension. This thus provides a way to determine the dimension of Vλ. It is

the number of ways to fill the Young diagram of λ with the numbers 1 to n, in such a way that all rows

and columns are increasing.

Example 3.6. Observe that K(n)λ = 1 since there is only one semi-standard tableau. Then by Young’s

Rule we conclude that each permutation module Mλ contains exactly one copy of the trivial represen-

tation S(n). See example 3.4.

The Kostka numbers are usually notated in a table. In the next example I will apply Young’s rule to

compute this table for S5.

Example 3.7. I will begin by giving the table of Kµλ and then work out how its entries are obtained.

Kµλ λ→ (5) (4,1) (3,2) (3,1,1) (2,2,1) (2,1,1,1) (1,1,1,1,1)

µ ↓ (5) 1 1 1 1 1 1 1

(4,1) 0 1 1 2 2 3 4

(3,2) 0 0 1 1 2 3 5

(3,1,1) 0 0 0 1 1 3 6

(2,2,1) 0 0 0 0 1 2 5

(2,1,1,1) 0 0 0 0 0 1 4

(1,1,1,1,1) 0 0 0 0 0 0 1

The values of the Kostka numbers are determined by counting the number of Young tableaux of shape

µ and content λ. When λ = (5) there is only one tableau that is semi standard and has 5 times the

number 1 as numbering of its boxes. Namely:

1 1 1 1 1

For the next partition λ = (4, 1) we can draw the following semi standard tableau with 4 times a 1 and

one 2. These are

1 1 1 1 21 1 1 12

8Note that this ordering is only a total ordering for ≤ 5. See example 3.2

24

Thus K(5)(4,1) = 1 and K(4,1)(4,1) = 1, all other Kostka numbers being zero, so we indeed get a new

Specht module V (4,1). I will not write out the diagrams for all partitions, since the result of doing

so can be read of from the table but I will write out the semi standard tableaux for µ for the filling

λ = (2, 1, 1, 1).

1 1 2 3 41 1 2 34

1 1 2 43

1 1 3 42

1 1 23 4

1 1 32 4

1 1 42 3

1 1 234

1 1 324

1 1 423

1 12 34

1 12 43

1 1234

We thus have M (2,1,1,1) ∼= V (5)⊕3V (4,1)⊕3V (3,2)⊕3V (3,1,1)⊕2V (2,2,1)⊕V (2,1,1,1), and as decomposition

for the regular representation M (15) we find

M (15) ∼= V (5) ⊕ 4V (4,1) ⊕ 5V (3,2) ⊕ 6V (3,1,1) ⊕ 5V (2,2,1) ⊕ 4V (2,1,1,1) ⊕ V (15).

Recall we defined the permutation module as Ind(↑SnSλ , 1). We can also consider

M ′λ = Ind ↑SnSλ′( sgn), i.e. the representation induced from the sign representation on the young sub-

group Sλ′. This induced representation we can also realize as: M ′λ = C[Sn]bλ and this representation

also includes Vλ. In [2] it is shown that the Specht module is the only irreducible constituent that these

two induced modules have in common and that this common constituent occurs with multiplicity one

(which is reflected by the diagonal Kostka number being one) . That is, it is shown that:

Ind ↑SnSλ (1) ∩ Ind ↑SnSλ′ (sgn) = Vλ

That this multiplicity is 1 is crucial in the construction. It ensures us that when we consider this

intersection we get exactly one copy of Vλ. One final result about the Specht modules concerns a

formula for its dimension.

Theorem 3.3. (The Hook Length formula)

dimVλ =n!

l1!· · · lk!

∏1<i<j<≤k

(li − lj) =n!∏

i≤λj hi,j

where hi,j is the hook length of the box with label (i, j) and is the number of boxes directly below of to

the right including the box once, and li = λi + k − i.

The first equality is a result that follows from the Frobenius character formula. We did not discuss

this here and a proof can be found in [1] or [4]. The second equality follows from the observation that

l1!∏1<j≤k(l1 − lj)

=∏

1≤m≤l1,m 6=l1−lj

m

and noting that the factors m in this product are precisely the hook lengths hi,1. Deleting the first

row of the diagram and proceeding by induction proves the statement.

Example 3.8. Consider the partition λ = (4, 3, 1). Labeling the boxes by their hook length gives

25

6 4 3 14 2 11

Then for the dimension of the corresponding S8 representation we find:

dim V(4,3,1) =8!

6 · 4 · 3 · 4 · 2= 70

26

4 The irreducible representations of GL(V )

In this section we will focus on the irreducible representations of the group GL(V ) ∼= GL(n,C). It

turns out that there is a connection between these and the Specht modules we considered in the pre-

vious section. In particular, we will determine the irreducible characters of these irreducible GL(V )

representations. We will follow [1]

Given a group G we have a representation of G on a vectorspace V which we denote by g(v) 7→ gv.

We can now consider the nth tensor power V ⊗n which is also a representation and both G and Sn

act on this space. We have a left-action of G given by g(v1 ⊗ . . . ⊗ v2) 7→ (gv1) ⊗ . . . ⊗ (gv2). We

also have a right-action of Sn on V ⊗n given by (v1 ⊗ . . . ⊗ vn)σ = vσ(1) ⊗ . . . ⊗ vσ(n) and it is easily

seen that their actions commute. This V ⊗n is not irreducible though and we would therefore like to

break it up into irreducible representations of G. We will see that in the case where G = GL(V ) this

can actually be accomplished9. Due to the commutativity of the actions of Sn and GL(V ) on V ⊗n

we expect there to be some kind of relation between the decomposition into irreducibles of V ⊗n when

viewed as Sn representation and its decomposition when viewed as GL(V ) representation. This, we

will see is indeed the case, and to construct these irreducible GL(V ) representations we will use the

Young symmetrizer cλ. Recall that it was defined as cλ = aλ · bλ =∑σ∈Pλ,τ∈Qλ sgn(τ) · eστ . We can

now define a new representation of GL(V ), which we will denote as SλV , by computing the image of

cλ on V ⊗n. Thus:

SλV = Im(cλ|V ⊗n).

This SλV is also a subrepresentation of V ⊗n.

Definition 4.1. We call the functor V SλV , that sends a representation V to SλV the Schur-

functor. The representation SλV is called the Weyl-Module.

By a functor we mean that a linear map φ : V → W between two vectorspaces determines a map

Sλ(φ) : SλV → SλW with Sλ(φ ψ) = Sλ(φ) Sλ(ψ) and Sλ(IdV ) = IdV .

Example 4.1. In this example I will demonstrate how cλ acts on V ⊗n, by decomposing V ⊗2. We

start by recalling that the nth symmetric power and exterior powers denoted SymnV and∧n

V are sub-

representations of V ⊗n. We can realize them as GL(V ) representations, by considering the partitions

(n) and (1n). In those cases we have for the Young symmetrizer cλ that c(n) = a(n) and c(1n) = b(1n).

Then, if v1 ⊗· · · ⊗ vn ∈ V ⊗n, cλ acts on this tensor by permuting the indices and we have for any n:

c(n)V⊗n = a(n)(v1 ⊗· · · ⊗ vn) =

∑σ

eσ(v1 ⊗· · · ⊗ vn) =∑σ

(vσ(1) ⊗· · · ⊗ vσ(n)) = SymnV

and

c(1n)V⊗n = b(1n)(v1⊗· · ·⊗vn) =

∑σ

sign(σ)eσ(v1⊗· · ·⊗vn) =∑σ

sign(σ)(vσ(1)⊗· · ·⊗vσ(n)) =∧n

V

by definitions 2.8 and 2.9. Therefore these two partitions (n) and (1n) correspond respectively for any

n to the functors

V Sym⊗nV and V ∧⊗n

V

9In the case of a general group G though, the best we can then hope for is to break it up into some sub-

representations.

27

This also immediately gives us the decomposition for n = 2: V⊗V = Sym⊗2V

⊕∧⊗2V . For n > 2

there will be a additional spaces in the decomposition that appear with a certain multiplicity m. For

example, when n = 3, we have the additional symmetrizer

c(2,1) = 1 + e(12) − 1(13) − e(132).

Its image is on V ⊗3 is the vectorspace spanned by the vectors

v1 ⊗ v2 ⊗ v3 + v2 ⊗ v1 ⊗ v3 − v3 ⊗ v2 ⊗ v1 − v3 ⊗ v1 ⊗ v2.

In the next subsection we will formulate a theorem that will enable us to determine the multiplicity

m and we will see that this multiplicity is related to the Specht module corresponding to the same

partition.

4.1 The irreducible characters of SλV .

We will now take a closer look at the representations SλV of GL(V ) we have constructed. As we

will see, they are indeed irreducible representations and their characters will be identified with certain

symmetric polynomials, called Schur polynomials. The needed results about these Schur polynomials,

some other symmetric polynomials and needed relations between them can be found in appendix A.

Theorem 4.1. 10

1. Let mλ be the dimension of the irreducible representation Vλ of Sn corresponding to λ. Then

V ⊗n ∼=⊕λ

(SλV )⊕mλ

2. Let k = dim V . For any semisimple g ∈ GL(V ), the trace of g on SλV is the value of the Schur

polynomial on the eigenvalues x1, . . . , xk of g on V , i.e.

χSλV (g) = sλ(x1, . . . , xk)

3. Each SλV is an irreducible representation of GL(V ).

4. Let k = dim V . Then SλV is zero if λk+1 6= 0. If λ = (λ1 ≥ . . . ≥ λk), then

dim SλV = sλ(1, . . . , 1) =∏

1≤i<j≤k

λi − λj + j − ij − i

Before turning to the proof of this theorem, we have a closer look at what (1) and (2) say. We already

stated that we expected that the representations of GL(V ) and Sn will be connected in some way due

to their commuting actions on V ⊗n. This is indeed reflected by (1). It says that as GL(V ) module V ⊗n

decomposes in irreducible sub-GL(V ) modules that occur with the multiplicity of the corresponding

Specht module, the irreducible of Sn11.

10The theorem and proof of the theorem is from [1]11In fact, this duality also holds the other way around, (a fact that will not be proven here). As Sn

representation V ⊗n ∼=⊕

λ(Vλ)⊕nλ where nλ is the dimension of SλV .

28

Regarding (2) we can already say something about a special cases that is easy to see, namely for

the case λ = (n). We let ρ(g) a semisimple endomorphism on V , then we know this leads to an

endomorphism of SλV and we want to compute the trace of this endomorphism. For this we let

x1, x2, . . . , xk be the eigenvalues of ρ(g) on V , where k = dim (V ). Now in the case where ρ(g) is the

diagonal matrix, we have χV (g) = x1 + x2 + . . .+ xk. Then in the case λ = (n), SλV = SymnV and

χSymnV is the complete symmetric polynomial of degree n obtained by multiplying the k factors in all

possible orders, which is clearly symmetric due to commutativity. Thus we have the special case of:

χSymnV = h(n)(x1, . . . , xk) (4.1)

We now turn to the proof. For this we translate the fact that the actions of GL(V ) and Sn on V ⊗n

commute, to the language of algebras. This we will do by introducing the commutator algebra. We will

formulate the results for the general case and then apply it to the situation we have in the theorem. We

consider a finite group G, later to be taken Sn and let U be a right module over an algebra A = C[G].

We define the commutator algebra B as:

B = HomG(U,U) = φ : U → U : φ(v · g) = φ(v)g, for all v ∈ U, g ∈ G (4.2)

It is the algebra of of all the endomorphisms φ of U that commute with the action of G. B acts on

U from the left, and this action commutes with the right action of A on U . If now U =⊕

i U⊕nii is

decomposition into non-isomorphic irreducible right A-modules, then we can apply to find:

B =⊕i

HomG(U⊕nii , U⊕nii ) ∼=⊕i

Matni(C)

which follows from Schur’s lemma (2.1) in the same way as before. If we now consider an additional

left A module W , we can construct a left B module through the tensor product:

U ⊗AW = U ⊗C W/subspace generated by va⊗ w − v ⊗ aw

This defines a left B module by acting on the first factor: b(v ⊗ w) = (bv)⊗ w. Having defined these

we can now formulate the first lemma:

Lemma 4.1. Let U a finite dimensional right A- module

1. For any c ∈ A, the canonical map U ⊗A Ac→ Uc is an isomorphism of left B-modules.

2. If W = Ac is an irreducible left A-module, then U ⊗AW = Uc is an irreducible left B-module

3. If Wi = Aci are all distinct irreducible left A-modules, with mi being the dimension of Wi, then

U ∼=⊕i

(U ⊗AWi)⊕mi ∼=

⊕i

(Uci)⊕mi

is the decomposition of U into irreducible left B-modules.

A first observation to make is that Ac is a direct summand of A due to semi-simplicity.

proof

1. Consider the following commuting diagram:

29

U ⊗A A U ⊗A Ac U ⊗A A

U U · c U

·c

·c

(4.3)

The vertical mappings are (v ⊗ a) 7→ va. Since the left horizontal maps are right-multiplication

with c, these maps are surjective. The right horizontal maps are embeddings and thus injective.

The outer vertical maps are isomorphisms and thus it follows that the middle vertical map is

also an isomorphism.

2. We first prove the claim for the case where U is an irreducible A-module. Then we have,

B = HomG(U,U) = C

Since B is one-dimensional its only submodules are 0 or B itself. Therefore it will sufficient to

show dim(U ⊗AW ) = 1. By Wedderburn, we can identify

A =

r⊕i=1

Mni(C),

Now, by assumption W = Ac is an irreducible left A module, and thus also a minimal left ideal

of A. In a matrix algebra Matni(C) a primitive idempotent is an ni×ni matrix Ekk, 1 ≤ k ≤ niwith all of its entries zero except for entry (k, k). In the direct sum of matrix algebra’s the

primitive idempotents are then r-tupels (0, . . . , 0, e, 0, . . . , 0) with e a primitive idempotent of

Matni(C) for some i as above. Now a minimal left ideal in A is of the from Matni(C)Ekk and

is isomorphic to one that consists of r-tupels of matrices with all entries zero except for entry

i. I.e.⊕r

i=1 Matni(C)(0, . . . , 0, e, 0, . . . , 0). In this entry i, all of the matrices have but one

non-zero column, say column k. In the same way we can identify U as the minimal right ideal

of r-tupels of matrices with all entries zero except for entry j, and in this factor are all zero

except for row l. Then (U ⊗AW ) will be zero unless i = j, in which case it is isomorphic to the

set of matrices that are all zero except in entry (l, k). Then dim(U ⊗AW ) = 1 which completes

the proof when U is irreducible. For the more general case, we decompose U =⊕

i U⊕nii into a

sum of irreducible right A-modules, whereby;

U ⊗AW =⊕i

(Ui ⊗AW )⊕ni = Cnk ,

for some k. Since this is irreducible over B =⊕

j MatnjC the proof is complete.

3. This is easily seen if we use the isomorphism A ∼=⊕Wmii . Then this determines an isomorphism:

U ∼= U ⊗A A ∼= U ⊗A (⊕

W⊕mii ) ∼=⊕i

(U ⊗AWi)⊕mi

In the proof of theorem 4.1 we will apply this lemma to U = V ⊗n and set G = Sn. Thus, the

commutator algebra B does now consist of all endomorphisms of V ⊗n that commute with the action

of Sn. But recall that we had commuting actions of GL(V ) and Sn on V ⊗n. Therefore, in this

30

particular context, the commutator algebra B in fact equal to GL(V ) and irreducible GL(V ) sub-

representations will be irreducible B submodules. The lemma now tells us how V ⊗n decomposes as a

B-module, i.e. how it decomposes as GL(V ) representation.

To prove (3) though, we need one more lemma that relates the commutator algebra to GL(V ) making

the above statement precise. Before formulating the lemma we first make the observation that B =

EndSn(V ⊗n) ⊂ End(V ⊗n). It is further clear that End(V ) ⊂ B and if φ ∈ End(V ) is an intertwining

operator then the operator

φ : V ⊗n → V ⊗n , φ(v1⊗, . . . ,⊗vn) = φ(v1)⊗ . . .⊗ φ(vn)

induced by φ : V → V is also an intertwining operator, since it is clearly Sn invariant. Thus,

φ : φ ∈ End(V ) ⊂ B. The lemma now claims that spanφ : φ ∈ End(V ) = B.

Lemma 4.2. 1. The commutator algebra B as linear subspace of End(V ⊗n) is spanned by End(V ).

2. A subspace of V ⊗n is a sub-B-module iff it is invariant under GL(V ).

proof To prove (1) we note that if W is any finite dimensional vector space, then the subspace

Symn(W ) = spanw ⊗ . . . ⊗ w : w ∈ W ⊂ W⊗n is invariant under Sn. In view of the discussion

before the proof we apply this to W = End(V ) and use that End(V ) = Hom(V, V ) = V ∗ ⊗ V . Then:

W⊗n = (V ∗ ⊗ V )⊗n = (V ∗)⊗n ⊗ V ⊗n = End(V ⊗n) ⊇ B = EndSn(V ⊗n) = End(V ⊗n)Sn

To justify the last equality: If

φ : V ⊗n → V ⊗n, (φσ)(w)→ φ(wσ)σ−1

then it is Sn invariant, if

(φσ)(w) = φ(w) that is iff φ(wσ) = φ(w)σ

meaning that φ intertwines with the action of Sn. Thus, we also have an action of Sn on W⊗n and

we deduce

B = (W⊗n)Sn = Symn(W ) = spanφ : φ ∈ End(V )

For (2) we let P ⊆ V ⊗n a subspace and assume it is invariant under ψ, for ψ ∈ GL(V ), i.e ψ is

invertible. Then P is a sub-B-module if it is invariant under the action of b, that is if bP ⊆ P for all

b ∈ B. Now, from (1) we had B = spanφ : φ ∈ End(V ), so it suffices to show that φP ⊆ P , for all

φ ∈ End(V ). If φ is invertible, this is certainly true since we assumed P to be GL(V ) invariant. If φ

is not invertible, we can approximate φ by mappings ψi ∈ GL(V ).

ψi → φ, , ψi → φ, i→∞.

Then

φ(p) = limi→∞

ψi(p) ∈ P, for all p ∈ P

since GL(V ) is dense in End(V ), i.e. every subset is closed.

31

Now we have all the machinery necessary to prove theorem 4.1. As said earlier, we set A = C[Sn] and

U = V ⊗n.

Proof of theorem 4.1

1. This follows immediately form lemma’s 4.1 (3) and 4.2 with the identifications Uci = SλV and

mi = dim(Aci) = dim(Vλ).

2. With lemma 4.1 (2) we have an isomorphism of GL(V ) modules

V ⊗n ⊗A Vλ ∼= SλV

with Vλ = Acλ. Similarly, we can do this for Mλ = Aaλ.

V ⊗n ⊗AMλ∼= V ⊗naλ = Symλ1V ⊗ Symλ2V ⊗· · · ⊗ SymλkV

But recall Young’s rule 3.2:

Mλ =⊕µDλ

KµλVµ

Therefore we deduce:

Symλ1V ⊗ Symλ2V ⊗· · · ⊗ SymλkV =V ⊗n ⊗A

⊕µDλ

KµλVµ

(4.4)

=⊕µDλ

(V ⊗n ⊗A KµλVµ)

=⊕µDλ

(KµλSµV )

But by (4.1) we know the trace on the left hand side of 4.4. It is the product h(λ) = h(λ1)· · ·h(λk)

of complete symmetric polynomials. Therefore

h(λ) =∑µ

KµλTrace(Sµ(g))

where Trace(Sµ(g)) = χSµV (g). But we also have the relation A.A.7

hλ = sλ1· · · sλk =

∑Kµλsµ,

and we can thus deduce that

χSµV (g) = sµ(x1, . . . , xk).

3. We note that SλV = V ⊗ncλ = Ucλ. This V ⊗ncλ is a subspace of V ⊗n and it is invariant under

the action of GL(V ). To see this, note that an element of V ⊗nc is of the form vc, for some

v ∈ V ⊗n. Then g(vc) = g(v)c ∈ V ⊗nc for g ∈ GL(V ), showing that V ⊗nc is invariant under

GL(V). Then by lemma’s 4.2 and 4.1 that SλV is an irreducible sub-B module.

4. Let k = dim(V ). The result from (2) gives us that if λ = (λ1, . . . , λn) with n > k and λk+1 6= 0

that the trace of an endomorphism on SλV is given by sλ(x1, . . . xk, 0, . . . , 0). But this is zero.

Part 2 also gives us that dim(SλV ) = sλ(1, . . . , 1). To prove this we use A.4 and the definition

of the Vandermonde determinant A.4. Then we have:

sλ(x, x, x2, . . . , xk−1) =∏

1≤i,j≤k

xλi+k−i − xλj+k−j

xk−i − xk−j= xk

∏1≤i,j≤k

xλi−i − xλj−j

x−i − x−j

32

Taking now the limit x→ 1 we have

sλ(1, . . . , 1) =∏

1≤i,j≤k

λi − λj + j − ij − i

This theorem is an important result. Having proved (2), we can determine the branching rules we

set out for. Branching rules describe how irreducible representations of a group G decompose into

irreducibles of a subgroup H when these are restricted to H. We will see that for GL(n,C) the

branching rules can be derived by making use of certain identities between the Schur polynomials. We

will do this in chapter 5. The result of (4) is also important. It provides a formula to compute the

dimension of the representation. The formula can be simplified however by making use of the hook

length hij . Then the above formula reduces to:

dim SλV =∏ k − i+ j

hij(4.5)

It follows from the observation that the dimension of the Specht module Vλ is given by (3.3) and

comparing this to the dimension formula above.

Example 4.2. Computing the dimension.

Consider the irreducible GL(5,C) representation with Young diagram λ = (4, 2, 2, 1). We first consider

the numerator. Number the box in the upper left corner with the number 5 and fill the rest of the

column in decreasing order from top to bottom. Then fill the rows such that their numberings are

strictly increasing from left to right. Then the numerator becomes the product over the filling of the

resulting Young tableau: For the denominator we take the product over the numbering of the Young

tableau that is numbered according to the hooklength of each box. The dimension of this representation

thus becomes:

dim SλV =

5 6 7 84 53 42

7 5 2 14 23 11

=5 · 6 · 7 · 8 · 4 · 5 · 3 · 4 · 27 · 5 · 2 · 1 · 4 · 2 · 3 · 1 · 1

= 480

We have further shown that the irreducible representations can be parametrized by Young diagrams

with at most n rows. However, we also know that there is only one possible way of numbering a

column of length n. Therefore, when denoting the representation by its Young diagram any column

of length n may be omitted. That is, taking for n = 2 for instance, the Young diagrams:

etc

all represent the same irreducible representation. I will further write = 12 for the one-dimensional

representation, where by the subscript I refer to the n = 2 case.

33

5 Branching Rules

In this section we will determine some branching rules for the GL(n,C). As said before, given a

group G, an irreducible representation of G and a subgroup H ⊂ G, branching rules describe how this

irreducible representation, when restricted to H, decomposes in terms of irreducible representations of

H. These branching rules have many applications, as well in mathematics as in physics. There they

are related to symmetry breaking, a phenomenon we will get back to in great detail in the second part

of this thesis. For now we will determine three different branching rules for GL(n,C). It turns out

that, once we have determined those, this also gives us the branching rules for SU(N), so we get those

for free. This we will see in section 5.2.

5.1 Branching Rules for GL(n,C)

With the results of the previous section at hand we can begin with determining the branching rule for

GL(n,C)→ GL(n− 1,C). Determining this comes down to determining the multiplicities dλµ in:

SλV |GL(n−1)∼=⊕µ

(SµV )⊕dλµ (5.1)

where λ = (λ1, . . . , λn) and µ = (µ1, . . . , µn−1). The first step is to rewrite (5.1) in terms of the

characters. For this we observe that when evaluating the character χSλV (g) = Trace(ρλ(g)) on an

element g ∈ GL(n,C) it sufficient to evaluate it on diagonal matrices. That is, to evaluate:

χSnλ V(

x1

. . .

xn

) = sλ(x1, . . . , xn) (5.2)

To see this, observe that from linear algebra we know that every matrix in GL(n,C) is conjugate

to a matrix in Jordan form. The character is therefore determined by its values on matrices in the

canonical Jordan form. Since the diagonalizable elements further form a dense subset of GL(n,C)

any representation is determined by its values on the diagonal matrices. Further, we can embed

GL(n− 1,C) in GL(n,C) as:

GL(n− 1,C) → GL(n,C), given by g 7→

g

0...

0

0 · · · 0 1

Therefore, taking g to be the diagonal matrix as above, this amounts to:

x1

. . .

xn−1

7→

x1

. . .

xn−1

1

(5.3)

We now have all we need to rewrite (5.1) in terms of the characters. The right hand side of (5.1) follows

immediately. These are the Schur polynomials in the variables x1, . . . , xn−1 corresponding to the

partition µ, i.e. sµ(x1, . . . , xn−1). As for the left hand side, we have to determine the character of SλV

34

restricted to GL(n− 1). But compare (5.2) with (5.3). Then we deduce that restriction to GL(n− 1)

of the irreducible character of SλV , which is sλ(x1, . . . , xn), amounts to setting xn = 1. Therefore the

multiplicities dλµ in (5.1) are determined by the following identity between Schur polynomials:

snλ(x1, . . . , xn−1, 1) =∑µ

dλµsn−1µ (x1, . . . , xn−1). (5.4)

Now, by (A.4) we have an identity

sλ(x1, . . . , xn) =∑µ⊆λ

sµ(x1, . . . , xn−1)x|λ−µ|n (5.5)

where the sum is over all partitions µ for which λ − µ is a horizontal strips and l(µ) ≤ n − 1. Thus

substituting xn = 1 then gives us:

sλ(x1, . . . , xn−1, 1) =∑µ⊆λ

sµ(x1, . . . , xn−1) (5.6)

and we see that the multiplicities dλµ are all one. Therefore, when determining the possible branchings

of an irreducible GL(n) representation we do not have to worry about multiplicities, but we only have

to determine which Young diagrams appear. And this we are being told by the condition of λ − µbeing a horizontal rim. We can now state the branching rule for the decomposition of an irreducible

GL(n) representation when restricted to GL(n− 1).

Definition 5.1. (Branching Rule) Let Sλ(V ) an irreducible GL(n,C) representation. Then we have

Sλ(V )|GL(n−1)∼=⊕µ⊆λ

Sµ(V )

where λ ` n, µ ` n − 1 and the sum is over all partitions µ ⊆ λ such that λ − µ is a horizontal strip

and l(µ) ≤ n− 1.

Example 5.1. As an example I will demonstrate how we can derive the branching pattern of an irre-

ducible GL(n,C) representation using Young diagrams. By theorem 4.1 the irreducible representations

correspond to Young diagrams with at most n rows. Consider now the partition λ = (3, 2, 1). According

to the branching rule, the possible GL(n − 1,C) representations correspond to those partitions µ that

can be obtained by removing boxes from λ such that λ−µ is a horizontal strip. To determine in which

ways we can do this, I label the boxes that we remove by ×. This gives us the following possibilities:

××

××

××

××

×

××

×

Then the decomposition into irreducible GL(n− 1,C) representations becomes:

+ + + +

+ + +

35

Explicitly for n = 3 the branching becomes:

= → + + +

= 2 + 1 +

As a check we can compute the dimensions on either sides. Then we indeed find: 8 = 2 · 2 + 1 + 3.

Branching a tensor product

Just as we used the Schur functions to determine the branchingrule for GL(n,C) → GL(n − 1,C)

we can determine the branching rule for the decomposition of a tensor product of two irreducible

GL(n) representations in terms of irreducible GL(n) representations. The difference compared to

the previous case is that the multiplicities will no longer be one. This branching rule is, for λ, µ, ν

partitions of n:

Sλ ⊗ Sµ ∼=⊕ν

(Sν)cνλµ

which in terms of characters becomes:

sλ(x1, . . . , xn) · sµ(x1, . . . , xn) =∑ν

cνλµsν(x1, . . . , xn) (5.7)

But this is precisely the relation between products of Schur polynomials given by the Littlewood-

Richardson rule A.6, and we conclude that the multiplicities are given by the Littlewood-Richardson

coefficients.

Branching GL(n+m) → GL(n) ⊗ GL(m)

As a last application we determine the branching rule for the restriction of an irreducible GL(n+m,C)

representation to GL(n,C) ⊗ GL(m,C) × GL(1). Note now that GL(1) ∼= U(1) and that GL(1) =

Z[GL(n+m)] the center of GL(n+m). We can embed GL(n,C)⊗GL(m,C)×GL(1) in GL(n+m,C)

by reserving the upper left n×n block for GL(n,C) and the right lower block m×m block for GL(m,C)

and embed U(1) along the diagonal. This way it will commute with both GL(n) and GL(m) and play

no role in the branching. We may thus ignore it. The corresponding branching rule will now be:

SλV |GL(n)×GL(m)∼=⊕µ,ν

(SµV ⊗ SνV )⊕eλµν (5.8)

where λ ` n+m,µ ` n, ν ` m. In terms of Schur polynomials this becomes:

sn+mλ (x1, . . . , xn+m) =

∑µ,ν

eλµνsnµ(x1, . . . , xn) · smν (xn+1, . . . , xm).

To determine the multiplicities we have a look at proposition A.2. Considering the two sets of variables

x(1) = (x1, . . . , xn), x(2) = (xn+1, . . . , xm) and setting µ = 0 the proposition gives us for partitions λ

and µ = 0:

sλ(x(1), x(2)) =∑ν

sν(1)/ν(0)(x(1))sν(2)/ν(1)(x(2))

36

with the sum over all sequences (ν(0), ν(1), ν(2)) of partitions such that ν(0) = µ = 0, ν(2) = λ and

ν(0) ⊆ ν(1) ⊆ ν(2). Implementing these restrictions on the partitions we find:

sλ(x(1), x(2)) =∑ν(1)⊆λ

sν(1)(x(1))sλ/ν(1)(x(2))

=∑ν(1)⊆λ

sν(1)(x(1))∑µ

cλν(1)µsµ(x(2))

=∑ν(1),µ

cλν(1)µsν(1)(x(1))sµ(x(2))

where in the second equality we used A.9. We can therefore identify the coefficients with the Littlewood-

Richardson coefficients, i.e. the number of ways the young diagram for λ can be obtained by strict µ

expansion of the young diagram of ν12. The branching rule thus becomes:

Sn+mλ (V )|GL(n)×GL(m)

∼=⊕ν,µ

cλνµSnν (V )⊗ Smµ (V ) (5.9)

Now, although (5.7) and (5.9) may look very much alike in that they both have the Littlewood-

Richardson coefficients as multiplicities, the latter is a lot more complex to work with. In (5.7) two of

the three partitions that label the coefficients are known. In (5.9) only the final partition λ is known

which greatly increases the complexity as the partitions get larger. In the next section I will argue

that the results for GL(n) also hold for SU(n) and there I will also discuss an example on how to

handle (5.9).

5.2 The irreducible representations of SU(n)

We have seen in the previous section how to derive some branching rules for GL(n,C). Here I will

argue with a minor discussion13 that this also gives us the branching rules for SU(n). To show this we

need some results from Lie-theory, in particular Lie-algebras14. Important here are now the following

three Lie-algebras:

gl(n,C) = n× n complex matrices

sl(n,C) = X | X n× n complex matrix with tr(X) = 0

su(n) = X ∈ sl(n,C) | X anti hermitian matrix

Given a real Lie-algebra g we can consider its complexification gC, defined as:

gC := g⊗R C ≡ g⊕ ig

Then sl(n,C) can be seen as the complexification of su(n), i.e.

sl(n,C) = su(n)⊕ isu(n)

and gl(n,C) is related to sl(n,C) by:

gl(n,C) = CI⊕ sl(n,C)

12The definition and an example of strict expansion can be found in definition A.613Lie-theory is not the main purpose of this thesis and therefore results are kept short.14The material from the first paragraph is based on notes from J. Stokman, [9] and [10].

37

where I is the n×n identity matrix and CI the center. Then, since gl(n,C) = CI⊕sl(n,C), an irreducible

representation of gl(n,C) will remain irreducible when we restrict it to sl(n,C). Further, for complex

n × n matrices su(n) is equal to its complexification and therefore equal to the Lie algebra sl(n,C).

Thus, the restriction of irreducible sl(n,C) representations to su(n) will also remain irreducible. This

observation is important. It tells us that when we have an irreducible representation of the Lie algebra

gl(n,C), this automatically defines a complex irreducible representation of su(n). However, in all the

previous we have been investigating the representations of the group GL(n,C), and not its Lie-algebra.

Irreducible representations of groups though, also define an irreducible representation of its Lie-algebra.

They are related to one another through the exponential map. That is, if Sλ(V ) is an irreducible

GL(n,C) representation, then we have an irreducible gl(n,C) representation by differentiating the

action on v ∈ SλV :

X · v =d

dt|t=0 exp(tX) · v, X ∈ gl(n,C), tX ∈ GL(n,C).

Conversely we can integrate an irreducible representation of the Lie-algebra to get an irreducible

representation of the Lie group. Thus, putting these results together, we conclude that equivalent

irreducible representations of GL(n,C) define equivalent irreducible representations of SU(n). Also,

inequivalent irreducible representations of GL(n) with the same action of the center give inequiva-

lent irreducible representations of SU(n). All results that hold for the irreducible representations of

GL(n,C) also hold for SU(n).

Example 5.2. Branching SU(5)→ SU(3)× SU(2).

With this result and the branching rule (5.9) we can decompose an irreducible SU(5) representation in

terms of irreducible SU(3) × SU(2) representations using the Young tableau. I will refer to the rep-

resentations by Young diagrams and dimension (denoted bold) and consider the 5 lowest dimensional

representations 5, 10, 15 and 24. I will label the Young diagrams with a subscript 2,3 or 5 to em-

phasize whether they should be seen as irreducible SU(2), SU(3) or SU(5) representations respectively.

Consider now first 5 = . Then off course we have a rather trivial decomposition.

= ⊗ 1 + 1⊗ .

where (as representations) in each term the first factor belongs to SU(3) and the second to SU(2).

SU(5) has a second 5 dimensional representation with Young diagram λ = (1, 1, 1, 1). To find out how

it decomposes, suppose we started out with the irreducible SU(3) representation 3 = . Then the only

way to obtain λ = (1, 1, 1, 1)5 by strict expansion with ν, is when ν = (1, 1, 1)2. But this is zero in

SU(2). However, the other three dimensional representation 3 can be expanded to λ = (1, 1, 1, 1) with

2 = 12. The second possibility is to expand 3 = 13 using 2 = 2. Thus:

= ⊗ + ⊗ = ⊗ 1 + 1⊗ (5.10)

Next is the SU(5) representation 10 = 5. Offcourse, we can obtain this Young diagram by expanding

3 = 3 with 2 = 2 and similarly by expanding 3 with 12. Likewise, we can expand 2 with 13.

Thus this representation decomposes as:

= ⊗ + ⊗ 1 + 1⊗ = ⊗ + ⊗ 1 + 1⊗ 1 (5.11)

38

The 15 has Young diagram 5. This we can similarly obtain in three different ways by expanding

3 using 12, expanding 3 using 2 or expanding 2 using 13. Thus

= ⊗ 1 + ⊗ + 1⊗ (5.12)

For the 24 dimensional representation the Young diagram corresponds to the partition λ = (2, 1, 1, 1).

We start with the trivial SU(3) representation 3 = 13. Then we can obtain λ using the young

diagrams for the SU(2) representations given by 2 = 12 and 2 each of which give one possible

way of expansion. Starting with the 3 dimensional representation 3, we see we cannot expand it to

λ. However this representation can also be represented by 3 which we can expand using 2. The

other 3 dimensional representation requires an expansion with 2 = 2. Finally, we can start with

the 8 dimensional representation 3 and then we need 2 = 12 which also gives one possible way of

strict expansion. Thus:

= 1⊗ 1 + 1⊗ + ⊗ + ⊗ + ⊗ 1 (5.13)

As a check we can compute dimensions. Then we find: 24 = 1 + 3 + 3 · 2 + 3 · 2 + 8.

I will apply the result of this branching rule when I discuss the symmetry group SU(5) in the context

of Grand Unified Theories in section 10. There I show that we can use the found (SU(3), SU(2))

decompositions of the irreducible SU(5) representations to assign the elementary fermions of the

Standard Model to the irreducible representations of SU(5). There I will argue this in more detail.

The following section I will give an introduction to the Lagrangian formalism in Field Theory and

discuss the phenomenon of symmetry breaking in physics.

39

6 Lagrangians, symmetries and symmetry breaking

This section will include a short introduction to the Lagrangian formalism in classical mechanics and

how it is generalized to obtain a relativistic field theory. Then I will discuss some examples on field

theoretic Lagrangians, define what we mean by spontaneous global symmetry breaking and show that

this phenomenon is accompanied by the appearance of massless particles. The material can be found

in [11], [12] and [13].

6.1 Lagrangian formalism

The classical mechanical Lagrangian of a system is defined as

L = T − U (6.1)

where T and U are the kinetic energy and potential respectively and L is a function of the coordinates

qi and their time derivatives. The action S is defined as

S =

∫ t2

t1

L(qi, qi, t)dt (6.2)

and the requirement of δS = 0 leads to the Euler-Lagrange equation (6.3) from which the equations

of motion can be obtainedd

dt

(∂L

∂qi

)− ∂L

∂qi= 0. (6.3)

With this recipe at hand we can easily obtain a Lagrangian field theory. To do this, we introduce

the functional L(φ(x), ∂µφ(x)), where ∂µ is the usual shorthand notation for ∂/∂xµ and xµ is the

relativistic four-vector. Note that L is a Lagrangian density since we have

S =

∫d4xL(φ(x), ∂µφ(x)) =

∫dt

∫d3xL(φ(x), ∂µφ(x)) =

∫dtL.

We can consider φ(x) as a generalized coordinate at each value of its argument x. and d unlike

the classical case we now have an infinite number of degrees of freedom. With this definition of the

Lagrangian density we can rewrite (6.3) to obtain the Euler-Lagrange equation for a relativistic field

theory

∂µ

(∂L

∂(∂µφi)

)=∂L∂φi

. (6.4)

There is one slight difference between the Classical and Field-theoretic Lagrangian formulation though.

While in classical mechanics the Lagrangian can be explicitly derived using (6.1), in Field theories

the Lagrangian is often taken to be axiomatic. The following examples will discuss a few important

Lagrangians.

Example 6.1. Scalar (spin-0) field. The free field Lagrangian for a real scalar (spin-0) field is given

by

Lscalar =1

2∂µφ∂

µφ− 1

2m2φ2. (6.5)

(In general, this will also include interaction terms of higher order in φ, which is why we call this

Lagrangian the free field Lagrangian.) By applying the Euler-Lagrange equation we obtain:

(∂µ∂µ −m2)φ2 = 0 (6.6)

Equation (6.6) is called the Klein-Gordon equation and describes a spin-0 particle of mass m.

40

Example 6.2. Vector (spin-1) field. Spin-1 particles are described in terms of a vector field Vµ, with

Lagrangian

Lproca = −1

4(∂µV ν − ∂νV µ)(∂µVν − ∂νVµ) +

1

2M2V νVν (6.7)

where M is the mass of the vector field. By defining the field-strenght tensor Fµν

Fµν = ∂µV ν − ∂νV µ (6.8)

we can rewrite (6.7) to get a neater expression

Lproca = −1

4FµνFµν +

1

2m2V νVν . (6.9)

Applying the Euler-Lagrange equation yields:

∂µ(∂µV ν − ∂νV µ) +m2V µ → ∂µFµν +m2V µ (6.10)

which describes a spin-1 particle.

Example 6.3. Spinor (spin-1/2) field The Lagrangian for a spinor field ψ is given by

Lfermion = iψ(γµ∂µ)ψ −mψψ, (6.11)

with ψ = ψ†γ0 is the adjoint spinor and the gamma matrices are defined in appendix C. Applying the

Euler-Lagrange equation to ψ gives the Dirac equation describing a spin-1/2 particle of mass m. It

reads:

iγµ∂µψ −mψ = 0 (6.12)

6.2 Symmetries

We say that the Lagrangian has a symmetry, when it is invariant under a certain type of transformation.

I will demonstrate this through some examples.

Example 6.4. Consider the following Lagrangian

L =1

2∂µφ∂

µφ− V (φ) where V (φ) = V (−φ) (6.13)

This Lagrangian has a discrete symmetry, since it is invariant under the parity transformation φ→ −φ.

Besides discrete symmetries, the Lagrangian can also have continuous symmetries. This is demon-

strated in the following examples15.

Example 6.5. The Lagrangian for a complex scalar field φ = 1√2(φ1 + iφ2) is given by

L = (∂µφ)∗(∂µφ)− V (φ) where V (φ) = µ2φ∗φ+ λ2(φ∗φ)2. (6.14)

This Lagrangian has a global U(1) symmetry. It is invariant under global phase transformations

φ→ φ′ = eiθφ, which we can easily see by looking at the modulus

φ∗φ→ φ′∗φ′ = e−iθeiθφ∗φ = φ∗φ

.

15Continuous symmetry groups are called Lie-groups. They are discussed in appendix B.

41

Example 6.6. As a final example, we take the Lagrangian for a nucleon

L = p(iγµ∂µ −m)p+ n(iγµ∂µ −m)n (6.15)

where an equal mass for the proton and neutron is assumed16. The Dirac-γ matrices are defined in

appendix C. We can rewrite this as

L = ψ(iγµ∂µ −m)ψ where ψ =

(p

n

)(6.16)

and ψ represents the spinor conjugate ψ†γ0. Lagrangian (6.16) is invariant under SU(2) transforma-

tions, which is the group of rotations in isospin space17. It consists of transformations

ψ → expi~σ · ~α

2ψ (6.17)

where ~σ = (σ1, σ2, σ3) are the Pauli matrices and ~α = (α1, α2, α3), is a parameter18.

6.3 Symmetry breaking

The previous chapter introduced the concept of symmetries of the Lagrangian by means of some

examples. These symmetries, however, can be broken and there are two ways this can happen. It can

be broken spontaneously or explicitly.

6.3.1 Explicit symmetry breaking

In this case the symmetry is broken by explicitly adding terms to the Lagrangian violate the symmetry.

For example, in a Lagrangian with a discrete φ = −φ symmetry, terms with odd powers of φ would

explicitly break this symmetry. As another example consider the Lagrangian (6.15). We already

noted that the symmetry was only approximate because we assumed an equal mass for the proton and

neutron. The symmetry is explicitly broken when we distinguish between mp and mn.

6.3.2 Spontaneous symmetry breaking

We speak of spontaneous symmetry breaking when the vacuum of the Lagrangian is not invariant

under the full symmetry group of the Lagrangian. To explain what this means let me first define what

we mean with the vacuum being invariant under a symmetry group. From appendix B we know that

we can write an element U of the symmetry group of the Lagrangian as

U = eiαt

with t is a group generator. The vacuum of a the Lagrangian is now said to be invariant under

transformations of the symmetry group when

eiαt < φ0 >=< φ0 > (6.18)

16Note that this makes the symmetry we consider an approximate symmetry since their masses are not

exactly equal.17More on isospin can be found in appendix C.218See appendix B

42

where t is a generator of the symmetry group. Now, if we consider infinitesimal transformations we

can rewrite this as

(1 + iαt) < φ0 >=< φ0 > (6.19)

which is to say that for the vacuum to be conserved under a symmetry, the following condition for the

generator t must hold

t < φ0 >= 0. (6.20)

Any generator that does not satisfy this condition is called a broken generator. The following examples

will discuss some examples on spontaneously broken symmetries. In all these examples there will be

a scalar field φ that acquires a vacuum expectation value (VEV) caused by a potential V (φ) =

µ2φφ∗ + λ2(φφ∗)2. This potential is called a mexican hat potential and we will later identify it with

the Higgs potential. The VEV value though will only be invariant under a subgroup of the symmetry

group. Some generators of the full symmetry group that satisfied condition (6.20) before the vacuum

acquired a VEV, will be broken and the remaining generators that do leave the vacuum invariant are

the generators of the subgroup that leaves the vacuum invariant. As we will see, there is a physical

interpretation for these broken generators and this will be important for the Little Higgs models we

discuss later.

Example 6.7. Breaking a global U(1) symmetry

Consider the Lagrangian

L = (∂µφ)∗(∂µφ)− µ2φ∗φ− λ2(φ∗φ)2 (6.21)

for a complex scalar field φ = 1√2(φ1 + iφ2). As stated earlier this Lagrangian is invariant under U(1)

transformations. A look at the expression for the potential indicates we have to distinguish between

the cases µ2 > 0 and µ2 < 0.

µ2 > 0: The case µ2 > 0 corresponds to a ground state < φ0 >= 0. In terms of the fields φ1 and φ2

the Lagrangian for small oscillations around this vacuum reads19,

L =1

2(∂µφ1)2 +

1

2(∂µφ2)2 − 1

2µ2(φ1

2 + φ22)− 1

4λ2(φ1

2 + φ22)2 (6.22)

and by comparing this to (6.5) we see that the µ2 > 0 case simply describes two particles, each of

which has a mass µ.

µ2 < 0: The case µ2 < 0 is more interesting. In this case the potential takes the form of the so-called

mexican hat potential and this potential is unstable in φ = 0. Unlike the µ2 > 0 case, we now have

an infinite number of vacua located at the rim of the hat satisfying√φ2

1 + φ22 =

√−µ2

λ= v (6.23)

and which are connected by rotational symmetry. They are therefore all equivalent so we are free to

choose

< φ0 >=

(φ1

φ2

)=

(v

0

)(6.24)

as our ground state. We now consider small oscillations around the ground state by redefining the

field variables through η = φ1 − v and ξ = φ2. This gives us a parametrization of φ in terms of the

19In field theory particles are described as oscillations around their ground state [12].

43

Figure 1: The V (φ) = µ2φ∗φ + λ2(φ∗φ)2 potential for a complex scalar field for (a) µ2 > 0

and (b) µ2 < 0. The picture is from [13].

fluctuation fields η and ξ20:

φ =1√2

(η + v + iξ). (6.25)

Next step is to rewrite (6.21) in terms of η and ξ to obtain the Lagrangian for the small oscillations.

To keep things clear, we treat the kinetic part and potential part separately. For the kinetic part we

find:

Lkin = (∂µφ)∗(∂µφ) =1

2(∂µ(η + v − iξ))(∂µ(η + v + iξ))

=1

2(∂µη)2 +

1

2(∂µξ)

2

where we used ∂µv = 0. For the potential part we note φ∗φ = 12 (η+v− iξ)(η+v+ iξ) = 1

2 (η+v)2 +ξ2.

Then we find:

Lpot =µ2φ∗φ+ λ(φ∗φ)2

=1

2(−λv2)[(η + v)2 + ξ2] +

1

4λ[(η + v)2 + ξ2]2

=−1

4λv4 + λv2η2 + λvη3 +

1

4λη4 +

1

4λξ4 + ληvξ2 +

1

2λη2ξ2 (6.26)

Since in (6.26) the 3rd and 4th order terms in η and ξ represent interaction terms and the constant

term is irrelevant we consider only the quadratic terms to obtain for the full Lagrangian in terms of

the oscillation field

Ls.o. =1

2(∂µη)2 +

1

2(∂µξ)

2 − λv2η2 + interaction terms. (6.27)

Comparing to (6.5) shows this corresponds to a massive η-particle with mη2 = 2λv2 = −2µ2 > 0

and a massless particle ξ, since there is no mass term 12m

2ξξ

2. This massless ξ-particle is called a

20We have < η0 >= 0 and < ξ0 >= 0 so they indeed describe fluctuations around the vacuum.

44

Nambu-Goldstone Boson (often abbreviated as NGB) and it this particle that we identify with the

broken generator. They are predicted by the Goldstone theorem that states that one Goldstone boson

will appear for every broken generator of the original symmetry group. This appearance of massless

particles resulting might seem troublesome. However, we will see in sections 7.2 and 7.3 that they play

an important role in the Higgs mechanism.

Now although the parametrization we used above worked perfectly well, we could also have chosen to

use the following parametrization for φ in terms of the two real fields η and ξ given by

φ =1√2

(v + η)eiξ/v. (6.28)

Here v is again the vacuum expectation value, η parametrizes radial oscillations around v and ξ

rotations in the complex plane21. Substituting this in (6.21) we obtain

L =1

2(∂µ(η + v)e−iξ/v(∂µ(η + v)eiξ/v)− µ2

2(η + v)2 +

µ2

4v2(η + v)4

=1

2(∂µη(∂µη +

(η + v)2

2v2∂µξ∂

µξ − µ2

2(η + v)2 +

µ2

4v2(η + v)4

=[1

2∂µη∂

µη + µ2η2] + [1

2∂µξ∂

µξ] + [(η2

2v2+η

v)∂µξ∂

µξ +µ2

4v2(4vη3 + η4)]− µ2v2

4

where we used ∂µv = 0 and λ = −µ2

v2 in the second line. As before we make the same identification

of a massive η particle and a Goldstone particle ξ, as was to be expected since the result should be

independent of the chosen parametrization. We now have a second look at (6.28) to see if we can

deduce some properties of ξ. Under a U(1) transformation we have:

φ→ eiαφ and ξ → ξ + α

We can conclude from this that any non-derivative term in ξ would not be invariant under the U(1)

transformations. Therefore, no mass term of ξ can appear in the Lagrangian and ξ must therefore

be massless. The only way for the Goldstone particle to acquire a small mass, is when the symmetry

is broken explicitly22. In this case the Goldstone particle is called a pseudo-NGB. Also, we see from

(6.28) that ξ parametrizes a direction in space without changing the energy, since a shift in ξ does not

change φφ∗. It corresponds to walking over the rim of the hat, see figure 1, and this is the reason why

the Goldstone particle remains massless. The η on the other hand, parametrizes the radial direction

and oscillations in this direction do change the energy as figure 1 shows. The η particle therefore

acquires a mass.

Example 6.8. Breaking a global SU(2) symmetry. Things become more interesting when we

look at the spontaneous breaking of a SU(2) symmetry. To have SU(2) invariance we have to consider

a doublet consisting of two complex scalar particles Φ1 and Φ2.

Φ =

(Φ1

Φ2

)=

1√2

(φ1 + iφ2

φ3 + iφ4

). (6.29)

21Note that (6.25) is the first order expansion of (6.28).22This shift of the Goldstone boson under the action of the broken generator is a general observation and

the realization of the symmetry is called a shift symmetry.

45

The expression for the Lagrangian of Φ1 is.

L =1

2(∂µφ1)2 +

1

2(∂µφ2)2 − 1

2m1

2(φ12 + φ2

2)− 1

4λ(φ1

2 + φ22)2.

A similar expression hold for Φ1. Using Φ21 = φ1

2 + φ22 and assuming equal masses for Φ1 and Φ2 of

m1 = m2 = µ we obtain the full Lagrangian for Φ

L =1

2((∂µΦ1)∗(∂µΦ1) +

1

2((∂µΦ2)∗(∂µΦ2)− 1

2µ2(Φ∗1Φ1 + Φ∗2Φ2)− 1

4λ((Φ∗1Φ1)2 + Φ∗2Φ2)2)2

=1

2((∂µΦ†)(∂µΦ)− 1

2µ2(Φ†Φ)− 1

4λ(Φ†Φ)2. (6.30)

As before we have to distinguish between the cases µ2 > 0 and µ2 < 0. The former again corresponds

to a vacuum expectation value of < Φ0 >= 0 and describes two particles each of mass µ. The case

µ2 < 0 has a vacuum expectation value of

< (Φ†Φ)0 >µ2

λ= v2

and there are again an infinite amount of vacuum states lying on a circle of radius v in the Φ1 − Φ2

plane. We choose our vacuum as

< Φ0 >=

(0

v

).

Now, let’s see how many generators get broken by this VEV. The full symmetry group SU(2) has

three generators Ti = τi2 where τi, i = 1,2,3 are the three Pauli matrices23. Recall that a generator is

broken by the vacuum state when condition (6.20) is not satisfied, that is, when

t < Φ0 > 6= 0

and that we will have as many Goldstone bosons as broken generators. For the generators of SU(2)

we see that

T1

(0

v

)=

1

2

(0 1

1 0

)(0

v

)6= 0 (6.31)

T2

(0

v

)=

1

2

(0 −ii 0

)(0

v

)6= 0 (6.32)

T3

(0

v

)=

1

2

(1 0

0 −1

)(0

v

)6= 0 (6.33)

so all three SU(2) generators are broken from which we expect there to be three Goldstone bosons.

We now set

Ξ =

(ξ1 + iξ2

ξ3 + iξ4

),

and vary Φ around its ground state by setting Φ - < Φ0 > = Ξ. Then < Ξ0 > = 0 and Φ in terms of

these shifted fields becomes

Φ =

(ξ1 + iξ2

ξ3 + v + iξ4

). (6.34)

23See appendix B.

46

What remains is to substitute (6.34) in the Lagrangian (6.30). With ∂µv = 0, we find for the kinetic

part

1

2((∂µΦ†)(∂µΦ) =

1

2

4∑i=1

((∂µξi)(∂µξi). (6.35)

For the potential part we have with λ = −µ2/v2:

L =1

2µ2(Φ†Φ) +

1

4λ(Φ†Φ)2 (6.36)

=1

2µ2(ξ2

1 + ξ22 + (ξ3 + v)2 + ξ2

4) +1

4λ(ξ2

1 + ξ22 + (ξ3 + v)2 + ξ2

4)2 (6.37)

=1

2µ2

((4∑i=1

ξ2i

)+ 2vξ3 + v2

)− µ2

4v2

((4∑i=1

ξ2i

)+ 2vξ3 + v2

)2

. (6.38)

For the second term we have((4∑i=1

ξ2i

)+ 2vξ3 + v2

)2

=

( 4∑i=1

ξ2i

)2

+ (2vξ3)2 + v4 + 2

(4∑i=1

ξ2i

)((2vξ3) + v2) + 4v3ξ3

= 4v2ξ2

3 + v4 + 2v2

(4∑i=1

ξ2i

)+ 4v3ξ3 + higher order terms (6.39)

where in the second line we neglected terms of order higher than two, since those represent interactions

and we are interested in the masses. Putting (6.39) back in (6.38) we obtain for the Lagrangian

L =1

2µ2

((4∑i=1

ξ2i

)+ 2vξ3 + v2

)− µ2

4v2

(4v2ξ2

3 + v4 + 2v2

(4∑i=1

ξ2i

)+ 4v3ξ3

)(6.40)

=1

4µ2v2 − µ2ξ2

3 . (6.41)

This tells us that ξ1, ξ2 and ξ4 correspond to massless goldstone particles since they have no mass

term and ξ3 has obtained a mass

Mξ3 =√−2µ2.

Example 6.9. Breaking SO(N) → SO(N-1) As a final example we consider the case of breaking

an SO(N) symmetry. We consider the Lagrangian

L = (∂µφi)T (∂µφi)− µ2φTi φi − λ(φTi φi)

2 where i = 1, . . . , N

and we choose

< φ0 >=(

0 ... 0 vN

)T

with v =

√−µ2

λ

as our VEV for the µ2 < 0 case. The Lagrangian is invariant under the SO(N) transformations

φ→ eiαTaφ

where Ta are the 12N(N − 1) generators that have a single −i above the diagonal and a corresponding

i below the diagonal such that the matrix is anti-symmetric. To determine the number of NGB we

again determine how many generators are broken. For this we use (6.20).

t < φ0 >= 0.

47

Now, looking at our choice of < φ0 > we see that any generator Ta with a nonzero entry in the last

column will not satisfy this condition and is therefore a broken generator. It is easy to see that there

are thus N − 1 broken generators. The number of unbroken generators is therefore

1

2N(N − 1)− (N − 1) =

1

2(N − 1)(N − 2)

which is the number of generators of SO(N − 1).

In this section we examined the concepts of the spontaneous breaking of global symmetries, group

generators and Goldstone bosons. In the next section I will discuss these same concepts in the context

of local symmetries and discuss the Higgs mechanism. Things become a little more complicated when

we demand our Lagrangian to obey local symmetries, and we see that precisely this requirement will

resolve the problem of massless particles.

48

7 Goldstone bosons and the Higgs mechanism

The Higgs mechanism was first published by Francois Englert, Robert Brout and Peter Higgs in 1964 to

explain why particles have mass. In the Higgs mechanism, the potential of the Higgs field is responsible

for the spontaneous breaking of the symmetry group of the electroweak force SU(2)W × U(1)Y to

U(1)EM , the electromagnetic symmetry group. This version of the Higgs mechanism is called the

standard model Higgs mechanism and is responsible for assigning mass to the W± en Z0 gauge bosons

of the electroweak force and the fundamental fermions while leaving the photon massless. Before

demonstrating the standard model Higgs mechanism I will first introduce the Higgs mechanism for

the Abelian case of U(1) as symmetry group. 24

First though, we have to impose local symmetries on our Lagrangian. This will require introducing so

called gauge fields and gauge bosons. These gauge bosons will be responsible for resolving the problem

of massless Goldstone particles.

7.1 Local U(1) gauge theory

Gauge theories are theories for which the Lagrangian is invariant under a group of local gauge trans-

formations. They find their origin in Maxwell’s equations for electromagnetism. There it was shown

that for any scalar function λ(r, t), the transformations A → A + ∇λ(r, t) and V → V − ∂λ∂t leave

E and B unchanged25. These transformations are called gauge transformations. We already saw in

section 6.2 that the Lagrangian (6.14) is invariant under the global gauge transformations

φ→ eiθφ

But what if we considered local transformations:

φ→ eiθ(x)φ (7.1)

by letting θ depend on the space-time coordinate xµ? Then by evaluating φ∗φ and ∂µφ under the

transformation we observe that φ∗φ remains unchanged. However, ∂µφ transforms as:

∂µφ→ eiθ(x)(∂µ + i∂µθ(x))φ

and the symmetry is clearly broken. To resolve this problem we introduce the covariant derivative Dµ

Dµ = ∂µ + ieAµ

Here e is the charge of the particle described by φ(x) and Aµ is the electromagnetic field that transforms

as

Aµ → Aµ −1

e∂µθ(x) (7.2)

Together (7.1) and (7.2) are the set of local gauge transformations. By replacing ∂µ with Dµ we can

now easily verify that the invariance is restored.

Dµφ = (∂µ + ieAµ)φ→ eiθ(x)((∂µ + i∂µθ(x)) + ie(Aµ −1

e∂µθ(x)))φ = eiθ(x)Dµφ

24The material in this section is based on [11], [12] and [13].25Recall that B = ∇×A and E = −∇V − ∂A

∂t

49

The Aµ field is defined in such a way that it cancels the offending i∂µθ(x) and the locally invariant

Lagrangian now reads26:

L = (D∗µφ∗)(Dµφ)− µ2φ∗φ− λ(φ∗φ)2 (7.3)

What remains, is that we have to include the Lagrangian for the vector field Aµ we introduced, which

is given by the Proca Lagrangian (6.7)

Lproca = −1

4(∂µAν − ∂νAµ)(∂µAν − ∂νAµ) +

1

2M2AνAν (7.4)

Using the definition of the field strength (6.8) this becomes

Lproca = −1

4FµνFµν +

1

2M2AνAν (7.5)

A quick calculation shows that while the first term in (7.5) is invariant under (7.2), AνAν is not.

It thus follows that for Lproca to be invariant we must have M = 0 which gives us our final locally

invariant Lagrangian:

L = (D∗µφ∗)(Dµφ)− µ2φ∗φ− λ(φ∗φ)2 − 1

4FµνFµν (7.6)

7.2 Abelian Higgs Mechanism

I will now discuss the Higgs mechanism for a U(1) symmetry. The same techniques as in section 6.3.2

are used only now we consider the locally-gauge invariant case as derived in the previous section. We

begin with the Lagrangian (7.6)

L = (∂µ − ieAµ)φ∗(∂µ + ieAµ)φ− µ2φ∗φ− λ(φ∗φ)2 − 1

4FµνFµν ,

for a complex scalar field

φ =1√2

(φ1 + iφ2). (7.7)

Considering now only the case µ2 < 0 we take our VEV to be < φ0 >= 1√2

(0 v

)Twith v =

√−µ2

λ

and parametrize φ in

φ =1√2

(v + η)eiξ/v

Substituting this in (7.6) gives

L =1

2

[(∂µ − ieAµ)

((v + η)e−iξ/v

)] [(∂µ + ieAµ)

((v + η)eiξ/v

)]−µ

2

2(v+η)2+

µ2

4v2(v+η)4−1

4FµνFµν

where we used that λ = −µ2/v2 in the 3rd term, and we can identify the 2nd and 3rd with a mass term

µ2η2 + coupling terms and irrelevant constants. Working out the first term gives

L =1

2

[(∂µ − ieAµ)

((v + η)e−iξ/v

)] [(∂µ + ieAµ)

((v + η)eiξ/v

)]=

1

2

[∂µη − ieAµ(v + η)− iη + v

v∂µξ

] [∂µη + ieAµ(v + η) + i

η + v

v∂µξ

]=

1

2

[∂µη∂

µη + e2(η + v)2AµAµ + 2ie

(η + v)2

vAµ∂µξ +

(η + v)2

v2∂µξ∂

µξ

]26Note that before we had ∂∗µ = ∂µ so we did not have to distinguish when writing down our Lagrangian.

For Dµ though we have D∗µ = ∂µ − ieAµ 6= ∂µ + ieAµ = Dµ, so now we do have to distinguish.

50

The important terms in our analysis are the terms that only contain η,Aµ and ξ, the other terms

represent interactions which we are not interested in. Thus, omitting the interaction term we find

1

2∂µη∂

µη +1

2∂µξ∂

µξ +e2v2

2AµA

µ + evAµ∂µξ

The relevant part of our Lagrangian thereby becomes

L =1

2[∂µη∂

µη + 2µ2η2] +1

2[∂µξ∂

µξ]− 1

4FµνFµν +

e2v2

2AµA

µ + evAµ∂µξ + ...

From this we can read of that just as before the η-particle has a mass√−2µ2 > 0, and the ξ particle

is massless. What’s new is that the gauge field Aµ also seems to have acquired a mass ev as seen from

the 4th term. While this is all perfectly fine, there are two problems with this result. The first of

them being the 5th term which seems to represent some kind of coupling between Aµ and ξ which is

clearly unwanted. Secondly, the Goldstone boson is still present. Both problems can be resolved by a

particular choice of gauge for θ(x) in (7.1) and (7.2) that where defined by

φ→ eiθ(x)φ, Aµ → Aµ −1

e∂µθ(x).

To see how we should pick θ(x), we rewrite the terms with Aµ and ξ as

e2v2

2

(Aµ +

1

ev∂µξ

)(Aµ +

1

ev∂µξ

).

By comparing this to the gauge transformation for Aµ we see we should pick θ(x) = −ξ(x)/v which

corresponds to the transformation

φ→ φ′ = e−iξ/vφ = e−iξ/v1√2

(v + η)eiξ/v =1√2

(v + η).

With this gauge choice we can rewrite the Lagrangian to obtain

L =1

2[∂µη∂

µη + 2µ2η2]− 1

4FµνFµν +

e2v2

2

(Aµ +

1

ev∂µξ

)(Aµ +

1

ev∂µξ

)=

1

2[∂µη∂

µη + 2µ2η2]− 1

4FµνFµν +

e2v2

2A′µA

′µ + ...

This choice for θ(x) is called the Unitary gauge (abbreviated ’U -gauge’) and in this gauge, only the

physical terms appear in the Lagrangian since the choice θ(x) = −ξ(x)/v corresponds to choosing φ

to be entirely real. With the Lagrangian written in this form we can draw the following conclusions

• The Aµ field has acquired a mass ev.

• An η-field with a mass√

2λv.

• The Goldstone particle ξ has disappeared!

The massless vector field Aµ before carried two degrees of freedom (transverse polarizations). It picks

up a third degree of freedom (longitudinal polarization) when it acquires a mass. This extra degree

of freedom came from the Goldstone boson ξ that simultaneously disappeared from the spectrum.

Prosaically this is sometimes referred to as ”the gauge field eating the Goldstone boson”[12]. The η

particle corresponds to the Higgs boson. Note that the VEV of the higgs field, vh, sets the scale for

the mass of the gauge boson as well as the Higgs mass.

51

7.3 The Standard model Higgs mechanism

We saw in the previous section how the Higgs mechanism is used to generate a mass term for the

U(1) gauge boson. We defined a covariant derivative and introduced gauge transformations to make

the global U(1) symmetry a local symmetry. The non-zero VEV of the Higgs field then caused this

symmetry to be broken, and by redefining our gauge fields we removed the NGB from the spectrum

which simultaneously resulted in a mass term for the Aµ gauge boson. Here we discuss how the Higgs

mechanism is embedded in the electroweak sector of the Standard model. The non-zero VEV of the

Higgs field will trigger electroweak symmetry breaking (EWSB) SU(2)W × U(1)Y to U(1)EM . We

thus expect 3 Goldstone bosons to appear.

7.3.1 Assigning mass to gauge bosons

Just as in all the examples we have discussed before, the Higgs Lagrangians reads27:

L = (Dµφ)†(Dµφ)− µ2φ†φ− λ(φ†φ)2 (7.8)

were the covariant derivative now reads

Dµ = ∂µ + ig

2W aµ τa + i

g′

2Y Bµ. (7.9)

Here W aµ are the three gauge bosons of the weak interaction and τa are the pauli matrices, with Ta = τa

2

being the generators of SU(2). Note that in the covariant derivative the summation convention is used

for a = 1,2,3 and that Dµ is a 2 × 2 matrix. Bµ represents the gauge boson of U(1) corresponding

to the weak hyper charge Y generator. g and g′ are finally the coupling constants of the SU(2)L and

U(1)Y respectively. After spontaneous symmetry breaking the three goldstone bosons will become the

degrees of freedom that mix with the 3 SU(2) gauge bosons to become the massive W+,W− and Z0

bosons, the photon will remain massless and the remaining degree of freedom will be identified with

the scalar Higgs boson. Contrary to the Abelian case, the Higgs field is now a complex doublet of the

complex scalar components φ+ and φ0

φ =

(φ+

φ0

)=

1√2

(φ1 + iφ2

φ3 + iφ4

). (7.10)

and transforms as an SU(2) doublet. The charges +1 and 0 follow from the fact that the Higgs is

supposed to give mass to the W+,W− and Z0 bosons. Therefore, one of the fields must necessarily

be neutral to while the other must be charged. In that way φ+ and (φ+)∗ = φ− become the massive

degrees of freedom of the W±. (The remaining field will become the massive degree of freedom for

the Higgs.) The Higgs field must further have a weak hypercharge of Y = +1, which follows from the

Gell-Mann-Nishijima formula

Q = T3 +Y

2that relates electric charge to weak isospin and weak hyper charge. (Note that, since we have an

isospin doublet the weak isospin T3 has eigenvalues ± 12 ). Now, as before we find for µ2 < 0 that the

Higgs field has a nonzero VEV. As in the previous sections we choose our vacuum to be

< φ0 >=1√2

(0

v

)27The Higgs Lagrangian also contains a term LYukawa for the coupling of the Higgs to fermions but this we

will not use until section 7.3.2. For now we focus on the gauge bosons

52

with v =√µ2/λ. This VEV breaks the SU(2)L symmetry as well as the U(1)Y symmetry. However,

it does remain invariant under the U(1)EM symmetry generated by the electric charge. In example

6.8 we already saw that all three SU(2) generators broken by this VEV. In addition, the hyper charge

generator Y also breaks the VEV.

Y

(0

v

)=

(1 0

0 1

)(0

v

)6= 0 (7.11)

However, the linear combination Q = T3 + Y/2 = 12 (τ3 + Y ) does leave the vacuum invariant and

the vacuum is thus invariant under the U(1)EM symmetry. As we will see, the W 1 and W 2 gauge

bosons will mix to become the massive charged W± bosons and W 3 and B will mix to become the

massive neutral Z0 boson and the massless photon Aµ. As in section 7.2 we now expand φ about the

minimum:

φ = exp

(iξ · τv

)1√2

(0

(v + h)

)and gauge away the NGB by turning to U-gauge. Then φ transforms as φ → φ′ = Uφ, where we

choose the unitary matrix U to be exp(−iξ·τv

). We thus get:

φ→ Uφ = exp

(−iξ · τv

)φ (7.12)

= exp

(−iξ · τv

)exp

(iξ · τv

)1√2

(0

(v + h)

)(7.13)

=1√2

(0

(v + h)

)(7.14)

To determine the masses we note for the h particle that a mass term comes solely from the potential

term µφ†φ whereby it has a mass term M2H = −2µ2 > 0 To determine the masses of the gauge bosons

we only have to consider the term (Dµφ)†(Dµφ). Remembering to write everything in the U-gauge

we obtain for (Dµφ)

(Dµφ) =1

2√

2[2∂µ + igτa ·W a

µ + ig′Bµ]

(0

(v + h)

)(7.15)

=1

2√

2

(2∂µ + igW 3

µ + ig′Bµ ig[W 1µ − iW 2

µ ]

ig[W 1µ + iW 2

µ ] 2∂µ − igW 3µ + ig′Bµ

)(0

(v + h)

)(7.16)

=1

2√

2

(ig[W 1

µ − iW 2µ ](v + h)

(2∂µ − igW 3µ + ig′Bµ)(v + h)

)(7.17)

where we used that Ta = τa/2. In this expression the gauge boson Bµ corresponds to the U(1)Y

hypercharge generator Y and W aµ correspond to the three SU(2) generators Ta. We therefore get for

(Dµφ)†(Dµφ) the following expression:

(Dµφ)†(Dµφ) =1

2(∂µh)(∂µh) +

1

8g2(W 1

µ + iW 2µ)(W (1)µ − iW (2)µ)(v + h)2 (7.18)

+1

8(gW 3

µ − g′Bµ)(gW (3)µ − g′Bµ)(v + h)2 (7.19)

53

To determine the masses of the gauge bosons we have to look at terms that are quadratic in the

fields. (We thus ignore terms that involve products of the gauge bosons and h. Those terms represent

interactions and I’ll get back to them later). Then we get,

1

8v2g2[(W 1

µ)2 + (W 2µ)2] +

1

8v2[gW 3

µ − g′Bµ][gW (3)µ − g′Bµ] (7.20)

This expression though does not yet contain the physical gauge bosons, W±µ , Z0µ and Aµ. To obtain

those we have to redefine the fields. Focusing on the first term we define the charged physical W±µgauge fields as

W±µ ≡W 1µ ∓ iW 2

µ√2

(7.21)

and it easily follows that

(W 1µ)2 + (W 2

µ)2 = |W+µ |2 + |W−µ |2

The physical W± gauge bosons therefore get a mass of

MW± =gv

2.

To obtain the masses for the physical Z0µ and Aµ we notice that we can rewrite de second term as

1

8v2[gW 3

µ − g′Bµ][gW (3)µ − g′Bµ] =v2

8

(W 3µ Bµ

)( g2 −gg′

−gg′ g′2

)((W 3)µ

Bµ

)(7.22)

=v2

8

(W 3µ Bµ

)M

((W 3)µ

Bµ

)(7.23)

where in the second equality we have defined M as the mass matrix. Its diagonal elements are the

mass terms for the W 3 and B eigenstates. However, M is a non-diagonal matrix and its off-diagonal

elements couple together the W 3 and B fields causing them to mix. To find the masses of the actual

physical gauge bosons, we have to go to a basis in which M is diagonal. In this basis then, the

masses of the physical gauge bosons will be the eigenvalues of M. They are easily derived from the

characteristic equation.

(g2 − λ)(g′2 − λ)− (gg′)2 = 0 → λ = 0, λ = g2 + g′2

Therefore, in this basis, the mass matrix in (7.23) can be rewritten as

1

8v2(Aµ Zµ

)(0 0

0 g2 + g′2

)(Aµ

Zµ

)(7.24)

where we have defined Aµ and Zµ as the physical fields corresponding to the normalized eigenvectors

of M. The masses of the physical gauge bosons can now be identified to with

MA = 0 and MZ =1

2v√g2 + g′2.

The physical Aµ and Zµ fields correspond to the normalized eigenvectors of M and these are found

to be:

λ = 0 → 1√g2 + g′2

(g′

g

)=g′W 3

µ + gBµ√g2 + g′2

= Aµ (7.25)

54

and

λ = g2 + g′2 → 1√g2 + g′2

(g

−g′

)=gW 3

µ − g′Bµ√g2 + g′2

= Zµ (7.26)

The physical fields are thus mixtures of the massless bosons that correspond to the SU(2)L and U(1)Y

generators. Through the Higgs mechanism, the combination corresponding to the Zµ boson, has ac-

quired a mass whereas the photon Aµ has remained massless.

Experimental verification of Higgs mechanism

We can rewrite the ratio of the coupling constants g and g′ in terms of the so called Weinberg angle

θW to parametrize the mixing of the W 3µ and Bµ fields.

g′

g= tan(θW ). (7.27)

The parameter θW is not predicted by the Standard model. Its value must be determined from

experiment. It is found to be [12]:

θW = 28, 75 (7.28)

Then we can rewrite (7.25), (7.26) as

Aµ = cos(θW )Bµ + sin(θW )W 3µ (7.29)

Zµ = − sin(θW )Bµ + cos(θW )W 3µ (7.30)

Similarly, we can use (7.27) to rewrite MZ and MW in terms of θW , from which we obtain:

MW

MZ= cos(θW ) (7.31)

This prediction for the mass relation of the physical gauge bosons has been experimentally verified

and provides the most compelling argument for the Higgs mechanism to be correct. Further, the Higgs

acquires a mass

m2h = −2µ2 = 2λv2

If we then use the relation MW = gv2 and the measured values for MW and g the Higgs VEV is found

to be

v = 246 GeV

The parameters µ and λ though, are free parameters. The Standard model provides no way to de-

termine them, which is why is took so long so find the Higgs boson. It was eventually discovered at

4 July 2012. Measurements at the LHC determined its mass to be around 126 GeV. With this the

parameter λ could also be determined.

Coupling to the Gauge bosons

When determining the masses of the gauge bosons we only considered the terms that where quadratic

in the gauge fields, and we ignored the terms in (7.19) that involved products of the gauge fields and

h. Here we have a closer look at those interaction terms. Using we can rewrite the second term of

(7.19) as:1

4g2W−µ W

+µ (v + h)2 =

1

4g2v2W−µ W

+µ +

1

2g2vW−µ W

+µ h+

1

4g2W−µ W

+µ hh (7.32)

55

The first term, as before, gives the masses of the W± bosons. W−µ W+µ h and W−µ W

+µ hh, however give

rise to triple and quartic couplings of the Higgs boson to the gauge bosons. Their coupling strengths

can be read of to be

ghWW =1

2g2v = gmW and ghhWW =

1

4g2 =

1

2

gmW

v

The coupling of the Higgs to the W boson is thus proportional to the mass of the W -boson and the

coupling to the Z-boson is similarly found to be proportional to the mass.

7.3.2 Assigning mass to fermions

Apart from generating a mass for the gauge bosons, the Higgs mechanism is also responsible for

generating a mass term for the fermions. A fermion mass term would be off the form mfψψ and this

is not allowed to appear in the Lagrangian because it does not respect the SU(2)L × U(1)Y gauge

symmetry. This can be seen when we decompose ψ into its left and right-handed chiral states28,

obtaining

mfψψ = mf (ψR + ψL)(ψL + ψR) = mf (ψRψL + ψLψR)

However, in the Standard Model left-handed fermions are placed in SU(2) doublets, (I = 1/2), while

right-handed fermions are placed in SU(2) singlets, I = 029. Therefore they both transform differently

under SU(2)L × U(1)Y and a mass term is thus not gauge invariant.

ψL → ψ′L = exp

(−iξ · τv

)ψL

ψR → ψR

However, the two complex scalar fields in (7.10) are also placed in an SU(2) doublet and it transforms as

in (7.14). Therefore, the combination ψLφ is invariant under SU(2)L gauge transformations since the

exponentials cancel. If we combine this with a right-handed singlet ψR then the combination ψLφψR

and its hermitian conjugate, ψRφ†ψL will be invariant under SU(2)L and U(1)Y transformations. We

can conclude therefore that a term of the form

− λYuk(ψLφψR + ψRφ†ψL) (7.33)

will be invariant under the full gauge symmetry. Here λYuk is the Yukawa coupling between the Higgs

field and the massless lepton and quark fields.

Lepton masses

To determine the masses of the leptons we write (7.33) in terms of the left and right-handed lep-

ton states. For the first family they are given by

L =

(νe

e

)L

R = eR. (7.34)

28For left handed fermions the spin is antiparallel to its momenta and for righthanded fermions the spin is

parallel to the momenta.29See Appendix (C.2.2).

56

where I will write ψL and ψR as L and R for simplicity. Since the derivation is the same for all

three families I will only discuss the first family. If now the Higgs potential is added this results in

spontaneous symmetry breaking and we can write the Higgs doublet in U-gauge:

φ =1√2

(0

v + h

),

whereby (7.33) thus becomes

Llepton =− λe√2

[(νe e

)L

(0

v + h

)eR + eR

(0 v + h

)(νee

)L

](7.35)

=− λe(v + h)√2

[eLeR + eReL] (7.36)

=− λe(v + h)√2

ee (7.37)

From this we see that the neutrino has remained mass, while the electron has acquired a mass of

me =λev√

2.

Remark We see from (7.37) that the term LφR+Rφ†L is only able to generate a mass for the fermion

in the lower component of the SU(2)L doublet, since the non-zero VEV occurs in the lower component

of φ. Since right-handed neutrino’s have never been observed this is not a problem in the lepton case.

However, when determining the quark masses we have to consider right-handed up-quarks as well as

right-handed down quarks so we expect some difficulties to occur there.

Quark masses

The left-handed quark doublet and right-handed quark singlets are given by

L =

(u

d

)L

R = uR, uD (7.38)

The derivation of mass for the down-type quark goes similarly to the derivation above for the electron

mass, and gives a down-quark mass of:

md =λdv√

2.

Now, we turn to the up-quark, the upper component of the SU(2)L doublet. In view of the remark

above we need to reverse the order of the Higgs doublet. This we can accomplish by using the charge

conjugated doublet φc, see appendix C.1:

φc = iτ2φ∗ =

(0 1

−1 0

)((φ+)∗

(φ0)∗

)=

((φ0)∗

(−φ+)∗

)=

1√2

(φ3 − iφ4

φ1 + iφ2

)(7.39)

This conjugate of the Higgs doublet transforms in exactly the same way as φ under SU(2)L transfor-

mations, as can easily be checked in U-gauge, and is thus also invariant under SU(2)L transformations.

Therefore, we can construct a gauge invariant mass term for the up-quark from

− λu(ψLφcψR + ψR(φc)†ψL) (7.40)

57

After spontaneous symmetry breaking we can write φc in U-gauge:

φc = iτ21√2

(0

h+ v

)=

1√2

(h+ v

0

)(7.41)

and substituting this in (7.40) gives us:

Lu =− λu√2

[(u d

)L

(v + h

0

)uR + uR

(h+ v 0

)(ud

)L

](7.42)

=− λu(v + h)√2

[uLuR + uRuL] (7.43)

=− λu(v + h)√2

uu. (7.44)

and we read of a up-quark mass of

mu =λuv√

2.

58

8 The Hierarchy problem

Hierarchy is an important concept in physics and is related to energy scales30. All physical theories

have their own energy domain where they are valid. The theory of the electroweak interaction has

an energy scale around ∼ 246 GeV, the VEV of the Higgs field. At energies around the Planck scale

1019 GeV physicists know that the SM breaks down and they expect a Grand Unified Theory to take

over at around 1016 GeV. Indications for this come from the precise measurements of the coupling

constants. At high energies they seem to converge to a single point hinting to new physics at around

this scale that unifies the electroweak force with the strong force. This GUT scale however, is far

greater than the electroweak scale ∼ 246 GeV and physicists are not sure what lies in the range

between the electroweak scale up to the GUT scale. Although this does not pose a direct problem for

physical theories physicists find it highly disturbing and have named this problem of the vast difference

between the electroweak scale and the GUT scale the hierarchy problem.

8.1 Naturalness

Before discussing how the hierarchy problem arises in the SM I will first discuss the notion of natural-

ness to explain why only the Higgs as a fundamental scalar particle is sensitive to the energy hierarchy.

It has to do with the Higgs being a fundamental scalar particle and that its mass is not protected by

any symmetry of the Standard Model. To explain what this means we need the definition of technical

naturalness formulated by Gerard ’t Hooft (1980) [17]:

• A parameter is naturally small if setting it to zero increases the symmetry of the theory.

We already saw that a gauge boson mass term in the Lagrangian is forbidden because it explicitly

breaks the gauge invariance. We concluded that the gauge bosons must therefore be massless. In other

words, setting their mass to zero increases the symmetry. The gauge bosons can only acquire a mass via

the Higgs mechanism that introduces a mass term in a gauge invariant way. Their masses are said to

be protected by the gauge symmetry. Similarly, as we saw in 7.3.2, a fermion mass term, which would

be of the form mψψ, is not invariant under the local a gauge symmetry SU(2)L × U(1)Y . However,

when mf is set to zero, this allows the left and right handed parts of ψ to transform independently

and Lagrangian is said to have an additional chiral symmetry. The fermion mass is therefore also

’naturally small’ in the definition of ’t Hooft and is said to be protected by chiral symmetry. In terms

of loop corrections31 this manifests itself in that any correction to the gauge boson and fermion masses

will be proportional to the mass, due to the terms that are allowed in the Lagrangian. The correction

from loop diagrams will thus be multiplicative and small in the limit of small masses. For a scalar field

φ though, like the Higgs field, there is no symmetry that forbids a mass term µ20φ∗φ and a correction

to the mass resulting from loop diagrams will be additive.

8.2 Hierarchy problem in the Higgs sector

I will now discuss how the hierarchy problem arises in the Higgs sector. Suppose that the SM remains

valid up to the Planck scale and has a cut-off ΛPl ∼ 1019 GeV. We consider a scalar theory with

30The arguments can be found in many textbooks such as [13]31See appendix D .

59

Yukawa and gauge interactions and Lagrangian:

L =1

2(∂µφ)2 − 1

2µ2

0φ2 − λφ4 + interactions

The parameter µ0 is the bare mass of the scalar φ resulting from tree diagrams. However, due to its

couplings to the fermions, gauge bosons and its self-interaction the bare mass µ20 receives quantum

corrections at one loop order. The physical mass µH is the bare mass plus these quantum corrections

from loop diagrams (denoted by δm), i.e.

m2H = m2

0,tree + δm (8.1)

The main loop contributions to the Higgs mass come from the coupling of the Higgs to the top quark32,

the coupling to the W±µ and Z0µ gauge bosons and the Higgs self energy due to the quartic self coupling.

The corresponding one-loop Feynman diagrams are displayed in the figure below.

Figure 2: The three most significantly quadratically divergent contributions to the Higgs mass.

From left to right: The top-quark loop, the gauge boson loop and the Higgs self-energy.

These three diagrams all depend quadratically on the cutoff Λ, as can be verified by power counting

of the momenta33, and their contributions and found to be [18]34:

• top quark loop - 38π2λ

2tΛ

2

• SU(2) gauge bosons 964π2 g

2Λ2

• Higgs loop 116π2λ

2Λ2

Thus (8.1) has the form:

m2H = m2

0,tree + Λ2(aλ2t + bg2 + cλ2)

orm2H

Λ2=m2

0,tree

Λ2+ (aλ2

t + bg2 + cλ2)

Implementing the assumption that the SM remains valid up to the Planck scale we find that a tremen-

dous amount of fine-tuning between the bare mass and the coupling terms is needed to explain the

light mass of the Higgs boson. This is where the hierarchy problem arises in the SM. Mathemati-

cally it is not a problem but it is again highly unnatural and the SM does not give any hints why

32The Higgs couples to all quarks but because the coupling strength of the Higgs is proportional to the mass

of the fermion it couples most strongly to the top-quark.33See Appendix D.1.34A calculation based on cut-off regularization can be found in Appendix D.3.

60

this cancellation should take place. Solutions to this hierarchy problem rely on the assumption that

new physics appears at a much lower scale at the order of TeV. In [18] the contributions of the three

quadratically divergent diagrams have been calculated for a cut-off of Λ ∼ 10 TeV. Their contributions

are respectively −(2 TeV)2, (0.7 TeV)2 and (0.5 TeV)2. Assuming this cut-off of 10 TeV a need for

fine-tuning of about one part in a hundred is needed and again the hierarchy manifests itself. When

the cut-off is taken to be 1 TeV the need for fine tuning no longer arises.

We can also turn the argument around by demanding that we find a fine tuning of no more than

10% acceptable. Then a cut-off Λ ≈ 2 TeV is found. At this scale we would then expect new physics

to appear and to find new particles that would naturally cancel the divergent loop contributions.

One successful solution to the hierarchy problem, is the idea of supersymmetry (SUSY) that states

that every SM particle has a supersymmetric partner. In SUSY the loop contributions of the SM

particles are cancelled by the loop contributions with a supersymmetric partner in the loop and so

the need for fine-tuning does not arise. All models that are to resolve the hierarchy problem must

introduce new physics at a scale far enough below the Planck scale for the amount for fine-tuning to

be reduced enough. If we indeed believe that an actual solution to this ’big’ hierarchy problem exists,

then physicists should have found observational evidence for new physics as they approach the cut-off

from below. The problem however, is that measurements give no evidence of new physics whatsoever.

This lack of evidence pushes up the lower limit for the cut-off above the TeV scale which reintroduces

a new less-severe hierarchy problem. This is called the Little Hierarchy problem. Any model that is to

successfully solve the hierarchy problem must also not reintroduce a Little Hierarchy problem. Here

we focus on a set of models that addresses this Little hierarchy problem by introducing particles at

the TeV scale that cancel the SM quadratic divergencies. These models are called Little Higgs models

and they realize the Higgs as a pseudo-NGB of a higher approximate global symmetry. This way the

Higgs becomes ’naturally light’.

61

9 Little Higgs models

Little Higgs models postulate the Higgs boson as the pseudo-NGB of some greater global symmetry

which is broken both spontaneously and explicitly. Here we will focus on the ”Simplest Little Higgs”

that involves the breaking of SU(3) to SU(2) and is a model to conceptually understand the mechanism

behind Little Higgs models and introduce the mechanism of collective symmetry breaking that will

prevent the Higgs from divergent corrections. In the calculations will I focus in mostly on the gauge

sector of the model and explicitly show that the quadratic contribution to the Higgs mass from the

SM W bosons is successfully cancelled. The fermion sector will be discussed in lesser detail. First

though we have to know how the NGB transform under the broken and unbroken generators of

[SU(N)/SU(N − 1)], which will be the purpose of the next paragraph.

9.1 Transformation of NGB

As we already saw we can parametrize the Goldstone boson ξ(x) by writing

φ(x) =1√2

(f + η)eiξ(x)/f ,

with f the VEV of φ and η representing the massive radial oscillations. We now generalize this to

breaking pattern of SU(N)→ SU(N − 1) and analyze how the NGB transform. The number of them

is equal to the number of broken generators. Using SU(N) has N2 − 1 generators, we should get a

total of 2N − 1 NGB. We now use the following parametrization for the NBG by writing [18]:

φ = eif Πφ0 with Π =

π1

...

πN−1

π1 · · · πN−1 π0/√

2

and VEV φ0 =

0...

0

f

. (9.1)

Π is the goldstone boson matrix, the fields π1 . . . πN−1 are complex and the field π0 is real. f Represents

the high symmetry breaking scale. Written in this form we can investigate how they transform under

the unbroken symmetries and broken [SU(N)/SU(N − 1)] symmetries. As a first observation we note

we can make the unbroken SU(N − 1) transformations explicit, since we have an embedding:

UN−1 =

(UN−1 0

0 1

). (9.2)

Looking first at how φ transforms under the unbroken SU(N − 1) transformations we find:

φ→ UN−1φ = (UN−1eivΠU†N−1)UN−1φ0 = e

ivUN−1ΠU†N−1φ0, (9.3)

where the invariance of the vacuum under SU(N − 1) was used in the second equality. From this we

see that the NGB transform linearly as Π→ UN−1ΠU†N−1. Using (9.2) we can further deduce that π0

transforms as a singlet and ~π =(π1, . . . , πN−1

)Ttransforms like(

0 ~π

~π† π0

)→ UN−1

(0 ~π

~π† π0

)U†N−1 =

(0 UN−1~π

U†N−1~π† π0

)

62

~π thus transforms in the fundamental representation of SU(N − 1) meaning that ~π → UN−1~π. Now

lets see how φ transforms under the broken SU(N) generators. By the BCH formula35 we have that

any general SU(N) transformation can be decomposed into a SU(N)/SU(N−1) transformation times

a SU(N − 1) transformation, the latter leaving φ0 invariant. Therefore:

φ→ UN/N−1eif Πφ0 = exp

[i

f

(0 ~α

~α† 0

)]exp

[i

f

(0 ~π

~π† 0

)]φ0

≡ exp

[i

f

(0 ~π′

~π′† 0

)]UN−1(α, π)φ0 by BCH

= exp

[i

f

(0 ~π′

~π′† 0

)]φ0

and thus, again by BCH, we see that to first order ~π → ~π′ = π + α meaning that they indeed shift

under the broken symmetries. Just as in the abelian case this again ensures that the NGB can only

have derivative interactions.

9.2 Constructing ”The Simplest Little Higgs”.

In the ”Simplest Little Higgs” the Higgs boson is realized as a NGB of an higher SU(3)W symme-

try which is spontaneously broken to SU(2)W by letting φ acquire a VEV ∼ f . However, an exact

NGB realized this way will not suffice as an appropriate Higgs candidate, since the SM Higgs mass

is non-zero. Therefore, we must also break the symmetry explicitly, in order to realize the Higgs

as a pseudo-NGB. Increasing the symmetry to SU(3) will require introducing new heavy particles

with masses O(f) that will cancel the quadratic divergencies of the SM. First however, we need to

construct a Lagrangian to work with that is invariant under the full SU(3) symmetry and includes

only the NGB. But, as concluded in the previous section, this means that we can only have derivative

couplings. Adding couplings to the Gauge bosons and fermions and also the quartic Higgs coupling

will take extra care, and we focus on these in the next few sections. We must also determine at what

energy scale the model will be valid. Since we will introduce new particles, these will need to be heavy

and the symmetry breaking scale f must be a high energy scale36.

The Lagrangian and energy scale of the model.

Under a SU(N) symmetry, only the combinations φ†φ and εa1a2···aNφa1φa2 · · ·φaN = 0 are invari-

ant and we are left with the following Lagrangian [18]

L = φ†φ+ f2|∂µφ|2 +O(∂4) (9.4)

where φ†φ = f2 is an irrelevant constant. To construct the NGB matrix Π we observe that the

VEV of the φ field induces the spontaneous breaking of SU(3) to SU(2) resulting in 5 exact NGB

corresponding to the 5 broken SU(3) generators. Noting that we can identify λ1, λ2, λ3 with the Pauli

35See appendix B.36Otherwise the LHC should have found these by now.

63

matrices, and thus remain unbroken, we can parametrize the NGB by:

Π = πi1

2λi =

π8

2√

30 1

2 (π4 − iπ5)

0 π8

2√

312 (π6 − iπ7)

12 (π4 + iπ5) 1

2 (π6 + iπ7) − π8√3

≡η/√

3 0 h

0 η/√

3

h† −2η/√

3

where we have defined h and η as the following combinations of NGB

h =

(12 (π4 − iπ5)12 (π6 − iπ7)

), η =

π8

2(9.5)

The h field is a complex SU(2) doublet and represents the Higgs doublet. It transforms linearly under

the unbroken SU(2) symmetry and shifts under the broken SU(3) generators. Note that h is arranged

such that it is not included by the SU(2) generators, so that when breaking SU(3) into SU(2), h

will always remain a goldstone boson. The scalar field η is an SU(2) singlet. The field φ we can

parametrize as

φ = eiΠ/fφ0 with VEV φ0 =

0

0

f

.

This φ we can expand in terms of h to see what interactions we get for h. Then, ignoring the η singlet,

we get:

φ = exp

if

0 0 h

0 0 h

h† h† 0

0

0

f

=

0

0

f

+ i

hh0

− 1

2f

0

0

h†h

+ h.o.c (9.6)

If we now insert this back into the kinetic term of 9.4, then we are left with:

|∂µφ|2 =

∣∣∣∣∣∣∣i∂µh∂µh

0

− 1

2f

0

0

∂µ(h†h)

∣∣∣∣∣∣∣2

=|∂µh|2 +1

4f2|2h†∂µh|2

=|∂µh|2(

1 +1

f2h†h

)The first term represents the kinetic term of the bare h propagator. The second term however,

represents a loop correction to this kinetic term by contracting h into a loop and this correction is

quadratically divergent. The Lagrangian therefore contains non-renormalizable interactions, which

is unacceptable for an effective field theory. Therefore, to discover up to which energy the model

remains valid as an effective field theory, and where we thus need a completion of the theory, we cut

the divergence off at Λ. The divergent contribution to the kinetic term is then found to be [18]

1

f2

Λ2

16π2(9.7)

Now we can check at which energy (9.7) will become comparable to the contribution of the tree level

diagram. This will be the case when the correction becomes O(1), that is, for Λ ∼ 4πf . We therefore

expect the theory to be valid for f ≈ 1 TeV which corresponds to Λ ≈ 10 TeV. Above this energy, the

theory becomes non-renormalizable, i.e. the corrections becomes more important than the tree level

diagram, and we need a high energy theory to take over37.

37It is important to note that this energy scale lies beyond the current scope of the LHC energy, meaning

that we have not yet been able to discover any new particles postulated by the theory. Had this not been

64

So far we have an effective theory with massless NGB that are not allowed to have any non-derivative

interactions. No coupling terms are allowed and also a mass term for h is forbidden. However, to

have a theory ’similar’ to the standard model Higgs we do need gauge-couplings, Yukawa couplings

and a quartic Higgs potential. In the next few sections I will focus on implementing those in a gauge

invariant way.

9.2.1 Adding the Gauge coupling

Beginning with the gauge couplings we try to implement the SU(3) symmetry by including the fol-

lowing covariant derivative38:

Dµ = ∂µ − igT aW aµ (9.8)

where

T a =1

2λa =

(σa2 0

0 0

)Thus, we only add the SU(2) gauge bosons. However, simply rewriting the covariant derivative in

this rather trivial way has no effect. Expansion of |Dµφ|2 shows that we have the following 1-loop

diagrams.

Figure 3: The two quadratically divergent contributions to the Higgs mass coming from the

terms that couple the SU(2) gauge bosons to φ.

Its value is schematically found to be

g2

16π2Λ2φ†

1 0 0

0 1 0

0 0 0

φ =g2

16π2Λ2h†h

As a second attempt we can try gauging the full SU(3) symmetry by now including all 8 SU(3) gauge

bosons in the covariant derivative (9.8). Expansion leads again to the same quadratically divergent

diagram coming from the fourth term in the expansion where we now have all eight gauge bosons.

This gives:g2

16π2Λ2φ†φ =

g2

16π2Λ2f2

which contains no mass term for the Higgs, but adds only a constant. However, the Higgs field is also

gone. The NGB’s that formed h have been gauged away by the 5 gauge bosons corresponding to the

broken SU(3) generators. Thus, adding a single set of NGB φ in combination with gauging the full

the case we should already have found any new particles, which has not been the case, and the theory can

therefore not be correct. This is the case for all models that are an extension to the Standard model.38Recall we introduced it in sections 7.1 and 7.3.

65

SU(3) results in a quadratic divergence with no dependence on h. However, because the full SU(3)

was gauged, the NGB’s are also eaten by the 5 gauge bosons. These two results suggest that a way to

circumvent this problem is by adding 2 sets of NGB’s φ1 and φ2 and add only a single set of SU(3)

gauge bosons. This way both φ fields result in a spontaneous symmetry breaking of SU(3)→ SU(2),

resulting in 10 exact NGB’s of which only 5 are eaten.

Collective symmetry breaking: The Little Higgs trick

As said, we add two sets of NGB’s, φ1 and φ2, parametrized by

φ1 = eiΠ1/f1

0

0

f1

, φ2 = eiΠ2/f2

0

0

f2

, where f1 = f2 = f

and add a single set of SU(3) gauge bosons by letting φ1 and φ2 both have the same covariant

derivative,

L = |Dµφ1|2 + |Dµφ2|2.

Since we have introduced two φ fields and a single set of gauge bosons, only one linear combination of

Π1 and Π2 will be eaten, while the other orthogonal combination will form the complex Higgs doublet.

In view of the previous attempt, both φ fields separately lead to the same quadratic divergent diagrams

and give a total quadratic divergence of

g2

16π2Λ2(φ†1φ1 + φ†2φ2) =

g2

16π2Λ2(f2 + f2) (9.9)

and thus do not contribute to the Higgs mass. However, since we are now dealing with two fields we

can also draw the following diagram which has two gauge bosons in the loop.

Figure 4: The third possible diagram that contains both φ fields. The external fields are φ1

and φ2 and it has 2 gauge bosons in the loop. This diagram is the only one-loop diagram

that contains both of the φ fields.

By counting momenta we expect this diagram to be logarithmically divergent39 and indeed its contri-

bution is [18]g4

16π2log

(Λ2

µ2

)|φ†1φ2|2. (9.10)

where µ is a free renormalization scale. By expanding |φ†1φ2|2 we can further show that it contains a

tree level mass term for h. For this we need a more explicit expression for the φ fields. We already

39See Appendix D.1.

66

noted that only one combination of Π1 and Π2 can be gauged away, since there is only one set of gauge

bosons. This we will call k. The orthogonal combination is identified with the Higgs and cannot be

gauged away. We thus choose the following parametrization for the fields.

φ1 = exp

[i

(0 k

k† 0

)]exp

[i

(0 h

h† 0

)]0

0

f

(9.11)

and

φ2 = exp

[i

(0 k

k† 0

)]exp

[−i

(0 h

h† 0

)]0

0

f

(9.12)

where k and h are to be read as doublets. Working in U -gauge, (recall we introduced it in section

7.1), we get the following expansion of |φ†1φ2|2:

φ†1φ2 =(

0 0 f)

exp

[−2i

f

(0 h

h† 0

)]0

0

f

=f2

[1− 2i

f

(0 h

h† 0

)− 2

f2

(hh† 0

0 hh†

)]

=f2 − 2h†h.

Equation (9.10) therefore contains a term

g4

16π2log

(Λ2

µ2

)f2.

Recalling that we argued Λ ∼ 4πf for the theory to be renormalizable we can estimate the contribution

of this diagram for f ≈ 1 TeV. Doing so, and using µ ≈ O(vh) and Λ ≈ 10 TeV, one finds its value is

about 100 GeV, which is about the Standard Model Higgs mass [18].

Collective symmetry breaking

What made the previous work and what is a key ingredient in all Little Higgs models is what is

called collective symmetry breaking. Let me explain what this means by investigating the relevant

symmetries of the model. First thing to note, is that without the gauge couplings the theory has

a global SU(3)1 × SU(3)2 symmetry, which is spontaneously broken to SU(2)1 × SU(2)2 by both

of the φ field VEV’s. The coset is thus [SU(3)/SU(2)]2 corresponding to 10 exact NGB’s. These

correspond to 2 singlets and 2 complex doublets (k and h) transforming under the unbroken SU(2)

symmetry. However, by introducing the SU(3) gauge interactions for both of the fields, we gauged

only the diagonal SU(3) subgroup of SU(3)1 × SU(3)2, and explicitly broke the global symmetry to

this gauged subgroup because φ1 and φ2 were no longer allowed to rotate independently. This can

be seen from the boson-scalar coupling term in the Lagrangian and the relative minus sign between

(9.11) and (9.12).

|gWµφ1|2 + |gWµφ2|2

This diagonal SU(3) is then spontaneously broken to SU(2) producing thus only 5 exact NGB,

corresponding to the k-fields that are eaten by the gauge bosons. The set that remains, h in the

67

parametrization, we just saw, acquires a mass term through the log-divergent loop diagram. This h is

thus a pseudo-NGB. What is crucial here is that both of the gauge couplings must be present. Suppose

that we were to set the gauge coupling of either of the φi to zero. Then the Lagrangian again has

two independent SU(3) symmetries that are spontaneously broken. This way we get 10 exact-NGB’s

of which 5 are eaten, leaving us with 5 exact-NGB’s and thus a massless h field. It is only when we

include the gauge coupling for both of the φ fields that the global [SU(3)]2 symmetry is explicitly

broken to its diagonal. This way only 5 exact-NGB appear and h will be realized as a pseudo-NGB,

resulting from the breaking of the approximate global symmetry. Thus, only when we include both φ

fields can we get a massive h field and we just saw that a diagram involving both of the fields is at

most logarithmically divergent.

This mechanism of realizing the Higgs as a pseudo-NGB is called collective symmetry breaking. It

realizes the Higgs as the NGB of a spontaneously broken global symmetry, that is also explicitly broken

making it a pseudo-NGB. ”Collectively” means that the explicit symmetry breaking can happen only

in the case when two of more couplings are non-zero. In this way the Higgs mass is natural and

protected, since setting either of the gauge couplings to zero restores the global symmetry and again

makes of the Higgs an exact NGB.

The same idea of this collective symmetry breaking can be applied when adding the Yukawa coupling

to the quarks which is what we will do next.

9.2.2 Adding the Yukawa coupling

Since the Yukawa coupling of the Higgs to the fermions is proportional to the fermion mass, the most

important contribution comes from the top-quark. Recall that the Yukawa Lagrangian was of the form

LYuk = −λfψLφψR.

We can then write down the following Lagrangian that involves both φi fields.

LYuk = −λ1ψLφ1ψ1R − λ2ψLφ2ψ2R (9.13)

To get SU(3) invariance, we enlarge the SU(2) doublets to triplets by adding a heavy top partner for

the SM top quark t, which we’ll denote by T . We thus get:

ψL =

t

b

T

L

ψR = tR, bR, TR. (9.14)

As we will see, the right-handed top quark tR, will mix with the heavy right-handed top-quark TR,

such that the quadratic divergent top-loop will be cancelled by this heavy top-quark. I will thus refer

to tR and TR as t1R and t2R to reflect their mixing. In the conventions used here, the Higgs fields is

assumed to obtain a VEV of(

0 v)T

. Therefore, we have to use the charge conjugates of the fields

φ1 and φ2, i.e.

φci =

(iτ2 0

0 1

)φ∗i .

68

to prevent the top and down quark from mixing40. This results in the following two terms for the top

quark Yukawa Lagrangian41:

Ltop = −λ1ψL

(iτ2 0

0 1

)φ∗1t1R − λ2ψL

(iτ2 0

0 1

)φ∗2t2R (9.15)

Recalling the expansions for φ1 and φ2 derived in (9.6) we find for the conjugate fields:

φc1 =

(−τ2h∗

f − h†h2f

)and φc2 =

(τ2h∗

f − h†h2f

)

Inserting this in (9.15) and for simplicity setting λ1 = λ2 = λ√2

gives:

Ltop = − λ√2

(t b

)Liτ2h

∗(it1R − it2R)− λ√2TL

(f − h†h

2f

)(t1R + t2R) (9.16)

where the factor of i is inserted so we can redefine the right-handed singlets as:

tR =1√2i(t2R − t1R) and TR =

1√2

(t2R + t1R).

In terms of these redefined fields, (9.16) becomes:

Ltop = −λ[(t b

)Liτ2h

∗tR + TL

(f − h†h

2f

)TR

](9.17)

The first term in represents the Standard Model top-Yukawa coupling and we identify λ = λt. The

second term includes a mass term for the heavy top-quark and a coupling term. We can read off a

mass of λtf and a coupling constant of λt/(2f).

Cancelling the top loop

From (9.2.2) we see we can draw the following two diagrams that both contribute to the Higgs mass

to first order The contribution of the first diagram to the Higgs mass we already know. It is

Figure 5: The quadratically divergent contributions from the top-quark and its heavy top

partner. They contribute equally to the Higgs mass with opposite sign and thus cancel.

−3λ2tΛ

2

8π2

40In [18] the calculations are based on the VEV of(v 0

)T. The calculations done here are give the exact

same results since it is a matter of convention whether one works with φ or its charge conjugate.41We discard the terms Rφ†L because these would reintroduce the quadratic divergencies.

69

To see why the second diagram should cancel the first we make the following observations. First

observation is that the coupling terms in the Lagrangian differ a relative minus sign. Further, the

second diagram actually represents two diagrams since the heavy Top and its antiparticle can both

run in opposite order. Then, the two relevant couplings in the second diagram are λtf and λt/(2f).

Therefore, the factors 2 and f different from the first diagram cancel and the contribution is the same

apart from the difference in minus sign, and the heavy top quark contribution indeed cancels the

SM-top quark contribution.

Symmetries

As in the gauge boson sector the absence of quadratic divergencies can be understood by looking

at the symmetries involved. Both of the Yukawa couplings separately preserve the SU(3) gauge sym-

metry. When both of the couplings are non-zero this forces the fields in both terms in (9.13) to be

aligned, since the couplings force force φi to transform like ψL. In this case there is only one sym-

metry, the diagonal SU(3) gauge symmetry. Suppose again we were to set either of the λi to zero.

This results in two independent SU(3) symmetries and the symmetry is thus enhanced to SU(3)2.

Both φi spontaneously break this symmetry to [SU(2)]2 resulting in two sets of 5 exact NGB’s. One

set is eaten and the other forms the Little Higgs, which is an exact NGB and thus massless. If we

then set both of the couplings to non-zero, only the diagonal SU(3) symmetry remains. We will only

get one set of 5 exact NGB, which are eaten, and the Higgs becomes a Pseudo-NGB and receives a

contributions to its mass from loop diagrams. This contribution can only come from diagrams that

involve both of the couplings, which can be most be logarithmically divergent.

Down quark coupling

Having dealt with the top quark it remains to include couplings for the other quarks. The cou-

pling for the other up type quarks can be added similarly to the method described above. For the

down type quarks we also have to use both of the φ fields, however, in this case we do not have to

worry about symmetries and collective breaking since Yukawa coupling of the bottom quark is too

small to have a significant contribution to the Higgs mass. For Λ ∼ 10 TeV its one loop diagram gives:

λ2b

16π2Λ2 ≈ (30 GeV)2

Both of the φ fields can be included by using an εijk contraction.

Lb = − λb2fεijkψ

iLφ

j1φk2bR

This epsilon contraction, though, immediately results in quadratic divergencies because it breaks

both of the SU(3) symmetries to the diagonal SU(3). This contribution in not problematic though,

because for the heaviest down quark, i.e. the bottom quark, it is only ∼ (30 GeV)2. Although it poses

no problem, the fact that the SU(3) model is unable to cancel the down quark coupling should be

considered as a shortcoming of the SU(3) based model. In order to cancel all quadratic divergencies,

models with larger symmetry groups have to be considered.

70

9.2.3 The Higgs potential

In our analysis of the SM Higgs mechanism, we have seen that the Higgs potential is responsible for

EWSB. Thus, the Little Higgs model must also include a potential large enough to achieve this. We

require this potential V = V (φ1, φ2) to have the following properties:

1. It must not contain a tree level mass term for the Higgs,

2. It must contain a quartic coupling for the Higgs doublet,

3. It must preserve the collective symmetry breaking of the SU(3)’s.

The last demand means that the quartic coupling must be generated when at least two couplings

are non-zero and setting either to zero will make the Higgs an exact NGB. Just as we saw with the

gauge and Yukawa coupling, this will ensure that contributions to the Higgs mass can at most be

logarithmically divergent. Constructing a potential that satisfies the above three properties is far from

trivial for an SU(3) model though. When both fields are included, the only nontrivial SU(3) invariant

term is φ†1φ2. The other terms we can construct are φ†iφi = Const. and εijkφiφjφk = 0, which are

clearly of no use. The φ†1φ2 term, however, immediately breaks both of the SU(3) symmetries to the

diagonal subgroup. Expanding φ†1φ2 gives:

φ†1φ2 ≈ f2 − 2h†h+2

3f2(h†h)2

and this contains a Higgs mass term as well as the quartic coupling and this will always be the case.

A solution to this problem might to tune the coefficient through:

A

f2n−4(φ†1φ2)2 ≈ a1f

4 − a2f2h†h+ a3(h†h)2

Then, by varying A it should in principle be possible to generate a mass term small enough to prevent

quadratic divergencies, or a quartic coupling large enough to induce EWSB. The combination of the

two however, turns out to be impossible. Next to its shortcoming to cancel the down-type quark

divergencies, this is another shortcoming of the SU(3) model. As with its inability to cancel the

down-type quark couplings, the problem of the quartic potential can be solved by increasing the

symmetry group. One such extension is the group SU(5), in the Littlest Higgs. We will have a look

at this model in section 11.

9.2.4 Hypercharge and color

It remains now to add color and hypercharge. Since all off the previous arguments are colorblind42,

meaning that red, blue and green quarks carry the same electric charge and hypercharge, we can

simply add the SU(3)color gauge group.

Hypercharge

In the standard model, the symmetry group of the electroweak interaction is SU(2)L × U(1)Y . In

this model we embedded the weak interaction in SU(3)w which is broken to SU(2)w by the VEV of

42In the sense of QCD.

71

the φ fields. Therefore, to include hypercharge Y , we gauge an additional U(1)X group thereby enlarg-

ing the symmetry group to SU(3)color × SU(3)weak ×U(1)X . It remains now to determine the correct

combination of generators such that the hypercharge for the Higgs comes out as +1.43 For this we

note that the SU(3) generator T 8 = 12λ8 is SU(2) invariant and has not been used so far. This leads

us to define the following combination of generators that is invariant under the VEV ∼(

0, 0, 1)

and produces the correct hyper charge:

Y = 2

(1√3T 8 −X

), where T 8 =

1

2√

3

1

1

−2

and we assigned φ the U(1)X quantum number −1/3[18]. Then we get

Y =

1

1

0

.

The η singlet

Until now we have ignored the η singlet. We can correct this by noting that in the defining the

linear combination for the hypercharge generator we also have the opposite linear combination

2

(−1√

3T 8 −X

)=

0

0

1

This combination will become the massive η goldstone boson that will be eaten by a gauge boson after

the high scale symmetry breaking [24].

9.2.5 The gauge sector

Now that we have assigned the fields φ1 and φ2 the U(1)X quantum number of −1/3 we can write

down a covariant derivative:

Dµ = ∂µ + igAaµ · Ta + igXAXµ ·X = ∂µ + igAaµ · Ta − igX

1

3AXµ (9.18)

where Ta = λa/2 are the eight SU(3)W generators and X = −1/3 is the U(1)X generator. We will

use it now to have a closer look at the gauge sector. Since we enlarged the SU(2)W × U(1)Y gauge

group to SU(3)W ×U(1)X we expect there to be 5 extra gauge bosons that correspond to the 5 broken

SU(3)W generators, with masses of order f . Here we will determine those masses and investigate the

quadratic divergencies due to the SM gauge bosons are indeed cancelled.

The masses

43Recall that < h >=(

0 v/√

2)

and Q = T 3 + Y/2.

72

The masses of the gauge bosons result from the kinetic terms of the φi fields, |Dµφi|2:

2∑i=1

∣∣∣∣(∂µ + igAaµ · Ta − igX1

3AXµ

)φiφ†i

∣∣∣∣2 → Trace

(2∑i=1

∣∣∣∣(igAaµ · Ta − igX 1

3AXµ

)φiφ†i

∣∣∣∣2)

= Trace

(∣∣∣∣(igAaµ · Ta − igX 1

3AXµ

)∣∣∣∣2 2∑i=1

φiφ†i

)(9.19)

where ∂µ is omitted since it plays no role in the masses. To evaluate this we determine what the two

matrices∑2i=1 φiφ

†i and Dµ look like. For

∑2i=1 φiφ

†i we find to order h/f :

2∑i=1

φiφ†i =

< hh† > 0

0

0 0 f2− < hh† >

=

0 0 0

0 v2/2 0

0 0 f2 − v2/2

(9.20)

where in the second equality we assumed the VEV for h of(

0 v√2

)T. For Dµ we find:

Dµ =

ig2

(A3µ +

A8µ√3

)− 1

3 igXAXµ

ig2 (A1

µ − iA2µ) ig

2 (A4µ − iA5

µ)

ig2 (A1

µ + iA2µ) ig

2

(−A3

µ +A8µ√3

)− 1

3 igXAXµ

ig2 (A6

µ − iA7µ)

ig2 (A4

µ + iA5µ) ig

2 (A6µ + iA7

µ) −igA8µ√3− 1

3 igXAXµ

where we used Ta = λa/2. We can now evaluate (9.19). Letting h assume its VEV this becomes:

Trace =1

2v2

[(g(−

A3µ

2+

A8µ

2√

3)−

AXµ3gX)2 + g2(

A1µ

2− i

A2µ

2)(A1µ

2+ i

A2µ

2) + g2(

A6µ

2− i

A7µ

2)(A6µ

2+ i

A7µ

2)

]

+

(f2 − v2

2

)[(−g

A8µ√3− gX

AXµ3

)2 + g2(A4µ

2− i

A5µ

2)(A4µ

2+ i

A5µ

2) + g2(

A6µ

2− i

A7µ

2)(A6µ

2+ i

A7µ

2)

]

We now define the following combinations for the SU(2) W± gauge bosons and the new heavy gauge

bosons (W ′)± and W 00′ .

W± =A1µ ∓ iA2

µ√2

(W ′)± =A4µ ± iA5

µ√2

(W ′)00 =A6µ ∓ iA7

µ√2

and we can read off their masses to be:

M2W± =

1

4g2v2 M2

(W ′)± =1

2g2f2 − 1

4g2v2 M2

W 00′ =1

2g2f2 (9.21)

which is in agreement with [19]. This clearly shows that the SM W± gauge bosons remain massless

untill EWSB when the Higgs field assumes a VEV, whereas the four new heavy gauge bosons acquire

masses of order f . With this we can already show that quadratic divergencies due to the charged W±

are cancelled by the new heavy W bosons since they all represent correct mass eigenstates. I will show

this in section 9.2.6. First we focus on determining the masses and mass eigenstates of the other gauge

bosons.

73

After the φ fields have assumed a VEV (and before EWSB) the neutral gauge bosons corresponding to

the generators T 3, T 8 and TX mix to form the physical fields W 3µ , Bµ and Z ′µ. The first two we have

seen before. After SU(2)W × U(1)Y breaking, these two fields will mix to form the massless photon

and heavy Z0µ boson. However, there is now also a Z ′µ that will mix with Z0

µ [19]. We now turn to

computing the mass matrix. Since we let only φ attain a VEV the relevant term to consider is:

f2

(g−A8

µ√3− gX

AXµ3

)2

(9.22)

Thus we see that the gauge boson A3µ corresponding to the third SU(3) generator does not contribute

to the mixing. We can explain the absence of A3µ in the mixing by observing that it corresponds to the

third SU(3) generator, which we can also identify with the third SU(2) generator. Therefore, when

breaking SU(3)W to SU(2)W , A3µ will in itself correspond to an SU(2) generator and will therefore

remain a mass eigenstate. The other two, A8µ and AXµ do not correspond to SU(2) generators and

thus will mix to form the mass eigenstates. Let us now determine what those eigenstates will be. We

can rewrite (9.22) as44:

f2

(gA8µ√3

+ gXAXµ3

)2

=f2

9

(√3gA8

µ + gXAXµ

)2

=f2

9

(A3 A8 AX

)0 0 0

0 3g2√

3ggX

0√

3ggX gX2

A3

A8

AX

(9.23)

Already familiar with this form from our analysis of the SM Higgs mechanism we can immediately

write down what the physical fields become:

A3µ = A3

µ Bµ =−gXA8

µ +√

3gAXµ√3g2 + g2

X

Z ′µ =

√3gA8

µ + gXAXµ√

3g2 + g2X

A3µ and Bµ remain massless after SU(3) breaking, as off-course they should, and the Z ′µ acquires a

mass:

M2Z′ =

2f2

9

(3g2 + g2

X

)(9.24)

(W ′)±, (W ′)00 and the Z ′µ form the 5 massive gauge bosons that correspond to the 5 broken generators

in breaking SU(3)W × U(1)X → SU(2)W × U(1)Y and they all have masses of order f . The Bµ

corresponds to the hypercharge generator Y and after EWSB will mix with A3µ to form the photon Aµ

and heavy Z0. This we will investigate next. For this, we have to rewrite (9.18) in terms of the new

mass eigenstates. We can implement the W± bosons by observing that we have the following equality:

A1µT1 +A2

µT2 =1

2[(A1

µ + iA2µ)(T1 − iT2) + (A1

µ − iA2µ)(T1 + iT2)]

=1√2

[W−µ (T1 − iT2) +W+µ (T1 + iT2)]

Similar expressions hold for the other W bosons and what remains is A3µT

3. Z ′µ and Bµ are more

tricky. They must come from rewriting

igA8µT8 −

igX3AXµ = c1(Z ′) + c2Bµ = c1

√3gA8

µ + gXAXµ√

3g2 + g2X

+ c2−gXA8

µ +√

3gAXµ√3g2 + g2

X

44The space-time indices µ are omitted for clarity.

74

Off course, for this covariant derivative to produce the correct mass eigenstates after EWSB c2 has

to contain the hyper charge generator Y = 2√3T 8 − 2X and a certain coupling g that we can identify

with the hypercharge coupling g′. The following combination will do:

igA8µT

8 − igX3AXµ

?=

i√3g2 + g2

X

[Z ′(√

3g2T 8 + g2XX)−

√3

2ggXBµ

(2√3T 8 − 2X

)](9.25)

=i

3g2 + g2X

[(√

3gA8µ + gXA

Xµ )(√

3g2T 8 + g2XX)

]− i

3g2 + g2X

√3

2ggX

[(−gXA8

µ +√

3gAXµ )

(2√3T 8 − 2X

)]=

i

3g2 + g2X

[(3g3T 8A8µ +√

3gg2XA

8µX +

√3g2gXA

Xµ T

8 + g3XA

Xµ X)

+(gg2XA

8µT

8 −√

3gg2XA

8µX −

√3g2gXA

Xµ T

8 + 3g2gXAXµ X)]

=i

3g2 + g2X

[(3g3T 8A8µ + gg2

XA8µT

8) + (g3XA

Xµ X + 3g2gXA

Xµ X)]

=i

3g2 + g2X

[(3g2 + g2X)gA8

µT8 + (3g2 + g2

X)gXAXµ X]

=igA8µT

8 − igX3AXµ

From (9.25) we deduce that the hypercharge coupling is given by45:

g′ =−√

3ggX

2√

3g2 + g2X

=−gX

2√

1 +g2X3g2

.

It also gives us the expression for the Weinberg angle θw from (7.27):

g′

g= tan(θw) = t =

−√

3gX

2√

3g2 + g2X

=−√

3

2√

3g2

g2X+ 1

Implementing all of this leaves us with the following covariant derivative in terms of the new mass

eigenstates:

Dµ =ig√

2[W−µ (T1 − iT2) +W+

µ (T1 + iT2)] (9.26)

+ig√

2[(W ′)0

µ(T6 − iT7) + (W ′)0µ(T7 + iT7)]

+ig√

2[(W ′)+

µ (T4 − iT5) + (W ′)−µ (T4 + iT5)]

+igA3µT

3 +i√

3g2 + g2X

Z ′µ(√

3g2T 8 + g2XX) + ig′BµY

Expressing gX in terms of g and t gives:

gX = g2t√

1− 4t2

3

.

45In agreement with [19]

75

and expressing further X in terms of T 8 and Y the term with Z ′µ becomes

ig√3− 4t2

Z ′µ(√

3T 8 − 2t2Y )

With this Dµ becomes in terms of g and t.

Dµ =ig√

2[W−µ (T1 − iT2) +W+

µ (T1 + iT2)] (9.27)

+ig√

2[(W ′)0

µ(T6 − iT7) + (W ′)0µ(T7 + iT7)]

+ig√

2[(W ′)+

µ (T4 − iT5) + (W ′)−µ (T4 + iT5)]

+igA3µT

3 +ig√

3− 4t2Z ′µ(√

3T 8 − 2t2Y ) + igtBµY

With the expression for gX we can also rewrite the mass term we found in (9.24) as:

M2Z′ =

2f2

9(3g2 + g2

X) = g2f2 2

3− 4t2

which is in agreement with [19]. Note however that when t =√

3/2 ≈ 0, 866 the mass of the heavy Z ′

would become infinite, which is clearly unphysical. However, recall from (7.28) that θW = 28, 75 and

thus t ≈ 0, 52. Thus we can conclude this will not be the case.

To find the masses and mass eigenstates after EWSB we follow the same procedure and let now the

higgs field acquire its VEV. Since we already determined the masses of the W bosons, we consider

only the remaining three fields A3µ, Z

′µ and Bµ. Computing again Dµ we find:

ig2 A

3µ + igtBµ + ig√

3−4t2( 1

2 − 2t2)Z ′µ 0 0

0 − ig2 A3µ + igtBµ + ig√

3−4t2( 1

2 − 2t2)Z ′µ 0

0 0 − ig√3−4t2

Z ′µ

Using now (9.20), we find:

Trace

(2∑i=1

|Dµφi|2)

=

(− ig

2A3µ + igtBµ +

ig

2√

3− 4t2(1− 4t2)Z ′µ

)2v2

2+

(− ig√

3− 4t2Z ′µ

)2(f2 − v2

2

)

=g2v2

2

(−A3µ

2+ tBµ +

1− 4t2

2√

3− 4t2Z ′µ

)2

+g2(f2 − v2

2

)3− 4t2

(Z ′µ)2. (9.28)

A3µ and Bµ will mix to form the massless photon Aµ and neutral massive Z0

µ. As I mentioned earlier,

Z0µ will also mix with the new heavy Z ′µ. Therefore we distinguish between Z ′ and Z ′ respectively

before and after EWSB. Rewriting (9.28) and using the mass matrix gives:

(A3µ Bµ Z ′µ

)M

(A3)µ

Bµ

(Z ′)µ

⇒(Aµ Z0

µ (Zµ)′)

Mdiag

Aµ

Z0µ

(Zµ)′

where the mass matrix is given by:

M =g2v2

2

14 − t

2 − 1−4t2

4√

3−4t2

− t2 t2 t(1−4t2)

2√

3−4t2

− 1−4t2

4√

3−4t2t(1−4t2)

2√

3−4t2(1−4t2)2

4(3−4t2) −1+( 2f2

v2 )

3−4t2

76

The eigenvalues are computed using Mathematica and are to O(1/f) found to be:

λ1 = 0 λ2,3 =g2v2

2

2f2√

3− 4t2 ± 2f2√

3− 4t2√

1− (3− 4t2)(1 + 4t2) v2

2f2

2v2(3− 4t2)√

3− 4t2

=g2

2

f2 ± f2(

1− 12 (3− 4t2)(1 + 4t2) v

2

2f2

)(3− 4t2)

=g2

2

(f2 ± f2

3− 4t2

)∓ g2v2

8(1 + 4t2)

Indeed, as off course it should, setting v = 0 reproduces the eigenvalues we found earlier when we let

only φ assume a VEV. Thus in the correct basis of eigenstates:

g2v2

2

(Aµ Z0

µ Z ′µ

)0 0 0

0 g2v2

8 (1 + 4t2) 0

0 0 g2f2

3−4t2 −g2v2

8 (1 + 4t2)

AµZ0

µ

Z ′µ

From this we identify46:

• a massless photon Aµ,

• a massive neutral Z0 boson with M2Z0 = g2v2

4 (1 + 4t2)

• a heavy neutral and Z ′ boson with M2Z′

= g2f2 23−4t2 −

g2v2

4 (1 + 4t2)

Computation of the corresponding eigenvectors results in the following linear combinations for the

mass eigenstates after EWSB:

A =(2t)A3

µ +Bµ

4t2 + 1

Z0 =xA3

µ + 2txBµ + Z ′

x2 + 4t2x2 + 1

Z ′ =yA3

µ + 2tyBµ + Z ′

y2 + 4t2 + 1

where I defined:

x = − 8

(1− 4t2)(1 + 4t2)√

3− 4t2(v2/f2)

and

y =(1− 4t2)

√3− 4t2(v2/f2)

8= − 1

x(1 + 4t2)

9.2.6 Cancellation of the W boson loop

Using the expressions for the charged gauged bosons and (9.20) we can determine whether the quadratic

divergencies due to the SM W± bosons are indeed cancelled by the new heavy bosons. The relevant

part of the covariant derivative being AaµTa, we must evaluate:

|Dµ|22∑i=1

φiφ†i =

g2

2Trace

0 W+ (W ′)−

W− 0 (W ′)0

(W ′)+ (W ′)0 0

20 0 0

0 v2

2 0

0 0 f2 − v2

2

.

46In agreement with [19].

77

where we assumed the Higgs VEV of(

0 v√2

)T, i.e. < hh† >= v2

2 . Then we find:

|Dµ|22∑i=1

φiφ†i =

g2

2

[(W ′)+(W ′)− +W 0W 0

](f2 − v2

2

)+v2

2

[W+W− +W 0W 0

]=g2

2W 0W 0f2 +

v2

2

[W+W− − (W ′)+(W ′)−

](9.29)

Thus, from (9.29) we see that once the Higgs field has assumes its VEV the divergencies from the

+ = 0

Figure 6: The quadratically divergent contributions from the W bosons and the heavy W ′.

They couple equally to the Higgs but with opposite sign and thus cancel.

charged W± bosons are precisely cancelled by the heavy (W ′)± bosons47. Looking back at the W±

and (W ′)± masses in (9.21) we see that this already gave us an indication of the cancellations shown

here. We see the same relation between the masses of the SM Z0 and the new heavy Z ′. Indeed,

when rewriting the covariant derivative in terms of the eigenstate after EWSB, one can, similarly to

the case of the W bosons, show that the quadratic divergencies from the SM Z0 is cancelled by the

heavy Z ′. I will not show this here.

By now we have introduced many aspect of the Little Higgs theories. We have seen here that an

SU(3) based Little Higgs model is able to cancel the most dangerous quadratic divergencies from the

Standard Model by introducing a heavy top partner T and additional heavy gauge bosons with masses

at the TeV scale. I introduced the concept of collective symmetry breaking, which ensures that any

contribution to the Higgs mass must contain both couplings meaning the contribution can only be log-

divergent. However, we also noted two shortcomings of the SU(3) based model, namely its inability to

cancel the contribution of the down-type quark and to generate a quartic potential. Resolving these

problems requires extension of the gauge group. One such group is SU(5). In section 11 we will have

a look at ’The Littlest Higgs’ that is based on SU(5). Regarding the latter of these problems I will

show that this model is able to generate a quartic potential. The next section will serve as a general

introduction to the SU(5) gauge group with respect to the transformation properties of particles.

47The term W 0W 0f2 might seem troublesome. It represents a coupling of the W 0W 0 to the φ fields after

they assumed their VEV. Its one loop diagram is quadratically divergent. However, this does not pose a

problem because under the 1 TeV scale the SM only contains the h doublet.

78

10 Representations, particle multiplets and symmetry break-

ing

In the ”Simplest Little Higgs” we broke a global SU(3) symmetry that to SU(2). We looked at the

gauge sector of the model and saw that enlarging the gauge group SU(2)W to SU(3)W introduced

5 new heavy gauge bosons, (and of course the heavy Top partner). However, when postulating a

model with a larger symmetry group, we need to indicate how the particles will transform under this

enlarged symmetry group. This is where the representation theory steps in. In the introduction I

already mentioned there is a link between representation theory and particle physics. The observation

follows when one considers particles als vectors in Cn. When a model has a certain symmetry group,

this symmetry group has irreducible representations of different dimensions. These irreducible rep-

resentations can be associated with the transformation properties of the different particles under the

symmetry group. Each particle is assigned to a certain representation which tells us how it transforms.

The left-handed electron doublet for one is placed in the two-dimensional representation of SU(2)W

while the right-handed electron transforms as the trivial one-dimensional representation. Physically

this corresponds to the observation that only left handed particles feel the weak force. More expla-

nation can be found in Appendix C.2 on Isospin. Recall now from sections 4 and 5.2 that we know

that the irreducible representations of SU(N) can be parametrized by Young diagrams with at most

N rows and that any column of N boxes may be omitted. Rows correspond to fully symmetric states,

columns to fully antisymmetric states and columns of length N may be omitted. The dimension of

the irreducible representations can be computed using (4.5) and we will refer to the representations

by their dimension, e.g. denote the eight dimensional representation by 8. Another important result

are the branching rules we derived for SU(N) → SU(N − 1) and SU(N + M) → SU(N) × SU(M).

Its relation to the previous will become clear with the following example where I discuss how we can

assign the gauge bosons from the previous section to the SU(3)W and SU(2)W representations48.

Example 10.1. The particles have to be assigned to the SU(3)W representations in such a way that

it is consistent with the SU(2)W transformation properties after spontaneous symmetry breaking. In

other word, they have to be distributed over the SU(3) representations in such a way, that when

we consider them as SU(2) representations, everything is consistent with the Standard Model. This

decomposing of irreducible SU(3) representations as SU(2) representations we are being told by the

branching rules. The 8 gauge bosons Aaµ of SU(3)W are placed in the 8 with Young tableau .

AXµ is a singlet in U(1)X . We already determined the branching of the 8 representation of SU(3) in

example 5.1. We found:

8 = 2 · 2 + 1 + 3.

Thus after SU(3) breaking we have 2 SU(2) doublets, a singlet and a triplet. The triplet we associate

with the standard model triplet of A1µ, A

2µ, A

3µ. The singlet corresponds to Z ′µ. Bµ is a singlet under

U(1)Y . What remains are the 2 doublets. To these we assign the new heavy charged W bosons:((W ′)+, (W ′)−

)and

(W 0, W 0

), which is in agreement with [19].

In the next paragraph we will have a more thorough look at the SU(5) case49. Knowing the SU(5)

48Recall that we also had U(1)X and U(1)Y . But, since their generators commute with the generators of

SU(3) and SU(2) they aren’t complicating factors.49It will be based on [14]

79

transformation properties is important for two reasons. First of all, we already noted that the Simplest

Little Higgs as discussed above was unsuccessful in generating a quartic potential for the Higgs and

canceling the divergencies that come from the down-type quarks. The minimal model that does

succeed in this and is consistent with the Standard model is called the ”Littlest Higgs” and it based on

a global SU(5) symmetry. Secondly, SU(5) enters in Grand Unified Theories because it is the minimal

symmetry group with the SM gauge group as a subgroup. Here I will discuss a part of an SU(5) based

theory focusing on transformation properties of elementary particles and show that all elementary

particles of the SM can be accommodated in the 5 lowest dimensional SU(5) representations.

10.1 SU(5)→ SU(3)C × SU(2)W × U(1)Y .

We will now investigate the representations and particle multiplets of the gauge group SU(5) that

has the standard model gauge group SU(3)c × SU(2)w × U(1)Y embedded in it. We can achieve this

embedding by embedding SU(3)C in the left upper 3 × 3 block and SU(2)w in the right lower 2 × 2

block and embed U(1) along the diagonal. Physically this embedding is motivated by the experimental

observation that the SU(3)C is completely blind with respect to the weak interaction SU(2)W×U(1)Y ,

meaning that red, blue and green quarks carry the same electric charge and weak hypercharge. This

implies that their generators must behave as unit matrices with respect to one another. Further, the

hyper charge generator must commute with both SU(3)C and SU(2)W by which the generator for

U(1)Y becomes Y = λ2450. This is because the leptons are color singlets and therefore the SU(3)

generators must have zero eigenvalues.

Remark on notation In the following we will adapt a different notation for the Young diagrams

that will proof useful. We will label the Young diagram by (p1, . . . , pn−1) where pi counts the number

of rows of length i. The conjugate diagram can be obtained by reversing the order of these numbers

and is of the same dimension of the original representation. Finally, we define the fundamental repre-

sentation for any n to be the Young diagram corresponding to exactly one pi = 1 and all others zero.

In terms of particles: If a right-handed particle transforms in the d representation than the charge

conjugated left-handed particle (i.e. antiparticle) transforms in the conjugated representation to which

we will refer as d. Note that for SU(2) the representations 2 and 2 coincide.

Now, back to SU(5). In order to accommodate all fermions and gauge bosons, it will be sufficient to

consider only the 5 lowest dimensional representations. Recall that in section 5.2 we already derived

the branching of the 5 lowest dimensional SU(5) representations in terms of Young diagrams. The

result is shown in table 1 51. Note that the correct hypercharges are not yet assigned, this will later be

corrected. First we focus on assigning the elementary fermions of the SM to the SU(5) representations.

Assigning the elementary fermions

We begin by recalling that the left-handed lepton doublet(νe e−

)has (SU(3), SU(2))Y quantum

50The explicit form of the SU(5) generators can be found in [14].51We can without problems add the hyper charge group because it commutes with SU(3) and SU(2).

80

Table 1: The (SU(3)×SU(2)×U(1)Y ) decomposition of the irreducible SU(5) representations

SU(5) Young tableau Dimension (SU(3), SU(2)) decomposition

5 (3, 1)⊕ (1,2)

5 (3, 1)⊕ (1,2)

10 (3,2)⊕ (3, 1)⊕ (1, 1)

15 (6, 1)⊕ (3,2)⊕ (1,3)

24 (1, 1)⊕ (1,3)⊕ (3,2)⊕ (3,2)⊕ (8, 1)

numbers (1,2)-1. In the 5 representation the hypercharge generator is related to λ24.

Y =

√5

3λ24 = diag(−2

3,−2

3,−2

3, 1, 1)

The normalization is inserted so the left handed lepton doublet(νe e−

)L

can be assigned to the

last two components of the 5 representation (ψi)L i = 1, . . . 5. In this representation the hypercharge

operator is given by −Y 52. Just as λ24 corresponds to the hypercharge generator we can identify the

SU(2) isospin generator T 3 = τ3/2 with λ23. The charge operator in the 5 representation can then be

obtained by using the Gell-mann-Nishijima relation. Then:

Q = T3 +Y

2=λ23

2+

√5

12λ24 = diag(−1

3,−1

3,−1

3, 1, 0) (10.1)

In the 5 representation the charge operator is then given by −Q producing indeed the correct charges

for the lepton doublet. The lepton anti-doublet, obtained by charge conjugation,(eC , −νCe

)R

is

assigned to the 5 quintuplet (ψi)R i = 1, . . . 5. To the first three components we must assign a SU(3)

color triplet of right-handed particles with charge −1/3 that transforms as a singlet under SU(2).

This rules out everything apart from the triplet of down quarks(dr, db, dg

)R

. The left-handed 5

quintuplet contains the conjugated left-handed antiparticles. Thus for i = 1, . . . , 5:

5 = (ψi)R =(dr db dg e+ −νCe

)R, 5 = (ψCi )L =

(dCr dCb dCg e− νe

)L

Now, for these assignments we did not yet explicitly needed table 1. However, to assign the left

handed color quark doublets

(ur, ub, ug

dr, db, dg

)L

and the singlets e+L and

(uCr , uCb , uCg

)L

to the SU(5)

multiplets we will need it. Since the (SU(3), SU(2))Y transformation properties of the fermions are

known, we can use the result of this table to assign the remaining fermions to the SU(5) multiplets by

comparison of the transformation properties. When expressed in left-handed particles and antiparticles

52See appendix C.2.2.

81

these are53:

uL, dL : (3,2)1/3

dCL : (3,1)2/3

uCL : (3,1)−4/3

νCL , eL : (1,2)−1

eCL : (1,1)2

Comparison with table 1 shows that the remaining fermions can be assigned to the antisymmetric 10

representation.

10 =

((3,1) (3,2)

(3,2) (1,1)

)(10.2)

Now, whereas the 5 and 5 representations where represented by one-component tensors ψi this the 10

is represented by an antisymmetric matrix, i.e. a two component tensor ψij that satisfies ψij = −ψji.The left handed color vector (uCi )L i = r, b, g = 1, 2, 3 can be turned into an antisymmetric 3×3 matrix

using an epsilon contraction:∑k εijk(uCk )L. Similarly, we can construct from the eCL an antisymmetric

2× 2 matrix through εijeCi . We then get:

10 = ψij =1√2

0 uCg −uCb −ur −dr−uCg 0 uCr −ub −dbuCb −uCr 0 −ug −dgur ub ug 0 e+

dr db dg −e+ 0

L

where the 1√2

is a normalization factor to compensate for every particle appearing twice. Thus the

fermions of the standard model can all be assigned to the 5, 5 and 10 multiplets of SU(5). The second

the third generations are handled similarly.

Correcting the hypercharge

It remains now to correct the hyper charge for all the particles. From example C.1 and (C.6) it

also follows that the hypercharge of a product of multiplets is the sum of their individual hyper-

charges. Using further that in the 5 representation Y =√

53λ24 we deduce that we can the identify

the hypercharges of (3,1) and (1,2) as −2/3 and 1 respectively. The hyper charges of the other mul-

tiplets can then be determined by taking their tensor products of 5 = (3,1)⊕ (1,2) and its conjugate

and summing the hyper charges. Explicitly this gives:

5× 5 =[(3,1)−2/3 ⊕ (1,2)1]× [(3,1)−2/3 ⊕ (1,2)1]

=[(3× 3,1× 1)−4/3 + (3× 1,1× 2)1/3 + (1× 3,2× 1)1/3 + (1× 1,2× 2)2]

=[(6⊕ 3,1)−4/3 + 2 · (3,2)1/3 + (1,1⊕ 3)2]

=[(6,1)−4/3 + (3,1)−4/3 + 2 · (3,2)1/3 + (1,1)2 + (1,3)2] (10.3)

53See appendix C.2.2.

82

and

5× 5 =[(3,1)2/3 ⊕ (1,2)−1]× [(3,1)−2/3 ⊕ (1,2)1]

=[(3× 3,1× 1)0 + (3× 1,1× 2)5/3 + (1× 3,2× 1)−5/3 + (1× 1,2× 2)0]

=[(8⊕ 1,1)0 + (3,2)5/3 + (3,2)−5/3 + (1,1⊕ 3)0]

=[(8,1)0 + [(1,1)0 + (3,2)5/3 + (3,2)−5/3 + (1,3)0 + (1,1)0] (10.4)

Assigning the gauge bosons

The gauge bosons of SU(5) correspond to the group generators and thus have to be assigned to

the representation of dimension 52 − 1 = 24 and we expect there to be 12 gauge bosons in addition

to the 8 gluons of SU(3), 3 W iµ of SU(2) and the hypercharge boson Bµ. We already derived the

decomposition of this representation in term of SU(3)× SU(2) representations.

24 = (1,1)0 ⊕ (1,3)0 ⊕ (3,2)−5/3 ⊕ (3,2)5/3 ⊕ (8,1)0

With our knowledge of the transformation properties of the standard model gauge bosons we can

immediately assign the SU(3) octet of gluons to the (8,1), the 3 SU(2) gauge bosons W iµ to (1,3)

and Bµ to (1,1). The remaining 12 gauge bosons are thus assigned to the (3,2) and (3,2), which

together are both 6 dimensional. The 12 new gauge bosons thus form 2 colored, isospin doublets. We

will denote this as:

(3,2) =

(Xr Xb Xg

Yr Yb Yg

)(10.5)

We can embed the gauge bosons in the 24 representation in the same way we did for the fermions in

the 10. Explicitly:

24 =

((8,1)0 (3,2)5,3

(3,2)−5/3 (1,3)0

)+ (1,1)0

where the matrix is a 5× 5 traceless matrix.

Concluding word on SU(5)

We have seen here that as a Grand Unification gauge group, SU(5) is able to accommodate all

elementary fermions and the gauge bosons. In [14] the Lagrangian for an SU(5) based gauge theory

is derived. With this it is possible to derive relations, make predictions and to see if those are consis-

tent with the Standard Model. For one, they show that the SU(5) group predicts the quantization of

charge. Fixing for example the electron charge within the SU(5) gauge group, the charges of all other

particles can be determined. Off course, this is provided they are arranged in correspondence with

their quantum numbers and color. They also show that under the assumption of an unbroken SU(5)

gauge theory the prediction for the Weinberg angle is

sin2(θW ) =3

8

which deviates from the Standard model prediction sin2(θW ) ≈ 0.23. This, and the fact we know

that the SU(2) × U(1) symmetry is spontaneously broken by the Higgs field VEV, indicates that

83

at low energies SU(5) is can not be a correct symmetry. It must thus be spontaneously broken to

the Standard Model gauge group at some energy scale and only above this energy is SU(5) an exact

symmetry and will the Weinberg angle approach this value. Since no hints of any influence of the

additional gauge fields X and Y have been found at any experiment, the unbroken SU(5) symmetry is

only realized at very high energies. When the well-known lifetime of the proton54 is taken into account

it can be shown that the breaking of the symmetry must occur at about 1015 GeV.

In the following section I will get back to SU(5) in the context of Little Higgs models where we discuss

the ’Littlest Higgs’.

54It is about 1034 years.

84

11 The Littlest Higgs

The Littlest Higgs is the minimal Little Higgs model that suffices as a theory to extend the SM. It is

based on a coset [SU(5)/SO(5)] and was the first viable Little Higgs model constructed in 2002 by

Arkani-Hamed et. all [20].

11.1 Requirements for the model

In the Littlest Higgs the Higgs is realized as the pseudo-NGB of a global symmetry group G that is

spontaneously broken to a global symmetry group H at the energy scale f ∼ 1 TeV. The model is

required to be an extension of the SM and thus the subgroup H must contain a copy of the SM gauge

group SU(2)W ×U(1)Y . Secondly, to prevent the Higgs from receiving quadratic divergencies, we as-

sume that the group G contains two gauged copies of SU(2)×U(1) : G ⊃ G1×G2 = [SU(2)×U(1)]2

that is diagonally broken to the SM gauge group. Both Gi are required to commute with a different

subgroup Xi of G. In this way they both preserve enough of the global symmetry to ensure that the

Higgs remains an exact NGB. Only when we include both of the gauge groups can the Higgs acquire a

mass term and this contribution can be at most logarithmically divergent55. To implement this, Xi is

chosen to be SU(3). This way G contains two different subgroups that are each of the form Gi ×Xi

and each Xi contains an SU(2)× U(1) subgroup.

Breaking the symmetry

To obtain the required symmetry breaking, we observe that the SU(5) Lie algebra has 52 − 1 = 24

generators. 14 of those are symmetric, 10 of those are antisymmetric. The latter are precisely the 10

generators of the SO(5) Lie algebra. To break the 14 symmetric generators the following VEV for the

Σ field is chosen56:

< Σ >= Σ0 =

12×2

1

12×2

(11.1)

Under a general SU(5) transformation U = eiθaTa , with Ta = 12λa it transforms as Σ → UΣUT : To

see why this VEV produces the correct result we redefine the SU(5) generators λa by introducing the

following unitary anti-symmetric matrix [25]:

A =1

2

1 + i 0 1− i0 2 0

1 + i 0 1− i

(11.2)

Direct calculation shows that this matrix satisfies Σ0 = A2 and A = AT . Redefining the SU(5)

generators as Xa ≡ AλaA−157 we deduce: Then:

XaΣ0 = AλaA−1A2 = AλaA = ± (AλaA)

T= ± (XaΣ0)

T= ±Σ0X

Ta . (11.3)

Obviously, the plus sign corresponds to an symmetric generator and the minus sign to an antisymmetric

generator. Using this we can show that the unbroken and broken generators respectively satisfy the

55After we have identified the goldstone bosons this will be made more precise.56There is no physical reason to justify this choice besides that is produces the required result.57These satisfy the SU(5) Lie algebra as can be verifies by direct calculation.

85

following relations:

T aΣ0 + Σ0TT

a = 0 (unbroken generators) (11.4)

TaΣ0 − Σ0TTa = 0 (broken generators) (11.5)

Showing this goes similar to what we did in section 6.3.2. An unbroken symmetry O = exp(iθaXa)

preserves the vacuum: OΣ0OT = Σ0. Expanding around the identity then leads to the following

condition on the generators:

Σ0 = (1 + iθaXa)Σ0(1 + iθaXTa ) = Σ0 + iθa(XaΣ0 + Σ0X

Ta ) +O(θ2

a) (11.6)

which clearly implies the above relations for the 10 unbroken generators. The 14 remaining generators

satisfy (11.5).

The second requirement, is also fulfilled by this VEV. It breaks the gauged subgroup G ⊃ G1 ×G2 = SU(2)1 × U(1)1 × SU(2)2 × U(1)2 down to its diagonal subgroup SU(2)W × U(1)Y . To see

this we consider the generators of each Gi, which are defined as follows: For the first subgroup

G1 = SU(2)1 × U(1)1:

Qa1 =1

2

(τa

03×3

)Y1 =

1

10diag(−3,−3, 2, 2, 2), (11.7)

and similarly for G2 = SU(2)2 × U(1)2:

Qa2 =1

2

(03×3

−τaT

)Y2 =

1

10diag(−2,−2,−2, 3, 3). (11.8)

This way G1 preserves a global SU(3) symmetry in the lower 3 × 3 block and G2 in the upper 3 × 3

block. The SU(2)W × U(1)Y generators are then given by:

Qa =1√2

(Qa1 +Qa2) and Y = Y1 + Y2 (11.9)

This combination of generators satisfies (11.4) and are thus left unbroken by the VEV. The orthogonal

combination

Qa =1√2

(Qa1 −Qa2) and Y = Y1 − Y2 (11.10)

is broken by the VEV. The unbroken combination of generators correspond to the SM W and B

bosons. They remain massless until EWSB occurs. The broken combination are related to the new

heavy W ′ and B′ that acquire masses of order f when high symmetry breaking occurs.

Goldstone Bosons

Breaking SU(5) → SO(5) results in 14 goldstone bosons. We can parametrize them by expanding

around the VEV Σ0, i.e.

Σ = eiΠ/fΣ0eiΠT /f = e2iΠ/fΣ0, (11.11)

where the second equality follows from (11.5) and f is the high symmetry breaking scale of order ∼ 1

TeV. The full goldstone matrix is given by

Π = πaTa =

χ+ η

2√

5h∗√

2φ†

hT√2

−2η√5

h†√2

φ h√2

χT + η

2√

5

. (11.12)

86

η is a real scalar field, χ = χaτa/2 a Hermitian 2 × 2 matrix, h is the SM complex Higgs doublet

h =

(h+

h0

)and φ is a heavy complex SU(2) Higgs triplet given by φ =

(φ++ φ+

√2

φ+

√2

φ0

). Note that,

similar to the SU(3) based model, the Higgs is again arranged in such a way that neither of the

SU(2) generators include h. In this way the Higgs will always remain a NGB when we break into the

SU(2) subgroups. Together, they account for the 14 degrees of freedom and they transform under the

unbroken SU(2)W × U(1)Y as:

10 ⊕ 30 ⊕ 2± 12⊕ 3±1 (11.13)

where the bold number denotes the number of fields and the subscript the hypercharge.

Symmetries

Lets try to understand why this setup of the model would succeed by analyzing the symmetries.

When we break SU(5) → SO(5), 14 goldstone bosons appear transforming as in (11.13). The first

two sets, η and χ will be eaten by the heavy gauge bosons when the gauged [SU(2)×U(1)]2 is broken

to the SM electroweak group. The trick in this Little Higgs model is now that by introducing two sets

of gauge couplings58 g1, g′1 and g2, g

′2 we let the two SU(2) groups in the opposite corners of the Σ

field mix. Only when both sets of the couplings are non-zero can the Higgs acquire a mass term. To

see, this observe that each of the Gi gauge groups commutes with a different SU(3) global symmetry

subgroup of SU(5). Suppose that we only include the gauge couplings g2 and g′2. Then the global

SU(5) symmetry is explicitly broken to SU(3) × SU(2) × U(1), where SU(3) acts on the first three

indices and SU(2) on the last two. This is then spontaneously broken to the electroweak group, thus

producing 8 exact NGB, corresponding to the eaten η, and χ and the four that make up the Higgs

doublet h. An analogous argument holds when only g1 and g′1 are turned on, only then SU(3) acts on

the last two indices and SU(2) on the first two. Since h shifts under the SU(3) symmetry a mass term

hh† is forbidden and thus neither of the two gauged subgroups alone can generate a Higgs potential.

When, however, we include both of the gauge couplings, enough of the global symmetry is broken to

allow h to acquire a potential which can at most be logarithmically divergent at tree level. In this case

SU(5) is explicitly broken to the gauged subgroup [SU(2)× U(1)]2 which then spontaneously breaks

to the electroweak group producing only 4 exact goldstone bosons corresponding to η and χ and thus

making the Higgs doublet a pseudo-NGB. The SU(2) triplet φ however is not protected by the global

symmetry and can pick up a potential which is quadratically divergent at tree level. However, this

does not pose a problem because under the 1 TeV scale the model only contains the higgs doublet h.

We’ll get back to the potential for h after we have showed the cancellation of the quadratic divergencies

at tree-level in the gauge sector and determined the masses of the new heavy gauge bosons.

11.2 The Gauge bosons

We will now determine the masses of the new the heavy gauge bosons W ′ and B′ and see that

they cancel the quadratic divergencies from the SM W and B bosons. As usual, we will do this by

58g1 and g2 are the couplings for the two SU(2)’s and g′1 and g′2 are the couplings for the two U(1)’s of G1

and G2 respectively.

87

considering the kinetic part of the Lagrangian and taking the Trace:

Lkin =f2

8Trace|DµΣ|2 (11.14)

where the coefficient is chosen such that the resulting scalar terms are normalized and the covariant

derivative is given by:

DµΣ = ∂µΣ− i2∑j=1

[gjW

aµj(Q

ajΣ + ΣQaTj ) + ig′jBµj(YjΣ + ΣY Tj )

](11.15)

To find the masses of the heavy gauge bosons and the corresponding mass eigenstates we have to

consider the terms in (11.14) that are quadratic in the gauge fields and substituting Σ = Σ0. Then,

ignoring ∂µ and omitting the space-time index µ, (11.14) becomes:

Lkin(Σ = Σ0) =f2

8Trace[

1

2g1W

a1

τa

0

τTa

− 1

2g2W

a2

τa

0

τTa

+

1

10g′1B1

−12×2

4

−12×2

− 1

10g′2B2

12×2

−4

12×2

]2

=f2

8[g2

1(W a1 )2 + g2

2(W a2 )2 − 2g1g2W

a1 W

a2 +

1

5g′21 B

21 +

1

5g′22 B

22 −

2

5g′21 B1g

′2B2] (11.16)

where we used τ2a = 12×2. Rewriting this in terms of the mass matrix we obtain:

Lkin(Σ = Σ0) =1

2

f2

4

(W a

1 W a2

)( g21 −g1g2

−g1g2 g22

)(W a

1

W a2

)+

1

2

f2

20

(B1 B2

)( g′21 −g′1g′2−g′1g′2 g′22

)(B1

B2

)(11.17)

Familiar with this form of the mass matrix and we can immediately rewrite (11.17) in terms of the

physical fields W,W ′, B,B′ and read of their masses. Then we have59:

Lkin(Σ = Σ0) =1

2

f2

4

(W W ′

)(0 0

0 g21 + g2

2

)(W

W ′

)+

1

2

f2

20

(B B′

)(0 0

0 g′21 + g′22

)(B

B′

)(11.18)

where we define the physical fields in terms of the mixing angles s, s′, c, c′:

W = sW1 + cW2 W ′ = −cW1 + sW2

B = s′B1 + c′B2 B′ = −c′B1 + s′B2 (11.19)

59Note that I omitted the a superscript for clarity.

88

with

s =g2√g2

1 + g22

c =g1√g2

1 + g22

s′ =g′2√

g′21 + g′22c′ =

g′1√g′21 + g′22

.

W and B thus remain massless and are identified with the SM gauge bosons. W ′ and B′ are the new

heavy gauge bosons and have masses at the TeV scale:

M2W ′ =

f2

4(g2

1 + g22) and M2

B′ =f2

20(g′21 + g′22 )

To show the desired cancellation of the quadratic divergencies we have to rewrite (11.16) in terms of

the physical gauge bosons. For this we expand the Σ field around its VEV. When gauging away η and

χ we get the following expansion for (11.11) in powers of 1/f :

Σ = Σ0 +2i

f

φ† h∗√

20

h†√2

0 hT√2

0 h√2

φ

+O(1

f2) (11.20)

Substituting this in (11.14) and using (11.19) results in the following expression for the couplings of

the gauge bosons to two scalars, (h and φ) [21]. For the W,W ′ bosons:

Lkin(W ·W ) =g2

4

[W aW b − c2 − s2

scW aW ′b

]Trace[h†hδab + 2φ†φδab + 2σaφ†σbTφ]

−g2

4

[W ′aW ′aTrace[h†h+ 2φ†φ]− c4 + s4

2s2c2W ′aW ′bTrace[2σaφ†σbTφ]

](11.21)

and for the B,B′ bosons:

Lkin(B ·B) =g′2[B2 − c′2 − s′2

s′c′BB′

]Trace[

1

4h†h+ φ†φ]

−g′2[B′2Trace[

1

4h†h]− (c′2 + s′2)2

4s′2c′2B′2Trace[φ†φ]

](11.22)

These two expressions show that the divergencies from the SM W and B bosons and indeed cancelled

by the new heavy W ′ and B′ bosons. W and W ′ both couple equally to the Higgs field only with

opposite sign. Same holds for B and B′. The contributions to the Higgs mass that are uncancelled

at one loop order are those that include both the light as well as the heavy gauge boson. The only

possible diagram is the one displayed in figure 7, which is logarithmically divergent.

11.3 The Quartic Higgs potential and Higgs mass

When analyzing the symmetries of the model we already mentioned that under the two SU(3) sym-

metries the SM Higgs doublet shifts, thereby forbidding a mass term hh† and a potential at tree

level. The heavy Higgs triplet φ is not protected by the symmetry that protects h and can pick up a

quadratic divergent mass. Here we will make this a bit more precise.

The Coleman-Weinberg potential

Since we are unable to add a potential at tree level, the only option left is to generate the poten-

tial at one-loop level as a correction to interactions of the Higgs with the gauge bosons and fermions.

89

Figure 7: The logarithmically divergent contribution to the Higgs mass that has the light as

well as the heavy gauge boson in the loop.

These interactions explicitly break all the global symmetries that forbid the presence of a tree level

Higgs potential. Such a potential that is generated at loop level, but absent at tree level, is called a

Coleman-Weinberg potential. The most important part of this potential can be parametrized as [21]:

V = λφ2f2Tr(φ†φ) + iλhφhf(hφ†hT − h∗φh†)− µ2hh† + λh4(hh†)2 (11.23)

Quartic terms involving φ4 and h2φ2 are not included because their contributions are small. There are

quadratically divergent contributions to this potential which are cut-off at a scale Λ ∼ 4πf ∼ 10 TeV.

They come from the gauge bosons as well as the fermions. For the gauge bosons this quadratically

divergent contribution to the CW potential is [21]

Lc = cg2j f

4∑a

Tr[(QajΣ)(QajΣ)∗

]+ cg′2j f

4Tr [(YjΣ)(YjΣ)∗] (11.24)

The relevant parts of this potential can be found by expanding the Lagrangian (11.24) in terms of

the fields h and φ and considering their global symmetry transformation properties. The G1 gauge

interactions leave the SU(3)1 symmetry invariant, under which h and φ transform as :

hi → hi + fεi + . . .

φij → φij − i(εihj + εjhi)

and similarly under SU(3)2, left invariant by the G2 gauge interactions, they transform as:

hi → hi + fηi + . . .

φij → φij + i(ηihj + ηjhi)

In the presence of both sets of gauge interactions the term that is left invariant under both these

transformations is:

|φij ±i

2f(hihj + hjhi)|2

Expanding this and substituting the expression in (11.24) yields:

Lc =c

2(g2

1 + g′21 )

[f2Tr(φ†φ)− if

2(hTφ†h− h†φh∗) +

1

4(hh†)2 + . . .

]+c

2(g2

2 + g′22 )

[f2Tr(φ†φ) +

if


1

4(hh†)2 + . . .

](11.25)

90

Remark. The same expression can be found when expanding Σ to quadratic order in φ and quartic

order in h. Note further that in this expression the first term is SU(3)2 invariant, whilst the second

is SU(3)1 invariant which can be seen from the gauge couplings.

The other quadratically divergent contribution to the one loop CW potential comes from the fermion-

loops 60 and is given by [21]:

Lc′ = −c′

2λ2ff

4εwxεyzεijkεkmnΣiwΣjxΣ∗myΣ∗nz + . . .

where i, j, k,m, n = 1, 2, 3 and w, x, y, z = 4, 5. This expression is thus SU(3)1 invariant and must

therefore have the same form as the second term in (11.25), hence:

Lc′ = −c′

2λ2

1

[f2Tr(φ†φ) +

if


1

4(hh†)2 + . . .

](11.26)

Note that no Higgs mass term is present in either of the contributions (11.25) and (11.26) as a

consequence of the collective symmetry breaking. However, there is a mass term for φ of order f :

M2φ = (c(g2

1 + g′21 + g22 + g′22 )− c′λ2

1)f2 = 2λφ2f2

We can further use the expressions (11.25) and (11.26) to determine the coefficients in (11.23). Ex-

plicitly:

λφ2 =c

2(g2

1 + g′21 + g22 + g′22 )− c′

2λ2

1

λhφh = − c4

(−g21 − g′21 + g2

2 + g′22 )− c′

4λ2

1

λφ4 =c

8(g2

1 + g′21 + g22 + g′22 )− c′

8λ2

1 =1

4λφ2 (11.27)

The Higgs quartic potential

Electroweak symmetry breaking at the scale v is only possible if the parameter λφ2 > 0. Other-

wise a VEV of order f for the triplet φ will be generated causing the electroweak symmetry to be

broken at the higher scale f . Note that:

λφ2 > 0 ⇐⇒ c(g21 + g′21 + g2

2 + g′22 )− c′λ21 > 0

For energies below this mass we can integrate φ out of the equation by calculating its equation of

motion and substituting the solution back in the expression for the total potential, i.e Lc +Lc′ . Thus

we differentiate the total potential to φ, set this to zero, and solve for φ. Doing so we find:

[c(g21 + g′21 )− c′λ2

f ]

(φij +

i

fhihj

)+ [c(g2

2 + g′22 )− c′λ2f ]

(φij −

i

fhihj

)= 0

Substituting the solution for φ in the total potential results in the following expression for the higgs

quartic potential λ(hh†)2 at tree level:

λ = c(g2

1 + g′21 − c′

c λ21)(g2

2 + g′22 )

g21 + g′21 + g2

2 + g′22 − c′

c λ21

60In particular the top loop, since the other fermions have small Yukawa couplings

91

This potential clearly reflects that only in the presence of both sets of couplings a quartic higgs poten-

tial can be generated. Turning either sets of gauge couplings off yields λ = 0 and no Higgs potential

is generated. Any contribution to the Higgs mass parameter at one loop level can thus at most be

logarithmically divergent. A few remarks about the µ parameter have to be made though [21]. As in

the Standard model it is to be seen as a free parameter. At one-loop order there are no quadratically

divergent contributions, only logarithmically divergent contributions of order f2 log(Λ2/f2)/(4π)2, im-

plying a natural hierarchy between the electroweak and TeV scale ∼ f . The first quadratic divergent

contribution to the parameter arises at two-loop order and is of order Λ2/(4π)4. This two-loop con-

tribution could however be as large as the one-loop log contribution. The Higgs mass parameter µ2

should therefore be treated as a new free parameter µ2 ∼ f2/(4π)2.

The Higgs mass

For µ2 > 061 the Coleman-Weinberg potential will trigger EWSB. We assume the following VEV’s for

the higgs fields h and φ:

< h >=1√2

(0

v

)and < φ >=

(0 0

0 −iv′

)(11.28)

Substitution of these VEV’s in (11.23) yields:

λφ2f2v′2 − λhφhfv2v′ − µ2 v2

2+ λh4

v4

4

and by minimizing this to v2 and v′ we find:

v2 =µ2

λh4 − λ2hφh/λφ2

and v′ =λhφhv

2

2λφ2f

The mass of the heavy Higgs triplet, is found to be

M2φ = 2λφ2f2

where the constants are given in (11.27). As with the tree level quartic higgs potential, the tree level

mass term for h will appear once φ is integrated out. Re-expressing v′ =λhφh(v+H)2

2λφ2f, where now H is

the mass-eigenstate, and parametrizing h and φ as:

< h >=1√2

(0

v +H

)and < φ >=

(0 0

0 −iv′

)(11.29)

results in the following mass for the Higgs doublet [21]:

M2H = 2v2(λh4 − λ2

hφh/λφ2) = 2µ2

11.4 Viability of ’Littlest Higgs’ and signatures in experiment

In the previous we discussed the ”Littlest Higgs”. I discussed the group-theoretical setup and showed

that the quadratically divergent Standard Model W and B loops are cancelled by the new heavy W ′

61Note the relative minus sign in comparison to earlier calculations.

92

and B′ gauge bosons with masses at the TeV scale. Unlike the SU(3) based model, this model was

indeed able to generate a quartic potential that included both sets of couplings. Any contribution

to the Higgs mass as one loop order must therefore be logarithmically divergent. What we did not

discuss were the fermion sector, electroweak symmetry breaking and the resulting masses for the gauge

bosons. I will conclude the discussion about this model by saying a few words about the latter because

it turns out to lead to some problems and reintroduce the hierarchy problem. In [21] the explicit

masses for the light and heavy gauge bosons after EWSB can be found. Additional mixing of the light

and heavy gauge bosons causes the masses of the Standard Model W and Z to gain corrections of

O(v2/f2). Recall now however, the relation (7.31) between the Weinberg angle and the masses of the

Standard Model W and Z bosons. If we now substitute the masses as found in [21] this also gives an

additional contribution to the Weinberg angle of order (v2/f2). Explicitly:

M2W

M2Z

= cos(θW )2

[1 +

v2

f2

5

4(c′2 − s′2)2 − 4

v′2

v2

](11.30)

= cos(θW )2

[1 +

v2

f2

(g′21 − g′22g′21 + g′22

)− 4

v′2

v2

](11.31)

Electroweak precision measurements force f > 4 TeV, which reintroduces the hierarchy problem! Ob-

serve though, that this contribution vanishes entirely in the case of g′1 = g′2 and v′ = 0. Fortunately,

a solution was found by imposing an additional discrete symmetry on the model, called T-parity.

Particles can be either T-even of T-odd and applying the operator T maps them to respectively plus

or minus themselves. The assignment is such that all SM particles can be chosen T-even and all other

particles T-odd. It turns out that when this symmetry is imposed on the model, this also imposes

constraints on parameters which indeed sets g′1 = g′2 and v′ = 0. Another nice feature of the model

with T-parity, is that the lightest T-odd particle it contains could be a potential candidate for Dark

Matter. In particular, this is the heavy B′ called the ’heavy photon’ [22]. Very precise measurements

have further been made by several experiments and imposing T-parity makes the model highly con-

sistent with electroweak precision data. All of these features make the Littlest Higgs with T-parity

very compelling theories to describe physics up to the cut-off scale of Λ ∼ 10 TeV. Above this scale it

still remains unclear what new physics we can expect. Various possibilities have been proposed, one

of them being supersymmetry, all of which are broken at scales high enough not to be in conflict with

experiments.

In the end it will only be experiment that can tell whether Little Higgs models are correct as ef-

fective theories. Recently the CMS collaboration published a paper on their analysis of LHC data

with a center of mass energy of 8 TeV. They searched for signs of heavy W ′ and Z ′ that decayed into

the Standard Model Higgs and a W or Z [23]. Unfortunately they have not yet found any signs of

decaying heavy W ′ and Z ′ with 95% confidence level. They excluded masses for W ′ in the interval

[1.0-1.6] TeV and for Z ′ in the intervals [1.0-1.1] and [1.3 - 1.5] TeV. However, with the latest upgrade

of the LHC it can now reach a beam energy of 6.5 TeV (13 TeV in total) so it should be clear in the

near future whether Little Higgs theories have their place in nature.

93

12 Summary

Here I will summarize what we have seen in each section. I divided the summary in two parts. Part I

focusing on the mathematics and part II on physics.

12.1 Part I

We set out to determine the irreducible representations of GL(V ) ∼= GL(n,C). For this we observed

that GL(V ) as well as the symmetric group act on the space V ⊗n and that their actions commuted.

Therefore, we started with determining the irreducible representations of the symmetric group Sn,

since unlike for other groups, its conjugacy classes are in bijection with the partitions of λ of n. We

first showed that we could visualize the partitions with Young diagrams and introduced the Young

tableaux. Then we constructed an idempotent element cλ, called the Young symmetrizer. We proved

that, by letting λ vary over the partitions, these elements form a mutually orthogonal set of central,

primitive idempotents and identified to each cλ an irreducible representation Vλ = C[Sn]cλ. It was

called the Specht module. Letting λ vary over the partitions then gave us all irreducible representa-

tions. I then discussed a few more results about these Specht modules. I discussed Young’s rule, and

introduced the Kostka numbers that gave the multiplicities for the different Specht modules appearing

in its decomposition. I finalized with the Hook length formula to compute the dimension of the repre-

sentation. In Section 4 I discussed the irreducible representations of GL(V ). We proved that many of

its irreducible representations could be obtained by using the same young symmetrizer. We denoted

these representations by SλV and these could be obtained by computing cλV⊗n. A second important

result from this section was that the irreducible characters of these irreducible representations where

identified with the Schur polynomials, certain symmetric polynomials. This would make it possible to

make the branching rules in section 5 concrete by using known identities between these polynomials.

We further saw a formula to compute the dimension of the irreducible representations and proved

that in the decomposition of V ⊗n each SλV occurs with the a multiplicity given by the dimension

of the corresponding Specht module. Section 5 discussed some branching rules. We reformulated the

problem in terms of characters and compared these to identities between Schur polynomials that could

be found in appendix A. I further gave some explicit examples of branching patterns in terms of Young

diagrams. Paragraph 5.2 discussed how all of the obtained results did also hold for SU(n), the special

unitary group. For this we used some results from Lie-Theory. We argued that irreducible represen-

tations of a Lie-group can be differentiated to yield irreducible representations of its Lie-algebra, and

conversely an irreducible representation of the Lie-algebra can be integrated to an irreducible represen-

tation of the Lie-group. The second observation was that (in)-equivalent irreducible representations

of the Lie-algebra gl(n) can be restricted to yield (in)-equivalent irreducible representations of the

Lie-algebra su(n). These two arguments completed the statement that irreducible representations of

GL(n) yield irreducible representations of SU(n). That is, the irreducible representation of SU(n)

can be visualized by young diagrams with columns of at most length n, we have a formula to compute

their dimensions and further three branching rules.

94

12.2 Part II

Section 6 served as an introduction to the Lagrangian formalism in Field Theories. I discussed some

examples of field theoretic Lagrangians and introduced spontaneous symmetry breaking. We derived

a condition on the group generators and saw that generators that did not leave the vacuum invariant

where broken generators. Then we did some examples on the spontaneous breaking of a global symme-

try and saw that this lead to the appearance of massless particles, called Nambu Goldstone particles,

that corresponded to the broken generators. Then, chapter 7 discussed the Higgs mechanism. We

considered local symmetries, introduced gauge fields and the covariant derivative and saw that by

changing to U-gauge, we could remove the massless Goldstone particles from the particle spectrum.

They became the longitudinal degrees of the freedom of the gauge bosons which thereby became mas-

sive. The fermions acquired masses as a result of a constant resistance against the Higgs field. Then

I introduced the hierarchy problem and quantum corrections to the Higgs mass in section 8. With a

short calculation I showed that an unacceptable amount of fine-tuning was needed to keep the Higgs

mass at around the EWSB scale. I discussed the related Little Hierarchy problem stating that new

physics should appear not far above the TeV scale for else it would reintroduce a hierarchy problem.

Sections 9 and 11 focused on two Little Higgs models as a solution to the Little hierarchy problem.

Little Higgs models postulate the Higgs as NGB’s of an approximate global symmetry. To let the

Higgs acquire a mass term we then explicitly broke the symmetry, but only broke it collectively, mean-

ing that at least two sets of coupling must be nonzero. This way divergent contributions to its mass

could be at most log divergent. The two models we looked at where an SU(3) based model where the

standard model SU(2)W was embedded in SU(3)W . It turned out to have some shortcomings though,

and for that reason we looked at the Littlest Higgs based on a global SU(5) symmetry in section 11.

In the preceding section 10 I related the branching rules for SU(N) as derived in the first part to

the transformation properties of the elementary Standard Model particles and showed they could be

embedded in the 5 lowest dimensional SU(5) representations.

12.3 Acknowledgements

Last I would like to give my thanks to my two supervisors Eric Laenen and Jasper Stokman for their

support and motivation. They where always available within a couple of days and explanation on

subjects I found difficult where always helpful. I have worked with much pleasure and enthusiasm on

the subjects and it has provided many new insights for me.

95

13 Popular summary (Dutch)

Symmetriebreken is het begrip wat centraal staat in mijn scriptie. De twee onderwerpen die behandeld

worden zijn ”vertakkingsregels” en ”Little Higgs” modellen. Vertakkingsregels vallen onder een tak van

de wiskunde die de representatietheorie wordt genoemd. De representatietheorie bestudeerd zogeheten

symmetrie structuren en vertakkingsregels beschrijven het fenomeen van symmetriebreken.

Ten tweede heb ik Little Higgs modellen bestudeerd. De ontdekking van het Higgs deeltje was erg

belangrijk want het bewees dat het Higgs mechanisme, verantwoordelijk voor de massa van deeltjes,

correct was. Het enige probleem met het Higgs deeltje is echter dat zijn massa gevoelig is voor het

zogeheten hierarchie probleem. Hier zal ik een schets geven van dit hierachie probleem inhoudt, hoe

het gerelateerd is aan de Higgs massa en hoe Little Higgs modellen een mogelijke oplossing zijn.

Vertakkingregels

Laat ik beginnen met uitleggen wat de vertakkingsregels inhouden. Hiervoor moeten we eerst weten

wat we bedoelen met symmetrien en symmetrie groepen. We nemen als voorbeeld de groep die de

symmetrien van een driehoek beschrijft. Deze groep wordt de dihedrale groep, D3 genoemd. Een

symmetrie is hierbij gedefineerd als een werking die je op de driehoek kan uitvoeren die de driehoek

onveranderd laat. Je gaat gemakkelijk na dat er 6 zulke symmetrie werkingen zijn. De eenheidswerking

e, rotaties r1 en r2 om resp. 120 en 240 en spiegelingen s1, s2 en s3 om de drie spiegelings-assen. Twee

Figure 8: De 6 verschillende symmetrien van een driehoek.

van deze werkingen na elkaar uitgevoerd vormen ook een symmetrie-transformatie van de driehoek die

in de groep zit. Als laatste is er nog een eenheidselement. Hier is dat het element e.

Definition 13.1. Een groep is een verzameling G voorzien van een bewerking en van een eenheid-

selement 1 zodat voldaan is aan bepaalde rekenregels.

De verzameling e, r1, r2, s1, s2, s3 en de groepsoperatie vormen zo een groep, waarbinnen we kunnen

rekenen. Een ondergroep H van G definieren we als volgt.

Definition 13.2. Een deelverzameling H van G noemen we een ondergroep van G als H met de

bewerking van G en hetzelfde eenheidselement een groep vormt.

Nu was dit kleine groep met eindig veel elementen. Het is een discrete symmetrie en je kan gemakkelijk

rekenen met deze groep. Dit rekenen wordt echter snel ingewikkelder als de symmetrie-groepen in-

96

gewikkelder worden. Hiervoor gebruiken we de representatietheorie. In de representatietheorie worden

alle groepselementen gezien als lineaire transformaties tussen vectorruimten.

Definition 13.3. Een verzameling V van vectoren voorzien van optelling en scalaire vermenigvuldiging

heet een vectorruimte als voldaan is aan bepaalde reken eigenschappen.

Zo’n transformatie tussen vectorruimten wordt weergegeven door een matrix. Laten we als voorbeeld

de 3 dimensionale representatie van de groep S3 bekijken. Dit is de groep van permutaties van 3 letters

met elementen e, (12), (23), (13), (123), (132). In deze notatie betekend (123) dat 1→ 2, 2→ 3 en

3→ 1. In de representatie theorie worden alle groepselementen weergeven als matrices die vectoren in

een vectorruimte V naar nieuwe vectoren sturen. Een 3-dimensionale vectorruimte zou je je kunnen

voorstellen als een assenstelsel en een vector geven we weer met ~x =

xyz

= x~e1 + y ~e2 + z ~e3 met ~ei de

eenheidsvectoren van lengte 1. De groep S3 werkt dan op een vector door de indices te permuteren.

Het element g = (12) bijvoorbeeld wisselt de indices 1 en 2 om en houdt 3 vast. Een representatie van

dit element zou dan de matrix 0 1 0

1 0 0

0 0 1

kunnen zijn. Dan:

~x→ ~x′ =

0 1 0

1 0 0

0 0 1

· ~x =

0 1 0

1 0 0

0 0 1

·xyz

=

yxz

Op dezelfde manier kun je ook de andere elementen door matrices representeren. De afbeelding die

alle elementen naar een set matrices stuurt noemen de representatie, en wanneer het duidelijk welke

afbeelding we bedoelen dan noteren we alleen nog de dimensie van de vectorruimte waarop we de

elementen representeren. In het bovenstaande geval is dat dus de 3-dimensionale representatie. Het

lijkt verder misschien alsof je informatie verliest door de groepselementen op deze manier weer te geven

maar dat is net zo. Alle informatie die in de groep zit blijft behouden. Als laatste het begrip van een

deelrepresentatie en een irreducibele representatie. We spreken van een deelrepresentatie als er een

deelruimte van V is die invariant wordt gelaten onder de werking van de groep. In het bovenstaande

voorbeeld is dat bijvoorbeeld de deelruimte ~v = e1 + e2 + e3. Je ziet namelijk dat indices permuteren

geen effect heeft. Tenslotte noemen we een representatie irreducibel als de enige deelrepresentaties de

ruimte V zelf zijn of 0. Je zou deze irreducibele representaties in een zekere zin als atomisch kunnen

beschouwen omdat je elke representatie uit kan drukken in termen van irreducibele representaties. Het

zijn dan ook de irreducibele representaties van een groep waar wiskundigen in geınteresseert zijn.

We hebben nu alles om vertakkingsregels uit te leggen. Deze beschrijven het fenomeen van symme-

triebreken waarbij de symmetriegroep van een systeem gereduceerd wordt tot een kleinere symme-

triegroep. In het algemeen zal een irreducibele representatie niet irreducibel blijven wanneer je deze

beperkt tot de kleinere groep. Waar we in zo’n geval daarom in geınteresseerd zijn is hoe een ir-

reducibele representatie van de grote symmetriegroep opsplitst in irreducibele representaties van de

kleinere symmetrie groep. In mijn scriptie was ik in het bijzonder geınteresseerd in het fenomeen van

spontane symmetrie breking. Om je spontane symmetrie breken voor te stellen kan je denken aan een

97

potlood dat je op zijn punt laat balanceren. Dit is een symmetrische toestand. Uiteindelijk zal het

potlood echter omvallen. De resulterende minder symmetrische toestand is namelijk veel stabieler.

In het algemeen zijn de meest stabiele toestanden diegene die de minste hoeveelheid energie nodig

hebben.

In mijn scriptie was het doel om de vertakkingsregels voor de groep SU(n) af te leiden. Deze sym-

metrie groep speelt een grote rol in het standaard model. Net als hierboven kan je je de elementaire

deeltjes namelijk voorstellen als vectoren die transformeren naar nieuwe vectoren onder de werking

van matrices. Je kan laten zien dat de transformaties die toegestaan zijn (dwz zonder schending van

natuurwetten) je precies deze groepen opleveren. Bekend zijn met de irreducibele representaties van

de groep SU(n) vertelt je dus hoe de elementaire deeltjes zich gedragen.

Het standaard model is een van de grootste successen in de moderne fysica. Het beschrijft alle materie

deeltjes (fermionen) en interacties tussen deze door uitwisseling van krachtdragers (bosonen). Er zijn 4

fundamentele krachten, de sterke kernkracht, de zwakke kernkracht, de electromagnetische kracht en de

zwaartekracht. De krachtdrager van de elektromagnetische kracht is bijvoorbeeld het foton. De term

elektromagnetisme geeft aan dat elektriciteit en magnetisme in feite twee manifestaties zijn van dezelfde

kracht. Fysici zagen later in dat ook de zwakke kracht en elektromagnetische kracht als een kracht

gezien konden worden. De elektrozwakke kracht. Echter, in het dagelijks leven nemen we deze twee

krachten als twee totaal verschillende fenomenen waar. Dit komt omdat de twee krachten alleen ge-

unificeert zijn bij de hoge temperaturen van het vroege universum. Met de afkoeling van het universum

werd de symmetrie van de elektrozwakke kracht namelijk gebroken tot de elektromagnetische kracht.

Het vacuum is namelijk niet leeg maar gevuld met een Higgs veld, een hypothetisch energieveld welke

je je kan voorstellen als een zee aan Higgs deeltjes. Dit Higgs veld is verantwoordelijk voor de massa

van alle deeltjes. Vlak na de oerknal was het Higgs veld symmetrisch en waren alle elementaire deeltjes

volledig massaloos.

Met de afkoeling van het universum vond er symmetriebreking plaats waardoor de symmetrie van de

elektrozwakke kracht gebroken werd tot de symmetrie van de electromagnetische kracht. Dit vond

plaats doordat het Higgs veld een vacuum verwachtings-waarde (vev) aannam. Om uit te leggen wat

dit betekend vergelijken we het potentiaal voor het Higgs veld met een normaal potentiaal. (Herinner

dat een potentiaal je de energie van een bepaalde configuratie vertelt.) We noemen het Higgsveld

potentiaal ook wel het Mexicaanse hoed potentiaal. In figuur 9 (a) zie je een normaal potentiaal.

Zoals met het potlood is ook hier de meest stabiele toestand de toestand met de laagste energie. Een

deeltje in de oorsprong neemt geen (vev) aan omdat het al in de toestand met de laagste energie zit

(minimum van de potentiaal). Figuur 9 (b) toont het Higgs potentiaal. De oorsprong is nu niet langer

het minimum van de potentiaal en het Higgs zal naar de toestand gaan waar het een lagere energie

heeft. De symmetrie die er voorheen was62, is gebroken tot een kleinere symmetrie63 en we zeggen dat

de Higgs vev de elektrozwakke symmetrie breekt. Het Higgs-veld werd hierna een stroperig krachtveld

en de materie deeltjes kregen een massa door een continue weerstand tegen dit veld. De krachtdeeltjes

van de zwakke kracht kregen een massa door wisselwerking met het Higgs deeltje. De krachtdrager

van de elektromagnetische kracht (het foton), de symmetrie die overbleef, bleef massaloos. Dit proces

wordt het Higgs mechanisme genoemd. Lang bleef onduidelijk of dit mechanisme correct was maar

62Deze symmetrie is de groep SU(2)× SU(1), de elektrozwakke symmetrie63De overblijvende symmetrie is SU(1) de elektromagnetische symmetrie.

98

Figure 9: Twee potentialen. (a) Een normaal potentiaal waarin de oorsprong de toestand met

de laagste energie is. (b) Het Higgs potentiaal waarin de oorsprong niet langer de toestand

met de laagste energie is [29].

het werd uiteindelijk geverifieerd met de ontdekking van het Higgs deeltje.

Fysici denken dat bij nog hogere temperaturen ook de elektrozwakke kracht en de sterke kernkracht

geunificeerd waren tot een kracht en dat bij een nog hogere energieschaal, die de Planck schaal wordt

genoemd, ook de zwaartekracht genificeerd was. Fysici weten dat het Standaard model niet meer geldig

is rond de Planck schaal en zijn nog steeds op zoek naar een theorie die deze unificaties beschrijft.

Het hierarchie probleem treedt op wanneer je de verschillende energieschalen gaat bekijken waarop de

verschillende unificaties plaatsvinden. Uitgedrukt in electronVolts (eV) treed elektrozwakke unificatie

op bij een energie van orde 100 GeV, unificatie met de sterke kernkracht bij ongeveer 1016 GeV

en met de zwaartekracht bij 1019 GeV!! Op zich is dit gigantische verschil tussen de verschillende

energieschalen geen probleem, het is alleen erg onnatuurlijk en fysici hebben geen verklaring waarom

deze zo uiteenlopen. Een probleem treed op wanneer men probeert de Higgs massa te bepalen. Het

standaard model Higgs deeltje heeft een gemeten massa van ongeveer 125 GeV (herinner E = mc2).

Echter, het Higgs deeltje gaat ook interacties aan met virtuele deeltjes en dit leidt tot zogeheten

quantum correcties op de Higgs massa. In mijn scriptie laat ik zien dat deze correcties proportioneel

zijn aan Λ2 waarbij Λ de energie is tot waar het Standaard model geldig is. Omdat de enige energie

schaal waar we weten dat het Standaard model niet meer geldig en nieuwe fysica optreedt de Planck

schaal is worden deze contributies aan de Higgs massa ∼ (1019)2 en de enige manier om te kunnen

verklaren waarom de fysisch gemeten massa MFys = 125 GeV voldoet aan M2Fys = M2

0 + (1019)2 is

wanneer de ”klassieke” massa heel precies gefine-tuned is tot op wel 30 decimalen achter de komma!!!

Deze finetuning is natuurlijk een erg onnatuurlijke manier om de Higgs massa te verklaren en fysici

zijn op zoek naar een theorie die deze noodzaak voor fine-tuning wegneemt. Zulke theorien zijn

gebaseerd op een grotere (globale) symmetrie groep die spontaan gebroken word tot de standaard

model symmetrie groep. Little Higgs modellen zijn zo’n soort klasse van modellen. Hierin worden

nieuwe deeltjes geıntroduceert die ook interacties met de Higgs aangaan en precies de contributie die

we hierboven bespraken opheffen! Ten tweede wordt het Higgs deeltje als een goldstone boson gezien.

Dit zijn massaloze deeltjes die ontstaan wanneer er spontane symmetrie optreedt. Dit verschijnsel

laat ik ook zien in mijn scriptie. Maar, het Higgs deeltje is niet massaloos. Daarom wordt de grotere

symmetrie naast spontaan ook op een heel speciale manier gebroken door expliciet bepaalde termen

aan de vergelijkingen toe te voegen. Dit wordt zodanig gedaan dat het Higgs deeltje op een veel

natuurlijkere manier aan zijn massa komt en de noodzaak voor finetuning verdwijnt. Laat ik om af te

99

sluiten nog de naam van de modellen verklaren. Het is namelijk belangrijk dat de energie waarop deze

grotere symmetriegroepen gebroken worden rond de TeV schaal liggen, dit is namelijk de grens wanneer

er geen noodzaak voor finetuning is. Modellen waarin de symmetrie breking namelijk op veel hogere

schalen plaatsvind kunnen weliswaar het hierarchie probleem zoals hierboven geschetst oplossen, maar

introduceren op hun beurt een ”Little”-hierarchie probleem. Little Higgs modellen hebben inderdaad

een symmetriebreking rond de 1 TeV en lossen hiermee het Little-hierarchie probleem op.

100

A Symmetric polynomials

This appendix will discuss certain symmetric polynomials and in particular the symmetric Schur

polynomials. These have their applications in the representation theory of GL(n,C) as the are the

characters of the irreducible representations. Here I will discuss some important identities between

there Schur polynomials64. We consider functions in the variables, x1, . . . , xk indexed by partitions

λ = (λ1 ≥ . . . λk ≥ 0) of n into at most k parts, or in terms of Young diagrams, Young diagrams with

at most k rows. There are several choices of bases for Λn, the ring of symmetric polynomials in n

variables. Here I will list a few of these bases and formulate some results about them and relations

between them. The first are the monomial symmetric polynomials.

A.1 Monomial symmetric polynomials

Definition A.1. For each α = (α1, . . . , αn) we denote by xα the monomial

xα = xα11 · · ·xαnn

Let λ be a partition of length ≤ n. Then the monomial symmetric polynomial

mλ =∑

xα (A.1)

is the sum over all distinct permutations α = (α1, . . . , αn) of λ1, . . . , λn.

For example, in three variables:

m(1,1) = x1x2 + x1x3 + x2x3

m(2.0) = x21 + x2

2 + x23

A.2 Complete symmetric polynomials

Definition A.2. The complete symmetric polynomial hλ is defined as

hλ = hλ1hλ2· · ·hλk

with hr =∑|λ|=rmλ the rth complete symmetric polynomial which is the sum over all monomials of

total degree r in the variables x1, x2 . . . xn.

The generating function for hr is:

H(t) =∑r≥0

hrtr =

n∏i=1

(1− xit)−1 (A.2)

Taking the same partitions and 3 variables we find:

h(1,1) = h1h1 = m1m1 = (x1 + x2 + x3)2

h(2,0) = h2h0 = h2 = m(2,0) +m(1,1) = x21 + x2

2 + x23 + x1x2 + x1x3 + x2x3

64All results can be found in [3].

101

A.3 Elementary symmetric polynomials

Third are the elementary symmetric polynomials. Unlike the previous two these are parametrized by

the partitions conjugate to λ which we denote by λ′.

Definition A.3. The elementary symmetric polynomial e′λ is defined as

e′λ = eλ′1eλ′2 · · · eλ′n

with er the rth elementary symmetric polynomial which is the sum of all products of r distinct variables

xi so that e0 = 1 and

er =∑

i1<12<...<ir

xi1xi2 · · ·xir = m(1r)

The generating function for er is:

E(t) =∑r≥0

ertr =

n∏i=1

(1 + xit) (A.3)

For example, in three variables:

e(1,1) = e1e1 = m1m1 = (x1 + x2 + x3)2

e(2,0) = e2e0 = e2 = m(1,1) = x1x2 + x1x3 + x2x3

A.4 Schur polynomials

Last are the Schur polynomials. They are defined via a determinantal formula We let xα = xα11 · · ·xαnn

be a monomial in a finite number of variables x1, . . . , xn and consider the polynomial aα obtained by

anti-symmetrizing xα. That is

aα(x1, . . . , xn) =∑σ∈Sn

ε(σ) · σ(xα)

This polynomial is skew symmetric since σ(aα) = ε(σ)aα. In particular, it vanishes unless all αi, i =

1, . . . n are distinct. We may therefore assume α1 > α2 > . . . > αn ≥ 0 and write α = λ+ δ with λ a

partition and l(λ) ≤ n and δ = (n− 1, n− 2, . . . , 1, 0). Then

aα = aλ+δ =∑σ

ε(σ) · σ(xλ+δ)

which we may write as the following determinant.

aλ+δ = det(xλj+n−ji )1≤i,j≤n =

xλ1+n−1

1 xλ1+n−12 · · · xλ1+n−1

n

xλ2+n−11 xλ2+n−1

2 · · · xλ2+n−1n

......

. . ....

xλn1 xλn2 · · · xλnn

This determinant is divisible by the the Vandermonde-determinant, which is the product over each of

the differences xi − xj , since aλ+δ is divisible by each of these separately. It is defined as:

∏1≤i<j≤n

(xi − xj) =

xn−1

1 xn−12 · · · xn−1

n

xn−21 xn−2

2 · · · xn−2n

......

. . ....

1 1 · · · 1

= det(xn−ji ) = aδ (A.4)

102

The quotient of these two determinants, aλ+δ/aδ, is symmetric and is called the Schur function in the

variables x1, . . . xn corresponding to the partition λ of length ≤ n and is homogenous of degree |λ|.

Definition A.4. The Schur function sλ in n variables, where l(λ) ≤ n is defined as

sλ(x1, . . . , xn) =det(x

λj+n−ji )

det(xn−ji )=aδ+λaδ

and the Schur functions form a basis of the ring of symmetric functions.

We further have the following identities that relate the Schur polynomials to the complete symmetric

polynomials and elementary symmetric polynomials. They are

sλ = det(hλi−i+j) and sλ = det(eλi′−i+j)

With this we see that we have the special cases of:

s(n) = hn and s(1n) = en.

A.5 Orthogonality

The Schur polynomials form in fact an orthonormal basis of Λ with respect to the scalar product on Λ

that we will define in a moment. First though, we have the following two identities for the expansion

of the product ∏i,j

(1− xiyj)−1

where x = (x1, . . . , xn) and y = (y1, . . . , yn) are two sets of variables. These are:

1.∏i,j(1− xiyj)−1 =

∑λ hλ(x)mλ(y)

2.∏i,j(1− xiyj)−1 =

∑λ sλ(x)sλ(y)

We now define the scalar product < u, v > on Λ, by requiring that for the bases hλ and mλ the

following are equivalent:

< hλ,mµ >= δλµ (A.5)

Then we have:

Proposition A.1. For each two bases (uλ) and (vλ), indexed by partitions λ of n ≥ 0 the following

holds;

1. < uλ, vλ >= δλµ

2.∏i,j(1− xiyj)−1 =

∑λ uλ(x)vλ(y)

therefore we have

< sλ, sµ >= δλµ (A.6)

Since the Schur functions define an orthonormal basis we can define any symmetric function f ∈ Λ by

its scalar product with sλ, i.e.

f =∑λ

< f, sλ > sλ

In the next section I will define some relations between Schur polynomials that will be needed. Then,

in section A.6.1 I will define the skew Schur function

103

A.6 Relations among the symmetric polynomials

Two relations that I will discuss here are the Pieri rule and the Littlewood-Richardson rule. The Pieri

rule tells us how to multiply a Schur polynomial by a basic Schur polynomial s(m) = hm.

Definition A.5. Pieri rule

sλ · s(m) =∑ν

sν

with the sum over all partitions ν whose Young diagram can be obtained from the Young diagram of

λ by adding a total of m boxes to the rows, but no two boxes in the same column.

Example A.1. Consider the product of Schur polynomials s(2,1) ·s(2). Expanding λ = (2, 1) by adding

2 boxes according to the rule gives the following possibilities for the young diagrams for ν:

and thus s(2,1) · s(2) = s(4,1) + s(3,2) + s(3,1,1) + s(2,2,1).

Applying the Pieri rule inductively to hλ = hλ1· · ·hλk , gives the following identity:

hλ = sλ1· · · sλk =

∑Kµλsµ. (A.7)

where the coefficients Kµλ are the Kostka numbers, the number of semi-standard tableaux of shape µ

and content λ. The second rule is the Littlewood Richardson rule that tells us how to multiply two

general Schur polynomials and is thus a generalization of the Pieri rule. It gives the expansion of a

product of two Schur polynomials in terms of Schur polynomials.

Definition A.6. Littlewood - Richardson rule

sλ · sµ =∑ν

cνλµsν

Here λ ` n, µ ` m and the summation is over all partitions ν of d+m. The Littlewood - Richardson

coefficients cνλµ that appear in the expansion are defined as the number of ways the Young diagram

for λ can be expanded to the Young diagram for ν by strict µ expansion. By this we mean that if

µ = (µ1, . . . , µk) we get a µ expansion by first adding µ1 boxes in the description of the Pieri rule and

putting a 1 in these boxes. Then repeating for µ2 and putting a 2 in those boxes, and so on up to the

last µk boxes and putting a k in those boxes. By strict expansion we mean that when the integers in

the boxes are listed from left to right, starting with the top row and working down, and one looks at

the first t entries in this list, then each integer p between 1 and k− 1 occurs at least as many times as

the next integer p+ 1.

Example A.2. Consider the product s(2,1) · s(2,1). Then strict (2,1) expansion of the Young diagram

gives the following possibilities for ν:

104

0 0 1 10 2

0 0 1 102

0 0 10 1 2

0 0 10 12

0 0 10 21

0 0 1012

0 00 11 2

0 00 112

Therefore: s(2,1) · s(2,1) = s(4,2) + s(4,1,1) + s(3,3) + 2s(3,2,1) + s(3,1,1,1) + s(2,2,2) + s(2,2,1,1).

A.6.1 Skew Schur functions

We now introduce the skew Schur function sλ/µ by defining:

< sλ/µ, sν >=< sλ, sµsν > (A.8)

They can be expanded in terms of Schur polynomials through the relation

sλ/µ =∑ν

cλµνsν (A.9)

where the coefficients are the Littlewood-Richardson rule as above. We further have

Definition A.7. sλ/µ(x1, . . . , xn) = 0 unless 0 ≤ λi − µi ≤ n for all i.

Schur functions in more set of variables

In the following we will consider three sets of independent variables x = (x1, x2, . . . ), y = (y1, y2, . . . ), z =

(z1, z2, . . . ). Then we have:∑λµ

sλ/µ(x)sλ(z)sµ(y) =∑µ

sµ(y)sµ(z) ·∏i,k

(1− xizk)−1

=∏j,k

(1− yjzk)−1∏i,k

(1− xizk)−1

=∑λ

sλ(x, y)sλ(z)

where sλ(x, y) is now the Schur function in the set of variables (x1, x2, . . . , y1, y2, . . . ). We thus conclude

that:

sλ(x, y) =∑µ

sµ(x)sλ/µ(y) (A.10)

In fact, this can be made more general:

sλ/µ(x, y) =∑ν

sλ/ν(x)sν/µ(y) (A.11)

with the sum over partitions ν such that λ ⊃ ν ⊃ µ.

105

proof We have ∑µ

sλ/µ(x, y)sµ(z) =sλ(x, y, z)

=∑ν

sλ/ν(x)sν(y, z)

=∑µ,ν

sλ/ν(x)sν/µ(y)sµ(z)

This we can generalize as follows:

Proposition A.2. Let λ, µ partitions and let x(1), . . . , x(n) be n sets of variables. Then

sλ/µ(x(1), . . . , x(n)) =∑ν

n∏i=1

sνi/νi−1(x(i))

with the sum over all sequences (ν) = (ν(0), . . . , ν(n)) of partitions such that ν(0) = µ, ν(n) = λ and

ν0 ⊂ ν2 ⊂ . . . ⊂ νn.

With this formula we can derive what will happen if we where to set the variable xn in the schur

function sλ(x1, . . . , xn) to 1. For this consider the two sets of variables x = (x1, . . . , xn−1) and y = xn

and let λ = (λ1, . . . , λn) a partition of l(λ) ≤ n. For this single variable xn, we have that sλ/µ(xn) = 0,

unless |λ−µ| is a horizontal strip by definition A.7. Then sλ/µ(x) = x|λ−µ|. Thus, with A.10 we have

sλ(x1, . . . , xn) =∑µ⊆λ

sµ(x1, . . . , xn−1)sλ/µ(xn) =∑µ⊆λ

sµ(x1, . . . , xn−1)x|λ−µ|n (A.12)

where the sum is over all partitions µ for which λ − µ is a horizontal strip and l(µ) ≤ n − 1 by def.

A.7, from which the effect of setting xn = 1 can be deduced.

Definition A.8. The partition µ ⊆ λ is called a horizontal strip if |λ − µ| has at most one box in

each column.

B Lie groups and Lie algebra’s

A certain set of groups that is of great importance in physics are the symmetry groups, and in particular

the continuous symmetry groups, such as the rotations. These continuously generated groups are called

Lie groups. A familiar example of a Lie group is the 3-dimensional rotation group SO(3) that contains

the rotation matrices and depends on three parameters. Other examples are the unitary group U(N),

consisting of all orthogonal N × N matrices, and its subgroup SU(N), the special unitary group,

meaning the matrices have determinant one. The special unitary groups are very important groups in

physics. For example, the symmetry group of the standard model is SU(3) × SU(2) × U(1). In this

appendix I will discuss a few basic results concerning Lie groups, group generators and the associated

Lie algebra. In particular I will discuss the cases of SU(2) and SU(3)65

65The material discussed is based on [27].

106

B.1 Lie groups

Consider a Lie group G with elements g(ξ), where ξ is a parameter. Because Lie groups are analytical

groups we can parametrize any two elements as:

g(ξ1)g(ξ2) = g(ξ1 + ξ2).

which implies the following properties for g(ξ):

g(0) = I, and (g(ξ))−1 = g(ξ).

Performing a Taylor expansion around the identity gives:

g(ξ) = g(0) + g′(0)(ξ − 0) + (O)(ξ2) = I + ξt+O(ξ2), (B.1)

where

t ≡ dg(ξ)

dξ|ξ=0

This t is called the generator of the group. We can obtain a nicer expression for g(ξ) by rewriting

(B.1) as an exponential:

g(ξ) = g(ξ/n)n

= limn→∞

I +ξ

nt . . . n = exp(ξt),

which may immediately be extended to an n-parameter Lie group:

g(ξ1, . . . , ξn) = exp(ξata)

where the summation convention is used. As with t, being the generator of the one-parameter Lie

group, the ta’s are the generators of the n-parameter Lie group and they are linearly independent. In

the following we will have a closer look at these generators and this will lead us to introduce the Lie

Algebra.

B.2 Lie algebra

Let G a Lie group with elements g(ξ1, . . . , ξn) and generators ta. Using the Baker-Campbell-Hausdorff

formula66 any product of two elements can be expressed as:

g(ξ1, . . . , ξn) · g(ζ1, . . . , ζn) = exp(ξata) · exp(ζbtb) (B.2)

= expξata + ζbtb +1

2ξaζb[ta, tb] + higher order commutators (B.3)

However, G is a group so it must close under multiplication. The product in B.2 must therefore again

be a group element, that is, we must have

g(ξ1, . . . , ξn) · g(ζ1, . . . , ζn) = g(ξ1, . . . , ηn) = exp(ηata). (B.4)

66The BCH formula[10] relates the product of the exponentials of two operators A and B to their

commutator [A,B]. The formula can be expressed as exp(A) · exp(B) = exp(A + B + 12[A,B] +

higher order commutators of A and B). Thus, in the case where A and B commute, i,e, [A,B] = 0, we re-

cover the usual identity exp(A) exp(B) = exp(A+B).

107

But this is possible if and only if any commutator of generators can again be written as a linear

combination of generators. We thus conclude that the generators must close under commutation:

[ta, tb] = fabctc (B.5)

where the fabc are called the structure constants. With this property the generators form the basis of

the Lie algebra g associated with the Lie group G.

B.3 Examples

Important Lie group in physics are U(N) and its subgroup SU(N). U(N) is the group of all unitary

n× n matrices and SU(N) is the group of unitary matrices with unit determinant. The group U(N)

has n2 independent generators. SU(N) therefore has n2−1 generators since we have the extra constant

of unit determinant. The number of generators can be seen by considering a general n × n matrix.

It can be described by n2 complex numbers and thus depends on 2n2 real parameters. The unitarity

condition imposes n2 constraints on the parameters. And thus the number of independent parameters

is 2n2 − n2 = n2 Here I will discuss the particular cases of SU(2) and SU(3).

SU(2)

According to the above discussion we can write the group elements U of SU(2) as:

U = exp(ξata)

The unitarity constraint, UU∗ = Id, and unital determinant constraint place the following two re-

quirements on the 3 group generators.

ta = −t†a, Trace(ta) = 0

Thus, the generators of SU(2) must be traceless anti-hermitian matrices. Explicitly:

ta =1

2τa (B.6)

where the factor of 12 comes from the restriction of unit determinant and the τa are the Pauli matrices

given by:

τ1 =

(0 1

1 0

)τ2 =

(0 −ii 0

)τ3 =

(1 0

0 −1

)(B.7)

The structure constants for SU(2) are fijk = εijk

SU(3)

SU(3) has 8 independent generators expressed in terms of the Gell-mann matrices ta = 12λi. The

Gell-mann matrices are given by the following set of matrices:

λ1 =

0 1 0

1 0 0

0 0 0

λ2 =

0 −i 0

i 0 0

0 0 0

λ3 =

1 0 0

0 −1 0

0 0 0

λ4 =

0 0 1

0 0 0

1 0 0

108

λ5 =

0 0 −i0 0 0

i 0 0

λ6 =

0 0 0

0 0 1

0 1 0

λ7 =

0 0 0

0 0 −i0 i 0

λ8 =1√3

1 0 0

0 1 0

0 0 −2

We thus see that we can embed SU(2) in SU(3) by identifying the three Pauli matrices with λ1, λ2

and λ3.

SO(N)

SO(N) is the group of all n × n orthogonal matrices U , i.e. UTU = Id, with unit determinant.

In physics SO(N) rotations are also called isometries because they leave lengths invariant. The gener-

ators of SO(N) are anti-symmetric trace-less n× n matrices and there are 12N(N − 1) such matrices

with a single 1 above the diagonal and a corresponding -1 below such that the matrix is antisymmetric.

Explicitly for SO(3) the three generators are: 0 1 0

−1 0 0

0 0 0

0 0 1

0 0 0

−1 0 0

0 0 0

0 0 1

0 −1 0

C Notation and relevant quantum numbers

C.1 Notation

gamma matrices

The 5 gamma matrices are given by [13]:

γ0 =

1 0 0 0

0 1 0 0

0 0 −1 0

0 0 0 −1

γ1 =

0 0 0 1

0 0 1 0

0 −1 0 0

−1 0 0 0

γ2 =

0 0 0 −i0 0 i 0

0 i 0 0

−1 0 0 0

γ3 =

0 0 1 0

0 0 0 −1

−1 0 0 0

0 1 0 0

They satisfy the Dirac algebra:

(γ0)2 = Id (γk)2 = −Id γ0† = γ0 γk† = −γk γµ, γν = γµγν + γνγµ = 2gµν

where gµν is the matrix tensor gµν =

1 0 0 0

0 −1 0 0

0 0 −1 0

0 0 0 −1

.

Some relevant identities they satisfy are [12]:

• gµνgµν = 4

• γµγµ = 4

• tr(γµ) = 0

• the trace of any product of an odd number of γµ is zero

• tr(γµγµ) = 4

109

• tr(γµγν) = 4gµν

• tr(/a/b) = 4a · b where /a ≡ aµγµ

C.2 Quantum numbers

In the Standard model, hypercharge Y , electric charge Q and the third isospin component I3 are

related through the ’Gell-Mann-Nishijima formula’ [12].

Q = I3 +Y

2(C.1)

where Y = B + S, with B the baryon number and S the strangeness.

C.2.1 isospin

Isospin I was introduced in 1932 by Heisenberg to explain the similarity in masses of the proton and

neutron. Heisenberg considered the proton and neutron as two distinct states of a single particle he

called the nucleon that would be indistinguishable apart from their difference in electric charge. I.e.

he considered the nucleon as the following linear combination of the proton and neutron

N = α

(1

0

)+ β

(0

1

)

where

p = e1 =

(1

0

)and n = e2 =

(0

1

)Any transformation that would mix these two basis vectors would yield a new linear combination of

the p and n, i.e. a new state of the neutron. In analogy with the notation for spin S he introduced

the concept of isospin I with third component I3. The nucleon is assigned isospin 12 and the third

component I3 thereby has eigenvalues + 12 , corresponding to the proton and − 1

2 corresponding to

the neutron. The idea behind this notation was that the proton and neutron were affected equally

by the strong nuclear force. The charge independence of the strong nuclear force was then seen as

invariance under unitary transformations in isospin space. One such transformation would be replacing

all protons by neutrons and vice versa. Explicitly, the allowed transformations where realized to be

SU(2) transformations. Put differently, the space spanned by the p and n was invariant under SU(2)

transformations. The same reasoning holds for the pions. The two charged pions π± and the neutral

π0 are placed in an isospin triplet with I = 1, to explain their similarity in masses.

C.2.2 Weak isospin and weak hypercharge

Just as isospin is a conserved quantum number for the strong interaction, weak isospin T is conserved

in the weak interaction. In the same way we can form weak isospin doublets for the left handed quarks

and leptons and weak isospin singlets for the right handed particles. For the first family (the other

two families are analogous) this gives:

ψL =

(νe

e−

)L

,

(u

d

)L

ψR = eR, uR, dR (C.2)

110

An equivalent of C.1 relates weak hypercharge to the electric charge Q and the third weak isospin

component T3. Explicitly:

Q = T3 +Y

2(C.3)

The lepton doublet, for one, is assigned weak isospin 12 , with eigenvalues ± 1

2 where the particle with

the highest charge is assigned the highest eigenvalue. The electron doublet is thus found to have a

hyper charge of -1. The singlet eR on the other hand has weak isospin 0 and thus a hypercharge of

-2. Continuing for the other particles we find the following hypercharges for the first family Standard

Model particles.

• νL, eL have Y = −1

• eR has Y = −2

• uL, dL have Y = 1/3

• uR has Y = 4/3

• dR has Y = −2/3

• Higgs field φ has Y = 1

I will conclude this appendix with an example about charge conjugation[14].

Example C.1. Charge conjugation The antiparticle fields of spin 1/2-particles can be obtained by

applying the charge conjugation operator C = iγ2γ0.

ψC = CψT

= Cγ0ψ∗, ψC

= ψC (C.4)

We can apply charge conjugation to the chirality eigenstates to conclude that

(ψL)C = (ψC)R and (ψR)C = (ψC)L (C.5)

which follows from the definition of C and the properties of the γ-matrices. Thus, applying charge

conjugation to a right-handed particle gives a left-handed anti-particle and vice versa. Applying charge

conjugation to an isospin doublet makes things a bit more complex. Since charge conjugation involves

complex conjugation it reverses the sign of the eigenvalues of all generators of the SU(2) symmetry.

This can be seen by first looking at the U(1)Y hyper charge generator. Applying complex conjugation

to a U(1) group operator yields:

[exp(iαY )]∗ = exp(−iαY ) = exp[iα(−Y )] (C.6)

Therefore, if we apply charge conjugation to the left handed doublet ψL =

(νe

e−

)L

the doublet

ψ′ =

(νCee+

)R

would have isospin T3 = −1/2 and T3 = +1/2, for the upper and lower components

respectively which is not right. We can obtain the correct result by using the relation iτ2 =

(0 1

−1 0

)to reverse the order of the doublet ψ′67. Then the correct charge conjugated isospin doublet is given

by:

ψR = iτ2

(νCee+

)R

=

(e+

−νCe

)R

(C.7)

67A derivation can be found in [14]

111

D Feynman rules and calculating loop integrals

In particle physics, any measurable quantity is proportional to the square of the matrix element −iM.

This matrix element is derived by consistently applying the Feynman rules, which can be derived from

the Lagrangian. QED, QCD and QFD each have their own set of rules. Here I will give an overview

of those rules and how the matrix element can be computed.

For a given process, one draws all diagrams that are consistent with the Feynman rules. Interactions

are represented by vertices and propagators by line’s connecting the vertices. Bosonic propagators are

to be drawn as wiggly lines and fermionic propagators as solid lines. Each propagator and vertex is then

associated with a certain factor. The vertices with coupling constants and the different propagators

also each have their own factors. I listed the relevant factors below [12], [13]. The relevant propagator

terms are:

• Spin-0 propagator: ( ip2−m2 )

• Spin 12 propagator: i

/p−m = i(/p+m)

p2−m2

• Spin-1 propagator : −ip2−m2

[gµν − pµpν

m2

]68

Remark These propagator terms are derived from the Klein-Gordon, Dirac and Proca equations we

have seen in section 6.1. One takes the free field equations (6.6), (6.12) and (6.10) and applies the

prescription pµ ↔ i∂µ to convert them to momentum space. The propagator is then i-times the inverse.

The relevant vertex terms are found by simply removing the fields of the interaction in the Lagrangian.

I derived a few in the analysis of the Standard model Higgs mechanism. Those needed here are:

• QED vertex : −iQeγµ = −iqγµ

• H − ff Yukawa vertex : −i λf√2

• Higgs 4 vertex : −iλ4• HH −W+W− 4 vertex : i g

2

4 gµν

• HH − ZZ 4 vertex : i g2+g′2

4 gµν

The matrix element of a particular diagram is then the product of all the terms that can be associated

with the diagram. Since for a process multiple diagrams can be possible, the total matrix element

is the sum of all the separate matrix elements associated to the separate diagrams. However, loop-

diagrams also have to be taken into account. These are to be seen as corrections to the tree level

diagrams and come with a factor of ±i∫

d4p(2π)4 . Here the ± is - when it is a fermion in the loop and +

when it is a boson, and we are to integrate over all momenta. This however, leads to infinities, and

infinities are not allowed in any physical theory. In this context physicist often speak of regularizing

and renormalizing, and these terms are indeed related although have a very different meaning.

Regularization methods deal with infinite integrals by splitting of the infinite part of the finite part.

There are different types of regularization schemes, all with their advantages and disadvantages. Here

I will briefly discuss the cut-off regularization. Renormalisation happens after the regularization has

taken place and involves absorbing the infinities into the parameters. This means that the bare

68Note that the photon propagator is thus −i gµνp2

.

112

coupling constants and masses are reparametrized into the physical coupling constants and masses

that we actually measure.

D.1 Superficial degree of divergence

Often, physicists are only interested in the type of divergence and not in the particular value of the

integral. This is called the superficial degree of divergence of the diagram. The superficial degree of

divergence is quickly determined by counting the powers of momentum p. In a loop diagram we can

have the following contributions of p.

• a loop contributes 4 powers of p through ±i∫

d4p(2π)4

• a fermion propagator contributes −1 powers of p through /p

p2−m2 , where /p = pµγµ.

• a boson (either scalar or massive vector boson) contributes −2 powers of p through 1p2−m2

• a vertex containing a derivative with respect to p contributes −1 powers of p.

The superficial degree of divergence D is now defined to be the sum of powers of the momenta from

these contributions, i.e.

D = 4L− Pfermion − 2Pboson − V ∂∂p, (D.1)

We expect the integral to converge when D < 0. When D = 0 we expect the diagram to diverge

logarithmically, when D = 1 we expect linear divergence and quadratic divergence when D = 2.

However, the actual degree of divergence may be less due to cancellations from divergent sub diagrams

or cancellations required by symmetries. The correction to the electron self energy for example is

expected to be quadratically divergent but turns out to be only logarithmically divergent [24].

D.2 Regularization schemes

D.2.1 Momentum Cut-off regularization

In the cut-off regularization scheme we only evaluate the integral up to a cut-off momentum Λ, and

at the end send Λ→∞. This Λ is the energy scale where the laws of physics break down. Beyond it

we don’t know how nature behaves, so we don’t even try computing it. It is a very effective way of

regulating an infinite integral, albeit a bit primitive. This type of regularization is used to compute

the divergent contributions to the Higgs mass.

D.3 Calculation of quadratic divergent contributions to the Higgs mass

Here I will demonstrate the calculation of the top-loop divergent integral that contributes to the Higgs

mass, since this forms the largest contribution.

The calculation simplifies greatly if we neglect the momenta of the external particles and the masses of

intermediate particles. These do not play a role in the dominance of the diagram which is determined

113

to leading order by the momenta running through the loop.

M =3

(−i λ

2t√2

)2

− i∫

(d4k)

(2π)4Tr(

i(kµγµ)

(k)2

i(kνγν)

k2)

=(−i)3λ2t

2

∫(d4k)

(2π)4Tr

(kµkνγ

µγν

k4

)=(−i)3λ

2t

2

∫(d4k)

(2π)4

(4k2

k4

)=(−i)3λ

2t

2

∫(d4k)

(2π)4

(4k2

k4

)where in the third equality we evaluated the trace: Tr(kµkνγ

µγν) = 4k2. We now perform a so called

Wick rotation to the energy component to convert the integral over Minkowski space to an integral in

Euclidian space-time. This amounts to the substitution:

k0 → ik0E

such that k2 = −(k0E)2 − |k|2 = −k2

E where now kE = (k0E , |k|) is defined as the positive definite

Euclidian scalar product69. Implementing this substitution gives for∫

(d4k)∫(d4k) =

∫dk0

∫d|k| =

∫idk0

E

∫d|k|

and thus:

M =− i3λ2t

2

∫(d4k)

(2π)4(4k2

k4)

=− i3λ2t

24

1

(2π)4

∫idk0

E

∫d|k|−k

2E

k4E

=− 3λ2t

24

1

(2π)4

∫d4kE

k2E

k4E

We now convert the integral to spherical coordinates by noting that∫d4k = k3dkdΩ, with dΩ = 2π2.

Then we get, no longer writing the E subscript:

M =− 3λ2t

24

1

(2π)4

∫d4k

k2E

k4E

=− 3λ2t

24

1

(2π)4

∫k3dkdΩ

k2

k4

=− 3λ2t

24

2π2

16π4

∫ Λ

0

kdk

=− 3λ2t

8π2Λ2

where in the third line we implemented the cut-off Λ. A more thorough calculation involves also the

mass of the top quark and the momenta of the external Higgs particle70. The loop integral we have

to compute corresponds again to the diagram with momenta p + k and k running through the loop.

69Recall that in minkowski space the squared four momentum k = (k0, |k|) is defined as k2 = (k0)2 − |k|2.70The calculation is based on [24].

114

The full calculation then amounts to the following integral:

M =3

(−i λt√

2

)2

− i∫

(d4k)

(2π)4Tr

(i(p+ k)γ +m)

(p+ k)2 −m2

i((kγ +m)

(k)2 −m2

)=(−i)3λt

2

∫(d4k)

(2π)4Tr

(((p− k)γ +m)

(p− k)2 −m2

(kγ +m)

k2 −m2

)(D.2)

The first thing to tackle is the denominator: For this we use a trick named Feynman’s trick to handle

such terms that are a product of multiple propagators. It allows us to complete the square and makes

the calculation easier. In its simplest form, that is two propagators, it says:

1

ab=

∫ 1

0

dx1

|xa+ (1− x)b|2=

∫ 1

0

dx1

|b+ (a− b)x|2

Thus letting a = (p− k)2 −m2 and b = k2 −m2 we can rewrite D.2 as

(−i)3λt2

∫(d4k)

(2π)4

∫ 1

0

dxTr((p+ k)γ +m)(kγ +m)

|k2 −m2 + (p2 + 2pk)|2(D.3)

We now make a change of variables by setting l = k + px. Then dl = dk and we obtain:

k2 −m2 + (p2 + 2pk) =l2 + p2x2 − 2lpx−m2 + p2x+ 2plx− 2p2x2

=l2 + p2x(1− x)−m2

for the denominator and for the nomenator we get:

((p+ k)γ +m)(kγ +m) =(l + p(1− x))γ +m)((l − px)γ +m)

We can now simplify thing by noting that to leading order in l the numerator becomes lµγµlνγν .

Evaluating the trace yields 4l2. Then (D.3) becomes to leading order in l:

(−i)3λ2t

2

∫ 1

0

dx

∫(d4l)

(2π)4

4l2

|l2 + p2x(1− x)−m2|2

As before we do a Wick rotation to convert the integral to Euclidian space. Thus we make the

substitution: l0 = il0E , l2 = −l2E . Then we get:

−3λ2t

2

∫ 1

0

dx

∫(d4l)

16π4

4l2

|l2 + p2x(1− x)−m2|2

Making further the substitution ∆ = m2 − p2x(1− x) and converting the integral to spherical coordi-

nates, i.e. d4l = 2π2l3dl, we get:

− 3λ2t

4π2

∫ 1

0

dx

∫ Λ

0

dll5

|l2 + ∆2|2(D.4)

where we also implemented the cut-off Λ. The integral over l we can evaluate: Its value is:∫ Λ

0

dll5

|l2 + ∆2|2=

[l2

2− ∆2

2(l2 + ∆)−∆ log(l2 + ∆)

]Λ

0

and thus becomes:

−3λ2t

4π2

∫ 1

0

dx

[Λ2

2− ∆2

2(Λ2 + ∆)+

1

2

]

115

After the integration over x this becomes to leading order in Λ:

−3λ2tΛ

2

8π2

which is in agreement with our first calculation and literature [18].

116

References

[1] W. Fulton., J. Harris, Representation theory: A first course, Springer-Verlag, New York Inc, 1991

[2] G. James., A. Kerber, The Representation Theory of the Symmetric Group, Addison-Wesley Pub-

lishing Company, Massachusetts, 1981

[3] I. G. Macdonald, Symmetric Functions and Hall Polynomials 2nd Ed., Oxford university Press,

New York, 1995

[4] P. Etingof, Introduction to Representation Theory, January 10-th 2011 http://math.mit.edu/

~etingof/replect.pdf

[5] J. Stokman, Algebra 3; Representatietheorie. Aanvulling 2, University of Amsterdam




[9] T. Brocker., T. Dieck, Representations of Compact Lie Groups, Graduate Texts in Mathematics,

Springer-Verlag,

[10] J. Fuchs., C. Schweigert, Symmetries, Lie Algebras and Represenations, pp137, 156, ;Cambridge

University Press, New York, 1997,

[11] C. Quigg, Gauge theories of the strong, weak and electromagnetic interactions, The Benjamin-

Cummings Publishing Company, Canada, 1983.

[12] D.G. Griffiths, Introduction to Elementary Particles, Wiley-Finch 2nd Rev. Ed, Weinheim, 2008

[13] M. Thomson, Modern Particle Physics, Cambridge University Press, Cornwall, 2014

[14] W. Greiner., B. Muller, Gauge Theory of Weak interactions, pp 305, ; Springer Fourth Edition,

Thun, 1986

[15] http://isites.harvard.edu/fs/docs/icb.topic1146666.files/

IV-4-SpontaneousSymmetryBreaking.pdf

[16] C. ter Burg., S. Bakker, Project Wiskunde, Variatierekening, Tweede-jaars project Bachelor

Wiskunde ; UvA ; 2014

[17] G. ’t Hooft, Naturalness, chiral symmetry and spontaneous symmetry breaking, inspire-

hep/144074v1, 1980

[18] M. Schmaltz., D. Tucker-Smith, Little Higgs Review, arXiv:hep-ph/0502182v1, 2005

[19] M. Schmaltz, The simplest little Higgs, arXiv:hep-ph/0407143v2, 2004

[20] N. Arkani-Hamed, et al, The Littlest Higgs, arXiv:hep-ph/0206021v2, 2002

[21] T. Han, et al, Phenomenology of the Little Higgs Model, arXiv:hep-ph/0301040v4, 2004

[22] A. Birkedal, et al, Little Higgs Dark Matter, arXiv:hep-ph/0603077v3, 2012

117

http://math.mit.edu/~etingof/replect.pdf

http://math.mit.edu/~etingof/replect.pdf

http://isites.harvard.edu/fs/docs/icb.topic1146666.files/IV-4-SpontaneousSymmetryBreaking.pdf

http://isites.harvard.edu/fs/docs/icb.topic1146666.files/IV-4-SpontaneousSymmetryBreaking.pdf

[23] CMS Collaboration Search for a massive resonance decaying into a Higgs boson and a W or Z bo-

son in hadronic final states in propon-proton collisions at√s = 8 Tev, arXiv:hep-ex/1506.01443v1,

2015

[24] J. Lukkezen, Little Higgs Phenomenology, Master Thesis, Universiteit van Amsterdam, 2008

[25] M. Brak, The Hierarchy Problem in the Standard Model and Little Higgs Theories, Master Thesis,

Universiteit van Amsterdam, 2004

[26] I. van Vulpen, The Standard Model Higgs Boson, University of Amsterdam, 2013-2014.

[27] E, Laenen Lecture Notes Quantum Field Theory: Appendix C: Introduction to group theory,

University of Amsterdam,

[28] http://www.quantumdiaries.org/2012/07/01/the-hierarchy-problem-why-the\

-higgs-has-a-snowballs-chance-in-hell/

[29] http://www.quantumdiaries.org/2011/11/21/why-do-we-expect-a-higgs\

-boson-part-i-electroweak-symmetry-breaking/

118

http://www.quantumdiaries.org/2012/07/01/the-hierarchy-problem-why-the\-higgs-has-a-snowballs-chance-in-hell/

http://www.quantumdiaries.org/2012/07/01/the-hierarchy-problem-why-the\-higgs-has-a-snowballs-chance-in-hell/

http://www.quantumdiaries.org/2011/11/21/why-do-we-expect-a-higgs\-boson-part-i-electroweak-symmetry-breaking/

http://www.quantumdiaries.org/2011/11/21/why-do-we-expect-a-higgs\-boson-part-i-electroweak-symmetry-breaking/

Documents

Branching Rules and Little Higgs models · group. A symmetry group that is of special importance in particle physics is the group SU(N) as the Standard Model symmetry group is SU(3)