Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Branching Rules and Little Higgs models
Cathelijne ter Burg
Bachelor Thesis year 3
Supervisors: prof. dr. J. Stokman & prof. dr. E. Laenen
July 16, 2015
Image is from [28]
KdVI & IoP-ITFA
Faculteit der Natuurwetenschappen, Wiskunde en Informatica
Universiteit van Amsterdam
Abstract
This thesis will discuss branching rules for GL(n,C) and Little Higgs models and will revolve around
the concept of symmetry breaking. An introduction to representation theory is given after which the
results are applied to Sn and GL(n,C). The irreducible characters of GL(n,C) will be related to the
Schur polynomials which enables us to derive the branching rules for GL(n,C). The results are then
extended to SU(N) and some examples are given. Then I discuss spontaneous symmetry breaking
in physics, introduce Nambu-Goldstone bosons (NGB) and discuss the Higgs mechanism applied to
the Standard Model gauge group. The hierarchy problem and the need to search for physics beyond
the Standard model are discussed. We focus on Little Higgs models that are a partial solution to
the hierarchy problem by postulating new physics at the TeV scale, and yield a naturally light Higgs
through a mechanism called collective symmetry breaking. Collective symmetry breaking will be
introduced via an SU(3) based toy model after which the ”Littlest Higgs”, based on a global SU(5)
symmetry is discussed. Branching rules for SU(5) will be also be discussed in the framework of Grand
Unified Theories.
Title: Branching rules and Little Higgs models.
Author: Cathelijne ter Burg, 10422722
Supervisors: Prof. dr. J. Stokman & Prof. dr. E. Laenen
Final date: 17-07-2015
IoP-ITFA & KdVI
University of Amsterdam
Science Park 105-107, 1098 XG Amsterdam
Contents
1 Introduction 3
2 Necessities from representation theory 5
2.1 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Character theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Modules and the group algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Restricted and induced representations . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 The irreducible representations of Sn 17
3.1 The symmetric group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Young diagrams and Young tableau’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Constructing the irreducible representations of Sn . . . . . . . . . . . . . . . . . . . . 20
3.4 Young subgroups, induced representations and Young’s Rule . . . . . . . . . . . . . . . 23
4 The irreducible representations of GL(V ) 27
4.1 The irreducible characters of SλV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5 Branching Rules 34
5.1 Branching Rules for GL(n,C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.2 The irreducible representations of SU(n) . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6 Lagrangians, symmetries and symmetry breaking 40
6.1 Lagrangian formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.2 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3 Symmetry breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.3.1 Explicit symmetry breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.3.2 Spontaneous symmetry breaking . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7 Goldstone bosons and the Higgs mechanism 49
7.1 Local U(1) gauge theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2 Abelian Higgs Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.3 The Standard model Higgs mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.3.1 Assigning mass to gauge bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.3.2 Assigning mass to fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8 The Hierarchy problem 59
8.1 Naturalness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.2 Hierarchy problem in the Higgs sector . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9 Little Higgs models 62
9.1 Transformation of NGB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.2 Constructing ”The Simplest Little Higgs”. . . . . . . . . . . . . . . . . . . . . . . . . . 63
9.2.1 Adding the Gauge coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.2.2 Adding the Yukawa coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.2.3 The Higgs potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
1
9.2.4 Hypercharge and color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
9.2.5 The gauge sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.2.6 Cancellation of the W boson loop . . . . . . . . . . . . . . . . . . . . . . . . . 77
10 Representations, particle multiplets and symmetry breaking 79
10.1 SU(5)→ SU(3)C × SU(2)W × U(1)Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
11 The Littlest Higgs 85
11.1 Requirements for the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
11.2 The Gauge bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
11.3 The Quartic Higgs potential and Higgs mass . . . . . . . . . . . . . . . . . . . . . . . . 89
11.4 Viability of ’Littlest Higgs’ and signatures in experiment . . . . . . . . . . . . . . . . . 92
12 Summary 94
12.1 Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
12.2 Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
12.3 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
13 Popular summary (Dutch) 96
A Symmetric polynomials 101
A.1 Monomial symmetric polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
A.2 Complete symmetric polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
A.3 Elementary symmetric polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
A.4 Schur polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
A.5 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
A.6 Relations among the symmetric polynomials . . . . . . . . . . . . . . . . . . . . . . . . 104
A.6.1 Skew Schur functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
B Lie groups and Lie algebra’s 106
B.1 Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
B.2 Lie algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
B.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
C Notation and relevant quantum numbers 109
C.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
C.2 Quantum numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
C.2.1 isospin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
C.2.2 Weak isospin and weak hypercharge . . . . . . . . . . . . . . . . . . . . . . . . 110
D Feynman rules and calculating loop integrals 112
D.1 Superficial degree of divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
D.2 Regularization schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
D.2.1 Momentum Cut-off regularization . . . . . . . . . . . . . . . . . . . . . . . . . . 113
D.3 Calculation of quadratic divergent contributions to the Higgs mass . . . . . . . . . . . 113
2
1 Introduction
Symmetries and symmetry groups play an important role in modern science, as well in mathemat-
ics as in physics. Mathematically we say an object obeys a symmetry when it is invariant under a
transformation. A 3D sphere for one, obeys rotational symmetry i.e. is invariant under the contin-
uous symmetry group SO(3). In most cases though, continuous symmetry groups are not easy to
work with. Therefore, mathematicians ”represent” their elements as linear transformations between
vectorspaces. Representation theory goes back to the late eighteen hundreds and finds applications in
many fields, ranging from mathematics and statistics, both pure and applied, to physics. In the latter
it has among others applications in particle physics. There, it proved to be convenient to associate
the transformations of different particles under a symmetry group with its different representations.
Each particle is assigned to a certain representation and is said to transform as the representation or
”lie in the representation”. As important as symmetries is the notion of symmetry breaking, meaning
that the symmetry group is reduced to a smaller group. This will be an integral part throughout
this thesis. Mathematically, symmetry breaking is described by branching rules. These describe how
the restriction of an irreducible representation decomposes into irreducible representations of the sub-
group. Put in terms of particles, they tell us how these will transform under the reduced symmetry
group. A symmetry group that is of special importance in particle physics is the group SU(N) as the
Standard Model symmetry group is SU(3)C × SU(2)W × U(1)Y .
The Standard Model describes the universe in terms of fermions (matter) and the forces between
them by interchanging bosons1. With the discovery of the Higgs boson in 2012 all particles predicted
by the standard model have been observed. It is highly consistent with experimental data, stands as
one of the biggest successes of modern science. However, it still has some unanswered questions, one
of them being the nature of dark matter, and physicists nowadays believe that it is only an effective
theory, meaning that at some high energy scale it must be replaced by a more fundamental theory.
One reason for expecting physics beyond the standard model comes from measurements of the coupling
constants. They are very different at low energies but at higher energies seem to converge to a single
point at around 1015 GeV indicating that the three forces were once united. Here we discuss a different
motivation for expecting physics not far beyond the Standard Model, namely the hierarchy problem.
Put briefly, the hierarchy problem refers to the vast difference between the energy scales of different
physical theories and causes the Higgs mass to acquire quadratically divergent quantum corrections.
If the Standard Model is assumed to remain valid up to energies many orders of magnitude above the
electroweak symmetry breaking scale, it becomes an extremely fine-tuned theory to keep the Higgs
mass at its measured value ∼ 100 GeV, which is highly unnatural. In this thesis I will discuss a class
of models that form a partial solution to the hierarchy problem. These are called ’Little Higgs models’
and they postulate new physics at the 1 TeV scale by introducing new particles. Little Higgs models
will be the subject of the second part.
The full content of this thesis will be organized as follows: The first part will have a focus on math-
ematics whilst the second will be focusing on physics. I start by providing an overview of results
from representation theory that will be needed. In section 3 I discuss the symmetric group Sn. I
introduce the Young diagram corresponding to the partitions of n and the Young tableaux and show
1This includes the strong, weak and electromagnetic forces. It does not incorporate gravity.
3
that by constructing a particular element in its group algebra, called the Young symmetrizer, we can
construct all the irreducible representations of Sn, which will be parametrized by the partitions. Then
we turn our attention to the group GL(V ) ∼= GL(n,C) in section 4. It turns out that the same Young
symmetrizer can be used to also construct many of the irreducible representations of GL(n,C). In
particular I show that the irreducible characters are given by certain symmetric polynomials, called
Schur polynomials. Once we have made this identification, determining branching rules in section 5
will come down to using known identities between these Schur polynomials. Needed results about these
Schur polynomials can be found in Appendix A. Once we have these branching rules, I will discuss
some results from Lie theory to argue that we can extend all results to SU(N) ⊂ U(N) ⊂ GL(n,C).
Then, in section 6 I will introduce the field theoretic Lagrangian, discuss symmetry breaking and in-
troduce Nambu-Goldstone bosons. Section 7 will introduce local symmetries, the covariant derivative,
gauge fields and discuss the Higgs mechanism in the Standard Model responsible for the masses of
the elementary fermions and the W±, Z0 bosons. Section 8 will focus on the hierarchy problem. In
section 9 I start by discussing the first Little Higgs model that will be based on SU(3). It will act as a
toy model to comprehend the physics. Later in section 11 I will discuss the ’Littlest Higgs’, based on
SU(5) that is the minimal model that could act as a viable extension of the Standard Model at the
TeV scale. In section 10 I will give a general introduction to the group SU(5) by discussing how the
fundamental particles can be distributed over the irreducible SU(5) representations. For this, we will
use the results of one of the branching rules as derived in section 5.
4
2 Necessities from representation theory
This section will serve as an overview of all results from representation theory that will be needed in
the following sections. The results are mainly based on [1], [4], [5] and [6]2. We begin by defining what
we mean by a representation.
2.1 Representations
Definition 2.1. Let G a group. A representation of a group G on a C-vectorspace V is a homo-
morphism ρ : G → GL(V ) of G to the group of automorfisms of V , such that ρ(gh) = ρ(g)ρ(h) and
ρ(1) = 1 for all g, h ∈ G.
The dimension of the representation is the dimension of the vectorspace V . Here GL(V ) is the group
of all invertible linear transformations φ : V → V . We will often call V the representation of G and
omit the symbol ρ, that is, write gv for ρ(g)v. A few representations that will be important are the
following.
Definition 2.2. The trivial representation of G is the representation ρ : G → GL(C) such that
ρ(g) = 1 for all g ∈ G.
All groups have this one-dimensional representation. In the case where G = Sn, the symmetric group
on n letters, there is a second one-dimensional representation.
Definition 2.3. The sign representation of Sn is the representation sgn : G→ GL(C) such that:
sgn(g) =
1 if g is an even permutation
-1 if g is an odd permutation(2.1)
Definition 2.4. If X is a finite set and G acts on X on the left, then there is an associated permutation
representation. If V is the vectorspace with basis ex : x ∈ X, then G acts on V by
g ·∑x∈X
axex =∑x∈X
axegx. (2.2)
Definition 2.5. The permutation representation corresponding to the left action of G on itself is
called the regular representation.
This representation is of dimension |G| and has a set basis vectors given by eg : g ∈ G.
Definition 2.6. A sub-representation of a representation V is a sub-vectorspace W of V which is
invariant under the action of G, i.e. g · w ∈W for all w ∈W .
Definition 2.7. A representation V is called irreducible (or simple) if the only sub-representations
are 0 or V itself. It is called indecomposable if it cannot be written as a direct sum of two nonzero
sub-representations. It is called reducible if it has a proper sub-representation.
2Proofs that have been omitted are in accordance with Prof. Stokman, mainly due to length and actual
relevance.
5
Example 2.1. Consider the symmetric group on n letters and let e1, e2, . . . , en be the standard basis
of Cn. Sn is a permutation group and thus has a natural permutation representation, where it acts on
Cn by permuting the indices. Note that the one-dimensional subspace spanned by e1+e2+. . .+en is left
invariant under the action of Sn. It has a complementary subspace (x1, . . . , xn) : x1 + . . .+ xn = 0.This subspace is n − 1 dimensional and is also invariant and hence a sub-representation. It is called
the standard representation of Sn.
Making new representations
Once we have two representations V and W it is possible to construct new representations and this
is most easily done by taking their direct sum, i.e. V ⊕ W . Also the tensor product V ⊗ W is a
representation via g(v ⊗ w) = gv ⊗ gw. It is then easily deduced that the nth tensor power V ⊗n is
also a representation. This nth tensor power has the exterior power and symmetric powers, denoted
Λ⊗nV and Sym⊗nV respectively, as sub-representations. They are defined as:
Definition 2.8. The nth symmetric power is the subspace of V ⊗n spanned by
∑σ∈Sn
vσ(1) ⊗ . . .⊗ vσ(n) | vi ∈ V
Definition 2.9. The nth exterior power is the subspace of V ⊗n spanned by
∑σ∈Sn
sgn(σ)vσ(1) ⊗ . . .⊗ vσ(n) | vi ∈ V
A completely different way of constructing a new representation is through the dual representation V ∗
of V . This is the space of all linear maps φ : V → C.
Definition 2.10. If ρV : G → GL(V ) is a representation of G on a vectorspace V , then the dual
representation ρV ∗ : G→ GL(V ∗) is defined by
ρV ∗(g)(φ) = φ ρV (g−1)
The vectorspace Hom(V,W ) can also be made into a representation through the map
ρHom(V,W )(g)(φ) = ρW (g) φ ρV (g−1)
Now, if we compare this equation with the defining map for the dual representation, then we can
deduce that in the case of W = C, the trivial representation, we can make the identification of:
V ∗ ∼= Hom(V,C). And this is in fact a special case of the general case where we have
Hom(V,W ) ∼= V ∗ ⊗W
Complete reducibility and Schur’s lemma
Given a representation we would now like to know how it is build up in terms of its irreducible sub-
representations, or put differently, how it decomposes in terms of its irreducible sub-representations.
We start with the following result.
6
Proposition 2.1. If W is a sub-representation of a representation V of a group G, then there is a
complementary invariant subspace W⊥ of V such that we have V = W ⊗W⊥.
Proof To prove this, we define a Hermitian inproduct H(·, ·) on V such that H(gv, gw) = (v, w) for
all g ∈ G, v ∈ V,w ∈W . We can get such an Hermitian inner product by taking any Hermitian inner
product H0 and averaging it over G. That is, we define it as
H(v, w) =1
|G|∑g∈G
H0(gv, gw)
Then, if W is a sub representation of V , then W⊥ is also a sub representation of V since:
gv ∈W⊥ ⇔ H(gv, w) = 0, for all w ∈W ⇔ H(v, g−1w) = 0, for all w ∈W
since g−1w ∈W , and we can thus describe W⊥ using W .
This proposition tells us that we can in fact consider any representation as a direct sum of sub
representations, and by induction on the dimension it can be concluded that it can in fact be written
as a direct sum of irreducible sub representations.
Corollary 2.1. Any representation of a finite group G can be written as a direct sum of irreducibles.
This is a property that all finite groups have and it is called complete reducibility. It does not tell us
however, how it decomposes as a direct sum of its irreducibles and whether this decomposition will
in fact be unique. This we are told by Schur’s lemma. To formulate this we need to introduce the
G-homomorphism/intertwining operator.
Definition 2.11. Given two representations (ρ, V ) and (π,W ) of the same group G, a intertwining
operator or G-homomorphism is a linear map ψ : V → W that intertwines with the action of G, i.e.
for which the following holds:
ψ(ρ(g)v) = π(g)(ψ(v)), for all g ∈ G, v ∈ V
Remark Both the kernel and image of φ denoted Ker(φ) and Im(φ) are sub representations of V and
W respectively. The vectorspace of all intertwining operators is denoted by HomG(V,W ) which is a
subspace of Hom(V,W ). We can now formulate Schur’s lemma, telling us under what conditions two
irreducible representations V and W will be equivalent.
Lemma 2.1. Schur’s lemma If V and W are two irreducible representations of G and φ : V →W is
a G-homomorfism, then
1. φ is an isomorphism or φ = 0.
2. If V = W , then φ = λI for some complex scalar λ, where I is the identity map.
Proof The first part of the theorem follows from the fact that both the kernel and image of an
intertwiner are invariant subspaces of V and W respectively. Since V and W are both irreducible they
can not be proper subspaces. Therefore, Ker(φ) is either 0 or V . If Ker(φ) = V then φ = 0, if φ 6= 0
then Ker(φ) = 0 meaning φ is injective. Similarly, Im(φ) is either 0 or W . If φ = 0 Im(φ) = 0,
else Im(φ) = W and φ is surjective. Thus, if both V and W are irreducible, φ is an isomorphism or
the zero map.
7
For (2) we let λ an eigenvalue3 of φ. It exists since C is algebraically closed. Then the operator φ−λ·Idhas a non-zero kernel. But then (1) implies that φ− λ·Id= 0. Thus φ is scalar multiplication.
Note that with this lemma it follows that the intertwiners between irreducible representations satisfy:
• HomG(V,W ) = 0 if V is not isomorphic to W .
• HomG(V,W ) ∼= C if V ∼= W .
Proposition 2.2. For any representation V of a finite group G there is a decomposition
V = V ⊕a11 ⊕ . . .⊕ V ⊕akk
where the Vi are non-isomorphic irreducible representations. This decomposition of V into a direct
sum of k factors is unique up to isomorphism, and so are the Vi that occur in the decomposition and
their multiplicities ai.
proof Suppose W is another representation of G with decomposition W =⊕W⊕bjj . Suppose further
that φ : V → W is a G-homomorphism. Then by Schur φ must map the factor V ⊕aii into that factor
W⊕bjj for which Vi ∼= Wj since if it was mapped to more that one, Vi would not be irreducible. By
applying this to the Identity map φ : V → V the uniqueness follows.
2.2 Character theory
This section will discuss character theory, a convenient way to characterize representations.
Definition 2.12. If (ρ, V ) is a representation of G, then its character χV is the function χV : G→ Cdefined by
χV (g) = Tr(ρ(g)|V )
i.e. the trace of ρ(g) on V .
The characters represent class functions on the group G. The set class functions on G, written Cclass(G)
is the set of functions that are constant on the conjugacy classes of G. It can be seen as follows:
χV (hgh−1) = Tr(hgh−1) = Tr(h−1hg) = Tr(g) = χV (g).
A few more results about characters that we will need are given by the following proposition.
Proposition 2.3. Let V and W two representations of G. Then
χV⊕W = χV + χW , χV⊗W = χV · χW
χV ∗(g) = χV (g), χ∧2V (g) =1
2[χV (g)2 − χV (g2)]
3λ is an eigenvalue if it is a root of the characteristic polynomial of det(φ - λId)
8
proof To prove this we consider a fixed element g ∈ G and compute the values of these characters on
g. For the action of g we let λi and µi be the eigenvalues of V and W respectively. Then the
first two formulas follow from the observation that λi +µi are the eigenvalues of V ⊕W and λiµithose of V ⊗W . Similarly λ−1
i = λi are the eigenvalues of g on V ∗, since all eigenvalues are nth roots
of unity, with n the order of g. Regarding the last formula we observe that λiλj : i < j are the
eigenvalues for g on∧2
V and ∑i<j
λiλj =(∑λi)
2 −∑λ2i
2
Then the formula follows since g2 has eigenvalues λ2i .
Characters have many applications. For one, they can be used to explicitly find the decomposition of
a representation into a direct sum of its irreducible sub-representations. In order to achieve this we
want to find a way to project V onto the irreducible representations to find out if those irreducibles
are in V , and if so, to determine the multiplicity that they appear with. For this we introduce the
first projection formula by setting:
φ =1
|G|∑g∈G
ρ(g) ∈ End(V ). (2.3)
Here |G| represents the number of elements of the group G. φ thus represents an average of all
endomorphisms ρ(g) : V → V . We further set for any representation V of G
V G = v ∈ V : gv = v, for all g ∈ G (2.4)
This is again a representation of G and it is a direct sum of trivial sub-representations of V .
Proposition 2.4. The map φ defined in (2.3) is a projection of V onto V G.
proof Suppose first that v = φ(w) = 1|G|∑gw. Then:
hv =1
|G|∑
hgw =1
|G|∑
gw for any h ∈ G
so Im(φ) ⊂ V G. To prove ” ⊃ ”, we let v ∈ V G, then φ(v) = 1|G|∑v = v, so V G ⊂ Im(φ). Further,
φ φ = φ.
With formula (2.3) we can explicitly find the direct sum of the trivial sub representations in a given
representations. In particular, the multiplicity of the trivial representation appearing in the decompo-
sition of V is the dimension of V G. Since φ is a projection onto V G, this dimension is the trace of φ.
Therefore, writing m for the multiplicity, we have:
m = Trace(φ) =1
|G|∑g∈G
Trace(g) =1
|G|∑g∈G
χV (g) (2.5)
We can do more with this idea. We let
Hom(V,W )G = G-module homomorphisms φ : V →W
9
Then it follows by Schur that if V is irreducible, dim(Hom(V,W )G) is the multiplicity of V in W .
Also, is V and W are both irreducible we have
dim(Hom(V,W )G) =
1 if V ∼= W
0 if V 6= W(2.6)
Using proposition 2.3 we have
χHom(V,W )G = χV (g) · χW (g)
where we used the fact Hom(V,W ) = V ∗ ⊗W . By now applying (2.5) we deduce:
1
|G|∑g∈G
χV (g)χW (g) =
1 if V ∼= W
0 if V 6= W(2.7)
where V andW are irreducible. This relation looks a lot like some kind of inner product< χV (g), χW (g) >
and that is precisely what it is. It represents an hermitian inner product on Cclass:
< α, β >=1
|G|∑g∈G
α(g)β(g) (2.8)
Therefore we can reformulate (2.7) as:
Theorem 2.1. In terms of the inner product (2.8), the characters of the irreducible representations
are orthonormal.
We now let V ∼= V ⊕a11 ⊕. . .⊕V ⊕akk with the Vi distinct irreducible representations, then χV =∑aiχVi .
Since the χVi is linearly independent, we can conclude the following.
Corollary 2.2. Any representation is determined by its character.
By using (2.7) we can further deduce the following results.
Corollary 2.3. A representation V is irreducible iff (χV , χV ) = 1.
proof The implication from left to right follows immediately with (2.7). For the other implication we
let V ∼= V ⊕a11 ⊕, . . . ,⊕V ⊕akk with the Vi distinct irreducible representations. Then, (χV , χV ) =∑a2i
which is 1 only if ai = 1 for all i and n = 1.
Corollary 2.4. Let V have a decomposition as above. Then the multiplicity ai of Vi in V is the inner
product ai = (χV , χVi).
proof We have that
(χV , χVi) =∑j
aj(χVj , χVi) = ai
since (χVj , χVi) = 0 for i 6= j and 1 when i = j.
Another important result follows from the fixed point formula applied to the regular representation.
Theorem 2.2. Fixed point formula Let G be a finite group, and X a finite set. Let V be the
permutation representation as in definition 2.4. Then for all g ∈ G, χV (g) is the number of elements
in X left fixed under the action of g.
10
proof Observe that the matrix M that is associated with the action of g is a permutation matrix.
Suppose first that X = x1, x2, x3 and ρ(g) permutes the basis vectors of V by sending ex1→ ex3
,
ex2to itself and ex3
→ ex1. Then
M =
0 0 1
0 1 0
1 0 0
In the general case: if gexi = egxi = exj then the matrix M will have a 1 in the i−th column and
j−th row, and zeros in all other entries of that column. In particular, when xi is held fixed by g, then
gexi = egxi = exi then M has a 1 in the i-th row and i-th column. Therefore, the trace is the number
of 1’s on the diagonal, i.e. the number of elements left fixed by g.
Then we deduce that the character of the regular representation, χR, is given by:
χR(g) =
0 if g 6= e
|G| if g = e(2.9)
Thus only when G = e is R irreducible and if we let R =⊕V aii a decomposition into distinct
irreducibles Vi we find
ai = (χVi , χR) =1
|G|χVi(e)|G| = dim(Vi),
which gives us the following corollary:
Corollary 2.5. Any irreducible representation V of G appears in the regular representation dim (V )
times. In particular this means that the regular representation contains all irreducibles.
Another consequence is the following:
|G| = dim(R) =∑i
aidim(Vi) =∑i
(dim(Vi))2
To conclude this section on character theory there is one more result that we will need.
Proposition 2.5. The number of irreducible representations of G is equal to the number of conjugacy
classes of G. Equivalently, the characters form an orthonormal basis for the set of class functions
CClass(G).
proof Take α : G→ C a class function and (α, χV ) = 0 for all irreducible representations V . Then it
is to show that α = 0. For this consider the endomorphism
φα,V =∑g∈G
α(g)g V → V
We now want to apply Schur’s lemma. For this we first have to show that φα,V is aG-homomorphism/intertwining
operator.
φα,V (hv) =∑
α(g)g(hv)
=∑
α(hgh−1)hgh−1(hv)
=h(∑
α(hgh−1)g(v))
=h(∑
α(g)g(v))
=h(φα,V (v))
11
Then it now follows by Schur’s lemma (2.1) that φα,V = λ· Id, whereby:
λ =1
dimVTrace(φα,V )
=1
dimV
∑α(g)χV (g)
=|G|
dimV(α, χV ∗)
=0
Therefore φα,V = 0 and hence∑g∈G α(g)g = 0. This also holds for the regular representation R and
in this representation the elements of G are linearly independent, implying α(g) = 0 for all g ∈ G as
was to be shown.
2.3 Modules and the group algebra
There is one particular choice for the vectorspace V that turns out to be very convenient. This is
when V is taken to the the group algebra C[G]. Before getting into the necessary details about the
group algebra, I will first review the concepts of algebra’s and modules.
Definition 2.13. An associative algebra over C is a vectorspace A over C together with a bilinear
map A×A→ A, (a, b)→ ab such that (ab)c = a(bc).
Definition 2.14. A left A-module with unit 1A, is a finite dimensional vector space V over C together
with a function φ : A × V → V , (a, v) → av which is bilinear and satisfies a(bv) = (ab)v for all
a, b ∈ A, v ∈ V .
Just as is we can define a representation of a group G, we can define a representation of an algebra.
Definition 2.15. A representation of an algebra A (or equivalently a left A-module) is a vectorspace
V together with an algebra homomorphism
φ : A→ End(V )
such that φ(ab) = φ(a)φ(b) and φ(1A) = 1.
Definition 2.16. The regular representation of A (also called the left regular A-module) is the vec-
torspace A itself, made into an A-module through the map A × A → A, given by (a, b) → ab for all
a, b ∈ A.
As said, we now consider the particular case where the vectorspace A is taken to be the group algebra
C[G]. This is the vectorspace with eg|g ∈ G as set of basis vectors and multiplication defined by
eg · eh = egh. It consists of all element of the form
C[G] =
∑g∈G
egg|eg ∈ C
.
In this case it holds that C[G] modules correspond directly to representations of G over C since any
representation ρ : G→ GL(V ) can be linearly extended to a map ρ : C[G]→ End(V ) via the map:
ρ : C[G]→ End(V ), ρ
∑g∈G
egg
=∑g∈G
egρ(g) ∈ End(V ). (2.10)
12
Therefore the correspondence ρ 7→ ρ gives us an equivalence between the representations of G and
C[G] modules. Further, sub-representations correspond to submodules, irreducible representations to
simple submodules etc. All statements about representations of G have an equivalent statement in
terms of its group algebra.
Proposition 2.6. If Wi are the irreducible representations of G then we have an isomorphism of
algebra’s:
C[G] ∼=⊕
End(Wi)
proof As mentioned above, a map G → GL(V ) of groups extends linearly to a map C[G] → End(V )
of algebra’s. By applying this to each of the Wi we find the canonical map:
φ : C[G]→⊕
End(Wi)
This map is an isomorphism. It is injective since the representation on the regular representation is
faithful. Surjectivity follows from the observation that both have dimension∑
(dim(Wi))2.
Remark. We can alternatively formulate this in terms of n × n matrix algebra’s over a division
ring C because endomorphisms between vectorspaces can be seen as matrices. If we denote ni as the
dimension of Wi, then:
C[G] ∼=⊕
Matni(C).
This relation holds in fact for the more general case where A is a semisimple algebra. It is a result due
to Wedderburn. A proof can be found in [4] pp 26. Since it is rather long I will not include it here. It
introduces the opposite algebra to show that A ∼=⊕
End(Vi) and uses the observation that if we can
decompose Vi =⊕
i niUi into irreducibles Ui with a certain multiplicity, then
End(Vi) = End
(k⊕i=1
niUi
)∼=
⊕1≤i,j≤k
Hom(niUi, njUj) ∼=k⊕i=1
End(niUi) ∼=k⊕i=1
Matni(C)
where the third equality follows from Schur’s lemma (2.1), since intertwiners between non-isomorphic
irreducibles are zero. Any semisimple algebra A can in this way be written as a direct sum of matrix
algebra’s over C.
Primitive idempotents in the center of the group algebra
I will now introduce an important type of elements, called idempotent elements, in an algebra A.
These we will need in the irreducible representations of the symmetric group in the next section. In
the following A will always be a unital, finite dimensional, associative, commutative algebra over C.
Definition 2.17. An idempotent element p ∈ A is an element that satisfies p2 = p. Two idempotents
p1, p2 are called mutually orthogonal if p1p2 = 0 = p2p1. An idempotent p is called primitive or
minimal if p = p1 + p2 implies p1 = 0 or p2 = 0, where p1 and p2 are mutually orthogonal. A set of
mutual orthogonal idempotents p1, . . . pn is complete if p1 + . . .+ pn = 1
These idempotent elements generate left ideals in a commutative algebra A and these left ideals are
precisely its submodules. The irreducible submodules of A correspond to the minimal left ideals
generated by primitive idempotents. In fact we have the following lemma:
13
Lemma 2.2. Let A be an algebra. If V = U ⊕ W is a decomposition of V as direct sum of A
-submodules, then the projection of V onto U and W , denoted pu and pw respectively, satisfy
• pu and pw and mutually orthogonal idempotents in A.
• 1 = pu + pw
• if p ∈ A is an idempotent then 1− p is an idempotent as well, 1 = p+ (1− p) is a decomposition
of 1 ∈ A as sum of orthogonal idempotents and V = pV ⊕ (1− p)V is a decomposition of V as
a direct sum of A-submodules.
Proof Since (1) and (2) are obvious, we only prove (3). Suppose that p ∈ A is an idempotent, then
also 1 − p ∈ A is an idempotent, since we have (1 − p)2 = 1 − 2p + p2 = 1 − p. They are clearly
orthogonal since p(1− p) = (1− p)p = p− p2 = 0. Therefore, Im(p)∩Im(1− p) = pV ∩ (1− p)V = 0
and V =Im(p)+Im(1− p) = pV + (1− p)V . Thus V = pV ⊕ (1− p)V is a direct sum decomposition
of V .
In the case where pii=1,...,n represents a complete set of orthogonal idempotents, then by the previous
lemma we have that V = ⊕ni=1piV is a decomposition of V as direct sum of A-submodules. There a a
few more results about these idempotents that we will need.
Lemma 2.3. Suppose p1, p2 ∈ A are primitive idempotents. Then p1p2 = 0 iff p1 6= p2.
proof We will prove p1p2 6= 0 iff p1 = p2. For ”⇒ ” this let p1, p2 ∈ A be primitive idempotents such
that p1p2 6= 0. Then
p1 = p1p2 + p1(1− p2)
is a decomposition of p1 in mutually orthogonal idempotents. Note that A is commutative. Now, since
p1 is primitive and p1p2 6= 0 we must have p1(1 − p2) = 0 and thus p1 = p1p2. Also p2 = p2p1 by
interchanging p1 and p2 and thus p1 = p2. Conversely, if p1 = p2, then p1p2 = p21 = p1 6= 0.
Corollary 2.6. The primitive idempotents of an algebra A form a finite, linear independent set.
Proof Suppose∑i λipi = 0, then 0 = pj
∑i λipi = λjpj by lemma 2.3. Thus, λj = 0 for all j and the
set is this linear independent. Since A is finite dimensional the set pi is a finite set.
Proposition 2.7. Let ai be the set of primitive idempotents in A. Then
1 =∑i
ai
Proof We prove this with induction to the dimension of A. If dim(A) = 1 there is nothing to prove
since then 1 is the only primitive idempotent. Suppose now dim(A) > 1. If 1 ∈ A is primitive, then it
is the only primitive idempotent. For this, suppose a ∈ A where another primitive idempotent. Then
0 6= a 6= a · 1 and thus a = 1 by lemma 2.3. Therefore, it remains to prove the induction step in the
case that 1 ∈ A is not primitive. Then there exist nonzero, pairwise orthogonal idempotents b, c ∈ Asuch that 1 = b + c. Now set A(b) = Ab = ab : a ∈ A and A(c) = Ac = ac : a ∈ A. Then
A(b), A(c) ⊂ A are subalgebras of A with unit elements b and c respectively. Further, we have that
14
A = A(b) +A(c), since 1 = b+ c and A(b)∩A(c) = 0 since bc = 0. Thus, viewed as vectorspaces we
have
A = A(b)⊕A(c)
We further conclude that A is isomorphic to the direct sum of the two subalgebras A(b) and A(c)
since A(b)A(c) = 0. Now, since b ∈ A(b) and c ∈ A(c) we have A(b) 6= 0 6= A(c) and thus dimA(b),
dimA(c) <dim(A). By the induction hypothesis, b∑i bi and c =
∑j cj with bi and cj the sets of
primitive idempotents of A(b) and A(c) respectively. Then the observation that bi ∪ cj is the set
primitive idempotents of A(b)⊕A(c) completes the proof.
Proposition 2.8. Let p ∈ A an idempotent element and A an algebra. Then the left ideal Ap is
indecomposable if p is a primitive idempotent.
Proof Suppose Ap where decomposable. Then Ap = I1 ⊕ I2 for two nonzero A−submodules of Ap.
Since, p ∈ Ap, we can find unique p1 ∈ I1 and p2 ∈ I2 that satisfy p = p1 + p2. If p1 = 0 then
p = p2 ∈ I2 implying that Ap = Ap2 ⊆ I2, since I2 is a left ideal, whereby I1 ⊕ I2 ⊂ I2, which
contradicts the assumption I1 6= 0. Similar reasoning for p2 = 0. Further, since p ∈ Ap we have
p1 = p1p, and thus
(1− p1)p1 = p1 − p21 = p1p− p2
1 = p1(p1 + p2)− p21 = p1p2
However, p1p2 ∈ I2 and (1−p1)p1 ∈ I1 and we thus conclude that p1−p21 = p1p2 = 0 since I1∩I2 = 0.
Thus p21 = p1 and p1p2 = 0. Similarly, p2
2 = p2 and p2p1 = 0 by interchanging p1 and p2. Thus p1 and
p2 are orthogonal idempotents with p1 + p2 = p. Therefore p is not primitive.
Definition 2.18. A finite dimensional, associative, unital algebra A over C is semi simple if A is the
sum of its simple left ideals.
The group algebra C[G], for one, is semi-simple. We can thus conclude that finding a complete set of
primitive orthogonal idempotent elements in Z(C[G]), the center of the group algebra, and determining
the left ideals they generate will give a decomposition in terms of its simple left ideals, i.e. simple
submodules. That the idempotents be in the center is important, since a the group algebra is in
general not commutative. These idempotents are defined as:
pπ =dim(Vπ)
|G|∑g∈G
χπ(g)eg ∈ Z(C[G]) (2.11)
where π is an irreducible representation of G. Then4:
Corollary 2.7. 1. pπ is a linear basis of Z(C[G])
2.∑π pπ = ee where ee ∈ C[G] is the unit element
3. pπ is the set of primitive orthogonal idempotent elements of Z(C[G])
4I will not prove this here since its proof is rather long. It can be found in [5] The proof relies on the
observation that f 7→ ψf ≡∑g∈G f(f)eg defines an isomorphism between the set of class functions F (G) and
Z(C[G]) and the observations that the characters span the set of class functions and form a set of idempotent
elements of F (G)
15
Now, we have also seen that the left regular C[G] module contains all simple submodules in its
decompositions. Thus, the problem of finding all irreducible representations amounts to constructing
such a complete set of idempotents of Z(C[G]). However, constructing such a set is not as easy as
it might seem. What we know is that there are as many irreducibles as conjugacy classes, but these
are not always easy to determine for a general group. In the case of the symmetric group though, we
will see in the next section that these conjugacy classes are in bijection with the partitions of n, and
these we can determine easily. First though, we will need to take a look at restricted and induced
representations.
2.4 Restricted and induced representations
Given a representation (ρ, V ) of a groupG and a subgroupH ⊂ G we can consider the restricted representation,
denoted ResGH(ρ). This is the representation of H defined by
ResGH(ρ) : H → GL(V ) ResGH(ρ) := ρ|H (2.12)
In the same way that the above operation of restricted representations provides us to construct repre-
sentations of subgroups, we can consider induced representations. This operation produces represen-
tations of G from representations of H. Here I will briefly discuss this particular construction. For
this we let V a representation of G and W ⊆ V an H−invariant subspace. We write G/H for the left
cosets of H in G, i.e. it is the set of equivalence classes of G w.r.t. the equivalence relation g ∼ g′
iff g−1g′ ∈ H. Its elements are thus the left cosets gH = gh : h ∈ H, g ∈ G. Now, for any g ∈ G,
the subspace g ·W depends only on the left coset gH of g modulo H, since ghW = g(hW ) = gW .
Let now σ ∈ G/H, a coset and write σW for the subspace of V . Then the induced representation is
defined as follows:
Definition 2.19. Let V and W be two representations of G and H respectively with H ⊆ G a
subgroup and W ⊆ V . Then we say that V is induced by W if:
V =⊕
σ∈G/H
σW
I this case we write V = IndGHW , of simply Ind W if there is no ambiguity.
From the previous section we now that representations of a group G have an exact equivalence in
terms of its group algebra C[G]. In this sense then, the induced representation IndGHW is defined as
the left C[G] module5.
C[G]⊗C[H] W with action ρ(a)(a′ ⊗C[H] w) = (aa′)⊗C[H] w
Two important induced representation that we will see later are the following.
Example 2.2. The permutation representation of G is induced from the trivial one dimensional
representation W of H.
Example 2.3. The regular representation of G is induced from the trivial representation on the trivial
subgroup.
5A proof can be found in [6]. Again it will not be included since it is rather long.
16
3 The irreducible representations of Sn
Equipped with all the results of the previous section we now turn to the symmetric group and in
particular to the construction of its irreducible representations. This construction will also allow us to
construct the irreducible representations of GL(V ) in section 4. First thought, we need some results
concerning the symmetric group and in particular we discuss the Young diagrams. Results are from
[1], [4], [7] and [8].
3.1 The symmetric group
Recall that the symmetric group was defined as:
Definition 3.1. Let n ≥ 1 and write Sn for the symmetric group on n letters. Sn is the group that
consists of all bijections of Ωn = 1, 2, . . . , n into itself under composition. We have #Sn = n!. The
elements of Sn are the permutations. If σ and π represent two permutations, then π σ means that
we first apply σ and then π.
We will write a permutation σ ∈ Sn using cycle notation. Take for example, the permutation σ =
(142)(53)(6). This notation means that σ maps 1 → 4 → 2 , 2 → 1 5 → 3 and 3 → 5 and maps 6 to
itself. The content of a cycle (i1, . . . , ir) we denote by I and this is an ordered subset of i1, . . . , ir ⊆Ωn of cardinality r. The permutation σ consists of 3 disjoint cycles, where by disjoint we mean that
their contents have trivial intersection. Note that disjoint cycles commute. The length of a cycle is
the number of elements it contains and the identity element is the cycle of length 1. The permutation
σ thus consists of one 3-cycle, one 2-cycle and one 1-cycle.
Lemma 3.1. Any permutation σ ∈ Sn can be written as a product of disjoint cycles.
Definition 3.2. A partition λ of n is a sequence λ = (λ1, λ2, . . . ) of nonnegative integers such that∑i λi = n and λ1 ≥ λ2 ≥ λ3 ≥ . . . . We write λ ` n. We refer to the length of λ by writing l(λ) where
l(λ) is the largest i such that λi 6= 0.
Definition 3.3. Let σ ∈ Sn and write σ as a product of disjoint cycles such that each i ∈ 1, 2, . . . , nis in one of the cycles. Collect the lengths of the disjoint cycles and put them in nondecreasing order.
This again defines a partition c(σ) of n, which is called the cycle type of σ.
Lemma 3.2. Two permutations σ and τ are conjugate iff they have the same cycle type.
Recall that by proposition 2.5 we have for any group G that the number of irreducible representations
is equal to its number of conjugacy classes. In the case of G = Sn we further have the following result.
Proposition 3.1. The conjugacy classes of the symmetric group are in bijection with the partitions.
proof Write Sn/ ∼ for the set conjugacy classes and Pn for the set of partitions of n. Now consider
the map
Sn/ ∼→ Pn; Ad(Sn)(τ) 7→ c(τ)
with c(τ) the cycle type of τ as in the above definition and Ad(Sn)(τ) the orbit of τ under conjugation
with σ ∈ Sn, i.e. Ad(σ)(τ) = στσ−1. There orbits represent the conjugacy classes of Sn. It is now to
show that the map Ad is well defined and bijective. Consider now a cycle (i1, i2,· · · , ir), then
σ(i1, i2,· · · , ir)σ−1 = (σ(i1), σ(i2),· · · , σ(ir))
17
for all σ ∈ Sn and all subsets I = i1, . . . , ir ⊆ Ωn of cardinality r. It then follows that:
Ad(Sn)σ = τ ∈ Sn : c(τ) = c(σ)
and thus Ad is a well defined and injective map. To show surjectivity, let λ = (λ1, . . . , λm) be a
partition of n, where m = l(λ). Choose further subsets
Ij = ij1, ij2, . . . , i
jλj ⊂ Ωn
of cardinality λj such that Ij ∩ Ij′ = ∅ if 1 ≤ j 6= j′ ≤ m. For the corresponding cycles of length λj
let σj be the cycle of content Ij , i.e.
σj = (ij1, ij2,· · · , i
jλj
)
Then
σ ≡ σ1σ2· · ·σm ∈ Sn
is a product of disjoint cycles such that c(σ) = λ. Thus Ad(Sn)σ → c(σ) maps onto Pn and surjectivity
is also shown.
Corollary 3.1. The number of irreducible representations of the symmetric group is equal to the
number of partitions.
This tells us that we can parametrize the irreducible representations with the partitions λ of n. In
the next subsection we will discuss an alternative way to look at these partitions by associating them
Young diagrams. These are a combinatorial tool named after the British mathematician Alfred Young
who first introduced them, in particular to study the representations of the symmetric group.
3.2 Young diagrams and Young tableau’s
The Young diagram is an array of boxes that is associated to a given partition λ = (λ1, λ2, .....) of n. A
Young diagram has λi boxes in the ith row which are lined up on the left. The conjugate partition λ′
of the partition λ is defined by interchanging the rows and columns of the Young diagram associated
to λ. Note that (λ′)′ = λ.
Example 3.1. The Young diagrams of the partition λ = (5, 4, 2) and that of its conjugate partition
λ′ = (5, 4, 2)′ = (3, 3, 2, 2, 1) are respectively given by
and
Ordering on partitions
Given two partitions of n, λ = (λ1, λ2, ..., λl) and µ = (µ1, µ2, ..., µk) we distinguish two different
orderings. The dominance ordering and the lexicographic ordering.
Definition 3.4. We say that λ dominates µ in dominance ordering, written as λD µ if
m∑i=1
λi ≥m∑j=1
µj for all 1 ≤ m ≤ maxk, l.
and defining λi = 0 if i > l and µj = 0 if j > k
18
In terms of Young diagrams we say that the Young diagram for λ dominates that for µ if there are
more boxes in the first m rows for λ than in the first m rows for µ for all 1 ≤ m ≤ maxk, l.
Example 3.2. This dominance ordering is a partial ordering for all n. However, for n ≤ 5, we can
consider it also as a total ordering on the partitions. This is easily verified by drawing all the possible
young diagrams for the first n = 1, . . . 5. For example if n = 2 we see that the only possible Young
diagrams satisfy
D
For n = 3 we have:
D D
Up to n = 5 we can construct such a tree of Young diagrams to compare the Young diagrams in total
ordering. However, when arriving at n = 6, the the dominance ordering becomes a partial ordering,
which can be seen by considering the Young diagrams for λ = (2, 2, 2) and µ = (3, 1, 1, 1).
Definition 3.5. We say that λ dominates µ in lexicographic ordering, written as λ µ if the first
non vanishing λi − µi is positive.
Note, that these two orderings are almost the same. Further, the dominance ordering implies the
lexicographic ordering, i.e. if λD µ then also λ µ. The other implication does not hold.
We obtain the Young tableau of a given Young diagram by numbering the boxes with the numbers
1, . . . , n where each number may be assigned once. We refer to a Young tableau λ by writing Tλ
and write Tλ(i, j) for the numbers in the i-th row and j-th column (1 ≤ j ≤ λi).
Definition 3.6. Let Tλ a Young tableau and λ ` n. Then Tλ is called
1. row standard if its filling is increasing along each row,
2. column standard if its filling is increasing along each column,
3. standard if it row standard as well as column standard.
A Young tableau is further called semi-standard if its filling is nondecreasing along each row and strictly
increasing along each column. Note that here it is allowed to place the same number in multiple boxes.
Example 3.3. The λ-tableau defined by
tλ(i, j) =
∑i−1k=1 λk + j if i 6= 1
j if i = 1
for (1 ≤ j ≤ λk)is a standard λ-tableau. For the partition λ = (4, 3, 2) it corresponds to:
1 2 3 45 6 78 9
Having defined the tableau we write T (λ) for the set of all λ- tableaux. Given a permutation σ ∈ Snwe can obtain a new tableau σT by defining this to be the tableau with the number σ(T (i, j)) in the
(i, j)-th box of its tableaux. This defines a left action of Sn on T (λ) i.e. Sn × T (λ) → T (λ). In the
next section we will use these Young tableau for the construction of the irreducible representations of
the symmetric group, following [1].
19
3.3 Constructing the irreducible representations of Sn
The first step is to define the row and column stabilizer of the Young tableau Tλ. That is, we define
the following subgroups:
P = Pλ = σ ∈ Sn : σ preserves each row of Tλ, (3.1)
and
Q = Qλ = σ ∈ Sn : σ preserves each column of Tλ. (3.2)
Corresponding to these subgroups we introduce two elements in the group algebra C[Sn] by setting:
aλ =∑σ∈Pλ
eσ and bλ =∑σ∈Qλ
sgn(σ) · eσ (3.3)
and we further define the Young symmetrizer cλ ∈ CSn to be
cλ = aλ · bλ =∑
σ∈Pλ,τ∈Qλ
sgn(τ) · eστ . (3.4)
This cλ generates a left ideal in the group algebra C[Sn]. This left ideal is a sub representation of the
regular Sn representation and we define it as follows:
Definition 3.7. We call Vλ = C[Sn]cλ the Specht module.
Example 3.4. this example demonstrates how Vλ is computed. For λ = (n) we have cλ = aλ =∑σ∈Sn eσ, so
V(n) = C[Sn]∑σ∈Sn
eσ = C ·∑σ∈Sn
eσ
which is the 1-dimensional trivial representation. For n ≥ 2 we have a second 1-dimensional repre-
sentation which we find by taking λ = (1, 1, . . . , 1) Then cλ = bλ and we have
V(1,1,...,1) = C[Sn]∑σ∈Sn
sgn(σ)eσ = C ·∑σ∈Sn
sgn(σ)eσ
which is the sign representation. Taking λ = (2, 1), we find for c(2,1) ∈ CS3,
c(2,1) = (e(1) + e(12)) · (e(1) − e(13)) = e(1) + e(12) − e(13) − e(132).
To find out which subspace this is we multiply c(2,1) by the basis elements of C[S3]. Then we find:
e(1)(e(1) + e(12) − e(13) − e(132)) = e(1) + e(12) − e(13) − e(132)
e(12)(e(1) + e(12) − e(13) − e(132)) = e(12) + e(1) − e(132) − e(13)
e(13)(e(1) + e(12) − e(13) − e(132)) = e(13) + e(123) − e(1) − e(23)
e(23)(e(1) + e(12) − e(13) − e(132)) = e(23) + e(132) − e(123) − e(12)
e(123)(e(1) + e(12) − e(13) − e(132)) = e(123) + e(13) − e(23) − e(1)
e(132)(e(1) + e(12) − e(13) − e(132)) = e(132) + e(23) − e(12) − e(123)
Thus C[S3] · c(12) is the 2-dimensional subspace spanned by the first and third vector, and we conclude
that it is the standard representation we introduced in example 2.1.
20
We will see that, after a normalization, these cλ form the complete set of primitive orthogonal idem-
potents we set out for and that thus the Vλ are the irreducible representations.
Theorem 3.1. cλ is an idempotent up to scalar multiplication, i.e. c2λ = nλcλ and the Specht module
Vλ is an irreducible representation of Sn. Every irreducible representation can be obtained in this way.
We need some more results to prove this theorem. In the following, the subscript λ is omitted when
it is clear it should be there, i.e. write a for aλ, etc. The idea behind the proof is to show that the
cλ are primitive idempotents up to a scalar multiple which we will call nλ. Further, we show that for
different partitions, λ, µ the product of the corresponding Young symmetrizers yields zero, meaning
they are mutually orthogonal. Lastly we show that the left ideals they generate are irreducible. We
start by observing that P and Q satisfy the following property which is clear from the way they are
defined.
1. For p ∈ P we have p · a = a · p = a
2. For q ∈ Q we have (sgn(q)q) · b = b · (sgn(q)q) = b
Lemma 3.3. For all p ∈ P and q ∈ Q we have p · c · (sgn(q)q) = c, and c is the only such element in
C[Sn] up to scalar multiplication.
We first note that since P and Q have trivial intersection, an element of Sn can be written as the
product p · q, p ∈ P, q ∈ Q in at most one way. Therefore, c =∑±eg where the sum is over al cycles
g ∈ Sn that can be written as the product p ·q with the coefficient being sgn(q). For one, the coefficient
of e1 in c is 1. In the proof we consider the tableau Tλ = tλ defined in example 3.3. Proof If∑ngeg
satisfies this condition, then we have that npgq = sgn(q)ng for all g, p, q. In particular we have npq =
sgn(q)ng. We therefore have to verify that ng = 0 if g /∈ PQ. For such a g it is sufficient to find a
transposition t such that p = t ∈ P and q = g−1tg ∈ Q, since then g = pgq, so ng = −ng. We now
define T ′ = gT , i.e. the tableau obtained by replacing each entry i of T by g(i). The claim is that
there are two distinct integers appearing in the same row of T and in the same column of T ′ and that
the element t is the transposition of these two integers. It is now to verify that if such a pair of integers
did not exist, that one could then write g = pq for some p ∈ P, q ∈ Q. To show this, we take p1 ∈ Pand q′1 ∈ Q′ = gQg−1 such that the tableaux p1T and q′1T
′ have the same first row. This we repeat
on the rest of the tableau. Then one gets p ∈ P and q′ ∈ Q′ so that pT = q′T ′. Then pT = q′gT from
which it follows that p = q′g and thus g = pq where q = g−1q′−1g ∈ Q.
In the following we will use the lexicographic ordering on the partitions λ and µ.
Lemma 3.4. 1. If λ > µ, then for all x ∈ C[Sn] we have that aλ ·x · bµ = 0. In particular we have
cλ · cµ = 0.
2. For all x ∈ C[Sn], cλ · x · cλ is a scalar multiple of cλ. In particular we have cλ · cλ = nλ · cλ,
for some nλ ∈ C.
Proof
1. Take x = g ∈ Sn. Then, since gbµg−1 is the element constructed from gT ′, with T ′ the tableau
used to construct bµ, we have to show that aλbµ = 0. When λ > µ this implies that there are
21
two integers in the same row of T and in the same column of T ′. Let now t is as in the previous
lemma be the transposition of these two integers. Then aλ = aλ · t and t · bµ = −bµ, hence
aλ · bµ = aλ · t · t · bµ = −aλ · bµ as was required to show.
2. This follows from Lemma 3.3
Corollary 3.2. If λ < µ, then cλ · C[Sn] · cµ = 0; in particular cλ · cµ = 0.
Proof We use the anti-involution6 map ˆ of C[Sn] that is induced by the map g 7→ g−1, g ∈ Sn.
Noting that its fixed points are aλ, bλ, aµ, bµ , i.e. cλ = (aλbλ) = bλaλ = bλaλ, we have (cλxcµ) =
(aλbλxaµbµ) = bµaµxbλaµ = bµaµxbλaµ = 0 since aµxbλ = 0.
Having showed that the cλ are idempotent elements that are mutually orthogonal, we now show that
Vλ they define are the irreducible. That is we proof:
Lemma 3.5. 1. Each Vλ is an irreducible representation of Sn.
2. If λ 6= µ, then Vλ and Vµ are not isomorphic.
Proposition 3.2. Let R be a ring and I 6= 0 a left ideal of R. If I is a direct summand of R, then
I2 6= 0.
Proof Suppose that I is a direct summand of R, then there exists a left ideal J such that I ⊕ J = R.
In particular, we can find i ∈ I and j ∈ J such that i + j = 1. Then i = i2 + ij by multiplying both
sides on the left with i. Then I2 6= 0, for else we had i = ij ∈ I ∩ J = 0, 1 = j ∈ J , and hence
J = R, I = 0.
Proof lemma 3.5.
1. We begin by noting that cλVλ ⊂ Ccλ by Lemma 3.4. If W ⊂ Vλ is a sub representation, then
either cλW is Ccλ or 0. If the first is true, then cλ ∈ cλW ⊆ W so Vλ = C[Sn]cλ ⊂ W .
Otherwise W ·W ⊂ C[Sn] · cλW = 0, but then W = 0 with proposition 3.2. This shows in
particular that cλVλ 6= 0, i.e. that the number nλ 6= 0.
2. We may assume λ > µ. Then cλVλ = Ccλ 6= 0, but cλVµ = cλC[Sn]cµ = 0. So they can not be
isomorphic as C[Sn] modules.
As a final step, we determine the factor nλ in c2λ = nλcλ.
Lemma 3.6. For any partition λ, cλcλ = nλcλ with nλ = n!Dim(Vλ) .
Proof Let F be right multiplication by cλ on C[Sn]. Then, since F is multiplication by nλ on Vλ, and
zero on Ker(cλ), the trace of F is nλ times the dimension of Vλ. But the coefficient of eg in egcλ is 1,
so trace(F) = |Sn| = n!
We have thus shown that the elements cλ = dim(Vλ)n! cλ
7 form a mutually orthogonal set of primi-
tive idempotents of Z(C[Sn]). This therefore proves the theorem since they give us all the irreducible
representations by letting λ vary over the partitions. In the remaining of the chapter I will discuss
some more properties about the Specht modules. In particular, I will introduce the Young-subgroup
and discuss Young’s Rule [1].
6An (anti-)involution map is a function f that is it’s own inverse, i.e. f(f(x)) = x for all x in the domain
of f .7Compare with (2.11)
22
3.4 Young subgroups, induced representations and Young’s Rule
We now introduce the Young subgroup. This is a subgroup of Sn that is isomorphic to
Sλ = Sλ1× . . .× Sλk
for some partition λ = (λ1, . . . , λk). There is one specific Young subgroup that is called the standard
Young subgroup. It is defined as
Sλ = Sλ1× Sλ2
× . . .× Sλk
where Sλ1 acts on the set 1, 2, ..., λ1,Sλi acts on the set
i−1∑j=1
λj + 1,
i−1∑j=1
λj + 2, . . . ,
i−1∑j=1
λj + λi
and Sλk acts on
k−1∑j=1
λj + 1,k−1∑j=1
λj + 2, . . . , n
.
Since this a subgroup, we can induce representations on it to representations of Sn. In particular,
inducing the trivial representation on each of the Sλi to Sn gives us the permutation representation.
Definition 3.8. We write Mλ for the permutation representation obtained by inducing the trivial
representation on Sλ to Sn, i.e. Mλ = Ind ↑SnSλ (1).
This Mλ we can equivalently define as Mλ = C[Sn]aλ, with aλ as before. Further, since we have a
surjection
Mλ = C[Sn]aλ Vλ = C[Sn]aλbλ, x 7→ x · bλ
and an isomorphism
Vλ = C[Sn]aλbλ ∼= C[Sn]bλaλ ⊂ C[Sn]aλ = Mλ
we note that Vλ appears in the decomposition of Mλ for every partition λ. To see the second equality
note that right multiplication by aλ gives a map C[Sn]aλbλ → C[Sn]bλaλ and right multiplication by
bλ gives a map back. These compositions are multiplications by non-zero scalars.
There is in fact an explicit formula, known as Young’s Rule, that tells us how the permutation module
decomposes in terms of the irreducible Specht modules.
Theorem 3.2. (Young’s Rule) The permutation module Mλ decomposes as
Mλ =⊕µDλ
KµλVµ
Definition 3.9. The numbers Kµλ are called the Kostka numbers.
These Kostka numbers are defined combinatorially as the number of semi-standard tableaux of shape
µ and content λ. That is, it is the number of ways we can fill the boxes of the Young diagram for µ
with λ1 1’s, λ2 2’s, up to λk, k’s, in such a way that the entries in each row are nondecreasing and the
entries in each column are strictly increasing.
23
Proposition 3.3. Suppose λ, µ ` n. Then the Kostka number Kµλ is non vanishing if and only if
µD λ. Further, Kλλ = 1
The property of the Kostka numbers as stated in proposition 3.3 is important. Suppose we consider
an ordering on the partitions λ1, λ2, . . . from (n) up to (1n)8. Then this will also gives us an ordering
on Mλ. Young’s rule now says that the first module Mλ1 will be equal to one copy of V λ1 . The next
module, Mλ2 will contain this same V λ1 with a certain multiplicity, plus one copy of a new irreducible
V λ2 etc.
Example 3.5. Consider the partition λ = (1, . . . , 1) Then Mλ is easily seen to be the regular rep-
resentation since we induce the trivial representation from the trivial subgroup. It therefore follows
that in this case Kµ(1,...,1) = dim(Vµ) since for the regular representation the irreducibles occur with
multiplicity being their dimension. This thus provides a way to determine the dimension of Vλ. It is
the number of ways to fill the Young diagram of λ with the numbers 1 to n, in such a way that all rows
and columns are increasing.
Example 3.6. Observe that K(n)λ = 1 since there is only one semi-standard tableau. Then by Young’s
Rule we conclude that each permutation module Mλ contains exactly one copy of the trivial represen-
tation S(n). See example 3.4.
The Kostka numbers are usually notated in a table. In the next example I will apply Young’s rule to
compute this table for S5.
Example 3.7. I will begin by giving the table of Kµλ and then work out how its entries are obtained.
Kµλ λ→ (5) (4,1) (3,2) (3,1,1) (2,2,1) (2,1,1,1) (1,1,1,1,1)
µ ↓ (5) 1 1 1 1 1 1 1
(4,1) 0 1 1 2 2 3 4
(3,2) 0 0 1 1 2 3 5
(3,1,1) 0 0 0 1 1 3 6
(2,2,1) 0 0 0 0 1 2 5
(2,1,1,1) 0 0 0 0 0 1 4
(1,1,1,1,1) 0 0 0 0 0 0 1
The values of the Kostka numbers are determined by counting the number of Young tableaux of shape
µ and content λ. When λ = (5) there is only one tableau that is semi standard and has 5 times the
number 1 as numbering of its boxes. Namely:
1 1 1 1 1
For the next partition λ = (4, 1) we can draw the following semi standard tableau with 4 times a 1 and
one 2. These are
1 1 1 1 21 1 1 12
8Note that this ordering is only a total ordering for ≤ 5. See example 3.2
24
Thus K(5)(4,1) = 1 and K(4,1)(4,1) = 1, all other Kostka numbers being zero, so we indeed get a new
Specht module V (4,1). I will not write out the diagrams for all partitions, since the result of doing
so can be read of from the table but I will write out the semi standard tableaux for µ for the filling
λ = (2, 1, 1, 1).
1 1 2 3 41 1 2 34
1 1 2 43
1 1 3 42
1 1 23 4
1 1 32 4
1 1 42 3
1 1 234
1 1 324
1 1 423
1 12 34
1 12 43
1 1234
We thus have M (2,1,1,1) ∼= V (5)⊕3V (4,1)⊕3V (3,2)⊕3V (3,1,1)⊕2V (2,2,1)⊕V (2,1,1,1), and as decomposition
for the regular representation M (15) we find
M (15) ∼= V (5) ⊕ 4V (4,1) ⊕ 5V (3,2) ⊕ 6V (3,1,1) ⊕ 5V (2,2,1) ⊕ 4V (2,1,1,1) ⊕ V (15).
Recall we defined the permutation module as Ind(↑SnSλ , 1). We can also consider
M ′λ = Ind ↑SnSλ′( sgn), i.e. the representation induced from the sign representation on the young sub-
group Sλ′. This induced representation we can also realize as: M ′λ = C[Sn]bλ and this representation
also includes Vλ. In [2] it is shown that the Specht module is the only irreducible constituent that these
two induced modules have in common and that this common constituent occurs with multiplicity one
(which is reflected by the diagonal Kostka number being one) . That is, it is shown that:
Ind ↑SnSλ (1) ∩ Ind ↑SnSλ′ (sgn) = Vλ
That this multiplicity is 1 is crucial in the construction. It ensures us that when we consider this
intersection we get exactly one copy of Vλ. One final result about the Specht modules concerns a
formula for its dimension.
Theorem 3.3. (The Hook Length formula)
dimVλ =n!
l1!· · · lk!
∏1<i<j<≤k
(li − lj) =n!∏
i≤λj hi,j
where hi,j is the hook length of the box with label (i, j) and is the number of boxes directly below of to
the right including the box once, and li = λi + k − i.
The first equality is a result that follows from the Frobenius character formula. We did not discuss
this here and a proof can be found in [1] or [4]. The second equality follows from the observation that
l1!∏1<j≤k(l1 − lj)
=∏
1≤m≤l1,m 6=l1−lj
m
and noting that the factors m in this product are precisely the hook lengths hi,1. Deleting the first
row of the diagram and proceeding by induction proves the statement.
Example 3.8. Consider the partition λ = (4, 3, 1). Labeling the boxes by their hook length gives
25
6 4 3 14 2 11
Then for the dimension of the corresponding S8 representation we find:
dim V(4,3,1) =8!
6 · 4 · 3 · 4 · 2= 70
26
4 The irreducible representations of GL(V )
In this section we will focus on the irreducible representations of the group GL(V ) ∼= GL(n,C). It
turns out that there is a connection between these and the Specht modules we considered in the pre-
vious section. In particular, we will determine the irreducible characters of these irreducible GL(V )
representations. We will follow [1]
Given a group G we have a representation of G on a vectorspace V which we denote by g(v) 7→ gv.
We can now consider the nth tensor power V ⊗n which is also a representation and both G and Sn
act on this space. We have a left-action of G given by g(v1 ⊗ . . . ⊗ v2) 7→ (gv1) ⊗ . . . ⊗ (gv2). We
also have a right-action of Sn on V ⊗n given by (v1 ⊗ . . . ⊗ vn)σ = vσ(1) ⊗ . . . ⊗ vσ(n) and it is easily
seen that their actions commute. This V ⊗n is not irreducible though and we would therefore like to
break it up into irreducible representations of G. We will see that in the case where G = GL(V ) this
can actually be accomplished9. Due to the commutativity of the actions of Sn and GL(V ) on V ⊗n
we expect there to be some kind of relation between the decomposition into irreducibles of V ⊗n when
viewed as Sn representation and its decomposition when viewed as GL(V ) representation. This, we
will see is indeed the case, and to construct these irreducible GL(V ) representations we will use the
Young symmetrizer cλ. Recall that it was defined as cλ = aλ · bλ =∑σ∈Pλ,τ∈Qλ sgn(τ) · eστ . We can
now define a new representation of GL(V ), which we will denote as SλV , by computing the image of
cλ on V ⊗n. Thus:
SλV = Im(cλ|V ⊗n).
This SλV is also a subrepresentation of V ⊗n.
Definition 4.1. We call the functor V SλV , that sends a representation V to SλV the Schur-
functor. The representation SλV is called the Weyl-Module.
By a functor we mean that a linear map φ : V → W between two vectorspaces determines a map
Sλ(φ) : SλV → SλW with Sλ(φ ψ) = Sλ(φ) Sλ(ψ) and Sλ(IdV ) = IdV .
Example 4.1. In this example I will demonstrate how cλ acts on V ⊗n, by decomposing V ⊗2. We
start by recalling that the nth symmetric power and exterior powers denoted SymnV and∧n
V are sub-
representations of V ⊗n. We can realize them as GL(V ) representations, by considering the partitions
(n) and (1n). In those cases we have for the Young symmetrizer cλ that c(n) = a(n) and c(1n) = b(1n).
Then, if v1 ⊗· · · ⊗ vn ∈ V ⊗n, cλ acts on this tensor by permuting the indices and we have for any n:
c(n)V⊗n = a(n)(v1 ⊗· · · ⊗ vn) =
∑σ
eσ(v1 ⊗· · · ⊗ vn) =∑σ
(vσ(1) ⊗· · · ⊗ vσ(n)) = SymnV
and
c(1n)V⊗n = b(1n)(v1⊗· · ·⊗vn) =
∑σ
sign(σ)eσ(v1⊗· · ·⊗vn) =∑σ
sign(σ)(vσ(1)⊗· · ·⊗vσ(n)) =∧n
V
by definitions 2.8 and 2.9. Therefore these two partitions (n) and (1n) correspond respectively for any
n to the functors
V Sym⊗nV and V ∧⊗n
V
9In the case of a general group G though, the best we can then hope for is to break it up into some sub-
representations.
27
This also immediately gives us the decomposition for n = 2: V⊗V = Sym⊗2V
⊕∧⊗2V . For n > 2
there will be a additional spaces in the decomposition that appear with a certain multiplicity m. For
example, when n = 3, we have the additional symmetrizer
c(2,1) = 1 + e(12) − 1(13) − e(132).
Its image is on V ⊗3 is the vectorspace spanned by the vectors
v1 ⊗ v2 ⊗ v3 + v2 ⊗ v1 ⊗ v3 − v3 ⊗ v2 ⊗ v1 − v3 ⊗ v1 ⊗ v2.
In the next subsection we will formulate a theorem that will enable us to determine the multiplicity
m and we will see that this multiplicity is related to the Specht module corresponding to the same
partition.
4.1 The irreducible characters of SλV .
We will now take a closer look at the representations SλV of GL(V ) we have constructed. As we
will see, they are indeed irreducible representations and their characters will be identified with certain
symmetric polynomials, called Schur polynomials. The needed results about these Schur polynomials,
some other symmetric polynomials and needed relations between them can be found in appendix A.
Theorem 4.1. 10
1. Let mλ be the dimension of the irreducible representation Vλ of Sn corresponding to λ. Then
V ⊗n ∼=⊕λ
(SλV )⊕mλ
2. Let k = dim V . For any semisimple g ∈ GL(V ), the trace of g on SλV is the value of the Schur
polynomial on the eigenvalues x1, . . . , xk of g on V , i.e.
χSλV (g) = sλ(x1, . . . , xk)
3. Each SλV is an irreducible representation of GL(V ).
4. Let k = dim V . Then SλV is zero if λk+1 6= 0. If λ = (λ1 ≥ . . . ≥ λk), then
dim SλV = sλ(1, . . . , 1) =∏
1≤i<j≤k
λi − λj + j − ij − i
Before turning to the proof of this theorem, we have a closer look at what (1) and (2) say. We already
stated that we expected that the representations of GL(V ) and Sn will be connected in some way due
to their commuting actions on V ⊗n. This is indeed reflected by (1). It says that as GL(V ) module V ⊗n
decomposes in irreducible sub-GL(V ) modules that occur with the multiplicity of the corresponding
Specht module, the irreducible of Sn11.
10The theorem and proof of the theorem is from [1]11In fact, this duality also holds the other way around, (a fact that will not be proven here). As Sn
representation V ⊗n ∼=⊕
λ(Vλ)⊕nλ where nλ is the dimension of SλV .
28
Regarding (2) we can already say something about a special cases that is easy to see, namely for
the case λ = (n). We let ρ(g) a semisimple endomorphism on V , then we know this leads to an
endomorphism of SλV and we want to compute the trace of this endomorphism. For this we let
x1, x2, . . . , xk be the eigenvalues of ρ(g) on V , where k = dim (V ). Now in the case where ρ(g) is the
diagonal matrix, we have χV (g) = x1 + x2 + . . .+ xk. Then in the case λ = (n), SλV = SymnV and
χSymnV is the complete symmetric polynomial of degree n obtained by multiplying the k factors in all
possible orders, which is clearly symmetric due to commutativity. Thus we have the special case of:
χSymnV = h(n)(x1, . . . , xk) (4.1)
We now turn to the proof. For this we translate the fact that the actions of GL(V ) and Sn on V ⊗n
commute, to the language of algebras. This we will do by introducing the commutator algebra. We will
formulate the results for the general case and then apply it to the situation we have in the theorem. We
consider a finite group G, later to be taken Sn and let U be a right module over an algebra A = C[G].
We define the commutator algebra B as:
B = HomG(U,U) = φ : U → U : φ(v · g) = φ(v)g, for all v ∈ U, g ∈ G (4.2)
It is the algebra of of all the endomorphisms φ of U that commute with the action of G. B acts on
U from the left, and this action commutes with the right action of A on U . If now U =⊕
i U⊕nii is
decomposition into non-isomorphic irreducible right A-modules, then we can apply to find:
B =⊕i
HomG(U⊕nii , U⊕nii ) ∼=⊕i
Matni(C)
which follows from Schur’s lemma (2.1) in the same way as before. If we now consider an additional
left A module W , we can construct a left B module through the tensor product:
U ⊗AW = U ⊗C W/subspace generated by va⊗ w − v ⊗ aw
This defines a left B module by acting on the first factor: b(v ⊗ w) = (bv)⊗ w. Having defined these
we can now formulate the first lemma:
Lemma 4.1. Let U a finite dimensional right A- module
1. For any c ∈ A, the canonical map U ⊗A Ac→ Uc is an isomorphism of left B-modules.
2. If W = Ac is an irreducible left A-module, then U ⊗AW = Uc is an irreducible left B-module
3. If Wi = Aci are all distinct irreducible left A-modules, with mi being the dimension of Wi, then
U ∼=⊕i
(U ⊗AWi)⊕mi ∼=
⊕i
(Uci)⊕mi
is the decomposition of U into irreducible left B-modules.
A first observation to make is that Ac is a direct summand of A due to semi-simplicity.
proof
1. Consider the following commuting diagram:
29
U ⊗A A U ⊗A Ac U ⊗A A
U U · c U
·c
·c
(4.3)
The vertical mappings are (v ⊗ a) 7→ va. Since the left horizontal maps are right-multiplication
with c, these maps are surjective. The right horizontal maps are embeddings and thus injective.
The outer vertical maps are isomorphisms and thus it follows that the middle vertical map is
also an isomorphism.
2. We first prove the claim for the case where U is an irreducible A-module. Then we have,
B = HomG(U,U) = C
Since B is one-dimensional its only submodules are 0 or B itself. Therefore it will sufficient to
show dim(U ⊗AW ) = 1. By Wedderburn, we can identify
A =
r⊕i=1
Mni(C),
Now, by assumption W = Ac is an irreducible left A module, and thus also a minimal left ideal
of A. In a matrix algebra Matni(C) a primitive idempotent is an ni×ni matrix Ekk, 1 ≤ k ≤ niwith all of its entries zero except for entry (k, k). In the direct sum of matrix algebra’s the
primitive idempotents are then r-tupels (0, . . . , 0, e, 0, . . . , 0) with e a primitive idempotent of
Matni(C) for some i as above. Now a minimal left ideal in A is of the from Matni(C)Ekk and
is isomorphic to one that consists of r-tupels of matrices with all entries zero except for entry
i. I.e.⊕r
i=1 Matni(C)(0, . . . , 0, e, 0, . . . , 0). In this entry i, all of the matrices have but one
non-zero column, say column k. In the same way we can identify U as the minimal right ideal
of r-tupels of matrices with all entries zero except for entry j, and in this factor are all zero
except for row l. Then (U ⊗AW ) will be zero unless i = j, in which case it is isomorphic to the
set of matrices that are all zero except in entry (l, k). Then dim(U ⊗AW ) = 1 which completes
the proof when U is irreducible. For the more general case, we decompose U =⊕
i U⊕nii into a
sum of irreducible right A-modules, whereby;
U ⊗AW =⊕i
(Ui ⊗AW )⊕ni = Cnk ,
for some k. Since this is irreducible over B =⊕
j MatnjC the proof is complete.
3. This is easily seen if we use the isomorphism A ∼=⊕Wmii . Then this determines an isomorphism:
U ∼= U ⊗A A ∼= U ⊗A (⊕
W⊕mii ) ∼=⊕i
(U ⊗AWi)⊕mi
In the proof of theorem 4.1 we will apply this lemma to U = V ⊗n and set G = Sn. Thus, the
commutator algebra B does now consist of all endomorphisms of V ⊗n that commute with the action
of Sn. But recall that we had commuting actions of GL(V ) and Sn on V ⊗n. Therefore, in this
30
particular context, the commutator algebra B in fact equal to GL(V ) and irreducible GL(V ) sub-
representations will be irreducible B submodules. The lemma now tells us how V ⊗n decomposes as a
B-module, i.e. how it decomposes as GL(V ) representation.
To prove (3) though, we need one more lemma that relates the commutator algebra to GL(V ) making
the above statement precise. Before formulating the lemma we first make the observation that B =
EndSn(V ⊗n) ⊂ End(V ⊗n). It is further clear that End(V ) ⊂ B and if φ ∈ End(V ) is an intertwining
operator then the operator
φ : V ⊗n → V ⊗n , φ(v1⊗, . . . ,⊗vn) = φ(v1)⊗ . . .⊗ φ(vn)
induced by φ : V → V is also an intertwining operator, since it is clearly Sn invariant. Thus,
φ : φ ∈ End(V ) ⊂ B. The lemma now claims that spanφ : φ ∈ End(V ) = B.
Lemma 4.2. 1. The commutator algebra B as linear subspace of End(V ⊗n) is spanned by End(V ).
2. A subspace of V ⊗n is a sub-B-module iff it is invariant under GL(V ).
proof To prove (1) we note that if W is any finite dimensional vector space, then the subspace
Symn(W ) = spanw ⊗ . . . ⊗ w : w ∈ W ⊂ W⊗n is invariant under Sn. In view of the discussion
before the proof we apply this to W = End(V ) and use that End(V ) = Hom(V, V ) = V ∗ ⊗ V . Then:
W⊗n = (V ∗ ⊗ V )⊗n = (V ∗)⊗n ⊗ V ⊗n = End(V ⊗n) ⊇ B = EndSn(V ⊗n) = End(V ⊗n)Sn
To justify the last equality: If
φ : V ⊗n → V ⊗n, (φσ)(w)→ φ(wσ)σ−1
then it is Sn invariant, if
(φσ)(w) = φ(w) that is iff φ(wσ) = φ(w)σ
meaning that φ intertwines with the action of Sn. Thus, we also have an action of Sn on W⊗n and
we deduce
B = (W⊗n)Sn = Symn(W ) = spanφ : φ ∈ End(V )
For (2) we let P ⊆ V ⊗n a subspace and assume it is invariant under ψ, for ψ ∈ GL(V ), i.e ψ is
invertible. Then P is a sub-B-module if it is invariant under the action of b, that is if bP ⊆ P for all
b ∈ B. Now, from (1) we had B = spanφ : φ ∈ End(V ), so it suffices to show that φP ⊆ P , for all
φ ∈ End(V ). If φ is invertible, this is certainly true since we assumed P to be GL(V ) invariant. If φ
is not invertible, we can approximate φ by mappings ψi ∈ GL(V ).
ψi → φ, , ψi → φ, i→∞.
Then
φ(p) = limi→∞
ψi(p) ∈ P, for all p ∈ P
since GL(V ) is dense in End(V ), i.e. every subset is closed.
31
Now we have all the machinery necessary to prove theorem 4.1. As said earlier, we set A = C[Sn] and
U = V ⊗n.
Proof of theorem 4.1
1. This follows immediately form lemma’s 4.1 (3) and 4.2 with the identifications Uci = SλV and
mi = dim(Aci) = dim(Vλ).
2. With lemma 4.1 (2) we have an isomorphism of GL(V ) modules
V ⊗n ⊗A Vλ ∼= SλV
with Vλ = Acλ. Similarly, we can do this for Mλ = Aaλ.
V ⊗n ⊗AMλ∼= V ⊗naλ = Symλ1V ⊗ Symλ2V ⊗· · · ⊗ SymλkV
But recall Young’s rule 3.2:
Mλ =⊕µDλ
KµλVµ
Therefore we deduce:
Symλ1V ⊗ Symλ2V ⊗· · · ⊗ SymλkV =V ⊗n ⊗A
⊕µDλ
KµλVµ
(4.4)
=⊕µDλ
(V ⊗n ⊗A KµλVµ)
=⊕µDλ
(KµλSµV )
But by (4.1) we know the trace on the left hand side of 4.4. It is the product h(λ) = h(λ1)· · ·h(λk)
of complete symmetric polynomials. Therefore
h(λ) =∑µ
KµλTrace(Sµ(g))
where Trace(Sµ(g)) = χSµV (g). But we also have the relation A.A.7
hλ = sλ1· · · sλk =
∑Kµλsµ,
and we can thus deduce that
χSµV (g) = sµ(x1, . . . , xk).
3. We note that SλV = V ⊗ncλ = Ucλ. This V ⊗ncλ is a subspace of V ⊗n and it is invariant under
the action of GL(V ). To see this, note that an element of V ⊗nc is of the form vc, for some
v ∈ V ⊗n. Then g(vc) = g(v)c ∈ V ⊗nc for g ∈ GL(V ), showing that V ⊗nc is invariant under
GL(V). Then by lemma’s 4.2 and 4.1 that SλV is an irreducible sub-B module.
4. Let k = dim(V ). The result from (2) gives us that if λ = (λ1, . . . , λn) with n > k and λk+1 6= 0
that the trace of an endomorphism on SλV is given by sλ(x1, . . . xk, 0, . . . , 0). But this is zero.
Part 2 also gives us that dim(SλV ) = sλ(1, . . . , 1). To prove this we use A.4 and the definition
of the Vandermonde determinant A.4. Then we have:
sλ(x, x, x2, . . . , xk−1) =∏
1≤i,j≤k
xλi+k−i − xλj+k−j
xk−i − xk−j= xk
∏1≤i,j≤k
xλi−i − xλj−j
x−i − x−j
32
Taking now the limit x→ 1 we have
sλ(1, . . . , 1) =∏
1≤i,j≤k
λi − λj + j − ij − i
This theorem is an important result. Having proved (2), we can determine the branching rules we
set out for. Branching rules describe how irreducible representations of a group G decompose into
irreducibles of a subgroup H when these are restricted to H. We will see that for GL(n,C) the
branching rules can be derived by making use of certain identities between the Schur polynomials. We
will do this in chapter 5. The result of (4) is also important. It provides a formula to compute the
dimension of the representation. The formula can be simplified however by making use of the hook
length hij . Then the above formula reduces to:
dim SλV =∏ k − i+ j
hij(4.5)
It follows from the observation that the dimension of the Specht module Vλ is given by (3.3) and
comparing this to the dimension formula above.
Example 4.2. Computing the dimension.
Consider the irreducible GL(5,C) representation with Young diagram λ = (4, 2, 2, 1). We first consider
the numerator. Number the box in the upper left corner with the number 5 and fill the rest of the
column in decreasing order from top to bottom. Then fill the rows such that their numberings are
strictly increasing from left to right. Then the numerator becomes the product over the filling of the
resulting Young tableau: For the denominator we take the product over the numbering of the Young
tableau that is numbered according to the hooklength of each box. The dimension of this representation
thus becomes:
dim SλV =
5 6 7 84 53 42
7 5 2 14 23 11
=5 · 6 · 7 · 8 · 4 · 5 · 3 · 4 · 27 · 5 · 2 · 1 · 4 · 2 · 3 · 1 · 1
= 480
We have further shown that the irreducible representations can be parametrized by Young diagrams
with at most n rows. However, we also know that there is only one possible way of numbering a
column of length n. Therefore, when denoting the representation by its Young diagram any column
of length n may be omitted. That is, taking for n = 2 for instance, the Young diagrams:
etc
all represent the same irreducible representation. I will further write = 12 for the one-dimensional
representation, where by the subscript I refer to the n = 2 case.
33
5 Branching Rules
In this section we will determine some branching rules for the GL(n,C). As said before, given a
group G, an irreducible representation of G and a subgroup H ⊂ G, branching rules describe how this
irreducible representation, when restricted to H, decomposes in terms of irreducible representations of
H. These branching rules have many applications, as well in mathematics as in physics. There they
are related to symmetry breaking, a phenomenon we will get back to in great detail in the second part
of this thesis. For now we will determine three different branching rules for GL(n,C). It turns out
that, once we have determined those, this also gives us the branching rules for SU(N), so we get those
for free. This we will see in section 5.2.
5.1 Branching Rules for GL(n,C)
With the results of the previous section at hand we can begin with determining the branching rule for
GL(n,C)→ GL(n− 1,C). Determining this comes down to determining the multiplicities dλµ in:
SλV |GL(n−1)∼=⊕µ
(SµV )⊕dλµ (5.1)
where λ = (λ1, . . . , λn) and µ = (µ1, . . . , µn−1). The first step is to rewrite (5.1) in terms of the
characters. For this we observe that when evaluating the character χSλV (g) = Trace(ρλ(g)) on an
element g ∈ GL(n,C) it sufficient to evaluate it on diagonal matrices. That is, to evaluate:
χSnλ V(
x1
. . .
xn
) = sλ(x1, . . . , xn) (5.2)
To see this, observe that from linear algebra we know that every matrix in GL(n,C) is conjugate
to a matrix in Jordan form. The character is therefore determined by its values on matrices in the
canonical Jordan form. Since the diagonalizable elements further form a dense subset of GL(n,C)
any representation is determined by its values on the diagonal matrices. Further, we can embed
GL(n− 1,C) in GL(n,C) as:
GL(n− 1,C) → GL(n,C), given by g 7→
g
0...
0
0 · · · 0 1
Therefore, taking g to be the diagonal matrix as above, this amounts to:
x1
. . .
xn−1
7→
x1
. . .
xn−1
1
(5.3)
We now have all we need to rewrite (5.1) in terms of the characters. The right hand side of (5.1) follows
immediately. These are the Schur polynomials in the variables x1, . . . , xn−1 corresponding to the
partition µ, i.e. sµ(x1, . . . , xn−1). As for the left hand side, we have to determine the character of SλV
34
restricted to GL(n− 1). But compare (5.2) with (5.3). Then we deduce that restriction to GL(n− 1)
of the irreducible character of SλV , which is sλ(x1, . . . , xn), amounts to setting xn = 1. Therefore the
multiplicities dλµ in (5.1) are determined by the following identity between Schur polynomials:
snλ(x1, . . . , xn−1, 1) =∑µ
dλµsn−1µ (x1, . . . , xn−1). (5.4)
Now, by (A.4) we have an identity
sλ(x1, . . . , xn) =∑µ⊆λ
sµ(x1, . . . , xn−1)x|λ−µ|n (5.5)
where the sum is over all partitions µ for which λ − µ is a horizontal strips and l(µ) ≤ n − 1. Thus
substituting xn = 1 then gives us:
sλ(x1, . . . , xn−1, 1) =∑µ⊆λ
sµ(x1, . . . , xn−1) (5.6)
and we see that the multiplicities dλµ are all one. Therefore, when determining the possible branchings
of an irreducible GL(n) representation we do not have to worry about multiplicities, but we only have
to determine which Young diagrams appear. And this we are being told by the condition of λ − µbeing a horizontal rim. We can now state the branching rule for the decomposition of an irreducible
GL(n) representation when restricted to GL(n− 1).
Definition 5.1. (Branching Rule) Let Sλ(V ) an irreducible GL(n,C) representation. Then we have
Sλ(V )|GL(n−1)∼=⊕µ⊆λ
Sµ(V )
where λ ` n, µ ` n − 1 and the sum is over all partitions µ ⊆ λ such that λ − µ is a horizontal strip
and l(µ) ≤ n− 1.
Example 5.1. As an example I will demonstrate how we can derive the branching pattern of an irre-
ducible GL(n,C) representation using Young diagrams. By theorem 4.1 the irreducible representations
correspond to Young diagrams with at most n rows. Consider now the partition λ = (3, 2, 1). According
to the branching rule, the possible GL(n − 1,C) representations correspond to those partitions µ that
can be obtained by removing boxes from λ such that λ−µ is a horizontal strip. To determine in which
ways we can do this, I label the boxes that we remove by ×. This gives us the following possibilities:
××
××
××
××
×
××
×
Then the decomposition into irreducible GL(n− 1,C) representations becomes:
+ + + +
+ + +
35
Explicitly for n = 3 the branching becomes:
= → + + +
= 2 + 1 +
As a check we can compute the dimensions on either sides. Then we indeed find: 8 = 2 · 2 + 1 + 3.
Branching a tensor product
Just as we used the Schur functions to determine the branchingrule for GL(n,C) → GL(n − 1,C)
we can determine the branching rule for the decomposition of a tensor product of two irreducible
GL(n) representations in terms of irreducible GL(n) representations. The difference compared to
the previous case is that the multiplicities will no longer be one. This branching rule is, for λ, µ, ν
partitions of n:
Sλ ⊗ Sµ ∼=⊕ν
(Sν)cνλµ
which in terms of characters becomes:
sλ(x1, . . . , xn) · sµ(x1, . . . , xn) =∑ν
cνλµsν(x1, . . . , xn) (5.7)
But this is precisely the relation between products of Schur polynomials given by the Littlewood-
Richardson rule A.6, and we conclude that the multiplicities are given by the Littlewood-Richardson
coefficients.
Branching GL(n+m) → GL(n) ⊗ GL(m)
As a last application we determine the branching rule for the restriction of an irreducible GL(n+m,C)
representation to GL(n,C) ⊗ GL(m,C) × GL(1). Note now that GL(1) ∼= U(1) and that GL(1) =
Z[GL(n+m)] the center of GL(n+m). We can embed GL(n,C)⊗GL(m,C)×GL(1) in GL(n+m,C)
by reserving the upper left n×n block for GL(n,C) and the right lower block m×m block for GL(m,C)
and embed U(1) along the diagonal. This way it will commute with both GL(n) and GL(m) and play
no role in the branching. We may thus ignore it. The corresponding branching rule will now be:
SλV |GL(n)×GL(m)∼=⊕µ,ν
(SµV ⊗ SνV )⊕eλµν (5.8)
where λ ` n+m,µ ` n, ν ` m. In terms of Schur polynomials this becomes:
sn+mλ (x1, . . . , xn+m) =
∑µ,ν
eλµνsnµ(x1, . . . , xn) · smν (xn+1, . . . , xm).
To determine the multiplicities we have a look at proposition A.2. Considering the two sets of variables
x(1) = (x1, . . . , xn), x(2) = (xn+1, . . . , xm) and setting µ = 0 the proposition gives us for partitions λ
and µ = 0:
sλ(x(1), x(2)) =∑ν
sν(1)/ν(0)(x(1))sν(2)/ν(1)(x(2))
36
with the sum over all sequences (ν(0), ν(1), ν(2)) of partitions such that ν(0) = µ = 0, ν(2) = λ and
ν(0) ⊆ ν(1) ⊆ ν(2). Implementing these restrictions on the partitions we find:
sλ(x(1), x(2)) =∑ν(1)⊆λ
sν(1)(x(1))sλ/ν(1)(x(2))
=∑ν(1)⊆λ
sν(1)(x(1))∑µ
cλν(1)µsµ(x(2))
=∑ν(1),µ
cλν(1)µsν(1)(x(1))sµ(x(2))
where in the second equality we used A.9. We can therefore identify the coefficients with the Littlewood-
Richardson coefficients, i.e. the number of ways the young diagram for λ can be obtained by strict µ
expansion of the young diagram of ν12. The branching rule thus becomes:
Sn+mλ (V )|GL(n)×GL(m)
∼=⊕ν,µ
cλνµSnν (V )⊗ Smµ (V ) (5.9)
Now, although (5.7) and (5.9) may look very much alike in that they both have the Littlewood-
Richardson coefficients as multiplicities, the latter is a lot more complex to work with. In (5.7) two of
the three partitions that label the coefficients are known. In (5.9) only the final partition λ is known
which greatly increases the complexity as the partitions get larger. In the next section I will argue
that the results for GL(n) also hold for SU(n) and there I will also discuss an example on how to
handle (5.9).
5.2 The irreducible representations of SU(n)
We have seen in the previous section how to derive some branching rules for GL(n,C). Here I will
argue with a minor discussion13 that this also gives us the branching rules for SU(n). To show this we
need some results from Lie-theory, in particular Lie-algebras14. Important here are now the following
three Lie-algebras:
gl(n,C) = n× n complex matrices
sl(n,C) = X | X n× n complex matrix with tr(X) = 0
su(n) = X ∈ sl(n,C) | X anti hermitian matrix
Given a real Lie-algebra g we can consider its complexification gC, defined as:
gC := g⊗R C ≡ g⊕ ig
Then sl(n,C) can be seen as the complexification of su(n), i.e.
sl(n,C) = su(n)⊕ isu(n)
and gl(n,C) is related to sl(n,C) by:
gl(n,C) = CI⊕ sl(n,C)
12The definition and an example of strict expansion can be found in definition A.613Lie-theory is not the main purpose of this thesis and therefore results are kept short.14The material from the first paragraph is based on notes from J. Stokman, [9] and [10].
37
where I is the n×n identity matrix and CI the center. Then, since gl(n,C) = CI⊕sl(n,C), an irreducible
representation of gl(n,C) will remain irreducible when we restrict it to sl(n,C). Further, for complex
n × n matrices su(n) is equal to its complexification and therefore equal to the Lie algebra sl(n,C).
Thus, the restriction of irreducible sl(n,C) representations to su(n) will also remain irreducible. This
observation is important. It tells us that when we have an irreducible representation of the Lie algebra
gl(n,C), this automatically defines a complex irreducible representation of su(n). However, in all the
previous we have been investigating the representations of the group GL(n,C), and not its Lie-algebra.
Irreducible representations of groups though, also define an irreducible representation of its Lie-algebra.
They are related to one another through the exponential map. That is, if Sλ(V ) is an irreducible
GL(n,C) representation, then we have an irreducible gl(n,C) representation by differentiating the
action on v ∈ SλV :
X · v =d
dt|t=0 exp(tX) · v, X ∈ gl(n,C), tX ∈ GL(n,C).
Conversely we can integrate an irreducible representation of the Lie-algebra to get an irreducible
representation of the Lie group. Thus, putting these results together, we conclude that equivalent
irreducible representations of GL(n,C) define equivalent irreducible representations of SU(n). Also,
inequivalent irreducible representations of GL(n) with the same action of the center give inequiva-
lent irreducible representations of SU(n). All results that hold for the irreducible representations of
GL(n,C) also hold for SU(n).
Example 5.2. Branching SU(5)→ SU(3)× SU(2).
With this result and the branching rule (5.9) we can decompose an irreducible SU(5) representation in
terms of irreducible SU(3) × SU(2) representations using the Young tableau. I will refer to the rep-
resentations by Young diagrams and dimension (denoted bold) and consider the 5 lowest dimensional
representations 5, 10, 15 and 24. I will label the Young diagrams with a subscript 2,3 or 5 to em-
phasize whether they should be seen as irreducible SU(2), SU(3) or SU(5) representations respectively.
Consider now first 5 = . Then off course we have a rather trivial decomposition.
= ⊗ 1 + 1⊗ .
where (as representations) in each term the first factor belongs to SU(3) and the second to SU(2).
SU(5) has a second 5 dimensional representation with Young diagram λ = (1, 1, 1, 1). To find out how
it decomposes, suppose we started out with the irreducible SU(3) representation 3 = . Then the only
way to obtain λ = (1, 1, 1, 1)5 by strict expansion with ν, is when ν = (1, 1, 1)2. But this is zero in
SU(2). However, the other three dimensional representation 3 can be expanded to λ = (1, 1, 1, 1) with
2 = 12. The second possibility is to expand 3 = 13 using 2 = 2. Thus:
= ⊗ + ⊗ = ⊗ 1 + 1⊗ (5.10)
Next is the SU(5) representation 10 = 5. Offcourse, we can obtain this Young diagram by expanding
3 = 3 with 2 = 2 and similarly by expanding 3 with 12. Likewise, we can expand 2 with 13.
Thus this representation decomposes as:
= ⊗ + ⊗ 1 + 1⊗ = ⊗ + ⊗ 1 + 1⊗ 1 (5.11)
38
The 15 has Young diagram 5. This we can similarly obtain in three different ways by expanding
3 using 12, expanding 3 using 2 or expanding 2 using 13. Thus
= ⊗ 1 + ⊗ + 1⊗ (5.12)
For the 24 dimensional representation the Young diagram corresponds to the partition λ = (2, 1, 1, 1).
We start with the trivial SU(3) representation 3 = 13. Then we can obtain λ using the young
diagrams for the SU(2) representations given by 2 = 12 and 2 each of which give one possible
way of expansion. Starting with the 3 dimensional representation 3, we see we cannot expand it to
λ. However this representation can also be represented by 3 which we can expand using 2. The
other 3 dimensional representation requires an expansion with 2 = 2. Finally, we can start with
the 8 dimensional representation 3 and then we need 2 = 12 which also gives one possible way of
strict expansion. Thus:
= 1⊗ 1 + 1⊗ + ⊗ + ⊗ + ⊗ 1 (5.13)
As a check we can compute dimensions. Then we find: 24 = 1 + 3 + 3 · 2 + 3 · 2 + 8.
I will apply the result of this branching rule when I discuss the symmetry group SU(5) in the context
of Grand Unified Theories in section 10. There I show that we can use the found (SU(3), SU(2))
decompositions of the irreducible SU(5) representations to assign the elementary fermions of the
Standard Model to the irreducible representations of SU(5). There I will argue this in more detail.
The following section I will give an introduction to the Lagrangian formalism in Field Theory and
discuss the phenomenon of symmetry breaking in physics.
39
6 Lagrangians, symmetries and symmetry breaking
This section will include a short introduction to the Lagrangian formalism in classical mechanics and
how it is generalized to obtain a relativistic field theory. Then I will discuss some examples on field
theoretic Lagrangians, define what we mean by spontaneous global symmetry breaking and show that
this phenomenon is accompanied by the appearance of massless particles. The material can be found
in [11], [12] and [13].
6.1 Lagrangian formalism
The classical mechanical Lagrangian of a system is defined as
L = T − U (6.1)
where T and U are the kinetic energy and potential respectively and L is a function of the coordinates
qi and their time derivatives. The action S is defined as
S =
∫ t2
t1
L(qi, qi, t)dt (6.2)
and the requirement of δS = 0 leads to the Euler-Lagrange equation (6.3) from which the equations
of motion can be obtainedd
dt
(∂L
∂qi
)− ∂L
∂qi= 0. (6.3)
With this recipe at hand we can easily obtain a Lagrangian field theory. To do this, we introduce
the functional L(φ(x), ∂µφ(x)), where ∂µ is the usual shorthand notation for ∂/∂xµ and xµ is the
relativistic four-vector. Note that L is a Lagrangian density since we have
S =
∫d4xL(φ(x), ∂µφ(x)) =
∫dt
∫d3xL(φ(x), ∂µφ(x)) =
∫dtL.
We can consider φ(x) as a generalized coordinate at each value of its argument x. and d unlike
the classical case we now have an infinite number of degrees of freedom. With this definition of the
Lagrangian density we can rewrite (6.3) to obtain the Euler-Lagrange equation for a relativistic field
theory
∂µ
(∂L
∂(∂µφi)
)=∂L∂φi
. (6.4)
There is one slight difference between the Classical and Field-theoretic Lagrangian formulation though.
While in classical mechanics the Lagrangian can be explicitly derived using (6.1), in Field theories
the Lagrangian is often taken to be axiomatic. The following examples will discuss a few important
Lagrangians.
Example 6.1. Scalar (spin-0) field. The free field Lagrangian for a real scalar (spin-0) field is given
by
Lscalar =1
2∂µφ∂
µφ− 1
2m2φ2. (6.5)
(In general, this will also include interaction terms of higher order in φ, which is why we call this
Lagrangian the free field Lagrangian.) By applying the Euler-Lagrange equation we obtain:
(∂µ∂µ −m2)φ2 = 0 (6.6)
Equation (6.6) is called the Klein-Gordon equation and describes a spin-0 particle of mass m.
40
Example 6.2. Vector (spin-1) field. Spin-1 particles are described in terms of a vector field Vµ, with
Lagrangian
Lproca = −1
4(∂µV ν − ∂νV µ)(∂µVν − ∂νVµ) +
1
2M2V νVν (6.7)
where M is the mass of the vector field. By defining the field-strenght tensor Fµν
Fµν = ∂µV ν − ∂νV µ (6.8)
we can rewrite (6.7) to get a neater expression
Lproca = −1
4FµνFµν +
1
2m2V νVν . (6.9)
Applying the Euler-Lagrange equation yields:
∂µ(∂µV ν − ∂νV µ) +m2V µ → ∂µFµν +m2V µ (6.10)
which describes a spin-1 particle.
Example 6.3. Spinor (spin-1/2) field The Lagrangian for a spinor field ψ is given by
Lfermion = iψ(γµ∂µ)ψ −mψψ, (6.11)
with ψ = ψ†γ0 is the adjoint spinor and the gamma matrices are defined in appendix C. Applying the
Euler-Lagrange equation to ψ gives the Dirac equation describing a spin-1/2 particle of mass m. It
reads:
iγµ∂µψ −mψ = 0 (6.12)
6.2 Symmetries
We say that the Lagrangian has a symmetry, when it is invariant under a certain type of transformation.
I will demonstrate this through some examples.
Example 6.4. Consider the following Lagrangian
L =1
2∂µφ∂
µφ− V (φ) where V (φ) = V (−φ) (6.13)
This Lagrangian has a discrete symmetry, since it is invariant under the parity transformation φ→ −φ.
Besides discrete symmetries, the Lagrangian can also have continuous symmetries. This is demon-
strated in the following examples15.
Example 6.5. The Lagrangian for a complex scalar field φ = 1√2(φ1 + iφ2) is given by
L = (∂µφ)∗(∂µφ)− V (φ) where V (φ) = µ2φ∗φ+ λ2(φ∗φ)2. (6.14)
This Lagrangian has a global U(1) symmetry. It is invariant under global phase transformations
φ→ φ′ = eiθφ, which we can easily see by looking at the modulus
φ∗φ→ φ′∗φ′ = e−iθeiθφ∗φ = φ∗φ
.
15Continuous symmetry groups are called Lie-groups. They are discussed in appendix B.
41
Example 6.6. As a final example, we take the Lagrangian for a nucleon
L = p(iγµ∂µ −m)p+ n(iγµ∂µ −m)n (6.15)
where an equal mass for the proton and neutron is assumed16. The Dirac-γ matrices are defined in
appendix C. We can rewrite this as
L = ψ(iγµ∂µ −m)ψ where ψ =
(p
n
)(6.16)
and ψ represents the spinor conjugate ψ†γ0. Lagrangian (6.16) is invariant under SU(2) transforma-
tions, which is the group of rotations in isospin space17. It consists of transformations
ψ → expi~σ · ~α
2ψ (6.17)
where ~σ = (σ1, σ2, σ3) are the Pauli matrices and ~α = (α1, α2, α3), is a parameter18.
6.3 Symmetry breaking
The previous chapter introduced the concept of symmetries of the Lagrangian by means of some
examples. These symmetries, however, can be broken and there are two ways this can happen. It can
be broken spontaneously or explicitly.
6.3.1 Explicit symmetry breaking
In this case the symmetry is broken by explicitly adding terms to the Lagrangian violate the symmetry.
For example, in a Lagrangian with a discrete φ = −φ symmetry, terms with odd powers of φ would
explicitly break this symmetry. As another example consider the Lagrangian (6.15). We already
noted that the symmetry was only approximate because we assumed an equal mass for the proton and
neutron. The symmetry is explicitly broken when we distinguish between mp and mn.
6.3.2 Spontaneous symmetry breaking
We speak of spontaneous symmetry breaking when the vacuum of the Lagrangian is not invariant
under the full symmetry group of the Lagrangian. To explain what this means let me first define what
we mean with the vacuum being invariant under a symmetry group. From appendix B we know that
we can write an element U of the symmetry group of the Lagrangian as
U = eiαt
with t is a group generator. The vacuum of a the Lagrangian is now said to be invariant under
transformations of the symmetry group when
eiαt < φ0 >=< φ0 > (6.18)
16Note that this makes the symmetry we consider an approximate symmetry since their masses are not
exactly equal.17More on isospin can be found in appendix C.218See appendix B
42
where t is a generator of the symmetry group. Now, if we consider infinitesimal transformations we
can rewrite this as
(1 + iαt) < φ0 >=< φ0 > (6.19)
which is to say that for the vacuum to be conserved under a symmetry, the following condition for the
generator t must hold
t < φ0 >= 0. (6.20)
Any generator that does not satisfy this condition is called a broken generator. The following examples
will discuss some examples on spontaneously broken symmetries. In all these examples there will be
a scalar field φ that acquires a vacuum expectation value (VEV) caused by a potential V (φ) =
µ2φφ∗ + λ2(φφ∗)2. This potential is called a mexican hat potential and we will later identify it with
the Higgs potential. The VEV value though will only be invariant under a subgroup of the symmetry
group. Some generators of the full symmetry group that satisfied condition (6.20) before the vacuum
acquired a VEV, will be broken and the remaining generators that do leave the vacuum invariant are
the generators of the subgroup that leaves the vacuum invariant. As we will see, there is a physical
interpretation for these broken generators and this will be important for the Little Higgs models we
discuss later.
Example 6.7. Breaking a global U(1) symmetry
Consider the Lagrangian
L = (∂µφ)∗(∂µφ)− µ2φ∗φ− λ2(φ∗φ)2 (6.21)
for a complex scalar field φ = 1√2(φ1 + iφ2). As stated earlier this Lagrangian is invariant under U(1)
transformations. A look at the expression for the potential indicates we have to distinguish between
the cases µ2 > 0 and µ2 < 0.
µ2 > 0: The case µ2 > 0 corresponds to a ground state < φ0 >= 0. In terms of the fields φ1 and φ2
the Lagrangian for small oscillations around this vacuum reads19,
L =1
2(∂µφ1)2 +
1
2(∂µφ2)2 − 1
2µ2(φ1
2 + φ22)− 1
4λ2(φ1
2 + φ22)2 (6.22)
and by comparing this to (6.5) we see that the µ2 > 0 case simply describes two particles, each of
which has a mass µ.
µ2 < 0: The case µ2 < 0 is more interesting. In this case the potential takes the form of the so-called
mexican hat potential and this potential is unstable in φ = 0. Unlike the µ2 > 0 case, we now have
an infinite number of vacua located at the rim of the hat satisfying√φ2
1 + φ22 =
√−µ2
λ= v (6.23)
and which are connected by rotational symmetry. They are therefore all equivalent so we are free to
choose
< φ0 >=
(φ1
φ2
)=
(v
0
)(6.24)
as our ground state. We now consider small oscillations around the ground state by redefining the
field variables through η = φ1 − v and ξ = φ2. This gives us a parametrization of φ in terms of the
19In field theory particles are described as oscillations around their ground state [12].
43
Figure 1: The V (φ) = µ2φ∗φ + λ2(φ∗φ)2 potential for a complex scalar field for (a) µ2 > 0
and (b) µ2 < 0. The picture is from [13].
fluctuation fields η and ξ20:
φ =1√2
(η + v + iξ). (6.25)
Next step is to rewrite (6.21) in terms of η and ξ to obtain the Lagrangian for the small oscillations.
To keep things clear, we treat the kinetic part and potential part separately. For the kinetic part we
find:
Lkin = (∂µφ)∗(∂µφ) =1
2(∂µ(η + v − iξ))(∂µ(η + v + iξ))
=1
2(∂µη)2 +
1
2(∂µξ)
2
where we used ∂µv = 0. For the potential part we note φ∗φ = 12 (η+v− iξ)(η+v+ iξ) = 1
2 (η+v)2 +ξ2.
Then we find:
Lpot =µ2φ∗φ+ λ(φ∗φ)2
=1
2(−λv2)[(η + v)2 + ξ2] +
1
4λ[(η + v)2 + ξ2]2
=−1
4λv4 + λv2η2 + λvη3 +
1
4λη4 +
1
4λξ4 + ληvξ2 +
1
2λη2ξ2 (6.26)
Since in (6.26) the 3rd and 4th order terms in η and ξ represent interaction terms and the constant
term is irrelevant we consider only the quadratic terms to obtain for the full Lagrangian in terms of
the oscillation field
Ls.o. =1
2(∂µη)2 +
1
2(∂µξ)
2 − λv2η2 + interaction terms. (6.27)
Comparing to (6.5) shows this corresponds to a massive η-particle with mη2 = 2λv2 = −2µ2 > 0
and a massless particle ξ, since there is no mass term 12m
2ξξ
2. This massless ξ-particle is called a
20We have < η0 >= 0 and < ξ0 >= 0 so they indeed describe fluctuations around the vacuum.
44
Nambu-Goldstone Boson (often abbreviated as NGB) and it this particle that we identify with the
broken generator. They are predicted by the Goldstone theorem that states that one Goldstone boson
will appear for every broken generator of the original symmetry group. This appearance of massless
particles resulting might seem troublesome. However, we will see in sections 7.2 and 7.3 that they play
an important role in the Higgs mechanism.
Now although the parametrization we used above worked perfectly well, we could also have chosen to
use the following parametrization for φ in terms of the two real fields η and ξ given by
φ =1√2
(v + η)eiξ/v. (6.28)
Here v is again the vacuum expectation value, η parametrizes radial oscillations around v and ξ
rotations in the complex plane21. Substituting this in (6.21) we obtain
L =1
2(∂µ(η + v)e−iξ/v(∂µ(η + v)eiξ/v)− µ2
2(η + v)2 +
µ2
4v2(η + v)4
=1
2(∂µη(∂µη +
(η + v)2
2v2∂µξ∂
µξ − µ2
2(η + v)2 +
µ2
4v2(η + v)4
=[1
2∂µη∂
µη + µ2η2] + [1
2∂µξ∂
µξ] + [(η2
2v2+η
v)∂µξ∂
µξ +µ2
4v2(4vη3 + η4)]− µ2v2
4
where we used ∂µv = 0 and λ = −µ2
v2 in the second line. As before we make the same identification
of a massive η particle and a Goldstone particle ξ, as was to be expected since the result should be
independent of the chosen parametrization. We now have a second look at (6.28) to see if we can
deduce some properties of ξ. Under a U(1) transformation we have:
φ→ eiαφ and ξ → ξ + α
We can conclude from this that any non-derivative term in ξ would not be invariant under the U(1)
transformations. Therefore, no mass term of ξ can appear in the Lagrangian and ξ must therefore
be massless. The only way for the Goldstone particle to acquire a small mass, is when the symmetry
is broken explicitly22. In this case the Goldstone particle is called a pseudo-NGB. Also, we see from
(6.28) that ξ parametrizes a direction in space without changing the energy, since a shift in ξ does not
change φφ∗. It corresponds to walking over the rim of the hat, see figure 1, and this is the reason why
the Goldstone particle remains massless. The η on the other hand, parametrizes the radial direction
and oscillations in this direction do change the energy as figure 1 shows. The η particle therefore
acquires a mass.
Example 6.8. Breaking a global SU(2) symmetry. Things become more interesting when we
look at the spontaneous breaking of a SU(2) symmetry. To have SU(2) invariance we have to consider
a doublet consisting of two complex scalar particles Φ1 and Φ2.
Φ =
(Φ1
Φ2
)=
1√2
(φ1 + iφ2
φ3 + iφ4
). (6.29)
21Note that (6.25) is the first order expansion of (6.28).22This shift of the Goldstone boson under the action of the broken generator is a general observation and
the realization of the symmetry is called a shift symmetry.
45
The expression for the Lagrangian of Φ1 is.
L =1
2(∂µφ1)2 +
1
2(∂µφ2)2 − 1
2m1
2(φ12 + φ2
2)− 1
4λ(φ1
2 + φ22)2.
A similar expression hold for Φ1. Using Φ21 = φ1
2 + φ22 and assuming equal masses for Φ1 and Φ2 of
m1 = m2 = µ we obtain the full Lagrangian for Φ
L =1
2((∂µΦ1)∗(∂µΦ1) +
1
2((∂µΦ2)∗(∂µΦ2)− 1
2µ2(Φ∗1Φ1 + Φ∗2Φ2)− 1
4λ((Φ∗1Φ1)2 + Φ∗2Φ2)2)2
=1
2((∂µΦ†)(∂µΦ)− 1
2µ2(Φ†Φ)− 1
4λ(Φ†Φ)2. (6.30)
As before we have to distinguish between the cases µ2 > 0 and µ2 < 0. The former again corresponds
to a vacuum expectation value of < Φ0 >= 0 and describes two particles each of mass µ. The case
µ2 < 0 has a vacuum expectation value of
< (Φ†Φ)0 >µ2
λ= v2
and there are again an infinite amount of vacuum states lying on a circle of radius v in the Φ1 − Φ2
plane. We choose our vacuum as
< Φ0 >=
(0
v
).
Now, let’s see how many generators get broken by this VEV. The full symmetry group SU(2) has
three generators Ti = τi2 where τi, i = 1,2,3 are the three Pauli matrices23. Recall that a generator is
broken by the vacuum state when condition (6.20) is not satisfied, that is, when
t < Φ0 > 6= 0
and that we will have as many Goldstone bosons as broken generators. For the generators of SU(2)
we see that
T1
(0
v
)=
1
2
(0 1
1 0
)(0
v
)6= 0 (6.31)
T2
(0
v
)=
1
2
(0 −ii 0
)(0
v
)6= 0 (6.32)
T3
(0
v
)=
1
2
(1 0
0 −1
)(0
v
)6= 0 (6.33)
so all three SU(2) generators are broken from which we expect there to be three Goldstone bosons.
We now set
Ξ =
(ξ1 + iξ2
ξ3 + iξ4
),
and vary Φ around its ground state by setting Φ - < Φ0 > = Ξ. Then < Ξ0 > = 0 and Φ in terms of
these shifted fields becomes
Φ =
(ξ1 + iξ2
ξ3 + v + iξ4
). (6.34)
23See appendix B.
46
What remains is to substitute (6.34) in the Lagrangian (6.30). With ∂µv = 0, we find for the kinetic
part
1
2((∂µΦ†)(∂µΦ) =
1
2
4∑i=1
((∂µξi)(∂µξi). (6.35)
For the potential part we have with λ = −µ2/v2:
L =1
2µ2(Φ†Φ) +
1
4λ(Φ†Φ)2 (6.36)
=1
2µ2(ξ2
1 + ξ22 + (ξ3 + v)2 + ξ2
4) +1
4λ(ξ2
1 + ξ22 + (ξ3 + v)2 + ξ2
4)2 (6.37)
=1
2µ2
((4∑i=1
ξ2i
)+ 2vξ3 + v2
)− µ2
4v2
((4∑i=1
ξ2i
)+ 2vξ3 + v2
)2
. (6.38)
For the second term we have((4∑i=1
ξ2i
)+ 2vξ3 + v2
)2
=
( 4∑i=1
ξ2i
)2
+ (2vξ3)2 + v4 + 2
(4∑i=1
ξ2i
)((2vξ3) + v2) + 4v3ξ3
= 4v2ξ2
3 + v4 + 2v2
(4∑i=1
ξ2i
)+ 4v3ξ3 + higher order terms (6.39)
where in the second line we neglected terms of order higher than two, since those represent interactions
and we are interested in the masses. Putting (6.39) back in (6.38) we obtain for the Lagrangian
L =1
2µ2
((4∑i=1
ξ2i
)+ 2vξ3 + v2
)− µ2
4v2
(4v2ξ2
3 + v4 + 2v2
(4∑i=1
ξ2i
)+ 4v3ξ3
)(6.40)
=1
4µ2v2 − µ2ξ2
3 . (6.41)
This tells us that ξ1, ξ2 and ξ4 correspond to massless goldstone particles since they have no mass
term and ξ3 has obtained a mass
Mξ3 =√−2µ2.
Example 6.9. Breaking SO(N) → SO(N-1) As a final example we consider the case of breaking
an SO(N) symmetry. We consider the Lagrangian
L = (∂µφi)T (∂µφi)− µ2φTi φi − λ(φTi φi)
2 where i = 1, . . . , N
and we choose
< φ0 >=(
0 ... 0 vN
)T
with v =
√−µ2
λ
as our VEV for the µ2 < 0 case. The Lagrangian is invariant under the SO(N) transformations
φ→ eiαTaφ
where Ta are the 12N(N − 1) generators that have a single −i above the diagonal and a corresponding
i below the diagonal such that the matrix is anti-symmetric. To determine the number of NGB we
again determine how many generators are broken. For this we use (6.20).
t < φ0 >= 0.
47
Now, looking at our choice of < φ0 > we see that any generator Ta with a nonzero entry in the last
column will not satisfy this condition and is therefore a broken generator. It is easy to see that there
are thus N − 1 broken generators. The number of unbroken generators is therefore
1
2N(N − 1)− (N − 1) =
1
2(N − 1)(N − 2)
which is the number of generators of SO(N − 1).
In this section we examined the concepts of the spontaneous breaking of global symmetries, group
generators and Goldstone bosons. In the next section I will discuss these same concepts in the context
of local symmetries and discuss the Higgs mechanism. Things become a little more complicated when
we demand our Lagrangian to obey local symmetries, and we see that precisely this requirement will
resolve the problem of massless particles.
48
7 Goldstone bosons and the Higgs mechanism
The Higgs mechanism was first published by Francois Englert, Robert Brout and Peter Higgs in 1964 to
explain why particles have mass. In the Higgs mechanism, the potential of the Higgs field is responsible
for the spontaneous breaking of the symmetry group of the electroweak force SU(2)W × U(1)Y to
U(1)EM , the electromagnetic symmetry group. This version of the Higgs mechanism is called the
standard model Higgs mechanism and is responsible for assigning mass to the W± en Z0 gauge bosons
of the electroweak force and the fundamental fermions while leaving the photon massless. Before
demonstrating the standard model Higgs mechanism I will first introduce the Higgs mechanism for
the Abelian case of U(1) as symmetry group. 24
First though, we have to impose local symmetries on our Lagrangian. This will require introducing so
called gauge fields and gauge bosons. These gauge bosons will be responsible for resolving the problem
of massless Goldstone particles.
7.1 Local U(1) gauge theory
Gauge theories are theories for which the Lagrangian is invariant under a group of local gauge trans-
formations. They find their origin in Maxwell’s equations for electromagnetism. There it was shown
that for any scalar function λ(r, t), the transformations A → A + ∇λ(r, t) and V → V − ∂λ∂t leave
E and B unchanged25. These transformations are called gauge transformations. We already saw in
section 6.2 that the Lagrangian (6.14) is invariant under the global gauge transformations
φ→ eiθφ
But what if we considered local transformations:
φ→ eiθ(x)φ (7.1)
by letting θ depend on the space-time coordinate xµ? Then by evaluating φ∗φ and ∂µφ under the
transformation we observe that φ∗φ remains unchanged. However, ∂µφ transforms as:
∂µφ→ eiθ(x)(∂µ + i∂µθ(x))φ
and the symmetry is clearly broken. To resolve this problem we introduce the covariant derivative Dµ
Dµ = ∂µ + ieAµ
Here e is the charge of the particle described by φ(x) and Aµ is the electromagnetic field that transforms
as
Aµ → Aµ −1
e∂µθ(x) (7.2)
Together (7.1) and (7.2) are the set of local gauge transformations. By replacing ∂µ with Dµ we can
now easily verify that the invariance is restored.
Dµφ = (∂µ + ieAµ)φ→ eiθ(x)((∂µ + i∂µθ(x)) + ie(Aµ −1
e∂µθ(x)))φ = eiθ(x)Dµφ
24The material in this section is based on [11], [12] and [13].25Recall that B = ∇×A and E = −∇V − ∂A
∂t
49
The Aµ field is defined in such a way that it cancels the offending i∂µθ(x) and the locally invariant
Lagrangian now reads26:
L = (D∗µφ∗)(Dµφ)− µ2φ∗φ− λ(φ∗φ)2 (7.3)
What remains, is that we have to include the Lagrangian for the vector field Aµ we introduced, which
is given by the Proca Lagrangian (6.7)
Lproca = −1
4(∂µAν − ∂νAµ)(∂µAν − ∂νAµ) +
1
2M2AνAν (7.4)
Using the definition of the field strength (6.8) this becomes
Lproca = −1
4FµνFµν +
1
2M2AνAν (7.5)
A quick calculation shows that while the first term in (7.5) is invariant under (7.2), AνAν is not.
It thus follows that for Lproca to be invariant we must have M = 0 which gives us our final locally
invariant Lagrangian:
L = (D∗µφ∗)(Dµφ)− µ2φ∗φ− λ(φ∗φ)2 − 1
4FµνFµν (7.6)
7.2 Abelian Higgs Mechanism
I will now discuss the Higgs mechanism for a U(1) symmetry. The same techniques as in section 6.3.2
are used only now we consider the locally-gauge invariant case as derived in the previous section. We
begin with the Lagrangian (7.6)
L = (∂µ − ieAµ)φ∗(∂µ + ieAµ)φ− µ2φ∗φ− λ(φ∗φ)2 − 1
4FµνFµν ,
for a complex scalar field
φ =1√2
(φ1 + iφ2). (7.7)
Considering now only the case µ2 < 0 we take our VEV to be < φ0 >= 1√2
(0 v
)Twith v =
√−µ2
λ
and parametrize φ in
φ =1√2
(v + η)eiξ/v
Substituting this in (7.6) gives
L =1
2
[(∂µ − ieAµ)
((v + η)e−iξ/v
)] [(∂µ + ieAµ)
((v + η)eiξ/v
)]−µ
2
2(v+η)2+
µ2
4v2(v+η)4−1
4FµνFµν
where we used that λ = −µ2/v2 in the 3rd term, and we can identify the 2nd and 3rd with a mass term
µ2η2 + coupling terms and irrelevant constants. Working out the first term gives
L =1
2
[(∂µ − ieAµ)
((v + η)e−iξ/v
)] [(∂µ + ieAµ)
((v + η)eiξ/v
)]=
1
2
[∂µη − ieAµ(v + η)− iη + v
v∂µξ
] [∂µη + ieAµ(v + η) + i
η + v
v∂µξ
]=
1
2
[∂µη∂
µη + e2(η + v)2AµAµ + 2ie
(η + v)2
vAµ∂µξ +
(η + v)2
v2∂µξ∂
µξ
]26Note that before we had ∂∗µ = ∂µ so we did not have to distinguish when writing down our Lagrangian.
For Dµ though we have D∗µ = ∂µ − ieAµ 6= ∂µ + ieAµ = Dµ, so now we do have to distinguish.
50
The important terms in our analysis are the terms that only contain η,Aµ and ξ, the other terms
represent interactions which we are not interested in. Thus, omitting the interaction term we find
1
2∂µη∂
µη +1
2∂µξ∂
µξ +e2v2
2AµA
µ + evAµ∂µξ
The relevant part of our Lagrangian thereby becomes
L =1
2[∂µη∂
µη + 2µ2η2] +1
2[∂µξ∂
µξ]− 1
4FµνFµν +
e2v2
2AµA
µ + evAµ∂µξ + ...
From this we can read of that just as before the η-particle has a mass√−2µ2 > 0, and the ξ particle
is massless. What’s new is that the gauge field Aµ also seems to have acquired a mass ev as seen from
the 4th term. While this is all perfectly fine, there are two problems with this result. The first of
them being the 5th term which seems to represent some kind of coupling between Aµ and ξ which is
clearly unwanted. Secondly, the Goldstone boson is still present. Both problems can be resolved by a
particular choice of gauge for θ(x) in (7.1) and (7.2) that where defined by
φ→ eiθ(x)φ, Aµ → Aµ −1
e∂µθ(x).
To see how we should pick θ(x), we rewrite the terms with Aµ and ξ as
e2v2
2
(Aµ +
1
ev∂µξ
)(Aµ +
1
ev∂µξ
).
By comparing this to the gauge transformation for Aµ we see we should pick θ(x) = −ξ(x)/v which
corresponds to the transformation
φ→ φ′ = e−iξ/vφ = e−iξ/v1√2
(v + η)eiξ/v =1√2
(v + η).
With this gauge choice we can rewrite the Lagrangian to obtain
L =1
2[∂µη∂
µη + 2µ2η2]− 1
4FµνFµν +
e2v2
2
(Aµ +
1
ev∂µξ
)(Aµ +
1
ev∂µξ
)=
1
2[∂µη∂
µη + 2µ2η2]− 1
4FµνFµν +
e2v2
2A′µA
′µ + ...
This choice for θ(x) is called the Unitary gauge (abbreviated ’U -gauge’) and in this gauge, only the
physical terms appear in the Lagrangian since the choice θ(x) = −ξ(x)/v corresponds to choosing φ
to be entirely real. With the Lagrangian written in this form we can draw the following conclusions
• The Aµ field has acquired a mass ev.
• An η-field with a mass√
2λv.
• The Goldstone particle ξ has disappeared!
The massless vector field Aµ before carried two degrees of freedom (transverse polarizations). It picks
up a third degree of freedom (longitudinal polarization) when it acquires a mass. This extra degree
of freedom came from the Goldstone boson ξ that simultaneously disappeared from the spectrum.
Prosaically this is sometimes referred to as ”the gauge field eating the Goldstone boson”[12]. The η
particle corresponds to the Higgs boson. Note that the VEV of the higgs field, vh, sets the scale for
the mass of the gauge boson as well as the Higgs mass.
51
7.3 The Standard model Higgs mechanism
We saw in the previous section how the Higgs mechanism is used to generate a mass term for the
U(1) gauge boson. We defined a covariant derivative and introduced gauge transformations to make
the global U(1) symmetry a local symmetry. The non-zero VEV of the Higgs field then caused this
symmetry to be broken, and by redefining our gauge fields we removed the NGB from the spectrum
which simultaneously resulted in a mass term for the Aµ gauge boson. Here we discuss how the Higgs
mechanism is embedded in the electroweak sector of the Standard model. The non-zero VEV of the
Higgs field will trigger electroweak symmetry breaking (EWSB) SU(2)W × U(1)Y to U(1)EM . We
thus expect 3 Goldstone bosons to appear.
7.3.1 Assigning mass to gauge bosons
Just as in all the examples we have discussed before, the Higgs Lagrangians reads27:
L = (Dµφ)†(Dµφ)− µ2φ†φ− λ(φ†φ)2 (7.8)
were the covariant derivative now reads
Dµ = ∂µ + ig
2W aµ τa + i
g′
2Y Bµ. (7.9)
Here W aµ are the three gauge bosons of the weak interaction and τa are the pauli matrices, with Ta = τa
2
being the generators of SU(2). Note that in the covariant derivative the summation convention is used
for a = 1,2,3 and that Dµ is a 2 × 2 matrix. Bµ represents the gauge boson of U(1) corresponding
to the weak hyper charge Y generator. g and g′ are finally the coupling constants of the SU(2)L and
U(1)Y respectively. After spontaneous symmetry breaking the three goldstone bosons will become the
degrees of freedom that mix with the 3 SU(2) gauge bosons to become the massive W+,W− and Z0
bosons, the photon will remain massless and the remaining degree of freedom will be identified with
the scalar Higgs boson. Contrary to the Abelian case, the Higgs field is now a complex doublet of the
complex scalar components φ+ and φ0
φ =
(φ+
φ0
)=
1√2
(φ1 + iφ2
φ3 + iφ4
). (7.10)
and transforms as an SU(2) doublet. The charges +1 and 0 follow from the fact that the Higgs is
supposed to give mass to the W+,W− and Z0 bosons. Therefore, one of the fields must necessarily
be neutral to while the other must be charged. In that way φ+ and (φ+)∗ = φ− become the massive
degrees of freedom of the W±. (The remaining field will become the massive degree of freedom for
the Higgs.) The Higgs field must further have a weak hypercharge of Y = +1, which follows from the
Gell-Mann-Nishijima formula
Q = T3 +Y
2that relates electric charge to weak isospin and weak hyper charge. (Note that, since we have an
isospin doublet the weak isospin T3 has eigenvalues ± 12 ). Now, as before we find for µ2 < 0 that the
Higgs field has a nonzero VEV. As in the previous sections we choose our vacuum to be
< φ0 >=1√2
(0
v
)27The Higgs Lagrangian also contains a term LYukawa for the coupling of the Higgs to fermions but this we
will not use until section 7.3.2. For now we focus on the gauge bosons
52
with v =√µ2/λ. This VEV breaks the SU(2)L symmetry as well as the U(1)Y symmetry. However,
it does remain invariant under the U(1)EM symmetry generated by the electric charge. In example
6.8 we already saw that all three SU(2) generators broken by this VEV. In addition, the hyper charge
generator Y also breaks the VEV.
Y
(0
v
)=
(1 0
0 1
)(0
v
)6= 0 (7.11)
However, the linear combination Q = T3 + Y/2 = 12 (τ3 + Y ) does leave the vacuum invariant and
the vacuum is thus invariant under the U(1)EM symmetry. As we will see, the W 1 and W 2 gauge
bosons will mix to become the massive charged W± bosons and W 3 and B will mix to become the
massive neutral Z0 boson and the massless photon Aµ. As in section 7.2 we now expand φ about the
minimum:
φ = exp
(iξ · τv
)1√2
(0
(v + h)
)and gauge away the NGB by turning to U-gauge. Then φ transforms as φ → φ′ = Uφ, where we
choose the unitary matrix U to be exp(−iξ·τv
). We thus get:
φ→ Uφ = exp
(−iξ · τv
)φ (7.12)
= exp
(−iξ · τv
)exp
(iξ · τv
)1√2
(0
(v + h)
)(7.13)
=1√2
(0
(v + h)
)(7.14)
To determine the masses we note for the h particle that a mass term comes solely from the potential
term µφ†φ whereby it has a mass term M2H = −2µ2 > 0 To determine the masses of the gauge bosons
we only have to consider the term (Dµφ)†(Dµφ). Remembering to write everything in the U-gauge
we obtain for (Dµφ)
(Dµφ) =1
2√
2[2∂µ + igτa ·W a
µ + ig′Bµ]
(0
(v + h)
)(7.15)
=1
2√
2
(2∂µ + igW 3
µ + ig′Bµ ig[W 1µ − iW 2
µ ]
ig[W 1µ + iW 2
µ ] 2∂µ − igW 3µ + ig′Bµ
)(0
(v + h)
)(7.16)
=1
2√
2
(ig[W 1
µ − iW 2µ ](v + h)
(2∂µ − igW 3µ + ig′Bµ)(v + h)
)(7.17)
where we used that Ta = τa/2. In this expression the gauge boson Bµ corresponds to the U(1)Y
hypercharge generator Y and W aµ correspond to the three SU(2) generators Ta. We therefore get for
(Dµφ)†(Dµφ) the following expression:
(Dµφ)†(Dµφ) =1
2(∂µh)(∂µh) +
1
8g2(W 1
µ + iW 2µ)(W (1)µ − iW (2)µ)(v + h)2 (7.18)
+1
8(gW 3
µ − g′Bµ)(gW (3)µ − g′Bµ)(v + h)2 (7.19)
53
To determine the masses of the gauge bosons we have to look at terms that are quadratic in the
fields. (We thus ignore terms that involve products of the gauge bosons and h. Those terms represent
interactions and I’ll get back to them later). Then we get,
1
8v2g2[(W 1
µ)2 + (W 2µ)2] +
1
8v2[gW 3
µ − g′Bµ][gW (3)µ − g′Bµ] (7.20)
This expression though does not yet contain the physical gauge bosons, W±µ , Z0µ and Aµ. To obtain
those we have to redefine the fields. Focusing on the first term we define the charged physical W±µgauge fields as
W±µ ≡W 1µ ∓ iW 2
µ√2
(7.21)
and it easily follows that
(W 1µ)2 + (W 2
µ)2 = |W+µ |2 + |W−µ |2
The physical W± gauge bosons therefore get a mass of
MW± =gv
2.
To obtain the masses for the physical Z0µ and Aµ we notice that we can rewrite de second term as
1
8v2[gW 3
µ − g′Bµ][gW (3)µ − g′Bµ] =v2
8
(W 3µ Bµ
)( g2 −gg′
−gg′ g′2
)((W 3)µ
Bµ
)(7.22)
=v2
8
(W 3µ Bµ
)M
((W 3)µ
Bµ
)(7.23)
where in the second equality we have defined M as the mass matrix. Its diagonal elements are the
mass terms for the W 3 and B eigenstates. However, M is a non-diagonal matrix and its off-diagonal
elements couple together the W 3 and B fields causing them to mix. To find the masses of the actual
physical gauge bosons, we have to go to a basis in which M is diagonal. In this basis then, the
masses of the physical gauge bosons will be the eigenvalues of M. They are easily derived from the
characteristic equation.
(g2 − λ)(g′2 − λ)− (gg′)2 = 0 → λ = 0, λ = g2 + g′2
Therefore, in this basis, the mass matrix in (7.23) can be rewritten as
1
8v2(Aµ Zµ
)(0 0
0 g2 + g′2
)(Aµ
Zµ
)(7.24)
where we have defined Aµ and Zµ as the physical fields corresponding to the normalized eigenvectors
of M. The masses of the physical gauge bosons can now be identified to with
MA = 0 and MZ =1
2v√g2 + g′2.
The physical Aµ and Zµ fields correspond to the normalized eigenvectors of M and these are found
to be:
λ = 0 → 1√g2 + g′2
(g′
g
)=g′W 3
µ + gBµ√g2 + g′2
= Aµ (7.25)
54
and
λ = g2 + g′2 → 1√g2 + g′2
(g
−g′
)=gW 3
µ − g′Bµ√g2 + g′2
= Zµ (7.26)
The physical fields are thus mixtures of the massless bosons that correspond to the SU(2)L and U(1)Y
generators. Through the Higgs mechanism, the combination corresponding to the Zµ boson, has ac-
quired a mass whereas the photon Aµ has remained massless.
Experimental verification of Higgs mechanism
We can rewrite the ratio of the coupling constants g and g′ in terms of the so called Weinberg angle
θW to parametrize the mixing of the W 3µ and Bµ fields.
g′
g= tan(θW ). (7.27)
The parameter θW is not predicted by the Standard model. Its value must be determined from
experiment. It is found to be [12]:
θW = 28, 75 (7.28)
Then we can rewrite (7.25), (7.26) as
Aµ = cos(θW )Bµ + sin(θW )W 3µ (7.29)
Zµ = − sin(θW )Bµ + cos(θW )W 3µ (7.30)
Similarly, we can use (7.27) to rewrite MZ and MW in terms of θW , from which we obtain:
MW
MZ= cos(θW ) (7.31)
This prediction for the mass relation of the physical gauge bosons has been experimentally verified
and provides the most compelling argument for the Higgs mechanism to be correct. Further, the Higgs
acquires a mass
m2h = −2µ2 = 2λv2
If we then use the relation MW = gv2 and the measured values for MW and g the Higgs VEV is found
to be
v = 246 GeV
The parameters µ and λ though, are free parameters. The Standard model provides no way to de-
termine them, which is why is took so long so find the Higgs boson. It was eventually discovered at
4 July 2012. Measurements at the LHC determined its mass to be around 126 GeV. With this the
parameter λ could also be determined.
Coupling to the Gauge bosons
When determining the masses of the gauge bosons we only considered the terms that where quadratic
in the gauge fields, and we ignored the terms in (7.19) that involved products of the gauge fields and
h. Here we have a closer look at those interaction terms. Using we can rewrite the second term of
(7.19) as:1
4g2W−µ W
+µ (v + h)2 =
1
4g2v2W−µ W
+µ +
1
2g2vW−µ W
+µ h+
1
4g2W−µ W
+µ hh (7.32)
55
The first term, as before, gives the masses of the W± bosons. W−µ W+µ h and W−µ W
+µ hh, however give
rise to triple and quartic couplings of the Higgs boson to the gauge bosons. Their coupling strengths
can be read of to be
ghWW =1
2g2v = gmW and ghhWW =
1
4g2 =
1
2
gmW
v
The coupling of the Higgs to the W boson is thus proportional to the mass of the W -boson and the
coupling to the Z-boson is similarly found to be proportional to the mass.
7.3.2 Assigning mass to fermions
Apart from generating a mass for the gauge bosons, the Higgs mechanism is also responsible for
generating a mass term for the fermions. A fermion mass term would be off the form mfψψ and this
is not allowed to appear in the Lagrangian because it does not respect the SU(2)L × U(1)Y gauge
symmetry. This can be seen when we decompose ψ into its left and right-handed chiral states28,
obtaining
mfψψ = mf (ψR + ψL)(ψL + ψR) = mf (ψRψL + ψLψR)
However, in the Standard Model left-handed fermions are placed in SU(2) doublets, (I = 1/2), while
right-handed fermions are placed in SU(2) singlets, I = 029. Therefore they both transform differently
under SU(2)L × U(1)Y and a mass term is thus not gauge invariant.
ψL → ψ′L = exp
(−iξ · τv
)ψL
ψR → ψR
However, the two complex scalar fields in (7.10) are also placed in an SU(2) doublet and it transforms as
in (7.14). Therefore, the combination ψLφ is invariant under SU(2)L gauge transformations since the
exponentials cancel. If we combine this with a right-handed singlet ψR then the combination ψLφψR
and its hermitian conjugate, ψRφ†ψL will be invariant under SU(2)L and U(1)Y transformations. We
can conclude therefore that a term of the form
− λYuk(ψLφψR + ψRφ†ψL) (7.33)
will be invariant under the full gauge symmetry. Here λYuk is the Yukawa coupling between the Higgs
field and the massless lepton and quark fields.
Lepton masses
To determine the masses of the leptons we write (7.33) in terms of the left and right-handed lep-
ton states. For the first family they are given by
L =
(νe
e
)L
R = eR. (7.34)
28For left handed fermions the spin is antiparallel to its momenta and for righthanded fermions the spin is
parallel to the momenta.29See Appendix (C.2.2).
56
where I will write ψL and ψR as L and R for simplicity. Since the derivation is the same for all
three families I will only discuss the first family. If now the Higgs potential is added this results in
spontaneous symmetry breaking and we can write the Higgs doublet in U-gauge:
φ =1√2
(0
v + h
),
whereby (7.33) thus becomes
Llepton =− λe√2
[(νe e
)L
(0
v + h
)eR + eR
(0 v + h
)(νee
)L
](7.35)
=− λe(v + h)√2
[eLeR + eReL] (7.36)
=− λe(v + h)√2
ee (7.37)
From this we see that the neutrino has remained mass, while the electron has acquired a mass of
me =λev√
2.
Remark We see from (7.37) that the term LφR+Rφ†L is only able to generate a mass for the fermion
in the lower component of the SU(2)L doublet, since the non-zero VEV occurs in the lower component
of φ. Since right-handed neutrino’s have never been observed this is not a problem in the lepton case.
However, when determining the quark masses we have to consider right-handed up-quarks as well as
right-handed down quarks so we expect some difficulties to occur there.
Quark masses
The left-handed quark doublet and right-handed quark singlets are given by
L =
(u
d
)L
R = uR, uD (7.38)
The derivation of mass for the down-type quark goes similarly to the derivation above for the electron
mass, and gives a down-quark mass of:
md =λdv√
2.
Now, we turn to the up-quark, the upper component of the SU(2)L doublet. In view of the remark
above we need to reverse the order of the Higgs doublet. This we can accomplish by using the charge
conjugated doublet φc, see appendix C.1:
φc = iτ2φ∗ =
(0 1
−1 0
)((φ+)∗
(φ0)∗
)=
((φ0)∗
(−φ+)∗
)=
1√2
(φ3 − iφ4
φ1 + iφ2
)(7.39)
This conjugate of the Higgs doublet transforms in exactly the same way as φ under SU(2)L transfor-
mations, as can easily be checked in U-gauge, and is thus also invariant under SU(2)L transformations.
Therefore, we can construct a gauge invariant mass term for the up-quark from
− λu(ψLφcψR + ψR(φc)†ψL) (7.40)
57
After spontaneous symmetry breaking we can write φc in U-gauge:
φc = iτ21√2
(0
h+ v
)=
1√2
(h+ v
0
)(7.41)
and substituting this in (7.40) gives us:
Lu =− λu√2
[(u d
)L
(v + h
0
)uR + uR
(h+ v 0
)(ud
)L
](7.42)
=− λu(v + h)√2
[uLuR + uRuL] (7.43)
=− λu(v + h)√2
uu. (7.44)
and we read of a up-quark mass of
mu =λuv√
2.
58
8 The Hierarchy problem
Hierarchy is an important concept in physics and is related to energy scales30. All physical theories
have their own energy domain where they are valid. The theory of the electroweak interaction has
an energy scale around ∼ 246 GeV, the VEV of the Higgs field. At energies around the Planck scale
1019 GeV physicists know that the SM breaks down and they expect a Grand Unified Theory to take
over at around 1016 GeV. Indications for this come from the precise measurements of the coupling
constants. At high energies they seem to converge to a single point hinting to new physics at around
this scale that unifies the electroweak force with the strong force. This GUT scale however, is far
greater than the electroweak scale ∼ 246 GeV and physicists are not sure what lies in the range
between the electroweak scale up to the GUT scale. Although this does not pose a direct problem for
physical theories physicists find it highly disturbing and have named this problem of the vast difference
between the electroweak scale and the GUT scale the hierarchy problem.
8.1 Naturalness
Before discussing how the hierarchy problem arises in the SM I will first discuss the notion of natural-
ness to explain why only the Higgs as a fundamental scalar particle is sensitive to the energy hierarchy.
It has to do with the Higgs being a fundamental scalar particle and that its mass is not protected by
any symmetry of the Standard Model. To explain what this means we need the definition of technical
naturalness formulated by Gerard ’t Hooft (1980) [17]:
• A parameter is naturally small if setting it to zero increases the symmetry of the theory.
We already saw that a gauge boson mass term in the Lagrangian is forbidden because it explicitly
breaks the gauge invariance. We concluded that the gauge bosons must therefore be massless. In other
words, setting their mass to zero increases the symmetry. The gauge bosons can only acquire a mass via
the Higgs mechanism that introduces a mass term in a gauge invariant way. Their masses are said to
be protected by the gauge symmetry. Similarly, as we saw in 7.3.2, a fermion mass term, which would
be of the form mψψ, is not invariant under the local a gauge symmetry SU(2)L × U(1)Y . However,
when mf is set to zero, this allows the left and right handed parts of ψ to transform independently
and Lagrangian is said to have an additional chiral symmetry. The fermion mass is therefore also
’naturally small’ in the definition of ’t Hooft and is said to be protected by chiral symmetry. In terms
of loop corrections31 this manifests itself in that any correction to the gauge boson and fermion masses
will be proportional to the mass, due to the terms that are allowed in the Lagrangian. The correction
from loop diagrams will thus be multiplicative and small in the limit of small masses. For a scalar field
φ though, like the Higgs field, there is no symmetry that forbids a mass term µ20φ∗φ and a correction
to the mass resulting from loop diagrams will be additive.
8.2 Hierarchy problem in the Higgs sector
I will now discuss how the hierarchy problem arises in the Higgs sector. Suppose that the SM remains
valid up to the Planck scale and has a cut-off ΛPl ∼ 1019 GeV. We consider a scalar theory with
30The arguments can be found in many textbooks such as [13]31See appendix D .
59
Yukawa and gauge interactions and Lagrangian:
L =1
2(∂µφ)2 − 1
2µ2
0φ2 − λφ4 + interactions
The parameter µ0 is the bare mass of the scalar φ resulting from tree diagrams. However, due to its
couplings to the fermions, gauge bosons and its self-interaction the bare mass µ20 receives quantum
corrections at one loop order. The physical mass µH is the bare mass plus these quantum corrections
from loop diagrams (denoted by δm), i.e.
m2H = m2
0,tree + δm (8.1)
The main loop contributions to the Higgs mass come from the coupling of the Higgs to the top quark32,
the coupling to the W±µ and Z0µ gauge bosons and the Higgs self energy due to the quartic self coupling.
The corresponding one-loop Feynman diagrams are displayed in the figure below.
Figure 2: The three most significantly quadratically divergent contributions to the Higgs mass.
From left to right: The top-quark loop, the gauge boson loop and the Higgs self-energy.
These three diagrams all depend quadratically on the cutoff Λ, as can be verified by power counting
of the momenta33, and their contributions and found to be [18]34:
• top quark loop - 38π2λ
2tΛ
2
• SU(2) gauge bosons 964π2 g
2Λ2
• Higgs loop 116π2λ
2Λ2
Thus (8.1) has the form:
m2H = m2
0,tree + Λ2(aλ2t + bg2 + cλ2)
orm2H
Λ2=m2
0,tree
Λ2+ (aλ2
t + bg2 + cλ2)
Implementing the assumption that the SM remains valid up to the Planck scale we find that a tremen-
dous amount of fine-tuning between the bare mass and the coupling terms is needed to explain the
light mass of the Higgs boson. This is where the hierarchy problem arises in the SM. Mathemati-
cally it is not a problem but it is again highly unnatural and the SM does not give any hints why
32The Higgs couples to all quarks but because the coupling strength of the Higgs is proportional to the mass
of the fermion it couples most strongly to the top-quark.33See Appendix D.1.34A calculation based on cut-off regularization can be found in Appendix D.3.
60
this cancellation should take place. Solutions to this hierarchy problem rely on the assumption that
new physics appears at a much lower scale at the order of TeV. In [18] the contributions of the three
quadratically divergent diagrams have been calculated for a cut-off of Λ ∼ 10 TeV. Their contributions
are respectively −(2 TeV)2, (0.7 TeV)2 and (0.5 TeV)2. Assuming this cut-off of 10 TeV a need for
fine-tuning of about one part in a hundred is needed and again the hierarchy manifests itself. When
the cut-off is taken to be 1 TeV the need for fine tuning no longer arises.
We can also turn the argument around by demanding that we find a fine tuning of no more than
10% acceptable. Then a cut-off Λ ≈ 2 TeV is found. At this scale we would then expect new physics
to appear and to find new particles that would naturally cancel the divergent loop contributions.
One successful solution to the hierarchy problem, is the idea of supersymmetry (SUSY) that states
that every SM particle has a supersymmetric partner. In SUSY the loop contributions of the SM
particles are cancelled by the loop contributions with a supersymmetric partner in the loop and so
the need for fine-tuning does not arise. All models that are to resolve the hierarchy problem must
introduce new physics at a scale far enough below the Planck scale for the amount for fine-tuning to
be reduced enough. If we indeed believe that an actual solution to this ’big’ hierarchy problem exists,
then physicists should have found observational evidence for new physics as they approach the cut-off
from below. The problem however, is that measurements give no evidence of new physics whatsoever.
This lack of evidence pushes up the lower limit for the cut-off above the TeV scale which reintroduces
a new less-severe hierarchy problem. This is called the Little Hierarchy problem. Any model that is to
successfully solve the hierarchy problem must also not reintroduce a Little Hierarchy problem. Here
we focus on a set of models that addresses this Little hierarchy problem by introducing particles at
the TeV scale that cancel the SM quadratic divergencies. These models are called Little Higgs models
and they realize the Higgs as a pseudo-NGB of a higher approximate global symmetry. This way the
Higgs becomes ’naturally light’.
61
9 Little Higgs models
Little Higgs models postulate the Higgs boson as the pseudo-NGB of some greater global symmetry
which is broken both spontaneously and explicitly. Here we will focus on the ”Simplest Little Higgs”
that involves the breaking of SU(3) to SU(2) and is a model to conceptually understand the mechanism
behind Little Higgs models and introduce the mechanism of collective symmetry breaking that will
prevent the Higgs from divergent corrections. In the calculations will I focus in mostly on the gauge
sector of the model and explicitly show that the quadratic contribution to the Higgs mass from the
SM W bosons is successfully cancelled. The fermion sector will be discussed in lesser detail. First
though we have to know how the NGB transform under the broken and unbroken generators of
[SU(N)/SU(N − 1)], which will be the purpose of the next paragraph.
9.1 Transformation of NGB
As we already saw we can parametrize the Goldstone boson ξ(x) by writing
φ(x) =1√2
(f + η)eiξ(x)/f ,
with f the VEV of φ and η representing the massive radial oscillations. We now generalize this to
breaking pattern of SU(N)→ SU(N − 1) and analyze how the NGB transform. The number of them
is equal to the number of broken generators. Using SU(N) has N2 − 1 generators, we should get a
total of 2N − 1 NGB. We now use the following parametrization for the NBG by writing [18]:
φ = eif Πφ0 with Π =
π1
...
πN−1
π1 · · · πN−1 π0/√
2
and VEV φ0 =
0...
0
f
. (9.1)
Π is the goldstone boson matrix, the fields π1 . . . πN−1 are complex and the field π0 is real. f Represents
the high symmetry breaking scale. Written in this form we can investigate how they transform under
the unbroken symmetries and broken [SU(N)/SU(N − 1)] symmetries. As a first observation we note
we can make the unbroken SU(N − 1) transformations explicit, since we have an embedding:
UN−1 =
(UN−1 0
0 1
). (9.2)
Looking first at how φ transforms under the unbroken SU(N − 1) transformations we find:
φ→ UN−1φ = (UN−1eivΠU†N−1)UN−1φ0 = e
ivUN−1ΠU†N−1φ0, (9.3)
where the invariance of the vacuum under SU(N − 1) was used in the second equality. From this we
see that the NGB transform linearly as Π→ UN−1ΠU†N−1. Using (9.2) we can further deduce that π0
transforms as a singlet and ~π =(π1, . . . , πN−1
)Ttransforms like(
0 ~π
~π† π0
)→ UN−1
(0 ~π
~π† π0
)U†N−1 =
(0 UN−1~π
U†N−1~π† π0
)
62
~π thus transforms in the fundamental representation of SU(N − 1) meaning that ~π → UN−1~π. Now
lets see how φ transforms under the broken SU(N) generators. By the BCH formula35 we have that
any general SU(N) transformation can be decomposed into a SU(N)/SU(N−1) transformation times
a SU(N − 1) transformation, the latter leaving φ0 invariant. Therefore:
φ→ UN/N−1eif Πφ0 = exp
[i
f
(0 ~α
~α† 0
)]exp
[i
f
(0 ~π
~π† 0
)]φ0
≡ exp
[i
f
(0 ~π′
~π′† 0
)]UN−1(α, π)φ0 by BCH
= exp
[i
f
(0 ~π′
~π′† 0
)]φ0
and thus, again by BCH, we see that to first order ~π → ~π′ = π + α meaning that they indeed shift
under the broken symmetries. Just as in the abelian case this again ensures that the NGB can only
have derivative interactions.
9.2 Constructing ”The Simplest Little Higgs”.
In the ”Simplest Little Higgs” the Higgs boson is realized as a NGB of an higher SU(3)W symme-
try which is spontaneously broken to SU(2)W by letting φ acquire a VEV ∼ f . However, an exact
NGB realized this way will not suffice as an appropriate Higgs candidate, since the SM Higgs mass
is non-zero. Therefore, we must also break the symmetry explicitly, in order to realize the Higgs
as a pseudo-NGB. Increasing the symmetry to SU(3) will require introducing new heavy particles
with masses O(f) that will cancel the quadratic divergencies of the SM. First however, we need to
construct a Lagrangian to work with that is invariant under the full SU(3) symmetry and includes
only the NGB. But, as concluded in the previous section, this means that we can only have derivative
couplings. Adding couplings to the Gauge bosons and fermions and also the quartic Higgs coupling
will take extra care, and we focus on these in the next few sections. We must also determine at what
energy scale the model will be valid. Since we will introduce new particles, these will need to be heavy
and the symmetry breaking scale f must be a high energy scale36.
The Lagrangian and energy scale of the model.
Under a SU(N) symmetry, only the combinations φ†φ and εa1a2···aNφa1φa2 · · ·φaN = 0 are invari-
ant and we are left with the following Lagrangian [18]
L = φ†φ+ f2|∂µφ|2 +O(∂4) (9.4)
where φ†φ = f2 is an irrelevant constant. To construct the NGB matrix Π we observe that the
VEV of the φ field induces the spontaneous breaking of SU(3) to SU(2) resulting in 5 exact NGB
corresponding to the 5 broken SU(3) generators. Noting that we can identify λ1, λ2, λ3 with the Pauli
35See appendix B.36Otherwise the LHC should have found these by now.
63
matrices, and thus remain unbroken, we can parametrize the NGB by:
Π = πi1
2λi =
π8
2√
30 1
2 (π4 − iπ5)
0 π8
2√
312 (π6 − iπ7)
12 (π4 + iπ5) 1
2 (π6 + iπ7) − π8√3
≡η/√
3 0 h
0 η/√
3
h† −2η/√
3
where we have defined h and η as the following combinations of NGB
h =
(12 (π4 − iπ5)12 (π6 − iπ7)
), η =
π8
2(9.5)
The h field is a complex SU(2) doublet and represents the Higgs doublet. It transforms linearly under
the unbroken SU(2) symmetry and shifts under the broken SU(3) generators. Note that h is arranged
such that it is not included by the SU(2) generators, so that when breaking SU(3) into SU(2), h
will always remain a goldstone boson. The scalar field η is an SU(2) singlet. The field φ we can
parametrize as
φ = eiΠ/fφ0 with VEV φ0 =
0
0
f
.
This φ we can expand in terms of h to see what interactions we get for h. Then, ignoring the η singlet,
we get:
φ = exp
if
0 0 h
0 0 h
h† h† 0
0
0
f
=
0
0
f
+ i
hh0
− 1
2f
0
0
h†h
+ h.o.c (9.6)
If we now insert this back into the kinetic term of 9.4, then we are left with:
|∂µφ|2 =
∣∣∣∣∣∣∣i∂µh∂µh
0
− 1
2f
0
0
∂µ(h†h)
∣∣∣∣∣∣∣2
=|∂µh|2 +1
4f2|2h†∂µh|2
=|∂µh|2(
1 +1
f2h†h
)The first term represents the kinetic term of the bare h propagator. The second term however,
represents a loop correction to this kinetic term by contracting h into a loop and this correction is
quadratically divergent. The Lagrangian therefore contains non-renormalizable interactions, which
is unacceptable for an effective field theory. Therefore, to discover up to which energy the model
remains valid as an effective field theory, and where we thus need a completion of the theory, we cut
the divergence off at Λ. The divergent contribution to the kinetic term is then found to be [18]
1
f2
Λ2
16π2(9.7)
Now we can check at which energy (9.7) will become comparable to the contribution of the tree level
diagram. This will be the case when the correction becomes O(1), that is, for Λ ∼ 4πf . We therefore
expect the theory to be valid for f ≈ 1 TeV which corresponds to Λ ≈ 10 TeV. Above this energy, the
theory becomes non-renormalizable, i.e. the corrections becomes more important than the tree level
diagram, and we need a high energy theory to take over37.
37It is important to note that this energy scale lies beyond the current scope of the LHC energy, meaning
that we have not yet been able to discover any new particles postulated by the theory. Had this not been
64
So far we have an effective theory with massless NGB that are not allowed to have any non-derivative
interactions. No coupling terms are allowed and also a mass term for h is forbidden. However, to
have a theory ’similar’ to the standard model Higgs we do need gauge-couplings, Yukawa couplings
and a quartic Higgs potential. In the next few sections I will focus on implementing those in a gauge
invariant way.
9.2.1 Adding the Gauge coupling
Beginning with the gauge couplings we try to implement the SU(3) symmetry by including the fol-
lowing covariant derivative38:
Dµ = ∂µ − igT aW aµ (9.8)
where
T a =1
2λa =
(σa2 0
0 0
)Thus, we only add the SU(2) gauge bosons. However, simply rewriting the covariant derivative in
this rather trivial way has no effect. Expansion of |Dµφ|2 shows that we have the following 1-loop
diagrams.
Figure 3: The two quadratically divergent contributions to the Higgs mass coming from the
terms that couple the SU(2) gauge bosons to φ.
Its value is schematically found to be
g2
16π2Λ2φ†
1 0 0
0 1 0
0 0 0
φ =g2
16π2Λ2h†h
As a second attempt we can try gauging the full SU(3) symmetry by now including all 8 SU(3) gauge
bosons in the covariant derivative (9.8). Expansion leads again to the same quadratically divergent
diagram coming from the fourth term in the expansion where we now have all eight gauge bosons.
This gives:g2
16π2Λ2φ†φ =
g2
16π2Λ2f2
which contains no mass term for the Higgs, but adds only a constant. However, the Higgs field is also
gone. The NGB’s that formed h have been gauged away by the 5 gauge bosons corresponding to the
broken SU(3) generators. Thus, adding a single set of NGB φ in combination with gauging the full
the case we should already have found any new particles, which has not been the case, and the theory can
therefore not be correct. This is the case for all models that are an extension to the Standard model.38Recall we introduced it in sections 7.1 and 7.3.
65
SU(3) results in a quadratic divergence with no dependence on h. However, because the full SU(3)
was gauged, the NGB’s are also eaten by the 5 gauge bosons. These two results suggest that a way to
circumvent this problem is by adding 2 sets of NGB’s φ1 and φ2 and add only a single set of SU(3)
gauge bosons. This way both φ fields result in a spontaneous symmetry breaking of SU(3)→ SU(2),
resulting in 10 exact NGB’s of which only 5 are eaten.
Collective symmetry breaking: The Little Higgs trick
As said, we add two sets of NGB’s, φ1 and φ2, parametrized by
φ1 = eiΠ1/f1
0
0
f1
, φ2 = eiΠ2/f2
0
0
f2
, where f1 = f2 = f
and add a single set of SU(3) gauge bosons by letting φ1 and φ2 both have the same covariant
derivative,
L = |Dµφ1|2 + |Dµφ2|2.
Since we have introduced two φ fields and a single set of gauge bosons, only one linear combination of
Π1 and Π2 will be eaten, while the other orthogonal combination will form the complex Higgs doublet.
In view of the previous attempt, both φ fields separately lead to the same quadratic divergent diagrams
and give a total quadratic divergence of
g2
16π2Λ2(φ†1φ1 + φ†2φ2) =
g2
16π2Λ2(f2 + f2) (9.9)
and thus do not contribute to the Higgs mass. However, since we are now dealing with two fields we
can also draw the following diagram which has two gauge bosons in the loop.
Figure 4: The third possible diagram that contains both φ fields. The external fields are φ1
and φ2 and it has 2 gauge bosons in the loop. This diagram is the only one-loop diagram
that contains both of the φ fields.
By counting momenta we expect this diagram to be logarithmically divergent39 and indeed its contri-
bution is [18]g4
16π2log
(Λ2
µ2
)|φ†1φ2|2. (9.10)
where µ is a free renormalization scale. By expanding |φ†1φ2|2 we can further show that it contains a
tree level mass term for h. For this we need a more explicit expression for the φ fields. We already
39See Appendix D.1.
66
noted that only one combination of Π1 and Π2 can be gauged away, since there is only one set of gauge
bosons. This we will call k. The orthogonal combination is identified with the Higgs and cannot be
gauged away. We thus choose the following parametrization for the fields.
φ1 = exp
[i
(0 k
k† 0
)]exp
[i
(0 h
h† 0
)]0
0
f
(9.11)
and
φ2 = exp
[i
(0 k
k† 0
)]exp
[−i
(0 h
h† 0
)]0
0
f
(9.12)
where k and h are to be read as doublets. Working in U -gauge, (recall we introduced it in section
7.1), we get the following expansion of |φ†1φ2|2:
φ†1φ2 =(
0 0 f)
exp
[−2i
f
(0 h
h† 0
)]0
0
f
=f2
[1− 2i
f
(0 h
h† 0
)− 2
f2
(hh† 0
0 hh†
)]
=f2 − 2h†h.
Equation (9.10) therefore contains a term
g4
16π2log
(Λ2
µ2
)f2.
Recalling that we argued Λ ∼ 4πf for the theory to be renormalizable we can estimate the contribution
of this diagram for f ≈ 1 TeV. Doing so, and using µ ≈ O(vh) and Λ ≈ 10 TeV, one finds its value is
about 100 GeV, which is about the Standard Model Higgs mass [18].
Collective symmetry breaking
What made the previous work and what is a key ingredient in all Little Higgs models is what is
called collective symmetry breaking. Let me explain what this means by investigating the relevant
symmetries of the model. First thing to note, is that without the gauge couplings the theory has
a global SU(3)1 × SU(3)2 symmetry, which is spontaneously broken to SU(2)1 × SU(2)2 by both
of the φ field VEV’s. The coset is thus [SU(3)/SU(2)]2 corresponding to 10 exact NGB’s. These
correspond to 2 singlets and 2 complex doublets (k and h) transforming under the unbroken SU(2)
symmetry. However, by introducing the SU(3) gauge interactions for both of the fields, we gauged
only the diagonal SU(3) subgroup of SU(3)1 × SU(3)2, and explicitly broke the global symmetry to
this gauged subgroup because φ1 and φ2 were no longer allowed to rotate independently. This can
be seen from the boson-scalar coupling term in the Lagrangian and the relative minus sign between
(9.11) and (9.12).
|gWµφ1|2 + |gWµφ2|2
This diagonal SU(3) is then spontaneously broken to SU(2) producing thus only 5 exact NGB,
corresponding to the k-fields that are eaten by the gauge bosons. The set that remains, h in the
67
parametrization, we just saw, acquires a mass term through the log-divergent loop diagram. This h is
thus a pseudo-NGB. What is crucial here is that both of the gauge couplings must be present. Suppose
that we were to set the gauge coupling of either of the φi to zero. Then the Lagrangian again has
two independent SU(3) symmetries that are spontaneously broken. This way we get 10 exact-NGB’s
of which 5 are eaten, leaving us with 5 exact-NGB’s and thus a massless h field. It is only when we
include the gauge coupling for both of the φ fields that the global [SU(3)]2 symmetry is explicitly
broken to its diagonal. This way only 5 exact-NGB appear and h will be realized as a pseudo-NGB,
resulting from the breaking of the approximate global symmetry. Thus, only when we include both φ
fields can we get a massive h field and we just saw that a diagram involving both of the fields is at
most logarithmically divergent.
This mechanism of realizing the Higgs as a pseudo-NGB is called collective symmetry breaking. It
realizes the Higgs as the NGB of a spontaneously broken global symmetry, that is also explicitly broken
making it a pseudo-NGB. ”Collectively” means that the explicit symmetry breaking can happen only
in the case when two of more couplings are non-zero. In this way the Higgs mass is natural and
protected, since setting either of the gauge couplings to zero restores the global symmetry and again
makes of the Higgs an exact NGB.
The same idea of this collective symmetry breaking can be applied when adding the Yukawa coupling
to the quarks which is what we will do next.
9.2.2 Adding the Yukawa coupling
Since the Yukawa coupling of the Higgs to the fermions is proportional to the fermion mass, the most
important contribution comes from the top-quark. Recall that the Yukawa Lagrangian was of the form
LYuk = −λfψLφψR.
We can then write down the following Lagrangian that involves both φi fields.
LYuk = −λ1ψLφ1ψ1R − λ2ψLφ2ψ2R (9.13)
To get SU(3) invariance, we enlarge the SU(2) doublets to triplets by adding a heavy top partner for
the SM top quark t, which we’ll denote by T . We thus get:
ψL =
t
b
T
L
ψR = tR, bR, TR. (9.14)
As we will see, the right-handed top quark tR, will mix with the heavy right-handed top-quark TR,
such that the quadratic divergent top-loop will be cancelled by this heavy top-quark. I will thus refer
to tR and TR as t1R and t2R to reflect their mixing. In the conventions used here, the Higgs fields is
assumed to obtain a VEV of(
0 v)T
. Therefore, we have to use the charge conjugates of the fields
φ1 and φ2, i.e.
φci =
(iτ2 0
0 1
)φ∗i .
68
to prevent the top and down quark from mixing40. This results in the following two terms for the top
quark Yukawa Lagrangian41:
Ltop = −λ1ψL
(iτ2 0
0 1
)φ∗1t1R − λ2ψL
(iτ2 0
0 1
)φ∗2t2R (9.15)
Recalling the expansions for φ1 and φ2 derived in (9.6) we find for the conjugate fields:
φc1 =
(−τ2h∗
f − h†h2f
)and φc2 =
(τ2h∗
f − h†h2f
)
Inserting this in (9.15) and for simplicity setting λ1 = λ2 = λ√2
gives:
Ltop = − λ√2
(t b
)Liτ2h
∗(it1R − it2R)− λ√2TL
(f − h†h
2f
)(t1R + t2R) (9.16)
where the factor of i is inserted so we can redefine the right-handed singlets as:
tR =1√2i(t2R − t1R) and TR =
1√2
(t2R + t1R).
In terms of these redefined fields, (9.16) becomes:
Ltop = −λ[(t b
)Liτ2h
∗tR + TL
(f − h†h
2f
)TR
](9.17)
The first term in represents the Standard Model top-Yukawa coupling and we identify λ = λt. The
second term includes a mass term for the heavy top-quark and a coupling term. We can read off a
mass of λtf and a coupling constant of λt/(2f).
Cancelling the top loop
From (9.2.2) we see we can draw the following two diagrams that both contribute to the Higgs mass
to first order The contribution of the first diagram to the Higgs mass we already know. It is
Figure 5: The quadratically divergent contributions from the top-quark and its heavy top
partner. They contribute equally to the Higgs mass with opposite sign and thus cancel.
−3λ2tΛ
2
8π2
40In [18] the calculations are based on the VEV of(v 0
)T. The calculations done here are give the exact
same results since it is a matter of convention whether one works with φ or its charge conjugate.41We discard the terms Rφ†L because these would reintroduce the quadratic divergencies.
69
To see why the second diagram should cancel the first we make the following observations. First
observation is that the coupling terms in the Lagrangian differ a relative minus sign. Further, the
second diagram actually represents two diagrams since the heavy Top and its antiparticle can both
run in opposite order. Then, the two relevant couplings in the second diagram are λtf and λt/(2f).
Therefore, the factors 2 and f different from the first diagram cancel and the contribution is the same
apart from the difference in minus sign, and the heavy top quark contribution indeed cancels the
SM-top quark contribution.
Symmetries
As in the gauge boson sector the absence of quadratic divergencies can be understood by looking
at the symmetries involved. Both of the Yukawa couplings separately preserve the SU(3) gauge sym-
metry. When both of the couplings are non-zero this forces the fields in both terms in (9.13) to be
aligned, since the couplings force force φi to transform like ψL. In this case there is only one sym-
metry, the diagonal SU(3) gauge symmetry. Suppose again we were to set either of the λi to zero.
This results in two independent SU(3) symmetries and the symmetry is thus enhanced to SU(3)2.
Both φi spontaneously break this symmetry to [SU(2)]2 resulting in two sets of 5 exact NGB’s. One
set is eaten and the other forms the Little Higgs, which is an exact NGB and thus massless. If we
then set both of the couplings to non-zero, only the diagonal SU(3) symmetry remains. We will only
get one set of 5 exact NGB, which are eaten, and the Higgs becomes a Pseudo-NGB and receives a
contributions to its mass from loop diagrams. This contribution can only come from diagrams that
involve both of the couplings, which can be most be logarithmically divergent.
Down quark coupling
Having dealt with the top quark it remains to include couplings for the other quarks. The cou-
pling for the other up type quarks can be added similarly to the method described above. For the
down type quarks we also have to use both of the φ fields, however, in this case we do not have to
worry about symmetries and collective breaking since Yukawa coupling of the bottom quark is too
small to have a significant contribution to the Higgs mass. For Λ ∼ 10 TeV its one loop diagram gives:
λ2b
16π2Λ2 ≈ (30 GeV)2
Both of the φ fields can be included by using an εijk contraction.
Lb = − λb2fεijkψ
iLφ
j1φk2bR
This epsilon contraction, though, immediately results in quadratic divergencies because it breaks
both of the SU(3) symmetries to the diagonal SU(3). This contribution in not problematic though,
because for the heaviest down quark, i.e. the bottom quark, it is only ∼ (30 GeV)2. Although it poses
no problem, the fact that the SU(3) model is unable to cancel the down quark coupling should be
considered as a shortcoming of the SU(3) based model. In order to cancel all quadratic divergencies,
models with larger symmetry groups have to be considered.
70
9.2.3 The Higgs potential
In our analysis of the SM Higgs mechanism, we have seen that the Higgs potential is responsible for
EWSB. Thus, the Little Higgs model must also include a potential large enough to achieve this. We
require this potential V = V (φ1, φ2) to have the following properties:
1. It must not contain a tree level mass term for the Higgs,
2. It must contain a quartic coupling for the Higgs doublet,
3. It must preserve the collective symmetry breaking of the SU(3)’s.
The last demand means that the quartic coupling must be generated when at least two couplings
are non-zero and setting either to zero will make the Higgs an exact NGB. Just as we saw with the
gauge and Yukawa coupling, this will ensure that contributions to the Higgs mass can at most be
logarithmically divergent. Constructing a potential that satisfies the above three properties is far from
trivial for an SU(3) model though. When both fields are included, the only nontrivial SU(3) invariant
term is φ†1φ2. The other terms we can construct are φ†iφi = Const. and εijkφiφjφk = 0, which are
clearly of no use. The φ†1φ2 term, however, immediately breaks both of the SU(3) symmetries to the
diagonal subgroup. Expanding φ†1φ2 gives:
φ†1φ2 ≈ f2 − 2h†h+2
3f2(h†h)2
and this contains a Higgs mass term as well as the quartic coupling and this will always be the case.
A solution to this problem might to tune the coefficient through:
A
f2n−4(φ†1φ2)2 ≈ a1f
4 − a2f2h†h+ a3(h†h)2
Then, by varying A it should in principle be possible to generate a mass term small enough to prevent
quadratic divergencies, or a quartic coupling large enough to induce EWSB. The combination of the
two however, turns out to be impossible. Next to its shortcoming to cancel the down-type quark
divergencies, this is another shortcoming of the SU(3) model. As with its inability to cancel the
down-type quark couplings, the problem of the quartic potential can be solved by increasing the
symmetry group. One such extension is the group SU(5), in the Littlest Higgs. We will have a look
at this model in section 11.
9.2.4 Hypercharge and color
It remains now to add color and hypercharge. Since all off the previous arguments are colorblind42,
meaning that red, blue and green quarks carry the same electric charge and hypercharge, we can
simply add the SU(3)color gauge group.
Hypercharge
In the standard model, the symmetry group of the electroweak interaction is SU(2)L × U(1)Y . In
this model we embedded the weak interaction in SU(3)w which is broken to SU(2)w by the VEV of
42In the sense of QCD.
71
the φ fields. Therefore, to include hypercharge Y , we gauge an additional U(1)X group thereby enlarg-
ing the symmetry group to SU(3)color × SU(3)weak ×U(1)X . It remains now to determine the correct
combination of generators such that the hypercharge for the Higgs comes out as +1.43 For this we
note that the SU(3) generator T 8 = 12λ8 is SU(2) invariant and has not been used so far. This leads
us to define the following combination of generators that is invariant under the VEV ∼(
0, 0, 1)
and produces the correct hyper charge:
Y = 2
(1√3T 8 −X
), where T 8 =
1
2√
3
1
1
−2
and we assigned φ the U(1)X quantum number −1/3[18]. Then we get
Y =
1
1
0
.
The η singlet
Until now we have ignored the η singlet. We can correct this by noting that in the defining the
linear combination for the hypercharge generator we also have the opposite linear combination
2
(−1√
3T 8 −X
)=
0
0
1
This combination will become the massive η goldstone boson that will be eaten by a gauge boson after
the high scale symmetry breaking [24].
9.2.5 The gauge sector
Now that we have assigned the fields φ1 and φ2 the U(1)X quantum number of −1/3 we can write
down a covariant derivative:
Dµ = ∂µ + igAaµ · Ta + igXAXµ ·X = ∂µ + igAaµ · Ta − igX
1
3AXµ (9.18)
where Ta = λa/2 are the eight SU(3)W generators and X = −1/3 is the U(1)X generator. We will
use it now to have a closer look at the gauge sector. Since we enlarged the SU(2)W × U(1)Y gauge
group to SU(3)W ×U(1)X we expect there to be 5 extra gauge bosons that correspond to the 5 broken
SU(3)W generators, with masses of order f . Here we will determine those masses and investigate the
quadratic divergencies due to the SM gauge bosons are indeed cancelled.
The masses
43Recall that < h >=(
0 v/√
2)
and Q = T 3 + Y/2.
72
The masses of the gauge bosons result from the kinetic terms of the φi fields, |Dµφi|2:
2∑i=1
∣∣∣∣(∂µ + igAaµ · Ta − igX1
3AXµ
)φiφ†i
∣∣∣∣2 → Trace
(2∑i=1
∣∣∣∣(igAaµ · Ta − igX 1
3AXµ
)φiφ†i
∣∣∣∣2)
= Trace
(∣∣∣∣(igAaµ · Ta − igX 1
3AXµ
)∣∣∣∣2 2∑i=1
φiφ†i
)(9.19)
where ∂µ is omitted since it plays no role in the masses. To evaluate this we determine what the two
matrices∑2i=1 φiφ
†i and Dµ look like. For
∑2i=1 φiφ
†i we find to order h/f :
2∑i=1
φiφ†i =
< hh† > 0
0
0 0 f2− < hh† >
=
0 0 0
0 v2/2 0
0 0 f2 − v2/2
(9.20)
where in the second equality we assumed the VEV for h of(
0 v√2
)T. For Dµ we find:
Dµ =
ig2
(A3µ +
A8µ√3
)− 1
3 igXAXµ
ig2 (A1
µ − iA2µ) ig
2 (A4µ − iA5
µ)
ig2 (A1
µ + iA2µ) ig
2
(−A3
µ +A8µ√3
)− 1
3 igXAXµ
ig2 (A6
µ − iA7µ)
ig2 (A4
µ + iA5µ) ig
2 (A6µ + iA7
µ) −igA8µ√3− 1
3 igXAXµ
where we used Ta = λa/2. We can now evaluate (9.19). Letting h assume its VEV this becomes:
Trace =1
2v2
[(g(−
A3µ
2+
A8µ
2√
3)−
AXµ3gX)2 + g2(
A1µ
2− i
A2µ
2)(A1µ
2+ i
A2µ
2) + g2(
A6µ
2− i
A7µ
2)(A6µ
2+ i
A7µ
2)
]
+
(f2 − v2
2
)[(−g
A8µ√3− gX
AXµ3
)2 + g2(A4µ
2− i
A5µ
2)(A4µ
2+ i
A5µ
2) + g2(
A6µ
2− i
A7µ
2)(A6µ
2+ i
A7µ
2)
]
We now define the following combinations for the SU(2) W± gauge bosons and the new heavy gauge
bosons (W ′)± and W 00′ .
W± =A1µ ∓ iA2
µ√2
(W ′)± =A4µ ± iA5
µ√2
(W ′)00 =A6µ ∓ iA7
µ√2
and we can read off their masses to be:
M2W± =
1
4g2v2 M2
(W ′)± =1
2g2f2 − 1
4g2v2 M2
W 00′ =1
2g2f2 (9.21)
which is in agreement with [19]. This clearly shows that the SM W± gauge bosons remain massless
untill EWSB when the Higgs field assumes a VEV, whereas the four new heavy gauge bosons acquire
masses of order f . With this we can already show that quadratic divergencies due to the charged W±
are cancelled by the new heavy W bosons since they all represent correct mass eigenstates. I will show
this in section 9.2.6. First we focus on determining the masses and mass eigenstates of the other gauge
bosons.
73
After the φ fields have assumed a VEV (and before EWSB) the neutral gauge bosons corresponding to
the generators T 3, T 8 and TX mix to form the physical fields W 3µ , Bµ and Z ′µ. The first two we have
seen before. After SU(2)W × U(1)Y breaking, these two fields will mix to form the massless photon
and heavy Z0µ boson. However, there is now also a Z ′µ that will mix with Z0
µ [19]. We now turn to
computing the mass matrix. Since we let only φ attain a VEV the relevant term to consider is:
f2
(g−A8
µ√3− gX
AXµ3
)2
(9.22)
Thus we see that the gauge boson A3µ corresponding to the third SU(3) generator does not contribute
to the mixing. We can explain the absence of A3µ in the mixing by observing that it corresponds to the
third SU(3) generator, which we can also identify with the third SU(2) generator. Therefore, when
breaking SU(3)W to SU(2)W , A3µ will in itself correspond to an SU(2) generator and will therefore
remain a mass eigenstate. The other two, A8µ and AXµ do not correspond to SU(2) generators and
thus will mix to form the mass eigenstates. Let us now determine what those eigenstates will be. We
can rewrite (9.22) as44:
f2
(gA8µ√3
+ gXAXµ3
)2
=f2
9
(√3gA8
µ + gXAXµ
)2
=f2
9
(A3 A8 AX
)0 0 0
0 3g2√
3ggX
0√
3ggX gX2
A3
A8
AX
(9.23)
Already familiar with this form from our analysis of the SM Higgs mechanism we can immediately
write down what the physical fields become:
A3µ = A3
µ Bµ =−gXA8
µ +√
3gAXµ√3g2 + g2
X
Z ′µ =
√3gA8
µ + gXAXµ√
3g2 + g2X
A3µ and Bµ remain massless after SU(3) breaking, as off-course they should, and the Z ′µ acquires a
mass:
M2Z′ =
2f2
9
(3g2 + g2
X
)(9.24)
(W ′)±, (W ′)00 and the Z ′µ form the 5 massive gauge bosons that correspond to the 5 broken generators
in breaking SU(3)W × U(1)X → SU(2)W × U(1)Y and they all have masses of order f . The Bµ
corresponds to the hypercharge generator Y and after EWSB will mix with A3µ to form the photon Aµ
and heavy Z0. This we will investigate next. For this, we have to rewrite (9.18) in terms of the new
mass eigenstates. We can implement the W± bosons by observing that we have the following equality:
A1µT1 +A2
µT2 =1
2[(A1
µ + iA2µ)(T1 − iT2) + (A1
µ − iA2µ)(T1 + iT2)]
=1√2
[W−µ (T1 − iT2) +W+µ (T1 + iT2)]
Similar expressions hold for the other W bosons and what remains is A3µT
3. Z ′µ and Bµ are more
tricky. They must come from rewriting
igA8µT8 −
igX3AXµ = c1(Z ′) + c2Bµ = c1
√3gA8
µ + gXAXµ√
3g2 + g2X
+ c2−gXA8
µ +√
3gAXµ√3g2 + g2
X
44The space-time indices µ are omitted for clarity.
74
Off course, for this covariant derivative to produce the correct mass eigenstates after EWSB c2 has
to contain the hyper charge generator Y = 2√3T 8 − 2X and a certain coupling g that we can identify
with the hypercharge coupling g′. The following combination will do:
igA8µT
8 − igX3AXµ
?=
i√3g2 + g2
X
[Z ′(√
3g2T 8 + g2XX)−
√3
2ggXBµ
(2√3T 8 − 2X
)](9.25)
=i
3g2 + g2X
[(√
3gA8µ + gXA
Xµ )(√
3g2T 8 + g2XX)
]− i
3g2 + g2X
√3
2ggX
[(−gXA8
µ +√
3gAXµ )
(2√3T 8 − 2X
)]=
i
3g2 + g2X
[(3g3T 8A8µ +√
3gg2XA
8µX +
√3g2gXA
Xµ T
8 + g3XA
Xµ X)
+(gg2XA
8µT
8 −√
3gg2XA
8µX −
√3g2gXA
Xµ T
8 + 3g2gXAXµ X)]
=i
3g2 + g2X
[(3g3T 8A8µ + gg2
XA8µT
8) + (g3XA
Xµ X + 3g2gXA
Xµ X)]
=i
3g2 + g2X
[(3g2 + g2X)gA8
µT8 + (3g2 + g2
X)gXAXµ X]
=igA8µT
8 − igX3AXµ
From (9.25) we deduce that the hypercharge coupling is given by45:
g′ =−√
3ggX
2√
3g2 + g2X
=−gX
2√
1 +g2X3g2
.
It also gives us the expression for the Weinberg angle θw from (7.27):
g′
g= tan(θw) = t =
−√
3gX
2√
3g2 + g2X
=−√
3
2√
3g2
g2X+ 1
Implementing all of this leaves us with the following covariant derivative in terms of the new mass
eigenstates:
Dµ =ig√
2[W−µ (T1 − iT2) +W+
µ (T1 + iT2)] (9.26)
+ig√
2[(W ′)0
µ(T6 − iT7) + (W ′)0µ(T7 + iT7)]
+ig√
2[(W ′)+
µ (T4 − iT5) + (W ′)−µ (T4 + iT5)]
+igA3µT
3 +i√
3g2 + g2X
Z ′µ(√
3g2T 8 + g2XX) + ig′BµY
Expressing gX in terms of g and t gives:
gX = g2t√
1− 4t2
3
.
45In agreement with [19]
75
and expressing further X in terms of T 8 and Y the term with Z ′µ becomes
ig√3− 4t2
Z ′µ(√
3T 8 − 2t2Y )
With this Dµ becomes in terms of g and t.
Dµ =ig√
2[W−µ (T1 − iT2) +W+
µ (T1 + iT2)] (9.27)
+ig√
2[(W ′)0
µ(T6 − iT7) + (W ′)0µ(T7 + iT7)]
+ig√
2[(W ′)+
µ (T4 − iT5) + (W ′)−µ (T4 + iT5)]
+igA3µT
3 +ig√
3− 4t2Z ′µ(√
3T 8 − 2t2Y ) + igtBµY
With the expression for gX we can also rewrite the mass term we found in (9.24) as:
M2Z′ =
2f2
9(3g2 + g2
X) = g2f2 2
3− 4t2
which is in agreement with [19]. Note however that when t =√
3/2 ≈ 0, 866 the mass of the heavy Z ′
would become infinite, which is clearly unphysical. However, recall from (7.28) that θW = 28, 75 and
thus t ≈ 0, 52. Thus we can conclude this will not be the case.
To find the masses and mass eigenstates after EWSB we follow the same procedure and let now the
higgs field acquire its VEV. Since we already determined the masses of the W bosons, we consider
only the remaining three fields A3µ, Z
′µ and Bµ. Computing again Dµ we find:
ig2 A
3µ + igtBµ + ig√
3−4t2( 1
2 − 2t2)Z ′µ 0 0
0 − ig2 A3µ + igtBµ + ig√
3−4t2( 1
2 − 2t2)Z ′µ 0
0 0 − ig√3−4t2
Z ′µ
Using now (9.20), we find:
Trace
(2∑i=1
|Dµφi|2)
=
(− ig
2A3µ + igtBµ +
ig
2√
3− 4t2(1− 4t2)Z ′µ
)2v2
2+
(− ig√
3− 4t2Z ′µ
)2(f2 − v2
2
)
=g2v2
2
(−A3µ
2+ tBµ +
1− 4t2
2√
3− 4t2Z ′µ
)2
+g2(f2 − v2
2
)3− 4t2
(Z ′µ)2. (9.28)
A3µ and Bµ will mix to form the massless photon Aµ and neutral massive Z0
µ. As I mentioned earlier,
Z0µ will also mix with the new heavy Z ′µ. Therefore we distinguish between Z ′ and Z ′ respectively
before and after EWSB. Rewriting (9.28) and using the mass matrix gives:
(A3µ Bµ Z ′µ
)M
(A3)µ
Bµ
(Z ′)µ
⇒(Aµ Z0
µ (Zµ)′)
Mdiag
Aµ
Z0µ
(Zµ)′
where the mass matrix is given by:
M =g2v2
2
14 − t
2 − 1−4t2
4√
3−4t2
− t2 t2 t(1−4t2)
2√
3−4t2
− 1−4t2
4√
3−4t2t(1−4t2)
2√
3−4t2(1−4t2)2
4(3−4t2) −1+( 2f2
v2 )
3−4t2
76
The eigenvalues are computed using Mathematica and are to O(1/f) found to be:
λ1 = 0 λ2,3 =g2v2
2
2f2√
3− 4t2 ± 2f2√
3− 4t2√
1− (3− 4t2)(1 + 4t2) v2
2f2
2v2(3− 4t2)√
3− 4t2
=g2
2
f2 ± f2(
1− 12 (3− 4t2)(1 + 4t2) v
2
2f2
)(3− 4t2)
=g2
2
(f2 ± f2
3− 4t2
)∓ g2v2
8(1 + 4t2)
Indeed, as off course it should, setting v = 0 reproduces the eigenvalues we found earlier when we let
only φ assume a VEV. Thus in the correct basis of eigenstates:
g2v2
2
(Aµ Z0
µ Z ′µ
)0 0 0
0 g2v2
8 (1 + 4t2) 0
0 0 g2f2
3−4t2 −g2v2
8 (1 + 4t2)
AµZ0
µ
Z ′µ
From this we identify46:
• a massless photon Aµ,
• a massive neutral Z0 boson with M2Z0 = g2v2
4 (1 + 4t2)
• a heavy neutral and Z ′ boson with M2Z′
= g2f2 23−4t2 −
g2v2
4 (1 + 4t2)
Computation of the corresponding eigenvectors results in the following linear combinations for the
mass eigenstates after EWSB:
A =(2t)A3
µ +Bµ
4t2 + 1
Z0 =xA3
µ + 2txBµ + Z ′
x2 + 4t2x2 + 1
Z ′ =yA3
µ + 2tyBµ + Z ′
y2 + 4t2 + 1
where I defined:
x = − 8
(1− 4t2)(1 + 4t2)√
3− 4t2(v2/f2)
and
y =(1− 4t2)
√3− 4t2(v2/f2)
8= − 1
x(1 + 4t2)
9.2.6 Cancellation of the W boson loop
Using the expressions for the charged gauged bosons and (9.20) we can determine whether the quadratic
divergencies due to the SM W± bosons are indeed cancelled by the new heavy bosons. The relevant
part of the covariant derivative being AaµTa, we must evaluate:
|Dµ|22∑i=1
φiφ†i =
g2
2Trace
0 W+ (W ′)−
W− 0 (W ′)0
(W ′)+ (W ′)0 0
20 0 0
0 v2
2 0
0 0 f2 − v2
2
.
46In agreement with [19].
77
where we assumed the Higgs VEV of(
0 v√2
)T, i.e. < hh† >= v2
2 . Then we find:
|Dµ|22∑i=1
φiφ†i =
g2
2
[(W ′)+(W ′)− +W 0W 0
](f2 − v2
2
)+v2
2
[W+W− +W 0W 0
]=g2
2W 0W 0f2 +
v2
2
[W+W− − (W ′)+(W ′)−
](9.29)
Thus, from (9.29) we see that once the Higgs field has assumes its VEV the divergencies from the
+ = 0
Figure 6: The quadratically divergent contributions from the W bosons and the heavy W ′.
They couple equally to the Higgs but with opposite sign and thus cancel.
charged W± bosons are precisely cancelled by the heavy (W ′)± bosons47. Looking back at the W±
and (W ′)± masses in (9.21) we see that this already gave us an indication of the cancellations shown
here. We see the same relation between the masses of the SM Z0 and the new heavy Z ′. Indeed,
when rewriting the covariant derivative in terms of the eigenstate after EWSB, one can, similarly to
the case of the W bosons, show that the quadratic divergencies from the SM Z0 is cancelled by the
heavy Z ′. I will not show this here.
By now we have introduced many aspect of the Little Higgs theories. We have seen here that an
SU(3) based Little Higgs model is able to cancel the most dangerous quadratic divergencies from the
Standard Model by introducing a heavy top partner T and additional heavy gauge bosons with masses
at the TeV scale. I introduced the concept of collective symmetry breaking, which ensures that any
contribution to the Higgs mass must contain both couplings meaning the contribution can only be log-
divergent. However, we also noted two shortcomings of the SU(3) based model, namely its inability to
cancel the contribution of the down-type quark and to generate a quartic potential. Resolving these
problems requires extension of the gauge group. One such group is SU(5). In section 11 we will have
a look at ’The Littlest Higgs’ that is based on SU(5). Regarding the latter of these problems I will
show that this model is able to generate a quartic potential. The next section will serve as a general
introduction to the SU(5) gauge group with respect to the transformation properties of particles.
47The term W 0W 0f2 might seem troublesome. It represents a coupling of the W 0W 0 to the φ fields after
they assumed their VEV. Its one loop diagram is quadratically divergent. However, this does not pose a
problem because under the 1 TeV scale the SM only contains the h doublet.
78
10 Representations, particle multiplets and symmetry break-
ing
In the ”Simplest Little Higgs” we broke a global SU(3) symmetry that to SU(2). We looked at the
gauge sector of the model and saw that enlarging the gauge group SU(2)W to SU(3)W introduced
5 new heavy gauge bosons, (and of course the heavy Top partner). However, when postulating a
model with a larger symmetry group, we need to indicate how the particles will transform under this
enlarged symmetry group. This is where the representation theory steps in. In the introduction I
already mentioned there is a link between representation theory and particle physics. The observation
follows when one considers particles als vectors in Cn. When a model has a certain symmetry group,
this symmetry group has irreducible representations of different dimensions. These irreducible rep-
resentations can be associated with the transformation properties of the different particles under the
symmetry group. Each particle is assigned to a certain representation which tells us how it transforms.
The left-handed electron doublet for one is placed in the two-dimensional representation of SU(2)W
while the right-handed electron transforms as the trivial one-dimensional representation. Physically
this corresponds to the observation that only left handed particles feel the weak force. More expla-
nation can be found in Appendix C.2 on Isospin. Recall now from sections 4 and 5.2 that we know
that the irreducible representations of SU(N) can be parametrized by Young diagrams with at most
N rows and that any column of N boxes may be omitted. Rows correspond to fully symmetric states,
columns to fully antisymmetric states and columns of length N may be omitted. The dimension of
the irreducible representations can be computed using (4.5) and we will refer to the representations
by their dimension, e.g. denote the eight dimensional representation by 8. Another important result
are the branching rules we derived for SU(N) → SU(N − 1) and SU(N + M) → SU(N) × SU(M).
Its relation to the previous will become clear with the following example where I discuss how we can
assign the gauge bosons from the previous section to the SU(3)W and SU(2)W representations48.
Example 10.1. The particles have to be assigned to the SU(3)W representations in such a way that
it is consistent with the SU(2)W transformation properties after spontaneous symmetry breaking. In
other word, they have to be distributed over the SU(3) representations in such a way, that when
we consider them as SU(2) representations, everything is consistent with the Standard Model. This
decomposing of irreducible SU(3) representations as SU(2) representations we are being told by the
branching rules. The 8 gauge bosons Aaµ of SU(3)W are placed in the 8 with Young tableau .
AXµ is a singlet in U(1)X . We already determined the branching of the 8 representation of SU(3) in
example 5.1. We found:
8 = 2 · 2 + 1 + 3.
Thus after SU(3) breaking we have 2 SU(2) doublets, a singlet and a triplet. The triplet we associate
with the standard model triplet of A1µ, A
2µ, A
3µ. The singlet corresponds to Z ′µ. Bµ is a singlet under
U(1)Y . What remains are the 2 doublets. To these we assign the new heavy charged W bosons:((W ′)+, (W ′)−
)and
(W 0, W 0
), which is in agreement with [19].
In the next paragraph we will have a more thorough look at the SU(5) case49. Knowing the SU(5)
48Recall that we also had U(1)X and U(1)Y . But, since their generators commute with the generators of
SU(3) and SU(2) they aren’t complicating factors.49It will be based on [14]
79
transformation properties is important for two reasons. First of all, we already noted that the Simplest
Little Higgs as discussed above was unsuccessful in generating a quartic potential for the Higgs and
canceling the divergencies that come from the down-type quarks. The minimal model that does
succeed in this and is consistent with the Standard model is called the ”Littlest Higgs” and it based on
a global SU(5) symmetry. Secondly, SU(5) enters in Grand Unified Theories because it is the minimal
symmetry group with the SM gauge group as a subgroup. Here I will discuss a part of an SU(5) based
theory focusing on transformation properties of elementary particles and show that all elementary
particles of the SM can be accommodated in the 5 lowest dimensional SU(5) representations.
10.1 SU(5)→ SU(3)C × SU(2)W × U(1)Y .
We will now investigate the representations and particle multiplets of the gauge group SU(5) that
has the standard model gauge group SU(3)c × SU(2)w × U(1)Y embedded in it. We can achieve this
embedding by embedding SU(3)C in the left upper 3 × 3 block and SU(2)w in the right lower 2 × 2
block and embed U(1) along the diagonal. Physically this embedding is motivated by the experimental
observation that the SU(3)C is completely blind with respect to the weak interaction SU(2)W×U(1)Y ,
meaning that red, blue and green quarks carry the same electric charge and weak hypercharge. This
implies that their generators must behave as unit matrices with respect to one another. Further, the
hyper charge generator must commute with both SU(3)C and SU(2)W by which the generator for
U(1)Y becomes Y = λ2450. This is because the leptons are color singlets and therefore the SU(3)
generators must have zero eigenvalues.
Remark on notation In the following we will adapt a different notation for the Young diagrams
that will proof useful. We will label the Young diagram by (p1, . . . , pn−1) where pi counts the number
of rows of length i. The conjugate diagram can be obtained by reversing the order of these numbers
and is of the same dimension of the original representation. Finally, we define the fundamental repre-
sentation for any n to be the Young diagram corresponding to exactly one pi = 1 and all others zero.
In terms of particles: If a right-handed particle transforms in the d representation than the charge
conjugated left-handed particle (i.e. antiparticle) transforms in the conjugated representation to which
we will refer as d. Note that for SU(2) the representations 2 and 2 coincide.
Now, back to SU(5). In order to accommodate all fermions and gauge bosons, it will be sufficient to
consider only the 5 lowest dimensional representations. Recall that in section 5.2 we already derived
the branching of the 5 lowest dimensional SU(5) representations in terms of Young diagrams. The
result is shown in table 1 51. Note that the correct hypercharges are not yet assigned, this will later be
corrected. First we focus on assigning the elementary fermions of the SM to the SU(5) representations.
Assigning the elementary fermions
We begin by recalling that the left-handed lepton doublet(νe e−
)has (SU(3), SU(2))Y quantum
50The explicit form of the SU(5) generators can be found in [14].51We can without problems add the hyper charge group because it commutes with SU(3) and SU(2).
80
Table 1: The (SU(3)×SU(2)×U(1)Y ) decomposition of the irreducible SU(5) representations
SU(5) Young tableau Dimension (SU(3), SU(2)) decomposition
5 (3, 1)⊕ (1,2)
5 (3, 1)⊕ (1,2)
10 (3,2)⊕ (3, 1)⊕ (1, 1)
15 (6, 1)⊕ (3,2)⊕ (1,3)
24 (1, 1)⊕ (1,3)⊕ (3,2)⊕ (3,2)⊕ (8, 1)
numbers (1,2)-1. In the 5 representation the hypercharge generator is related to λ24.
Y =
√5
3λ24 = diag(−2
3,−2
3,−2
3, 1, 1)
The normalization is inserted so the left handed lepton doublet(νe e−
)L
can be assigned to the
last two components of the 5 representation (ψi)L i = 1, . . . 5. In this representation the hypercharge
operator is given by −Y 52. Just as λ24 corresponds to the hypercharge generator we can identify the
SU(2) isospin generator T 3 = τ3/2 with λ23. The charge operator in the 5 representation can then be
obtained by using the Gell-mann-Nishijima relation. Then:
Q = T3 +Y
2=λ23
2+
√5
12λ24 = diag(−1
3,−1
3,−1
3, 1, 0) (10.1)
In the 5 representation the charge operator is then given by −Q producing indeed the correct charges
for the lepton doublet. The lepton anti-doublet, obtained by charge conjugation,(eC , −νCe
)R
is
assigned to the 5 quintuplet (ψi)R i = 1, . . . 5. To the first three components we must assign a SU(3)
color triplet of right-handed particles with charge −1/3 that transforms as a singlet under SU(2).
This rules out everything apart from the triplet of down quarks(dr, db, dg
)R
. The left-handed 5
quintuplet contains the conjugated left-handed antiparticles. Thus for i = 1, . . . , 5:
5 = (ψi)R =(dr db dg e+ −νCe
)R, 5 = (ψCi )L =
(dCr dCb dCg e− νe
)L
Now, for these assignments we did not yet explicitly needed table 1. However, to assign the left
handed color quark doublets
(ur, ub, ug
dr, db, dg
)L
and the singlets e+L and
(uCr , uCb , uCg
)L
to the SU(5)
multiplets we will need it. Since the (SU(3), SU(2))Y transformation properties of the fermions are
known, we can use the result of this table to assign the remaining fermions to the SU(5) multiplets by
comparison of the transformation properties. When expressed in left-handed particles and antiparticles
52See appendix C.2.2.
81
these are53:
uL, dL : (3,2)1/3
dCL : (3,1)2/3
uCL : (3,1)−4/3
νCL , eL : (1,2)−1
eCL : (1,1)2
Comparison with table 1 shows that the remaining fermions can be assigned to the antisymmetric 10
representation.
10 =
((3,1) (3,2)
(3,2) (1,1)
)(10.2)
Now, whereas the 5 and 5 representations where represented by one-component tensors ψi this the 10
is represented by an antisymmetric matrix, i.e. a two component tensor ψij that satisfies ψij = −ψji.The left handed color vector (uCi )L i = r, b, g = 1, 2, 3 can be turned into an antisymmetric 3×3 matrix
using an epsilon contraction:∑k εijk(uCk )L. Similarly, we can construct from the eCL an antisymmetric
2× 2 matrix through εijeCi . We then get:
10 = ψij =1√2
0 uCg −uCb −ur −dr−uCg 0 uCr −ub −dbuCb −uCr 0 −ug −dgur ub ug 0 e+
dr db dg −e+ 0
L
where the 1√2
is a normalization factor to compensate for every particle appearing twice. Thus the
fermions of the standard model can all be assigned to the 5, 5 and 10 multiplets of SU(5). The second
the third generations are handled similarly.
Correcting the hypercharge
It remains now to correct the hyper charge for all the particles. From example C.1 and (C.6) it
also follows that the hypercharge of a product of multiplets is the sum of their individual hyper-
charges. Using further that in the 5 representation Y =√
53λ24 we deduce that we can the identify
the hypercharges of (3,1) and (1,2) as −2/3 and 1 respectively. The hyper charges of the other mul-
tiplets can then be determined by taking their tensor products of 5 = (3,1)⊕ (1,2) and its conjugate
and summing the hyper charges. Explicitly this gives:
5× 5 =[(3,1)−2/3 ⊕ (1,2)1]× [(3,1)−2/3 ⊕ (1,2)1]
=[(3× 3,1× 1)−4/3 + (3× 1,1× 2)1/3 + (1× 3,2× 1)1/3 + (1× 1,2× 2)2]
=[(6⊕ 3,1)−4/3 + 2 · (3,2)1/3 + (1,1⊕ 3)2]
=[(6,1)−4/3 + (3,1)−4/3 + 2 · (3,2)1/3 + (1,1)2 + (1,3)2] (10.3)
53See appendix C.2.2.
82
and
5× 5 =[(3,1)2/3 ⊕ (1,2)−1]× [(3,1)−2/3 ⊕ (1,2)1]
=[(3× 3,1× 1)0 + (3× 1,1× 2)5/3 + (1× 3,2× 1)−5/3 + (1× 1,2× 2)0]
=[(8⊕ 1,1)0 + (3,2)5/3 + (3,2)−5/3 + (1,1⊕ 3)0]
=[(8,1)0 + [(1,1)0 + (3,2)5/3 + (3,2)−5/3 + (1,3)0 + (1,1)0] (10.4)
Assigning the gauge bosons
The gauge bosons of SU(5) correspond to the group generators and thus have to be assigned to
the representation of dimension 52 − 1 = 24 and we expect there to be 12 gauge bosons in addition
to the 8 gluons of SU(3), 3 W iµ of SU(2) and the hypercharge boson Bµ. We already derived the
decomposition of this representation in term of SU(3)× SU(2) representations.
24 = (1,1)0 ⊕ (1,3)0 ⊕ (3,2)−5/3 ⊕ (3,2)5/3 ⊕ (8,1)0
With our knowledge of the transformation properties of the standard model gauge bosons we can
immediately assign the SU(3) octet of gluons to the (8,1), the 3 SU(2) gauge bosons W iµ to (1,3)
and Bµ to (1,1). The remaining 12 gauge bosons are thus assigned to the (3,2) and (3,2), which
together are both 6 dimensional. The 12 new gauge bosons thus form 2 colored, isospin doublets. We
will denote this as:
(3,2) =
(Xr Xb Xg
Yr Yb Yg
)(10.5)
We can embed the gauge bosons in the 24 representation in the same way we did for the fermions in
the 10. Explicitly:
24 =
((8,1)0 (3,2)5,3
(3,2)−5/3 (1,3)0
)+ (1,1)0
where the matrix is a 5× 5 traceless matrix.
Concluding word on SU(5)
We have seen here that as a Grand Unification gauge group, SU(5) is able to accommodate all
elementary fermions and the gauge bosons. In [14] the Lagrangian for an SU(5) based gauge theory
is derived. With this it is possible to derive relations, make predictions and to see if those are consis-
tent with the Standard Model. For one, they show that the SU(5) group predicts the quantization of
charge. Fixing for example the electron charge within the SU(5) gauge group, the charges of all other
particles can be determined. Off course, this is provided they are arranged in correspondence with
their quantum numbers and color. They also show that under the assumption of an unbroken SU(5)
gauge theory the prediction for the Weinberg angle is
sin2(θW ) =3
8
which deviates from the Standard model prediction sin2(θW ) ≈ 0.23. This, and the fact we know
that the SU(2) × U(1) symmetry is spontaneously broken by the Higgs field VEV, indicates that
83
at low energies SU(5) is can not be a correct symmetry. It must thus be spontaneously broken to
the Standard Model gauge group at some energy scale and only above this energy is SU(5) an exact
symmetry and will the Weinberg angle approach this value. Since no hints of any influence of the
additional gauge fields X and Y have been found at any experiment, the unbroken SU(5) symmetry is
only realized at very high energies. When the well-known lifetime of the proton54 is taken into account
it can be shown that the breaking of the symmetry must occur at about 1015 GeV.
In the following section I will get back to SU(5) in the context of Little Higgs models where we discuss
the ’Littlest Higgs’.
54It is about 1034 years.
84
11 The Littlest Higgs
The Littlest Higgs is the minimal Little Higgs model that suffices as a theory to extend the SM. It is
based on a coset [SU(5)/SO(5)] and was the first viable Little Higgs model constructed in 2002 by
Arkani-Hamed et. all [20].
11.1 Requirements for the model
In the Littlest Higgs the Higgs is realized as the pseudo-NGB of a global symmetry group G that is
spontaneously broken to a global symmetry group H at the energy scale f ∼ 1 TeV. The model is
required to be an extension of the SM and thus the subgroup H must contain a copy of the SM gauge
group SU(2)W ×U(1)Y . Secondly, to prevent the Higgs from receiving quadratic divergencies, we as-
sume that the group G contains two gauged copies of SU(2)×U(1) : G ⊃ G1×G2 = [SU(2)×U(1)]2
that is diagonally broken to the SM gauge group. Both Gi are required to commute with a different
subgroup Xi of G. In this way they both preserve enough of the global symmetry to ensure that the
Higgs remains an exact NGB. Only when we include both of the gauge groups can the Higgs acquire a
mass term and this contribution can be at most logarithmically divergent55. To implement this, Xi is
chosen to be SU(3). This way G contains two different subgroups that are each of the form Gi ×Xi
and each Xi contains an SU(2)× U(1) subgroup.
Breaking the symmetry
To obtain the required symmetry breaking, we observe that the SU(5) Lie algebra has 52 − 1 = 24
generators. 14 of those are symmetric, 10 of those are antisymmetric. The latter are precisely the 10
generators of the SO(5) Lie algebra. To break the 14 symmetric generators the following VEV for the
Σ field is chosen56:
< Σ >= Σ0 =
12×2
1
12×2
(11.1)
Under a general SU(5) transformation U = eiθaTa , with Ta = 12λa it transforms as Σ → UΣUT : To
see why this VEV produces the correct result we redefine the SU(5) generators λa by introducing the
following unitary anti-symmetric matrix [25]:
A =1
2
1 + i 0 1− i0 2 0
1 + i 0 1− i
(11.2)
Direct calculation shows that this matrix satisfies Σ0 = A2 and A = AT . Redefining the SU(5)
generators as Xa ≡ AλaA−157 we deduce: Then:
XaΣ0 = AλaA−1A2 = AλaA = ± (AλaA)
T= ± (XaΣ0)
T= ±Σ0X
Ta . (11.3)
Obviously, the plus sign corresponds to an symmetric generator and the minus sign to an antisymmetric
generator. Using this we can show that the unbroken and broken generators respectively satisfy the
55After we have identified the goldstone bosons this will be made more precise.56There is no physical reason to justify this choice besides that is produces the required result.57These satisfy the SU(5) Lie algebra as can be verifies by direct calculation.
85
following relations:
T aΣ0 + Σ0TT
a = 0 (unbroken generators) (11.4)
TaΣ0 − Σ0TTa = 0 (broken generators) (11.5)
Showing this goes similar to what we did in section 6.3.2. An unbroken symmetry O = exp(iθaXa)
preserves the vacuum: OΣ0OT = Σ0. Expanding around the identity then leads to the following
condition on the generators:
Σ0 = (1 + iθaXa)Σ0(1 + iθaXTa ) = Σ0 + iθa(XaΣ0 + Σ0X
Ta ) +O(θ2
a) (11.6)
which clearly implies the above relations for the 10 unbroken generators. The 14 remaining generators
satisfy (11.5).
The second requirement, is also fulfilled by this VEV. It breaks the gauged subgroup G ⊃ G1 ×G2 = SU(2)1 × U(1)1 × SU(2)2 × U(1)2 down to its diagonal subgroup SU(2)W × U(1)Y . To see
this we consider the generators of each Gi, which are defined as follows: For the first subgroup
G1 = SU(2)1 × U(1)1:
Qa1 =1
2
(τa
03×3
)Y1 =
1
10diag(−3,−3, 2, 2, 2), (11.7)
and similarly for G2 = SU(2)2 × U(1)2:
Qa2 =1
2
(03×3
−τaT
)Y2 =
1
10diag(−2,−2,−2, 3, 3). (11.8)
This way G1 preserves a global SU(3) symmetry in the lower 3 × 3 block and G2 in the upper 3 × 3
block. The SU(2)W × U(1)Y generators are then given by:
Qa =1√2
(Qa1 +Qa2) and Y = Y1 + Y2 (11.9)
This combination of generators satisfies (11.4) and are thus left unbroken by the VEV. The orthogonal
combination
Qa =1√2
(Qa1 −Qa2) and Y = Y1 − Y2 (11.10)
is broken by the VEV. The unbroken combination of generators correspond to the SM W and B
bosons. They remain massless until EWSB occurs. The broken combination are related to the new
heavy W ′ and B′ that acquire masses of order f when high symmetry breaking occurs.
Goldstone Bosons
Breaking SU(5) → SO(5) results in 14 goldstone bosons. We can parametrize them by expanding
around the VEV Σ0, i.e.
Σ = eiΠ/fΣ0eiΠT /f = e2iΠ/fΣ0, (11.11)
where the second equality follows from (11.5) and f is the high symmetry breaking scale of order ∼ 1
TeV. The full goldstone matrix is given by
Π = πaTa =
χ+ η
2√
5h∗√
2φ†
hT√2
−2η√5
h†√2
φ h√2
χT + η
2√
5
. (11.12)
86
η is a real scalar field, χ = χaτa/2 a Hermitian 2 × 2 matrix, h is the SM complex Higgs doublet
h =
(h+
h0
)and φ is a heavy complex SU(2) Higgs triplet given by φ =
(φ++ φ+
√2
φ+
√2
φ0
). Note that,
similar to the SU(3) based model, the Higgs is again arranged in such a way that neither of the
SU(2) generators include h. In this way the Higgs will always remain a NGB when we break into the
SU(2) subgroups. Together, they account for the 14 degrees of freedom and they transform under the
unbroken SU(2)W × U(1)Y as:
10 ⊕ 30 ⊕ 2± 12⊕ 3±1 (11.13)
where the bold number denotes the number of fields and the subscript the hypercharge.
Symmetries
Lets try to understand why this setup of the model would succeed by analyzing the symmetries.
When we break SU(5) → SO(5), 14 goldstone bosons appear transforming as in (11.13). The first
two sets, η and χ will be eaten by the heavy gauge bosons when the gauged [SU(2)×U(1)]2 is broken
to the SM electroweak group. The trick in this Little Higgs model is now that by introducing two sets
of gauge couplings58 g1, g′1 and g2, g
′2 we let the two SU(2) groups in the opposite corners of the Σ
field mix. Only when both sets of the couplings are non-zero can the Higgs acquire a mass term. To
see, this observe that each of the Gi gauge groups commutes with a different SU(3) global symmetry
subgroup of SU(5). Suppose that we only include the gauge couplings g2 and g′2. Then the global
SU(5) symmetry is explicitly broken to SU(3) × SU(2) × U(1), where SU(3) acts on the first three
indices and SU(2) on the last two. This is then spontaneously broken to the electroweak group, thus
producing 8 exact NGB, corresponding to the eaten η, and χ and the four that make up the Higgs
doublet h. An analogous argument holds when only g1 and g′1 are turned on, only then SU(3) acts on
the last two indices and SU(2) on the first two. Since h shifts under the SU(3) symmetry a mass term
hh† is forbidden and thus neither of the two gauged subgroups alone can generate a Higgs potential.
When, however, we include both of the gauge couplings, enough of the global symmetry is broken to
allow h to acquire a potential which can at most be logarithmically divergent at tree level. In this case
SU(5) is explicitly broken to the gauged subgroup [SU(2)× U(1)]2 which then spontaneously breaks
to the electroweak group producing only 4 exact goldstone bosons corresponding to η and χ and thus
making the Higgs doublet a pseudo-NGB. The SU(2) triplet φ however is not protected by the global
symmetry and can pick up a potential which is quadratically divergent at tree level. However, this
does not pose a problem because under the 1 TeV scale the model only contains the higgs doublet h.
We’ll get back to the potential for h after we have showed the cancellation of the quadratic divergencies
at tree-level in the gauge sector and determined the masses of the new heavy gauge bosons.
11.2 The Gauge bosons
We will now determine the masses of the new the heavy gauge bosons W ′ and B′ and see that
they cancel the quadratic divergencies from the SM W and B bosons. As usual, we will do this by
58g1 and g2 are the couplings for the two SU(2)’s and g′1 and g′2 are the couplings for the two U(1)’s of G1
and G2 respectively.
87
considering the kinetic part of the Lagrangian and taking the Trace:
Lkin =f2
8Trace|DµΣ|2 (11.14)
where the coefficient is chosen such that the resulting scalar terms are normalized and the covariant
derivative is given by:
DµΣ = ∂µΣ− i2∑j=1
[gjW
aµj(Q
ajΣ + ΣQaTj ) + ig′jBµj(YjΣ + ΣY Tj )
](11.15)
To find the masses of the heavy gauge bosons and the corresponding mass eigenstates we have to
consider the terms in (11.14) that are quadratic in the gauge fields and substituting Σ = Σ0. Then,
ignoring ∂µ and omitting the space-time index µ, (11.14) becomes:
Lkin(Σ = Σ0) =f2
8Trace[
1
2g1W
a1
τa
0
τTa
− 1
2g2W
a2
τa
0
τTa
+
1
10g′1B1
−12×2
4
−12×2
− 1
10g′2B2
12×2
−4
12×2
]2
=f2
8[g2
1(W a1 )2 + g2
2(W a2 )2 − 2g1g2W
a1 W
a2 +
1
5g′21 B
21 +
1
5g′22 B
22 −
2
5g′21 B1g
′2B2] (11.16)
where we used τ2a = 12×2. Rewriting this in terms of the mass matrix we obtain:
Lkin(Σ = Σ0) =1
2
f2
4
(W a
1 W a2
)( g21 −g1g2
−g1g2 g22
)(W a
1
W a2
)+
1
2
f2
20
(B1 B2
)( g′21 −g′1g′2−g′1g′2 g′22
)(B1
B2
)(11.17)
Familiar with this form of the mass matrix and we can immediately rewrite (11.17) in terms of the
physical fields W,W ′, B,B′ and read of their masses. Then we have59:
Lkin(Σ = Σ0) =1
2
f2
4
(W W ′
)(0 0
0 g21 + g2
2
)(W
W ′
)+
1
2
f2
20
(B B′
)(0 0
0 g′21 + g′22
)(B
B′
)(11.18)
where we define the physical fields in terms of the mixing angles s, s′, c, c′:
W = sW1 + cW2 W ′ = −cW1 + sW2
B = s′B1 + c′B2 B′ = −c′B1 + s′B2 (11.19)
59Note that I omitted the a superscript for clarity.
88
with
s =g2√g2
1 + g22
c =g1√g2
1 + g22
s′ =g′2√
g′21 + g′22c′ =
g′1√g′21 + g′22
.
W and B thus remain massless and are identified with the SM gauge bosons. W ′ and B′ are the new
heavy gauge bosons and have masses at the TeV scale:
M2W ′ =
f2
4(g2
1 + g22) and M2
B′ =f2
20(g′21 + g′22 )
To show the desired cancellation of the quadratic divergencies we have to rewrite (11.16) in terms of
the physical gauge bosons. For this we expand the Σ field around its VEV. When gauging away η and
χ we get the following expansion for (11.11) in powers of 1/f :
Σ = Σ0 +2i
f
φ† h∗√
20
h†√2
0 hT√2
0 h√2
φ
+O(1
f2) (11.20)
Substituting this in (11.14) and using (11.19) results in the following expression for the couplings of
the gauge bosons to two scalars, (h and φ) [21]. For the W,W ′ bosons:
Lkin(W ·W ) =g2
4
[W aW b − c2 − s2
scW aW ′b
]Trace[h†hδab + 2φ†φδab + 2σaφ†σbTφ]
−g2
4
[W ′aW ′aTrace[h†h+ 2φ†φ]− c4 + s4
2s2c2W ′aW ′bTrace[2σaφ†σbTφ]
](11.21)
and for the B,B′ bosons:
Lkin(B ·B) =g′2[B2 − c′2 − s′2
s′c′BB′
]Trace[
1
4h†h+ φ†φ]
−g′2[B′2Trace[
1
4h†h]− (c′2 + s′2)2
4s′2c′2B′2Trace[φ†φ]
](11.22)
These two expressions show that the divergencies from the SM W and B bosons and indeed cancelled
by the new heavy W ′ and B′ bosons. W and W ′ both couple equally to the Higgs field only with
opposite sign. Same holds for B and B′. The contributions to the Higgs mass that are uncancelled
at one loop order are those that include both the light as well as the heavy gauge boson. The only
possible diagram is the one displayed in figure 7, which is logarithmically divergent.
11.3 The Quartic Higgs potential and Higgs mass
When analyzing the symmetries of the model we already mentioned that under the two SU(3) sym-
metries the SM Higgs doublet shifts, thereby forbidding a mass term hh† and a potential at tree
level. The heavy Higgs triplet φ is not protected by the symmetry that protects h and can pick up a
quadratic divergent mass. Here we will make this a bit more precise.
The Coleman-Weinberg potential
Since we are unable to add a potential at tree level, the only option left is to generate the poten-
tial at one-loop level as a correction to interactions of the Higgs with the gauge bosons and fermions.
89
Figure 7: The logarithmically divergent contribution to the Higgs mass that has the light as
well as the heavy gauge boson in the loop.
These interactions explicitly break all the global symmetries that forbid the presence of a tree level
Higgs potential. Such a potential that is generated at loop level, but absent at tree level, is called a
Coleman-Weinberg potential. The most important part of this potential can be parametrized as [21]:
V = λφ2f2Tr(φ†φ) + iλhφhf(hφ†hT − h∗φh†)− µ2hh† + λh4(hh†)2 (11.23)
Quartic terms involving φ4 and h2φ2 are not included because their contributions are small. There are
quadratically divergent contributions to this potential which are cut-off at a scale Λ ∼ 4πf ∼ 10 TeV.
They come from the gauge bosons as well as the fermions. For the gauge bosons this quadratically
divergent contribution to the CW potential is [21]
Lc = cg2j f
4∑a
Tr[(QajΣ)(QajΣ)∗
]+ cg′2j f
4Tr [(YjΣ)(YjΣ)∗] (11.24)
The relevant parts of this potential can be found by expanding the Lagrangian (11.24) in terms of
the fields h and φ and considering their global symmetry transformation properties. The G1 gauge
interactions leave the SU(3)1 symmetry invariant, under which h and φ transform as :
hi → hi + fεi + . . .
φij → φij − i(εihj + εjhi)
and similarly under SU(3)2, left invariant by the G2 gauge interactions, they transform as:
hi → hi + fηi + . . .
φij → φij + i(ηihj + ηjhi)
In the presence of both sets of gauge interactions the term that is left invariant under both these
transformations is:
|φij ±i
2f(hihj + hjhi)|2
Expanding this and substituting the expression in (11.24) yields:
Lc =c
2(g2
1 + g′21 )
[f2Tr(φ†φ)− if
2(hTφ†h− h†φh∗) +
1
4(hh†)2 + . . .
]+c
2(g2
2 + g′22 )
[f2Tr(φ†φ) +
if
2(hTφ†h− h†φh∗) +
1
4(hh†)2 + . . .
](11.25)
90
Remark. The same expression can be found when expanding Σ to quadratic order in φ and quartic
order in h. Note further that in this expression the first term is SU(3)2 invariant, whilst the second
is SU(3)1 invariant which can be seen from the gauge couplings.
The other quadratically divergent contribution to the one loop CW potential comes from the fermion-
loops 60 and is given by [21]:
Lc′ = −c′
2λ2ff
4εwxεyzεijkεkmnΣiwΣjxΣ∗myΣ∗nz + . . .
where i, j, k,m, n = 1, 2, 3 and w, x, y, z = 4, 5. This expression is thus SU(3)1 invariant and must
therefore have the same form as the second term in (11.25), hence:
Lc′ = −c′
2λ2
1
[f2Tr(φ†φ) +
if
2(hTφ†h− h†φh∗) +
1
4(hh†)2 + . . .
](11.26)
Note that no Higgs mass term is present in either of the contributions (11.25) and (11.26) as a
consequence of the collective symmetry breaking. However, there is a mass term for φ of order f :
M2φ = (c(g2
1 + g′21 + g22 + g′22 )− c′λ2
1)f2 = 2λφ2f2
We can further use the expressions (11.25) and (11.26) to determine the coefficients in (11.23). Ex-
plicitly:
λφ2 =c
2(g2
1 + g′21 + g22 + g′22 )− c′
2λ2
1
λhφh = − c4
(−g21 − g′21 + g2
2 + g′22 )− c′
4λ2
1
λφ4 =c
8(g2
1 + g′21 + g22 + g′22 )− c′
8λ2
1 =1
4λφ2 (11.27)
The Higgs quartic potential
Electroweak symmetry breaking at the scale v is only possible if the parameter λφ2 > 0. Other-
wise a VEV of order f for the triplet φ will be generated causing the electroweak symmetry to be
broken at the higher scale f . Note that:
λφ2 > 0 ⇐⇒ c(g21 + g′21 + g2
2 + g′22 )− c′λ21 > 0
For energies below this mass we can integrate φ out of the equation by calculating its equation of
motion and substituting the solution back in the expression for the total potential, i.e Lc +Lc′ . Thus
we differentiate the total potential to φ, set this to zero, and solve for φ. Doing so we find:
[c(g21 + g′21 )− c′λ2
f ]
(φij +
i
fhihj
)+ [c(g2
2 + g′22 )− c′λ2f ]
(φij −
i
fhihj
)= 0
Substituting the solution for φ in the total potential results in the following expression for the higgs
quartic potential λ(hh†)2 at tree level:
λ = c(g2
1 + g′21 − c′
c λ21)(g2
2 + g′22 )
g21 + g′21 + g2
2 + g′22 − c′
c λ21
60In particular the top loop, since the other fermions have small Yukawa couplings
91
This potential clearly reflects that only in the presence of both sets of couplings a quartic higgs poten-
tial can be generated. Turning either sets of gauge couplings off yields λ = 0 and no Higgs potential
is generated. Any contribution to the Higgs mass parameter at one loop level can thus at most be
logarithmically divergent. A few remarks about the µ parameter have to be made though [21]. As in
the Standard model it is to be seen as a free parameter. At one-loop order there are no quadratically
divergent contributions, only logarithmically divergent contributions of order f2 log(Λ2/f2)/(4π)2, im-
plying a natural hierarchy between the electroweak and TeV scale ∼ f . The first quadratic divergent
contribution to the parameter arises at two-loop order and is of order Λ2/(4π)4. This two-loop con-
tribution could however be as large as the one-loop log contribution. The Higgs mass parameter µ2
should therefore be treated as a new free parameter µ2 ∼ f2/(4π)2.
The Higgs mass
For µ2 > 061 the Coleman-Weinberg potential will trigger EWSB. We assume the following VEV’s for
the higgs fields h and φ:
< h >=1√2
(0
v
)and < φ >=
(0 0
0 −iv′
)(11.28)
Substitution of these VEV’s in (11.23) yields:
λφ2f2v′2 − λhφhfv2v′ − µ2 v2
2+ λh4
v4
4
and by minimizing this to v2 and v′ we find:
v2 =µ2
λh4 − λ2hφh/λφ2
and v′ =λhφhv
2
2λφ2f
The mass of the heavy Higgs triplet, is found to be
M2φ = 2λφ2f2
where the constants are given in (11.27). As with the tree level quartic higgs potential, the tree level
mass term for h will appear once φ is integrated out. Re-expressing v′ =λhφh(v+H)2
2λφ2f, where now H is
the mass-eigenstate, and parametrizing h and φ as:
< h >=1√2
(0
v +H
)and < φ >=
(0 0
0 −iv′
)(11.29)
results in the following mass for the Higgs doublet [21]:
M2H = 2v2(λh4 − λ2
hφh/λφ2) = 2µ2
11.4 Viability of ’Littlest Higgs’ and signatures in experiment
In the previous we discussed the ”Littlest Higgs”. I discussed the group-theoretical setup and showed
that the quadratically divergent Standard Model W and B loops are cancelled by the new heavy W ′
61Note the relative minus sign in comparison to earlier calculations.
92
and B′ gauge bosons with masses at the TeV scale. Unlike the SU(3) based model, this model was
indeed able to generate a quartic potential that included both sets of couplings. Any contribution
to the Higgs mass as one loop order must therefore be logarithmically divergent. What we did not
discuss were the fermion sector, electroweak symmetry breaking and the resulting masses for the gauge
bosons. I will conclude the discussion about this model by saying a few words about the latter because
it turns out to lead to some problems and reintroduce the hierarchy problem. In [21] the explicit
masses for the light and heavy gauge bosons after EWSB can be found. Additional mixing of the light
and heavy gauge bosons causes the masses of the Standard Model W and Z to gain corrections of
O(v2/f2). Recall now however, the relation (7.31) between the Weinberg angle and the masses of the
Standard Model W and Z bosons. If we now substitute the masses as found in [21] this also gives an
additional contribution to the Weinberg angle of order (v2/f2). Explicitly:
M2W
M2Z
= cos(θW )2
[1 +
v2
f2
5
4(c′2 − s′2)2 − 4
v′2
v2
](11.30)
= cos(θW )2
[1 +
v2
f2
(g′21 − g′22g′21 + g′22
)− 4
v′2
v2
](11.31)
Electroweak precision measurements force f > 4 TeV, which reintroduces the hierarchy problem! Ob-
serve though, that this contribution vanishes entirely in the case of g′1 = g′2 and v′ = 0. Fortunately,
a solution was found by imposing an additional discrete symmetry on the model, called T-parity.
Particles can be either T-even of T-odd and applying the operator T maps them to respectively plus
or minus themselves. The assignment is such that all SM particles can be chosen T-even and all other
particles T-odd. It turns out that when this symmetry is imposed on the model, this also imposes
constraints on parameters which indeed sets g′1 = g′2 and v′ = 0. Another nice feature of the model
with T-parity, is that the lightest T-odd particle it contains could be a potential candidate for Dark
Matter. In particular, this is the heavy B′ called the ’heavy photon’ [22]. Very precise measurements
have further been made by several experiments and imposing T-parity makes the model highly con-
sistent with electroweak precision data. All of these features make the Littlest Higgs with T-parity
very compelling theories to describe physics up to the cut-off scale of Λ ∼ 10 TeV. Above this scale it
still remains unclear what new physics we can expect. Various possibilities have been proposed, one
of them being supersymmetry, all of which are broken at scales high enough not to be in conflict with
experiments.
In the end it will only be experiment that can tell whether Little Higgs models are correct as ef-
fective theories. Recently the CMS collaboration published a paper on their analysis of LHC data
with a center of mass energy of 8 TeV. They searched for signs of heavy W ′ and Z ′ that decayed into
the Standard Model Higgs and a W or Z [23]. Unfortunately they have not yet found any signs of
decaying heavy W ′ and Z ′ with 95% confidence level. They excluded masses for W ′ in the interval
[1.0-1.6] TeV and for Z ′ in the intervals [1.0-1.1] and [1.3 - 1.5] TeV. However, with the latest upgrade
of the LHC it can now reach a beam energy of 6.5 TeV (13 TeV in total) so it should be clear in the
near future whether Little Higgs theories have their place in nature.
93
12 Summary
Here I will summarize what we have seen in each section. I divided the summary in two parts. Part I
focusing on the mathematics and part II on physics.
12.1 Part I
We set out to determine the irreducible representations of GL(V ) ∼= GL(n,C). For this we observed
that GL(V ) as well as the symmetric group act on the space V ⊗n and that their actions commuted.
Therefore, we started with determining the irreducible representations of the symmetric group Sn,
since unlike for other groups, its conjugacy classes are in bijection with the partitions of λ of n. We
first showed that we could visualize the partitions with Young diagrams and introduced the Young
tableaux. Then we constructed an idempotent element cλ, called the Young symmetrizer. We proved
that, by letting λ vary over the partitions, these elements form a mutually orthogonal set of central,
primitive idempotents and identified to each cλ an irreducible representation Vλ = C[Sn]cλ. It was
called the Specht module. Letting λ vary over the partitions then gave us all irreducible representa-
tions. I then discussed a few more results about these Specht modules. I discussed Young’s rule, and
introduced the Kostka numbers that gave the multiplicities for the different Specht modules appearing
in its decomposition. I finalized with the Hook length formula to compute the dimension of the repre-
sentation. In Section 4 I discussed the irreducible representations of GL(V ). We proved that many of
its irreducible representations could be obtained by using the same young symmetrizer. We denoted
these representations by SλV and these could be obtained by computing cλV⊗n. A second important
result from this section was that the irreducible characters of these irreducible representations where
identified with the Schur polynomials, certain symmetric polynomials. This would make it possible to
make the branching rules in section 5 concrete by using known identities between these polynomials.
We further saw a formula to compute the dimension of the irreducible representations and proved
that in the decomposition of V ⊗n each SλV occurs with the a multiplicity given by the dimension
of the corresponding Specht module. Section 5 discussed some branching rules. We reformulated the
problem in terms of characters and compared these to identities between Schur polynomials that could
be found in appendix A. I further gave some explicit examples of branching patterns in terms of Young
diagrams. Paragraph 5.2 discussed how all of the obtained results did also hold for SU(n), the special
unitary group. For this we used some results from Lie-Theory. We argued that irreducible represen-
tations of a Lie-group can be differentiated to yield irreducible representations of its Lie-algebra, and
conversely an irreducible representation of the Lie-algebra can be integrated to an irreducible represen-
tation of the Lie-group. The second observation was that (in)-equivalent irreducible representations
of the Lie-algebra gl(n) can be restricted to yield (in)-equivalent irreducible representations of the
Lie-algebra su(n). These two arguments completed the statement that irreducible representations of
GL(n) yield irreducible representations of SU(n). That is, the irreducible representation of SU(n)
can be visualized by young diagrams with columns of at most length n, we have a formula to compute
their dimensions and further three branching rules.
94
12.2 Part II
Section 6 served as an introduction to the Lagrangian formalism in Field Theories. I discussed some
examples of field theoretic Lagrangians and introduced spontaneous symmetry breaking. We derived
a condition on the group generators and saw that generators that did not leave the vacuum invariant
where broken generators. Then we did some examples on the spontaneous breaking of a global symme-
try and saw that this lead to the appearance of massless particles, called Nambu Goldstone particles,
that corresponded to the broken generators. Then, chapter 7 discussed the Higgs mechanism. We
considered local symmetries, introduced gauge fields and the covariant derivative and saw that by
changing to U-gauge, we could remove the massless Goldstone particles from the particle spectrum.
They became the longitudinal degrees of the freedom of the gauge bosons which thereby became mas-
sive. The fermions acquired masses as a result of a constant resistance against the Higgs field. Then
I introduced the hierarchy problem and quantum corrections to the Higgs mass in section 8. With a
short calculation I showed that an unacceptable amount of fine-tuning was needed to keep the Higgs
mass at around the EWSB scale. I discussed the related Little Hierarchy problem stating that new
physics should appear not far above the TeV scale for else it would reintroduce a hierarchy problem.
Sections 9 and 11 focused on two Little Higgs models as a solution to the Little hierarchy problem.
Little Higgs models postulate the Higgs as NGB’s of an approximate global symmetry. To let the
Higgs acquire a mass term we then explicitly broke the symmetry, but only broke it collectively, mean-
ing that at least two sets of coupling must be nonzero. This way divergent contributions to its mass
could be at most log divergent. The two models we looked at where an SU(3) based model where the
standard model SU(2)W was embedded in SU(3)W . It turned out to have some shortcomings though,
and for that reason we looked at the Littlest Higgs based on a global SU(5) symmetry in section 11.
In the preceding section 10 I related the branching rules for SU(N) as derived in the first part to
the transformation properties of the elementary Standard Model particles and showed they could be
embedded in the 5 lowest dimensional SU(5) representations.
12.3 Acknowledgements
Last I would like to give my thanks to my two supervisors Eric Laenen and Jasper Stokman for their
support and motivation. They where always available within a couple of days and explanation on
subjects I found difficult where always helpful. I have worked with much pleasure and enthusiasm on
the subjects and it has provided many new insights for me.
95
13 Popular summary (Dutch)
Symmetriebreken is het begrip wat centraal staat in mijn scriptie. De twee onderwerpen die behandeld
worden zijn ”vertakkingsregels” en ”Little Higgs” modellen. Vertakkingsregels vallen onder een tak van
de wiskunde die de representatietheorie wordt genoemd. De representatietheorie bestudeerd zogeheten
symmetrie structuren en vertakkingsregels beschrijven het fenomeen van symmetriebreken.
Ten tweede heb ik Little Higgs modellen bestudeerd. De ontdekking van het Higgs deeltje was erg
belangrijk want het bewees dat het Higgs mechanisme, verantwoordelijk voor de massa van deeltjes,
correct was. Het enige probleem met het Higgs deeltje is echter dat zijn massa gevoelig is voor het
zogeheten hierarchie probleem. Hier zal ik een schets geven van dit hierachie probleem inhoudt, hoe
het gerelateerd is aan de Higgs massa en hoe Little Higgs modellen een mogelijke oplossing zijn.
Vertakkingregels
Laat ik beginnen met uitleggen wat de vertakkingsregels inhouden. Hiervoor moeten we eerst weten
wat we bedoelen met symmetrien en symmetrie groepen. We nemen als voorbeeld de groep die de
symmetrien van een driehoek beschrijft. Deze groep wordt de dihedrale groep, D3 genoemd. Een
symmetrie is hierbij gedefineerd als een werking die je op de driehoek kan uitvoeren die de driehoek
onveranderd laat. Je gaat gemakkelijk na dat er 6 zulke symmetrie werkingen zijn. De eenheidswerking
e, rotaties r1 en r2 om resp. 120 en 240 en spiegelingen s1, s2 en s3 om de drie spiegelings-assen. Twee
Figure 8: De 6 verschillende symmetrien van een driehoek.
van deze werkingen na elkaar uitgevoerd vormen ook een symmetrie-transformatie van de driehoek die
in de groep zit. Als laatste is er nog een eenheidselement. Hier is dat het element e.
Definition 13.1. Een groep is een verzameling G voorzien van een bewerking en van een eenheid-
selement 1 zodat voldaan is aan bepaalde rekenregels.
De verzameling e, r1, r2, s1, s2, s3 en de groepsoperatie vormen zo een groep, waarbinnen we kunnen
rekenen. Een ondergroep H van G definieren we als volgt.
Definition 13.2. Een deelverzameling H van G noemen we een ondergroep van G als H met de
bewerking van G en hetzelfde eenheidselement een groep vormt.
Nu was dit kleine groep met eindig veel elementen. Het is een discrete symmetrie en je kan gemakkelijk
rekenen met deze groep. Dit rekenen wordt echter snel ingewikkelder als de symmetrie-groepen in-
96
gewikkelder worden. Hiervoor gebruiken we de representatietheorie. In de representatietheorie worden
alle groepselementen gezien als lineaire transformaties tussen vectorruimten.
Definition 13.3. Een verzameling V van vectoren voorzien van optelling en scalaire vermenigvuldiging
heet een vectorruimte als voldaan is aan bepaalde reken eigenschappen.
Zo’n transformatie tussen vectorruimten wordt weergegeven door een matrix. Laten we als voorbeeld
de 3 dimensionale representatie van de groep S3 bekijken. Dit is de groep van permutaties van 3 letters
met elementen e, (12), (23), (13), (123), (132). In deze notatie betekend (123) dat 1→ 2, 2→ 3 en
3→ 1. In de representatie theorie worden alle groepselementen weergeven als matrices die vectoren in
een vectorruimte V naar nieuwe vectoren sturen. Een 3-dimensionale vectorruimte zou je je kunnen
voorstellen als een assenstelsel en een vector geven we weer met ~x =
xyz
= x~e1 + y ~e2 + z ~e3 met ~ei de
eenheidsvectoren van lengte 1. De groep S3 werkt dan op een vector door de indices te permuteren.
Het element g = (12) bijvoorbeeld wisselt de indices 1 en 2 om en houdt 3 vast. Een representatie van
dit element zou dan de matrix 0 1 0
1 0 0
0 0 1
kunnen zijn. Dan:
~x→ ~x′ =
0 1 0
1 0 0
0 0 1
· ~x =
0 1 0
1 0 0
0 0 1
·xyz
=
yxz
Op dezelfde manier kun je ook de andere elementen door matrices representeren. De afbeelding die
alle elementen naar een set matrices stuurt noemen de representatie, en wanneer het duidelijk welke
afbeelding we bedoelen dan noteren we alleen nog de dimensie van de vectorruimte waarop we de
elementen representeren. In het bovenstaande geval is dat dus de 3-dimensionale representatie. Het
lijkt verder misschien alsof je informatie verliest door de groepselementen op deze manier weer te geven
maar dat is net zo. Alle informatie die in de groep zit blijft behouden. Als laatste het begrip van een
deelrepresentatie en een irreducibele representatie. We spreken van een deelrepresentatie als er een
deelruimte van V is die invariant wordt gelaten onder de werking van de groep. In het bovenstaande
voorbeeld is dat bijvoorbeeld de deelruimte ~v = e1 + e2 + e3. Je ziet namelijk dat indices permuteren
geen effect heeft. Tenslotte noemen we een representatie irreducibel als de enige deelrepresentaties de
ruimte V zelf zijn of 0. Je zou deze irreducibele representaties in een zekere zin als atomisch kunnen
beschouwen omdat je elke representatie uit kan drukken in termen van irreducibele representaties. Het
zijn dan ook de irreducibele representaties van een groep waar wiskundigen in geınteresseert zijn.
We hebben nu alles om vertakkingsregels uit te leggen. Deze beschrijven het fenomeen van symme-
triebreken waarbij de symmetriegroep van een systeem gereduceerd wordt tot een kleinere symme-
triegroep. In het algemeen zal een irreducibele representatie niet irreducibel blijven wanneer je deze
beperkt tot de kleinere groep. Waar we in zo’n geval daarom in geınteresseerd zijn is hoe een ir-
reducibele representatie van de grote symmetriegroep opsplitst in irreducibele representaties van de
kleinere symmetrie groep. In mijn scriptie was ik in het bijzonder geınteresseerd in het fenomeen van
spontane symmetrie breking. Om je spontane symmetrie breken voor te stellen kan je denken aan een
97
potlood dat je op zijn punt laat balanceren. Dit is een symmetrische toestand. Uiteindelijk zal het
potlood echter omvallen. De resulterende minder symmetrische toestand is namelijk veel stabieler.
In het algemeen zijn de meest stabiele toestanden diegene die de minste hoeveelheid energie nodig
hebben.
In mijn scriptie was het doel om de vertakkingsregels voor de groep SU(n) af te leiden. Deze sym-
metrie groep speelt een grote rol in het standaard model. Net als hierboven kan je je de elementaire
deeltjes namelijk voorstellen als vectoren die transformeren naar nieuwe vectoren onder de werking
van matrices. Je kan laten zien dat de transformaties die toegestaan zijn (dwz zonder schending van
natuurwetten) je precies deze groepen opleveren. Bekend zijn met de irreducibele representaties van
de groep SU(n) vertelt je dus hoe de elementaire deeltjes zich gedragen.
Het standaard model is een van de grootste successen in de moderne fysica. Het beschrijft alle materie
deeltjes (fermionen) en interacties tussen deze door uitwisseling van krachtdragers (bosonen). Er zijn 4
fundamentele krachten, de sterke kernkracht, de zwakke kernkracht, de electromagnetische kracht en de
zwaartekracht. De krachtdrager van de elektromagnetische kracht is bijvoorbeeld het foton. De term
elektromagnetisme geeft aan dat elektriciteit en magnetisme in feite twee manifestaties zijn van dezelfde
kracht. Fysici zagen later in dat ook de zwakke kracht en elektromagnetische kracht als een kracht
gezien konden worden. De elektrozwakke kracht. Echter, in het dagelijks leven nemen we deze twee
krachten als twee totaal verschillende fenomenen waar. Dit komt omdat de twee krachten alleen ge-
unificeert zijn bij de hoge temperaturen van het vroege universum. Met de afkoeling van het universum
werd de symmetrie van de elektrozwakke kracht namelijk gebroken tot de elektromagnetische kracht.
Het vacuum is namelijk niet leeg maar gevuld met een Higgs veld, een hypothetisch energieveld welke
je je kan voorstellen als een zee aan Higgs deeltjes. Dit Higgs veld is verantwoordelijk voor de massa
van alle deeltjes. Vlak na de oerknal was het Higgs veld symmetrisch en waren alle elementaire deeltjes
volledig massaloos.
Met de afkoeling van het universum vond er symmetriebreking plaats waardoor de symmetrie van de
elektrozwakke kracht gebroken werd tot de symmetrie van de electromagnetische kracht. Dit vond
plaats doordat het Higgs veld een vacuum verwachtings-waarde (vev) aannam. Om uit te leggen wat
dit betekend vergelijken we het potentiaal voor het Higgs veld met een normaal potentiaal. (Herinner
dat een potentiaal je de energie van een bepaalde configuratie vertelt.) We noemen het Higgsveld
potentiaal ook wel het Mexicaanse hoed potentiaal. In figuur 9 (a) zie je een normaal potentiaal.
Zoals met het potlood is ook hier de meest stabiele toestand de toestand met de laagste energie. Een
deeltje in de oorsprong neemt geen (vev) aan omdat het al in de toestand met de laagste energie zit
(minimum van de potentiaal). Figuur 9 (b) toont het Higgs potentiaal. De oorsprong is nu niet langer
het minimum van de potentiaal en het Higgs zal naar de toestand gaan waar het een lagere energie
heeft. De symmetrie die er voorheen was62, is gebroken tot een kleinere symmetrie63 en we zeggen dat
de Higgs vev de elektrozwakke symmetrie breekt. Het Higgs-veld werd hierna een stroperig krachtveld
en de materie deeltjes kregen een massa door een continue weerstand tegen dit veld. De krachtdeeltjes
van de zwakke kracht kregen een massa door wisselwerking met het Higgs deeltje. De krachtdrager
van de elektromagnetische kracht (het foton), de symmetrie die overbleef, bleef massaloos. Dit proces
wordt het Higgs mechanisme genoemd. Lang bleef onduidelijk of dit mechanisme correct was maar
62Deze symmetrie is de groep SU(2)× SU(1), de elektrozwakke symmetrie63De overblijvende symmetrie is SU(1) de elektromagnetische symmetrie.
98
Figure 9: Twee potentialen. (a) Een normaal potentiaal waarin de oorsprong de toestand met
de laagste energie is. (b) Het Higgs potentiaal waarin de oorsprong niet langer de toestand
met de laagste energie is [29].
het werd uiteindelijk geverifieerd met de ontdekking van het Higgs deeltje.
Fysici denken dat bij nog hogere temperaturen ook de elektrozwakke kracht en de sterke kernkracht
geunificeerd waren tot een kracht en dat bij een nog hogere energieschaal, die de Planck schaal wordt
genoemd, ook de zwaartekracht genificeerd was. Fysici weten dat het Standaard model niet meer geldig
is rond de Planck schaal en zijn nog steeds op zoek naar een theorie die deze unificaties beschrijft.
Het hierarchie probleem treedt op wanneer je de verschillende energieschalen gaat bekijken waarop de
verschillende unificaties plaatsvinden. Uitgedrukt in electronVolts (eV) treed elektrozwakke unificatie
op bij een energie van orde 100 GeV, unificatie met de sterke kernkracht bij ongeveer 1016 GeV
en met de zwaartekracht bij 1019 GeV!! Op zich is dit gigantische verschil tussen de verschillende
energieschalen geen probleem, het is alleen erg onnatuurlijk en fysici hebben geen verklaring waarom
deze zo uiteenlopen. Een probleem treed op wanneer men probeert de Higgs massa te bepalen. Het
standaard model Higgs deeltje heeft een gemeten massa van ongeveer 125 GeV (herinner E = mc2).
Echter, het Higgs deeltje gaat ook interacties aan met virtuele deeltjes en dit leidt tot zogeheten
quantum correcties op de Higgs massa. In mijn scriptie laat ik zien dat deze correcties proportioneel
zijn aan Λ2 waarbij Λ de energie is tot waar het Standaard model geldig is. Omdat de enige energie
schaal waar we weten dat het Standaard model niet meer geldig en nieuwe fysica optreedt de Planck
schaal is worden deze contributies aan de Higgs massa ∼ (1019)2 en de enige manier om te kunnen
verklaren waarom de fysisch gemeten massa MFys = 125 GeV voldoet aan M2Fys = M2
0 + (1019)2 is
wanneer de ”klassieke” massa heel precies gefine-tuned is tot op wel 30 decimalen achter de komma!!!
Deze finetuning is natuurlijk een erg onnatuurlijke manier om de Higgs massa te verklaren en fysici
zijn op zoek naar een theorie die deze noodzaak voor fine-tuning wegneemt. Zulke theorien zijn
gebaseerd op een grotere (globale) symmetrie groep die spontaan gebroken word tot de standaard
model symmetrie groep. Little Higgs modellen zijn zo’n soort klasse van modellen. Hierin worden
nieuwe deeltjes geıntroduceert die ook interacties met de Higgs aangaan en precies de contributie die
we hierboven bespraken opheffen! Ten tweede wordt het Higgs deeltje als een goldstone boson gezien.
Dit zijn massaloze deeltjes die ontstaan wanneer er spontane symmetrie optreedt. Dit verschijnsel
laat ik ook zien in mijn scriptie. Maar, het Higgs deeltje is niet massaloos. Daarom wordt de grotere
symmetrie naast spontaan ook op een heel speciale manier gebroken door expliciet bepaalde termen
aan de vergelijkingen toe te voegen. Dit wordt zodanig gedaan dat het Higgs deeltje op een veel
natuurlijkere manier aan zijn massa komt en de noodzaak voor finetuning verdwijnt. Laat ik om af te
99
sluiten nog de naam van de modellen verklaren. Het is namelijk belangrijk dat de energie waarop deze
grotere symmetriegroepen gebroken worden rond de TeV schaal liggen, dit is namelijk de grens wanneer
er geen noodzaak voor finetuning is. Modellen waarin de symmetrie breking namelijk op veel hogere
schalen plaatsvind kunnen weliswaar het hierarchie probleem zoals hierboven geschetst oplossen, maar
introduceren op hun beurt een ”Little”-hierarchie probleem. Little Higgs modellen hebben inderdaad
een symmetriebreking rond de 1 TeV en lossen hiermee het Little-hierarchie probleem op.
100
A Symmetric polynomials
This appendix will discuss certain symmetric polynomials and in particular the symmetric Schur
polynomials. These have their applications in the representation theory of GL(n,C) as the are the
characters of the irreducible representations. Here I will discuss some important identities between
there Schur polynomials64. We consider functions in the variables, x1, . . . , xk indexed by partitions
λ = (λ1 ≥ . . . λk ≥ 0) of n into at most k parts, or in terms of Young diagrams, Young diagrams with
at most k rows. There are several choices of bases for Λn, the ring of symmetric polynomials in n
variables. Here I will list a few of these bases and formulate some results about them and relations
between them. The first are the monomial symmetric polynomials.
A.1 Monomial symmetric polynomials
Definition A.1. For each α = (α1, . . . , αn) we denote by xα the monomial
xα = xα11 · · ·xαnn
Let λ be a partition of length ≤ n. Then the monomial symmetric polynomial
mλ =∑
xα (A.1)
is the sum over all distinct permutations α = (α1, . . . , αn) of λ1, . . . , λn.
For example, in three variables:
m(1,1) = x1x2 + x1x3 + x2x3
m(2.0) = x21 + x2
2 + x23
A.2 Complete symmetric polynomials
Definition A.2. The complete symmetric polynomial hλ is defined as
hλ = hλ1hλ2· · ·hλk
with hr =∑|λ|=rmλ the rth complete symmetric polynomial which is the sum over all monomials of
total degree r in the variables x1, x2 . . . xn.
The generating function for hr is:
H(t) =∑r≥0
hrtr =
n∏i=1
(1− xit)−1 (A.2)
Taking the same partitions and 3 variables we find:
h(1,1) = h1h1 = m1m1 = (x1 + x2 + x3)2
h(2,0) = h2h0 = h2 = m(2,0) +m(1,1) = x21 + x2
2 + x23 + x1x2 + x1x3 + x2x3
64All results can be found in [3].
101
A.3 Elementary symmetric polynomials
Third are the elementary symmetric polynomials. Unlike the previous two these are parametrized by
the partitions conjugate to λ which we denote by λ′.
Definition A.3. The elementary symmetric polynomial e′λ is defined as
e′λ = eλ′1eλ′2 · · · eλ′n
with er the rth elementary symmetric polynomial which is the sum of all products of r distinct variables
xi so that e0 = 1 and
er =∑
i1<12<...<ir
xi1xi2 · · ·xir = m(1r)
The generating function for er is:
E(t) =∑r≥0
ertr =
n∏i=1
(1 + xit) (A.3)
For example, in three variables:
e(1,1) = e1e1 = m1m1 = (x1 + x2 + x3)2
e(2,0) = e2e0 = e2 = m(1,1) = x1x2 + x1x3 + x2x3
A.4 Schur polynomials
Last are the Schur polynomials. They are defined via a determinantal formula We let xα = xα11 · · ·xαnn
be a monomial in a finite number of variables x1, . . . , xn and consider the polynomial aα obtained by
anti-symmetrizing xα. That is
aα(x1, . . . , xn) =∑σ∈Sn
ε(σ) · σ(xα)
This polynomial is skew symmetric since σ(aα) = ε(σ)aα. In particular, it vanishes unless all αi, i =
1, . . . n are distinct. We may therefore assume α1 > α2 > . . . > αn ≥ 0 and write α = λ+ δ with λ a
partition and l(λ) ≤ n and δ = (n− 1, n− 2, . . . , 1, 0). Then
aα = aλ+δ =∑σ
ε(σ) · σ(xλ+δ)
which we may write as the following determinant.
aλ+δ = det(xλj+n−ji )1≤i,j≤n =
xλ1+n−1
1 xλ1+n−12 · · · xλ1+n−1
n
xλ2+n−11 xλ2+n−1
2 · · · xλ2+n−1n
......
. . ....
xλn1 xλn2 · · · xλnn
This determinant is divisible by the the Vandermonde-determinant, which is the product over each of
the differences xi − xj , since aλ+δ is divisible by each of these separately. It is defined as:
∏1≤i<j≤n
(xi − xj) =
xn−1
1 xn−12 · · · xn−1
n
xn−21 xn−2
2 · · · xn−2n
......
. . ....
1 1 · · · 1
= det(xn−ji ) = aδ (A.4)
102
The quotient of these two determinants, aλ+δ/aδ, is symmetric and is called the Schur function in the
variables x1, . . . xn corresponding to the partition λ of length ≤ n and is homogenous of degree |λ|.
Definition A.4. The Schur function sλ in n variables, where l(λ) ≤ n is defined as
sλ(x1, . . . , xn) =det(x
λj+n−ji )
det(xn−ji )=aδ+λaδ
and the Schur functions form a basis of the ring of symmetric functions.
We further have the following identities that relate the Schur polynomials to the complete symmetric
polynomials and elementary symmetric polynomials. They are
sλ = det(hλi−i+j) and sλ = det(eλi′−i+j)
With this we see that we have the special cases of:
s(n) = hn and s(1n) = en.
A.5 Orthogonality
The Schur polynomials form in fact an orthonormal basis of Λ with respect to the scalar product on Λ
that we will define in a moment. First though, we have the following two identities for the expansion
of the product ∏i,j
(1− xiyj)−1
where x = (x1, . . . , xn) and y = (y1, . . . , yn) are two sets of variables. These are:
1.∏i,j(1− xiyj)−1 =
∑λ hλ(x)mλ(y)
2.∏i,j(1− xiyj)−1 =
∑λ sλ(x)sλ(y)
We now define the scalar product < u, v > on Λ, by requiring that for the bases hλ and mλ the
following are equivalent:
< hλ,mµ >= δλµ (A.5)
Then we have:
Proposition A.1. For each two bases (uλ) and (vλ), indexed by partitions λ of n ≥ 0 the following
holds;
1. < uλ, vλ >= δλµ
2.∏i,j(1− xiyj)−1 =
∑λ uλ(x)vλ(y)
therefore we have
< sλ, sµ >= δλµ (A.6)
Since the Schur functions define an orthonormal basis we can define any symmetric function f ∈ Λ by
its scalar product with sλ, i.e.
f =∑λ
< f, sλ > sλ
In the next section I will define some relations between Schur polynomials that will be needed. Then,
in section A.6.1 I will define the skew Schur function
103
A.6 Relations among the symmetric polynomials
Two relations that I will discuss here are the Pieri rule and the Littlewood-Richardson rule. The Pieri
rule tells us how to multiply a Schur polynomial by a basic Schur polynomial s(m) = hm.
Definition A.5. Pieri rule
sλ · s(m) =∑ν
sν
with the sum over all partitions ν whose Young diagram can be obtained from the Young diagram of
λ by adding a total of m boxes to the rows, but no two boxes in the same column.
Example A.1. Consider the product of Schur polynomials s(2,1) ·s(2). Expanding λ = (2, 1) by adding
2 boxes according to the rule gives the following possibilities for the young diagrams for ν:
and thus s(2,1) · s(2) = s(4,1) + s(3,2) + s(3,1,1) + s(2,2,1).
Applying the Pieri rule inductively to hλ = hλ1· · ·hλk , gives the following identity:
hλ = sλ1· · · sλk =
∑Kµλsµ. (A.7)
where the coefficients Kµλ are the Kostka numbers, the number of semi-standard tableaux of shape µ
and content λ. The second rule is the Littlewood Richardson rule that tells us how to multiply two
general Schur polynomials and is thus a generalization of the Pieri rule. It gives the expansion of a
product of two Schur polynomials in terms of Schur polynomials.
Definition A.6. Littlewood - Richardson rule
sλ · sµ =∑ν
cνλµsν
Here λ ` n, µ ` m and the summation is over all partitions ν of d+m. The Littlewood - Richardson
coefficients cνλµ that appear in the expansion are defined as the number of ways the Young diagram
for λ can be expanded to the Young diagram for ν by strict µ expansion. By this we mean that if
µ = (µ1, . . . , µk) we get a µ expansion by first adding µ1 boxes in the description of the Pieri rule and
putting a 1 in these boxes. Then repeating for µ2 and putting a 2 in those boxes, and so on up to the
last µk boxes and putting a k in those boxes. By strict expansion we mean that when the integers in
the boxes are listed from left to right, starting with the top row and working down, and one looks at
the first t entries in this list, then each integer p between 1 and k− 1 occurs at least as many times as
the next integer p+ 1.
Example A.2. Consider the product s(2,1) · s(2,1). Then strict (2,1) expansion of the Young diagram
gives the following possibilities for ν:
104
0 0 1 10 2
0 0 1 102
0 0 10 1 2
0 0 10 12
0 0 10 21
0 0 1012
0 00 11 2
0 00 112
Therefore: s(2,1) · s(2,1) = s(4,2) + s(4,1,1) + s(3,3) + 2s(3,2,1) + s(3,1,1,1) + s(2,2,2) + s(2,2,1,1).
A.6.1 Skew Schur functions
We now introduce the skew Schur function sλ/µ by defining:
< sλ/µ, sν >=< sλ, sµsν > (A.8)
They can be expanded in terms of Schur polynomials through the relation
sλ/µ =∑ν
cλµνsν (A.9)
where the coefficients are the Littlewood-Richardson rule as above. We further have
Definition A.7. sλ/µ(x1, . . . , xn) = 0 unless 0 ≤ λi − µi ≤ n for all i.
Schur functions in more set of variables
In the following we will consider three sets of independent variables x = (x1, x2, . . . ), y = (y1, y2, . . . ), z =
(z1, z2, . . . ). Then we have:∑λµ
sλ/µ(x)sλ(z)sµ(y) =∑µ
sµ(y)sµ(z) ·∏i,k
(1− xizk)−1
=∏j,k
(1− yjzk)−1∏i,k
(1− xizk)−1
=∑λ
sλ(x, y)sλ(z)
where sλ(x, y) is now the Schur function in the set of variables (x1, x2, . . . , y1, y2, . . . ). We thus conclude
that:
sλ(x, y) =∑µ
sµ(x)sλ/µ(y) (A.10)
In fact, this can be made more general:
sλ/µ(x, y) =∑ν
sλ/ν(x)sν/µ(y) (A.11)
with the sum over partitions ν such that λ ⊃ ν ⊃ µ.
105
proof We have ∑µ
sλ/µ(x, y)sµ(z) =sλ(x, y, z)
=∑ν
sλ/ν(x)sν(y, z)
=∑µ,ν
sλ/ν(x)sν/µ(y)sµ(z)
This we can generalize as follows:
Proposition A.2. Let λ, µ partitions and let x(1), . . . , x(n) be n sets of variables. Then
sλ/µ(x(1), . . . , x(n)) =∑ν
n∏i=1
sνi/νi−1(x(i))
with the sum over all sequences (ν) = (ν(0), . . . , ν(n)) of partitions such that ν(0) = µ, ν(n) = λ and
ν0 ⊂ ν2 ⊂ . . . ⊂ νn.
With this formula we can derive what will happen if we where to set the variable xn in the schur
function sλ(x1, . . . , xn) to 1. For this consider the two sets of variables x = (x1, . . . , xn−1) and y = xn
and let λ = (λ1, . . . , λn) a partition of l(λ) ≤ n. For this single variable xn, we have that sλ/µ(xn) = 0,
unless |λ−µ| is a horizontal strip by definition A.7. Then sλ/µ(x) = x|λ−µ|. Thus, with A.10 we have
sλ(x1, . . . , xn) =∑µ⊆λ
sµ(x1, . . . , xn−1)sλ/µ(xn) =∑µ⊆λ
sµ(x1, . . . , xn−1)x|λ−µ|n (A.12)
where the sum is over all partitions µ for which λ − µ is a horizontal strip and l(µ) ≤ n − 1 by def.
A.7, from which the effect of setting xn = 1 can be deduced.
Definition A.8. The partition µ ⊆ λ is called a horizontal strip if |λ − µ| has at most one box in
each column.
B Lie groups and Lie algebra’s
A certain set of groups that is of great importance in physics are the symmetry groups, and in particular
the continuous symmetry groups, such as the rotations. These continuously generated groups are called
Lie groups. A familiar example of a Lie group is the 3-dimensional rotation group SO(3) that contains
the rotation matrices and depends on three parameters. Other examples are the unitary group U(N),
consisting of all orthogonal N × N matrices, and its subgroup SU(N), the special unitary group,
meaning the matrices have determinant one. The special unitary groups are very important groups in
physics. For example, the symmetry group of the standard model is SU(3) × SU(2) × U(1). In this
appendix I will discuss a few basic results concerning Lie groups, group generators and the associated
Lie algebra. In particular I will discuss the cases of SU(2) and SU(3)65
65The material discussed is based on [27].
106
B.1 Lie groups
Consider a Lie group G with elements g(ξ), where ξ is a parameter. Because Lie groups are analytical
groups we can parametrize any two elements as:
g(ξ1)g(ξ2) = g(ξ1 + ξ2).
which implies the following properties for g(ξ):
g(0) = I, and (g(ξ))−1 = g(ξ).
Performing a Taylor expansion around the identity gives:
g(ξ) = g(0) + g′(0)(ξ − 0) + (O)(ξ2) = I + ξt+O(ξ2), (B.1)
where
t ≡ dg(ξ)
dξ|ξ=0
This t is called the generator of the group. We can obtain a nicer expression for g(ξ) by rewriting
(B.1) as an exponential:
g(ξ) = g(ξ/n)n
= limn→∞
I +ξ
nt . . . n = exp(ξt),
which may immediately be extended to an n-parameter Lie group:
g(ξ1, . . . , ξn) = exp(ξata)
where the summation convention is used. As with t, being the generator of the one-parameter Lie
group, the ta’s are the generators of the n-parameter Lie group and they are linearly independent. In
the following we will have a closer look at these generators and this will lead us to introduce the Lie
Algebra.
B.2 Lie algebra
Let G a Lie group with elements g(ξ1, . . . , ξn) and generators ta. Using the Baker-Campbell-Hausdorff
formula66 any product of two elements can be expressed as:
g(ξ1, . . . , ξn) · g(ζ1, . . . , ζn) = exp(ξata) · exp(ζbtb) (B.2)
= expξata + ζbtb +1
2ξaζb[ta, tb] + higher order commutators (B.3)
However, G is a group so it must close under multiplication. The product in B.2 must therefore again
be a group element, that is, we must have
g(ξ1, . . . , ξn) · g(ζ1, . . . , ζn) = g(ξ1, . . . , ηn) = exp(ηata). (B.4)
66The BCH formula[10] relates the product of the exponentials of two operators A and B to their
commutator [A,B]. The formula can be expressed as exp(A) · exp(B) = exp(A + B + 12[A,B] +
higher order commutators of A and B). Thus, in the case where A and B commute, i,e, [A,B] = 0, we re-
cover the usual identity exp(A) exp(B) = exp(A+B).
107
But this is possible if and only if any commutator of generators can again be written as a linear
combination of generators. We thus conclude that the generators must close under commutation:
[ta, tb] = fabctc (B.5)
where the fabc are called the structure constants. With this property the generators form the basis of
the Lie algebra g associated with the Lie group G.
B.3 Examples
Important Lie group in physics are U(N) and its subgroup SU(N). U(N) is the group of all unitary
n× n matrices and SU(N) is the group of unitary matrices with unit determinant. The group U(N)
has n2 independent generators. SU(N) therefore has n2−1 generators since we have the extra constant
of unit determinant. The number of generators can be seen by considering a general n × n matrix.
It can be described by n2 complex numbers and thus depends on 2n2 real parameters. The unitarity
condition imposes n2 constraints on the parameters. And thus the number of independent parameters
is 2n2 − n2 = n2 Here I will discuss the particular cases of SU(2) and SU(3).
SU(2)
According to the above discussion we can write the group elements U of SU(2) as:
U = exp(ξata)
The unitarity constraint, UU∗ = Id, and unital determinant constraint place the following two re-
quirements on the 3 group generators.
ta = −t†a, Trace(ta) = 0
Thus, the generators of SU(2) must be traceless anti-hermitian matrices. Explicitly:
ta =1
2τa (B.6)
where the factor of 12 comes from the restriction of unit determinant and the τa are the Pauli matrices
given by:
τ1 =
(0 1
1 0
)τ2 =
(0 −ii 0
)τ3 =
(1 0
0 −1
)(B.7)
The structure constants for SU(2) are fijk = εijk
SU(3)
SU(3) has 8 independent generators expressed in terms of the Gell-mann matrices ta = 12λi. The
Gell-mann matrices are given by the following set of matrices:
λ1 =
0 1 0
1 0 0
0 0 0
λ2 =
0 −i 0
i 0 0
0 0 0
λ3 =
1 0 0
0 −1 0
0 0 0
λ4 =
0 0 1
0 0 0
1 0 0
108
λ5 =
0 0 −i0 0 0
i 0 0
λ6 =
0 0 0
0 0 1
0 1 0
λ7 =
0 0 0
0 0 −i0 i 0
λ8 =1√3
1 0 0
0 1 0
0 0 −2
We thus see that we can embed SU(2) in SU(3) by identifying the three Pauli matrices with λ1, λ2
and λ3.
SO(N)
SO(N) is the group of all n × n orthogonal matrices U , i.e. UTU = Id, with unit determinant.
In physics SO(N) rotations are also called isometries because they leave lengths invariant. The gener-
ators of SO(N) are anti-symmetric trace-less n× n matrices and there are 12N(N − 1) such matrices
with a single 1 above the diagonal and a corresponding -1 below such that the matrix is antisymmetric.
Explicitly for SO(3) the three generators are: 0 1 0
−1 0 0
0 0 0
0 0 1
0 0 0
−1 0 0
0 0 0
0 0 1
0 −1 0
C Notation and relevant quantum numbers
C.1 Notation
gamma matrices
The 5 gamma matrices are given by [13]:
γ0 =
1 0 0 0
0 1 0 0
0 0 −1 0
0 0 0 −1
γ1 =
0 0 0 1
0 0 1 0
0 −1 0 0
−1 0 0 0
γ2 =
0 0 0 −i0 0 i 0
0 i 0 0
−1 0 0 0
γ3 =
0 0 1 0
0 0 0 −1
−1 0 0 0
0 1 0 0
They satisfy the Dirac algebra:
(γ0)2 = Id (γk)2 = −Id γ0† = γ0 γk† = −γk γµ, γν = γµγν + γνγµ = 2gµν
where gµν is the matrix tensor gµν =
1 0 0 0
0 −1 0 0
0 0 −1 0
0 0 0 −1
.
Some relevant identities they satisfy are [12]:
• gµνgµν = 4
• γµγµ = 4
• tr(γµ) = 0
• the trace of any product of an odd number of γµ is zero
• tr(γµγµ) = 4
109
• tr(γµγν) = 4gµν
• tr(/a/b) = 4a · b where /a ≡ aµγµ
C.2 Quantum numbers
In the Standard model, hypercharge Y , electric charge Q and the third isospin component I3 are
related through the ’Gell-Mann-Nishijima formula’ [12].
Q = I3 +Y
2(C.1)
where Y = B + S, with B the baryon number and S the strangeness.
C.2.1 isospin
Isospin I was introduced in 1932 by Heisenberg to explain the similarity in masses of the proton and
neutron. Heisenberg considered the proton and neutron as two distinct states of a single particle he
called the nucleon that would be indistinguishable apart from their difference in electric charge. I.e.
he considered the nucleon as the following linear combination of the proton and neutron
N = α
(1
0
)+ β
(0
1
)
where
p = e1 =
(1
0
)and n = e2 =
(0
1
)Any transformation that would mix these two basis vectors would yield a new linear combination of
the p and n, i.e. a new state of the neutron. In analogy with the notation for spin S he introduced
the concept of isospin I with third component I3. The nucleon is assigned isospin 12 and the third
component I3 thereby has eigenvalues + 12 , corresponding to the proton and − 1
2 corresponding to
the neutron. The idea behind this notation was that the proton and neutron were affected equally
by the strong nuclear force. The charge independence of the strong nuclear force was then seen as
invariance under unitary transformations in isospin space. One such transformation would be replacing
all protons by neutrons and vice versa. Explicitly, the allowed transformations where realized to be
SU(2) transformations. Put differently, the space spanned by the p and n was invariant under SU(2)
transformations. The same reasoning holds for the pions. The two charged pions π± and the neutral
π0 are placed in an isospin triplet with I = 1, to explain their similarity in masses.
C.2.2 Weak isospin and weak hypercharge
Just as isospin is a conserved quantum number for the strong interaction, weak isospin T is conserved
in the weak interaction. In the same way we can form weak isospin doublets for the left handed quarks
and leptons and weak isospin singlets for the right handed particles. For the first family (the other
two families are analogous) this gives:
ψL =
(νe
e−
)L
,
(u
d
)L
ψR = eR, uR, dR (C.2)
110
An equivalent of C.1 relates weak hypercharge to the electric charge Q and the third weak isospin
component T3. Explicitly:
Q = T3 +Y
2(C.3)
The lepton doublet, for one, is assigned weak isospin 12 , with eigenvalues ± 1
2 where the particle with
the highest charge is assigned the highest eigenvalue. The electron doublet is thus found to have a
hyper charge of -1. The singlet eR on the other hand has weak isospin 0 and thus a hypercharge of
-2. Continuing for the other particles we find the following hypercharges for the first family Standard
Model particles.
• νL, eL have Y = −1
• eR has Y = −2
• uL, dL have Y = 1/3
• uR has Y = 4/3
• dR has Y = −2/3
• Higgs field φ has Y = 1
I will conclude this appendix with an example about charge conjugation[14].
Example C.1. Charge conjugation The antiparticle fields of spin 1/2-particles can be obtained by
applying the charge conjugation operator C = iγ2γ0.
ψC = CψT
= Cγ0ψ∗, ψC
= ψC (C.4)
We can apply charge conjugation to the chirality eigenstates to conclude that
(ψL)C = (ψC)R and (ψR)C = (ψC)L (C.5)
which follows from the definition of C and the properties of the γ-matrices. Thus, applying charge
conjugation to a right-handed particle gives a left-handed anti-particle and vice versa. Applying charge
conjugation to an isospin doublet makes things a bit more complex. Since charge conjugation involves
complex conjugation it reverses the sign of the eigenvalues of all generators of the SU(2) symmetry.
This can be seen by first looking at the U(1)Y hyper charge generator. Applying complex conjugation
to a U(1) group operator yields:
[exp(iαY )]∗ = exp(−iαY ) = exp[iα(−Y )] (C.6)
Therefore, if we apply charge conjugation to the left handed doublet ψL =
(νe
e−
)L
the doublet
ψ′ =
(νCee+
)R
would have isospin T3 = −1/2 and T3 = +1/2, for the upper and lower components
respectively which is not right. We can obtain the correct result by using the relation iτ2 =
(0 1
−1 0
)to reverse the order of the doublet ψ′67. Then the correct charge conjugated isospin doublet is given
by:
ψR = iτ2
(νCee+
)R
=
(e+
−νCe
)R
(C.7)
67A derivation can be found in [14]
111
D Feynman rules and calculating loop integrals
In particle physics, any measurable quantity is proportional to the square of the matrix element −iM.
This matrix element is derived by consistently applying the Feynman rules, which can be derived from
the Lagrangian. QED, QCD and QFD each have their own set of rules. Here I will give an overview
of those rules and how the matrix element can be computed.
For a given process, one draws all diagrams that are consistent with the Feynman rules. Interactions
are represented by vertices and propagators by line’s connecting the vertices. Bosonic propagators are
to be drawn as wiggly lines and fermionic propagators as solid lines. Each propagator and vertex is then
associated with a certain factor. The vertices with coupling constants and the different propagators
also each have their own factors. I listed the relevant factors below [12], [13]. The relevant propagator
terms are:
• Spin-0 propagator: ( ip2−m2 )
• Spin 12 propagator: i
/p−m = i(/p+m)
p2−m2
• Spin-1 propagator : −ip2−m2
[gµν − pµpν
m2
]68
Remark These propagator terms are derived from the Klein-Gordon, Dirac and Proca equations we
have seen in section 6.1. One takes the free field equations (6.6), (6.12) and (6.10) and applies the
prescription pµ ↔ i∂µ to convert them to momentum space. The propagator is then i-times the inverse.
The relevant vertex terms are found by simply removing the fields of the interaction in the Lagrangian.
I derived a few in the analysis of the Standard model Higgs mechanism. Those needed here are:
• QED vertex : −iQeγµ = −iqγµ
• H − ff Yukawa vertex : −i λf√2
• Higgs 4 vertex : −iλ4• HH −W+W− 4 vertex : i g
2
4 gµν
• HH − ZZ 4 vertex : i g2+g′2
4 gµν
The matrix element of a particular diagram is then the product of all the terms that can be associated
with the diagram. Since for a process multiple diagrams can be possible, the total matrix element
is the sum of all the separate matrix elements associated to the separate diagrams. However, loop-
diagrams also have to be taken into account. These are to be seen as corrections to the tree level
diagrams and come with a factor of ±i∫
d4p(2π)4 . Here the ± is - when it is a fermion in the loop and +
when it is a boson, and we are to integrate over all momenta. This however, leads to infinities, and
infinities are not allowed in any physical theory. In this context physicist often speak of regularizing
and renormalizing, and these terms are indeed related although have a very different meaning.
Regularization methods deal with infinite integrals by splitting of the infinite part of the finite part.
There are different types of regularization schemes, all with their advantages and disadvantages. Here
I will briefly discuss the cut-off regularization. Renormalisation happens after the regularization has
taken place and involves absorbing the infinities into the parameters. This means that the bare
68Note that the photon propagator is thus −i gµνp2
.
112
coupling constants and masses are reparametrized into the physical coupling constants and masses
that we actually measure.
D.1 Superficial degree of divergence
Often, physicists are only interested in the type of divergence and not in the particular value of the
integral. This is called the superficial degree of divergence of the diagram. The superficial degree of
divergence is quickly determined by counting the powers of momentum p. In a loop diagram we can
have the following contributions of p.
• a loop contributes 4 powers of p through ±i∫
d4p(2π)4
• a fermion propagator contributes −1 powers of p through /p
p2−m2 , where /p = pµγµ.
• a boson (either scalar or massive vector boson) contributes −2 powers of p through 1p2−m2
• a vertex containing a derivative with respect to p contributes −1 powers of p.
The superficial degree of divergence D is now defined to be the sum of powers of the momenta from
these contributions, i.e.
D = 4L− Pfermion − 2Pboson − V ∂∂p, (D.1)
We expect the integral to converge when D < 0. When D = 0 we expect the diagram to diverge
logarithmically, when D = 1 we expect linear divergence and quadratic divergence when D = 2.
However, the actual degree of divergence may be less due to cancellations from divergent sub diagrams
or cancellations required by symmetries. The correction to the electron self energy for example is
expected to be quadratically divergent but turns out to be only logarithmically divergent [24].
D.2 Regularization schemes
D.2.1 Momentum Cut-off regularization
In the cut-off regularization scheme we only evaluate the integral up to a cut-off momentum Λ, and
at the end send Λ→∞. This Λ is the energy scale where the laws of physics break down. Beyond it
we don’t know how nature behaves, so we don’t even try computing it. It is a very effective way of
regulating an infinite integral, albeit a bit primitive. This type of regularization is used to compute
the divergent contributions to the Higgs mass.
D.3 Calculation of quadratic divergent contributions to the Higgs mass
Here I will demonstrate the calculation of the top-loop divergent integral that contributes to the Higgs
mass, since this forms the largest contribution.
The calculation simplifies greatly if we neglect the momenta of the external particles and the masses of
intermediate particles. These do not play a role in the dominance of the diagram which is determined
113
to leading order by the momenta running through the loop.
M =3
(−i λ
2t√2
)2
− i∫
(d4k)
(2π)4Tr(
i(kµγµ)
(k)2
i(kνγν)
k2)
=(−i)3λ2t
2
∫(d4k)
(2π)4Tr
(kµkνγ
µγν
k4
)=(−i)3λ
2t
2
∫(d4k)
(2π)4
(4k2
k4
)=(−i)3λ
2t
2
∫(d4k)
(2π)4
(4k2
k4
)where in the third equality we evaluated the trace: Tr(kµkνγ
µγν) = 4k2. We now perform a so called
Wick rotation to the energy component to convert the integral over Minkowski space to an integral in
Euclidian space-time. This amounts to the substitution:
k0 → ik0E
such that k2 = −(k0E)2 − |k|2 = −k2
E where now kE = (k0E , |k|) is defined as the positive definite
Euclidian scalar product69. Implementing this substitution gives for∫
(d4k)∫(d4k) =
∫dk0
∫d|k| =
∫idk0
E
∫d|k|
and thus:
M =− i3λ2t
2
∫(d4k)
(2π)4(4k2
k4)
=− i3λ2t
24
1
(2π)4
∫idk0
E
∫d|k|−k
2E
k4E
=− 3λ2t
24
1
(2π)4
∫d4kE
k2E
k4E
We now convert the integral to spherical coordinates by noting that∫d4k = k3dkdΩ, with dΩ = 2π2.
Then we get, no longer writing the E subscript:
M =− 3λ2t
24
1
(2π)4
∫d4k
k2E
k4E
=− 3λ2t
24
1
(2π)4
∫k3dkdΩ
k2
k4
=− 3λ2t
24
2π2
16π4
∫ Λ
0
kdk
=− 3λ2t
8π2Λ2
where in the third line we implemented the cut-off Λ. A more thorough calculation involves also the
mass of the top quark and the momenta of the external Higgs particle70. The loop integral we have
to compute corresponds again to the diagram with momenta p + k and k running through the loop.
69Recall that in minkowski space the squared four momentum k = (k0, |k|) is defined as k2 = (k0)2 − |k|2.70The calculation is based on [24].
114
The full calculation then amounts to the following integral:
M =3
(−i λt√
2
)2
− i∫
(d4k)
(2π)4Tr
(i(p+ k)γ +m)
(p+ k)2 −m2
i((kγ +m)
(k)2 −m2
)=(−i)3λt
2
∫(d4k)
(2π)4Tr
(((p− k)γ +m)
(p− k)2 −m2
(kγ +m)
k2 −m2
)(D.2)
The first thing to tackle is the denominator: For this we use a trick named Feynman’s trick to handle
such terms that are a product of multiple propagators. It allows us to complete the square and makes
the calculation easier. In its simplest form, that is two propagators, it says:
1
ab=
∫ 1
0
dx1
|xa+ (1− x)b|2=
∫ 1
0
dx1
|b+ (a− b)x|2
Thus letting a = (p− k)2 −m2 and b = k2 −m2 we can rewrite D.2 as
(−i)3λt2
∫(d4k)
(2π)4
∫ 1
0
dxTr((p+ k)γ +m)(kγ +m)
|k2 −m2 + (p2 + 2pk)|2(D.3)
We now make a change of variables by setting l = k + px. Then dl = dk and we obtain:
k2 −m2 + (p2 + 2pk) =l2 + p2x2 − 2lpx−m2 + p2x+ 2plx− 2p2x2
=l2 + p2x(1− x)−m2
for the denominator and for the nomenator we get:
((p+ k)γ +m)(kγ +m) =(l + p(1− x))γ +m)((l − px)γ +m)
We can now simplify thing by noting that to leading order in l the numerator becomes lµγµlνγν .
Evaluating the trace yields 4l2. Then (D.3) becomes to leading order in l:
(−i)3λ2t
2
∫ 1
0
dx
∫(d4l)
(2π)4
4l2
|l2 + p2x(1− x)−m2|2
As before we do a Wick rotation to convert the integral to Euclidian space. Thus we make the
substitution: l0 = il0E , l2 = −l2E . Then we get:
−3λ2t
2
∫ 1
0
dx
∫(d4l)
16π4
4l2
|l2 + p2x(1− x)−m2|2
Making further the substitution ∆ = m2 − p2x(1− x) and converting the integral to spherical coordi-
nates, i.e. d4l = 2π2l3dl, we get:
− 3λ2t
4π2
∫ 1
0
dx
∫ Λ
0
dll5
|l2 + ∆2|2(D.4)
where we also implemented the cut-off Λ. The integral over l we can evaluate: Its value is:∫ Λ
0
dll5
|l2 + ∆2|2=
[l2
2− ∆2
2(l2 + ∆)−∆ log(l2 + ∆)
]Λ
0
and thus becomes:
−3λ2t
4π2
∫ 1
0
dx
[Λ2
2− ∆2
2(Λ2 + ∆)+
1
2
]
115
After the integration over x this becomes to leading order in Λ:
−3λ2tΛ
2
8π2
which is in agreement with our first calculation and literature [18].
116
References
[1] W. Fulton., J. Harris, Representation theory: A first course, Springer-Verlag, New York Inc, 1991
[2] G. James., A. Kerber, The Representation Theory of the Symmetric Group, Addison-Wesley Pub-
lishing Company, Massachusetts, 1981
[3] I. G. Macdonald, Symmetric Functions and Hall Polynomials 2nd Ed., Oxford university Press,
New York, 1995
[4] P. Etingof, Introduction to Representation Theory, January 10-th 2011 http://math.mit.edu/
~etingof/replect.pdf
[5] J. Stokman, Algebra 3; Representatietheorie. Aanvulling 2, University of Amsterdam
[6] J. Stokman, Algebra 3; Representatietheorie. Aanvulling 3, University of Amsterdam
[7] J. Stokman, Algebra 3; Representatietheorie. Aanvulling 4, University of Amsterdam
[8] J. Stokman, Algebra 3; Representatietheorie. Aanvulling 5, University of Amsterdam
[9] T. Brocker., T. Dieck, Representations of Compact Lie Groups, Graduate Texts in Mathematics,
Springer-Verlag,
[10] J. Fuchs., C. Schweigert, Symmetries, Lie Algebras and Represenations, pp137, 156, ;Cambridge
University Press, New York, 1997,
[11] C. Quigg, Gauge theories of the strong, weak and electromagnetic interactions, The Benjamin-
Cummings Publishing Company, Canada, 1983.
[12] D.G. Griffiths, Introduction to Elementary Particles, Wiley-Finch 2nd Rev. Ed, Weinheim, 2008
[13] M. Thomson, Modern Particle Physics, Cambridge University Press, Cornwall, 2014
[14] W. Greiner., B. Muller, Gauge Theory of Weak interactions, pp 305, ; Springer Fourth Edition,
Thun, 1986
[15] http://isites.harvard.edu/fs/docs/icb.topic1146666.files/
IV-4-SpontaneousSymmetryBreaking.pdf
[16] C. ter Burg., S. Bakker, Project Wiskunde, Variatierekening, Tweede-jaars project Bachelor
Wiskunde ; UvA ; 2014
[17] G. ’t Hooft, Naturalness, chiral symmetry and spontaneous symmetry breaking, inspire-
hep/144074v1, 1980
[18] M. Schmaltz., D. Tucker-Smith, Little Higgs Review, arXiv:hep-ph/0502182v1, 2005
[19] M. Schmaltz, The simplest little Higgs, arXiv:hep-ph/0407143v2, 2004
[20] N. Arkani-Hamed, et al, The Littlest Higgs, arXiv:hep-ph/0206021v2, 2002
[21] T. Han, et al, Phenomenology of the Little Higgs Model, arXiv:hep-ph/0301040v4, 2004
[22] A. Birkedal, et al, Little Higgs Dark Matter, arXiv:hep-ph/0603077v3, 2012
117
[23] CMS Collaboration Search for a massive resonance decaying into a Higgs boson and a W or Z bo-
son in hadronic final states in propon-proton collisions at√s = 8 Tev, arXiv:hep-ex/1506.01443v1,
2015
[24] J. Lukkezen, Little Higgs Phenomenology, Master Thesis, Universiteit van Amsterdam, 2008
[25] M. Brak, The Hierarchy Problem in the Standard Model and Little Higgs Theories, Master Thesis,
Universiteit van Amsterdam, 2004
[26] I. van Vulpen, The Standard Model Higgs Boson, University of Amsterdam, 2013-2014.
[27] E, Laenen Lecture Notes Quantum Field Theory: Appendix C: Introduction to group theory,
University of Amsterdam,
[28] http://www.quantumdiaries.org/2012/07/01/the-hierarchy-problem-why-the\
-higgs-has-a-snowballs-chance-in-hell/
[29] http://www.quantumdiaries.org/2011/11/21/why-do-we-expect-a-higgs\
-boson-part-i-electroweak-symmetry-breaking/
118