Lie Algebras - Mathematisches Institut der LMUwehler/LieAlgebrasScript.pdf · 4 1 Matrix functions exp(z)= ¥ å n=0 1 n! zn which converges not only for all real numbers but for

Joachim Wehler

Lie Algebras

in Mathematics and Physics

DRAFT, Release 0.8

May 1, 2017

2

I have prepared these notes for the students of my lecture. The lecture tookplace during the winter semester 2016/17 at the mathematical department of LMU(Ludwig-Maximilians-Universitat) at Munich.

Compared to the oral lecture in class these written notes contain some additionalmaterial.

Please report any errors or typos to [email protected]

Release notes:

• Release 0.8: Addition to Proposition 7.26

• Release 0.7: Several typos corrected, in particular: Last line in Table 7.1.

• Release 0.6: Several typos corrected.

• Release 0.5: Proof of Lemma 1.11 corrected, Chapter 8 added.

Contents

Part I General Lie Algebra Theory

1 Matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 Power series of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Exponential of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.3 Logarithm of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Jordan theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.1 Jordan decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2 Surjectivity of the complex exponential map . . . . . . . . . . . . . . . . . . . . 282.3 Jordan normal form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3 Fundamentals of Lie algebra theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.1 Definitions and first examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2 Lie algebras of the classical groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.3 The adjoint representation of a Lie algebra . . . . . . . . . . . . . . . . . . . . . 52

4 Nilpotent Lie algebras and Solvable Lie algebras . . . . . . . . . . . . . . . . . . 574.1 Engel’s theorem for nilpotent Lie algebras . . . . . . . . . . . . . . . . . . . . . . 574.2 Lie’s theorem for solvable Lie algebras . . . . . . . . . . . . . . . . . . . . . . . . 664.3 Heisenberg algebra in 1-dimensional quantum mechanics . . . . . . . . . 73

5 Killing form and Semisimple Lie algebras . . . . . . . . . . . . . . . . . . . . . . . . . 795.1 Trace of endomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795.2 Fundamentals of semisimple Lie algebras . . . . . . . . . . . . . . . . . . . . . . 835.3 Weyl’s theorem on complete reducibility . . . . . . . . . . . . . . . . . . . . . . . 90

Part II Structure of Complex Semisimple Lie Algebras

6 Cartan decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1056.1 Toral subalgebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1056.2 sl(2,C)-modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

v

vi Contents

6.3 Decomposition into eigenspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

7 Root systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1237.1 Abstract root system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1237.2 Action of the Weyl group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1317.3 Coxeter graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

8 Classification of complex semisimple Lie algebras . . . . . . . . . . . . . . . . . . 1518.1 The root system of a semisimple Lie algebra . . . . . . . . . . . . . . . . . . . . 1518.2 Root system of the complex Lie algebras from A,B,C,D-series . . . . 1678.3 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

Part IGeneral Lie Algebra Theory

Use the template part.tex together with the Springer document class SVMono(monograph-type books) or SVMult (edited books) to style your part title page and,if desired, a short introductory text (maximum one page) on its verso page in theSpringer layout.

Chapter 1Matrix functions

The paradigm of Lie algebras is the vector space of matrices with the commutator oftwo matrices as Lie bracket. These concrete examples even cover all abstract finitedimensional Lie algebra which are the focus of these notes. Nevertheless it is usefulto consider Lie algebras from an abstract viewpoint as a separate algebraic structurelike groups or rings.

We denote by K either the field R of real numbers or the field C of complexnumbers. Both fields have characteristic 0. The relevant difference is the fact that Cis algebraically closed, i.e. each polynomial with coefficients from K of degree n hasexactly n complex roots. This result allows to transform matrices over C to certainstandard forms by transformations which make use of the eigenvalues.

1.1 Power series of matrices

At highschool every pupil learns the equation

exp(x) · exp(y) = exp(x+ y)

This formula holds for all real numbers x,y. Then exponentiation defines a map

exp : R→ R∗.

Students who attended a class on complex analysis will remember that expo-nentiation is defined also for complex numbers. Hence exponentiation extends to amap

exp : C→ C∗.

This map satisfies the same functional equation.The reason for the seamless transition from the real to the complex field is the

fact that the exponential map is defined by a power series

3

4 1 Matrix functions

exp(z) =∞

∑ν=0

1ν!· zν

which converges not only for all real numbers but for all complex numbers too.In order to generalize further the exponential map we try to exponentiate argu-

ments which are strict upper triangular matrices

P =

0 p 00 0 00 0 0

,Q =

0 0 00 0 q0 0 0

∈ n(3,K).

We compute

exp(P)=∞

∑ν=0

1ν!·Pν =1+P=

1 p 00 1 00 0 1

,exp(Q)=∞

∑ν=0

1ν!·Qν =1+Q=

1 0 00 1 q0 0 1

.

Note Pν = Qν = 0 for all ν ≥ 2. Hence we do not need to care about questions ofconvergence. To check the functional equation we compute on one hand

exp(P) · exp(Q) =

1 p 00 1 00 0 1

·1 0 0

0 1 q0 0 1

=

1 p pq0 1 q0 0 1

On the other hand

exp(P+Q) = 1+P+Q+1/2 · (P+Q)2 =

1 p (pq)/20 1 q0 0 1

.

Apparently the functional equation is not satisfied:

exp(P+Q) 6= exp(P) · exp(Q).

Fortunately the defect can be repaired by introducing the matrix

C := 1/2 · [P,Q] =

0 0 (pq)/20 0 00 0 0

.

Here [P,Q] := PQ−QP denotes the commutator of the matrices P and Q. Then

exp(P+Q+1/2 · [P,Q]) = exp(P+Q+C) = 1+P+Q+C+(1/2)(P+Q+C)2 =

=

0 p pq0 0 q0 0 0

= exp(P) · exp(Q).

1.1 Power series of matrices 5

The exponential map can be defined on the strict upper triangular matrices

exp : n(3,K)→ GL(3,K).

It satisfies the functional equation in the generalized form:

exp(P+Q+1/2 · [P,Q]) = exp(P) · exp(Q).

The goal of the present section is to extend the exponential map to all matrices.Therefore we need a concept of convergence for sequences and series of matrices.The basic ingredience is the norm of a matrix.

Definition 1.1 (Norm). We consider the K-vector space Kn with the Euclideannorm

‖x‖ :=

√n

∑i=1|xi|2 f or x = (x1, ...,xn) ∈Kn.

On the K-algebra M(n×n,K) we introduce the operator norm

‖A‖ := sup{‖Ax‖ : x ∈Kn and ‖x‖ ≤ 1}.

Note that ‖A‖< ∞ due to compactness of the unit ball

{x ∈Kn : ‖x‖ ≤ 1}.

Intuitively, the operator norm of A is the volume to which the linear map determinedby A blows-up or blows-down the unit ball of Kn.

The operator norm ‖A‖ does only depend on the endormorphism which is rep-resented by the matrix A. Hence, all matrices which represent the same endormor-phism with repect to different bases have the same norm.

The K-vector space M(n× n,K) of all matrices with components from K is anassociative K-algebra with respect to the matrix product

A ·B ∈M(n×n,K)

because(A ·B) ·C = A · (B ·C)

for matrices A,B,C ∈M(n×n,K).

Proposition 1.2 (The associative algebra of matrices). The matrix algebra (M(n×n,K),‖‖)is a normed associative algebra:


1. ‖A‖= 0 iff A = 02. ‖A+B‖ ≤ ‖A‖+‖B‖3. ‖λ ·A‖= |λ | · ‖A‖ with λ ∈ R4. ‖A ·B‖ ≤ ‖A‖ · ‖B‖5. ‖1‖= 1 with 1 ∈M(n×n,K) the unit matrix.6. For a matrix A = (ai, j)i, j ∈M(n×n,K) holds

‖A‖sup ≤ ‖A‖ ≤ n · ‖A‖sup.

with‖A‖sup := sup{|ai j| : 1≤ i, j ≤ n}.

Proof. ad 6) For j = 1, ...,n denote by e j the j-th canonical basis vector of Kn. Thenfor all i = 1, ...,n

‖A‖ ≥ ‖Ae j‖=

√n

∑k=1|ak j|2 ≥ |ai j|,

hence‖A‖ ≥ ‖A‖sup.

Moreover, the Cauchy Schwarz inequality implies

‖Ax‖2 =n

∑i=1

(n

∑j=1|ai j · x j|2)≤

n

∑i=1

(n

∑j=1|ai j|2 ·

n

∑j=1|x j|2)≤ n2 · sup

i, j|ai j|2 · ‖x‖2,

hence‖A‖ ≤ n · ‖A‖sup.

Remark 1.3. 1. Because the operator norm ‖A‖ and the sup-norm of the entries of Acan be mutually estimated by each other the following two convergence conceptsare equivalent for a sequence (Aν)ν∈N of matrices Aν ∈M(n×n,K),ν ∈ N, anda matrix A ∈M(n×n,K):

• limν→∞‖Aν −A‖= 0, i.e. lim

ν→∞Aν = A (norm convergence).

• The sequence (Aν)ν∈N converges to A component-by-component.

2. In particular each Cauchy sequence of matrices is convergent: The matrix algebra

(M(n×n,K),‖‖)

is complete, i.e. a Banach algebra.


3. The vector space M(n× n,K) is finite dimensional, its dimension is n2. Henceany two norms are equivalent in the sense that they dominate each other. There-fore the structure of a topological vector space on M(n×n,K) does not dependon the choice of the norm.

Because (M(n× ,K),‖‖) is a normed algebra according to Proposition 1.2, con-cepts from analysis like convergence, Cauchy sequence, infinite series, and conti-nous function also apply to matrices.

We recall the fundamental properties of a complex power series

∞

∑ν=0

cν · zν

with domain of convergence D⊂ C:

• The series is absolute convergent in D, i.e. also ∑∞ν=0 |cν | · |z|ν is convergent

for z ∈ D.• The series converges compact in D, i.e. the convergence is uniform on each com-

pact subset of D.• The series is infinitely often differentiable in D, its derivation can be obtained by

termwise differentiation.

Lemma 1.4 (Convergent matrix series). Consider a power series

∞

∑ν=0

cν · zν

with coefficients cν ∈K,ν ∈ N, and radius of convergence R > 0. Let

B(R) := {A ∈M(n×n,K) : ‖A‖< R}

be the open ball in M(n×n,K) around zero with radius R.

Consider a matrix A ∈ B(R):

• The series

f (A) :=∞

∑ν=0

cν ·Aν := limn→∞

n

∑ν=0

cν ·Aν ∈M(n×n,K)

is convergent, absolute convergent and compact convergent in B(R).

• The series satisfies [ f (A),A] = 0 with the commutator

[ f (A),A] := f (A) ·A−A · f (A).


• The functionf : B(R)→M(n×n,K),A 7→ f (A),

is continous.

Proof. We apply the Cauchy criterion: For N > M holds

‖N

∑ν=0

cν ·Aν −M

∑ν=0

cν ·Aν‖ ≤N

∑ν=M+1

|cν | · ‖A‖ν .

If r := ‖A‖< R then ∑∞ν=0 |cν | ·rν converges. Hence the Cauchy criterion is satisfied

and the limit

limn→∞

n

∑ν=0

cν ·Aν

exists.The equation [A, f (A)] = 0 about the commutator follows by taking the limit of

the equation

[A,n

∑ν=0

cν ·Aν ] =n

∑ν=0

cν · [A,Aν ] = 0 f or each n ∈ N.

The remaining statements about convergence follow from the estimation

‖ f (A)‖ ≤∞

∑ν=0|cν | · ‖A‖ν

and the corresponding properties of the complex power series. Continuity of f is aconsequence of the compact convergence, q.e.d.

Definition 1.5 (Derivation of matrix functions). Let I ⊂ R be an open interval. Amatrix function

A : I→M(n×n,K)

is differentiable at a point t ∈ I iff the limit

limh→0

A(t +h)−A(t)h

exists.In this case we employ the notation

A′(t) :=dAdt

(t) := limh→0

A(t +h)−A(t)h

∈M(n×n,K).


Lemma 1.6 (Derivation of convergent matrix series). Consider a power series

f (z) =∞

∑ν=0

cν · zν ,cν ∈K,

with radius of convergence R > 0. Let I ⊂ R be an open interval and

A : I→ B(R)⊂M(n×n,K)

a differentiable function with [A′(t),A(t)] = 0 for all t ∈ I.Then also the function

f ◦A : I→M(n×n,K), t 7→ f (A(t)) :=∞

∑ν=0

cν ·Aν(t),

is differentiable for all t ∈ I and

ddt

f (A(t)) = f ′(A(t)) ·A′(t) = A′(t)·) f ′(A(t)).

Here f ′ denotes the derivation of f term by term.

Proof. The series

f (A(t)) =∞

∑ν=0

cν ·Aν(t)

and

f ′(A(t)) =∞

∑ν=1

ν · cν ·Aν−1(t)

are convergent because ‖A(t)‖ < R. Due to the compact convergence we may dif-ferentiate f (A(t)) term by term. The proof of the lemma reduces to proving

ddt

Aν(t) = ν ·Aν−1(t) ·A′(t) = ν ·A′(t) ·Aν−1(t).

Here the proof goes by induction on ν ∈N. For the induction step assume the claimto hold for ν ∈ N. Analogous to the proof of the product rule from calculus:

ddt

Aν+1(t) = limh→0

Aν+1(t +h)−Aν+1(t)h

= limh→0

Aν(t +h) ·A(t +h)−Aν(t) ·A(t)h

=

= limh→0

Aν(t +h) ·A(t +h)−Aν(t) ·A(t +h)+Aν(t) ·A(t +h)−Aν(t) ·A(t)h

=

= limh→0

Aν(t +h) ·A(t +h)−Aν(t) ·A(t +h)h

+ limh→0

Aν(t) ·A(t +h)−Aν(t) ·A(t)h

=

= limh→0

(Aν(t +h)−Aν(t)

h·A(t +h))+ lim

h→0(Aν(t) ·

A(t +h)−A(t)h

) =


= (ddt

Aν(t)) ·A(t)+Aν(t) ·A′(t) = ν ·Aν−1(t) ·A′(t) ·A(t)+Aν(t) ·A′(t) =

(ν +1) ·Aν(t) ·A′(t) = (ν +1) ·A′(t) ·Aν(t).

To obtain the third last equation we have employed the induction assumption, q.e.d.

1.2 Exponential of matrices

The exponential series

exp(z) =∞

∑ν=0

zν

ν!,z ∈ C,

has convergence radius R = ∞.

Definition 1.7 (Exponential of matrices). For any matrix A ∈M(n×n,K) one de-fines the exponential

exp(A) :=∞

∑ν=0

Aν

ν!∈M(n×n,K).

Proposition 1.8 (Derivation of the exponential). Consider an open interval I ⊂ Rand a differentiable function

A : I→M(n×n,K)

with [A′(t),A(t)] = 0 for all t ∈ I. Then for all t ∈ I

ddt(exp A(t)) = A′(t) · exp A(t) = exp A(t) ·A′(t).

Proof. Lemma 1.6 with the power series

f (z) =∞

∑ν=0

1ν!· zν

and its derivativef ′(z) = f (z)

implies

ddt(exp A(t)) = (

∞

∑ν=0

1ν!·Aν(t)) ·A′(t) = exp A(t) ·A′(t) = A′(t) · exp A(t).

1.2 Exponential of matrices 11

In order to derive some fundamental properties of the exponential map we recall thefollowing matrix representation of certain endomorphisms.

Definition 1.9 (Diagonal and triangular form of endomorphisms). Consider afinite-dimensional vector space V .

1. A matrix A = (ai j) ∈ M(n× n,K) is an upper triangular matrix iff ai j = 0 forall i > j. An endomorphism f ∈ End(V ) is triagonalizable iff it is representableby an upper triangular matrix from M(n×n,K).

2. A matrix A = (ai j)∈M(n×n,K) is a diagonal matrix iff ai j = 0 for all i 6= j. Anendomorphism f ∈ End(V ) is diagonalizable iff it is representable by a diagonalmatrix from M(n×n,K).

Lemma 1.10 (Diagonalization and triagonalization). Consider a complex, finite-dimensional vector space V .

1. An endomorphism f ∈End(V ) is diagonalizable if all its eigenvalues are distinct.

2. Every endomorphism f ∈ End(V ) is triagonalizable.

3. For each endomorphism f ∈ End(V ) a sequence ( fν)ν∈N of diagonalizable en-domorphisms fν ∈ End(V ) exists with

f = limν→∞

fν .

Proof. ad 1) Any family of eigenvectors (v j) j of f belonging to pairwise distincteigenvalues λ j is linear independent: Assume the existence of an equation

α1 · v1 + ...+αk · vk = 0

with k ∈ N minimal such that αi ∈K,αi 6= 0, i = 1, ...,k. After appying f we obtaina second equation

α1 ·λ1 · v1 + ...+αk ·λk · vk = 0.

Subtracting the second equation from λk-times the first equation gives

α1 · (λk−λ1) · v1 + ...+αk−1 · (λk−λk−1) · vk−1 = 0.

The resulting equation has less terms 6= 0 than the original equation, a contradic-tion.Hence V has a basis of eigenvectors of f . The matrix representing f with re-spect to this basis is diagonal.

ad 2) By induction on n = dim V we show the existence of a flag of V invariantwith respect to f : A flag of V is a sequence of subspaces

V0 = {0}(V1 ( ...(Vn =V.


It is invariant with respect to f iff for all i = 0, ...,n

f (Vi)⊂Vi.

The endomorphism f has at least one complex eigenvalue with eigenvector v1 ∈V .Set V1 := C · v1. Extend v1 to a basis (vi)i=1,...,n of V . Define

p : V →V1

as the projection with respect to this basis and consider

W := spanC < v2, ...,vn >

and the restrictiong := ( f − p◦ f )|W.

Theng : W →W

is well-defined. The induction assumption applied to W provides a flag of W

W0 = {0}(W1 ( ...(Wn−1 =W

invariant with respect to g. Then

V0 = {0}(V1 (V2 :=V1 +W1 ( ...Vi :=V1 +Wi−1 ( ...(Vn :=V1 +Wn−1 =V

is a flag of V . From the definition of g follows

f |W = g+ p◦ ( f |W ).

Hence for each i = 1, ...,n

f (Vi) = f (V1 +Wi−1)⊂ f (V1)+( f |W )(Wi−1)⊂V1 +g(Wi−1)+V1 =

=V1 +g(Wi−1)⊂V1 +Wi−1 =Vi

and the flag is invariant with respect to f .

ad 3) According to part 2) represent f by an upper triangular matrix A

A = ∆ +N

with a diagonal matrix ∆ = diag(λ1, ...,λn) and a strict upper triangular matrix

N = (ai j),ai j = 0 i f i≥ j.

For i= 1, ...,n define successively sequences ai = (aiν)(ν∈N) converging to zero such

that the sets{λ j +a j

ν : j = 1, ..., i−1 and ν ∈ N}

1.2 Exponential of matrices 13

and{λi +ai

ν : ν ∈ N}

are disjoint. Such choice is possible because the field C is uncountably infinite.Define

Aν := ∆ +diag(λ1 +a1ν , ...,λn +an

ν),ν ∈ N.

Each matrix Aν has eigenvalues

λi +aiν , i = 1, ...,n

and for each fixed ν ∈ N all eigenvalues of Aν are pairwise distinct. Hence Aν

represents a diagonalizable endomorphism fν due to part i) and

A = limν→∞

Aν ,

i.e.f = lim

ν→∞fν ,

q.e.d.

The name flag shall evoke the idea of a physical flag V = V2. The whole flagcomprises besides its bunting also the flagpole V1 ( V2. And the whole flagpole V1comprises its basepoint V0 (V1.

Proposition 1.11 (Exponential of matrices). The exponential of matrices A,B ∈M(n×n,C)satisfies the following rules:

1. Functional equation in the Abelian case: If [A,B] = 0 then

exp(A+B) = exp(A) · exp(B).

2. Transposition: (exp A)> = exp(A>)

3. Base change: If S ∈ GL(n,C) then

exp(S ·A ·S−1) = S · exp(A) ·S−1.

4. Determinant and trace: det(exp A) = exp(tr A).

Proof. ad 1: First we apply the binomial theorem with [A,B] = 0. Secondly weinvoke the Cauchy product formula for absolut convergent infinite matrix series:

exp(A+B) =∞

∑ν=0

(A+B)ν

ν!=

∞

∑ν=0

1ν!(

ν

∑µ=0

(ν

µ

)Aν−µ ·Bµ) =


=∞

∑ν=0

(ν

∑µ=0

Aν−µ

(ν−µ)!·

Bµ

µ!) =

∞

∑ν=0

Aν

ν!·

∞

∑ν=0

Bν

ν!= exp(A) · exp(B).

The Cauchy product formula is well known in the context of absolute convergent se-ries of real or complex numbers, e.g. see [9, §8]. In the context of absolut convergentmatrix series the proof of the Cauchy product formula is literally the same.

ad 2: Taking limN→∞

of

(N

∑ν=0

Aν

ν!)> =

N

∑ν=0

(A>)ν

ν!

ad 3: Taking limN→∞

of

S · (N

∑ν=0

Aν

ν!) ·S−1 =

N

∑ν=0

S ·Aν ·S−1

ν!=

N

∑ν=0

(S ·A ·S−1)ν

ν!

ad 4: According to Lemma 1.10 and part 3) of the proposition we may assume anupper triangular matrix

A =

λ1 ∗. . .

0 λn

, λi ∈ C, i = 1, ...,n.

We obtain

Aν =

λ ν1 ∗

. . .0 λ ν

n

,ν ∈ N,

and conclude

det(exp A) = det(∞

∑ν=0

Aν

ν!) = det

eλ1 ∗. . .

0 eλn

.

Hence

det(exp A) =n

∏i=1

eλi = eλ1+...+λn = exp(tr A).

Corollary 1.12 (Exponential map). The exponential defines a map

exp : M(n×n,K)→ GL(n,K), A 7→ exp(A),

i.e. exp A is invertible with exp(A)−1 = exp(−A).

1.3 Logarithm of matrices 15

1.3 Logarithm of matrices

Recall the complex power series of logarithm

log(1+ z) =∞

∑ν=1

(−1)ν+1

ν· zν ,z ∈ C,

and the geometric series∞

∑ν=0

zν ,z ∈ C.

Both power series have radius of convergence R = 1.

Definition 1.13 (Logarithm and geometric series of matrices). For a matrix A ∈M(n×n,K)with ‖A‖< 1 one defines its logarithm

log(1+A) :=∞

∑ν=1

(−1)ν+1 · Aν

ν∈M(n×n,K)

and its geometric series∞

∑ν=0

Aν ∈M(n×n,K).

Corollary 1.14 (Derivation of logarithm).

1. For a matrix A ∈M(n×n,K) with ‖A‖< 1 the matrix 1−A is invertible with

(1−A)−1 =∞

∑ν=0

Aν .

2. Consider an open interval I ⊂ R and a differentiable function

B : I→M(n×n,K)

with ‖B(t)−1‖< 1 and [B′(t),B(t)] = 0 for all t ∈ I.

Then for all t ∈ I the inverse B(t)−1 exists and

ddt(log B(t)) = B(t)−1 ·B′(t) = B′(t) ·B(t)−1.

Proof. ad 1) For each n ∈ N

(1−A) ·n

∑ν=0

Aν =n

∑ν=0

Aν −n+1

∑ν=1

Aν = 1−An+1.

Because ‖A‖< 1 it follows


(1−A) ·∞

∑ν=0

Aν = 1− limn→∞‖A‖n+1 = 1.

ad 2) For t ∈ I defineA := 1−B(t).

Because ‖A‖= ‖1−B(t)‖< 1 apply part 1) to A:

B(t) = 1−A

has the inverse

B(t)−1 =∞

∑ν=0

Aν =∞

∑ν=0

(1−B(t))ν .

In addition,log(B(t)) = log(1+(B(t)−1))

is well-defined. Lemma 1.6 with the power series

f (1+ z) =∞

∑ν=1

(−1)ν+1

ν· zν

and its derivative

f ′(1+ z) =∞

∑ν=0

(−z)ν

implies

ddt(log B(t))=

ddt

log(1+(B(t)−1))= (∞

∑ν=0

(1−B(t))ν)·B′(t)=B(t)−1 ·B′(t)=B′(t)·B(t)−1.

Proposition 1.15 (Exponential and logarithm as inverse maps). We have

1. log(exp A) = A for A ∈M(n×n,K) if ‖A‖< log 2

2. exp(log B) = B for B ∈M(n×n,K) if ‖B−1‖< 1.

Proof. ad 1) The assumption ‖A‖< log 2 implies

‖exp(A)−1‖= ‖∞

∑ν=1

Aν

ν!‖ ≤

∞

∑ν=1

‖A‖ν

ν!= exp(‖A‖)−1 < 2−1 = 1.

Hencelog(exp A) = log(1+(exp(A)−1))

is a well-defined convergent power series.In order to prove the equality


log(exp A) = A

we first consider a diagonal matrix

A =

λ1 0. . .

0 λn

.

Then

log(exp(A)) =

log(eλ1) 0. . .

0 log(eλn)

=

λ1 0. . .

0 λn

= A.

According to Lemma 1.10 an arbitrary matrix A ∈M(n×n,C) is triagonizable, i.e.an invertible matrix S ∈ GL(n,C) exists with

S ·A ·S−1 = B

an upper triangular matrix. According to the same lemma a series of diagonalizablematrices (Bν)ν∈N exists with

B = limν→∞

Bν .

For each ν ∈ N an invertible matrix Sν ∈ GL(n,C) exists with

Sν ·Bν ·S−1ν = ∆ν

a diagonal matrix. Because ‖A‖= ‖B‖ we may assume: For ν ∈N sufficiently largealso

‖∆ν‖= ‖Bν‖< log 2

andBν = S−1

ν ·∆ν ·Sν = S−1ν · log(exp ∆ν) ·Sν =

= log(exp(S−1ν ·∆ν ·Sν)) = log(exp Bν).

Taking the limit and using the continuity of the functions log and exp gives

B = log(exp(B))

and

A = S−1 ·B ·S = S−1 · log(exp(B)) ·S = log(exp(S−1 ·B ·S)) = log(exp(A)).

ad 2) By assumption exp(log(B)) is a convergent power series. The proof of theequality

exp(log B) = B


goes along the same line as the proof of part 1): First, one proves the equality fordiagonal matrices. Secondly, one approximates the triagonizable matrix B by a se-quence of diagonalizable matrices. And finally one uses the continuity of functionswhich are given by power series.

Remark 1.16 (Counterexample). In Proposition 1.15, part 1) the assumption ‖A‖< log 2cannot be dropped: Consider the matrix

A = 2π i ·1 ∈M(n×n,C)

with ‖A‖= 2π > log 2. We have

exp(A) = e2π i ·1= 1

hencelog(exp(A)) = log(1) = 0 6= A.

Proposition 1.17 (Lie product formula). For any two matrices A,B ∈M(n×n,C)the exponential satisfies

exp(A+B) = limν→∞

(expAν· exp

Bν)ν .

In the proof we will use the standard notation f (ν) = O(1/ν2) iff for limν→∞

thequotient

f (ν)1/ν2 = ν

2 · f (ν)

remains bounded.

Proof. We conside the Taylor series

expAν= 1+

Aν+O(1/ν

2)

expBν= 1+

Bν+O(1/ν

2)

and

expAν· exp

Bν= 1+

Aν+

Bν+O(1/ν

2).

For large ν ∈ N we have

‖expAν· exp

Bν−1‖< log 2.


Therefore the logarithm is well-defined:

log(expAν· exp

Bν) = log(1+

Aν+

Bν+O(1/ν

2)) =Aν+

Bν+O(1/ν

2).

Hence on one hand,

(exp◦ log)(expAν· exp

Bν) = exp(

Aν+

Bν+O(1/ν

2)).

On the other hand, Proposition 1.15 implies

(exp◦ log)(expAν· exp

Bν) = exp

Aν· exp

Bν.

We get

expAν· exp

Bν= exp(

Aν+

Bν+O(1/ν

2))

and

(expAν· exp

Bν)ν = (exp(

Aν+

Bν+O(1/ν

2)))ν = exp(A+B+O(1/ν)).

Taking limν→∞

and using limν→∞

O(1/ν) = 0 and the continuity of the exponential provesthe claim

limν→∞

(expAν· exp

Bν)ν = exp(A+B).

Chapter 2Jordan theory

Any vector space will be assumed finite-dimensional if not stated otherwise.

2.1 Jordan decomposition

We will derive the results for the general case of an endomorphism of a vector space.But we will often identify an endomorphism and its matrix with respect to a certainbasis.

Definition 2.1 (Minimal and characteristic polynomial of an endomorphism).Denote by V a K-vector space and by f ∈ End(V ) an endomorphism.

• The minimal polynomial of f is the unique monic polynomial of minimal degree

pmin(T ) ∈K[T ] with pmin( f ) = 0.

Note that such polynomial exists because the family ( f ν)ν∈N is linearly depen-dent and dim End(V )< ∞.

• The characteristic polynomial of f is the polynomial

pchar(T ) := det(T − f ) ∈K[T ].

Because K[T ] is a principal ideal domain the minimal polynomial pmin(T ) is themonic generator of the ideal in K[T ] of all polynomials which annihilate f . Theroots of the characteristic polynomial pchar(T ) are the eigenvalues of f .

We will later prove the theorem of Cayley-Hamilton: The characteristic polyno-mial annihilates the endomorphism, i.e. pmin( f ) = 0. As a consequence, pmin(T )divides pchar(T ) in K[T ]. In addition, both polynomials have the same irreduciblefactors.

21

22 2 Jordan theory

Example 2.2 (Minimal and characteristic polynomial).

1. The endomorphism f ∈ End(K2) with matrix

A =

(λ 00 λ

),λ ∈K,

has respectively minimal polynomial and characteristic polynomial

pmin(T ) = T −λ , pchar(T ) = (T −λ )2 ∈K[T ].

More general: Any diagonalizable complex endomorphism with pairwise dis-tinct eigenvalues λi with algebraic multiplicity ei ∈N, i = 1, ...r, has respectivelyminimal polynomial and characteristic polynomial

pmin(T ) =r

∏i=1

(T −λi), pchar(T ) =r

∏i=1

(T −λi)ei ∈K[T ].

2. Conversely, consider an endomorphism f ∈End(V ) of a complex vector space. Ifits minimal polynomial pmin(T ) ∈ C[T ] splits completely into pairwise differentlinear factors then f is diagonalizable.

3. The endomorphism f ∈ End(K2) with matrix

A =

(λ b0 λ

),λ ∈K, b ∈K∗,

has respectively minimal polynomial and characteristic polynomial

pmin(T ) = pchar(T ) = (T −λ )2 ∈K[T ].

Note that the minimal polynomial pmin(T ) is not irreducible.

The central aim of Jordan theory is to decompose a complex vector space V intosubspaces which are eigenspaces or generalized eigenspaces of a given endomor-phism f ∈ End(V ).

Definition 2.3 (Eigenspaces and generalized eigenspaces). Consider a K-vectorspace V and a fixed endomorphism f ∈ End(V ).

• For λ ∈K define the eigenspace of f with respect to λ as the subspace

Vλ ( f ) := ker[ f −λ ]⊂V

and the generalized eigenspace of f with respect to λ as the subspace

V λ ( f ) :=⋃

n∈Nker[( f −λ )n]⊂V.

2.1 Jordan decomposition 23

Remark 2.4. 1. All generalized eigenspaces V λ ( f ) are f -invariant, i.e.

f (V λ ( f ))⊂V λ ( f ),

because[( f −λ )n, f ] = 0.

2. Every non-zero generalized eigenspace V λ ( f ) contains at least one eigenvector vof f with eigenvalue λ : Take a non-zero vector v0 ∈ V λ ( f ) and choose n ∈ Nmaximal with

v := ( f −λ )n(v0) 6= 0.

Definition 2.5 (Nilpotent endomorphism). Consider a vector space V .An endomorphism f ∈ End(V ) is nilpotent iff f n(V ) = 0 for a suitable n ∈ N.

Referring to definition 1.9 concerning diagonalizable endomorphisms of a com-plex vector space we emphasize that we will use the terms diagonalizable andsemisimple as synonyms. For a diagonalizable endomorphism f ∈ End(V ) the com-plex vector space V decomposes as a direct sum of eigenspaces of f , i.e.

V =⊕λ∈K

Vλ ( f ).

Note that the zero morphism is the only endomorphism which is both semisimpleand nilpotent.

It is part of the following Theorem 2.6 that a complex vector space V splits as adirect sum of generalized eigenspaces for any given endomorphism f ∈ End(V ). Ifall generalized eigenspaces are even eigenspaces then the endomorphism is namedsemisimple.

Apparently all these properties can be read off from a matrix representation of fwith respect to an arbitrary basis of V .

Theorem 2.6 (Jordan decomposition). Let V be a complex vector space and

f ∈ End(V )

an endomorphism.

1. Being a complex polynomial, the minimal polynomial pmin(T ) ∈ C[T ] of f splitsas

pmin(T ) =r

∏i=1

(T −λi)ki ∈ C[T ], i = 1, ...,r,λi ∈ C,ki ∈ N.

The splitting induces a corresponding splitting of V into the sum of generalizedeigenspaces

V =r⊕

i=1

V λi( f ).

24 2 Jordan theory

2. A unique decomposition

f = fs + fn (Jordan decomposition)

exists with a semisimple endomorphism fs ∈ End(V ) and a nilpotent endomor-phism fn ∈ End(V ) such that both satisfy [ fs, fn] = 0.For each i = 1, ...,r

V λi( f ) =Vλi( fs).

3. The components fs and fn depend on f in a polynomial way, i.e. polynomials

ps(T ), pn(T ) ∈ C[T ]

exist with ps(0) = pn(0) = 0 such that

fs = ps( f ) and fn = pn( f ).

In particular, [ f ,g] = 0 for an endomorphism g ∈ End(V ) implies

[ fs,g] = [ fn,g] = 0.

Proof. We take the proof from [19, Theor. 5.3.3].i) Taking as starting point the minimal polynomial pmin(T ) of f we introduce for

i = 1, ...,r the polynomials

pi(T ) :=pmin(T )(T −λi)ki

∈ C[T ].

By construction, no polynomial of positive degree exists as the common divisorof these polynomials. Hence their common divisor is a unit from C[T ]. Becausethe ring C[T ] is a principal ideal domain polynomials ri(T ) ∈ C[T ], i = 1, ...,r, existwith

1 =r

∑i=1

ri(T ) · pi(T ).

For each i = 1, ...,r we define the endomorphism

Ei := ri( f )◦ pi( f ) ∈ End(V ).

Then

idV =r

∑i=1

Ei ∈ End(V ).

By construction, for i 6= j the minimal polynomial pmin(T ) divides the polynomial

ri(T )pi(T )r j(T )p j(T )

hence Ei ◦E j = 0. As a consequence


E2i = Ei ◦

r

∑i=1

E j = Ei ◦ idV = Ei.

Therefore, the family (Ei)i=1,...,r comprises pairwise commuting projectors on subspaces Vi ⊂Vwith

V =r⊕

i=1

Vi.

On each vector space Vi, i = 1, ...,r, the corresponding linear map Ei - being a pro-jector - acts as identity, hence the endomorphism

λi · (Ei|Vi) ∈ End(Vi)

acts as multiplication by λi. We define the polynomial

ps(T ) :=r

∑i=1

λi · ri(T ) · pi(T ) ∈ C[T ]

and the endomorphismfs := ps( f ) ∈ End(V ).

Then fs ∈ End(V ) is semisimple with

Vi =Vλi( fs).

In order to obtain the nilpotent component of f we define the polynomial

pn(T ) := T − ps(T ) ∈ C[T ]

and the endomorphism

fn := pn( f ) = f − fs ∈ End(V )

satisfyingf = fs + fn.

The two definitionsfs := ps( f ), fn := pn( f )

imply[ f , fs] = [ f , fn] = 0.

In order to prove the nilpotency of fn we consider the restrictions to a fixed butarbitrary subspace Vi. Here

fn = f − fs = ( f −λi)− ( fs−λi).

Using [ fn, f ] = 0 the binomial theorem implies

26 2 Jordan theory

f kin =

ki

∑ν=0

(ki

ν

)(−1)ki−ν( f −λi)

ν ◦ ( fs−λi)ki−ν .

Here the second summand ( fs−λi)ki−ν vanishes for 0≤ ν < ki. The first summand

( f −λi)ν vanishes for ν = ki because for v ∈Vi

( f−λi)ki(v)= ( f−λi)

ki(Ei(v))= ( f−λi)ki(ri( f )◦ pi( f ))(v)= (ri( f )◦ pmin( f ))(v)= 0,

i.e. Vi ⊂V λi( f ). As a consequence of the binomial theorem

f kin (Vi) = 0.

Hence the restriction fn|Vi is nilpotent for every i = 1, ...,r, which implies the nilpo-tency of fn ∈ End(V ).

ii) Consider an arbitrary, but fixed i = 1, ...,r. We show V λi( f ) =Vi:We just proved Vi ⊂V λi( f ). In order to prove the opposite inclusion V λi( f )⊂Vi

we use the last result and in addition the direct sum decomposition

V =r⊕

j=1

Vj.

Consider v ∈V λi( f ) and decompose

v =r

∑j=1

v j,v j ∈Vj f or j = 1, ...,r.

For any n ∈ N

( f −λi)n(v) =

r

∑j=1

( f −λi)n(v j).

Assume v j 6= 0 for an index j = 1, ...,r. Because ( f −λi)n(v) = 0 implies

( f −λi)n(v j) = 0,

i.e.v j ∈V λi( f ),

a maximal n j ∈ N exists with

v j′ := ( f −λi)

n j(v j) 6= 0 and ( f −λi)(v j′) = 0

i.e.0 6= v j

′ ∈Vλi( f )

is an eigenvector. We obtain for large n

0 = ( f −λ j)n(v j

′) = (λi−λ j)n · v j

′.


Here the left equality is due to v j′ ∈ Vj ⊂ V λ j( f ) with the last inclusion already

proven above. And the right equality is due to v j′ ∈Vλi( f ).

Hence λi = λ j, j = i, and v = vi ∈Vi. This finishes the proof of V λi( f )⊂Vi.

As a consequence Vi =V λi( f ) and

V ∼=r⊕

i=1

Vi ∼=r⊕

i=1

V λi( f )∼=r⊕

i=1

Vλi( fs).

iii) Uniqueness of the Jordan decomposition f = fs + fn: Assume a second decom-position f = f ′s + f ′n with the properties stated in part 2) of the theorem. Then

[ f , f ′s ] = [ f ′s , f ′s ]+ [ f ′n, f ′s ] = 0

and similar [ f , f ′n] = 0. Hence by part 3), which has been already proved,

[ fs, f ′s ] = [ fn, f ′n] = 0.

As a consequencefs− f ′s = f ′n− fn ∈ End(V )

is an endomorphism which is both nilpotent and semisimple. Hence it is zero.

iv) The polynomials ps and pn may be choosen with ps(T ) = pn(0) = 0:On one hand, if λi = 0 for one index i = 1, ...,r then

{0} 6=V0 = ker f .

Hence bothf |V0 = 0 and ps( f )|V0 = 0

and the polynomial ps(T ) ∈ C[T ] has no constant term, i.e. ps(0) = 0. From thedefinition pn(T ) := T − ps(T ) follows pn(0) =−ps(0) = 0.

On the other hand, if λi 6= 0 for all indices i = 1, ...,r then pmin(0) 6= 0: Otherwise

pmin(T ) = T ·q(T ) with q ∈ C[T ],q 6= 0,deg q < deg pmin.

In particular q( f ) 6= 0, i.e. a vector v ∈V exists with q( f )(v) 6= 0. As a consequence

0 = pmin( f ) = f ·q( f )

implies0 6= q( f )(v) ∈ ker f ,

a contradiction.One replaces the polynomials ps and pn by

ps := ps−ps(0)

pmin(0)· pmin

28 2 Jordan theory

and

pn := pn−pn(0)

pmin(0)· pmin.

Thenps(0) = pn(0)

without changingfs = ps( f ) and fn = pn( f ),

q.e.d.

2.2 Surjectivity of the complex exponential map

Theorem 2.7 (Surjectivity of the complex exponential map). The complex expo-nential map

exp : M(n×n,C)→ GL(n,C), A 7→ exp A,

is surjective.

Proof. Consider a fixed but arbitrary matrix B∈GL(n,C). According to Theorem 2.6the vector space Cn splits into the sum of generalized eigenspaces

Cn ∼=⊕

λ

V λ (B)

Each restriction Bλ := B|V λ (B) leaves V λ (B) invariant.

Without loss of generality we may assume B = Bλ with a fixed λ ∈ C∗. Thematrix

N := B−λ ·1

is nilpotent and

B = λ ·1+N = λ (1+1λ·N).

Moreover

A1 := log(1+1λ·N) =

∞

∑ν=1

(−1)ν+1

ν· N

ν

λ ν

is well-defined because the sum is finite. We have

exp A1 = exp(log(1+1λ·N)) = 1+

1λ·N

according to Proposition 1.15. After choosing a complex logarithm µ ∈C with eµ = λ

and settingA := µ ·1+A1,

Proposition 1.11 implies

2.2 Surjectivity of the complex exponential map 29

exp A = exp(µ ·1) · exp A1 = λ · exp A1 = λ (1+1λ·N) = B,

q.e.d.

Remark 2.8. In general the exponential map is not surjective.

1. Complex case: Set

sl(2,C) := {A∈M(2×2,C) : tr A= 0)} and SL(2,C) := {B∈GL(2,C) : det B= 1}

and considerexp : sl(2,C)→ SL(2,C).

The map is well-defined due to Proposition 1.11. The matrix

B =

(−1 b0 −1

)∈ SL(2,C),b ∈ C∗,

has no inverse image.2. Real case:

exp : M(2×2,R)→ GL+(2,R) := {B ∈ GL(2,R) : det B > 0).

The matrix

B =

(−1 b0 −1

)∈ GL+(2,R),b ∈ R∗,

has no inverse image.

Proof. If a matrix A ∈ M(n× n,K) has the eigenvector v with eigenvalue λ thenthe matrix exp A has the same vector as eigenvector with eigenvalue eλ : Theequation A · v = λ · v and its iterates Aν · v = λ ν · v,ν > 1, imply

(n

∑ν=1

Aν

ν!) · v = (

n

∑ν=1

λ ν

ν!) · v ∈Kn.

Taking the limit on both sides proves the claim.

ad 1. Assume the existence of a matrix A ∈ sl(2,C) with exp A = B.The two complex eigenvalues λi, i = 1,2, of A satisfy

0 = tr A = λ1 +λ2.

The case 0= λ1 = λ2 is excluded because then exp A has the single eigenvalue e0 = 1.But B has the single eigenvalue −1.

Hence λ1 6= λ2. Then A is diagonalizable, a matrix S ∈ GL(2,C) exists with

30 2 Jordan theory

A = S ·(

λ1 00 λ2

)·S−1.

Because both eigenvalues of B are equal to −1 we obtain

B = exp A = S ·(

eλ1 00 eλ2

)·S−1 = S ·

(−1 00 −1

)·S−1 =−1,

a contradiction.

ad 2. Assume the existence of a matrix A ∈ sl(2,R) with exp A = B.The real matrix A has two complex eigenvalues λi, i = 1,2, which are conjugate

to each other. On one hand, if both eigenvalues are real then λ1 = λ2 =: λ ∈ Rand exp A has the single eigenvalue eλ > 0 while B has the single eigenvalue −1, acontradiction. On the other hand, if λ1 6= λ2 we obtain analogously to the first partthe contradiction

exp A =−1 6= B,

q.e.d.

2.3 Jordan normal form

The Jordan decomposition from Theorem 2.6 is the first step in proving the existenceof the Jordan normal form of a matrix representing an endomorphism f ∈ End(V ).The second step considers each generalized eigenspace V λ ( f ) separately and con-structs a suitable basis for it. The main ingredience for this step is the theory ofmodules over principal ideal domains.

A generalized eigenspace V λ ( f ) carries a second algebraic structure besides itsstructure as a complex vector space. This second structure encodes the operation off on M. It makes V λ ( f ) a module over the ring C[T ] of polynomials: Define

C[T ]×V λ ( f )→V λ ( f ),(p,v) 7→ p · v := p( f )(v).

As usual, p( f ) ∈ End(V ) denotes the endomorphism which results from the poly-nomial p by replacing the indeterminate T by the endomorphism f . E.g., a powerT n acts as the n-fold composition

f n := f ◦ ...◦ f .

The endomorphism f restricts on each generalized eigenspace to an endomor-phism fλ ∈ End(V λ ( f )). Here its minimal polynomial is

q fλ (T ) = (T −λ )e ∈ C[T ]

with an exponent e ∈ N.

2.3 Jordan normal form 31

For a ring R and an R-module M an element v ∈ M is a torsion element if anon-zero r ∈ R exists with r · v = 0. The module M is a torsion module if all itselements are torsion elements. The ring R := C[T ] is a principal ideal domain andthe R-module M :=V λ ( f ) is a finitely generated torsion module.

The structure of a finitely generated torsion module over a principal ideal domainis well-known, cf. [22, Chap. XV, Theor. 4].

Lemma 2.9 (Torsion module over a principal ideal domain). Consider a prin-cipal ideal domain R and a finitely generated torsion R-module M 6= {0}. Then afinite decreasing sequence of ideals I j ⊂ R, j = 1, ...,k, exists

I j ⊃ I j+1, j = 1, ...,k−1,

such that

M 'k⊕

j=1

R/I j.

The principal ideals I j =< q j > of R, j = 1, ...,k, are uniquely determined.

Proposition 2.10 (Principal modules derived from an endomorphism). Considera finite dimensional complex vector space M and an endormorphism f ∈ End(M).Assume that M with the C[T ]-module structure induced by f is principal, i.e. anelement v ∈M exists with

M = C[T ] · v.

Ifq f (T ) = (T −λ )d ,λ ∈ C,d ∈ N,

is the minimal polynomial of f , then the family

B := (wk)k=1,...,d with wk := ( f −λ )d−k(v)

is a basis of M as a complex vector space. With respect to this base the endomor-phism f has the matrix

J =

λ 1 0

λ 1. . . 1

0 λ

∈M(d×d,C).

Proof. The family B is linearly independent: A relation of linear dependence wouldgenerate a polynomial p ∈ C[T ] with p( f )(v) = 0 but degree less than d. If w =h( f )(v) ∈M with a polynomial h ∈ C[T ] denotes an arbitrary element then

p( f )(w) = (p( f )◦h( f ))(v) = (h( f )◦ p( f ))(v) = 0.

This contradicts the minimality of q f .

32 2 Jordan theory

From the mapC[T ]→M,T n 7→ f n(v),

derives the isomorphy of complex vector spaces C[T ]/(q f )'M. Hence dim M = dwhich completes the proof that B is a basis.

Setting w−1 := 0 one has for all k = 1, ...,e

f (wk) = ( f −λ )(wk)+λ ·wk = wk−1 +λ ·wk,

which proves the form of the matrix, q.e.d.

Theorem 2.11 (Jordan normal form). Let V be an n-dimensional complex vectorspace and

f ∈ End(V )

an endomorphism. Then a basis of V exists such that f has the matrix

J1,1 0. . .

J1,i(1)J2,1

. . .J2,i(2)

. . .Jr,1

. . .0 Jr,i(r)

∈M(n×n,C)

with Jordan boxes J j,i for j = 1,...,r and i=1,...,i(j) of the form

J j,i =

λ j 1 0

λ j 1. . . 1

0 λ j

, λ j ∈ C.

Proof. Part 1) Decomposition of V as a direct sum of generalized eigenspaces:Denote by

q f (T ) =r

∏i=1

(T −λi)ki ∈ C[T ]

the minimal polynomial of the endomorphism f ∈End(V ). According to Theorem 2.6we have

V =r⊕

i=1

V λi( f )

2.3 Jordan normal form 33

with non-zero generalized eigenspaces V λi( f ).Part 2) Decomposition of a generalized eigenspace V λ ( f ) into a direct sum of

principal torsion modules:Consider a generalized eigenspace E :=V λ ( f ) and denote by

q[T ] = (T −λi)e ∈ C[T ]

the minimal polanomial of the restriction f |E. Then E splits according to Lemma 2.9as a sum

E 'k⊕

j=1

C[T ]/I j.

of principal modules over the ring C[T ]. The invariant of C[T ]/Ik is the minimalpolynomial q[T ], the other invariants are some of its divisors

(T −λi)d ,d < e.

Part 3) The structure of a principal torsion module W ⊂ V with respect to theoperation of f has been clarified in Proposition 2.10. Concerning a suitable basis itis given by a Jordan box J j,i, q.e.d.

Many domains from mathematics contribute to Jordan theory: From Linear Al-gebra we need the theory of eigenvalues, from Algebra we use results about poly-nomial rings and divisibility in principal ideal domains. Starting point of the Jordandecomposition is the splitting of the minimal polynomial into linear factors. Thisresult can be proven by methods from Complex Analysis.

For extending the Jordan decomposition to obtain the Jordan normal form of acomplex endomorphism one has to invoke the theory of modules over principal idealdomains. These results belong to the field of Commutative Algebra.

The decomposition of a complex endomorphism into the sum of its semisimpleand its nilpotent part points to the two fundamental classes of Lie algebras. Semisim-ple Lie algebras will be studied in Part II. Nilpotent - and slightly more general -solvable Lie algebras are the subject of Chapter 4 from the present Part ??.

Chapter 3Fundamentals of Lie algebra theory

In this chapter the base field is either K= R or K= C.

3.1 Definitions and first examples

In general, in the associative algebra M(n×n,K) the product of two matrices is notcommutative:

AB 6= BA, A,B ∈M(n×n,K).

The commutator[A,B] := AB−BA ∈M(n×n,K)

is a measure for the degree of non-commutativity. The commutator depends K-linearly on both matrices A and B. In addition, it satisfies the following rules:

1. Permutation of two matrices: [A,B]+ [B,A] = 0.

2. Cyclic permutation of three matrices: [A, [B,C]]+ [B, [C,A]]+ [C, [A,B]] = 0.

These properties make the pair

gl(n,K) := (M(n×n,K), [−,−])

the prototype of a Lie algebra.

Definition 3.1 (Lie algebra). A Lie algebra over the field K is a K-vector space Ltogether with a K-bilinear map

[−,−] : L×L→K (Lie bracket)

such that

1. [x,x] = 0 for all x ∈ L

35

36 3 Fundamentals of Lie algebra theory

2. [x, [y,z]]+ [y, [z,x]]+ [z, [x,y]] = 0 for all x,y,z ∈ L (Jacobi identity).

A morphism of Lie algebras is a K-linear map f : L1 → L2 between to K-Liealgebras with

f ([x1,x2]) = [( f (x1), f (x2))],x1,x2 ∈ L1.

Note that the Lie bracket satisfies

[x,y]+ [y,x] = 0 f or all x,y ∈ L.

For the proof one computes [x+ y,x+ y] ∈ L.

We have just seen that the Lie algebra gl(n,K) derives from the associative alge-bra M(n×n,K). Reversing the viewing direction one can ask for a generalization ofthis derivation. We will see that every finite-dimensional Lie algebra derives froman associative algebra. This is an easy result when we admit infinitely-dimensionalassociative algebras, see Chapter ??. Much more difficult is to prove that L derivesfrom a matrix algebra, hence from a finite-dimensional associative algebra, see The-orem ??.

Definition 3.2 (Basic algebraic concepts). Consider a Lie algebra L over K. Onedefines:

• A subspace M ⊂ L is a subalgebra of L iff [M,M] ⊂M, i.e. iff M is closed withrespect to the Lie bracket of L.

• A subspace I ⊂ L is an ideal of L iff [L, I]⊂ I, i.e. if I is L-invariant.

• The normalizer of a subalgebra M ⊂ L is the subalgebra

NL(M) := {x ∈ L : [x,M]⊂M} ⊂ L,

i.e. the largest subalgebra which includes M as an ideal.

• The center of L isZ(L) := {x ∈ L : [x,L] = 0}.

It collects those elements which commute with all elements from L.

• The centralizer CL(Y ) of a subset Y ⊂ L is the largest subalgebra of L whichcommutes with Y

CL(Y ) := {x ∈ L : [x,Y ] = 0}

• The derived algebra or commutator algebra of L is

[L,L] := spanK{[x,y] : x,y ∈ L}.

Iff [L,L] = 0 then L is Abelian.

3.1 Definitions and first examples 37

For a morphism f : L1→ L2 between to K-Lie algebras the kernel ker( f ) ∈ L1 isan ideal.

Apparently the derived algebra is an ideal [L,L]⊂ L. And the defining propertiesof the Lie bracket imply that the center is an ideal Z(L)⊂ L too.

A Lie algebra is a vector space with the Lie bracket as additional operator. Thisadditional algebraic structure resembles the multiplication within a group or a ring.The Lie algebra concept of the commutator is taken from group theory while idealcomes from ring theory.

Besides the Lie algebra of all matrices gl(n,K) we will consider a series of im-portant subalgebras.

Definition 3.3 (Distinguished Lie algebras of matrices). We distinguish the fol-lowing subalgebras of the Lie algebra gl(n,K):

1. The subalgebra of strict upper triangular matrices

n(n,K) := {A = (ai j) ∈ gl(n,K) : ai j = 0 i f i≥ j}.

Each strict upper triangular matrix has the form0 ∗. . .

0 0

2. The subalgebra of upper triangular matrices

t(n,K) := {A = (ai j) ∈ gl(n,K) : ai j = 0 i f i > j}.

Each upper triangular matrix has the form∗ ∗. . .

0 ∗

3. The subalgebra of diagonal matrices

d(n,K) := {A = (ai j) ∈ gl(n,K) : ai j = 0 i f i 6= j}.

Each diagonal matrix has the form∗ 0. . .

0 ∗

All three vector spaces of matrices are closed with respect to the commutator ofmatrices, hence they are subalgebras of gl(n,K).


Moreover n(n,K)( t(n,K) because all diagonal elements of a strict upper trian-gular matrix vanish. We will see in Chapter 4.1 and Chapter 4.2 that the latter twoexamples are the prototype of the important classes of respectively nilpotent andsolvable Lie algebras.

3.2 Lie algebras of the classical groups

The classical semisimple Lie algebras originate as infinitesimal generators of theclassical groups.

Definition 3.4 (1-parameter group and infinitesimal generator). For a matrix X ∈M(n×n,K)the differentiable group morphism

fX : (R,+)→ (GL(n,K), ·), t 7→ exp(t ·X),

with derivation

(ddt

exp(t ·X))(0) = X (see Proposition 1.8).

is named a 1-parameter group with infinitesimal generator X .

Note: The general definition of a 1-parameter group of an arbitrary Lie group Gonly requires a continous group morphism

f : R→ G.

Nevertheless, one can show: All 1-parameter groups of G have the form f (t) = exp(t ·X)with an element X ∈ Lie G from the Lie algebra of G.

Definition 3.5 (Matrix group and 1-parameter subgroup). A matrix group is aclosed subgroup H ⊂ GL(n,K).

For a matrix X ∈M(n×n,K) the group morphism

fX : R→ GL(n,K), t 7→ exp(t ·X),

is a 1-parameter subgroup of the matrix group H iff fX (t) ∈ H for all t ∈ R.

The term closed subgroup of G := GL(n,K) refers to the relative topology of G,i.e. a subgroup H ⊂G is closed iff for any sequence (Aν)ν∈N of matrices Aν ∈ H,ν ∈ N,with

A = limν→∞

Aν ∈ G

also A ∈ H.

3.2 Lie algebras of the classical groups 39

Lemma 3.6 (The classical groups and the Lie algebra of their infinitesimalgenerators). In the following r ∈ N and the index m = m(r) ∈ N fixes the dimen-sion of the ambient Lie algebra

M := gl(m,K)

and the dimension of the ambient matrix group

G := GL(m,K).

i) Series Ar,r ≥ 1,m := r+1: The infinitesimal generators of all 1-parameter sub-groups of the special linear group

SL(m,K) := {g ∈ G : det g = 1} ⊂ G

span the subalgebra of traceless matrices

sl(m,K) := {X ∈M : tr X = 0} ⊂M.

ii) Series Br,r ≥ 2,m := 2r + 1: The infinitesimal generators of all 1-parametersubgroups of the special orthogonal group

SO(m,K) := {g ∈ G : g ·g> = 1,det g = 1}

span the subalgebra of skew-symmetric matrices

so(m,K) := {X ∈M : X +X> = 0}.

iii) Series Cr,r ≥ 3,m = 2r: The infinitesimal generators of all 1-parameter sub-groups of the symplectic group

Sp(m,K) := {g ∈ G : g> ·σ ·g = σ}

with

σ :=(

0 1

−1 0

),σ−1 =

(0 −11 0

)=−σ ,1 ∈ GL(r,K) unit matrix,

span the subalgebra

sp(m,K) := {X ∈M : X> ·σ +σ ·X = 0}.

iv) Series Dr,r ≥ 4,m = 2r: The infinitesimal generators of all 1-parameter sub-groups of the special orthogonal group

SO(m,K) := {g ∈ G : g ·g> = 1,det g = 1}

span the subalgebra of skew-symmetric matrices


so(m,K) := {X ∈M : X +X> = 0}.

v) Special unitary group: The infinitesimal generators of all 1-parameter subgroupsof the special unitary group

SU(m) := {g ∈ GL(m,C) : g ·g∗ = 1,det g = 1},g∗ := g> Hermitian con jugate,

span the real Lie algebra of traceless skew-Hermitian matrices

su(m) := {X ∈ gl(m,C) : X +X∗ = 0, tr X = 0}.

The discrimination between the two series Br and Dr of skew-symmetric matrixalgebras is motivated by the classification of their Dynkin diagrams in Chapter 8.

Proof. We show for all cases that any infinitesimal generator of a 1-parameter sub-group of the matrix group belongs to the Lie algebra in question. Vice versa, Propo-sition 1.11 implies that any element of the Lie algebra is an infinitesimal generatorof a 1-parameter subgroup of the matrix group.

ad i) Taking the derivativeddt

on both sides of the equation

1 = det(exp(t ·X)) = exp(tr (t ·X))

gives0 = (tr X) · exp(tr (t ·X)).

Hence for t = 0:0 = tr X .

Moreoversl(m,K) = ker[gl(m,K)

tr−→K]

is the kernel of the trace map to K considered as Abelian Lie algebra. Hence sl(m,K)⊂ gl(m,K)is even an ideal and a posteriori a subalgebra.

ad ii) and iv) Taking the derivative of

1= exp(t ·X) · exp(t ·X)> = exp(t ·X) · exp(t ·X>)

and using the product rule gives

0 = X · exp(t ·X) · exp(t ·X>)+ exp(t ·X) ·X> · exp(t ·X>).

Hence for t = 0:X +X> = 0.

And for X ,Y ∈ o(n×n,K):


[X ,Y ]+ [X ,Y ]> = XY −Y X +(XY −Y X)> = XY −Y X +Y>X>−X>Y> =

X(Y +Y>)− (X +X>)Y>−Y (X +X>)+(Y>+Y )X> = 0

after inserting XY>−XY> and Y X>−Y X>.

ad iii) Taking the derivative of

σ = exp(t ·X)> ·σ · exp(t ·X) = exp(t ·X>) ·σ · exp(t ·X)

gives

0 = X> · exp(t ·X>) ·σ · exp(t ·X)+ exp(t ·X>) ·σ ·X · exp(t ·X).

Hence for t = 0:0 = X> ·σ +σ ·X .

And for X ,Y ∈ sp(2n,K):

[X ,Y ]> ·σ +σ · [X ,Y ] = (Y>X>−X>Y>) ·σ +σ · (XY −Y X) = 0

after inserting X> ·σ ·Y−X> ·σ ·Y and Y> ·σ ·X−Y> ·σ ·X and using X> ·σ =−σ ·X ,q.e.d.

The reason for introducing these series and for the distinction between the twoseries Bn and Cn will become clear later in Chapter 8. The lower bound of theindex n ∈ N has been made in order to avoid duplicates. Otherwise we would have

A1 = B1 =C1,B2 =C2,A3 = D3.

• Elements of the complex matrix groups of the series Br,Dr preserve the C-bilinear form on Cm

(z,w) =m

∑j=1

z j ·w j.

• Elements of the complex matrix groups of the series Cr preserve the C-bilinearform on Cm

(x,y) =r

∑i=1

xi · yr+i− xr+i · yi.

• Elements of the special unitary group SU(m) preserve the sesquilinear (Hermi-tian) form on Cm

(z,w) =m

∑j=1

z j ·w j.

Here the bilinear form (z,w) is C-linear with respect to the first component andC-antilinear with respect to the second component.


Over the complex numbers quadratic forms always transform to these normalforms by a change of basis. Therefore one can restrict the investigation to the latter.The situation changes over the real numbers. Here one has to consider differentnormal forms, depending on the signature of the quadratic form in question.

Proposition 3.7 (Topology of SO(3,R),SU(2),SL(2,C)).

1. The matrix group SO(3,R) is connected, the matrix group O(3,R) has two con-nected components.

2. The matrix group SU(2) - considered as a topological space - is the 3-sphere:

SU(2)' S3 := {x ∈ R4 : ‖x‖= 1}.

In particular, SU(2) is simply connected.

3. The matrix group SL(2,C) - considered as a topogical space - is a trivial fibrebundle over the base C2 \{0} with fibre C. In particular,

SL(2,C)' S3×R3

and SL(2,C) is simply connected.

Proof. ad 1) We show that the matrix group SO(3,R) is path-connected: For a givenrotation matrix we choose an orthonormal basis of R3 with the rotation axis as thirdbasis element. Then we may assume

A =

cos δ sin δ 0−sin δ cos δ 0

0 0 1

∈ SO(3,R),δ ∈ [0,2 ·π[.

The path

γ : [0,1]→ SO(3,R), t 7→

cos (t ·δ ) sin (t ·δ ) 0−sin (t ·δ ) cos (t ·δ ) 0

0 0 1

∈ SO(3,R)

connects γ(0) = 1 with γ(1) = A in SO(3,R).Each matrix A ∈ O(3,R) has det A = ±1. Hence O(3,R) has two connected

components.

ad 2) Consider a matrix

A =

(z wu v

)∈ SU(2).

Then


A−1 =

(v −w−u z

)and A∗ =

(z uw v

).

HenceSU(2) = {A ∈ GL(2,C) : A∗ = A−1}=

{(

z w−w z

): z,w ∈ C, |z|2 + |w|2 = 1} ' S3 ⊂ C2 ' R4

is homeomorphic to the 3-dimensional unit sphere.The unit sphere S3 is compact, connected and simply connected. Simply con-

nected means the vanishing of the fundamental group

π1(S3,∗) = 0

or equivalently: Any closed path in S3 is contractible in S3 to one point. Intuitively,simple connectedness means that S3 has no “holes”.

The result π1(S3,∗) = 0 is a consequence of the theorem by Seifert and vanKampen, see [30, Kap. 5.3].

ad 3) On one hand, projecting a matrix onto its last column defines the differen-tiable map

p : SL(2,C)→ B := C2 \{0},A =

(a zc w

)7→(

zw

).

On the other hand, one considers the open covering U =(U1,U2) of the base B := C2 \{0}with

U1 := {(z,w) ∈ B : z 6= 0},U2 := {(z,w) ∈ B : w 6= 0}

One obtains a well-defined map

f : SL(2,C)→ B×C

A =

(a zc w

)7→

{(p(A),(a−w · r−2) · z−1) if p(A) ∈U1

(p(A),(c+ z · r−2) ·w−1) if p(A) ∈U2

with r := |z|2 + |w|2.

The map f is a differentiable isomorphism - in particular a homeomorphism. Bydefinition, it satisfies pr1 ◦ f = p.

Note:B = C2 \{0} ' S3× ]0,∞[' S3×R.

Because S3 is simply connected according to part 1), also B and eventually SL(2,C)' B×Care simply connected, q.e.d.

Note: We will see in Example 3.8 that SO(3,R) is not simply connected.


Several relations exists between the classical groups. These relations can be madeexplicit by differentiable group morphisms. Hence it is advantegeous not to studyeach group in isolation. Instead, it is recommended to focus on the relations be-tween the classical groups and to study those properties, which the groups have incommon.

We give two examples in low dimension.

Example 3.8 (SU(2) as two-fold covering of SO(3,R)).i) In order to obtain a suitable morphism

Φ : SU(2)→ SO(3,R)

we have to find out first: How does a complex matrix U ∈ SU(2) act on the 3-dimensionalreal space R3?

We define the real vector space of complex selfadjoint two-by-two matrices withtrace zero

Herm0(2) := {X ∈M(2×2,C) : X = X∗, tr X = 0}.

The family of Pauli matrices (σ j) j=1,2,3 is a basis of Herm0(2):

σ1 :=(

0 11 0

),σ2 :=

(0 −ii 0

)σ3 :=

(1 00 −1

).

The Pauli matrices have the non-zero commutator

[σ1,σ2] = 2 · i ·σ3

and all further non-zero commutators result from cyclic permutation. Concerningthe associative product one has

σ j ·σk = i ·3

∑l=1

(ε jkl ·σl)+δ jk ·1

with the value ε jkl depending on the sign of the permutation ( j,k, l)

ε jkl =

1, if sign( j,k, l) = +1−1, if sign( j,k, l) =−10, otherwise

We define the map

α : R→ Herm0(2),x = (x1,x2,x3) 7→ X :=3

∑j=1

x j ·σ j =

(x3 x1− i · x2

x1 + i · x2 −x3

).

On R3 we consider the Euclidean quadratic form


qE : R3→ R,x = (x1,x2,x3) 7→3

∑j=1

x2j .

Correspondingly, on the real vector space Herm0(2) we consider the quadratic form

qH : Herm0(2)→ R,X 7→ −det X ,

i.e.

qH(X) = a2 + |b |2 f or X =

(a bb −a

),a ∈ R,b ∈ C.

The mapα : (R3,qE)

∼−→ H := (Herm0(2),qH)

is an isometric isomorphism of Euclidean spaces , i.e. α is an isomorphism of vectorspaces with

qH(α(x)) = qE(x),x ∈ R3.

By means of the isometric isomorphism α we identify with O(3) the group ofisometries of H

O(H) := {g ∈ GL(Herm0(2)) : qH(g(X)) = qH(X) f or all X ∈ Herm0(2)}.

ii) We define a group morphism

Φ : SU(2)→ O(H),B 7→ΦB,

withΦB : Herm0(2)→ Herm0(2),X 7→ B ·X ·B−1.

Note B ·X ·B−1 ∈ Herm0(2), because B−1 = B∗ and X∗ = X imply

(B ·X ·B−1)∗ = (B ·X ·B∗)∗ = B ·X ·B∗ = B ·X ·B−1.

And in addition det(B ·X ·B−1) = det X .

Because SO(H)' SO(3,R) is connected due to Proposition 3.7 we have

Φ : SU(2)→ SO(H)

with SO(H)⊂ O(H) the connected component of the neutral element e ∈ O(H).

iii) Each differentiable group homomorphism of matrix groups

Ψ : G1→ G2

induces a morphism of the corresponding Lie algebras

ψ := Lie Ψ : Lie G1→ Lie G2.


The Lie algebra Lie G of a matrix group G is the tangent space Lie G = TeG at theneutral element e ∈ G1 and ψ is the linearisation of Ψ at e.

In order to determine the linearisation of Φ : SU(2)→ SO(H) we compute for

B = exp A,A ∈ su(2) = Lie SU(2),X ∈ Herm0(2) :

ΦB(X)=Φexp A(X)= exp(A)·X ·exp(−A)= (1+A+O(A2))·X ·(1−A+O(A2))=

= X +A ·X−X ·A+O(A2) = X +[A,X ]+O(A2).

As linearisation with respect to A we obtain the morphism of Lie algebras

ϕ(A) := (Lie Φ)(A) : su(2)→ o(H),ϕ(A)(X) = [A,X ].

Here o(H) ⊂ gl(Herm0(2)) is the subalgebra of the infinitesimal generators ofall 1-parameter subgroups of O(H).

The vector space su(2) has as basis the family (i ·σ j) j=1,2,3

i ·σ1 =

(0 ii 0

), i ·σ2 =

(0 1−1 0

), i ·σ3 =

(i 00 −i

).

The basis elements of su(2) map as

ϕ(i ·σ1) : H→ H

ϕ(i ·σ1)(σ1) = 0,ϕ(i ·σ1)(σ2) = i · [σ1,σ2] =−2 ·σ3

ϕ(i ·σ1)(σ3) = i · [σ1,σ3] =−i · [σ3,σ1] = 2 ·σ2

With respect to the basis (σ j) j=1,2,3 of Herm0(2) we obtain

ϕ(i ·σ1) = 2 ·

0 0 00 0 10 −1 0

∈ o(H).

and similar

ϕ(i ·σ2) = 2 ·

0 0 −10 0 01 0 0

,ϕ(i ·σ3) = 2 ·

0 1 0−1 0 00 0 0

∈ o(H).

Hence the familyϕ(i · σ j) j=1,2,3 is linearly independent in o(H). As a conse-quence,

ϕ : su(2) ∼−→ o(H)

is an isomorphism of Lie algebras.

iv) The morphismΦ : SU(2)→ SO(3,R)


is a local isomorphism at 1 ∈ SU(2) because its tangent map is bijective due topart iii). In particular, Φ is an open map. The image Φ(SU(2))⊂ SO(3,R) is com-pact because SU(2) is compact due to Proposition 3.7.

As a consequence, the open and closed subset Φ(SU(2))⊂ SO(3,R) equals theconnected set SO(3,R), i.e. the map Φ is surjective.

v) We are left with calculating the kernel of Φ . We claim:

ker Φ = {±1} ⊂ SU(2).

For the proof consider an arbitrary but fixed matrix

B =

(z w−w z

)∈ SU(2)

with B ·X ·B−1 = X for all X ∈ Herm0(2). Then

B−1 = B∗ =(

z −ww z

).

Choosing for X successively the basis elements σ1,σ2 ∈ Herm0(2) we obtainfrom

B ·σ1 ·B−1 = σ1

the restriction w = 0. ThenB ·σ2 ·B−1 = σ2

implies the further restriction z2 = 1, i.e. z =±1 = z.

vi) The group SU(2) is simply connected according to Proposition 3.7. Hencethe map

Φ : SU(2)→ SO(3,R)

is the univeral covering space of SO(3,R), i.e. SU(2) is a double cover of SO(3,R).

Nearly the same methods allow to compute the universal covering space of theconnected component of the neutral element of the Lorentz group:

The Lorentz group is the matrix group of isometries of the Minkowski space M

O(1,3) := { f ∈ GL(4,R) : qM( f (x)) = qM(x) f or all x ∈ R4}

The group O(1,3) has 4 connected components. The connected componentof 1 ∈ O(1,3) is the proper orthochronous Lorentz group L↑+. The term indicatesthat elements from

L↑+ = {B = (bi j)0≤i, j≤3 ∈ O(1,3) : det B = 1,b00 ≥ 1}

keep the orientation of vectors and the sign of their time component.


Proposition 3.9 (Universal covering space of the Lorentz group). The orthochronousLorentz group has the universal covering space

Ψ : SL(2,C)→ L↑+

with a suitable two-fold covering space Ψ , which is a group homomorphism.

Proof. i) Minkowski space as a vector space of matrices: Let M := (R4,qM) denotethe Minkowski space with the quadratic form of signature (1,3)

qM : R4→ R,qM(x) := x02− (x1

2 + x22 + x3

2),x = (x0, ...,x3).

Let H := (Herm(2),qH) denote the real vector space of Hermitian matrices

Herm(2) := {X ∈M(2×2,C) : X = X∗}

equipped with the real quadratic form

qH : Herm(2)→ R,X 7→ det X ,

i.e.qH(X) = a ·d−|b|2

for

X =

(a bb d

),a,d ∈ R,b ∈ C.

We set σ0 := 1 ∈ Herm(2) and denote by σ j ∈ Herm(2), j = 1,2,3, the Paulimatrices. The family (σ j) j=0,...,3 is a basis of the vector space Herm(2). The map

β : M→ H,x = (x0, ...,x3) 7→ X :=3

∑j=0

x j ·σ j =

(x0 + x3 x1− i · x2

x1 + i · x2 x0− x3

)is an isometric isomorphy, i.e. an isomorphism of vector spaces satisfying

qH(β (x)) = qM(x),x ∈ R4

becauseqH(β (x)) = det β (x) = x2

0− (x21 + x2

2 + x23) = qM(x).

By means of the isometric isomorphism β we identify the group of isometriesof H

O(H) := {g ∈ GL(Herm(2)) : qH(g(X)) = qH(X) f or all X ∈ Herm(2)}

with O(1,3) and denote byL↑+(H)⊂ O(H)

the connected component of the neutral element idH ∈ O(H).


ii) Defining the map: The map

Ψ : SL(2,C)→ O(H),B 7→ΨB,

withΨB : H→ H,X 7→ B ·X ·B∗,

is a well-defined morphism of matrix groups. We have Ψ(SL(2C)) ⊂ L↑+(H) be-cause SL(2,C) is connected due to Proposition 3.7.

iii) The tangent map: Denote by

o(H)⊂ gl(Herm(2)) := (End(Herm(2)), [−,−])

the subalgebra of the infinitesimal generators of all 1-parameter subgroups of O(H).And let

ψ := Lie Ψ : sl(2,C)→ o(H)

be the tangent map of Ψ at 1 ∈ SL(2,C).Then linearisation of Ψ at 1 ∈ SL(2,C) has the form

ψ(A)(X) = A ·X +X ·A∗,A ∈ sl(2,C),X ∈ Herm(2).

It is an isomorphism of real Lie algebras:The family (A j) j=1,...,6 with

A j :=

{σ j if j = 1,2,3

i ·σ j−3 if j = 4,5,6

is basis of sl(2,C) considered as real vector space. Explicit computation of the ma-trices representing ψ(A j), j = 1, ...,6 shows: The family (A j) j=1,...,6 is a linearlyindependent family in the vector space End(Herm(2)).

iv) Covering space: The map

Ψ : (SL(2,C))⊂ L↑+(H)

is open because its tangent map 1 ∈ SL(2,C) is a local isomorphism. The imageΨ : (SL(2,C))⊂ L↑+(H) is also closed because

L↑+(H) =⋃

g∈L↑+(H)

g ·Ψ(SL(2,C))

represents the complement L↑+(H)\Ψ : (SL(2,C)) as a union of open subsets. Hence

Ψ : SL(2,C)→ L↑+(H)


is surjective.

v) 2-fold covering: The kernel is

ker Ψ = {±1} ⊂ SL(2,C) :

For the proof one evaluates the condition

ΨB(X) = X

for B⊂ SL(2,C) and all X ∈Herm(2) for the matrices X from the basis (σ j) j=0,...,3of Herm(2).

As a consequence of part i) - part v):

Ψ : SL(2,C)→ L↑+(H)

is the universal covering space of the proper orthochronous Lorentz group. It is atwo-fold covering space, q.e.d.

For arbitrary dimension we have the following topological results, e.g. see [17,Chap. 10, §2] and [19, Chap. 17]:

Remark 3.10 (Topological properties of the classical groups).

1. General linear group, m≥ 1:The group GL(m,C) is connected.The real group GL(m,R) has two connected components. The component of theidentity is not simply connected for m≥ 2.

2. Series Ar,r ≥ 1,m = r+1:For K ∈ {R,C} the group SL(m,K) is connected, but not simply connected andnot compact.

3. Series Br,r ≥ 2,m = 2r, and series Dr,r ≥ 4,m = 2r+1:For K ∈ {R,C} the group SO(m,K) is connected, but not simply connected.The real group SO(m,R) is compact.

4. Series Cr : r ≥ 3,m = 2r:For K ∈ {R,C} the group Sp(m,K) is connected.

5. Special unitary group, m≥ 1:The group SU(m) is compact, connected and simply connected. Note that SU(m)is a real matrix group.

No complex group from this list is compact, because any complex, connectedand compact Lie group is Abelian, see [19, Prop. 15.3.7].


Definition 3.11 (Real form of a complex Lie algebra, complexification of a realLie algebra).

i) Consider a complex Lie algebra L. By restricting scalars from C to R the Liealgebra L can be considered a real Lie algebra LR.

ii) Consider a real Lie algebra M. Then the complexification of M is the complexLie algebra

C⊗R M

with Lie bracket

[z1⊗m1,z2⊗m2] := (z1 · z2)⊗ [m1,m2],z1,z2 ∈ C,m1,m2 ∈M,

iii) A real form of the complex Lie algebra L is a real subalgebra M ⊂ LR such thatthe complex linear map

C⊗R M→ L,1⊗m 7→ m, i⊗m 7→ i ·m,

is an isomorphism of complex Lie algebras.

Proposition 3.12 (Compact real form). Each complex Lie algebra from Lemma 3.6has a real form which is the Lie algebra of a compact matrix group:

1. General linear group, m≥ 1: The real Lie algebra

u(m) = {X ∈ gl(m,C);X +X∗ = 0}

is a compact real form of gl(m,C). u(m) is the Lie algebra of infinitesimal gen-erators of the unitary group U(m) which is a compact matrix group.

2. Series Ar, m=r+1: The real Lie algebra su(m) is a compact real form of sl(m,C). su(m)is the Lie algebra of infinitesimal generators of the compact matrix group SU(m).

3. Series Br, m=2r: The real Lie algebra so(m,R) is a compact real form of so(m,C). so(m,R)is the Lie algebra of infinitesimal generators of the compact real matrix group SO(m,R).

4. Series Cr, m=2r: The real Lie algebra

sp(m) = {X ∈ gl(m,R) : g>σ −σ ·g∗ = 0}

is a compact real form of sp(m,C). sp(m) is the Lie algebra of infinitesimalgenerators of the compact symplectic group

Sp(m) :=U(m)∩Sp(m,C).

Here U(m) denotes the unitary group

U(m) := {g ∈ GL(m,C) : g ·g∗ = 1}.


5. Series Dr, m=2r+1: The real Lie algebra so(m,R) is a compact real formof so(m,C). so(m,R) is the Lie algebra of infinitesimal generators of the com-pact real matrix group SO(m,R).

Proof. All groups SU(m),Sp(m),SO(m,R) are closed subgroups of a suitable uni-tary group U(n). Hence it suffices to prove the compactness of U(n): A matrix A ∈U(n)preserves the Euclidean metric of Cn, hence ‖U‖sup ≤ ‖U‖= 1, q.e.d.

Remark 3.13 (Exponential map of classical Lie groups). On one hand, in general theexponential map

exp : Lie G→ G

is not surjective for a connected classical group G: According to Example 2.8 thematrix

B =

(−1 b0 −1

)∈ SL(2,R),b ∈ R∗,

has not the form B = exp A with a matrix A ∈ sl(2,R).On the other hand, the exponential map

exp : Lie G→ G

is surjective for a connected matrix group G if Lie G is the compact form of acomplex Lie algebra, see [17, Chap. II, Prop. 6.10].

3.3 The adjoint representation of a Lie algebra

Every associative matrix algebra gives rise to a Lie algebra with Lie bracket

[A,B] := A ·B−B ·A.

In the opposite direction, a representation of a Lie algebra maps the Lie algebrato a matrix algebra with Lie bracket the commutator of matrices. Hence the ques-tion: Which Lie algebras are isomorphic to a Lie algebra of matrices, do injectiverepresentations always exist?

Definition 3.14 (Representation, adjoint representation). Consider a K-Lie alge-bra L.

1. A representation of L on a K-vector space V is a morphism of Lie algebras

ρ : L→ gl(V ) := (End(V ), [−,−]).

3.3 The adjoint representation of a Lie algebra 53

In particular,

ρ([x,y]) = [ρ(x),ρ(y)] = ρ(x)◦ρ(y)−ρ(y)◦ρ(x).

The vector space V is named L-module with respect to the multiplication

L×V →V,(x,m) 7→ x ·m := ρ(x)(m).

Then

[x,y] ·m = ρ([x,y])(m) = [ρ(x),ρ(y)](m) = ρ(x)(ρ(y)(m))−ρ(y)(ρ(x)(m)) =

x · (y ·m)− y · (x ·m).

The representation is faithful iff ρ is injective, i.e. ρ embedds L into a Lie algebraof matrices.

2. The adjoint representation of L is the map

ad : L→ gl(L),x 7→ ad x,

defined asad x : L→ L,y 7→ (ad x)(y) := [x,y].

Lemma 3.15 (Adjoint representation). The adjoint representation is a representa-tion, i.e. a morphism of Lie algebras.

Proof. We have to show

ad [x,y] = [ad x,ad y] : L→ L.

Consider z ∈ L. On one hand,

ad[x,y](z) = [[x,y],z] =−[[y,z],x]− [[z,x],y] (Jacobi identity)

On the other hand

[ad x,ad y](z) = (ad x ◦ad y−ad y ◦ad x)(z) = [x, [y,z]]− [y, [x,z]].

Both results are equal because the Lie bracket is antisymmetric.

The adjoint representation maps any abstract Lie algebra to a Lie algebra of en-domorphisms of a vector space, i.e. to a Lie algebra of matrices. But in general theadjoint representation is not injective. The kernel of the adjoint representation of aLie algebra L is the center Z(L). Nevertheless Theorem ?? will show that also aninjective morphism into a Lie algebra of matrices exists.


The characteristic feature of a Lie algebra L is the Lie bracket. It refines theunderlying vector space of L. In order to study the Lie bracket one studies how eachsingle element x ∈ L acts on L as an endomorphism ad x.

This procedure is similar to the study of number fields Q⊂K. The multiplicationon K refines the underlying Q-vector space structure. And one studies the multipli-cation by considering the Q-endomorphisms which result from the multiplicationby all elements x ∈ K. Norm and trace of these endomorphisms are important pa-rameters in algebraic number theory.

The concept of a representation is a concept of fundamental importance in thetheory of Lie algebras and their use in physics. We will study many examples inlater chapters. The scope of the concept is not restricted to Lie algebra theory.

For all x∈ L the element ad x is not only an endomorphism of L, but also a deriva-tion of L: With respect to the Lie bracket it satisfies a rule similar to the Leibniz rulefor the derivation of the product of two functions.

Definition 3.16 (Derivation). Let L be a Lie algebra. A derivation of L is an endo-morphism D : L→ L which satisfies the product rule

D([y,z]) = [D(y),z]+ [y,D(z)].

Lemma 3.17 (Adjoint representation and derivation). Consider a Lie algebra L.For every x ∈ L

ad x : L→ L

is a derivation of L.

Proof. Set D := ad x ∈ End(L). The Jacobi identity implies

D([y,z]) := (ad x)([y,z]) = [x, [y,z]] =−[y, [z,x]]− [z, [x,y]] = [y, [x,z]]+ [[x,y],z] =

= [y,(ad x)(z)]+ [(ad x)(y),z] = [y,D(z)]+ [D(y),z].

Derivations arising from the adjoint representation are named inner derivations,all other derivations outer derivations.

Lemma 3.18 (Derivation). Let L be a Lie algebra. The set of all derivations of L

Der(L) := {D ∈ End(L) : D derivation}

is a subalgebra Der(L)⊂ gl(L).

3.3 The adjoint representation of a Lie algebra 55

Proof. Apparently Der(L) ⊂ End(V ) is subspace. In order to show that Der(L) iseven a subalgebra, we have to prove: If D1,D2 ∈ Der(L) then [D1,D2] ∈ Der(L).

[D1,D2]([x,y]) = (D1 ◦D2)([x,y])− (D2 ◦D1)([x,y]) =

= D1([D2(x),y]+ [x,D2(y)])−D2([D1(x),y]+ [x,D1(y)]) =

= [D1(D2(x)),y]+ [D2(x),D1(y)]+ [D1(x),D2(y)]+ [x,D1(D2(y)]

−[D2(D1(x)),y]− [D1(x),D2(y)]− [D2(x),D1(y)]− [x,D2(D1(y)] =

= [[D1,D2](x),y]+ [x, [D1,D2](y)].

Chapter 4Nilpotent Lie algebras and Solvable Lie algebras

If not stated otherwise, all Lie algebras and vector spaces in this chapter will beassumed finite-dimensional.

4.1 Engel’s theorem for nilpotent Lie algebras

Recall Definition 2.5: An endomorphism f ∈ End(V ) with V a vector space V isnilpotent iff an index n ∈ N exists with f n = 0.

Definition 4.1 (Ad-nilpotency). Consider a Lie algebra L. An element x ∈ L is ad-nilpotent iff the endomorphism of L

ad x : L→ L,y 7→ [x,y],

is nilpotent.

Nilpotency and ad-nilpotency refer to two different structures: Nilpotency iter-ates the associative product, while ad-nilpotency iterates the Lie product. If the Liealgebra results from a matrix algebra both concepts are related: A Lie algebra ofnilpotent endomorphisms of a vector space acts “nilpotent” on itself by the adjointrepresentation.

Lemma 4.2 (Nilpotency implies ad-nilpotency). Consider a vector space V andan element x ∈ L⊂ gl(V ). If the endomorphism x : V →V is nilpotent, then also

ad x : L→ L

is nilpotent, i.e. the element x ∈ L is ad-nilpotent.

57

58 4 Nilpotent Lie algebras and Solvable Lie algebras

Proof. The endomorphism x ∈ End(V ) acts on End(V ) by left composition andright composition

l : End(V )→ End(V ),y 7→ x ◦ y,

r : End(V )→ End(V ),y 7→ y ◦ x.

Thenad x = l− r

because for all y ∈ End(V )

(ad x)(y) = [x,y] = (l− r)(y).

Nilpotency of x implies that both actions are nilpotent. Hence lN = rN = 0 for asuitable exponent N ∈ N.

Both actions commute: For all y ∈ End(V )

[l,r](y) = x◦ (y◦ x)− (x◦ y)◦ x = 0.

Hence the binomial theorem implies

(ad x)2N = (l− r)2N =2N

∑ν=0

(2Nν

)lν · r2N−ν · (−1)2N−ν = 0

because each summand has at least one factor equal to zero.

A further relation between the associative structure of the matrix algebra M(n×n,C)and the Lie algebra gl(n,C) derives from the Jordan decomposition of endomor-phisms according to Theorem 2.6.

Proposition 4.3 (Jordan decomposition of the adjoint representation). Consideran n-dimensional complex vector space V and the Jordan decomposition

f = fs + fn

of an endomorphism f ∈ End(V ). Then

ad f = ad fs +ad fn

is the Jordan decomposition of

ad f : End(V )→ End(V ),g→ [ f ,g],

i.e. ad fs is semisimple, ad fn is nilpotent and [ad fs,ad fn] = 0.

Proof. According to Lemma 4.2 the endomorphism ad fn is nilpotent. In order toshow that ad fs is semisimple we choose a base (v1, ...,vn) of V consisting of eigen-vectors of fs with corresponding eigenvalues λ1, ...,λn.

4.1 Engel’s theorem for nilpotent Lie algebras 59

Let (Ei j)1≤i, j≤n denote the standard base of End(V ) relatively to (v1, ...,vn), i.e.

Ei j(vk) = δ jkvi.

For 1≤ i, j,k ≤ n:

((ad fs)Ei j)(vk) = [ fs,Ei j](vk) = fs(Ei j(vk))−Ei j( fs(vk)) = ( fs−λk)(Ei j(vk)) =

= ( fs−λk)(δ jkvi) = (λi−λ j)(δ jkvi) = (λi−λ j) ·Ei j(vk)

Hence(ad fs)(Ei j) = (λi−λ j) ·Ei j

and ad fs acts diagonally on gl(V ).

Because ad : L→ gl(L),L := gl(V ), is a morphism of Lie algebras:

[ad fs,ad fn] = ad([ fs, fn]) = 0

q.e.d.

Using the adjoint representation we generalize the concept of a semisimple en-domorphism to elements of arbitrary complex Lie algebra.

Definition 4.4 (Semisimple element of a complex Lie algebra). Consider a com-plex Lie algebra L. An element x∈L is semisimple iff the endomorphism ad x ∈ End(L)is semisimple.

It is a trivial fact that a nilpotent endomorphism of a vector space V 6= {0} hasan eigenvector with eigenvalue zero. Theorem 4.5 strongly generalizes this fact: Itproves the existence of a common eigenvector for a whole Lie algebra of nilpotentendomorphisms.

Theorem 4.5 (Annihilation of a common eigenvector). Consider a vector space V 6= {0}and a Lie subalgebra L⊂ gl(V ) of endomorphisms.

If each endomorphism x ∈ L is nilpotent, then all x ∈ L annihilate a commoneigenvector, i.e. a nonzero vector v ∈V exists with

x(v) = 0 f or all x ∈ L.

Proof. The proof goes by induction on dim L, the dimension of the finite-dimensionalvector space V is arbitrary. As noted before the theorem is true for dim L = 1.

For the induction step assume dim L > 1 and the theorem true for all Lie algebrasof less dimension.

i) By assumption all x∈L are nilpotent endomorphisms of V . Hence by Lemma 4.2each action


ad x : L→ L,x ∈ L,

is nilpotent.If K ( L is a proper subalgebra, then (ad x)(K)⊂ K for all x ∈ K. Hence the Lie

algebra K with dim K < dim L acts on the vector space L/K by an action whichderives from the adjoint representation of L:

ad x : L/K→ L/K,y+K 7→ [x,y]+K,x ∈ K.

Nilpotency of each ad x,x ∈ K, implies the nilpotency of each ad x,x ∈ K.The induction hypothesis applied to the Lie algebra K and the vector space V := L/K

provides a common eigenvector for all ad x,x ∈ K,

y+K ∈ L/K

belonging to eigenvalue zero. In particular y /∈ K but [x,y] ∈ K for all x ∈ K. As aconsequence

y ∈ NL(K)\K

i.e. K is properly contained in its normalizer.

ii) We now choose a proper Lie subalgebra I ( L of maximal dimension.Then NL(I) = L, i.e. I ∈ L is an ideal. Hence we have a canonical projection ofLie algebras

π : L→ L/I.

In case dim L/I > 1 the inverse image π−1(K)⊂L of a 1-dimensional subalgebra K ⊂ L/Iwere a Lie subalgebra of L properly containing I, a contradiction to the maximalityof I. Hence dim L/I = 1 which implies the decomposition

L = I +K · x0

with a suitable element x0 ∈ L\ I.By induction assumption the elements of I annihilate a common non-zero eigen-

vector, i.e.W := {v ∈V : x(v) = 0 f or all x ∈ I} 6= {0}.

In addition the subspace W ⊂V is stable with respect to the endomorphism x0 ∈ End(V ),i.e. for all w ∈W holds x0(w) ∈W : For all x ∈ I

x(x0(w)) = x0(x(w))− [x0,x](w) = 0.

Here the first summand vanishes because w ∈W and x ∈ I. The second summandvanishes because I ⊂ L is an ideal and therefore [x0,x] ∈ I. The restriction

x0|W : W →W


is a nilpotent endomorphism and has an eigenvector v0 ∈W,v0 6= 0, with eigenvalue 0.From the decomposition L = I+K · x0 follows x(v0) = 0 for all x ∈ L. This finishesthe induction step, q.e.d.

The fact, that any Lie algebra L⊂ gl(V ) of nilpotent endomorphisms annihilatesa common eigenvector, allows the simultaneous triagonalization of all endomor-phisms of L to strict upper triangular form.

Corollary 4.6 (Simultaneous strict triagonalization of nilpotent endomorphisms).Consider an n-dimensional K-vector space V and a Lie subalgebra L⊂ gl(V ) of en-domorphisms of V .

Then the following properties are equivalent:

1. Each endomorphism x ∈ L is nilpotent.

2. A flag (Vi)i=0,...,n of subspaces of V exists such that L(Vi)⊂Vi−1 for all i= 1, ...,n.

3. The Lie algebra L is isomorphic to a subalgebra of the Lie algebra of strict uppertriangular matrices

n(n,K) = {

0 ∗. . .

0 0

∈ gl(n,K)}.

In particular, any endomorphism x ∈ L has zero trace.

Proof. 1) =⇒ 2): The proof goes by induction on n = dim V . The implication isvalid for n = 0.

Assume n> 0 and part 2) valid for all vector spaces of less dimension. Set V0 := {0}.According to Theorem 4.5 all elements from L annihilate a common eigenvector v1 ∈V .Consider the quotient

W :=V/Kv1

with canonical projectionπ : V →W.

Any endomorphism x ∈ L ⊂ gl(V ) annihilates V1. Hence it induces an endomor-phism x : W →W such that the following diagram commutes

V V

W W

x

π

xπ

In particular, the endomorphism x∈ End(W ) is nilpotent. The induction assumptionapplied to W with dim W < dim V and L ⊂ gl(W ) provides a flag (Wi)i=0,...,n−1 ofsubspaces of W with


x(Wi)⊂Wi−1 f or all i = 1, ...,n−1.

Now defineVi := π

−1(Wi−1), i = 1, ...,n.

Thenx(Wi)⊂Wi−1

impliesx(Vi)⊂Vi−1, i = 1, ...,n.

2) =⇒ 3) For the proof one extends successively a basis (v1) of V1 to a basis (vi)i=1,...,nof Vn =V .

3) =⇒ 1) The proof is obvious, q.e.d.

From a categorical point of view the concept of a short exact sequence is a usefultool to handle injective or surjective morphisms and respectively their kernels andcokernels.

Definition 4.7 (Chain complex and exact sequence of Lie algebra morphisms).

1. A chain complex of Lie algebras is a sequence of Lie algebra morphisms

(Lifi−→ Li+1)i∈Z

such that for all i ∈ Z the composition fi+1 ◦ fi = 0, i.e.

im[Li−1fi−1−−→ Li]⊂ ker[Li

fi−→ Li+1].

2. A chain complex of Lie algebras (Lifi−→ Li+1)i∈Z is exact or an exact sequence of

Lie algebra morphisms if for all i ∈ Z

im[Li−1fi−1−−→ Li] = ker[Li

fi−→ Li+1].

3. A short exact sequence of Lie algebra morphisms is an exact sequence of theform

0→ L0f0−→ L1

f1−→ L2→ 0.

A short exact sequence

0→ L0f0−→ L1

f1−→ L2→ 0

expresses the following fact about two morphisms:

• f0 : L0→ L1 is injective,


• f1 : L1→ L2 is surjective and• im f0 = ker f1, in particular L2 ' L1/L0.

A Lie algebra is Abelian when the commutator of any two elements vanishes.Nilpotent Lie algebras are a first step to generalize this property: A Lie algebra isnamed nilpotent when a number N ∈ N exists such that all N-fold commutatorsvanish.

Definition 4.8 (Descending central series and nilpotent Lie algebra). Consider aLie algebra L.

1. The descending central series of L is the sequence (CiL)i∈N of ideals CiL ⊂ L,inductively defined as

C0L := L and Ci+1L := [L,CiL], i ∈ N.

2. The Lie algebra L is nilpotent iff an index i0 ∈ N exists with Ci0L = 0. Thesmallest index with this property is named the length of the descending centralseries.

3. An ideal I ⊂ L is nilpotent if I is nilpotent when considered as a Lie algebra.

One easily verifies Ci+1L⊂CiL and each CiL⊂ L, i ∈ N, being an ideal.

The Lie algebra n(n,K) of strictly upper triangular matrices and all subalgebrasof n(n,K) are nilpotent Lie algebras.

For a Lie algebra L one obtains for any i ∈ N a short exact sequence

0→CiL/Ci+1L→ L/Ci+1L→ L/CiL→ 0

By definition of the descending central series, the Lie algebra on the left handside is always the center of the Lie algebra in the middle, i.e.

CiL/Ci+1L = Z(L/Ci+1L).

As a consequence the Lie algebra on the right hand side is Abelian. For a nilpotentLie algebra L and minimal index i0 with Ci0L = 0 the exact sequence

0→Ci0−1L→ L→ L/Ci0−1L

presents L as a central extension of L/Ci0−1L. Successively one obtains L as a finitesequence of central extensions of Abelian Lie algebras, starting with the AbelianLie algebra Z(L)' L/C1L.

Lemma 4.9 shows the reverse direction: A finite sequence of central extensionsof an Abelian Lie algebra leads to a nilpotent Lie algebra.


Lemma 4.9 (Central extension of a nilpotent Lie algebra). Consider a short exactsequence

0→ I→ L→ L/I→ 0

with a Lie algebra L and an ideal I ⊂ L contained in the center of L, i.e. I ⊂ Z(L).

If L/I is nilpotent then L is nilpotent too.

Proof. The center Z(L) ⊂ L is a nilpotent ideal, hence also I is nilpotent. Set L′ :=L/I and consider the canonical projection

π : L→ L′.

The descending central series of L and L′ = π(L) relate as

Ci(π(L)) = π(CiL), i ∈ N.

If Ci0L′ = 0 then π(Ci0L) = 0, i.e.Ci0 L ⊂ I ⊂ Z(L), hence Ci0+1L = [L,Ci0L] = 0,q.e.d.

If two Lie algebras from a short exact sequence

0→ L0→ L1→ L2→ 0

are nilpotent, one cannot conclude that also the the third Lie algebra is nilpotent.

Example 4.10. The descending central series of the Lie algebra

L := t(2,K) =

{(∗ ∗0 ∗

)∈ gl(2,K)

}is

C0L = [L,L] = n(2,K) =K ·(

0 10 0

)and

C1L = [L,n(2,K)] = n(2,K) =C0L 6= 0.

In the exact sequence

0→ n(2,K)→ t(2,K)→ d(2,K)→ 0

both the ideal of strictly upper triangular matrices

I := n(2,K)⊂ L

and the quotient of diagonal matrices

L/I ' d(2,K)

are nilpotent. But the Lie algebra L of upper triangular matrices is not nilpotent.


Lemma 4.11. Consider a short exact sequence of Lie algebras

0→ L0→ L1π−→ L2→ 0.

If the Lie algebra L1 is nilpotent, then also L0 and L2.

Proof. For each index i ∈ N we have

CiL0 ⊂CiL1 and π(CiL1) =CiL2.

Hence CiL1 = 0 implies CiL0 = 0 and CiL2 = 0, q.e.d.

We now prove Engel’s theorem. It is the main result about nilpotent Lie alge-bras. It characterizes the nilpotency of a Lie algebra by the ad-nilpotency of all itselements. Engel’s theorem follows as a corollary from Theorem 4.5.

Theorem 4.12 (Engel’s theorem for nilpotent Lie algebras). A Lie algebra L isnilpotent if and only if every element x ∈ L is ad-nilpotent.

Proof. i) Assume that every endomorphism

ad x : L→ L,x ∈ L,

is nilpotent. According to Corollary 4.6 the Lie algebra ad L is isomorphic to a sub-algebra of n(n,K),n = dim L, of strictly upper triangular matrices, hence nilpotent.Due to Lemma 4.9 the exact sequence

0→ Z(L)→ L ad−→ ad L→ 0

implies the nilpotency of L.

ii) Suppose L to be nilpotent. An index i0 ∈N exists with Ci0L= 0. Hence adi0(x) = 0for all x ∈ L, i.e. ad x is nilpotent, q.e.d.

Corollary 4.13. Consider a nilpotent Lie algebra L and an ideal I ⊂ L, I 6= {0}.Then

Z(L)∩ I 6= {0}.

Proof. According to Theorem 4.12 all endomorphisms

ad x : L→ L,x ∈ L,

are nilpotent. Because I ⊂ L is an ideal each restriction

(ad x)|I : I→ I,x ∈ L,


is well-defined and also nilpotent. According to Theorem 4.5 a non-zero element x0 ∈ Iexists with

[L,x0] = 0,

i.e. 0 6= x0 ∈ Z(L)∩ I, q.e.d.

Proposition 4.14 (Normalizer in nilpotent Lie algebras). Any proper subalgebra M ( Lof a nilpotent Lie algebra L is properly contained in its normalizer M ( NL(M).

Proof. We consider the lower central series of L. Because M ( L = C0L we startwith

M ( M+C0L.

Due to Ci0L = 0 for a suitable index i0 ∈ N we end with

M = M+Ci0L.

Let i < i0 be the largest index with

M ( M+CiL.

ThenM = M+Ci+1L.

Therefore[M+CiL,M]⊂ [M,M]+ [CiL,M]⊂M+Ci+1L = M

because M is a Lie subalgebra. We obtain

M ( M+CiL⊂ NL(M),q.e.d.

4.2 Lie’s theorem for solvable Lie algebras

Definition 4.15 (Derived series and solvable Lie algebra). Consider a Lie algebra L.

1. The derived series of L is the sequence (DiL)i∈N inductively defined as

D0L := L and Di+1L := [DiL,DiL], i ∈ N.

2. The Lie algebra L is solvable iff an index i0 ∈N exists with Di0L = 0. The small-est index with this property is named the length of the derived series.

3. An ideal I ⊂ L is solvable iff I is solvable when considered as Lie algebra.

By induction on i ∈ N one easily shows that each DiL, i ∈ N, is an ideal in L.Note that D1L = [L,L] is the derived algebra. Comparing the derived series with thelower central series one has DiL⊂CiL for all i ∈ N. Hence for Lie algebras:

4.2 Lie’s theorem for solvable Lie algebras 67

Abelian =⇒ nil potent =⇒ solvable.

Lemma 4.16 (Solvability and short exact sequences). Consider an exact sequenceof Lie algebras

0→ L0→ L1π−→ L2→ 0.

If any two terms of the sequence are solvable then the third term is solvable too.

Proof. We prove that solvability of both L0 and L2 implies solvability of L1. AssumeDi0L2 = 0 and D j0L0 = 0. For all i ∈ N

DiL2 = Diπ(L1) = π(DiL1).

Hence Di0L2 = 0 implies Di0L1 ⊂ L0. Then

Di0+ j0L1 ⊂ D j0 L0 = 0.

The proof of the other cases is similar to the proof of Lemma 4.11.

Example 4.10 demonstrates that an analogous statement concerning the nilpo-tency of the Lie algebra in the middle is not valid.

Corollary 4.17 (Solvable ideals). Consider a Lie algebra L. The sum a+b of twosolvable ideals a,b⊂ L is solvable.

Proof. We apply Lemma 4.16: The quotient a/(a∩b) of the solvable Lie algebra ais solvable too. By means of the isomorphy

(a+b)/b' a/(a∩b)

the exact sequence0→ b→ a+b→ (a+b)/b→ 0

implies the solvability ofa+b.

Definition 4.18 (Radical of a Lie algebra). The unique solvable ideal of L whichis maximal with respect to all solvable ideals of L is denoted rad L, the radical of L.

To verify that the concept is well-defined take rad L as a solvable ideal not con-tained in any distinct solvable ideal. For an arbitrary solvable ideal S⊂ L also

S+ rad L


is solvable by Lemma 4.16. The inclusion

rad L⊂ S+ rad L

implies by maximality of rad L that

rad L = S+ rad L.

Hence S⊂ rad L.

The following Lemma 4.19 prepares the proof of Lie’s theorem about solvableLie algebras.

Lemma 4.19. Consider a K-vector space V and a K-subalgebra L ⊂ gl(V ) of en-domorphisms of V . Consider an ideal I ⊂ L. Assume: All endomorphisms

x : I→ I,x ∈ I,

have a common eigenvector v ∈V . Then the eigenvector equation

x(v) = λ (x) · v,x ∈ I,

defines a linear functional λ : I→K such that

λ ([x,y]) = 0

for all x ∈ I,y ∈ L.

Proof. We choose an arbitrary but fixed y ∈ L and consider all x ∈ I.

i) Common triangularization of all endomorphisms x ∈ I on a subspace W ⊂V :Let n ∈ N∗ be a maximal index such that the family of iterates

B = (v,y(v),y2(v), ...,yn−1(v))

is linearly independent. Denote by W ⊂ V the n-dimensional vector space spannedby B. By definition W is stable with respect to y, i.e. y(W )⊂W .

We claim: The family

(Wi := span < v,y(v), ...,yi−1 >)i=0,...,n

is a flag of W invariant with respect to all endomorphisms x : W →W,x ∈ I.

We prove x(Wi)⊂Wi by induction on i = 0, ...,n.For the induction step i 7→ i+1 we consider x ∈ I and decompose

x(yi(v)) = (x◦ y)(yi−1(v)) = [x,y](yi−1(v))+(y◦ x)(yi−1(v)).


The induction assumption applied to yi−1(v) ∈Wi proves: For the first summand

[x,y](yi−1(v)) ∈ I(Wi)⊂Wi ⊂Wi+1

because yi−1(v) ∈Wi and [x,y] ∈ I, and for the second summand

(y◦ x)(yi−1(v)) = y(x(yi−1(v))) ∈ y(Wi)⊂Wi+1.

Hence x(yi(v)) ∈Wi+1.

With respect to the basis B of W each restricted endomorphism x|W →W,x ∈ I,is represented by an upper triangular matrix

m(x|W ) ∈ t(n,K).

ii) The diagonal elements of the upper triangular matrices: We claim that all diago-nal elements are equal to λ (x), i.e.

m(x|W ) =

λ (x) ∗. . .

0 λ (x)

∈M(n×n,K),x ∈ I.

The proof of the equation

x(yi(v))≡ λ (x) · yi(v) mod Wi

goes by induction on i=0,...,n. For the induction step i−1 7→ i consider the decom-position from above

x(yi(v)) = [x,y](yi−1(v))+(y◦ x)(yi−1(v)).

For the first summand[x,y](yi−1(v)) ∈Wi.

For the second summand

(y◦ x)(yi−1(v)) = y(x(yi−1(v)))≡ λ (x) · yi(v) mod Wi

because by induction assumption

x(yi−1(v))≡ λ (x) · yi−1(v) mod Wi−1

and by definition y(Wi−1)⊂Wi.

iii) Vanishing of the trace of a commutator: We have shown: All elements x ∈ Iact on the subspace W ⊂V as endomorphisms with

trace(x|W ) = n ·λ (x).


For all x ∈ I also [y,x] ∈ I because I ⊂ L is an ideal. But the trace of a commutatorof two endomorphisms vanishes. Hence for all x ∈ I

0 = tr([y|W,x|W ] = n ·λ ([y,x])

which proves λ ([y,x]) = 0, q.e.d.

The present section deals with eigenvalues of certain endomorphisms. We needthe fact that a polynomial with coefficients from K splits completely into linearfactors over K. Therefore we assume that the underlying field K is algebraicallyclosed, i.e. we now consider complex Lie algebras.

Theorem 4.20 (Common eigenvector of a complex solvable Lie algebra of en-domorphism). Consider a complex vector space V 6= {0} and a complex solvableLie subalgebra L⊂ gl(V ). Then L has a common eigenvector: A vector v ∈V,v 6= 0,and a linear functional λ : V → C exist such that

x(v) = λ (x) · v f or all x ∈ L.

The proof of this theorem will imitate the proof of Engel’s theorem.

Proof. Without restriction we may assume L 6= {0}. Then the proof goes by induc-tion on dim L. The claim trivially holds for dim L = 0. Hence we assume dim L > 0and suppose that the claim is true for smaller dimension.

i) Construction of an ideal I ⊂ L of codimL(I) = 1: The derived series of L endswith Di0 = 0, hence it starts with D1L ( D0L, i.e. L/[L,L] 6= 0. Let

π : L→ L/[L,L]

be the canonical projection of Lie algebras. Fortunately the Lie algebra L/[L,L] isAbelian. Hence any arbitrary choosen vector subspace D⊂L/[L,L] of codimension 1is even an ideal D⊂ L/[L,L]. Then I := π−1(D) is an ideal of L with

codimL(I) := dim L−dim I = 1.

ii) Subspace of eigenvector candidates: We restrict the action of L on V to anaction of I. By induction assumption applied to I we get an element 0 6= v ∈V anda linear functional λ : I→ C with

x(v) = λ (x) · v f or all x ∈ I.

We define the non-zero subspace of V

W := {v ∈V : x(v) = λ (x) · v f or all x ∈ I}.

The subspace W ⊂V is stable under the action of L, i.e. for v ∈W and y ∈ L holdsy(v) ∈W : For arbitrary x ∈ I,v ∈W,y ∈ L we have


x(y(v)) = [x,y](v)+ y(x(v)) = [x,y](v)+λ (x) · y(v) = λ ([x,y]) · y(v)+λ (x) · y(v).

Here [x,y](v) = λ ([x,y]) ·y(v) because [L, I]⊂ I. Lemma 4.19 implies λ ([x,y]) = 0.

iii) Complex endomorphisms have at least one eigenvector: According to thechoice of the ideal I ⊂ L in part i) an element x0 ∈ L exists with

L = I⊕C · x0.

By part ii) the subspace W is stable under the action of the restricted endomor-phism x0|W . Because the field C is algebraic closed an eigenvector v0 ∈W of x0with eigenvalue λ ′ exists:

x0(v0) = λ′ · v0.

The linear functional λ : I → C extends to a linear functional λ : L → C bydefining λ (x0) := λ ′. Then

x(v0) = λ (x) · v0 f or all x ∈ L, q.e.d.

A corollary of Theorem 4.20 is Lie’s theorem about the simultaneous triangular-ization of a complex solvable Lie algebra of endomorphisms.

Theorem 4.21 (Lie’s theorem about complex solvable Lie algebras of endomor-phisms). Consider a n-dimensional complex vector space V . Each solvable complexLie algebra L⊂ gl(V ) is isomorphic to a subalgebra of the Lie algebra of upper tri-angular matrices

t(n,C) = {

∗ ∗. . .

0 ∗

∈ gl(n,C)}.

Proof. Similar to the proof of Lemma 1.10, part ii) we construct a flag (Vi)i=1,...,n ofsubspaces, each stable with respect to all endomorphisms x : V →V,x ∈ L. Succes-sively we extend a basis (v1) of V1 to a basis B = (v1, ...,vn) of V. With respect tothe basis B all endomorophism x ∈ L are represented by upper triangular matrices.

Corollary 4.22 (Theorem of Cayley-Hamilton). Consider an endomorphism ofan n-dimensional vector space V

f : V →V.

The characteristic polynomial pchar(T ) ∈ C[T ] of f annihilates f , i.e.

pchar( f ) = 0 ∈ End(V ).


Proof. If the base field is R we consider the complex linear endomorphism

f ⊗ id ∈ End(V ⊗RC).

Hence we may prove the claim for a complex vector space V . The characteristicpolynomial of f splits as

pchar(T ) =n

∏j=1

(T −λ j) ∈ C[T ].

After choosing a suitable basis B = (v1, ...,vn) of V the endomophism f is repre-sented by the upper triangular matrixλ1 ∗

. . .0 λn

∈ gl(n,C).

For i = 1, ...,n we define the subspace Vi := span < v1, ...,vi > and prove byinduction

(i

∏j=1

( f −λ j))|Vi = 0.

This holds for i = 1.To prove the induction step i−1 7→ i: Decompose an arbitrary element v ∈Vi as

v = w+µ · vi,w ∈Vi−1,µ ∈ C.

Then( f −λi)(v) = ( f −λi)(w)+µ · ( f −λi)(vi).

Here the first summand belongs to Vi−1 because w ∈Vi−1 and the second summandalso belongs to Vi−1 because

f (vi)≡ λi · vi mod Vi−1.

Therefore( f −λi)(v) ∈Vi−1

and by induction assumption

i

∏j=1

( f −λ j)(v) = (i−1

∏j=1

( f −λ j))( f −λi)(v) = 0.

Eventually we obtain

pchar( f )(v) = (n

∏j=1

( f −λ j))(v) = 0 f or all v ∈V.

4.3 Heisenberg algebra in 1-dimensional quantum mechanics 73

Corollary 4.23 (Chain of ideals). Any n-dimensional complex solvable Lie algebra Lhas a chain of ideals of length n

0 = I0 ( I1 ( ...( In = L.

Proof. Consider the subalgebra ad L ⊂ gl(L). Lemma 4.16 applied to the exactsequence

0→ Z(L)→ L ad−→ ad L→ 0

shows that ad L is solvable. According to Theorem 4.21 a flag (Ii), i = 0, ...,n, ofsubspaces of L exists, such that each Ii is stable with respect to the action of allendomorphisms ad x ∈ End(L),x ∈ L. Hence each Ii ⊂ L, i = 0, ...,n, is an ideal,q.e.d.

Lemma 4.24 (Solvability and nilpotency). Consider a complex Lie algebra L.Then L is solvable iff its derived algebra DL = [L,L] is nilpotent.

Proof. The derived series of L relates to the lower central series of DL as

DiL⊂Ci−1(DL)

for all i ≥ 1. The proof goes by induction on i ∈ N. The inclusion holds for i = 1.Induction step i 7→ i+1:

Di+1L := [DiL,DiL]⊂ [Ci−1(DL),Ci−1(DL)]⊂ [DL,Ci−1(DL)] =Ci(DL).

i) Assume DL is nilpotent. The vanishing of Ci0(DL) for an index i0 ∈N impliesthe vanishing of Di0+1L. Hence L is solvable.

ii) Assume L solvable and dim L = n. Then also ad L⊂ gl(L) is solvable accord-ing to Lemma 4.16. Hence

ad L⊂ t(n,C).

If x ∈ DL = [L,L] then also ad x ∈ n(n,C). By Engel’s theorem, see 4.12, the Liealgebra DL is nilpotent, q.e.d.

Note that the Lemma is also valid for the base field R.

4.3 Heisenberg algebra in 1-dimensional quantum mechanics

A corner stone of quantum mechanics is the canonical commutation relation

[P,Q] =−ih


with P the momentum operator, Q the position operator and Planck’s constant

h =h

2π= 1.054×10−27erg · sec.

This relation has been introduced by Born and Jordan and termed strict quantumcondition (German: verscharfte Quantenbedingung), see [1, Equation (38)].

Definition 4.25 (Heisenberg algebra). The Heisenberg algebra heisn from n-dimensionalquantum mechanics is the real Lie algebra of dimension 2n+1

• spanned as a vector space by the set

{Pj : j = 1, ...,n}∪{Q j : j = 1, ...,n}∪{Z}

• with the Lie bracket defined by

[Pj,Qk] = δ jk ·Z and [Pj,Z] = [Q j,Z] = [Pj,Pk] = [Q j,Qk] = 0,1≤ j,k,≤ n.

Lemma 4.26 (Heisenberg algebra).i) The short exact sequence

0→ R ι−→ heisnπ−→ Rn×Rn→ 0

withι(1) := Z and π(Z) := 0

π(Pj) := (e j,0),π(Q j) := (0,e j), e j ∈ Rn,1≤ j ≤ n, canonical basis vectors,

represents heisn as a central extension of the Abelian Lie algebra Rn×Rn ' R2n.Hence heisn is a nilpotent Lie algebra.

ii) The Heisenberg algebra heis1 is isomorphic to n(3,R), the Lie algebra ofstrict upper triangular matrices, by the isomorphim of Lie algebras

φ : heis1∼−→ n(3,R)

with

φ(P) := E12 =

0 1 00 0 00 0 0

,φ(Q) = E23 =

0 0 00 0 10 0 0

,φ(Z) := E13 =

0 0 10 0 00 0 0

.

Note. More generally, heisn is isomorphic to a subalgebra of n(n+2,R). In par-ticular, the Heisenberg algebra is nilpotent.


The Heisenberg algebra captures the kinematics of the quantum mechanicalproblem. For a complete description which also covers the dynamics of the problemone needs a further operator: The Hamiltonian H of the problem and its relation tothe kinematical operators.

Remark 4.27 (1-dimensional quantum mechanics).i) In classical mechanics a 1-dimensional system is basically described by a pair

(q, p) of canonical coordinates with respect to the Hamilton function h(q, p) of thesystem.

For any two functions f = f (q, p, t) and g = g(q, p, t) the Poisson bracket isdefined as

{ f ,g} :=∂ f∂q

∂g∂ p− ∂ f

∂ p∂g∂q

.

The Poisson bracket { f ,g)} is antisymmetric in its two arguments. It satisfies theJacobi identity with respect to three arguments

{ f1,{ f2, f3}}+{ f2,{ f3, f1}}+{ f3,{ f1, f2}}= 0.

Hamilton equations of motion are

p(t) = {p,h}=−∂h∂q

and q(t) = {q,h}= ∂h∂ p

.

More general, a function f (q, p) has the time derivative

f (t) :=d f (q(t), p(t)))

dt= { f ,h}.

As a result, the vector space E (M) of infinitely often differentiable functionson phase space M with the Poisson bracket as Lie bracket is a Lie algebra, namedPoisson algebra.

ii) First quantization passes from classical mechanics to quantum mechanics byreplacing the canonical variables and the Hamilton function (p,q,h) by linear opera-tors

(P,Q,H).

Because [P,Q] = −ih the resulting operators do not commute in general. TheHeisenberg algebra formalizes the algebraic properties of the momentum opera-tor P and the position operator Q and their commutator Z := [P,Q]. First quan-tization replaces the Poisson bracket {−,h} with the hamilton function by thecommutator [−,H] with the Hamilton operator.

How to extend the Heisenberg algebra to include also the Hamiltonian H?

One of the most important constructions to obtain a new Lie algebra from exist-ing Lie algebras is the semidirect product.


Definition 4.28 (Semidirect product). Consider two Lie algebras I and M with Liebrackets respectively [−,−]I and [−,−]M together with a morphism of Lie algebras

θ : M→ Der(I).

The semidirect product of I and M with respect to θ is the Lie algebra

I oθ M := (L, [−,−])

with vector space L := I⊕M and Lie bracket

[−,−] : L×L→ L

[(i1,m1),(i2,m2)] :=([i1, i2]I +θ(m1)i2−θ(m2)i1, [m1,m2]M), i1, i2 ∈ I,m1,m2 ∈M.

The semidirect product is part of an exact sequence

0→ I→ I oθ Mp−→M→ 0

which has a morphisms : M→ I oθ M

with p ◦ s = idM,m 7→ (0,m). Hence one can identify I ⊂ I oθ M with an idealand M = s(M)⊂ I oθ M with a subalgebra.

The following Definition 4.29 shows how to obtain the Lie algebra of 1-dimensionalquantum mechanics from the Hamiltonian H and the Heisenberg algebra heis1.

Definition 4.29 (Lie algebra of the 1-dimensional quantum mechanics). The Liealgebra of 1-dimensional quantum mechanics quant1 is the semidirect product ofthe Heisenberg algebra heis1 and the 1-dimensional Lie algebra RH

quant1 = heis1 oθ RH

via the Lie algebra morphism

θ : RH→ Der(heis1),H 7→ δ .

The derivationδ : heis1→ heis1

is defined asδ (P) := Q,δ (Q) :=−P,δ (Z) := 0.

Proposition 4.30 (Solvability of the Lie algebra of 1-dimensional quantum me-chanics).


i) The Lie algebra quant1 of 1-dimensional quantum mechanics is solvable. Itsderived algebra is the Heisenberg algebra

D1(quant1) = heis1.

The remaining components of the derived series of L := quant1 are

D2L = span < Z >,D3L = {0}.

ii) The Lie algebra is not nilpotent, because

C1L =C2L = heis1 6= {0}

iii) The commutators are

• [P,Q] = Z.• [P,H] =−Q and [Q,H] = P.• Z(L) = R ·Z.

Proof. From the definition of L results the exact sequence

0→ heis1j−→ L

p−→ R H→ 0

defined byj(i) := (i,0), p((i,m)) := m.

The exact sequence has the section

s : R H→ L,m 7→ (0,m),

with p◦ s = idR H .We consider heis1 an ideal of L via j and R ·H a subalgebra of L via s. Then the

commutators of the kinematical operators with the Hamiltonian are

[P,H] = [(P,0),(0,H)] := (−δ (P),0) =−Q

[Q,H] = [(Q,0),(0,H)] := (−δ (Q),0) = P

[Z,H] = [(Z,0),(0,H)] := (−δ (Z),0) = 0, q.e.d.

Chapter 5Killing form and Semisimple Lie algebras

All vector spaces are assumed finite-dimensional if not stated otherwise. The com-position f2◦ f1 of two endomorphisms will be denoted as product f2 · f1 or also f2 f1.

5.1 Trace of endomorphisms

Lemma 5.1 (Simple properties of the trace). Consider a vector space V and en-domorphisms x,y,z ∈ End(V ).

1. For nilpotent x holdstr x = 0.

2. With respect to two endomorphisms the trace is symmetric, i.e.

tr (xy) = tr (yx).

In particular, tr[x,y] = 0.3. With respect to cyclic permutation the trace is invariant, i.e.

tr (xyz) = tr (yzx)

4. With respect to the commutator the trace is “associative”

tr ([x,y]z) = tr (x[y,z]).

Proof. 1) The endomorphism x can be represented by a strict upper triangular ma-trix, e.g. due to Corollary 4.6.

2) Consider matrix representations and note

tr (xy) = ∑i, j

xi jy ji = ∑i, j

y jixi j = tr (yx)

79

80 5 Killing form and Semisimple Lie algebras

3) According to part 2) we have

tr((xy)z) = tr(z(xy)) = tr((zx)y) = tr(y(zx)).

4) We have[x,y]z = xyz− yxz and x[y,z] = xyz− xzy.

The ordering yxz results from the ordering xzy by cyclic permutation. Hence part 3)implies

tr(yxz) = tr(yxz)

which proves the claim.

Definition 5.2 (Killing form). Let L be a Lie algebra.Consider a representation ρ : L→ gl(V ) on a K-vector space V . The trace form

of ρ is the symmetric bilinear map

β : L×L→K,β (x,y) := tr (ρ(x)ρ(y)).

The Killing form of L

κ : L×L→K,κ(x,y) := tr (ad(x)ad(y))

is the trace form of the adjoint representation ad : L→ gl(L),

The following Lemma 5.3 prepares the proof of Theorem 5.4.

Lemma 5.3 (Trace criterion for nilpotency). Consider an n-dimensional complexvector space V , two subspaces A⊂B⊂End(V ) and the subspace of endomorphisms

M := {x ∈ End(V ) : [x,B]⊂ A}.

If x ∈M andtr(xy) = 0

for all y ∈M, then x ∈ End(V ) is nilpotent.

Proof. i) Jordan composition for endomorphisms of V: Consider an arbitrary butfixed endomorphism

x : V →V

which satisfies the assumption [x,B]⊂A. According to Theorem 2.6 a unique Jordandecomposition exists

x = xs + xn.

For a suitable basis B of V the semisimple part xs is represented by a diagonalmatrix

xs = diag(λ1, ...,λn)

5.1 Trace of endomorphisms 81

and the nilpotent part xn by a strict upper triangular matrix. Denote by

E := spanQ < λ1, ...,λn >⊂ C

the Q-vector subspace generated by the eigenvalues of xs. In order to show xs = 0we will show E = {0} or E∗ = {0} for the dual space of Q-linear functionals

f : E→Q.

ii) Jordan decomposition for endomorphisms of End(V ): According to Proposi-tion 4.3

ad x = ad xs +ad xn

is the Jordan decomposition of ad x. A polynomial

ps(T ) ∈ C[T ], ps(0) = 0,

exists withad xs = ps(ad x).

Consider a Q-linear functional f : E→Q. The endomorphism

y : V →V

defined with respect to B by the matrix

y = diag( f (λ1), ..., f (λn)).

is semisimple by construction.According to Proposition 4.3, both ad xs and ad y act on End(V) as diagonal

matrices with respect to the standard basis (Ei j)1≤i, j≤n2 of End(V ) correspondingto B:

(ad xs)(Ei j) = (λi−λ j) ·Ei j and (ad y)(Ei j) = ( f (λi)− f (λ j)) ·Ei j.

Choose an interpolation polynomial q(T ) ∈ C[T ] with supporting points

(λi−λ j, f (λi)− f (λ j))1≤i, j≤n.

By construction q(0) = 0 and

ad y = q(ad xs).

Therefore,ad y = (q◦ ps)(ad x)

and also the polynomial (q◦ ps)(T ) ∈ C[T ] satisfies (q◦ ps)(0) = 0.

From


[x,B] = (ad x)(B)⊂ A

follows[y,B] = (ad y)(B)⊂ A,

i.e. y ∈M.

iii) Trace computation: The assumption

0 = tr (xy) = tr (xs · y)+ tr(xn · y) = tr (xs · y) =n

∑i=1

λi · f (λi) ∈ E ⊂ C

implies

0 = f (n

∑i=1

λi · f (λi)) =n

∑i=1

f (λi)2 ∈Q.

Therefore, all f (λi), i = 1, ...,n, vanish and f = 0 because the family (λi)i=1,...,nspans E, q.e.d.

Theorem 5.4 (Cartan’s trace criterion for solvability). Consider a complex vec-tor space V and a complex subalgebra L ⊂ gl(V ). The following properties areequivalent:

• L is solvable• The trace form

β : L×L→ C,β (x,y) := tr (xy),

satisfies β ([L,L],L) = 0.

Proof. 1) Assume β ([L,L],L) = 0.

i) Reduction to the case of nilpotent endomorphisms: According to Lemma 4.24it suffices to prove the nilpotency of the derived algebra DL = [L,L]. Accordingto Engel’s theorem, see Theorem 4.12, for this purpose it suffices to prove: Everyelement x ∈ [L,L] is a ad-nilpotent. We prove that all endomorphisms

x : V →V,x ∈ L

are nilpotent and apply Lemma 4.2.

ii) Trace criterion for nilpotency: Consider a fixed but arbitrary element x ∈[L,L]⊂End(V ). To show nilpotency of x we apply the trace criterion from Lemma 5.3with A := [L,L], B := L and

M := {x ∈ End(V ) : [x,L]⊂ [L,L]}.

Apparently x ∈ [L,L]⊂M. It remains to show for all endomorphisms y ∈M:

5.2 Fundamentals of semisimple Lie algebras 83

tr (xy) = 0.

Because [L,L] has generators [u,v],u,v ∈ L, we may assume x = [u,v]. According toLemma 5.1, part iv)

tr (xy) = tr ([u,v]y) = tr (u[v,y]) = tr ([v,y]u]).

But [v,y]∈ [L,L] because y∈M. By assumption β ([L,L],L)= 0, which proves tr (xy) = 0.

2) Assume L solvable. According to Lie’s theorem, see Theorem 4.21, theLie algebra L is isomorphic to a subalgebra of upper triangular matrices. Denoteby (Vi)i=1,...,n a corresponding flag of L-invariant subspaces of V .

For x ∈ L, [u,v] ∈ [L,L],[u,v](Vi)⊂Vi−1

andx[u,v](Vi)⊂ x(Vi−1)⊂Vi−1, i = 1, ...,n.

Therefore x · [u,v] is nilpotent and

0 = tr (x[u,v]) = tr ([u,v]x).

Theorem 5.4 is valid also in the real case.

5.2 Fundamentals of semisimple Lie algebras

Definition 5.5 (Simple and semisimple Lie algebras). Consider a Lie algebra L.

1. L is semisimple iff L has no Abelian ideal I 6= 0.

2. L is simple iff L is not Abelian and L has no ideal other than 0 and L.

These concepts apply also to an ideal I ⊂ L when the ideal is considered a Liealgebra.

Note: The trivial Lie algebra L= {0} is semisimple. A semisimple Lie algebra L 6= {0}is not Abelian.

If L has a solvable Ideal I 6= {0} then L has an Abelian ideal 6= {0} too: Let i ∈Nbe the largest index from the derived series of I with DiI 6= {0}. Then DiI ⊂ L is anAbelian ideal because Di+1I = [DiI,DiI] = {0}. The reverse implication is obvious:Each Abelian ideal is solvable. As a consequence:

L semisimple ⇐⇒ rad(L) = 0.


The equivalence above shows a certain complementarity between semisimplicityand solvability. It facilitates to prove semisimplicity by using results about solvableLie algebras. An application of this heuristics is Cartan’s criterion for semisimplic-ity, see Theorem 5.11.

A simple application of the above characterization is the following Proposition 5.6.

Proposition 5.6 (Adjoint representation of a semisimple Lie algebra).

The adjoint representation of a semisimple Lie algebra is faithful. In particular,

L' ad(L)⊂ Der(L)

and L is isomorphic to a subalgebra of matrices.

Proof. According to Proposition 3.18 the adjoint representations maps to the Liealgebra of derivations. The kernel is the center

ker[ad : L→ Der(L)] = Z(L)

which is an Abelian ideal of L. Therefore Z(L) = 0, and the adjoint representationis injective, q.e.d.

Any Lie algebra becomes semisimple after dividing out its radical.

Proposition 5.7. For any Lie algebra L the quotient L/rad(L) is semisimple.

Proof. An idealI ⊂ L/rad(L)

has the formI = J/rad(L)

with an ideal J ⊂ L. If I is Abelian then [I, I] = 0, hence [J,J]⊂ rad(L) and notably

[J,J]⊂ L

solvable. The latter property implies J solvable because Di+1J = Di[J,J]. As a con-sequence J ⊂ rad(L) and I = 0, q.e.d.

Definition 5.8 (Orthogonal space). Consider a Lie algebra L and a symmetric bi-linear map

β : L×L→K.

i) The orthogonal space of a vector subspace M ⊂ L with respect to β is


M⊥ := {x ∈ L : β (x,M) = 0}.

The nullspace of β is L⊥.

ii) The form β is nondegenerate if it has trivial nullspace, i.e. L⊥ = {0}.

Lemma 5.9 (Orthogonal complement of ideals). Consider a complex vector space V ,a Lie algebra L⊂ gl(V ) and the symmetric bilinear form

β : L×L→ C,β (x,y) := tr(xy).

For an ideal I ⊂ L the orthogonal space I⊥ ⊂ L of β is an ideal too.

Proof. Consider x ∈ I⊥. For arbitrary y ∈ L and u ∈ I

β ([x,y],u) = β (x, [y,u])

according to Lemma 5.1. Because [y,u] ∈ I and x ∈ I⊥

β (x, [y,u]) = 0.

Hence [x,y] ∈ I⊥, q.e.d.

The main step in characterizing semisimplicity of a Lie algebra by its Killingform is the following Proposition.

Proposition 5.10 (Nondegenerateness of the trace form for semisimple Lie alge-bras). Consider a complex vector space V and a semisimple Lie algebra L⊂ gl(V ).Then the trace form

β : L×L→ C,β (x,y) := tr(xy),

is nondegenerate.

Proof. The nullspace

S := L⊥ = {x ∈ L : tr(xy) = 0 f or all y ∈ L}

is an ideal according to Lemma 5.9, notably [S,S] ⊂ S. We apply Cartan’s tracecriterion, see Theorem 5.4, to the subalgebra S⊂ gl(V ). For x,y,z ∈ S

tr([x,y],z) = tr(x, [y,z]) = 0

because x ∈ S. Therefore the ideal S ⊂ L is solvable. Eventually the semisimplicityof L implies S = 0, q.e.d.

The principal theorem of the present section is the following Cartan criterion forsemisimplicity. It derives from Cartan’s trace criterion for solvability.


Theorem 5.11 (Cartan criterion for semisimplicity). For a complex Lie algebra Lthe following properties are equivalent:

• L is semisimple• The Killing form κ of L is nondegenerate.

Proof. i) Assume L semisimple. According to Proposition 5.6 the adjoint repre-sentations identifies L with a subalgebra of gl(L). Therefore κ is nondegenerateaccording to Proposition 5.10.

ii) Assume κ nondegenerate, i.e. having nullspace S = 0. Consider an Abelianideal I ⊂ L. We claim: I ⊂ S. For arbitrary fixed x ∈ I and arbitrary y ∈ L the com-position

ad(x)◦ad(y) : L→ I ⊂ L

is a nilpotent endomorphism of L, because

Lad y−−→ L ad x−−→ I

ad y−−→ I ad x−−→ [I, I] = {0}.

Hence κ(x,y) = tr(ad(x) · ad(y)) = 0. As a consequence x ∈ S. Hence I ⊂ Sand S = 0 implies I = 0. Therefore the null-ideal is the only Abelian ideal of Lwhich forces L to be semisimple, q.e.d.

Lemma 5.12 (Killing form of an ideal). Consider a Lie algebra L and an ideal I ⊂ L.Then the Killing form of I is the restriction of the Killing form of L to I.

Proof. Extend a base of the vector subspace I ⊂ L to a base of L. For x ∈ I thecorresponding matrix representations of

ad x : L→ I and ad(x)|I : I→ I

showtr(ad x) = tr(ad(x)|I).

As a consequence, for x,y ∈ I

κ(x,y) = tr(ad(x)ad(y)) = tr((ad(x)ad(y))|I) = κI(x,y).

The first step on the way to represent a semisimple Lie algebra as a finite directsum of simple Lie algebras is the following Proposition 5.13.

Proposition 5.13 (Direct sum decomposition of a semisimple Lie algebra). Con-sider a complex semisimple Lie algebra L and an ideal I ⊂ L.


1. Then L decomposes as a direct sum of ideals

L = I⊕ I⊥.

The notation means:

• The vector space of L decomposes as the direct sum of the vector subspacesof I and I⊥ and

• [I, I⊥] = 0.

2. The ideal I is semisimple.

Proof. ad 1) First we consider the underlying vector spaces. The Killing form κ

of L induces a map of vector spaces

F : L−→ L∗,x 7→ κ(x,−).

The map is injective because the Killing form is nondegenerate according to Car-tan’s criterion for semisimplicity, Theorem 5.11. It is also surjective because dim L = dim L∗.With respect to this isomoprhism

F(I⊥) = {ϕ ∈ L∗ : ϕ(I) = 0}= (L/I)∗.

We obtaindim I⊥ = dim (L/I)∗ = dim (L/I) = dim L−dim I,

i.e. dim L = dim I +dim I⊥.

Secondly, we show I∩I⊥= {0}: We apply Cartan’s trace criterion for solvability,Theorem 5.4, to the ideal

J := I∩ I⊥

considered as a Lie algebra. Denote by κ the Killing form of L and by κJ the Killingform of J. According to Lemma 5.12 the Killing form κJ is the restriction of κ .

Consider x,y,z ∈ J. Then x ∈ J ⊂ I⊥ and [y,z] ∈ J ⊂ I. Therefore

κJ([x,y],z) = κJ(x, [y,z]) = κ(x, [y,z]) = 0

by definition of I⊥. The Cartan criterion implies that the ideal J ⊂ L is solvable.Semisimplicity of L implies J = {0}.

As a consequence, the dimension formula of vector spaces

dim L = dim I +dim I⊥ = dim(I + I⊥)+dim (I∩ I⊥)

implies the direct sum decomposition of vector spaces

L = I⊕ I⊥.

Thirdly, to compute the Lie bracket we use that both I and I⊥ are ideals of L, thelatter due to Lemma 5.9. We have


[I, I⊥]⊂ (I∩ I⊥) = {0}.

ad 2) In order to prove semisimplicity of I by applying the Cartan criterion to I, seeTheorem 5.11, we have to show: If x ∈ I with κI(x, I) = 0 then x = 0.

We first derive κ(x,L) = 0: Due to Lemma 5.12 the Killing form of I is the re-striction of the Killing form of L to I

κI = κ|I× I : I× I→K.

Due to part 1)

κ(x,L) = κ(x, I)+κ(x, I⊥) = κ(x, I) = κI(x, I)

because by definition of I⊥

κ(x, I⊥) = 0.

Therefore κI(x, I) = 0 implies κ(x,L) = 0. The Cartan criterion applied to thesemisimple Lie algebra L shows x = 0, q.e.d.

Theorem 5.14 (Decomposition of a semisimple Lie algebra into simple sum-mands). For a Lie algebra L the following properties are equivalent:

• L is semisimple• L decomposes as a finite direct sum

L =⊕α∈A

Iα ,card A < ∞,

of simple ideals Iα ⊂ L.

If these conditions are satisfied then:

1. Each simple ideal I ⊂ L is one of the distinguished ideals Iα ,α ∈ A. In particular,the representation of L as direct sum of simple ideals is uniquely determined upto the order of the summands.

2. Each ideal I ⊂ L is a finite direct sum of simple ideals. In particular, for eachideal I ⊂ L an ideal J ⊂ L exists with L = I⊕ J.

3. L = [L,L], i.e. L equals its derived algebra.

Proof. i) Assume L semisimple. In case L = {0} take the empty sum with A = /0.

Assume L 6= {0}. If L has no proper ideals then L itself is simple. Otherwise wechoose a proper ideal

{0}( I1 ( L


of minimal dimension. Proposition 5.13 implies a direct sum representation

L = I1⊕ I⊥1 ,

The two ideals I1 and I⊥1 have the following properties:

1. Any ideal of I1 or of I⊥1 is also an ideal of L because of the direct sum represen-tation.

2. The ideal I1 is simple, because any proper ideal of I1 would be a proper ideal of Lof dimension less than dim I1.

3. Both ideals are semisimple due to Proposition 5.13.

Starting with I⊥1 the decomposition can be iterated until the last summand in thedirect sum representation

L = I1⊕ I2⊕ ...⊕ In

does not contain any proper ideal. The decomposition stops after finitely many stepsbecause each steps decreases the dimension of the ideals in question.

For the rest of the proof presuppose the direct sum representation of L.

ii) Consider an arbitrary ideal I ⊂ L: The intersections

I∩ Iα ,α ∈ A,

are ideals of the simple Lie algebras Iα ,α ∈ A. Hence for each α ∈ A either

I∩ Iα = Iα

orI∩ Iα = {0}.

As a consequence a subset A′ ⊂ A exists such that I is the direct sum

I =⊕α∈A′

Iα .

If I is a simple ideal thenI∩ Iα0 = Iα0

for exactly one index α0 ∈ A.

iii) The orthogonal decomposition o L implies the decomposition

[L,L] = [⊕α∈A

Iα ,⊕β∈A

Iβ ] =⊕

α,β∈A

[Iα , Iβ ] =⊕α∈A

[Iα , Iα ].

For each α ∈ A either [Iα , Iα ] = {0} or [Iα , Iα ] = Iα because the simple algebra Iα

has no proper ideals. The case [Iα , Iα ] = {0} is excluded because Iα is not Abelian.As a consequence


[L,L] =⊕α∈A

Iα = L.

iv) In order to derive the semisimplicity of L consider an Abelian ideal I ⊂ L.According to part ii) it has the form

I =⊕α∈A′

Iα

with a subset A′ ⊂ A. An argument alike to part iii) proves

[I, I] = I.

Because I is Abelian we have [I, I] = {0} which implies I = {0}. Therefore L issemisimple because no Abelian ideals distinct from {0} exist, q.e.d.

5.3 Weyl’s theorem on complete reducibility

Alike to the decomposition of a semisimple Lie algebra into a direct sum of sim-ple Lie algebras we decompose arbitrary finite-dimensional representations of asemisimple Lie algebra L as a direct sum of irreducible representations of L. Themain result is Weyl’s Theorem 5.23.

Definition 5.15 (Basic definitions). Consider a K-Lie algebra L and a representa-tion

ρ : L→ gl(V )

with a vector space V . According to Definition 3.14 the latter is named an L-modulewith respect to ρ . As a shorthand one often writes the module operation as

L×V →V,(x,v) 7→ xv := x.v := ρ(x)(v).

1. A submodule W of V is a subspace W ⊂V stable under the action of L, i.e.

ρ(L)(W )⊂W.

2. An L-module V is reducible iff a proper submodule 0(W (V exists. Otherwise Vis irreducible.

3. A submodule W of V has a complement iff a submodule W ′ ⊂V exists with

V =W ⊕W ′.

4. An L-module V is completely reducible iff a decomposition exists

5.3 Weyl’s theorem on complete reducibility 91

V =k⊕

j=1

Wj

with irreducible L-modules Wj, j = 1, ...,k.

Applying the standard constructions from linear algebra to L-modules generatesa series of new L-modules based on existing L-modules.

Definition 5.16 (Derived representations). Consider a Lie algebra L. Two repre-sentations of L

ρ : L→ gl(V ) and σ : L→ gl(W )

with corresponding L-modules V and W induce further representations of L in acanonical manner:

1. Dual representation ρ∗ : L→ gl(V ∗) with - note the minus sign -

(ρ∗(x)λ )(v) :=−λ (ρ(x)v), (x ∈ L,λ ∈V ∗,v ∈V ).

Corresponding module: Dual module V ∗ with (x.λ )(v) =−λ (x.v).

2. Tensor product ρ⊗σ : L→ gl(V ⊗W ) with

(ρ⊗σ)(x)(v⊗w) := ρ(x)v⊗w+ v⊗σ(x)w, (x ∈ L,v ∈V,w ∈W ).

Corresponding module: Tensor product V ⊗W with x.(u⊗v) = x.u⊗v+u⊗x.v.

3. Exterior product ρ ∧ρ : L→ gl(∧2 V ) with

(ρ ∧ρ)(x)(v1⊗ v2) := ρ(x)v1∧ v2 + v1∧ρ(x)v2, (x ∈ L,v1,v2 ∈V ).

Corresponding module: Exterior product∧2 V with x.(u∧ v) = x.u∧ v+u∧ x.v.

4. Symmetric product Sym2(ρ) : L→ gl(Sym2V ) with

(Sym2(ρ)(x)(v1 · v2) := (ρ(x)v1) · v2 + v1 · (ρ(x)v2), (x ∈ L,v1,v2 ∈V ).

Corresponding module: Symmetric product Sym2V with x.(u · v) = (x.u) · v+u · (x.v).

5. Hom-representation HomK(ρ,σ) := τ : L→ gl(HomK(V,W )) with

(τ(x) f )(v) := σ(x) f (v)− f (ρ(x)v), (x ∈ L, f ∈ HomK(V,W ),v ∈V ).

Corresponding module: Module of homomorphisms HomK(V,W ) with

(x. f )(v) = x. f (v)− f (x.v).


The notations emphasize the tight relationship to the notations used in Commu-tative Algebra for modules over rings.

Remark 5.17 (Induced representations).

1. It remains to check that the constructions in Definition 5.16 actually yield L-modules. As an example we consider the case of the dual module V ∗ and verifythat the induced map

(ρ∗(x)λ )(v) :=−λ (ρ(x)v)

preserves the Lie bracket. Claim: For all x,y ∈ L, f ∈V ∗ and v ∈V

([x,y]. f )(v) = (x.y. f − y.x. f )(v)

We compute

([x,y]. f )(v) =− f ([x,y].v) =− f (x.y.v)+ f (y.x.v) = (x. f )(y.v)− (y. f )(x.v) =

=−(y.x. f )(v)+(x.y. f )(v) = (x.y. f − y.x. f )(v).

2. One can show that the canonical isomorphism of vector spaces

V ∗⊗W '−→ HomK(V,W ), f (−)⊗w 7→ f (−) ·w

is an isomorphism of L-modules.

3. For a map f ∈ HomK(V,W ): L. f = 0 ⇐⇒ f : V →W is L-linear.

Theorem 5.18 (Lemma of Schur). Consider a complex Lie algebra L and an irre-ducible representation ρ : L→ gl(V ) on a complex vector space V .

Any endomorphism f ∈ End(V ) commuting with all elements of the representa-tion is a multiple of the identity, i.e. if for all x ∈ L

[ f ,ρ(x)] = 0

then a complex number µ ∈ C exists with

f = µ · id.

Proof. The endomophism f has a complex eigenvalue µ . Its eigenspace W ⊂ V isalso an L-submodule: If w ∈W and f (w) = µ ·w then for all x ∈ L

( f ◦ρ(x))(w) = (ρ(x)◦ f )(w) = ρ(x)(µ ·w) = µ ·ρ(x)w.

Now W 6= {0} and V irreducible imply W =V .


Definition 5.19 (Casimir element). Consider a complex semisimple Lie algebra Land a faithful representation ρ : L→ gl(V ) on a complex vector space V . The traceform of ρ

β : L×L→ C,β (x,y) := tr(ρ(x)ρ(y)),

is nondegenerate according to Proposition 5.10. For a base (xi)i=1,...,n of L denote by(y j) j=1,...,n the dual base with respect to β , i.e. β (xi,y j) = δi j. The Casimir elementof ρ is defined as the endomorphism

cρ :=n

∑i=1

ρ(xi)ρ(yi) ∈ End(V ).

Note: The Casimir element does not depend on the choice of the basis (xi)i=1,...,n.

Remark 5.20 (Reduction to faithful representations). If the representation ρ is notfaithful one considers the direct decomposition

L = ker ρ⊕L′

with L′ :=(ker ρ)⊥⊂L. The Lie algebra L′ is semisimple according to Proposition 5.13.The restricted representation ρ|L′→ gl(V ) is faithful.

Proposition 5.21 (Properties of the Casimir element). The Casimir element cρ ∈ End(V )of a faithful representation ρ of a complex semisimple Lie algebra L has the follow-ing properties:

• Commutation: The Casimir element commutes with all elements of the represen-tation:

[cρ ,ρ(L)] = 0

• Trace: tr(cρ) = dim L

• Scalar: For an irreducible representation ρ holds

cρ =dim Ldim V

· idV .

Proof. Set n = dim L and

cρ =n

∑i=1

ρ(xi)ρ(yi) ∈ End(V )

with bases (xi)i=1,...,n and (y j) j=1,...,n of L, which are dual with respect to the traceform β of ρ .


i) Commutation: For x ∈ L we show [ρ(x),cρ ] = 0: Define the coefficients (ai j)and (b jk) according to

[x,xi] =n

∑j=1

ai j · x j, [x,y j] =n

∑k=1

b jk · yk.

Because the families (x j)1≤ j≤n and (yk)1≤k≤n are dual bases with respect to β

ai j = β (n

∑k=1

aik · xk,y j) = β ([x,xi],y j) =−β ([xi,x],y j) =−β (xi, [x,y j]) =

=−β (xi,n

∑k=1

b jk · yk) =−b ji.

Here we made use of the associativity of the trace form according to Lemma 5.1. Tocompute

[ρ(x),cρ ] =n

∑i=1

[ρ(x),ρ(xi)ρ(yi)]

we use the formula [A,BC] = [A,B]C+B[A,C] for endomorphisms A,B,C∈End(V ).For each summand

[ρ(x),ρ(xi)ρ(yi)] = [ρ(x),ρ(xi)]ρ(yi)+ρ(xi)[ρ(x),ρ(yi)]

and therefore

[ρ(x),cρ ] =n

∑i=1

([ρ(x),ρ(xi)]ρ(yi)+ρ(xi)[ρ(x),ρ(yi)]) =

=n

∑i=1

(ρ([x,xi])ρ(yi)+ρ(xi)ρ([x,yi]) =n

∑i, j=1

ai j ·ρ(x j)ρ(yi)+n

∑i,k=1

bik ·ρ(xi)ρ(yk) =

=n

∑i, j=1

ai j ·ρ(x j)ρ(yi)+n

∑i, j=1

b ji ·ρ(x j)ρ(yi) =

=n

∑i, j=1

(ai j +b ji) · (ρ(x j)ρ(yi)) = 0.

Here we have changed in the second sum the summation indices (i,k) 7→ ( j, i).

ii) Trace: We have

tr(cρ) =n

∑i=1

tr(ρ(xi)ρ(yi)) =n

∑i=1

β (xi,yi) = n = dim L.

iii) Scalar: For an irreducible representation ρ we get with part i) and the Lemmaof Schur, Theorem 5.18, cρ = µ · idV and with part ii)


tr(cρ) = µ ·dim V = dim L, q.e.d.

Lemma 5.22 (Representations of semisimple Lie algebras are traceless). Con-sider a complex semisimple Lie algebra L and a representation ρ : L→ gl(V ).

• Then ρ(L)⊂ sl(V ), i.e. tr(ρ(x)) = 0 for all x ∈ L.

• In particular, 1-dimensional representations of L are trivial, i.e. if dim V = 1then ρ = 0.

Proof. Due to semisimplicity we have L = [L,L], see Theorem 5.14. If x = [u,v]∈ Lthen

tr(ρ(x)) = tr(ρ([x,y])) = tr([ρ(x),ρ(y)]) = 0

according to Lemma 5.1. For 1-dimensional V , i.e. V = C, for all x ∈ L

0 = tr(ρ(x)) = ρ(x),q.e.d.

Theorem 5.23 (Weyl’s theorem on complete reducibility). Every finite dimen-sional L-module of a complex semisimple Lie algebra L is completely reducible.

Proof. We have to show: Any submodule of a L-module has a complement.

Case 1 (Codimension = 1): Consider pairs (V,W ) with an L-module V and asubmodule W ⊂V with codimVW = 1, i.e. exact sequences of L-modules

0→W →V → C→ 0

Note that the 1-dimensional quotient V/W is a trivial L-module by Lemma 5.22. Weshow by induction on n := dim W that W has a complement. We have to distinguishtwo subcases:

Subcase 1a), W reducible: Consider a proper submodule {0}(W ′(W , inducingthe exact sequence of L-modules

0→W/W ′→V/W ′→ C→ 0

Because dim (W/W ′) < dim W the pair (V/W ′,W/W ′) satisfies the induction as-sumption. We obtain a submodule

W (V with V/W ′ =W/W ′⊕W/W ′.

We have the exact sequence

0→W ′→ W → W/W ′ ' C→ 0


Because dim W ′ < dim W the pair (W ,W ′) satisfies the induction assumption. Weobtain a submodule

X ⊂ W with W =W ′⊕X .

We claimV =W ⊕X .

For the proof, on one hand,

dim V = dim W +dim W −dim W ′ and dim W = dim W ′+dim X

which impliesdim V = dim W +dim X .

On the other hand,X ⊂ W

(W ∩X)⊂ (W ∩W ).

The first direct sum decomposition implies

{0}= (W/W ′)∩ (W/W ′) = (W ∩W )⊂W ′,

therefore(W ∩X)⊂ (W ∩W )⊂W ′.

As a consequence

W ∩X = (W ∩X)∩W ′ =W ∩ (X ∩W ′).

UsingX ∩W ′ = {0}

due to the second direct sum decomposition we obtain

W ∩X = {0}.

ThereforeV =W ⊕X

which finishes the induction step for reducible W .

Subcase 1b), W irreducible: Assume that the representation ρ : L→ gl(V ) definesthe L-module structure. According to Remark 5.20 we may assume ρ faithful. Weconsider the Casimir element of ρ

cρ :=dim L

∑j=1

ρ(x j)ρ(y j) ∈ End(V ).

• Concerning V : The Casimir element cρ : V →V is L-linear, i.e. for x ∈V,v ∈V ,


(cρ ◦ρ(x))(v) = (ρ(x)◦ cρ)(v)

because [cρ ,ρ(x)] = 0. Therefore

X := ker cρ ⊂V

is a L-submodule.

• Concerning W : The 1-dimensional L-module V/W is trivial according to Lemma5.22. Hence

ρ(L)(V )⊂W.

By definition of cρ the submodule W ⊂ V is also stable with respect to cρ ,i.e. cρ(W )⊂W . The irreducibility of W implies according to Proposition 5.21that

cρ |W = µ · idW .

• Concerning W ⊂ V : If we extend a basis (w1, ...,wn) of the vector subspace Wto a basis B = (w1, ...,wn,v) of the vector space V , then with respect to B theCasimir element has the matrix

m(cρ) =

µ ∗

. . .µ ∗

0 ... 0 0

.

Fromtr cρ = dim L

follows µ 6= 0 and in particular

W ∩X = {0}.

From det m(cρ) = 0 we get

X = ker cρ 6= {0}.

Because codimVW = 1 we getV =W ⊕X

which finishes the induction step for irreducible W .

Case 2 (Arbitrary codimension): Consider an arbitrary proper submodule

{0}(W (V.

We claim the existence of a L-linear map

f0 : V →W


such that f0|W = idW . Then X := ker f0 is a complement of W because V/X 'W .

The idea is to consider the induced L-module HomC(V,W ) and to reduce theclaim to a problem concerning pairs of L-modules

(V ,W )

with codimV W = 1, which can be solved by case 1.

Consider the following submodules of the L-module HomC(V,W )

V := { f ∈ HomC(V,W ) : f |W = λ · idW ,λ ∈ C}

W := { f ∈ V : f |W = 0}.

In order to prove that V is a L-module, consider f ∈HomC(V,W ) with f |W = λ · idWand x ∈ L,w ∈W :

(x. f )(w) = x. f (w)− f (x.w) = x. f (λ ·w)− f (x.(λ ·w)) = λ · (x.w)−λ · (x.w) = 0.

Therefore even L.V ⊂W and both V and W are L-modules. By definition

codimV W = 1

because W as submodule of V is defined by one additional equation.

In this context we can apply the result of case 1: A complement C · f0 of Wexists, i.e.

V = W ⊕C · f0

and without restriction f0|W = idW . The L-module C · f0 is 1-dimensional, hencetrivial according to Lemma 5.22. According to Remark 5.17 the equality L. f0 = 0implies the L-linearity of

f0 : V →W.

Therefore X := ker f0 ⊂V is a submodule. Due to our considerations at the begin-ning of case 2)

V =W ⊕X ,q.e.d.

Theorem 5.24 (Jordan decomposition in semisimple Lie algebras). Consider avector space V and a semisimple Lie algebra L⊂ gl(V ).

If an element x ∈ L considered as endomorphism

x : V →V,v 7→ x(v),

has the Jordan decomposition

x = xs + xn ∈ End(V )


then both summands belong to L, i.e. xs,xn ∈ L.

Proof. i) Calculations with the Lie algebra gl(V ). In particular, the adjoint elementsare taken with respect to gl(V )

Consider an element x ∈ L as endomorphism x : V →V and its Jordan decompo-sition

x = xs + xn.

Proposition 4.3 implies the Jordan decomposition

ad x = ad xs +ad xn

with polynomials ps(T ), pn(T ) ∈ C[T ] without constant term such that

ad xs = ps(ad x),ad xn = pn(ad x).

Therefore (ad x)(L)⊂ L implies

[xs,L] = (ad xs)(L)⊂ L and [xn,L] = (ad xn)(L)⊂ L.

Consider the normalizer of the subalgebra L⊂ gl(V )

N := Ngl(V )L⊂ gl(V ).

Then L⊂ N is an ideal by definition and

xs,xn ∈ N.

ii) Calculations with the semisimple Lie algebra L.We have xs,xn ∈ N. But unfortunately, in general L ( N. Therefore we will con-

struct a submoduleL⊂ L⊂ N

with the same property and - a posteriori - L = L.

For an arbitrary submodule W ⊂V of the L-module V we define the L-module

LW := {y ∈ gl(V ) : y(W )⊂W and tr(y|W ) = 0}

We defineL := N∩

⋂W⊂V

LW =⋂

W⊂V

(N∩LW ).

With N also L is an L-module.

• L⊂ L: Lemma 5.22 implies tr(y|W ) = 0 for all y ∈ L. As a consequence

L⊂⋂

W⊂V

LW and L⊂ L.


• Moreover xs,xn ∈ L: Because the subspace W is stable with respect to theendomorphism x : V →V , the same is true for its Jordan components, i.e.

xs(W )⊂W and xn(W )⊂W.

With xn also the restriction xn|W is nilpotent, therefore tr (x|W )= 0, altogether xn ∈ LW .As a consequence also xs = x− xn ∈ LW . We obtain

xs,xn ∈ L

because xs,xn ∈ N and xs,xn ∈ LW for all L-submodules W ⊂V .

It remains to show L = L. According to Weyl’s theorem on complete reducibility,Theorem 5.23, an L-submodule M ⊂ L exists with

L = L⊕M.

We claim: M = 0. Consider an arbitrary but fixed y ∈M. Because L⊂ N we have

[L,L] = [L, L]⊂ L

which implies that the action of L on M is trivial. Therefore [L,y] = 0. For anyirreducible L-submodule W ⊂V on one hand, Schur’s Lemma, Theorem 5.18, implies y = µ · idW .On the other hand, y ∈ LW implies tr y = 0. Therefore y|W = 0.

The decomposition of V as a direct sum of irreducible L-modules shows y = 0.We obtain M = 0 and L = L, q.e.d.

The previous Theorem 5.24 assures: For a semisimple Lie algebra L the Jordandecomposition of elements from ad L induces a decomposition of elements from L.

Definition 5.25 (Abstract Jordan decomposition). Consider a complex semisim-ple Lie algebra L. Its adjoint representation satisfies

ad : L '−→ ad L⊂ gl(L).

For x ∈ L consider the element ad x ∈ ad(L) as endomorphism

ad x : L→ L,y 7→ [x,y]

and denote byad(x) = fs + fn ∈ End(L)

the corresponding Jordan decomposition. Due to Theorem 5.24, applied to thesemismiple Lie algebra ad L:

fs ∈ ad(L) and fn ∈ ad(L).


Introducing the elements s := ad−1( fs) ∈ L and n := ad−1( fn) ∈ L provides thedecomposition

x = s+n

with [s,n] = 0. This decomposition is named the abstract Jordan decompositionof x ∈ L.

In the abstract case, neither x ∈ L nor its components s and n from the abstractJordan decomposition are endomorphisms of a vector space. But Corollary 5.26 willshow: For any representation

ρ : L→ gl(V )

the abstract Jordan decomposition of x ∈ L induces the Jordan decompositionof ρ(x) ∈ End(V ).

Corollary 5.26 (Jordan decomposition of representations). Consider a complexvector space V and a representation

ρ : L→ gl(V )

of a complex semisimple Lie algebra L.If x ∈ L has the abstract Jordan decomposition

x = s+n

then ρ(x) ∈ End(V ) has the Jordan decomposition

ρ(x) = ρ(s)+ρ(n).

Proof. i) Calculations with the semisimple Lie algebra ρ(L).By definition the endomorphism

ad s : L→ L

is semisimple, i.e. the eigenvectors of ad s ∈ End(L) span L. Therefore the eigen-vectors of ad ρ(s) ∈ End(ρ(L)) span ρ(L) and the endomorphism

ad ρ(s) ∈ End(ρ(L))

is semisimple. Analogously, by definition the endomorphism

ad n ∈ End(L)

is nilpotent. Thereforead ρ(n) ∈ End(ρ(L))


is nilpotent because (ad ρ(n))k = ρ ◦ (ad n)k. Apparently,

[ρ(s),ρ(n)] = ρ([s,n]) = 0

which implies [ad ρ(s),ad ρ(n)] = 0. Summing up,

ad ρ(x) = ad ρ(s)+ad ρ(n)

is the Jordan decomposition of the element ad ρ(x) ∈ ad ρ(L) considered as anendomorphism

ad ρ(x) : ad ρ(L)→ ad ρ(L).

The semisimplicity of the Lie algebra ρ(L) implies the isomorphy

ad : ρ(L) '−→ ad(ρ(L))

and the abstract Jordan decomposition

ρ(x) = ρ(s)+ρ(n)

of ρ(x) ∈ ρ(L).

ii) Jordan decomposition in End(V ).Consider the Jordan decomposition of the endomorphism ρ(x)∈ ρ(L)⊂End(V )

ρ(x) = fs + fn.

Because ρ(L) is semisimple, Theorem 5.24 applies to ρ(L) and implies fs, fn ∈ ρ(L).From the uniqueness of the Jordan decomposition derives

fs = ρ(s) and fn = ρ(n).

Part IIStructure of Complex Semisimple Lie

Algebras

Use the template part.tex together with the Springer document class SVMono(monograph-type books) or SVMult (edited books) to style your part title page and,if desired, a short introductory text (maximum one page) on its verso page in theSpringer layout.

Chapter 6Cartan decomposition

The base field in this chapter is K= C, the field of complex numbers. All Lie alge-bras are complex Lie algebras if not stated otherwise.

6.1 Toral subalgebra

Consider a Lie algebra L. We recall from Definition 4.4 that an element x ∈ L isnamed semisimple iff the endomorphims ad x ∈ gl(L) is semisimple, i.e. diagonal-izable.

Definition 6.1 (Toral subalgebra). Consider a Lie algebra L.

• A toral subalgebra of L is a subalgebra T ⊂ L with all elements x∈ T semisimple.• A toral subalgebra T ⊂ L is a maximal toral subalgebra iff T is not properly

included in any other toral subalgebra of L.

We recall from Linear Algebra the following Lemma 6.2. It states that the re-striction of a diagonalizable endomorphism to an invariant subspace stays diagonal-izable.

Lemma 6.2 (Diagonalizable endomorphisms). Consider a vector space V and adiagonalizable endomorphism

x : V →V.

If W ⊂V is a stable subspace, i.e. x(W )⊂W, then also the restriction

x|W : W →W

is diagonalizable.

Proof. By assumption V decomposes as a direct sum of eigenspaces of x

105

106 6 Cartan decomposition

V =k⊕

i=1

Vλi(x)

with pairwise distinct eigenvalues λi, i = 1, ...,k. In order to prove

W =W ∩k⊕

i=1

Vλi(x) =k⊕

i=1

(W ∩Vλi(x))

we show:If w = v1 + ...+vk ∈W with eigenvectors vi of x with pairwise distinct eigenval-

ues λi, then vi ∈W for all i = 1, ...,k.The claim holds for k = 1. For the induction step k−1 7→ k consider

w = v1 + ...+ vk ∈W

withx(w) = λ1 · v1 + ...+λk · vk ∈W

and apply the induction assumption to

x(w)−λ1 ·w = (λ2−λ1) · v2 + ...+(λk−λ1) · vk ∈W.

We obtain v2, ...,vk ∈W . The representation w= v1+...+vk ∈W implies v1 ∈W , q.e.d.

We prove that the abstract Jordan decomposition of a semisimple Lie algebra L,cf. Definition 5.25, implies the existence of a maximal toral subalgebra of L.

Proposition 6.3 (Toral subalgebras). Consider a semisimple Lie algebra L 6= {0}.

1. The Lie algebra L has a non-zero toral subalgebra, in particular also a nonzeromaximal toral subalgebra.

2. Each toral subalgebra of L is Abelian.

Proof. ad i) Existence of non-zero toral subalgebras: Not every element x ∈ L isad-nilpotent. Otherwise ad L and also L' ad L were nilpotent according to Engel’stheorem, see Theorem 4.12. But the property [L,L] = L, see Theorem 5.14, showsthat L is not nilpotent. Choose an element x ∈ L with ad xs 6= 0. The abstract Jordandecomposition x = s + n provides a non-zero semisimple element s ∈ L. The 1-dimensional subalgebra

C · s⊂ L

is a toral subalgebra.

ad ii) Toral subalgebras are Abelian: Consider a toral subalgebra T ⊂ L. Choos-ing an arbitrary but fixed x ∈ T we have to show: The restriction

adT (x) : T → T,y 7→ ad(x)(y) = [x,y]

6.2 sl(2,C)-modules 107

is zero, i.e. each 0 6= y ∈ T is an eigenvector of ad(x) belonging to eigenvalue zero.Because ad(x)∈ End(L) is diagonalizable and T ⊂ L is stable with respect to ad(x),also the restriction

adT (x) : T → T

is diagonalizable according to Lemma 6.2. Therefore it suffices to prove

adT (x)(y) = 0

for all eigenvectors y ∈ T of adT (x).Assume the existence of an eigenvector y ∈ T of adT (x) satisfying

adT (x)(y) = [x,y] = λ · y.

Also adT (y) ∈ End(V ) is diagonalizable. Hence T decomposes as a direct sum ofeigenspaces of adT (y) belonging to pairwise distinct eigenvalues (λ j) j∈J . A corre-sponding representation

x = ∑j∈J

α j · y j

with eigenvectors y j, j ∈ J, and coefficients α j ∈ C gives

−λ · y =−[x,y] = [y,x] = adT (y)(x) = ∑j∈J,λ j 6=0

α j ·λ j · y j.

The element y 6= 0 is an eigenvector of adT (y) with eigenvalue zero. Such eigenvec-tor cannot be represented as a linear combination of eigenvectors of adT (y) havingeigenvalues different from zero. Therefore λ = 0.

6.2 sl(2,C)-modules

The 3-dimensional complex Lie algebra sl(2,C) is the prototype of a complexsemisimple Lie algebra. Also its representation theory is of fundamental impor-tance:

• The representation theory of sl(2,C) is the means to clarify the structure of gen-eral semisimple Lie algebras, see Proposition 8.3.

• The representation theory of sl(2,C) is paradigmatic for the representation the-ory of general semisimple Lie algebras, see Chapter ??.

Therefore Proposition 6.4 presents the structure of sl(2,C) in a form which gener-alizes to arbitrary complex semisimple Lie algebras.

Proposition 6.4 (Structure of sl(2,C)). The Lie algebra L := sl(2,C) has the stan-dard basis B := (h,x,y) with matrices

h =

(1 00 −1

),x =

(0 10 0

),y =

(0 01 0

)∈ sl(2,C).


• The non-zero commutators of the elements from B are

[h,x] = 2x, [h,y] =−2y, [x,y] = h.

Therefore, with respect to B the matrices of the adjoint representation ad : L→ gl(L)are

ad h =

0 0 00 2 00 0 −2

,ad x =

0 0 1−2 0 0

0 0 0

,ad y =

0 −1 00 0 02 0 0

• The element h∈ L is semisimple: ad h is a diagonal matrix. It has the eigenvalues

0,α := 2,−α

with eigenspaces respectively

H := L0 := C ·h,Lα := C · x,L−α := C · y.

In particularsl(2,C) = H⊕L2⊕L−2

as a direct sum of complex vector spaces.

• With respect to the basis B the Killing form of sl(2,C) has the symmetric matrix

4 ·

2 0 00 0 10 1 0

∈M(3×3,Z).

The Killing form is nondegenerate. In particular, sl(2,C) is a semisimple Liealgebra.

• The Abelian subalgebra H ⊂ L is a maximal toral subalgebra of sl(2,C).

• The restriction κH := κ|H to H of the Killing form is positive definite. The scalarproduct κH induces the isomorphism

H ∼−→ H∗,h 7→ κ(h,−) = 8 ·h∗.

The element tα ∈ H with κH(tα ,−) = α ·h∗ = 2 ·h∗ is

tα =h4.

It satisfies

h =2 · tα

κ(tα , tα).

Proof. Only the following issues need a separate proof.


• In order to prove that H ⊂ sl(2,C) is a maximal toral subalgebra we show CL(H) = H:

[λ ·h+µ · x+ν · y,h] =−2µ · x+2ν · y = 0 ⇐⇒ µ = ν = 0 (λ ,µ,ν ∈ C)

• The restriced Killing form κH is the positive scalar 8.

The element h∈ L := sl(2,C) is semisimple by definition because ad h∈End(L)is semisimple. Therefore h ∈ L coincides with its semisimple component in the ab-stract Jordan decomposition of L. As a consequence h acts as semisimple endomor-phism on any L-module V , cf. Corollary 5.26. Therefore any L-module decomposesas a direct sum of eigenspaces with respect to the action of h.

Definition 6.5 (Weight, weightspace and primitive element). Consider a sl(2,C)-module V .The semisimple element h ∈ H acts diagonally on V , see Corollary 5.26.

1. Denote byVλ = {v ∈V : h.v = λ · v},λ ∈ C,

the eigenspaces with respect to the action of h. If Vλ 6= {0} then Vλ is named aweight space of V , the corresponding eigenvalue λ ∈ C is a weight of V .

2. A non-zero element e ∈ Vλ is named a primitive element of V with weight λ ifx.e = 0 with respect to the action of the element x from the standard basis B ofsl(2,C).

For any sl(2,C) module V not only the action of h ∈ sl(2,C) on V but also theaction of the two other elements of the standard basis can be easily described.

Proposition 6.6 (Action of the standard basis). Consider a sl(2,C)-module V 6= {0}.Denote by B = (h,x,y) the standard basis of sl(2,C) according to Proposition 6.4.

1. The module decomposes as a direct sum of weight spaces

V =⊕

λweight

Vλ .

2. The module V has at least one primitive element e ∈V .

3. If e ∈V is a primitive element of V with weight λ ∈C then the successive actionof y ∈B on e generates elements

ei :=1i!· yi.e, i≥ 0.

These elements satisfy:


a. Weight: h.ei = (λ −2i) · ei.

b. Lowering the weight: y.ei = (i+1) · ei+1.

c. Raising the weight: x.ei = (λ − i+1) · ei−1,e−1 := 0.

d. Irreducible submodule: Denote by imax is the largest index i ∈ N with

y.ei 6= 0.

Then the family(ei)0≤i≤imax

is linearly independent and spans an irreducible submodule of W ⊂V with

dim W = imax +1.

All weight spaces of W are 1-dimensional.

Proof. ad 2) Choose a weight space Vκ and a non-zero element v ∈Vκ .

Because [h,x] = 2x we have

h.(x.v) = [h,x].v+ x.(h.v) = (2x).v+ x.(κ · v) = 2 · (x.v)+κ · (x.v) = (κ +2)(x.v),

i.e. applying x ∈B raises the weight by 2. Because V has finite dimension, the lastnon-zero element from the sequence (xk.v)k∈N is a primitive element e ∈V . Denoteby λ its weight.

ad 3)

a) The formulah.ei = (λ −2i) · ei

follows by induction on i∈N in an analogous way from the commutator [h,y] =−2y:For y ∈Vκ

h.(y.v) = [h,y].v+y.(h.v) =−(2y).v+y.(κ ·v) =−2 ·(y.v)+κ ·(y.v) = (κ−2)(y.v),

i.e. applying y ∈B lowers the weight by 2.

b) The formulay.ei = (i+1) · ei+1

follows from the definition of the family (ei)i≥0.

c) With the convention e−1 := 0 the formula

x.ei = (λ − i+1) · ei−1


follows by induction on i ∈ N by using [x,y] = h ∈ sl(2,C) and using part a) andpart b):

i · x.ei = x.(y.ei−1) = [x,y].ei−1 + y.(x.ei−1) = h.ei−1 + y.((λ − i+2) · ei−2) =

= (λ −2(i−1)) · ei−1 +(i−1)(λ − i+2) · ei−1 = i(λ − i+1) · ei−1

and after dividing by ix.ei = (λ − i+1) · ei−1.

d) Because V is finite dimensional an index imax with the maximality propertyexists. Due to part 3) elements ei with pairwise distinct index belong to differentweight spaces. Hence they are linearly independent. The previous formulas frompart 3) show that W ⊂ V is a submodule. The formula about the dimension of Wfollows from the choice of imax.

In order to prove the irreducibility of W consider a non-zero submodule W ‘ ⊂W .The element h acts diagonal on W ‘. Hence W ‘ contains at least one of the eigen-vectors ei. Formula 3c) shows that W ‘ contains all elements e j with 0 ≤ j ≤ i.Formula 3b) shows that W ‘ contains all elements e j with i ≤ j. Therefore W ‘ = Wand W is irreducible, q.e.d.

The construction of the sl(2,C)-module V (λ ) from Proposition 6.6 generates allirreducible sl(2,C)-modules.

Theorem 6.7 (Classification of irreducible modules).

1. Consider an sl(2,C)-module V . The weight λ of a primitive element of V is aninteger.

2. For any λ ∈N an irreducible sl(2,C)-module V (λ ) with highest weight λ exists.It decomposes as the direct sum of 1-dimensional weight spaces

V (λ ) =λ⊕

i=0

Vλ−2i.

In particular, all weights of V (λ ) are integers.

3. The map to the isomorphy classes of sl(2,C)-modules

N→{[V ] : V irreducible sl(2,C)−module},λ 7→ [V (λ )],

is bijective.

Proof. 1. Choose a primitive element e ∈ V with weight λ ∈ C and consider thefamily (ei)i≥0 from Proposition 6.6: If imax is the largest index with eimax 6= 0then formula 3c) shows:


0 = x.eimax+1 = (λ − imax)eimax .

Therefore λ = imax ∈ N.

2. Choose a vector space V of dimension λ + 1 with basis e0, ...,eλ . Define theaction of sl(2,C) on V by the formulas from Proposition 6.6:

• h.ei = (λ −2i) · ei• y.ei = (i+1) · ei+1• x.ei = (λ − i+1) · ei−1,e−1 := 0.

One checks that these definitions satisfy

• h.(x.ei)− x.(h.ei) = 2x.ei• h.(y.ei)− y.(h.ei) =−2y.ei• x.(y.ei)− y.(x.ei) = h.ei.

Accordingly V becomes a sl(2,C)-module with primitive element e0 with weight λ .This module, denoted by V (λ ), is generated by the images under the action of yon e0. Therefore V (λ ) is irreducible.

3. According to part 2) the map from the theorem is well-defined.

The map is surjective: Consider an irreducible sl(2,C)-module V . Consider aprimitive element e ∈V with weight λ ∈ N and the submodule

V (λ )' span < yi.e : i ∈ N>⊂V.

The irreducibility of V implies V 'V (λ ).

The map is injective: According to the formula

1+λ = dim V (λ )

the highest weight of V (λ ) determines its dimension. Therefore λ1 6= λ2 implies V (λ1) 6=V (λ2),q.e.d.

Example 6.8 (Explicit representation by homogeneous polynomials).Set L = sl(2,C).

1. The irreducible L-module of highest weight λ = 0 is the 1-dimensional vectorspace C with the trivial representation

ρ : sl(2,C)→{0} ⊂ gl(C).

Its weight space decomposition is


V (0) =V0 ' C.

Any non-zero element e ∈ C is a primitive element.

2. The irreducible L-module of highest weight λ = 1 is the 2-dimensional vectorspace C2 with the tautological representation

f : L = sl(2,C) ↪−→ gl(C2).

Its weight space decomposition is

V (1) =V1⊕V−1

with

V1 = C ·(

10

),V−1 = C ·

(01

)and primitive element

e =(

10

).

3. The irreducible L-module of highest weight λ = 2 is the 3-dimensional vectorspace L itself, considered as L-module with respect to the adjoint representation

ad : L→ gl(L).

Proposition 6.4 shows the weight space decomposition

V (2) = L = L2⊕L0⊕L−2

with primitive element x ∈B.

4. In general, the irreducible L-module V (λ ) of highest weight λ ∈ N is isomophicto the complex vector space of all homogeneous polynomials P(u,v) ∈C[u,v] ofdegree = n.

The vector space C[u,v] of polynomials in two variables has a basis of monomials (uµ · vν)µ,ν∈N.A homogeneous polynomial of degree n ∈ N is an element

P(u,v) = ∑µ+ν=n

aµ,ν ·uµ · vn−µ ∈ C[u,v],aµ,ν ∈ C.

Denote byPoln ⊂ C[u,v]

the subspace of homogeneoups polynomials of degree n. One has dim Poln = n+1because the family of monomials (un−i · vi)i=0,...,n is a base of Poln.

When identifying the canonical basis of C2 with the two variables u and v thenthe tautological representation of L acts on


Pol1 ' C ·u⊕C · v

according to

h.u = u,h.v =−v;x.u = 0,x.v = u;y.u = v,y.v = 0.

More general, we define an action of z ∈ L on Poln by the differential operator

Dz := (z.u)∂

∂u+(z.v)

∂

∂v,

notably for un−i · vi ∈ Poln

h.(un−i · vi) = u · (n− i) ·un−i−1 · vi− i · v ·un−i · vi−1 = (n−2i) ·un−i · vi

y.(un−i · vi) = v · (n− i) ·un−i−1 · vi = (n− i) ·un−i−1 · vi+1

x.(un−i · vi) = u ·un−i · i · vi−1 = i ·un−i+1 · vi−1.

We show that the map

L×Poln→ Poln,(z,P) 7→ Dz(P),

defines the irreducible L-module structure V (n) on Poln:We set e := un ∈ Poln and define for i = 0, ...,n

ei :=1i!· (yi.e) =

1i!·n · (n−1) · ... · (n− i+1) ·un−i · vi =

(ni

)·un−i · vi.

We obtainh.ei = (n−2i) · ei

y.ei = y.((

ni

)·un−i · vi) =

(ni

)· (n− i) ·un−i−1 · vi+1 =

= (i+1) ·(

ni+1

)·un−i−1 · vi+1 = (i+1) · ei+1

x.ei = x.((

ni

)·un−i · vi) = i ·

(ni

)·un−i+1 · vi−1 =

(n− i+1) ·(

ni−1

)·un−i+1 · vi−1 = (n− i+1) · ei−1,

which proves Poln 'V (n) with a primitive element e := un ∈ Poln.

One checks that the isomorphy of vector spaces

Polλ = Symλ (C ·U⊕C ·V )' SymλC2

6.3 Decomposition into eigenspaces 115

induces an isomorphy of L-modules between Polλ and the symmetric power withexponent λ of the tautological L-module.

6.3 Decomposition into eigenspaces

For a semisimple Lie algebra L we denote by (L,T ) the pair with a fixed maximaltoral subalgebra T ⊂L. Any toral subalgebra is Abelian according to Proposition 6.3.Because T is Abelian all endomorphisms

ad h : L→ L

are simultaneously diagonizable.Therefore the whole Lie algebra L splits as a di-rect sum of their common eigenspaces. This decomposition is called the root spacedecomposition of L with respect to T .

We recall from Lemma 5.1 that the Killing form of L is “associative”

κ([x,y],z) = κ(x, [y,z]),x,y,z ∈ L.

Definition 6.9 (Root space decomposition). Consider a pair (L,T ) with a semisim-ple Lie algebra L and a maximal toral subalgebra T ⊂ L.

1. For a complex linear functional

α : T → C

setLα := {x ∈ L : [h,x] = α(h) · x f or all h ∈ T}.

If α 6= 0 and Lα 6= 0 then α is a root of (L,T ) and Lα is the corresponding rootspace.

2. The set of all roots of (L,T ) is denoted

Φ := {α : T → C : α 6= 0,Lα 6= 0}.

3. The vector space decomposition

L = L0⊕α∈Φ

Lα

is the root space decomposition of (L,T ).

In the root space decomposition the zero eigenspace L0 plays a distinguished role.By definition L0 equals the centralizer CL(T ) of the maximal toral subalgebra T ofthe pair (L,T ). Proposition 6.3 implies T ⊂CL(T ). The main result of the presentsection is the prove of the opposite inclusion CL(T )⊂ T . See see Theorem 6.13 forthe equality


T =CL(T ).

The main steps of the proof are:

• The centralizer CL(T ) contains with each element also its semisimple and itsnilpotent summand. Therefore one can consider both types of elements sepa-rately.

• The case of semisimple elements is trivial due to the maximality of T with respectto semisimple elements.

• If non-zero nilpotent elements of CL(T ) exist, they cannot belong to T . A poste-riori, as a subset of T the centralizer CL(T ) cannot contain any non-zero nilpotentelements because all elements of a toral subalgebra are semisimple.

• Because the subalgebra CL(T ) is Abelian all its nilpotent elements belong to thenull space of the Killing form restricted to CL(T ).

• The Killing form is nondegenerate on CL(T ).

The following Lemmata 6.10 and 6.11 as well as Proposition 6.12 prepare theproof of Theorem 6.13.

Lemma 6.10 (Orthogonality of root spaces). Consider a pair (L,T ) with a semisim-ple Lie algebra L and a maximal toral subalgebra T ⊂ L. Consider the correspond-ing root space decomposition from Definition 6.9

L = L0⊕α∈Φ

Lα .

For two functionals α,β ∈ T ∗:

•[Lα ,Lβ ]⊂ Lα+β .

• If β 6=−α thenκ(Lα ,Lβ ) = 0.

Proof. Assume x ∈ Lα ,y ∈ Lβ ,h ∈ T :

[h, [x,y]] =−([x, [y,h]]+ [y, [h,x]]) = [x,β (h) · y]− [y,α(h) · x]

= β (h) · [x,y]+α(h) · [x,y] = (α +β )(h) · [x,y] ∈ Lα+β .

For the proof of the second claim choose an element h ∈ T with (α +β )(h) 6= 0.For arbitrary elements x ∈ Lα ,y ∈ Lβ :

κ([h,x],y) =−κ([x,h],y) =−κ(x, [h,y]),

α(h) ·κ(x,y) =−β (h) ·κ(x,y),

(α +β )(h) ·κ(x,y) = 0

which implies κ(x,y) = 0. Hence κ(Lα ,Lβ ) = 0.


Lemma 6.11 (Centralizer of a maximal toral subalgebra). Consider a pair (L,T )with L a semisimple Lie algebra and T ⊂ L a maximal toral subalgebra.

i) The centralizer CL(T ) contains with each element x ∈CL(T ) also the semisim-ple and the nilpotent part

s,n ∈CL(T )

from the abstract Jordan decomposition x = s+n.

ii) Any semisimple element x ∈CL(T ) belongs to T .

Proof. Set C :=CL(T ).

i) If x ∈C with abstract Jordan decomposition x = s+ n then also s,n ∈C: Theabstract Jordan decomposition makes use of the fact that

ad x = ad s+ad n ∈ End(L)

is the Jordan decomposition of the endomorphism ad x ∈ End(L). In particular

ad s = ps(ad x) and ad n = pn(ad x)

with polynomials ps(T ), pn(T ) ∈ C[T ].

As a consequence: If h ∈ T with (ad x)(h) = 0 then also

(ad s)(h) = 0 and(ad n)(h) = 0.

Hence s,n ∈C.

ii) All semisimple elements x ∈C belong to T :A semisimple element x∈C commutes with all elements from T . Therefore x and

all elements from T are simultaneously diagonalizable. In particular, all elementsfrom the vector space

T +C · x⊂ L

are diagonalizable. The maximality of T implies

T +C · x⊂ T,

i.e. x ∈ T , q.e.d.

The strategy to prove Theorem 6.13 uses the fact that the Killing form restrictedto a maximal toral subalgebra is nondegenerate. The following Lemma proves thisresult in two steps. Nevertheless, Theorem 6.13 will show a posteriori that bothstatements of the lemma coincide.


Proposition 6.12 (Non-degenerateness of the Killing form of a maximal toralsubalgebra). Consider a pair (L,T ) with L a semisimple Lie algebra and T ⊂ L amaximal toral subalgebra.

The restriction of the Killing form to the centralizer of T

κC := κ|CL(T )×CL(T ) : CL(T )×CL(T )→ C

and the restriction to the maximal toral subalgebra T

κT := κ|T ×T : T ×T → C

are nondegenerate.

Proof. 1) Restriction to CL(T ): By definition CL(T )=L0. Assume x∈L0 with κ(x,L0) = 0.But also for roots β ∈Φ holds κ(x,Lβ ) = 0 according to Lemma 6.10. Hence

κ(x,L) = 0.

Because the Killing form is nondegenerate on L according to the Cartan criterion,see Theorem 5.11, the last equation implies x = 0.

2) Restriction to T : Assume an element h ∈ T with κT (h,T ) = 0. Because T ⊂Cand because the restricted Killing form κC is nondegenerate according to the firstpart, it suffices to show

κC(h,C) = 0.

For a nilpotent element n ∈ C the endomorphism ad n ∈ End(L) is nilpotent. Be-cause [x,h] = 0 also [ad n,ad h] = 0. Therefore the endomorphism

(ad h)◦ (ad n) ∈ End(L)

is nilpotent. HenceκC(h,n) = tr((ad h)◦ (ad n)) = 0

by Corollary 5.1. For a general element x ∈C with abstract Jordan decomposition

x = s+n

we have s ∈C due to Lemma 6.11 and s ∈ T due to Lemma 6.11. Hence

κC(h,x) = κC(h,s)+κC(h,n) = κT (h,s)+κC(h,n) = 0

which implies h = 0, q.e.d.

Theorem 6.13 (A maximal toral subalgebra equals its centralizer). Consider asemisimple Lie algebra L and a maximal toral subalgebra T ⊂ L. Then T =CL(T ).


Proof. According to Proposition 6.3 the toral subalgebra T is Abelian. Therefore T ⊂CL(T ).Therefore it remains to prove the opposite inclusion

CL(T )⊂ T.

Set C :=CL(T ).

Due to Lemma 6.11 it suffices to prove that any nilpotent element n ∈C belongsto L. For the proof we use the non-degenerateness of the restricted Killing form κTaccording to Proposition 6.12.

i) T ∩ [C,C] = {0}: By definition of the centralizer [C,T ] = {0}. Therefore

κC(T, [C,C]) = κC([T,C],C) = 0.

In particular,0 = κC(T,T ∩ [C,C]) = κT (T,T ∩ [C,C]).

Non-degenerateness of κT according to Proposition 6.12 implies T ∩ [C,C] = {0}.

ii) The centralizer C is a nilpotent Lie algebra: According to Engel’s theorem,see Theorem 4.12, it suffices to show that for each element x∈C the endomorphismadC x ∈ End(C) is nilpotent .

For an element x ∈C consider the abstract Jordan decomposition x = s+ n. Onone hand, Lemma 6.11 implies s ∈ T . Hence [s,C] = 0 by definition. On the otherhand, ad n ∈ End(L) is nilpotent and a posteriori also the restriction (ad n)|C ∈End(C). Hence

adC x = adC n

is nilpotent.

iii) The centralizer C is Abelian: We argue by indirect proof. Assume on thecontrary [C,C] 6= 0. Due to part ii) the Lie algebra C is nilpotent. We applyCorollary 4.13 to the ideal

{0} 6= I := [C,C]⊂C

and obtain an element0 6= x ∈ Z(C)∩ [C,C].

The element x is not semisimple, because semisimple elements from C belong to Taccording to Lemma 6.11 but T ∩ [C,C] = {0} according to part ii). Therefore n 6= 0in the abstract Jordan decomposition

x = s+n

and n∈C according to Lemma 6.11. Moreover x∈ Z(C) implies n∈ Z(C) accordingto Theorem 2.6. The nilpotency of ad n and the commutator [n,y] = 0 for all y ∈Cimply the nilpotency of


(ad n)◦ (ad y).

As a consequence κC(n,y) = 0 according to Lemma 5.1. We obtain κC(n,C) = 0.Proposition 6.12 implies n = 0, a contradiction.

iv) C = T : We argue by indirect proof. Consider an element x ∈C \T with ab-stract Jordan decomposition

x = s+n.

The same reasoming as in part iii) shows n 6= 0 and concludes - using [C,C] = 0from part iii) - that n = 0, a contradiction, q.e.d.

Consider a pair (L,T ) with a complex semisimple Lie algebra L and a maximaltoral subalgebra T ∈ L. Theorem 6.13 allows to replace in the root space decompo-sition of L from Definition 6.9 the eigenspace L0 =CL(T ) by T .

Definition 6.14 (Cartan decomposition). Consider (L,T ) with a semisimple Liealgebra L and a maximal toral subalgebra T ⊂ L. Denote by Φ the root set of (L,T ).

Then the decomposition as a direct sum of vector spaces

L = T ⊕⊕α∈Φ

Lα

is the Cartan decomposition of L with respect to T .

Hence a semisimple Lie algebra L decomposes as the direct sum of a maximaltoral subalgebra T and the root spaces Lα of its roots α ∈Φ .

Definition 6.15 (Cartan subalgebra). Consider a Lie algebra L. A Cartan subalge-bra H of L is a nilpotent subalgebra H ⊂ L equal to its normalizer, i.e. H = NL(H).

Lemma 6.16 (Cartan subalgebras of a semisimple Lie algebra). For a semisim-ple Lie algebra L any maximal toral subalgebra T ⊂ L is a Cartan subalgebra of L.

Proof. i) According to Proposition 6.3 any toral subalgebra T ⊂ L is Abelian, inparticular nilpotent.

ii) The Cartan decomposition of L with respect to T represents an arbitraryelement x ∈ NL(T ) uniquely as

x = xT + ∑α∈Φ

xα ,xT ∈ T,xα ∈ Lα .

For all h ∈ T

[h,x] = [h,xT ]+ ∑α∈Φ

α(h) · xα = ∑α∈Φ

α(h) · xα ∈ T.


For any α ∈ Φ an element h ∈ T exists with α(h) 6= 0, therefore xα = 0. Weobtain x = xT .

Therefore NL(T )⊂ T . Apparently T ⊂ NL(T ). As a consequence

NL(T ) = T.

Remark 6.17 (Cartan subalgebras of a semisimpleLie algebra). For a semisimpleLie algebra L the concepts Cartan subalgebra and maximal toral subalgebra areequivalent, see [20, Chapter 15.3].

In the following we will employ the standard notation H for Cartan subalgebrasand denote a fixed toral subalgebra of a semisimple Lie algebra by the letter H.

Chapter 7Root systems

Our point of departure is the Cartan decomposition of a complex semisimple Liealgebra

L = H⊕⊕α∈Φ

Lα ,

see Definition 6.14.We separate the concept of roots from the Lie algebra as its origin and study

the properties of Φ in the context of abstract root systems. Here we follow Serre’sguide [27]. Different from many other authors Serre introduces a root system aloneby its set of reflections. The existence of an invariant scalar product is then a con-sequence and not a prerequisite. Serre develops the properties of a root system byfocusing on the real vector space V spanned by the root system. He does not oscillatebetween V and its dual V ∗ like many other textbooks.

The base field in the present chapter is R, all vector spaces are real and finitedimensional.

7.1 Abstract root system

The ambient space of an abstract root system is real vector space V . The readermay conceive of V as the vector space spanned by the root set of a semisimple Liealgebra. When following this conception, roots are conceived as linear functionals.

Definition 7.1 (Symmetry). Consider a vector space V . A symmetry of V with vec-tor α ∈V,α 6= 0, is an automorphism

σα : V →V

with

123

124 7 Root systems

1. σα(α) =−α

2. The subspaceHα := {x ∈V : σα(x) = x}

of elements fixed by σα is a hyperplane in V , i.e. codimV Hα = 1.

Apparently the hyperplane Hα is a complement of the real line R ·α ⊂ V . Andthe symmetry σα is completely determined by the choice of α and Hα .

The symmetry defines a linear functional α∗ ∈V ∗ such that for all x ∈V

σα(x) = x−α∗(x) ·α :

Ifx = µ1 ·α +µ2 · v,v ∈ Hα ,

thenσα(x) =−µ1(x) ·α +µ2 · v = x−2µ1 ·α.

The functional α∗ satisfies α∗(α) = 2 and ker α∗ = Hα .

Conversely, if 0 6= α ∈ V and a linear functional α∗ ∈ V ∗ with α∗(α) = 2 aregiven, then the definition

σα(x) := x−α∗(x) · x,x ∈V,

defines a symmetry σα with pointwise fixed hyperplane Hα := ker α∗.

Definition 7.2 (Root system). A root system of a vector space V is a subset Φ ⊂Vwith the following properties:

• (R1) Finite and generating: The set Φ is finite, 0 /∈Φ , and spanRΦ =V .

• (R2) Invariance under distinguished symmetries: For each α ∈Φ a symmetry σα

of V with vector α exists which leaves Φ invariant, i.e. σα(Φ)⊂Φ .

• (R3) Cartan integers: For all α,β ∈Φ , integer values < β ,α >∈ Z exist with

σα(β ) = β−< β ,α > ·α.

The integers< β ,α >∈ Z,α,β ∈Φ ,

are named the Cartan integers of Φ .

7.1 Abstract root system 125

• (R4) Reducedness: For each α ∈ Φ the only roots proportional to α are α itselfand −α , i.e.

(R ·α)∩Φ = {α,−α}.

The dimension of V is the rank of the root system, the elements of Φ are the rootsof V .

In the literature condition (R4) is considered an additional requirement for areduced root system. Because all root systems in these notes will be reduced weomit the attribute reduced.

Note the particular Cartan integers < α,α >= 2 for any α ∈Φ .

Lemma 7.3. The symmetry σα required in Definition 7.2 part (R2) is uniquely de-termined by (R1).

Proof. Assume two symmetries of V with vector α satisfying σi(Φ) ⊂ Φ , i = 1,2.Consider the automorphism

u := σ2 ◦σ1.

It satisfies u(Φ) = Φ . Moreover, u induces the identity on the quotient V/Rα be-cause

u(x) = σ2(σ1(x)) = σ2(x−α∗1 (x) ·α) = x−α

∗2 (x−α

∗1 (x) ·α) ·α ∈ x+Rα.

As a consequence, u has the single eigenvalue = 1. Because u|Φ is a permutationof Φ an exponent n ∈ N exists with

(u|Φ)n = id|Φ .

Because Φ spans V we obtain un = id.

On one hand, the minimal polynomial pmin(T ) of u divides the characteristicpolynomial pchar(T ). Hence pmin(T ) has only the zero with value = 1. On the otherhand, pmin(T ) - being the minimal polynomial - divides the polynomial T n−1 be-cause un− id = 0. The latter polynomial has the value 1 as simple zero. As a conse-quence

pmin(T ) = T −1

and u = id, q.e.d.

Definition 7.4 (Weyl group). The Weyl group W of a root system Φ of V is thesubgroup of GL(V ) generated by all symmetries σα ,α ∈Φ .

The Weyl group permutes the elements of the finite set Φ , hence W is a finitegroup.

126 7 Root systems

Lemma 7.5 (Invariant scalar product). Let Φ be a root system of a vector space V .Then a scalar product (−,−) exists on V which is invariant under the Weyl group Wof Φ . With respect to any invariant scalar product the Cartan integers satisfy

< β ,α >= 2 ·(β ,α)

(α,α), α,β ∈Φ .

Proof. Take an arbitrary scalar product B on V and define for x,y ∈ V the scalarproduct

(x,y) := ∑w∈W

B(w(x),w(y))

as average over the Weyl group. For two roots α,β ∈ Φ the corresponding Cartaninteger is defined by

σα(β ) = β−< β ,α > α.

Because σα = σ−1α ,σα(α) =−α , and due to the invariance of the scalar product

(σα(α),β ) = (α,σα(β ))

−(α,β ) = (α,β )−< β ,α > (α,α),

which proves < β ,α >= 2 ·(α,β )

(α,α).

Note that the Cartan integers < β ,α > are linear in the first argument β but notnecessarily in the second argument α .

Now that we have an Euclidean space (V,(−,−)) of the root system Φ we can de-fine lenght and angle between two roots. We choose a fixed invariant scalar product.

Definition 7.6 (Lenght of and angle between roots). Consider a root system Φ

and the corresponding Euclidean vector space (V,(−,−)).

1. The length of a root α ∈Φ is defined as

‖α‖ :=√

(α,α).

2. The angle 0 < θ =^(α,β )≤ π included between two roots α,β ∈Φ is definedas

cos(θ) :=(α,β )

‖α‖ · ‖β‖.

Lemma 7.7 (Possible angles and length ratio of two roots). Consider a root sys-tem Φ and two non proportional roots α,β ∈Φ . The only possible angles ^(α,β )included by α and β and their ratios of length are displayed in the following table.Here the last column indicates the type of the Lie algebra with the base ∆ = {α,β}.


No. < α,β > < β ,α > ^(α,β ) ‖β‖2/‖α‖2 ∆ = {α,β}1 0 0 π

2 undef. A1×A1

2 1 1 π

3 1

3 -1 -1 2π

3 1 A2

4 1 2 π

4 2

5 -1 -2 3π

4 2 B2

6 1 3 π

6 3

7 -1 -3 5π

6 3 G2

Table 7.1 Angles and length of roots

Proof. Employing the Cartan integers < α,β >,< β ,α > ∈ Z we have

4 > 4 · cos2θ = 2 · (α,β )

‖β‖2 ·2 ·(β ,α)

‖α‖2 = < α,β > ·< β ,α > ≥ 0.

Both Cartan integers have equal sign and we may assume ‖β‖ ≥ ‖α‖, i.e.

|< α,β > | ≤ |< β ,α > |

because‖β‖2

‖α‖2 =< β ,α >

< α,β >=|< β ,α > ||< α,β > |

.

Table 7.1 shows how the included angle θ = ^(α,β ) determines the lenghtratio ‖β‖/‖α‖ with the only exception of case θ = π/2.

Example 7.8 (Root systems of rank ≤ 3).

• Rank = 1: The only root system is A1.

• Rank = 2: Figure 7.1 displays all root systems with rank = 2.

• Rank = 3: See the figures in [15, Chap. 8.9] as one example.

Lemma 7.9 (Roots with acute angle). If two non-proportional roots α,β ∈ Φ in-clude an acute angle, i.e. (α,β )> 0, then also α−β ∈Φ .

Proof. According to Table 7.1 the assumption (α,β )> 0 implies

< β ,α >= 1 or < α,β >= 1.

In the first case σα(β ) = β − α , in the second case σβ (α) = α − β . In bothcases α−β ∈Φ because the Weyl group leaves Φ invariant.

128 7 Root systems

Fig. 7.1 Root systems with Rank = 2

Definition 7.10 (Base of a root system, positive and negative roots). Consider aroot system Φ of a vector space V .

i) A set ∆ = {α1, ...,αr} of roots αi ∈ Φ , i = 1, ...,r, is a base of Φ and theelements of ∆ are named simple roots iff

• The family (αi)i=1,...,r is a basis of V .• Each root β ∈Φ has a representation

β =r

∑i=1

ki ·αi, ki ∈ Z,

with either all integer coefficients ki ≥ 0 or all all ki ≤ 0.

ii) With respect to a base ∆ a root β is

• positive, β � 0, iff all ki ≥ 0• negative, β ≺ 0, iff all ki ≤ 0.

The subset Φ+ ⊂ Φ is defined as the set of all positive roots and the subsetsubset Φ− ⊂Φ as the set of all negative roots.


Theorem 7.11 (Existence of a base). Every root system Φ of a vector space V hasa base ∆ .

Proof. The construction of a candidate for ∆ is straightforward. But the proof thatthe candidate is indeed a base, will take several steps.

i) Construction of ∆ : Because Φ is finite, a linear functional t ∈ V ∗ existswith t(α) 6= 0 for all α ∈Φ . Set

Φ+t := {α ∈Φ : t(α)> 0}

and call α ∈Φ+ decomposable iff

α = α1 +α2 with α1,α2 ∈Φ+t

and indecomposable otherwise. We claim that the set

∆ := {α ⊂Φ+t : α indecomposable}

is a base of Φ .

ii) Representation of elements from Φ+t : Each β ∈Φ

+t has the form

β = ∑α∈∆

kα ·α

with all kα ≥ 0: Otherwise consider the set C 6= /0 of all β ∈ Φ+t which lack such a

representation.Choose an element β ∈C with t(β ) > 0 minimal. By construction β is decom-

posable. Hence β = β1 +β2 with β1,β2 ∈Φ+ and β1 ∈C or β2 ∈C. We get

t(β ) = t(β1)+ t(β2)

which implies0 < t(β1), t(β2)< t(β ),

a contradiction to the minimality of β .

iii) Angle between elements from ∆ : We claim that two different roots α 6= β

from ∆ are either orthogonal or include an obtuse angle, i.e. (α,β )≤ 0.Otherwise (α,β )> 0 and we obtain from Lemma 7.9 the root γ :=α−β ∈Φ . As

a consequence α = β + γ and γ /∈Φ+ because α indecomposable. Hence −γ ∈Φ+t

which implies β = α +(−γ) decomposable. The contradiction proves the claim.

iv) Linear independency: A finite subset A⊂V with all elements α,β ∈ A satis-fying

t(α)> 0 and (α,β )≤ 0.

is linearly independent. Assume the existence of a representation

130 7 Root systems

0 = ∑α∈A

nα ·α

with coefficients nα ∈ R for all α ∈ A. Separating summands with positive coeffi-cients from those with negative coefficients gives an equation

∑α∈A1

kα ·α = ∑α∈A2

kα ·α =: v ∈V

with disjoint subsets A1,A2 ⊂ A and all kα ≥ 0. Then

(v,v) = ∑α∈A1,β∈A2

kα · kβ · (α,β )≤ 0.

Hence v = 0. Now0 = t(v) = ∑

α∈A1

kα · t(α)

with t(α) > 0 for all α ∈ A1 implies kα = 0 for all α ∈ A1. Similarly kα = 0 forall α ∈ A2.

The sequence of all steps i) until iv) proves the claim of the theorem, q.e.d.

The proof of Theorem 7.11 constructs a base ∆ by starting from a certain func-tional t ∈V ∗. We show that any base can be obtained in this way.

Lemma 7.12. Consider a base ∆ of a root system Φ of a vector space V .

Then a functional t ∈V ∗ exists with t(α) 6= 0 for all α ∈Φ such that

Φ+ = Φ

+t := {α ∈Φ : t(α)> 0}

and∆ = {α ∈Φ

+t : α indecomposable}.

Proof. With respect to ∆ we have the splitting

Φ = Φ+∪Φ

−.

In particular, the elements from ∆ form a basis of the vector space V , a functionalt ∈V ∗ exists with t(α)> 0 for all α ∈ ∆ . Set

Φ+t := {α ∈Φ : t(α)> 0},Φ−t := {α ∈Φ : t(α)< 0}.

From∆ ⊂Φ

+t

followsΦ

+ ⊂Φ+t and Φ

− ⊂Φ−t .

7.2 Action of the Weyl group 131

And the decompositionΦ

+∪Φ− = Φ = Φ

+t ∪Φ

−t

impliesΦ

+ = Φ+t and Φ

− = Φ−t .

As a consequence, the indecomposable elements from both sets Φ+ and Φ+t are

equal, i.e. ∆ = ∆ ′, q.e.d.

Corollary 7.13 (Simple roots include an obtuse angle). Consider a base ∆ of aroot system Φ .

Any two different roots α 6= β ∈ ∆ include an obtuse angle, i.e. (α,β )≤ 0.

Proof. According to Lemma 7.12 a suitable functional t ∈V ∗ exists with

∆ = {α ∈Φ : t(α)> 0 and α indecomposable}.

Part iv) in the proof of Theorem 7.11 shows (α,β ) ≤ 0 because all elements from∆ belong to the same half-space.

7.2 Action of the Weyl group

Our aim in the present chapter is the classification of all possible root systems. Thisresult will be achieved by two means.

The first means is the integrality of the Cartan integers of a root system Φ . Thisfact restricts the angle and the relative length of two roots, see Lemma 7.7. Thesecond means is the Weyl group which identifies equivalent representations of thesame root system Φ . Besides the action of the Weyl group W on Φ

W ×Φ →Φ ,(w,α) 7→ w(α)

we study the action on the symmetries σα , on the Cartan integers < α,β > and onthe bases ∆ of Φ . We will show: A root system is characterized by the matrix ofits Cartan integers, the Cartan matrix. Only finitely many types of Cartan matricesexist.

Proposition 7.14 (Action of the Weyl group on the roots). Consider a root system Φ

of a vector space V and denote by W its Weyl group. Denote by ∆ a fixed base of Φ .Then

132 7 Root systems

1. Mapping ∆ to half-spaces: For any functional t ∈ V ∗ an element w ∈ W existswith

w(t)(α)≥ 0

for all α ∈ ∆ .

2. Transitive action on bases: Consider a second base ∆ ′ of Φ . Then an element w ∈Wexists with

w(∆) = ∆′.

3. Any root belongs to a base: For any root α ∈Φ an element w ∈W exists with

w(α) ∈ ∆ .

4. Symmetries of roots from ∆ are generators: The Weyl group W is generated bythe symmetries σα of the roots α ∈ ∆ .

Proof. Denote by W∆ ⊂ W the subgroup generated by the symmetries σα of theroots α ∈ ∆ .

i) The symmetry σα of any root α ∈∆ leaves the set Φ+ \{α} invariant: Assumethat ∆ comprises at least two roots. Consider an element β ∈Φ+ \{α}. It has arepresentation

β = ∑γ∈∆

kγ · γ,kγ ≥ 0 f or all γ ∈ ∆ ,

Because β is not proportional to α we have kγ > 0 for at least one γ ∈ ∆ \{α}. Weget

σα(β ) = β−< β ,α > α =

= (∑γ∈∆

kγ · γ)−< β ,α > α = (kα−< β ,α >) ·α + ∑γ∈∆\{α}

kγ · γ.

Hence also σα(β ) is not proportional to α and has at least one coefficient kγ > 0. Ac-cording to Definition 7.10, part ii), all coefficients are non-negative and σα(β ) ∈Φ+ \{α}.

ii) Consider the distinguished element half the sum of all positive roots

ρ := 1/2 · ∑β∈Φ+

β .

According to part i) each symmetry σα ,α ∈ ∆ , permutes all positive roots differentfrom α and σα(α) =−α . Hence

σα(ρ) = ρ−α.

iii) Part 1) of the Lemma is satisfied even with an element w ∈ W∆ : For a givenfunctional t ∈V ∗ we choose an element w∈W∆ with w(t)(ρ) maximal, in particular

w(t)(ρ)≥ t(σα(ρ))


for all α ∈ ∆ . Using part ii) we get

w(t)(ρ)≥ w(t)(σα(ρ)) = w(t)(ρ−α)) = w(t)(ρ)−w(t)(α),

which implies w(t)(α)≥ 0.

iv) Part 2) of the Lemma is satisfied even with an element w ∈W∆ : Accordingto Lemma 7.12 a functional t ′ ∈ V ∗ exists with t ′(α ′) > 0 for all α ∈ ∆ ′. Due topart iii) an element w ∈W exists such that the functional t := w(t ′) satisfies

t(α)≥ 0

for all α ∈ ∆ . We havet(α) = t ′(w−1(α)).

With α ∈ Φ also w−1(α) runs through the roots of Φ . Because t ′(β ) 6= 0 for allroots β ∈ Φ also t(α) 6= 0 for all roots α ∈ Φ and t(α) > 0 for all α ∈ ∆ . Dueto Lemma 7.12 the bases ∆ ′ and ∆ are induced from respectively the functionals t ′

and t. From t := w(t ′) follows w(∆ ′) = ∆).

v) Part 3) of the Lemma is satisfied even with an element w ∈W∆ : For fixedα ∈ Φ we find an element t0 ∈ V ∗ with t0(α) = 0 but t(β ) 6= 0 for all roots β notproportional to α .

A functional t ∈ V ∗ exists with t(α) = ε > 0 and t(β )> ε for all roots β

not propotional to α . Denote by ∆(t) the base of Φ induced by t according toLemma 7.12. By part iv) an element w ∈W∆ exists with w(∆(t))=∆ . From α ∈ ∆(t)we obtain w(α) ∈ ∆ .

vi) We prove W∆ = W : It suffices to show σα ∈ W∆ for all α ∈ Φ . Wechoose an arbitrary root α ∈ Φ . According to part iv) an element w ∈ W∆ existswith β := w(α) ∈ ∆ .

Thenσβ = σw(α) = w◦σα ◦w−1

σα = w−1 ◦σβ ◦w ∈W∆ , q.e.d.

Lemma 7.15 (Action of the Weyl group on Cartan integers and on symmetries).Consider a root system Φ of a vector space V with Weyl group W . Let α,β ∈ Φ

denote two roots with w(α) = β for a suitable w ∈W .

1. The Cartan integers are linear in the first argument. The corresponding linearfunctionals transform as

<−,β >=<−,α > ◦w−1.

134 7 Root systems

2. The corresponding symmetries transform as

σβ = w◦σα ◦w−1.

Proof. Choose a scalar product (−,−) on V invariant with respect to W , seeLemma 7.5. Then

<−,β >= 2 ·(−,β )(β ,β )

= 2 ·(−,w(α))

(β ,β )= 2 ·

(w−1(−),α)

(β ,β )= <−,α > ◦w−1

and accordingly for any v ∈V : On the left-hand side

σβ (v) = v−< v,β > ·β = (w ·w−1(v)−< w−1(v),α > ·w(α)) =

= w(w−1(v)−< w−1(v),α > ·α).

And on the right-hand side

(w◦σα ◦w−1)(v) = w(σα(w−1(v))) = w(w−1(v)−< w−1(v),α > ·α),q.e.d.

In accordance with Lemma 7.15 about the action on the Cartan integers onedefines the action of W on the dual space V ∗:

W ×V ∗→V ∗,(w, t) 7→ w(t) := t ◦w−1.

Definition 7.16 (Cartan matrix). Consider a root system Φ of a vector space V anda base ∆ = {α1, ...,αr} of Φ .

The Cartan matrix of ∆ is the matrix of the Cartan integers of the roots from ∆

Cartan(∆) := (< αi,α j >1≤i, j≤r) ∈M(r× r,Z).

Note that the Cartan matrix is not necessarily symmetric. All diagonal elementsof the Cartan matrix have the value

< αi,αi >= 2.

For i 6= j only values< αi,α j >≤ 0

are possible according to Corollary 7.13. Moreover, these values are restricted to theset

{0,−1,−2,−3}

according to Lemma 7.7.


The Cartan matrix is defined with reference to a base ∆ and with reference toa numbering of its elements. Lemma 7.17 shows that for any two bases of Φ anynumbering of the elements of the first base induces a numbering of the elements ofthe second base such that the respective Cartan matrices are equal.

Lemma 7.17. Any two bases ∆ ,∆ ′ of a root system Φ have the same Cartan matrix.

More specifically, an element w∈W of the Weyl group of Φ exists with w(∆) = ∆ ′

and

Cartan(∆)= (<αi,α j >1≤i, j≤r)=Cartan(∆ ′)= (<w(αi),w(α j)>1≤i, j≤r),r = rank Φ .

Proof. According to Proposition 7.14 an element w∈W exists with w(∆) = ∆ ′, i.e.

∆ = {α1, ...,αr} =⇒ ∆′ = {w(α1), ...,w(αr)}.

According to Lemma 7.15

<−,w(α j)>=<−,α j > ◦w−1

which implies

< w(αi),w(α j)>=< w−1(w(αi)),α j >=< αi,α j >,q.e.d.

The Cartan matrix encodes the full information of the root system Φ , notably thedimension of its ambient space V :

Proposition 7.18 (Characterization of the root system by its Cartan matrix).Consider a root system Φ of an Euclidean space V and a base ∆ = {α1, ...,αr}of Φ .

If ∆ ′ = {β1, ...,βr} is a base of a root system Φ ′ of a second Euclidean space V ′

andf : ∆ → ∆

′

a bijective map with βi = f (αi), i = 1, ...,r, and

Cartan(< αi,α j >1≤i, j≤r) =Cartan(< βi,β j >1≤i, j≤r),

then a unique isomorphism F : V → V ′ of vector spaces exists with F(Φ) = Φ ′

and F |∆ = f .

Proof. i) Determination of F : Because the elements from ∆ form a basis of thevector space V we may define

F : V →V ′

as the uniquely determined linear extension of f . And because ∆ and ∆ ′ have thesame cardinality the linear map F is an isomorphism.

136 7 Root systems

ii) We show that the Weyl groups W and W ′ are conjugate via F , i.e.

W ′ = F ◦W ◦F−1 or W ′ ◦F = F ◦W :

Consider two simple roots α,β ∈ ∆ and the corresponding symmetries:

(σ f (α) ◦F)(β ) = σ f (α)( f (β )) = f (β )−< f (β ), f (α)> f (α)

(F ◦σα)(β ) = F(σα(β )) = f (β )−< β ,α,> α) = f (β )−< β ,α,> f (α)

Hence for every root α ∈ ∆

σ f (α) ◦F = F ◦σα .

Moreover, for two roots α1,α2 ∈ ∆ :

(σ f (α2) ◦σ f (α1)◦F = σ f (α2) ◦ (σ f (α1) ◦F) = σ f (α2) ◦ (F ◦σα1) =

= (σ f (α2) ◦F)◦σα1 = (F ◦σα2)◦σα1 = F ◦ (σα2 ◦σα1).

The Weyl groups W and W ′ are generated by the symmetries of the elements fromrespectively ∆ and ∆ ′, see Proposition 7.14. Hence

W ′ ◦F = F ◦W .

iii) We show F(Φ)⊂Φ ′: Consider an arbitrary root β ∈ Φ . According to Proposi-tion 7.14 elements α ∈ ∆ and w ∈W exist with

β = w(α).

Due to part ii) an element w′ ∈W ′ exists such

F(β ) = (F ◦w)(α) = (w′ ◦F)(α) = w′( f (α)) ∈ w′(∆ ′)⊂Φ′, q.e.d.

7.3 Coxeter graph

There is a simple data structure from discrete mathematics derived from the Car-tan matrix of a root system. The data structure is a weighted graph called theCoxeter graph of the root system. Recall that the angle θ between two distinctroots α 6= β ∈ ∆ is determined by

< α,β >< β ,α >= 4 · cos2(θ) ∈ {0,1,2,3}

with π/2≤ θ < π .

7.3 Coxeter graph 137

Definition 7.19 (Coxeter graph). Consider a root system Φ of a vector space Vwith a base ∆ . The Coxeter graph of Φ with repect to ∆ is the undirected weightedgraph

Coxeter(Φ) = (N,E)

with

• vertex set N := ∆

• and set E of weighted edges: A weighted edge (e,m) ∈ E is a pair with anedge e = {α,β} joining two vertices in N iff

α 6= β and < α,β >< β ,α >6= 0.

The corresponding edge weight is defined as m :=< α,β >< β ,α >.

If (−,−) denotes an invariant scalar product of Φ then

< α,β >< β ,α >= 4 · cos2θ

with θ the angle included by α and β satisfying

(α,β ) = ‖α‖ · ‖β‖ · cos θ ,0 < θ ≤ π.

According to Lemma 7.7 we have m ∈ {1,2,3}.

Theorem 7.20 (Classification of connected Coxeter graphs). If the Coxeter graphof a root system Φ with respect to a base ∆ is connected then it belongs to exactlyone of the classes from Figure 7.2 - for a suitable numbering of the elements of thebase:

• Series Ar,r ≥ 1: Each pair of subsequent roots include the angle 2π

3 .

• Series Br,r ≥ 2 or Series Cr,r ≥ 3: The first r− 2 pairs of subsequent rootsinclude the angle 2π

3 , the last two roots include the angle 3π

4 .

• Series Dr,r ≥ 4: The first r− 3 subsequent pairs of roots include the angle 2π

3 ,root αr−2 includes with each of the roots αr−1 and αr the angle 2π

3 .

• G2: ∆ = (α1,α2). The two roots include the angle 5π

6 .

• F4: ∆ = (α1, ...,α4). The pairs (α1,α2) and (α3,α4) include the angle 2π

3 , thepair (α2,α3) includes the angle 3π

4 .

138 7 Root systems

Fig. 7.2 Connected Coxeter graphs (annotation for weight = 1 suppressed)

• Series Er,r ∈ {6,7,8} : ∆ = (α1, ...,αr). The first r−2 pairs of subsequent rootsinclude the angle 2π

3 . In addition, the distinguished root α3 includes the angle 2π

3with the root αr.

The range of r for the types Ar,Br,Cr,Dr has been choosen to avoid dublets withlow value of r.

Note. The Coxeter graph employing the product of Cartan integers as its weightsdoes not encode the relative lenght of two roots, i.e. their length ratio. The lenghtratio derives from the quotient of the Cartan integers

‖β‖2

‖α‖2 =< β ,α >

< α,β >.

The length ratio derives from the Cartan matrix made up by the Cartan integers.Therefore the Coxeter graph does not encode the full information about the root sys-tem. We will see that a base of the root systems belonging to the types Br,Cr,G2,F4is made up by roots with different lenghts. As a consequence, after complementingthe Coxeter graph by the information about the length ratio the series Br and Cr ofroot systems will differ for r ≥ 3, see Theorem 7.25.


But in any case, the (absolute) lenght of a root is not defined because an invariantscalar product of a root system is not uniquely determined.

For the proof of Theorem 7.20 we will capture the characteristic properties ofthe Coxeter graph in the following definition which embedds the graph into a finite-dimensional Euclidean space. With respect to an invariant scalar product (−,−) ofa root system Φ the angle

θ := ^(α,β ),α,β ∈Φ ,

is determined by

cos θ =(α,β )

‖α‖ · ‖β‖or 4 · cos2(θ) = 4 ·

(α,β )2

‖α‖2 · ‖β‖2.

Definition 7.21. Consider an Euclidean space (V,(−,−)). An undirected weightedconnected graph (N,E) in V is admissible if

• vertex setN = {v1, ...,vr} ⊂V

with a linearly independent family (vi)i=1,...,r of unit vectors satisfying (vi,v j)≤ 0for 1≤ i 6= j ≤ r

• and edge set

E = {(e,m) : e= {vi,v j} i f f i 6= j and (vi,v j) 6= 0, m := 4 ·(vi,v j)2 satis f ying m∈{1,2,3}}.

Proof (of Theorem 7.20). The proof will classify the connected admissible graphsof all root systems Φ .

i) Any subgraph of an admissible graph obtained by removing a subset of verticesand their incident edges is admissible.

ii) An admissible graph has less edges than vertices:

Consider the element

v :=r

∑i=1

vi ∈V.

Then v 6= 0 because the family (vi)i=1,...,r is linearly independent. We get

0 < (v,v) =r

∑i=1

(vi,vi)+2 · ∑1≤i< j≤r

(vi,v j) = r+2 · ∑1≤i< j≤r

(vi,v j).

140 7 Root systems

For an edge {vi,v j} with weight m = 4 · (vi,v j)2 ∈ {1,2,3} and (vi,v j)≤ 0 follows

2 · (vi,v j)≤−1,

therefore0 < r−|E| i.e. |E|< r.

iii) An admissible graph has no cycles: A cycle is an admissible graph according topart i). But the number of its vertices equalizes the number of its edges, in contra-diction to part ii).

iv) For any vertex of an admissible graph the weighted sum of incident edges isat most = 3: Denote by

Inc(v) := {(e,m) ∈ E : e incident with v}

the set of weighted edges incident to vertex v. We have to show

∑(e,m)∈Inc(v)

m≤ 3.

Denote by {w1, ...,wk} the set of vertices adjacent to v.

Fig. 7.3 Vertices adjacent to v

Because an admissible graph is cycle free due to part iii) we have

(wi,w j) = 0

for 1≤ i 6= j ≤ k, i.e. the family (wi)i=1,...,k is orthonormal in (V,(−,−)). The fam-ily (v,w1, ...,wk) is linearly independent as a subfamily of the linearly independentfamily of all vertices. Choose a vector

w0 ∈ span < v,w1, ...,wk >

orthonormal to (wi)i=1,...,k and represent


v =k

∑i=0

(v,wi) ·wi.

Here (v,w0) 6= 0 because v is linearly independent from (wi)i=1,...,k. We obtain

1 = (v,v) =k

∑i=0

(v,wi)2

andk

∑i=1

(v,wi)2 < 1.

As a consequence

∑(e,m)∈Inc(v)

m =k

∑i=1

4 · (v,wi)2 < 4.

v) Blowing down a simple path, i.e. all its edges have weight = 1, results in a newadmissible graph (N′,E ′).

Fig. 7.4 Blow-down of a simple path

Denote by C = {w1, ...,wn} ⊂ N the vertices of the path. By assumption for i =1, ...,n−1

4 · (wi,wi+1)2 = 1, i.e. 2 · (wi,wi+1) =−1.

The graph (N′,E ′) resulting from blowing down the original path has vertex set

N′ = (N \C)∪{w0}

and edge setE ′ = E ′1∪E ′2

with

142 7 Root systems

E ′1 := {(e,m) ∈ E : e = {vi,v j},vi,v j ∈ N \C}

E ′2 := {(e′,m) : e′= {v,w0},v∈ (N\C) and exists (e,m)∈E with e= {v,wi} f or i= 1, ...,n}.

We show that the graph (N′,E ′) in V is admissible:

Linear independence of the vertex set N′ is obvious. Moreover, we compute

(w0,w0) =n

∑1≤i, j≤n

(wi,w j) =n

∑i=1

(wi,wi)+2 · ∑1≤i 6= j≤n

(wi,w j) =

n+2 ·n−1

∑i=1

(wi,wi+1) = n− (n−1) = 1.

In (N,E) any vertex w from N \C is adjacent to at most one vertex from the path,because an admissible graph is cycle free according to part iii). Hence

• either (w,w0) = 0• or exactly one index i = 1, ...,n exists with 0 6= (w,wi).

In either case holds4 · (w,w0)

2 ∈ {0,1,2,3}.

vi) A connected admissible graph does not contain any subgraph of the form:

• a) Vertex with more than 3 adjacent vertices.• b) Two edges with weight = 2:• b) One edge with weight = 2 and one vertex with three adjacent vertices:• d) Two distinct vertices, both having three adjacent edges:

In any of the subgraphs b)-d) it would be possible to blow down a path to a vertexwith weighted sum of incident edges ≥ 4, which contradicts part v) and iv).

vii) Any conncected admissible graph belongs to one of the following types:

• a) All edges have weight = 1, no vertex has 3 incident edges.• b) A single edge has weight = 2, all other edges have weight = 1.• c) A single edge, it has weight 3.• d) All edges have weight = 1, a single vertex has 3 incident edges, each with

weight = 1.

The result follows from excluding the subgraphs from part vi).viii) All connected admissible graphs for type vii,a) belong to series Ar. Obvious.

ix) All connected admissible graphs for type vii,b) belong to series Br or F4:

Consider the two vectors


Fig. 7.5 Non-admissible graphs

u :=p

∑i=1

i ·ui and v :=q

∑i=1

i · vi.

They are linearly independent. We compute using

2 · (ui,ui+1) =−1

(u,u) = ∑1≤i, j≤p

i · j · (ui,u j) =p

∑i=1

i2 · (ui,ui)+2 ·p−1

∑i=1

i(i+1) · (ui,ui+1) =

=p

∑i=1

i2 +2 ·p−1

∑i=1

(−1/2) · i(i+1) =p

∑i=1

i2−p−1

∑i=1

i2−p−1

∑i=1

i =

= p2− (1/2)p(p−1) = (p/2)(p+1).

Analogously(v,v) = (q/2)(q+1).

Because 4 · (up,vq)2 = 2 we obtain

(u,v)2 = (p ·up,q · vq)2 = p2 ·q2 · (up,vq)

2 = (1/2)· p2 ·q2.

Employing the Cauchy-Schwarz inequality (u,v)2 < (u,u) · (v,v) with u and v lin-early independent gives

144 7 Root systems

Fig. 7.6 Admissible graphs


(1/2)· p2q2 < (p/2)(p+1) · (q/2)(q+1).

Multiplying with 2 both sides implies

p2 ·q2 < (1/2)p(p+1)q(q+1) = (1/2)pq(p+1)(q+1)

pq < (1/2)(p+1)(q+1) = (1/2)pq+(1/2)(p+q+1)

pq < (p+q+1)

(p−1)(q−1)−2 < 0

and eventually(p−1)(q−1)< 2.

This restriction allows only the possibilities

(p,q) = (1,≥ 2),(p,q) = (≥ 2,1),(p,q) = (2,2,(p,q) = (1,1).

The first two give the same Coxeter graph. It is of type Br,r ≥ 3, equal to type Cr.The third possibility gives F4. The forth possibility is of type B2.

x) The only connected admissible graph containing an edge with weight = 3 isthe graph G2: This result follows from part iv).

xi) All connected admissible graphs for type vii,d) belong to series Dr orseries Er,r ∈ {6,7,8}:

As in the proof of part ix) we define the vectors

u :=r−1

∑i=1

i ·ui,v :=q−1

∑i=1

i · vi and w :=p−1

∑i=1

i ·wi.

Note r,q, p,≥ 2. The three vectors (u,v,w) are pairwise orthogonal and the four vec-tors (u,v,w,x) are linearly independent. Denote by θ1,θ2,θ3 the respective anglesbetween the vectors x and u,v,w. Similar to the proof of part iv) we obtain

1 >3

∑i=1

cos2(θi).

Analogously to the proof of part x) we have

(u,u) = (r/2)(r−1),(v,v) = (q/2)(q−1),(w,w) = (p/2)(p−1).

Using in addition 4 · (x,ur−1)2 = 1 we obtain

cos2(θ1)=(x,u)2

‖x‖2 · ‖u‖2 =(r−1)2 · (x,ur−1))

2

‖u‖2 =(r−1)2 ·2 · (1/4)

r(r−1)= (1/2)(1−(1/r))

146 7 Root systems

and analogously for cos2(θ2) and cos2(θ3). Hence

1 >3

∑i=1

cos2(θi) = (1/2)[(1− (1/r)+1− (1/q)+1− (1/p)]

or1/r+1/q+1/p > 1.

W.l.o.g we may assumep≤ q≤ r

and p≥ 2 because p = 1 reproduces type An. Hence

1 < 1/r+1/q+1/≤ 3/p≤ 3/2

which implies1 < 3/p≤ 2/3, or 3 > p≥ 2, i.e.

p = 2.

We obtain

1 < 1/r+1/q+1/2, or 1/2 < 1/r+1/q,1/2 < 2/q,

hence2≤ q < 4.

In case q = 3 we have r < 6. In case q = 2 the parameter r may have any value ≥ 2.

Summing up: When r ≥ q≥ p then the only possibilities for (r,q, p) are

(5,3,2),(4,3,2),(3,3,2),(≥ 2,2,2).

These possibilities refer to the series E8,E7,E6 or to the series Dr, q.e.d.

Definition 7.22 (Irreducible root system). Consider a root system Φ of a vectorspace V and denote by (−,−) a scalar product on V invariant with respect to theWeyl group W of Φ .

The root system Φ is reducible if a decomposition

Φ = Φ1∪Φ2,Φ1 6= /0,Φ2 6= /0,

exists with (Φ1,Φ2) = 0. Otherwise Φ is irreducible.Analogously defined are the terms reducible and irreducible for a base ∆ of Φ .


Proposition 7.23 (Irreducibility and connectedness of the Coxeter graph). Con-sider a root system Φ of a vector space V and a base ∆ of Φ .

i) Φ is irreducible if and only if ∆ is irreducible.

ii) Φ is irreducible if and only if its Coxeter graph is connected.

Proof. i) Suppose Φ reducible with decomposition

Φ = Φ1∪Φ2;Φ1 6= /0,Φ2 6= /0.

Define ∆i := Φi∩∆ , i = 1,2. Then

∆ = ∆1∪∆2

and (∆1,∆2)= 0. Assume: ∆1 = 0 then ∆ =∆2⊂Φ2. Then (Φ1,∆2)= (Φ1,Φ2)= 0.Because V = span ∆ = span ∆2 we even get

(Φ1,V ) = (Φ1,∆2) = 0.

Therefore Φ1 = 0, which is excluded. As a consequence: ∆1 6= /0 and similarly ∆2 6= /0.The decomposition

∆ = ∆1∪∆2

proves the reducibility of ∆ .

For the opposite direction suppose ∆ reducible with decomposition

∆ = ∆1∪∆2.

Denote by W the Weyl group of Φ . Define

Φi := W (∆i); i = 1,2.

According to Lemma 7.14 any root β ∈Φ has the form β =w(α) for suitable w ∈Wand α ∈ ∆ . Therefore

Φ = Φ1∪Φ2.

The Weyl group is generated by the symmetries σα ,α ∈ ∆ . Explicit calculationshows:

• The orthogonality (∆1,∆2) = 0 implies that for two roots αi ∈ ∆i : i = 1,2 thecorresponding symmetries σα1 and σα2 commute.

• If α1 ∈ ∆1 then for a root α ∈ span ∆1 also σα1(α) ∈ span ∆1.• If α2 ∈ ∆2 then σα2(α) = α for any root α ∈ span ∆1.

As a consequence W (∆1)⊂ span ∆1 and similarly W (∆2)⊂ span ∆2. The orthog-onality (∆1,∆2) = 0 implies the orthogonality

(Φ1,Φ2) = 0, notably Φ1∩Φ2 = /0.

148 7 Root systems

Because ∆ 6= /0 and id ∈ W also Φi 6= /0; i = 1,2. Therefore Φ is reducible withdecomposition

Φ = Φ1∪Φ2.

ii) The statement from part ii) is obvious, q.e.d.

The Dynkin diagram of a root system complements the Coxeter graph by theinformation about the length ratio of the elements from a base. This information canbe encoded by an orientation of the edges pointing from the long root to the shortroot in case two non-orthogonal roots have different lenght.

Definition 7.24 (Dynkin diagram of a root system). Consider a root system Φ andits Coxeter graph Coxeter(Φ) = (N,E). The Dynkin diagram of Φ is the directedweighted graph

Dynkin(Φ) := (N,ED)

with

• Vertex set: N = ∆ .• Edge set: Edge (e,m) ∈ ED iff e = (α,β ) with

< α,β >< β ,α >6= 0 and‖α‖2

‖β‖2 =< α,β >

< β ,α >≥ 1.

In this case m :=< α,β >< β ,α > satisfies m ∈ {1,2,3}.

Note: The Coxeter graph and the Dynkin diagram of a root system have the sameset of vertices. In the Dynkin diagram an oriented edge e = (α,β ) is a pair; it isoriented from the long root α to the short root β , ‖α‖ ≥ ‖β‖. If both roots have thesame length, then also the edge with the inverse orientation belongs to the Dynkindiagram. In this case the two oriented edges from the Dynkin diagram are equivalentto the corresponding single undirected edge from the Coxeter graph. Therefore,often the figure of the Dynkin diagram suppresses the arrows of the two oppositeorientations.

Theorem 7.25 (Classification of connected Dynkin diagrams). Consider an irre-ducible root system Φ . Then its Dynkin diagram belongs to one the following types,see Figure 7.7:

• Series Ar,r ≥ 1• Series Br,r ≥ 2• Series Cr,r ≥ 3• Series Dr,r ≥ 4• G2• F4• Series Er,r ∈ {6,7,8}


Fig. 7.7 The Dynkin diagrams of irreducible root systems

Notably, the two types Br and Cr are distinguished by the lenght ratio of their roots.

Proof. The statement follows from the classification of Coxeter graphs according toTheorem 7.3 and the restriction of the length ratio of two roots from a base accordingto Lemma 7.7: A weighted edge (e,m) from the Coxeter graph links two roots withlenght ratio 6= 1 if and only if m∈ {2,3}. Therefore the Dynkin diagrams distinguishbetween the two series Br and Cr, q.e.d.

The Dynkin diagrams of an irreducible root system complements the Coxetergraph by the length ratio of the roots from a base. The Dynkin diagram contains thefull information of the Cartan matrix of the root system. Therefore from the Dynkin

150 7 Root systems

diagram of a root system Φ one can reconstruct a base ∆ and the Cartan matrix of Φ

by the algorithm from Proposition 7.26.

Proposition 7.26 (Characterisation of a root system by its Dynkin diagram).Consider a root system Φ and its Dynkin diagram

Dynkin(Φ) = (N,ED).

A base ∆ of Φ and the Cartan matrix of Φ with respect to a numbering of theelements from ∆

Cartan(∆) = (< αi,α j >1≤i, j≤r)

can be recovered from the Dynkin diagram by the following algorithm:

1. Set ∆ := N.

2. For any vertex α ∈ N, i = 1, ...,r, set

< α,α >:= 2.

3. For a pair of vertices α,β ∈ N,α 6= β , not joined by an edge (e,m) ∈ ED set

< α,β >:= 0.

4. For any edge (e,m) ∈ ED oriented from α to β set

< α,β >:=−m,< β ,α >:=−1

Remember that an edge without orientation from the figure of a Dynkin diagramreplaces two oriented edges of the diagram.

Proof. The proof evaluates for each pair of roots the relation between the includedangle and the length ratio of the roots, see Lemma 7.7, q.e.d.

Chapter 8Classification of complex semisimple Liealgebras

The objective of the present chapter is to classify complex semisimple Lie algebrasby their Dynkin diagram, more precisely the Dynkin diagram of its root system.The result is one of the highlights of Lie algebra theory. It completely encodes thestructure of these Lie algebras by a certain finite graph, a data structure from discretemathematics.

We consider a fixed pair (L,H) with a complex semisimple Lie algebra L and amaximal toral subalgebra H ⊂ L. We denote by Φ the root set corresponding to theCartan decomposition of (L,H)

L = H⊕⊕α∈Φ

Lα .

From the Cartan criterion we know that a complex Lie algebra is semisimple ifand only if its Killing form is nondegenerate. We even have the stricter result fromProposition 6.12: For a semisimple Lie algebra L also the restriction of the Killingform to a maximal toral subalgebra H

κH := κ|H×H→ C

is nondegenerate.

In this chapter L denotes a complex semisimple Lie algebra and H ⊂ L a maximaltoral subalgebra, if not stated otherwise.

8.1 The root system of a semisimple Lie algebra

Roots are linear functionals on H. Therefore we have to translate properties of Hinto properties of its dual space H∗. This transfer is achieved by the nondegenerateKilling form κH :

151

152 8 Classification of complex semisimple Lie algebras

Each linear functional λ : H → C is uniquely represented by an element tλ ∈ Hwith λ = κH(tλ ,−), i.e.

λ (h) = κH(tλ ,h) f or all h ∈ H.

The mapH ∼−→ H∗, tλ 7→ λ ,

is a linear isomorphism of complex vector spaces. In particular, for each root α ∈Φ

of L its inverse image under this isomorphism is denoted tα ∈ H, i.e.

α = κH(tα ,−).

This formula relates roots being elements from the dual space H∗ to elements tα ∈Hof the maximal toral algebra.

Next we transfer the Killing form from the maximal toral subalgebra to a nonde-generate bilinear form on the dual space H∗

Definition 8.1 (Bilinear form on H∗). For the pair (L,H) the nondegenerate Killingform κH on H induces a nondegenerate symmetric bilinear form on the dual space H∗

(−,−) : H∗×H∗→ C,(λ ,µ) := κH(tλ , tµ).

Combining the definition of (λ ,µ) with the definition of tµ and tλ results in thefollowing formula

(λ ,µ) = κH(tλ , tµ) = λ (tµ) = µ(tλ ).

The objective of this section is to show that the root set Φ of (L,H) satisfies theaxioms for a root system of a real vector space V in the sense of Definition 7.2. Eachroot α ∈Φ is a complex linear functional

α : H→ C,

hence an element of the dual space H∗. The latter is a complex vector space. Byrestricting scalars from C to the subfield R⊂C we obtain a real vector space (H∗)R.We define the real vector space

V := spanR Φ ⊂ (H∗)R.

We will show that the root set Φ of (L,H) has the following properties:

• (R1) Finite and generating: |Φ |< ∞, 0 /∈Φ , dimR V = dimC H∗.

8.1 The root system of a semisimple Lie algebra 153

• (R2) Invariance under symmetries: Each α ∈Φ defines a symmetry σα : V →Vwith vector α ∈V and σα(Φ)⊂Φ .

• (R3) Cartan integers: The coefficients < β ,α >,α,β ∈Φ , from the representa-tion

σα(β ) = β−< β ,α > ·α

are integers, i.e. < β ,α >∈ Z.

• (R4) Reducedness: For each α ∈Φ holds Φ ∩R ·α = {±α}.

To motivate the separate steps, how to derive the root system from the Lie alge-bra, we first consider the example of sl(3,C).

Example 8.2 (The root system of sl(3,C)).

Set L := sl(3,C).

i) Maximal toral subalgebra: The subalgebra

H := span < h1 := E11−E22,h2 := E22−E33 >⊂ L

is a maximal toral subalgebra. It satisfies dim H = 2.

ii) Cartan decomposition: The Cartan decomposition of (L,H) is

L = H⊕⊕

α∈{α1,α2,α3}(Lα ⊕L−α).

All root spaces are 1-dimensional. The root set Φ = {±α1,±α2,±α3} satisfies:

• Lα1 = span < x1 := E12 >,L−α1 = span < y1 := E21 >

α1 : H→ C,α1(h1) = 2,α1(h2) =−1


α2 : H→ C,α2(h1) =−1,α2(h2) = 2


α3 : H→ C,α3(h1) = 1,α3(h2) = 1

Therefore α3 = α1 +α2.

iii) Killing form: To compute the Killing form κ and its restriction

κH : H×H→ C

one can use the formula


κ(z1,z2) = 2n · tr(z1 ◦ z2),z1,z2 ∈ sl(n,C),

with n = 3, see [20, Chapter 6, Ex. 7]. E.g.,

tr(h1 ◦h2) = tr((E11−E22)◦ (E22−E33)) = tr(−E22 ◦E22) =−tr(E22) =−1.

Therefore

(κH(hi,h j)1≤i, j≤2) = 6 ·(

2 −1−1 2

).

iv) Bilinear form on H∗: With respect to the isomorphy

H ∼−→ H∗, tλ 7→ λ ,

we gettα1 = (1/6) ·h1, tα2 = (1/6) ·h2.

The family (α1,α2) is linearly independent and a basis of H∗. According toDefinition 8.1, on H∗ the induced bilinear form

(−,−) : H∗×H∗→ C

with respect to the basis (α1,α2) of V has the matrix

((αi,α j)1≤i, j≤2) = (αi(tα j)1≤i, j≤2) = (1/6) ·(

2 −1−1 2

).

In particular, all roots αi, i = 1,2,3, have the same lenght:

(αi,αi) = 2 · (1/6) = 1/3.

We define the real vector space

V := spanR < α1,α2 >⊂ (H∗)R.

Apparently dimC V = dimC H∗. The restriction of the bilinear form (−,−) to Vis a real positive definite form, hence a scalar product. We now prove by explicitcomputation: The root set Φ of (L,H) is a root system of the real vector space V inthe sense of Definition 7.2.

v) Distinguished symmetries: Consider the Euclidean space (V,(−,−)). Forfor i = 1,2,3 define the maps

σi : V →V,x 7→ x−6 · (x,αi) ·αi.

Here the factor 6 has been choosen in order to get


σi(αi) =−αi.

Moreover

σ1(α2) = α3,σ1(α3) = σ1(α1)+σ1(α2) =−α1 +α2 +α1 = α2

σ2(α1) = α3,σ2(α3) = α1

σ3(α1) = α1−6 · (α1,α3) ·α3 = α1− (α1 +α2) =−α2,

σ3(α2) = α2− (α1 +α2) =−α1.

For i = 1,2,3 we define the symmetries

σαi := σi and σ−αi := σαi .

Then σ(α)⊂Φ for all α ∈Φ .

Each reflection leaves the scalar product invariant, because for α ∈Φ ,x,y ∈V,

(σα(x),σα(y)) = (x−6(x,α) ·α,y−6(y,α) ·α) =

(x,y)−6(y,α)(x,α)−6(x,α)(α,y)+36(x,α)(y,α)(α,α) = (x,y)

using (α,α) = 1/3.

vi) Cartan integers: From the symmetries of part v) one reads off the Cartannumbers

< αi,αi >= 2, i = 1,2, and < α1,α2 >=< α2,α1 >=−1

which are integers indeed.

vii) Reducedness: Apparently for each α ∈Φ the only roots proportional to α

are ±α .

viii) Base: Apparently the set ∆ := {α1,α2} is a base of Φ . The Cartan matrix withrespect to ∆ and the given numbering is

Cartan(Φ) =

(2 −1−1 2

).

According to Lemma 7.7 both roots from ∆ have the same lenght and include theangle (2/3)π . Therefore Coxeter graph and Dynkin diagram of Φ contain the sameinformation. According to the classification from Theorem 7.25 the root systemof sl(3,C) has type A2. The scalar product (−,−) induced from the Killing form isinvariant with respect to the Weyl group W = span < σα1 ,σα2 >.

The following two propositions collect the main properties of the root set Φ ofa semisimple complex Lie algebra. They generalize the result from Proposition 6.4about the structure of sl(2,C).


Proposition 8.3 considers the complex linear structure of L and its canonicalsubalgebras sl(2,C) ⊂ L. While Proposition 8.4 considers the integrality proper-ties of the root set Φ with respect to the bilinear form induced by the Killing form,see Definition 8.1. These properties assure that Φ is a root system in the sense ofDefinition 7.2. They allow to apply the classification of respectively Coxeter graphsfrom Theorem 7.20 and Dynkin diagrams from Theorem 7.25 to obtain the classifi-cation of complex semisimple Lie algebras in Proposition 8.6 - 8.9 and Remark ??.

Proposition 8.3 (Semisimple Lie algebras as sl(2,C-modules). Consider a pair (L,H)with L a complex semisimple Lie algebra and H ⊂ L a maximal toral subalgebra.Denote by (−,−) the nondegenerate bilinear form on H∗ derived from the restrictedKilling form κH on H according to Definition 8.1. The Cartan decomposition

L = H⊕⊕α∈Φ

Lα

from Definition 6.14 has the following properties:

1. Generating: spanCΦ = H∗.

2. Existence of negative root: For each α ∈Φ also −α ∈Φ .

3. Duality of root spaces: For each α ∈ Φ the vector spaces Lα and L−α are dualwith respect to the Killing form, i.e. the bilinear map

Lα ×L−α → C,(x,y) 7→ κ(x,y),

is nondegenerate. The distinguished element tα ∈ H satisfies the equation

[x,y] = κ(x,y) · tα

valid for any x ∈ Lα ,y ∈ L−α . In particular,

[Lα ,L−α ] = C · tα .

4. Subalgebras sl(2,C): Consider an arbitrary but fixed α ∈Φ . The map

α : [Lα ,L−α ]∼−→ C

is an isomorphism. If hα ∈ [Lα ,L−α ] denotes the uniquely determined elementwith α(hα) = 2 then

hα =2

(α,α)· tα .

For each non-zero element xα ∈ Lα a unique element yα ∈ L−α exists with

[xα ,yα ] = hα .


The remaining commutators are

[hα ,xα ] = 2 · xα , [hα ,yα ] =−2 · yα .

As a consequence, the subalgebra

Sα := spanC < hα ,xα ,yα >⊂ L

is isomorphic to sl(2,C) via the Lie algebra morphism

Sα

∼−→ sl(2,C)

hα 7→(

1 00 −1

),xα 7→

(0 10 0

),yα 7→

(0 01 0

).

The Lie algebra L is an Sα -module with respect to the module multiplication

Sα ×L→ L,(z,u) 7→ ad(z)(u) = [z,u]

which results from restricting the adjoint representation of L.

Proof. ad 1) The Killing form defines an isomorphism H ∼−→ H∗. If Φ does notspan H∗ then span Φ $ H∗ is a proper subspace. A non-zero linear functional

h ∈ (H∗)∗ ' H

exists which vanishes on this subspace, i.e. h 6= 0 and (α)(h) = 0 for all α ∈ Φ .We obtain [h,Lα ] = 0 for all α ∈ Φ . In addition [h,H] = 0 because the toralsubalgebra H is Abelian, see Proposition 6.3. As a consequence [h,L] = 0, i.e.

h ∈ Z(L).

We obtain h = 0 because the center of the semisimple Lie algebra L is trivial. Thiscontradiction proves span Φ = H∗.

In order to prove the remaining issues we recall from Lemma 6.10 for two linearfunctionals λ ,µ ∈ H∗ with λ +µ 6= 0 the orthogonality

κ(Lλ ,Lµ) = 0.

As a consequence, for two roots α,β with β 6=−α , i.e. α +β 6= 0, holds

κ(Lα ,H) = 0 and κ(Lα ,Lβ ) = 0.

ad 2) Consider α ∈ Φ and assume on the contrary −α /∈ Φ . Then α + β 6= 0 forall β ∈Φ . We obtain

κ(Lα ,L) = 0

contradicting the non-degenerateness of the Killing form, see Theorem 5.11.


ad 3)

• If x ∈ Lα satisfies κ(x,L−α) = 0 then also κ(x,L) = 0, which implies x = 0.Hence the canonical map

Lα → (L−α)∗,x 7→ κ(x,−),

is injective. Interchanging the role of the two roots α and−α shows that the mapis also surjective.

• We have[Lα ,L−α ]⊂ L0 = H

according to Lemma 6.10. The Killing form is associative according to Lemma 5.1.Hence for all h ∈ H

κ(h, [x,y]) = κ([h,x],y) = κ(α(h)x,y) = α(h)κ(x,y) = κ(tα ,h) ·κ(x,y) =

= κ(h, tα) ·κ(x,y) = κ(h,κ(x,y) · tα).

This equation holds for elements from H, hence it is satisfied also by the restricedKilling form: For all h ∈ H

κH(h, [x,y]) = κH(h,κ(x,y) · tα).

Non-degenerateness of the restricted Killing form κH implies

[x,y] = κ(x,y) · tα .

The duality between Lα and L−α provides for each element xα ∈ Lα ,xα 6= 0, anelement yα ∈ L−α such that

κ(xα ,yα) 6= 0

proving in particular[Lα ,L−α ] = C · tα .

ad 4)

• For an arbitrary but fixed x ∈ Lα ,x 6= 0, one can choose an element y ∈ L−α

with κ(x,y) 6= 0 because Lα and L−α are dual according to part 3). If z := [x,y]then

z = κ(x,y) · tα 6= 0

according to the formula from part 3).

We claim α(z) 6= 0: Assume on the contrary α(z) = 0. Then define thesubalgebra

S :=< x,y,z >⊂ L.

Its Lie bracket satisfies


[z,x] = α(z)x = 0, [z,y] =−α(z)y = 0, [x,y] = z.

Apparently the Lie algebra S is nilpotent, in particular solvable. Semisimplicityof L implies that the adjoint representation

ad : L ↪−→ gl(L)

embedds L and it subalgebra S into the matrix Lie algebra gl(L):

S' ad(S)⊂ gl(L).

According to Lie’s theorem, see Theorem 4.21, the solvable subalgebra ad(S)embedds into the subalgebra of upper triangular matrices. The element

ad(z) = ad[x,y],

being a commutator, is even a strict upper triangular matrix. Therefore ad(z) is anilpotent endomorphism, i.e. z ∈ L is nilpotent. Because H is toral subalgebra,the element z ∈ H is also semisimple, hence z = 0. This contradiction proves theclaim.

Therefore the linear map

α : [Lα ,L−α ]→ C,h 7→ α(h),

between 1-dimensional vector spaces is an isomorphism. Wedefine hα ∈ [Lα ,L−α ] as the uniquely determined element with

α(hα) = 2.

The definition α(tα) = (α,α) implies

hα =2

(α,α)· tα .

• For an arbitrary element xα ∈ Lα ,xα 6= 0 according to part 2) a uniqueelement yα ∈ L−α exists with

[xα ,yα ] = hα .

We obtain[hα ,xα ] = α(hα) · xα = 2 · xα

[h,yα ] =−α(hα) · yα =−2 · yα

[xα ,yα ] = hα .

Therefore the subalgebra Sα :=< xα ,yα ,hα > is isomorphic to sl(2,C), q.e.d.


Because L is a sl(2,C)-module with respect to each subalgebra Sα the resultsfrom section 6.2 about the structure of irreducible sl(2,C)-module can be applied.These results imply a series of integrality and rationality properties of L, see Propo-sition 8.4. For two roots α,β ∈Φ we will often employ the formula

β (hα) =2 ·β (tα)(α,α)

=2 ·κH(tβ , tα)

(α,α)=

2 · (β ,α)

(α,α).

It derives from the relation between hα and tα from Proposition 8.3 and from thedefining relation κH(tβ ,−) = β . The numbers β (hα) will turn out as the Cartanintegers < β ,α > of the root system.

Proposition 8.4 (Integrality and rationality properties of the root set). Considera pair (L,H) with L a complex semisimple Lie algebra and H ⊂ L a maximal toralsubalgebra. The root set Φ from the Cartan decomposition

L = H⊕⊕α∈Φ

Lα

has the following properties:

1. Root spaces are 1-dimensional: dim Lα = 1 for each root α ∈Φ .

2. Integrality: For each pair α,β ∈Φ

β (hα) ∈ Z and β −β (hα) ·α ∈Φ .

3. Rationality and scalar product: Consider a basis B = (α1, ...,αr) of the complexvector space H∗ made up from roots αi ∈ Φ , i = 1, ...,r. Then any root β ∈ Φ isa rational combination of elements from B, i.e.

Φ ⊂VQ := spanQ{α1, ...,αr}

anddimQ VQ = dimC H.

The bilinear form (−,−) restricts from H∗ to the rational subspace VQ ⊂H∗ andits restriction

(−,−)Q : VQ×VQ→Q

is positive definite, i.e. a scalar product.

4. Symmetries: By extending scalars from Q to R define the real vector space

V := R⊗QVQ = spanRΦ ⊂ H∗

and the scalar product (−,−)R on V . For each root α ∈Φ the map


σα : V →V,v 7→ σα(β ) := v−2 · (v,α)R

(α,α)R·α,

is a symmetry of V with vector α and Cartan integers

< β ,α >= β (hα) ∈ Z,β ∈Φ .

It satisfies σα(Φ) ⊂ Φ . In addition: Each symmetry σα leaves invariant thescalar product (−,−)R.

5. Proportional roots: If α ∈Φ then the only roots proportional to α are ±α .

6. α-string through β : For each pair of non-proportional roots α,β ∈Φ let p,q ∈ Nbe the largest numbers such that

β − p ·α ∈Φ and β +q ·α ∈Φ .

Then all functionals from the string

β + k ·α ∈ H∗,−p≤ k ≤ q,

are roots.

Moreover, for two roots α ∈Φ ,β ∈Φ with α +β ∈Φ:

[Lα ,Lβ ] = Lα+β .

Fig. 8.1 α-string through β

Proof. ad 1) According to Proposition 8.3, part 2) the root spaces Lα and L−α aredual with respect to the Killing form κ .

In order to show dim Lα = 1 we assume on the contrary

dim Lα = dim L−α > 1.

Consider an element xα ∈ Lα ,x 6= 0 and the correponding subalgebra Sα fromProposition 8.3, part 3). An element hα ∈ Sα exists with [hα ,xα ] = 2 · xα , i.e.


α(hα) = 2.

The linear functional κ(xα ,−) on L−α is non-zero. Because dim L−α > 1 the func-tional has a non-trivial kernel, i.e. an element y∈L−α ,y 6= 0, exists with κ(xα ,y) = 0.The latter formula implies [xα ,y] = 0, see Proposition 8.3, part 3). As a consequence y ∈ Lis a primitive element of an irreducible Sα -submodule of L with weight

(−α)(hα) =−2 < 0.

The latter property contradicts the fact that all irreducible sl(2,C)-modules have anon-negative highest weight, see Theorem 6.7.

ad 2) Consider an element 0 6= y ∈ Lβ . Considered as an element of the rootspace Lβ it satisfies

[hα ,y] = β (hα) · y.

When considered as an element of the Sα -module L the element y∈L has weight β (hα).Theorem 6.7 implies

β (hα) ∈ Z

We set p := β (hα) and

z := ypα .y i f p≥ 0 and z := x−p

α .y i f p≤ 0.

Theorem 6.7 implies z 6= 0. Because

z ∈ Lβ−p·α

we get β −β (hα) ·α ∈Φ ..

ad 3)

• Each root β ∈Φ has a unique representation

β =r

∑i=1

ci ·αi

with complex coefficients ci ∈ C, i = 1, ...,r. We claim that the coefficients areeven rational, i.e. ci ∈Q for i = 1, ...,r.

Multiplying the representation of β successively for j = 1, ...,r by

2(α j,α j)

and applying the bilinear form (−,α j) to the resulting equation gives a linearsystem of equations

b = A · c


for the vector of indeterminates c := (c1, ...,cr)>. The left-hand side is the vector

b =

(2 · (β ,α j

(α j,α j)

)>The coefficient matrix is

A =

(2 · (αi,α j)

(α j,α j)

), row index j, column index i,

andc =

(ci)>

is the vector of indeterminates. The system is defined over the ring Z because fortwo roots γ,δ ∈Φ

2 ·(γ,δ )

(δ ,δ )= γ(hδ ) ∈ Z

according to part 2). The coefficient matrix A is invertible as an element from GL(r,Q):It originates by multiplying for j = 1, ...,r the row with index j of the matrix

(αi,α j)1≤i, j≤r ∈ GL(r,C)

by the non-zero scalar 2/(α j,α j). And the latter matrix defines the bilinearform(−,−) on H∗, which is nondegenerate because it corresponds to the Killing formκH(−,−) on H. As a consequence, the solution of the linear system of equationsis unique and is already defined over the base field Q, i.e.

c j ∈Q f or all j = 1, ...,r.

• We claim that the bilinear form (−,−) is rational, i.e. it is defined over thefield Q, and positive definite. For any two roots α,β ∈Φ we have to show

(α,β ) ∈Q and (α,α)> 0.

For two linear functionals λ ,µ ∈ H∗ we compute

(λ ,µ) = κH(tλ , tµ) = tr(ad tλ ◦ad tµ).

In order to evaluate the trace we employ the definition of a root space: If z ∈ Lγ

then(ad tµ)(z) = γ(tµ) · z

and(ad tλ ◦ad tµ)(z) = γ(tλ ) · γ(tµ) · z.

Using the Cartan decomposition


L = H⊕⊕γ∈Φ

Lγ

and observing [H,H] = 0 we obtain

(λ ,µ) = tr(ad tλ ◦ad tµ) = ∑γ∈Φ

γ(tλ ) · γ(tµ).

We apply this formula to the computation of (α,α) using

2 · (γ,α)

(α,α)= 2 ·

γ(tα)(α,α)

= γ(hα) =:< γ,α >∈ Z

according to part 2). As a consequence

4 · γ(tα)2 = (α,α)2·< γ,α >2 .

We obtain

(α,α) = ∑γ∈Φ

γ(tα)2 = (1/4) · (α,α)2 · ∑γ∈Φ

< γ,α >2

and because of (α,α) 6= 0

1 = (1/4) · (α,α) · ∑γ∈Φ

< γ,α >2 .

In particular

(α,α) = 4 · ( ∑γ∈Φ

< γ,α >2)−1 ∈Q and (α,α)> 0

and(β ,α) = (1/2)(α,α)·< β ,α >∈Q.

ad 4) Due to the characterization α(hα) = 2 from Proposition 8.3 the map σα is asymmetry of V with vector α . Due to part 2) its Cartan numbers are integers

< β ,α >=2 · (β ,α)

(α,α)= β (hα) ∈ Z.

The inclusion σα(Φ)⊂Φ has been proven in part 2). In order to prove theinvariance of the scalar product with respect to the Weyl group it suffices toconsider three roots α,β ,γ ∈Φ :

(σα(β ),σα(γ)) = (β −β (hα)α,γ− γ(hα) ·α) =

= (β ,γ)−β (hα)(α,γ)− γ(hα)(β ,α)+β (hα)γ(hα)(α,α)

After using


(α,γ) = (1/2)(α,α)γ(hα) and (α,β ) = (1/2)(α,α)β (hα)

we obtain(σα(β ),σα(γ)) = (β ,γ).

ad 5) Assume on the contrary the existence of a pair of proportional roots differentfrom ±α . W.l.o.g. we may assume α ∈Φ and β = t ·α ∈Φ ,0 < t < 1, t ∈ R.From β (hα) ∈ Z and

β (hα) ·α = σα(β )−β = t · (σα(α)−α) =−2t ·α

follows2t ∈ Z.

Hence the two roots from any pair of proportional roots belong to a set

{±α,±(1/2)α},α ∈Φ .

We may now assume α ∈Φ and check whether the linear functional 2α ∈ H∗

belongs to Φ .

Consider an element z ∈ L2α . Then

[hα ,z] = 2α(hα) · z = 4 · z.

Because 3α /∈Φ we have xα .z = 0.

With respect to the sl(2,C)-module structure of L induced from the action of

Sα =< xα ,yα ,hα >,

see Proposition 8.3, part 3), we have

• hα .z = 2 ·2 · z, because z ∈ L2α .• xα .z ∈ L3α . But 3α /∈Φ , hence xα .z = 0.• yα .(xα .z) ∈ Lα , hence yα .(xα .z) ∈ Cxα .

As a consequence

4z = hα .z = [xα ,yα ].z = xα .(yα .z)− yα .(xα .z) = yα(xα .z) ∈ C · xα ∈ L2α ∩Lα

which implies z = 0. The conclusion L2α = 0 implies 2α /∈Φ .

ad 6) Consider the Sα -module

M :=q⊕

k=−p

Lβ+kα .

Non-zero elements from Lβ+kα have weight with respect to Sα


(β + kα)(hα) = β (hα)+2k.

The functional β + kα is a root of L if and only if one has Lβ+kα 6= 0. Then Lβ+kα

is 1-dimensional according to part 1). Proposition 6.6 implies that M is anirreducible Sα -module of dimension

m+1 = p+q+1.

In particular, for every −p≤ k ≤ q the number β (hα)+2k is a weight of M andevery functional β + kα is a root of L.

In addition, Proposition 6.6 implies that the shift of weight spaces of M

xα : Lβ → Lβ+α ,v 7→ xα .v,

is an isomorphism

Theorem 8.5 (Root system of a complex simple Lie algebra). Consider a pair (L,H)with L a complex semisimple Lie algebra and H ⊂ L a maximal toral subalgebra.

The root set Φ from the Cartan decomposition

L = H⊕⊕α∈Φ

Lα

is a root system of the real vector space

V := spanRΦ ⊂ (H∗)R.

It is named the root system of the Lie algebra L. The root system attaches toeach α ∈Φ the symmetry

σα : V →V,β 7→ β −β (hα) ·α

with Cartan integers< β ,α >= β (hα) ∈ Z.

The bilinear form (−,−) on V , defined as

(α,β ) := κ(tα , tβ ),α,β ∈Φ ,

is an invariant scalar product with respect to the Weyl group of Φ .

Proof. The proof collects the results from Proposition 8.4, q.e.d.

8.2 Root system of the complex Lie algebras from A,B,C,D-series 167

8.2 Root system of the complex Lie algebras from A,B,C,D-series

We will show in the present section that the root systems of type Ar,Br,Cr,Dr arethe root systems of the complex Lie algebras belonging to the classical groups of thecorreponding types, see Lemma 3.6. We follow [15, Chap. 7.7] and [17, Chap. X, §3].

The Lie algebras of the classical groups are subalgebras of the Lie algebra gl(n,C).We introduce the following notation for the elements of the canonical basis of thevector space M(n×n,C):

Ei, j ∈M(n×n,C)

is the matrix with entry = 1 at place (i, j) and entry = 0 for all other places. Ourmatrix computations are based on the formulas

Ei, j ·Ek,l = δ j,k ·Ei,l ,1≤ i, j,k, l ≤ n.

The family(Ei,i)1≤i≤n

is a basis of the subspace of diagonal matrices d(n,C). Denote the elements of thedual base by

εi := Ei,i∗ ∈ d(n,C)∗, i = 1, ...,n.

Proposition 8.6 (Typ Ar). The Lie algebra L := sl(r+1,C),r≥ 1, has the followingcharacteristics:

1. dim L = (r+1)2−1.2. The subalgebra

H := d(r+1,C)∩L

is a maximal toral subalgebra with dim H = r.

3. The family (hi)i=1,...,r with

hi := Ei,i−Ei+1,i+1

is a basis of H.

4. Define the functionals

εi := εi|H ∈ H∗, i = 1, ...,r+1.

Then the root system Φ of L has the elements

εi− ε j,1≤ i 6= j ≤ r+1.

The corresponding root spaces are 1-dimensional, generated by the elements

Ei j,1≤ i 6= j ≤ r+1.


A base of Φ is the set ∆ := {αi : 1≤ i≤ r} with

αi := εi− εi+1.

The positive roots are the elements of Φ+ = {εi − ε j : i < j}. They have therepresentation

εi− ε j =j−1

∑k=i

αk ∈Φ+.

5. For each positive root α := εi− ε j ∈Φ+ the subalgebra

Sα ' sl(2,C)

is generated by the three elements

hα := Ei,i−E j, j,xα := Ei, j,yα := E j,i.

6. The Cartan matrix of Φ referring to the base ∆ is

Cartan(∆) =

2 −1 0 . . . 0−1 2 −1 . . . 0

. . .. . .

. . .0 . . . −1 2 −10 . . . 0 −1 2

∈M((r+1)× (r+1),Z).

All roots α ∈ ∆ have the same length. The only pairs of roots from ∆ , which arenot orthogonal, are

(αi,αi+1), i = 1, ...,r−1.

Each pair includes the angle2π

3. In particular the Dynkin diagram of the root

system Φ has type Ar from Theorem 7.25.

Proof. 4) For i 6= j elements h ∈ H act on Ei, j according to

[h,Ei, j] = h◦Ei, j−Ei, j ◦h = εi(h) ·Ei, j− ε j(h) ·Ei, j = (εi(h)− ε j(h)) ·Ei, j

5) According to part 4) the commutators are

[hα ,xα ] = [hα ,Ei, j] = (εi(hα)− ε j(hα))Ei, j = 2 ·Ei, j = 2 · xα

[hα ,yα ] = [hα ,E j,i] = (ε j(hα)− εi(hα))E j,i =−2 ·Ei, j =−2 · yα

[xα ,yα ] = [Ei, j,E j,i] = Ei,i−E j, j = hα .

6) The Cartan matrix has the entries β (hα),α,β ∈ ∆ . We have to consider

• βk = εk− εk+1,1≤ k ≤ r and• α j = ε j− ε j+1,1≤ j ≤ r, with corresponding elements h j = E j, j−E j+1, j+1.


βk(hα j) = (εk− εk+1)(E j, j−E j+1, j+1) = (εk− εk+1)(E j, j−E j+1, j+1) =

= δk j−δk, j+1−δk+1, j +δk+1, j+1 = 2δk j−δk, j+1−δk+1, j =

=

2, if k = j−1, if |k− j|= 10, if |k− j| ≥ 2

If α 6= β and <α,β >,< β ,α >6= 0 then the symmetry of the Cartan matrix implies

1 =< β ,α,>

< α,β >=‖β‖2

‖α‖2

Hence all simple roots have the same length. The angles between two simple rootscan be read off from the Cartan matrix. Hence the root system Φ has the Dynkindiagram Ar from Theorem 7.25, q.e.d.

In dealing with the Lie algebra sp(2r,C) we note that X ∈ sp(2r,C) iff

σ ·X> ·σ = X

with

σ :=(

0 1

−1 0

)and σ

−1 =−σ .

This condition is equivalent to

X =

(A BC −A>

)with symmetric matrices B = B> and C =C>. The following proposition will referto this type of decomposition.

Proposition 8.7 (Typ Cr). The Lie algebra L := sp(2r,C),r ≥ 3, has the followingcharacteristics:

1. dim L = r(2r+1).

2. The subalgebra

H := {(

D 00 −D

)∈ L : D ∈ d(r,C)}


3. The family(hi := Ei,i−Er+i,r+i)1≤i≤r

is a basis of H.


4. Define the functionals

εi := εi|H ∈ H∗, i = 1, ...,2r.

Then the elements

Ei, j−Er+ j,r+i,1≤ i 6= j ≤ r, (Matrix o f type A),

Ei,r+ j +E j,r+i,1≤ i, j ≤ r, (Matrix o f type B),

Er+i, j +Er+ j,i,1≤ i, j ≤ r, (Matrix o f type C),

generate the 1-dimensional root spaces belonging to the respective root

α =

εi− ε j 1≤ i 6= j ≤ rεi + ε j 1≤ i, j ≤ r−εi− ε j 1≤ i, j ≤ r

The root system Φ is

Φ = {εi−ε j : 1≤ i 6= j≤ r}∪{±(εi +ε j) : 1≤ i < j≤ r}∪{±2 ·εi : 1≤ i≤ r}

A base of Φ is the set∆ := {α1, ...,αr}

with - note the different form of the last root αr -

• α j := ε j− ε j+1,1≤ j ≤ r−1,• αr = 2 · εr

The set of positive roots is Φ+ = {εi± ε j : 1≤ i < j ≤ r}∪{2 · ε j : 1≤ i≤ r}.

5. For each positive root α ∈Φ+ the subalgebra

Sα =< hα ,xα ,yα >' sl(2,C)

has the generators:

• If α := εi− ε j ∈Φ+,1≤ i < j ≤ r, then

hα := Ei,i−E j, j,xα := Ei, j−Er+ j,r+i,yα := E j,i−Er+i,r+ j.

• If α := εi + ε j ∈Φ+,1≤ i < j ≤ r, then

hα := Ei,i +E j, j,xα := Ei,r+ j +E j,r+i,yα := Er+i, j +Er+ j,i.

• If α := 2εi ∈Φ ,1≤ i≤ r, then

hα := Ei,i−Er+i,r+i,xα := Ei,r+i,yα := Er+i,i.


6. The Cartan matrix of the root system Φ referring to the basis ∆ is

Cartan(∆) =

2 −1 0 . . . 0−1 2 −1 . . . 0

. . .. . .

. . .0 . . . −1 2 −1 00 . . . 0 −1 2 −10 . . . 0 −2 2

∈M(r× r,Z),

entry <αi,α j > at position (row,column)= ( j, i). Note the distinguished entry −2in the last row: The Cartan matrix is not symmetric.All roots α j ∈ ∆ ,1 ≤ j ≤ r− 1 have equal length, they are the short roots. Theroot αr is the long root:

√2 =‖αr‖‖α j‖

,1≤ j ≤ r−1.

The only pairs of simple roots with are not orthogonal are

(αi,αi+1), i = 1, ...,r−1.

For i = 1, ...,r− 2 these pairs include the angle2π

3, while the pair (αr−1,αr)

inludes the angle3π

4. In particular the Dynkin diagram of the root system Φ has

type Cr according to Theorem 7.25.

Proof. 1) Using the representation

X =

(A BC −A>

)∈ sp(2r,C)

with symmetric matrices B = B> and C = C> we obtain from the number of freeparameters for A,B,C

dim L = r2 +2 · (r2− r

2+ r) = 2r2 + r = r(2r+1).

4) Note εi(h) =−εr+i(h) for all h ∈ H. For h ∈ H the commutators are:

[h,Ei, j−Er+ j,r+i] = h · (Ei, j−Er+ j,r+i)− (Ei, j−Er+ j,r+i) ·h =

= εi(h)Ei, j− εr+ j(h)Er+ j,r+i− ε j(h)Ei, j + εr+i(h)Er+ j,r+i =

= (εi(h)− ε j(h))Ei, j− εr+ j− εr+i)Er+ j,r+i =

= (εi(h)− ε j(h))Ei, j + ε j(h)− εi(h))Er+ j,r+i =

(εi(h)− ε j(h))(Ei, j−Er+ j,r+i)


[h,Ei,r+ j +E j,r+i] = εi(h)Ei,r+ j + ε j(h)E j,r+i− εr+ j(h)Ei,r+ j− εr+i(h)E j,r+i =

(εi(h)− εr+ j(h))Ei,r+ j +(ε j(h)− εr+i(h))E j,r+i =

= (εi(h)+ ε j(h))Ei,r+ j +(ε j(h)+ εi(h))E j,r+i =

(εi(h)+ ε j(h))(Ei,r+ j +E j,r+i)

[h,Er+i, j +Er+ j,i] = εr+i(h)Er+i, j + εr+ j(h)Er+ j,i− ε j(h)Er+i, j− εi(h)Er+ j,i =

= (εr+i(h)− ε j(h))Er+i, j +(εr+ j(h)− εi(h))Er+ j,i =

= (−εi(h)− ε j(h))Er+i, j +(ε j− εi(h))Er+ j,i =

= (−εi(h)− ε j(h))(Er+i, j +Er+ j,i)

The positive roots have the base representation

εi− ε j =j−1

∑k=i

αk,1≤ i < j ≤ r.

2 · εi = αr +r−1

∑k=i

2 ·αk,1≤ i≤ r−1.

Part 6) The Cartan matrix has the entries β (hα),α,β ∈ ∆ . We have to consider theroots

• βk = εk− εk+1,1≤ k ≤ r−1, and• βr = 2εr

and the roots

• α j = ε j− ε j+1,1≤ j ≤ r−1, with elements hα j = h j−h j+1 and• αr = 2εr with element hαr = hr.

Accordingly, we calculate the cases:

• For 1≤ j,k ≤ r−1:

βk(hα j) = (εk− εk+1)(E j, j−E j+1, j+1) = δk, j−δk, j+1−δk+1, j +δk+1, j+1 =

= 2 ·δk, j−δk, j+1−δk, j−1

• For 1≤ j ≤ r−1,k = r:

βr(hα j) = 2 · εr(E j, j−E j+1, j+1) = 2(δr, j−δr, j+1) =−2δr, j+1

• For 1≤ k ≤ r−1, j = r:


βk(hαr) = (εk− εk+1)(Er,r) = δk,r−δk+1,r =−δk+1,r

• For j = k = r:βr(hαr) = 2 · εr(Er,r) = 2.

If α 6= β and < α,β >,< β ,α >6= 0 then

β (hα)

α(hβ )=‖α‖2

‖β‖2

Hence

2 =−2−1

=αr(hαr−1)

αr−1(hαr)=‖αr−1‖2

‖αr‖2 ,q.e.d.

In dealing with the Lie algebra so(2r,C) it is useful to consider a matrix M ∈ so(2r,C)as a scheme having 2×2-matrices as entries. Hence we introduce the non-commutativering

R := M(2×2,C)

and consider the matrices

M = ((a jk)1≤ j,k≤r) ∈M(r× r,R)'M(2r×2r,C)

with entriesa jk ∈ R,1≤ j,k ≤ r.

For each 1≤ j,k ≤ r we introduce the matrix

E j,k ∈M(r× r,R)

with onyl one nonzero entry, namely 1 ∈ R at place ( j,k). We distinguish the fol-lowing elements from R

h :=(

0 −ii 0

), s :=

12

(i −1−1 −i

), t :=

12

(i 11 −i

), u :=

12

(i 1−1 i

), v :=

12

(i −11 i

)satisfying

• s = s>,h · s = s,s ·h =−s

• t = t>,h · t =−t, t ·h = t

• u> = v,h ·u = u ·h = u

• v> = u,h · v = v ·h =−v


• [s, t] =−h

• u2− v2 =−h

Proposition 8.8 (Typ Dr). The Lie algebra L := so(2r,C),r ≥ 4, has the followingcharacteristics:

1. dim L = r(2r−1).

2. The subalgebra

H := spanC < h ·E j, j : 1≤ j ≤ r > ⊂ d(r,R)∩L


3. For any pair 1≤ j < k ≤ r each of the four elements

s · (E j,k−Ek, j), t · (E j,k−Ek, j),u ·E j,k− v ·Ek, j,v ·E j,k−u ·Ek, j

generates the 1-dimensional root space belonging to the respective root

α =

ε j + εk

−ε j− εk

ε j− εk

−ε j + εk

Here ε j := (h ·E j, j)∗ ⊂ H∗,1≤ j ≤ r are the dual functionals.

4. The root system Φ of L has the elements

−εk− εn,εk + εn,−εk + εn,εk− εn,1≤ k < n≤ r,

for short

Φ = {±εk± εn : 1≤ k < n≤ r} (each combination o f signs).

A base of Φ is the set∆ := {α1, ...,αr}

with - note the different form of the last root αr -

• α j := ε j− ε j+1,1≤ j ≤ r−1,• αr = εr−1 + εr

The set of positive roots is Φ+ = {εk± εn : 1≤ k < n≤ r}.


5. For each positive root α := ε j + εk ∈Φ+,1≤ j < k ≤ r, the subalgebra

Sα ' sl(2,C)


hα := h · (E j, j +Ek,k),xα := s · (E j,k−Ek, j),yα := t · (E j,k−Ek, j).

For each positive root α := ε j− εk ∈Φ+,1≤ j < k ≤ r, the subalgebra

Sα ' sl(2,C)


hα := h · (E j, j−Ek,k),xα := u ·E j,k− v ·Ek, j,yα := v ·E j,k−u ·Ek, j.

6. The Cartan matrix of the root system Φ referring to the basis ∆ is

Cartan(∆) =

2 −1 0 . . . 0−1 2 −1 . . . 0

. . .. . .

. . .0 . . . −1 2 −1 −10 . . . 0 −1 2 00 −1 0 2

∈M(r× r,Z).

Note the distinguished entries in the last row and in the last column.All roots α ∈ ∆ have the same length. The only pairs of simple roots with are notorthogonal are

(αi,αi+1), i = 1, ...,r−2, and (αr−2,αr).

These pairs include the angle2π

3. In particular the Dynkin diagram of the root

system Φ has type Dr from Theorem 7.25.

Proof. 3) It suffices to consider only pairs of indices (k, l) with k < l because thegenerators belonging to (k, l) and (l,k) differ only by sign.

The general element of H has the form

z =r

∑ν=1

aν ·h ·Eν ,ν ,aν ∈ C.

H acts on s · (E j,k−Ek, j) according to

[z,s ·(E j,k−Ek, j)] = ε j(z) ·hs ·E j,k−εk(z) ·hs ·Ek, j−εk(z) ·sh ·E j,k+ε j(z) ·sh ·Ek, j =

ε j(z) · s ·E j,k− εk(z) · s ·Ek, j + εk(z) · s ·E j,k− ε j(z) · s ·Ek, j =


= (ε j(z)+ εk(z)) · s · (E j,k−Ek, j)

H acts on t · (E j,k−Ek, j) according to

[z, t ·(E j,k−Ek, j)] = ε j(z) ·ht ·E j,k−εk(z) ·ht ·Ek, j−εk(z) ·th ·E j,k+ε j(z) ·th ·Ek, j =

−ε j(z) · t ·E j,k + εk(z) · t ·Ek, j− εk(z) · t ·E j,k + ε j(z) · t ·Ek, j =

= (−ε j(z)− εk(z)) · t · (E j,k−Ek, j)

H acts on u ·E j,k− v ·Ek, j according to

[z,u ·E j,k−v ·Ek, j] = ε j(z)·hu ·E j,k−εk(z)·hv·Ek, j−εk(z)·uh ·E j,k+ε j(z)·vh·Ek, j =

ε j(z) ·u ·E j,k + εk(z) · v ·Ek, j− εk(z) ·u ·E j,k− ε j(z) · v ·Ek, j =

= (ε j(z)− εk(z)) · (u ·E j,k− v ·Ek, j)

H acts on v ·E j,k−u ·Ek, j according to

[z,v ·E j,k−u ·Ek, j] = ε j(z)·hv ·E j,k−εk(z)·hu ·Ek, j−εk(z)·vh·E j,k+ε j(z)·uh ·Ek, j =

−ε j(z) ·u ·E j,k− εk(z) ·u ·Ek, j + εk(z) · v ·E j,k + ε j(z) ·u ·Ek, j =

= (−ε j(z)− εk(z)) · (v ·E j,k−u ·Ek, j).

The positive roots have the base representation

εi− ε j =j−1

∑k=i

αk,1≤ i < j ≤ r,

εi + ε j =r−2

∑k=i

αk +r

∑k= j

αk,1≤ i < j ≤ r.

5) For α := ε j + εk ∈Φ+,1≤ j < k ≤ r, the commutators are

[hα ,xα ] = [h ·(E j, j+Ek,k),s·(E j,k−Ek, j)]= hs·(E j, j+Ek,k)(E j,k−Ek, j)−sh·(E j,k−Ek, j)(E j, j+Ek,k)=

= s · (E j,k−Ek, j)+ s · (E j,k−Ek, j) = 2 · xα

[hα ,yα ] = [h ·(E j, j+Ek,k), t ·(E j,k−Ek, j)]= −t ·(E j,k−Ek, j)−t ·(E j,k−Ek, j)=−2 ·yα

[xα ,yα ] = [s ·(E j,k−Ek, j), t(E j,k−Ek, j)]= [s, t]·(E j,k−Ek, j)2 =(−h)·(−E j, j−Ek,k)= hα

For α := ε j− εk ∈Φ+,1≤ j < k ≤ r, the commutators are

[hα ,xα ] = [h ·(E j, j−Ek,k),u ·E j,k−v ·Ek, j)= hu ·E j,k+hv·Ek, j+uh·E j,k+vh·Ek, j =

u ·E j,k− v ·Ek, j +u ·E j,k− v ·Ek, j = 2u ·E j,k−2v ·Ek, j = 2 · xα


[hα ,yα ] = [h·(E j, j−Ek,k),v·E j,k−u·Ek, j] = hv·E j,k+hu ·Ek, j+vh·E j,k+uh·Ek, j =

−v ·E j,k +u ·Ek, j− v ·E j,k +u ·Ek, j = 2u ·Ek, j−2v ·E j,k =−2 · yα

[xα ,yα ] = [u ·E j,k−v ·Ek, j,v ·E j,k−u ·Ek, j] = −u2E j, j−v2Ek,k +v2E j, j +u2Ek,k =

= (v2−u2) ·E j, j +(u2− v2) ·Ek,k = h · (E j, j−Ek,k) = hα

6) The Cartan matrix has the entries β (hα),α,β ∈ ∆ . We have to consider

• α j = ε j− ε j+1,1≤ j ≤ r−1 with elements hα j = h · (E j, j−E j+1, j+1) and• αr = εr−1 + εr with element hαr = h · (Er−1,r−1 +Er,r)

and

• βk = εk− εk+1,1≤ k ≤ r−1 and• βr = εr−1 + εr.

If 1≤ j,k ≤ r−1 then

(εk−εk+1)(h ·(E j, j−E j+1, j+1))= δk j−δk, j+1−δk+1, j+δk+1, j+1 = 2δk j−δk, j+1−δk+1, j =

=

2, if i = j−1, if |i− j|= 10, if |i− j| ≥ 2

If k = r and 1≤ j ≤ r−1 then

(εr−1 + εr)(h · (E j, j−E j+1, j+1)) = δr−1, j−δr−1, j+1 +δr, j−δr, j+1 =−δr, j+2

=

{−1, if j = r−20, if j 6= r−2

If 1≤ k ≤ r−1 and j = r then

(εk− εk+1)(h · (Er−1,r−1 +Er,r)) = δk,r−1 +δk,r−δk+1,r−1−δk+1,r =−δk,r−2

=

{−1, if k = r−20, if k 6= r−2

If k = r = 2 then

(εr−1 + εr)(h · (Er−1,r−1−Er,r)) = δr−1,r−1 +δr,r = 2.

If α 6= β and < α,β >,< β ,α >6= 0 then

< α,β >

< β ,α >=‖β‖2

‖α‖2, q.e.d.


In order to investigate the Lie algebra L := so(2r+ 1,C) we use the same nota-tions and employ matrices similar to those introduced to study so(2r+1,C). Morespecific: For 1≤ j,k ≤ r we extend the matrix

E j,k ∈M(r× r,R)'M(2r×2r,C)

by zero to the matrix

E j,k ∈M((2r+1)× (2r+1),C).

Like E j,k the matrix E j,k has non-zero entries only at the two places (2 j−1,2k−1)and (2 j,2k), both entries with value = 1.

Proposition 8.9 (Typ Br). The Lie algebra L := so(2r+1,C),r≥ 2, has the follow-ing characteristics:

1. dim L = r(2r+1).

2. The subalgebra

H := spanC < h · E j, j : 1≤ j ≤ r > ⊂ d(2r+1,C)∩L


3. For 1≤ j ≤ r denote byε j := (h · E j, j)

∗ ⊂ H∗

the dual functionals. For any pair 1≤ j < k ≤ r each of the four elements

s · (E j,k− Ek, j), t · (E j,k− Ek, j),u · E j,k− v · Ek, j,v · E j,k−u · Ek, j

generates the 1-dimensional root space belonging to the respective root

α =

ε j + εk

−ε j− εk

ε j− εk

−ε j + εk

In addition, for each index j = 1, ...,r a 1-dimensional root space is generatedby the matrix

X j ∈ so(2r+1,C)

with exactly four non-zero entries: The vector

B1 :=(

i1

)∈M(2×1C)


at places (2 j−1,2r+1) and (2 j,2r+1) and the vector B>1 at places (2r+1,2 j−1)and (2r+1,2 j).

And in addition, for each index j = 1, ...,r a 1-dimensional root space is gener-ated by the matrix

Yj ∈ so(2r+1,C)

with exactly four non-zero entries: The vector

B2 :=(

i−1

)∈M(2×1,C)

at places (2 j−1,2r+1) and (2 j,2r+1) and the vector B>2 at places (2r+1,2 j−1)and (2r+1,2 j).

For j = 1, ...,r the respective roots belonging to X j and Yj are

α =±ε j.

4. The root system Φ of L is

Φ = {±εk± εn : 1≤ k < n≤ r}∪{±ε j : j = 1, ...,r}.

A base of φ is the set∆ = {α1, ...,αr}

with - note the different form of the last root -

• α j := ε j− ε j+1,1≤ j ≤ r−1,• αr := εr.

The set of positive roots is

Φ+ = {εi± ε j : 1≤ i < j ≤ r}∪{ε j : 1≤ j ≤ r}.

5. For a positive root α ∈Φ+ the subalgebra

Sα =< hα ,xα ,yα >' sl(2,C)

has the generators:

• If α = ε j + εk,1≤ j < k ≤ r, then

hα := h · (E j, j +Ek,k),xα := s · (E j,k−Ek, j),yα := t · (E j,k−Ek, j).

• If α = ε j− εk,1≤ j < k ≤ r, then

hα := h · (E j, j−Ek,k),xα := u ·E j,k− v ·Ek, j,yα := v ·E j,k−u ·Ek, j.


• If α = ε j,1≤ j ≤ r, then

hα := 2h ·E j, j,xα = X j,yα = Yj.

6. The Cartan matrix of the root system Φ referring to the base ∆ is

Cartan(∆) =

2 −1 0 . . . 0−1 2 −1 . . . 0

. . .. . .

. . . 00 . . . 0 −1 2 −20 0 −1 2

∈M(r× r,Z).

Note the distinguished entries in the last row and the last column.The roots α j, j = 1, ...,r− 1, have equal length; they are the long roots. Theroot αr is the single short root. The length ratio is

‖α j‖‖αr‖

=√

2, j = 1, ...,r−1.

The non-orthogonal pairs of roots from ∆ are

(α j,α j+1), j = 1, ...,r−1.

Each pair (α j,α j+1), j = 1, ...,r − 2, encloses the angle2π

3, while the last

pair (αr−1,αr) encloses the angle3π

4. The Dynkin diagram of Φ has type Br

from Theorem 7.25.

Proof. The canonical embedding

so(2l,C) ↪−→ so(2l +1,C)

extends a matrix X ∈ so(2l,C) by adding a zero-column as last column and a zero-row as last row. Therefore most part of the proof follows from the correspondingstatements in Proposition 8.8. In addition:

3) The general element of H has the form

z =r

∑ν=1

aν ·h ·Eν ,ν ,aν ∈ C.

In addition to the action of H on the root space elements from Proposition 8.8 onehas the action on the additional elements X j and Yj: For 1≤ j≤ r the element z ∈Hacts according to

[z,X j] = ε j(z) ·X j, [z,Yj] =−ε j(z) ·Yj.

4) The positive roots have the base representation:

8.3 Outlook 181

ε j =r

∑k= j

αk,1≤ j ≤ r,

εi− ε j =j−1

∑k=i

αk,1≤ i < j ≤ r.

5) Note:B1 ·B>2 −B2 ·B1

> = 2 ·h ∈M(2×2,C).

6) The computation of the Cartan matrix is similar to the computation in theproof of Proposition 8.8, q.e.d.

8.3 Outlook

According to Theorem 8.5 the root set of any complex semisimple Lie algebrais a root system. The previous section shows that all Dynkin diagrams from theA,B,C,D-series, see Theorem 7.25, are realized by the root system of a semisimpleLie algebra.

At least the following related issues remain:

• Show that also the exceptional Dynkin diagrams of type F4,G2,Er with r = 6,7,8result from root systems, see [20, Chap. 12.1].

• Show that also the exceptional root systems can be obtained from semisimple Liealgebras.

• Show that a semisimple Lie algebra is uniquely determined by its root system.

References

The main references for these notes are

• Lie algebra: Hall [15], Humphreys [20] and Serre [27]• Lie group: Serre [28].

In addition,

• References with focus on mathematics:[2], [3], [5], [17], [18], [19], [20], [32], [33]

• References with focus on physics:[1], [11], [13], [14],[24]

• References with focus on both mathematics and physics:[12], [15], [29]

1. Born, Max, Jordan, Pascual: Zur Quantenmechanik. Zeitschrift fur Physik 34, 858-888 (1925)2. Bourbaki, Nicolas: Elements de mathematique. Groupes et Algebres de Lie. Chapitre I. Dif-

fusion C.C.L.S., Paris (without year)3. Bourbaki, Nicolas: Elements de mathematique. Groupes et Algebres de Lie. Chapitre VII,

VIII. Algebres de Lie. Diffusion C.C.L.S., Paris (without year)4. Bourbaki, Nicolas: Elements de mathematique. Groupes et Algebres de Lie. Chapitre II,III.

Algebres de Lie. Diffusion C.C.L.S., Paris (without year)5. Bourbaki, Nicolas: Elements of Mathematics. Algebra II. Chapters 4-7. Springer, Berlin

(2003)6. Bourbaki, Nicolas: Elements of Mathematics. General Topology. Chapters 1-4. Springer,

Berlin (1989)7. Brocker, Theodor; tom Dieck, Tammo: Representations of Compact Lie groups. Springer,

New York (1985)8. Dugundji, James: Topology. Allyn and Bacon, Boston (1966)9. Forster, Otto: Analysis 1. Differential- und Integralrechnung einer Veranderlichen. Vieweg,

Reinbek bei Hamburg (1976)10. Forster, Otto: Komplexe Analysis. Universitat Regensburg. Regensburg (1973)11. Georgi, Howard: Lie Algebras in Particle Physics. Westview, 2nd ed. (1999)12. Gilmore, Robert: Lie Groups, Lie Algebras, and some of their Applications. Dover Publica-

tions, Mineola (2005)13. Grawert, Gerald: Quantenmechanik. Akademische Verlagsanstalt, Wiesbaden (1977)14. Hall, Brian: Quantum Theory for Mathematicians. Springer, New York (2013)

183

184 References

15. Hall, Brian: Lie Groups, Lie Algebras, and Representations. An Elementary Introduction.Springer, Heidelberg (2ed 2015)

16. Hatcher, Allen: Algebraic Topology. Cambridge University Press, Cambridge (2002). Down-load https://www.math.cornell.edu/ hatcher/AT/AT.pdf

17. Helgason, Sigurdur: Differential Geometry, Lie Groups and Symmetric Spaces. AcademicPress, New York (1978)

18. Hilgert, Joachim, Neeb, Karl-Hermann: Lie Gruppen und Lie Algebren. Braunschweig (1991)19. Hilgert, Joachim, Neeb, Karl-Hermann: Structure and Geometry of Lie Groups. New York

(2012)20. Humphreys, James E.: Introduction to Lie Algebras and Representation Theory. Springer,

New York (1972)21. Kac, Victor, in Introduction to Lie Algebras, http://math.mit.edu/classes/18.745/index.html.

Cited 6 July 201622. Lang, Serge: Algebra. Addison-Wesley, Reading (1965)23. Lang, Serge: SL(2,R). Springer, New York (1985)24. Messiah, Albert: Quantum Mechanics. Volume 1. North-Holland Publishing Company, Am-

sterdam (1970)25. Montgomery, Deane; Zippin, Leo: Topological Transformation Groups. Interscience Publish-

ers, New York (1955)26. Narasimhan, Raghavan: Several Complex Variables. The University of Chicago Press,

Chicago (1971)27. Serre, Jean-Pierre: Complex Semisimple Lie Algebras. Reprint 1987 edition, Springer, Berlin

(2001)28. Serre, Jean-Pierre: Lie Algebras and Lie Groups. 1964 Lectures given at Harvard University.

2nd edition, Springer, Berlin (2006)29. Schottenloher, Martin: Geometrie und Symmetrie in der Physik. Leitmotiv der Mathematis-

chen Physik. Vieweg, Braunschweig (1995)30. Stocker, Ralph; Zieschang, Heiner: Algebraische Topologie. Eine Einfuhrung. Teubner,

Stuttgart (1988)31. Spanier, Edward: Algebraic Topology. Tata McGraw-Hill, New Delhi (Repr. 1976)32. Varadarajan, Veeravalli S.: Lie Groups, Lie Algebras, and their Representations. Springer,

New York (1984)33. Weibel, Charles A.: An Introduction to Homological Algebra. Cambridge University Press,

Cambridge (1994)

Documents

Lie Algebras - Mathematisches Institut der LMUwehler/LieAlgebrasScript.pdf · 4 1 Matrix functions exp(z)= ¥ å n=0 1 n! zn which converges not only for all real numbers but for