26
A-1 Appendix A: Mathematical tools The following pages provide a brief summary of the mathematical material used in the lectures of Advanced Physical Chemistry II. No rigorous proofs are provided, although the proof of statements is generally either outlined, or an illustrative example is sketched. The reader unfamiliar with the material will find a distillation of the "bare-bones" facts, and may wish to read up on more details in the appropriate mathematical physics or mathematics textbook, such as Butkov, Applied Mathematics , or, for the not-faint-of-heart, Jeffreys & Jeffreys, Mathematical Physics. A.1 Complex variable analysis 1.1 Complex numbers The Schrödinger equation is a complex-valued differential equation, and its solutions are complex functions. This behavior of the wavefunction is very important as the basis of quantum mechanical interference effects, and a solid understanding of complex variable analysis is necessary to really appreciate the properties of wavefunctions. We start with the usual definition i 2 = 1 , in terms of which a general complex number can be written as z = x + iy . For a product of two complex numbers the standard rules of algebra then yield x + iy ( ) c + id ( ) = xc yd + i xd + yc ( ) A-1 R iy x I |z|=r Fig. A-1. Imaginary number in the complex plane. The complex conjugate of z : = x + iy is defined as z * = x iy. with zz * = x 2 + y 2 . A-2 The latter expression appears like the square modulus of a cartesian vector in two dimensions. Thus, we can view complex number as 2-D cartesian vectors, with length given by z = z 2 = zz * A-3 Addition and multiplication follow the usual associative rules. 1.2 Functions of a complex variable: g( z) = g( x + iy) Funtions of a complex variable can be defined in terms of analytic continuation of real functions into the complex plane. In practice, this means that a complex variable can be

Appendix A: Mathematical tools A.1 Complex variable analysis

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Appendix A: Mathematical tools A.1 Complex variable analysis

A-1

Appendix A: Mathematical tools

The following pages provide a brief summary of the mathematical material used in thelectures of Advanced Physical Chemistry II. No rigorous proofs are provided, although the proofof statements is generally either outlined, or an illustrative example is sketched. The readerunfamiliar with the material will find a distillation of the "bare-bones" facts, and may wish toread up on more details in the appropriate mathematical physics or mathematics textbook, suchas Butkov, Applied Mathematics, or, for the not-faint-of-heart, Jeffreys & Jeffreys, MathematicalPhysics.

A.1 Complex variable analysis1.1 Complex numbers

The Schrödinger equation is a complex-valued differential equation, and its solutions arecomplex functions. This behavior of the wavefunction is very important as the basis of quantummechanical interference effects, and a solid understanding of complex variable analysis isnecessary to really appreciate the properties of wavefunctions.

We start with the usual definition i2 = −1 , in terms of which a general complex number canbe written as z = x + iy . For a product of two complex numbers the standard rules of algebrathen yield

x + iy( ) c + id( ) = xc − yd + i xd + yc( ) A-1

R

iy

x

I

|z|=r Fig. A-1. Imaginary number inthe complex plane.

The complex conjugate of z: = x + iy is defined asz* = x − iy. with zz* = x2 + y2. A-2

The latter expression appears like the square modulus of a cartesian vector in two dimensions.Thus, we can view complex number as 2-D cartesian vectors, with length given by

z = z 2 = zz* A-3

Addition and multiplication follow the usual associative rules.

1.2 Functions of a complex variable: g(z) = g(x + iy)Funtions of a complex variable can be defined in terms of analytic continuation of real

functions into the complex plane. In practice, this means that a complex variable can be

Page 2: Appendix A: Mathematical tools A.1 Complex variable analysis

A-2

substituted into standard expressions instead of a real variable, and manipulated using the algebraoutlined in a). For example, the exponential funtion becomes

ez = ex+iy = exeiy;

eiy = 1+ iy − 12y2 −

i3!y3 +...

= 1 − 12y2 +⎛

⎝ ⎞ ⎠ . ..+i y − 1

3!y3 +. ..⎛

⎝ ⎞ ⎠

A-4

The Taylor expansion in the last line corresponds to exp[iy] = cos[y] + i sin[y], the famous Eulerformula for the exponentiation of an imaginary number. Note that the square modulus of exp[iy]is always 1, and y can be interpreted as an angular phase factor as it varies from 0 to 2π. Anycomplex number can be written in this exponential form:

x + iy = reiθ r = x2 + y2 θ = tan −1 yx

. A-5

In a similar way, any function of complex variables can be expressed in terms of function ofreal variables. For example,

ln z = ln x + iy( ) = ln re iθ( ) = ln r + ln e iθ = lnr + iθ , A-6

so the natural logarithm of a complex number automatically yields r andθ separated in its realand imaginary parts. Since the cos[] and sin[] funtion in the Euler formula cycle after 2π , it isclear that some complex funtions will be multivalued, just as the real square-root function istwovalued. Consider the square root function

f = z = z12 = re iθ( )

12 = r

12 e

iθ2 A-7

This function does not return to its original value after we traverse a circle in the complex planeand return to the x axis:

θ → 0 to 2π , f → r12 to − r

12

θ → 2π to 4π , f → −r12 to + r

12

⎫ ⎬ ⎪

⎭ ⎪ two different functions . A-8

However, we like to think of them as one function with two Riemann surfaces, as shown in theplot below. The seam where the sheets interleave is the “branch cut.”

Fig. A-2. Double-leafed Riemannsheet for a double-valued complexfunction such as √z.

Page 3: Appendix A: Mathematical tools A.1 Complex variable analysis

A-3

1.3 Derivatives of complex functionsLet f z( ) = F x + iy( ) = u x, y( ) + iv x, y( ) ; what is ∂f

∂z?

Calculus of complex functions is built along the same lines as real calculus; however, derivativescan now be taken in many directions in the complex plane, as opposed to along a single variableaxis. This introduces additional constraints on complex functions in order to satisfy continuity.As usual,

∂ f∂z

≈ ΔfΔz

=F x + Δx + i y + Δy( )( ) − F x + iy( )

Δx + iΔy

= u x + Δx, y + Δy( ) − u x,y( ) + i v x + Δx,y + Δy( ) − v x, y( )[ ]Δx + iΔy

A-9

This derivative should be uniquely defined in terms of z, no matter what the direction ofapproach. For simplicity, consider derivatives taken along the real (x) and imaginary (y) axes:

Let Δy = 0 appr. to real axis( ) ⇒ ∂ f∂z

= ∂u∂x

+ i ∂v∂x

Let Δx = 0 appr. to imag axis( ) ⇒ ∂f∂z

= −i ∂u∂y

+ ∂v∂y

⎬ ⎪

⎭ ⎪ ⇒

∂u∂x

=∂v∂y

& ∂v∂x

= −∂u∂y

A-10

Continuity (identity) of the derivatives can only be satisfied simultaneously if the real andimaginary parts of the derivatives can be identified, resulting in the Cauchy-Riemann relations tothe right. Therefore, not just any combination of x's, y's and i's is a proper complex function ofthe variable z = x + iy , e.g. f = x2 + i y is not a complex function. To verify the Cauchy-Riemann equations, consider for example the ln[] function:

f z( ) = ln z ′ f z( ) = 1z

= 1x + iy

= x − iyx2 + y2

⇒ u =x

x2 + y2 , ∂u∂x

=1

x2 + y2 −2x2

x2 + y2( )2 =y2 − x2

x2 + y2( )2

v = −y

x2 + y2 , ∂v∂y

=−1

x2 + y2 +2y2

x2 + y2( )2 =y2 − x 2

x2 + y2( )2

A-11

and similarly for the other Cauchy-Riemann relation. To summarize: generally we can takederivatives as for any real function, but NOT ANY u x, y( ) + iv x, y( ) is a legal complex function.

1.4 Complex integralsIntegrals in the complex plane follow a contour line C( x=x(t), y=y(t) ), much like line

integrals in the x-y plane. However, since f(z) cannot be a completely arbitrary function of thereal and imaginary parts of z, contour integrals have certain special properties. We can write anycontour integral as two line integrals over vector functions F and G to separate real andimaginary parts:

f z( )dzC∫ = u + iv( ) dx + idy( )

C∫= udx − vdy( )

C∫ + i vdx + udy

C∫ = F • dx

C∫ + i G • dx

C∫

A-12

Page 4: Appendix A: Mathematical tools A.1 Complex variable analysis

A-4

where F = u x,y( )ˆ x − v x, y( )ˆ y and G = u x, y( )ˆ x + v x, y( )ˆ y . Finally, if we choose a specific curveC with x = x(t) and y = y(t), the line integrals become

f z( )dzC∫ = u dx

dt⎛ ⎝

⎞ ⎠ dt − v dy

dt⎛ ⎝

⎞ ⎠

C (t )∫ + i . ..

C∫ ,

= F •dxdt

⎛ ⎝

⎞ ⎠ dt

C∫ + i G •

dxdt

⎛ ⎝

⎞ ⎠ dt

C∫

. A-13

z

f(z)

x

yFig. A-3. Integration contour in thecomplex plane, with one contributionto the integral indicated at z=x+iy.

This formula is the general recipe for evaluating complex integrals along any curve in thecomplex plane. Consider the following example integrating the function exp[z] along aparabolic path y = x2 in the complex plane from x=y=0 to x=y=1:

ez = ex+ iy = ex cos y+ i sin y( ), x = t, y = t2:

e zdz0,0

1,1

∫ = dt et cos t2 − 2te t sint 2( )0

1

∫ + i dt et sint2 + 2te t cos t2( )0

1

∫A-14

x

y(1,1)

(0,0)

Fig. A-4. Integration contour in theexample.

Sometimes, variable substitution helps. In particular, when the path is circular, a commoncase, the use of polar coordinates is recommended. For example, consider the integral of thefunction f(z) = 1/z around the origin on a circle of radius r. Then z = reiθ , r = const., dz = ire iθdθ and we have A-15

dzf z( )C∫ =

dzzΟ

∫ =ireiθdθreiθ

= i dθ = 2πi0

∫∫ . A-16

Integrals around closed contours such as circles are identically zero when the function beingintegrated is analytic (i.e. continuous and differentiable) everywhere inside the closed contour:

dz f z( )

C∫ = 0 since Fidx = ∇

S∫

C∫ × Fi ds (Stoke's theorem)

udx-vdy= dxdyi j k

∂x ∂y 0

u −v 0planar surface S

∫ = − dxdy ∂u∂y

+∂v∂x

⎛⎝⎜

⎞⎠⎟S

∫C∫ = 0 by Cauchy's theorem A-17

Page 5: Appendix A: Mathematical tools A.1 Complex variable analysis

A-5

The first line uses Stoke's theorem (vide infra) to turn the line integral around a closed contourinto a surface integral of the Curl of F over the enclosed surface. The second line then uses theCauchy-Riemann relations for a differentiable complex function to show that this surface integralmust be zero.

Even if f(z) has a singularity, the integral is still independent of the choice of contour, if bothcontours include the same singularity:

0 = f z( )C∫ dz = − f (z)dz

C2∫ + f (z)dz

C1∫ ⇒ f (z)dz

C1∫ = f (z)dz

C2∫ A-18

We can write the first identity because the integrals along the two lines connecting the contourscancel.

x

y C2

C1

Fig. A-5. The integral around theoverall contour is zero since it con-tains no singularity; therefore inte-grals around C1 and C2 are equal.

singularity

What about the actual value of these two integrals? If there is a singularity inside C, at z0, weexpand in a Laurent series

f (z) = cn z − z0( )nn= nmin

∑ =. .. c−2z − z0( )2

+c−1

z − z0( )+ c0 + c1 z − z0( ) + c2 z − z0( )2 +. .. A-19

The difference between a Laurent series and a Taylor series lies in the negative powers of z,which allow one to account for singularities. Note that n does not start at -∞ in the summation.In that case, the singularity would be "essential", and what follows does not apply.

If there is a singularity at z0 (i.e. some of the c's with negative subscripts are nonzero), aclosed contour integral can generally be written as

f (z)dzC∫ = dz Cn z − z0( )n = d ′z Cn ′z n

n=−∞

∑circle around 0

∫n=−∞

∑circle around z0

∫ A-20

Note that the contour C has been replaced by a circle enclosing the same singularity, since thetwo integrals have the same value. If n ≥ 0, the integrand is a polynomial, and therefore analytic;the integral vanishes. If n < 0, we have

Cnz ndzC∫ = Cn dθ ie iθ

0

∫ einθ = Cn idθ ei (n+1)θ0

∫ = 0 if n ≠ 1

= 2πiC−1 if n = −1A-21

Thus only the n = -1 term contributes to the integral if a function has a nonessential singularityinside the contour C:

⇒ f (z)dzC∫ = 2πiC−1 z0

(simple singularity at z 0 )

whereC−1 z0

≡ Res ( f (z0 )).A-22

The remaining problem is to calculate the C-1 coefficient or Residue of the function f(z) at thepoint z0.

Page 6: Appendix A: Mathematical tools A.1 Complex variable analysis

A-6

For a simple pole limz→ z0(z − z0 ) f (z) = Res( f (z0 )) A-23

e.g. if f (z) = g(z0 )z − z0

⇒ Res f (z){ }0 = g(z0 ) . A-24

In some cases, there may be a singularity at ∞ and the contour includes ∞ (e.g. the common caseof an integral along the real axis from -∞ to +∞). In such a case, one can make a change ofvariables z' = 1/z, and find the residue at z' = 0, which is the same as the residue at z = ∞.

Consider the following example of a contour integration: the function 1/[a2+x2] is to beintegrated from -∞ to +∞. We proceed by replacing the real integral by a contour integral withthe contour as shown in the figure: a half circle with radius R which we allow to go to infinity.C1 is the part of the contour we want (the real axis), and C2 goes to zero as 1/R:

ex: dxa2 + x 2

−∞

∫ = ? Replacing the real integral by a contour integral ,

dzz2 + a2

C∫ = dz

z + ia( ) z − ia( )C∫ = dz

z + ia( ) z − ia( )C1

∫ + dzz + ia( ) z − ia( )C2

The second of these terms vanishes : Ri e iθdθR2

0

π

∫ → 0 as R→ ∞

⇒ dzz + ia( ) z − ia( )C

∫ = dxa2 + x2

−∞

∫ ; the left hand side can be evaluated using

Res 1z + ia( ) z − ia( )

⎛ ⎝ ⎜ ⎞

⎠ ⎟ z =ia

=1

2ia⇒

dxa2 + x 2

−∞

∫ =2πi2ia

=πa

A-25

x

y

C1

ia

-ia

Fig. A-6. Integration contour in the example.The radius of C2 goes to ∞ so it does notcontribute to the integral.

C2

1.5 Stationary phase integration

Consider an integral of the type I = dx e iS( x )−∞

∫ where S is some real function of x. These

types of integrals often come up when calculating aproximate Ψ matrix elements or pathintegrals: the integral oscillates wildly in most places except where S has a minimum ormaximum, where the exponential function varies very slowly. One suspects that only thoseregions contribute significantly to the integral. To see how, expand S about a minimum S0 suchthat S0' vanishes:

Page 7: Appendix A: Mathematical tools A.1 Complex variable analysis

A-7

Let S = S0 + S0' x − x0( ) +

12S0

" x − x0( )2+ ...

I ≈ e iS0 ei2S0

" x −x 0( ) 2

dx−∞

∫ ≈ e iS02πiS0

" e− a2x 2

∫ = 12a

π⎛ ⎝

⎞ ⎠

a =S0

" i2

A-26

Only the quadratic term contributes to the integral in lowest order, which can be evaluated usingthe Gaussian integral shown in parentheses.

exp[iS(x)]

S(x)

xFig. A-7. The major contribution tothe exponential integral comes from the point of stationary phase whereS'=0.

A.2 Fourier transforms

2.1 Definition

Fourier transforms play an important role in quantum mechanics as they convert representationsfrom one conjugate variable to another (e.g. x to p or ω to t). Many different definitions are inuse. The one given below uses symmetric normalization factors (favored in mathematics), and inparenthesis the normalization often used in physics and chemistry is also shown.

`

F.T . (g(t)) = 12π

eiω t g(t) dt−∞

∫ = G(ω ) or 1 ⋅ dt...∫ or dxe− ikx ...∫( )

F.T .−1 (G()) = 12π

dω e− iω tG(ω−∞

∫ ) = g(t) or 12π

⋅ ∫ or dkeikx ...∫⎛⎝⎜

⎞⎠⎟

ex: F.T . (sinωt) = 12π

eiω t sin ′ω t dt−∞

∫ =12π

eiω t 12i

ei ′ω t − e− i ′ω t( ) dt−∞

=12π

12i

ei(ω +ω ')t − ei(ω −ω ')t dt−∞

=12i

12π

2π δ ω +ω '( ) − δ ω -ω '( )( )

A-27

The example shows that the function sinωt only has frequency contributions at ω = ±ω' . Thefrequency spectrum is "infinitely" narrow and given by delta-distributions. In general, thefunction should be absolutely integrable (i.e. ∫|g|dt is finite) in order to have a well-definedFourier transform. Note that for position x as opposed to time, the F.T. and its inverse are often

Page 8: Appendix A: Mathematical tools A.1 Complex variable analysis

A-8

defined with the sign switched, to be compatible with wave expressions of the type exp[ikx-iωt].The ± definition is arbitrary, but of course needs to be used consistently throughout.

2.2 Some useful propertiesThe following equations summarize a series of properties of Fourier transforms:

-Time shift yields frequency phase factor:

F.T . g t − t0( )( ) = 12π

e+ iω t g(t − t0 ) dt−∞

∫ =12πe+ iω t0 e+ iω t 'g(t ')dt '

−∞

∫ = e+ iω t0G(ω ) A-28

-F.T. of derivative turns into product factor:

F.T . ′g t( )( ) = 12π

dt ′g t( )e+ iω t

−∞

∫ =12π

g t( )e+ iω t⎡⎣ ⎤⎦−∞∞

− iω e+ iω t g t( )−∞

∫ dt

= −iω F g t( )( ); In general F.T . g n( ) t( )( ) = −iω( )n F.T . g t( )( )A-29

-F.T. of a convolution is the product of the F.T:

g ∗h = 12π

d ′t g t '( )−∞

∫ f t − ′t( ) ⇒ F.T . g ∗h( ) 12π

dt eiω t−∞

∫ d ′t g t '( ) f t − ′t( )∫

=12π

d ′t ′g t( )−∞

∫ dt eiω t f t − ′t( )−∞

∫ (let ′′t = t − ′t and dt = d ′′t )

=12π

d ′t ′g t( )−∞

∫ d ′′t eiω ( ′′t + ′t ) f ′′t( )−∞

=12π

dt 'eiω ′t g ′t( )−∞

∫2π2π

dt"eiω ′′t f ′′t( )−∞

∫ = 2πG ω( )F ω( )

A-30

The convolution function occurs in many applications, for example the increase in observedlinewidths due to convolution of the natural lineshape with a "broadening factor" due to laserresolution, molecular motion (Doppler effect), etc. In a laser autocorrelation function, the outputintensity is obtained by convolving the input electric fields with respect to one another:

I(t0 ) = dtA t( )A t0 − t( )−∞

⇒ ∫ I(t0 )dt = ∫ | A(ω ) |2 dωA-31

The formula at the bottom is a form the Wiener-Khintchine theorem, a fundamental theorem ofstatistical mechanics relating the integral of power over time to the power in the frequencydomain. These F.T. formulas are practical because of a computer algorithm known as the FFTdeveloped by Cooley and Tukey in the early 1960s, which allows rapid evaluation of Fouriertransforms.

A.3 Linear algebra, function spaces, Dirac notation

Quantum mechanics uses complex-valued wavefunctions to describe the evolution of asystem, operators that act on these wavefunctions to transform them into other wavefunctions,and expectation values of operators, which correspond to observable quantities. This is directly

Page 9: Appendix A: Mathematical tools A.1 Complex variable analysis

A-9

analogous to vectors, matrices that act on them to yield other vectors, and scalar products ofvectors with such transformed vectors:

|Ψ >↔ v

ˆ A |Ψ >=|Ψ' >↔ Av = v'< Ψ| ˆ A |Ψ >=<Ψ|Ψ' >↔ v • (Av) = v • v'

A-32

The analogy is twofold: the function space of wavefunctions is directly analogous to the n-dimensional vector spaces of linear algebra; and by using basis set representations ofwavefunctions and operators, the quantum mechanical problem in effect becomes a linearalgebra problem. We briefly review these issues below.

3.1 Hilbert spaceAmong the many notations attempted for quantum mechanics, the most powerful has proven

to be in terms of a "function space", in which the functions can be thought of as vectors, muchlike n-dimensional vectors in ordinary vector space. To be useful for quantum mechanics, such afunction space h must have a number of properties:

-The functions Ψ must have a norm |Ψ|. In practice ∫dxΨk*Ψk = 1 is the square of this norm(∫dxΨk*Ψk' = δ(k-k') for free particles).

-There exists a scalar product operation < | > which maps any two functions Ψ, Ψ '∈h onto thecomplex numbers with properties analogous to ordinary scalar products. In practice, this is doneby an operation dxΨ*(x)_∫ . The complete set of operations dxΨ*(x)_∫ for all Ψ∈ h also forma space, the dual space h* of h. The recipe for computing the scalar product of Ψ and Ψ' is asfollows: take the operation dxΨ*(x)_∫ dual to Ψ, insert Ψ ' in the placeholder _. The results isthe scalar product <Ψ|Ψ'>, a complex number. In particular, √<Ψ|Ψ> is the positive-definitenorm of Ψ. Also, <Ψ'|Ψ> = <Ψ|Ψ'>*.

The first of these properties is sufficient for a "Banach space" h. An entity H = (h; < | >)which consists of a Banach space on which a scalar product is defined, is known as a "Hilbertspace". These concepts are entirely analogous to the ones from ordinary linear algebra. Forexample, an ordinary vector v is element of a vector space V. The norm of v is the length of thatvector. The dual space of V has elements v. which are operations acting on the v∈V, and thescalar product between v and v' is defined as v.v'. The norm is v . •v

For a Hilbert space, we can define a complete orthonormal basis {Ψi}. Orthonormal implies<Ψi|Ψj>=δi,j. Completeness implies that any function Ψ∈H can be expressed as a linearcombination of these basis functions. (If not, we would add the part of Ψ that's left over asanother basis function to obtain completeness!) This implies that Ψ can be fully characterized byits scalar products <Ψ|Ψi> with the basis functions Ψi. These scalar products can be assembledinto a vector c with coefficients <Ψ|Ψi> and the same dimension as the Hilbert space. Thisallows one to reduce function spaces to ordinary vector spaces, as discussed below.

3.2 Dirac notation and operatorsFunctions such as Ψ can be expressed in terms of many different kinds of variables such as x

or p, depending on what is most convenient. Very often, one only wants to discuss the functionsthemselves, irrespective of which variables they are represented in terms of. It is thenconvenient to use the bracket notation of Dirac and replace

Ψ∈H by the ket |Ψ>

Page 10: Appendix A: Mathematical tools A.1 Complex variable analysis

A-10

dxΨ*(x)_∫ ∈H* by the bra <Ψ|. A-33

Note that while |Ψ> is indeed the function Ψ as an abstract member of the function space H , <Ψ|is not Ψ*(x) or any function at all, but rather an overlap operation in the dual space h*. Lack ofrealization of this fact leads to endless confusion and misuse of the Dirac notation.

We now have a compact representation for functions and their overlap integrals (or scalarproducts) with other functions. The final ingredient is the addition of linear operators, which canturn functions into new functions. Again, the best analogy is conventional vector space: v' = a vwhere a is a scalar. The new vector v' points in the same direction as v, but has differentnormalization. Similarly Ψ' = a Ψ. To change the orientation of a vector, we need to apply amatrix M: v' = M v. M may induce a translation, rotation, or shearing on the vector. Similarlywe can have Ψ ' = M.Ψ, where M is a linear operator (i.e. it acts on Ψ to the first power asshown). A wide class of linear operators is allowed, such as differentiation, multiplication byanother function, etc. Consider for example the angular-variable function Ψ(ϕ) acted on byoperator exp[α d/dϕ]:

eαd / dϕΨ(ϕ) = (1+ α ddϕ Ψ + 1

2 α2 d2

dϕ2 Ψ+)

= Ψ(ϕ + α )A-34

Clearly, exp[αd/dϕ] is the wavefunction equivalent of a rotation matrix for a vector, rotating thewavefunction by an angle α .

The analogy with linear algebra vector spaces is even more direct. Let us expresswavefunctions Ψ and Ψ' in terms of their coefficients ci and ci' in a basis {Ψi} as vectors c andc'. Then ci=∫dxΨi*Ψ and ci'=∫dxΨi*Ψ'. We can compute "matrix elements" of any operator Musing the dual basis: Mij ≡<Ψi|M|Ψj> . If Ψ ' = M.Ψ, then

dxΨi*∫ Ψ' = dxΨi

*∫ MΨ

dxΨi*∫ cj

'∑ Ψ j = dxΨi*∫ M cj∑ Ψ j Α−35

cj' =∑ < Ψi |M|Ψ j > ci or

c' = Mc or|Ψ' >=M |Ψ >

The first step simply involves application of the "overlap operator"; in the second step, Ψ and Ψ 'are expressed in terms of the basis; the third step switches to Dirac notation for the overlapintegrals, AND assumes the basis functions are orthonormal. Steps 4 and 5 are the same as step3, in pure algebra or pure Dirac notation.

Thus, acting with an operator on a wavefunction can be replaced by a linear algebra problemof matrix multiplication if we write the wavefunction as a vector of basis set coefficients, and theoperator as a matrix of overlap integrals. Dirac notation is an intermediate form of notationwhich allows us to write equations in terms of the abstract wavefunction, its "dual" bra, andabstract operators without reference to a basis, while maintaining the simplicity of pure matrixnotation.

In the above, the dual space overlap operation dxΨ*(x)_∫ is referred to as an operation, notan operator. This is because its application to a wavefunction yields a scalar (the overlapcoefficient ci) rather than another function, as an operator would. We can write an operatorwhich is the direct analog of the dual space operators as follows:

Pi =Ψi dx∫ Ψ i* _ =|Ψ i >< Ψi | Α−36

When applied to a function Ψ, Pi computes its overlap with Ψi and returns the function Ψi timesthat overlap. It effectively projects the part of Ψ that looks like Ψi out and returns it. Hence Pi is

Page 11: Appendix A: Mathematical tools A.1 Complex variable analysis

A-11

referred to as a projection operator. Generally, operators can be constructed by combiningfunctions ∈ h and duals ∈ h* in "reverse" order from matrix elements.

3.3 Hermitean operatorsSince observable quantities must be real (as far as we know) and are interpreted as

expectation values in quantum mechanics, operators with real expectation values are of particularinterest. For a matrix to have this property, it must be Hermitean, i.e. it must equal its complexconjugate-transpose: M = M†. An analogous property must therefore hold for Hermitieanoperators. For any operator,

Ψ FΦ = Ψ FΦ = FΦ Ψ ∗ = Φ F† Ψ ∗ A-37The fourth step can be taken as the definition of the hermitean conjugate F† of F. In the specialcase of hermitean operators Ψ FΦ = Φ FΨ ∗ must hold for any matrix elements in any basisrepresentation, or the expectation values will not be real. This immediately implies that F = F†must hold for a hermitean operator, i.e. in its matrix representation it must equal its own complexconjugate-transpose. This can easily be shown in matrix notation:

Let Ψ FΦ = ci∗Fijd ji , j∑

or

= c1* c2

* ( )Ψ = di

* Ψ ii∑

f1 1 f1 2

f2 1 f 2 2

⎝ ⎜ ⎜

⎠ ⎟ ⎟

d1

d2

⎝ ⎜ ⎜

⎠ ⎟ ⎟

Φ = di Ψ ii∑

= c1∗ f1 1d1 + c1

* f1 2d2 +…A-38

We claim that the complex conjugate transpose matrix element of operator f†, namely

d1∗ d2

∗( )f1 1

+ f1 2+

f 2 1+ f2 2

+

⎜ ⎜

⎠ ⎟ ⎟

c1

c2

⎝ ⎜ ⎜

⎠ ⎟ ⎟

⎜ ⎜

⎟ ⎟

= d1∗ f1 1

+c1 + d1 f1 2+ c2 + d2 f 2 1

+ c1∗

F+( ) 2 1

*= f12 or F+( )1 2

= f1 2+

+… A-39

should be the same. As shown by the one example of f12 by comparison of the two explicitequations, this is indeed the case, if we make use of the hermitean property of the operator'smatrix representation.

3.4 Other useful matrix properties-Inverse and transpose of products:

AB( )−1 = B−1A−1

AB( )+ = B+A+A-40

The latter can be shown easily by direct evaluation:

a11 a12a21 a22

⎜ ⎜

⎠ ⎟ ⎟

b11 b12b21 b22

⎜ ⎜

⎠ ⎟ ⎟

⎢ ⎢

⎥ ⎥

+

=… a11b12 + a12b22+…

a21b11 + a22b21

⎛ ⎝ ⎜ ⎞

+

=

=… a21* b11* + a22* b21* +

a11* b12* + a12* b22* +...⎛ ⎝ ⎜ ⎞

⎠ ⎟

-Antihermitean property of commutators:

Page 12: Appendix A: Mathematical tools A.1 Complex variable analysis

A-12

Let A and B be hermitean; thenA,B[ ]+ = AB( )+ − BA( )+ = B+A+ −A+B+ = BA −AB = − A,B[ ] A-41

This implies that [A,B] is anti-Hermitean: its Hermitean conjugate equals minusitself. [A,B] can therefore be written as iC, where C is a Hermitean operator.

-Diagonalization of Hermitean matrices by unitary matrices:Let M−1HM = Λ, H is hermitian ⇒

M−1HM( )+ = Λ+ ⇒M+H M−1( )+ = Λ or M−1 = M+

M+M = I

A-42

The last property defines M to be a unitary matrix.

3.5 Behavior of the determinant of matrices

i ) t he de t e rminan t o f A i s t he sum ove r a l l pos s ib l e p roduc t s o f ma t r ix e l emen t s includingonly

one from each row and column, with a sign alternation for odd and even permutations:

a11 a12 a13 La21 a22 a23a31 a32 a33

M O

= a11

a22 a23 La32 a33

M O− a12

a21 a23 La31 a33

M O+ a31 −L A-43

ex:

deta11 a12a21 a22⎛ ⎝ ⎜ ⎞

⎠ = a11a22 − a12a21 A-44

Note: det(A) unchanged if we add rows to one another or exchange them.

ii) det AB( ) = det A( )det B( ) A-45

iii) det A−1( ) = 1det A( ) since det AA −1( ) = det I( ) = 1 = det A( )det A−1( ) A-46

iv) ⇒ a similarity transformation leaves the determinant unchanged:det M−1GM( ) = det M−1( )det G( )det M( ) = det G( ) A-47

v) ⇒ the determinamt of a matrix is just the product of its eigenvalues, since diagonalization is just a similarity transformation:

det G( ) = γ ii∏

vi) For any matrix, det( ˜ M ) = det(M); det(M*) = det(M)* A-48

Page 13: Appendix A: Mathematical tools A.1 Complex variable analysis

A-13

vii) For unitary matrices, det M−1( ) = det ˜ M *( ) = det M( )* A-49det M−1M( ) = det M−1( )det M( ) = 1 ⇒ det M( )det M( )* = 1

or det M( ) = e iθ

Diagonalize

M =λ1

λ2O

⎝ ⎜ ⎜

⎠ ⎟ ⎟ M+ = M−1 ⇒ λiλi

* = 1 or λi = e iθi : A-50

all eigenvalues. of a unitary matrix have unit modulus.

3.6 Matrix diagonalizationMany problems in quantum mechanics can be posed as eigenvalue problems; the most

famous is the solution of the time-independent Schrödinger equation for energy eigenvalues.The eigenvalue problem is the following: which functions in H are unchanged by application of acertain operator G, other than a multiplicative factor λ , referred to as the eigenvalue? If weconvert wavefunctions to vectors m, the question becomes: which vectors are unchanged byapplication of a matrix G, except for a multiplicative factor λ?

For hermitean and other well-conditioned matrices, the number of such vectors generallyequals the dimension of the vector space. (This is of course infinite for a Hilbert space withinfinitely many basis functions.) We can group all these vectors m into a matrix M, and all theeigenvalues into a diagonal matrix Λ, in which case the problem is posed as

GM = MΛ or M−1GM = Λ A-51awhere the columns of M are the vectors m, and the diagonal elements of Λ are the eigenvalues.Note that the matrix M appears on the LEFT-hand side of Λ. For a single vector m, we can writethis explicitly in terms of a set of linear equations to be solved:   Gm = λm ,  or

(G-Iλ)m = 0; explicitly, A-51b

i =1n: (Gij − δijλ )x j

j=1

n

∑ = 0

It is not generally possible for n sums over n xi variables to be simultaneously zero, unless all xiare identically zero:

ex: z + 5x + 2y = 0 → z = −5x − 2y 2z + 8x +10y = 0 → z = −4x − 5y

z + 2x + 2y = 0−x + 3y = 0 or x = 3y, z = −17y ∴ −17y + 6y + 2y = 0

A-52

However, if one of the equations is redundant (can be expressed in terms of the others), then asolution is possible, where all xi can be determined as multiples of one of them:

ex: z + 5x + 2y = 02z + 8x +10y = 04z +18x +14y = 0

x = 3y, z = −17y∴− 6y + 54y +14y = 0 for any y!

A-53

Page 14: Appendix A: Mathematical tools A.1 Complex variable analysis

A-14

A condition for such redundancy is that the determinant of the coefficient matrix be zero, ordet(G-Iλ) = 0 to make the set of equations singular. The det(G-Iλ) is a polynomial in λ of ordern, where n is the size of the matrix. These eigenvalues λ can be found by solving thepolynomial, and can then be inserted into the linear equations above to yield the components ofthe eigenvectors.

A.4 Tensors

4.1 Outer products; contractionTo illustrate how tensors arise in quantum mechanics, consider a system with 2 degrees of

freedom spanned by the bases {ψi(q)} and {χi(Q)} with Hamiltonian operator H(q,Q). All basisfunctions of this Hamiltonian can be written as a product ϕiχj and are now classified by TWOindices (one for each coordinate). Such a basis is an outer product basis, and the number of itsindices is the product of the number of indices of the individual bases from which it isconstructed. The H matrix is now indexed by FOUR indices, and can be grouped as a matrix ofmatrices

where, for example A-54

H15,23 =< ϕ1|< χ2 |H |χ3 >|ϕ 5 >=< ϕ1χ2|H |ϕ5χ 3 > .The upper left hand block is a submatrix diagonal in ϕ1; within such a block, only the indices forthe basis {χ} change; from block to block, the indices in the basis {ϕ} change.

Such an entity with more than two indices is a tensor. In this case, H has 4 indices, and is a4th rank tensor. Each block is an entire basis matrix for basis {χ}, and corresponds to oneparticular <ϕi|H|ϕj> matrix element. Just as the basis {ϕiχj } is labeled an outer product basis,H can be written as a sum over outer products of two matrices. The outer product of two n by nmatrices A and B is obtained by putting replicas of the entire B matrix into each matrix elementposition aij of A, and multiplying by aij. Clearly, the dimension of this newly constructed 4thrank tensor is n2 instead of n.

If we took diagonal matrix elements with respect to just one function ϕ, e.g. ϕ1, we are leftwith a Hamiltonian where the q degree of freedom has been integrated out:

< ϕ1 H(q, Q)ϕ1 >=Heff (Q) = H11(Q) A-55Such a Hamiltonian is known as an "effective Hamiltonian" in the Q degree of freedom.Integration (or the taking of matrix elements) reduces the number of indices by two. Our 4thranktensor matrix for operator H(q,Q) is reduced back to an ordinary matrix (2nd rank tensor) foroperator H11(Q). H11 is referred to as a "reduced" or "contracted" tensor. A typical applicationof this would be to take the total Hamiltonian of a molecule, take its expectation value with theground electronic state wavefunction to eliminate the electronic coordinates, leaving a "groundstate Hamiltonian" which depends only on nuclear coordinates.

Page 15: Appendix A: Mathematical tools A.1 Complex variable analysis

A-15

4.2 Pseudotensors

Pseudotensors differ from tensors in that they do not invert their sign if the coordinate systemis inverted. For example, consider the angular momentum "vector" J = r × p . If one invertscoordinates, r→ r, p→ p but J / → J . Instead, J = (-r) × (-p) = r × p = J . Thus the angularmomentum "vector" does not flip direction in an inverted coordinate system. In fact, it is not avector at all, but a pseudovector: the right hand rule changes to a left hand rule in an invertedcoordinate system.

A more general way of writing angular momentum makes use of the Levi-Civitapseudotensor εijk, which has elements equal to +1, -1 and 0 to yield the correct cross productresult:

(a × b)i = ε ijkj=lk=m

∑ albm (ε= Levi-Civita tensor) A-56

The product εijkalbm is a fifth rank pseudotensor. In the summation, two pairs of indices are setequal to one another and summed; this is a twofold contraction, resulting in a loss of two indiceseach time. The final result is a 1st rank pseudotensor or a pseudovector with only one index.

The Kroneckerδ is a second rank tensor closely related to ε in structure: it equals 1 when itstwo indices i and j are equal, zero otherwise; it can thus be thought of as the unit matrix. Both εand δ are useful in writing shorthand matrix manipulations, particularly those involving angularmomentum operators.

4.3 Spatial tensors

Rotational properties in 3 dimensions are of great practical importance in quantum mechanics(e.g. angular momentum) and optics (e.g. anisotropic crystals). Consider for example the totaldipole moment of an assembly of molecules, which has a permanent and induced component:

µtot = µo + µ ind

= µ o + χ (1 ) • E + χ (2 ):EE + χ (3 ) • EEE

χ xx

χ yy

χzz

⎜ ⎜

⎠ ⎟ ⎟

Ex

Ey

Ez

⎜ ⎜

⎠ ⎟ ⎟

χij χijk → χijkj, k∑ E jEk

A-57

χ(1) is a matrix or 2nd rank tensor, indicating that the dipole induced does not necessarily point inthe same direction as the incident electric field. The second-order susceptibility χ(2) is a thirdrank tensor, and the double dot product with two electric fields means a double contraction, sothat five indices (3 on χ, 1 each on the two fields) are reduced to one again (dipole is a vector).

Angular momentum operators are block-diagonal in J and have 1,2,3.. independent elementsfor J=0,1/2,1... In order to manipulate quantities such as dipoles or susceptibilities, it is useful todecompose them into operators block-diagonal in J, such that one can do calculations on onevalue of J at a time. (Of course, J may not be conserved in external fields, so there will be offdiagonal elements, but the J-representation can nonetheless be useful as a starting point.) Suchtensors are known as spherical tensors. The spherical tensor rank is given by the correspondingvalue of J.

For example, consider a general second rank tensor written symbolically as

Page 16: Appendix A: Mathematical tools A.1 Complex variable analysis

A-16

M =xx xy xzyx yy yzzx zy zz

⎝ ⎜ ⎜

⎠ ⎟ ⎟ A-58

Do these elements transform in independent sets under rotation? The answer is yes, there arethree sets to be formed from the 9 elements: 1 scalar (1 element), 1 spherical vector (3 elements)and one spherical 2nd rank tensor (5 elements); note that there are still 9 independent elements intoto.

1 element, scalar: Tr(M≈) = xx + yy + zz A-59

3 elements, vector: [ 12(xy − yx), 1

2(xz − zx), 1

2(yz − zy)] = 1

2(M

≈− M

+) A-60

5 elements, symmetric tensor:

12(M

≈+ M

+) − 13δ≈Tr(M

≈) =

xx − 13Tr xy − yx xz + zx

xy + yx yy − 13Tr yz + zy

xz + zx yz + zy zz − 13Tr

⎜ ⎜ ⎜ ⎜

⎟ ⎟ ⎟ ⎟

A-61

The trace of any matrix (like its determinant) is invariant under rotations, and hence a zero-rankspherical tensor. If we write the 9 independent elements as a vector, the 9x9 transformationmatrix will be a block diagonal:

1x13x3

5x5

⎝ ⎜ ⎜

⎠ ⎟ ⎟ A-62

A.5 Green functionsConsider the inhomogeneous differential equation Ox f (x ) = a(x)where Ox is a differential

operator, e.g.Ox =

2∂2∂x+ 2k A-63

The solutions f(x) will be different for each function a(x) , but can all be derived from a singlefunction, the Green function G of Ox. Specifically, the Green function is the solution thatcorresponds to a(x) being a δ distribution or "infinitely narrow" impulse. G is therefore the"impulse response" of an operator such as A-63. Any inhomogeneous solution can then beobtained by integrating G over a number of such impulses. This effectively recovers a(x) as asum over impulses.

Assume that you have solved the "delta-homogenous" problem OxG(x,x' ) = δ(x - x' ) .Multiplying both sides of this differential equation by a(x ' ) and integrating:

Ox dx '−∞

∫ G(x,x' )a(x ' ) = dx'Ox−∞

∫ G(x, x' )a(x' )

= dx' δ−∞

∫ (x, x' )a(x' )

= a(x)

A-64

Therefore

Page 17: Appendix A: Mathematical tools A.1 Complex variable analysis

A-17

f (x) = dx 'G(x,x ' )a(x ' )−∞

∫ A-65

is indeed the solution to the inhomogeneous equation, and can be found by simple integration. Acommon example is the wave equation in eq. A-63 above. Explicitly,

Ox =∂ 2

∂x2+ k2 ⇒ G(x, x' ) = 1

2ikeik z −z'

since OxG =−k 2

2ikik z −z'e +

k2

2ikik z − z'e = 0 if z ≠ z' A-66

and OxG diverges if z = z'

because lm{e ik z− z' } is discontinuous at z=z', yielding an infinite second derivative. Anothercommon example is the equation

∇2 f (r) = a(r) . A-67The Green function for the corresponding delta-homogeneous equation is simply

G(r ,r' ) − 14π

1| r− r' |

, A-68

which can also be verified by direct substitution into A-67 with a(r,r') = δ(r,r').Closely related to the Green function is the Green operator, which comes up frequently in

perturbation theory and scattering theory. For example, according to the Schrödinger equation,(H − Ei )Ψi = 0 if Ei is an eigenvalue and Ψi an eigenfunction. If ϕn is an arbitrary function, and E anarbitrary energy, let

(H - E)ϕn =ϕn−1 = aiΨii∑

Solving for ϕn , ϕn =- 1(H-E) ϕ n - 1 = G(E)ϕn - 1 . Clearly, if E ≈ Ej and aj ∉0 , G(E) acting on

ϕn − 1 will yieldG(E)ϕ n − 1 =

ai

Ei - EΨi

i∑ ≈

ai

Ei - Ej

Ψj if E ≈ E j A-69

Thus, ϕ n = bi∑ Ψ i will have bj〉〉bi . After every iteration ϕ will approach an eigenfunction moreclosely. G(E) is the Green operator of the (differential) operatior H . This iterative process isknown as inverse iteration.

Sometimes, it is useful to represent the propagator in a Green operator representation. It isgiven by

e−iHt /h =

12πi

dλeλt ih(ihλ − H)−1∫ =12πi

dλeλtihG(ihλ)∫ . A-70This can be derived by using the Laplace transform

L ( f (t)) = L(k) = dtf (t)e−kt , f (t) =

12πi0

∞∫ dkL(k)ektc−i∞

c+ i∞∫ ,

which is the complex-frequency analog of the Fourier transorm. In analogy to eq. A-29 it has theeasily derived property

L (dΨ(t ) / dt ) = −Ψ(0 ) + kL(Ψ(t )) .Inserting this into the Schrödinger equation one obtains

L (Ψ(t )) = ih(ihk − H)−1Ψ (0) ,and inserting this into the definition of the Laplace transform, one obtains A-70. The Greenfunction formulation can be useful during manipulations using complete sets of eigenstates.

Page 18: Appendix A: Mathematical tools A.1 Complex variable analysis

A-18

A.6 Vector Identities

∇a =∂∂xii

∑ a ˆ i (gradient ) A-71

∇ •a = ∂ai∂xi

∑ (divergence ) A-72

∇ × a =i j k∂x ∂y ∂zax ay az

= (∂az∂y

−∂ay∂z

)ι + ... (curl) A-73

∇×v=0∇.v≠0

∇×v=0∇.v=0

∇×v≠0∇.v=0 Fig. A-8. Behavior of different derivatives

of a vector field

∇ ×∇a = 0, ∇•∇ × a = 0 A-74

Gauss's Law: j • dˆ s = ∂Q∂ ts

∫ ; where Q = ρdr3

v∫ relates surface and volume integrals: A-75

⇒ ∇V∫ • jdr3 = ∂

∂tρ

V∫ dr3 A-76

⇒ ∇• j = ∂∂tρ where dˆ s = dl sinθdθdr2

and dV = dr3 = dlsin θdθr 2drA-77

V

A

Fig. A-9. Relationship between vectorflow field out of a surface and scalardensity contained within that surface.

A final useful identity is( fj⋅ ∇g + gj ⋅ ∇f )dr∫ = 0 A-78

where f and g are two well-behaved scalar functions, and j is a vector function with zerodivergence in the closed integration volume V which vanishes on the surface bounding V and thevolume outside V (i.e. j is localized inside the volume V). A-78 can be derived by integrating

Page 19: Appendix A: Mathematical tools A.1 Complex variable analysis

A-19

the second term by parts, yielding gfj| - ∫dr[f∇(gj)]. The first part of this vanishes because j iszero on the bounding surface of V. The second term becomes -∫dr[fg∇.j+fj.∇g] = -∫drfj.∇g]after applying the chain rule and using the zero divergence of j. This term then cancels the firstterm in A-78

Many useful identities can be derived from A-78. For example, if f = xi and g = xj (i=x,y,z)then we have

dr(xi∫ j j + x j ji ) = 0 A-79

This can be used to turn integrals involving dot products into integrals involving cross products,i.e. in the second line of the following derivation:

x' ⋅ xji∫ dr = xj' xj ji∫ drj∑

= − 12 xj' (xi j j −∫ x j ji)dr

j∑

= − 12 ε ijk xj

' (x × j)k dr∫j∑

= − 12 x' × (x × j)dr∫[ ]

i

A-80

This relationship is useful for formulas involving scalar or vector potentials.

A.7 Lagrange multipliers

Sometimes one wishes to minimize a function f(xi), subject to a constraint g(xi)=0. A typicalexample would be the variational principle, where the energy is to be minimized subject to theconstraint that Ψ remains normalized to unity (g = |Ψ(x)|2-1 = 0); otherwise, picking just anyfunction Ψ would allow the energy to drop to -∞. We consider the proper minimization processfor the example of two variables x and y.

Let C be the contour defined by g(x,y)=0. Then ∇g⊥C always. To minimize f(x,y), we let

ddt

{ f (x(t)), y(t)} = 0 = dfdx

dxdt

+ dfdy

dydt

=∇f • g'A-81

From the second line follows that the gradient of f must be perpendicular to g', which is parallelto C and perpendicular to the gradient of g; thus the gradients of f and g are parallel, or

∇F = λ∇g→ ∇(F - λg) = 0 A-82

where λ is the proportionality factor between the two gradients: the Lagrange multiplier. Thus,minimizing f subject to the constraint g=0 means solving the equation obtained by setting thegradient of F-λg equal to zero.

Page 20: Appendix A: Mathematical tools A.1 Complex variable analysis

A-20

f(x,y)

x yg(x,y)=0

∇gg'

Fig. A-10. Minimization subject to a constraint.The global minimum is shown by a transparentcircle, the desired constrained minimum by ablack circle.

A.8 Correspondence principles

Although the correspondence principles between classical and quantum mechanics are notpurely mathematical rules, we summarize here some of the more important ones by equation orname, noting some restrictions:

A → ˆ A A-83

AB →ˆ A B + ˆ B A

2A-84

[ ]P →

1i

[ ] A-85

dpdq→ hTr{ }∫ A-86Wigner-Weyl rule A-87

Podolsky Rule [Phys. Rev. 32, 812 (1928)] A-88

For many operators A, one can just replace the classical quantity directly by its quantumanalog, e.g. x → ˆ x and p → ˆ p . This is subject to the restriction that one must choose arepresentation and eventually express one of the conjugate operators in terms of the other (e.g. pin terms of ∂/∂x).

Combinations such as xp lead to a problem due to non-commutation, but the second formulaguarantees that a Hermitean operator is obtained, and often works. However, symmetrizationand hermiticity alone are not guarantees that the resulting quantum operator is the correct analogof the desired classical function!

The third and fourth formulas show how the classical Poisson bracket is converted to acommutator, and how a phase space integral becomes a trace operation.

The Wigner-Weyl rule and Podolsky rules are far more general than the first two methods;however, even they are not applicable in all cases, and no general correspondence formula whichtakes a classical expression into a correct quantum mechanical version is known to date.

Page 21: Appendix A: Mathematical tools A.1 Complex variable analysis

A-21

A.9 Calculus of variations

The variational principle is familiar from basic quantum mechanics: Given an energyE[Ψ] = dx∫ Ψ*(x) ˆ H Ψ(x) , A-89

find the function Ψ which remains normalized and yields the lowest energy E[Ψ]. In this case,the energy is actually a function of a function, or a functional. Note that the functional E[Ψ]does not depend on the variable x used for the function Ψ , since it is integrated out; it dependsonly on the function Ψ , no matter what variables Ψ was expressed in. For example, if wetransformed our Hamiltonian H from cartesian to polar coordinates, the minimized energy wouldstill be the same.

Functionals can be manipulated by rules not unlike the algebra and calculus of ordinaryfunctions. Perhaps the most important problem of functional calculus is the calculus ofvariations, or finding of extrema and their stabilities. In particular, we will consider the exampleof finding a minimum for a functional S. S could be an energy (variational principle, Hartree-Fock equations), the classical action (Hamilton's principle of least action) or the path taken by alight ray (geometrical optics). We will consider the last of these examples here.

Problem: given a functional

S[y, ˙ y , x] = f (y(x), ˙ y (x), x)dxxi

x f

∫ A-90

find the function y that minimizes (or maximizes) S subject to the constraint that y(xi) and y(xf)are constant, i.e. the trial functions y remains fixed at the endpoints. This requires the variationof S with y to be zero, or

δSδy

= 0 A-91

A small variation in S can be expanded to first order as

S + δS = f (y + δyxi

x f

∫ , ˙ y + δ ˙ y , x)dx ≈ { f (yxi

xf

∫ , ˙ y , x) + ∂f∂y δy + ∂f

∂ ˙ y δ ˙ y }dx

≈ S + {xi

x f

∫ ∂f∂y δy + ∂f

∂˙ y δ ˙ y }dx; letting ∂f∂y = u, δ ˙ y = dv ⇒ A-92

≈ S + δy ∂f∂˙ y xi

x f − {xi

x f

∫ ddx

∂f∂˙ y ( ) + ∂f

∂y}δydx .

The second term in the last line vanishes because the trial functions y are supposed to remainfixed at the endpoints. We therefore have

δSδy

=

− dx{xi

x f

∫ ddx

∂f∂˙ y ( ) + ∂f

∂y}δy

δy. A-93

This must hold for all variations δy that perturb y from the ideal minimizing function, and cantherefore only be satisfied if the term in curly brackets inside the integral vanishes. This yields adifferential equation which can be solved for the desired function y:

ddx

∂f∂˙ y

⎛ ⎝

⎞ ⎠ + ∂f

∂y = 0 A-94

Page 22: Appendix A: Mathematical tools A.1 Complex variable analysis

A-22

This equation is universal for the problem posed above.Consider the example of a photon propagating in the x-y plane on a path y(x). Let us assume

the principle of "least time": photons always propagate such as to minimize the travel timebetween starting and final point. We need to minimize

t[y] = f ( ˙ y (x))dxxi

x f

∫ , A-95

where t is the time functional, and the pathlength f over an interval dx is given by the usualformula f (y, ˙ y , x) = 1 + (dy / dx)2 . Inserting f into the master differential equation, the secondterm vanishes and the first term yields after a little manipulation ˙ y = const. Freely propagatingphotons travel in a straight line!

To make this more interesting, consider a photon traveling though an interface fromrefractive index 1 to index n (say, into a piece of glass). The speed during the first segment is c,during the second segment c/n. Which path will be followed? The time functional now is

t = 1c [1+ tan2 θ i]

1/ 2 + nc [1+ tan2 θ f ]

1/ 2 A-96since we already know that in each interface at least, the photon must travel in a straight line.The effective path length therefore consists of two straight line segments joining at the interface,

f = ct = [1 + tan 2 θi ]1/ 2 + n[1+ (d − tan θ i)

2]1/ 2 A-97Inserting into the master equation yields df/dθi=0, and trigonometric manipulation of theresulting equation yields

sinθ i

sinθ f

= n (Snell' s law). A-98

Similarly, the propagation of light by the rules of geometrical optics can be found through anysystem as long as the refractive index n(x,y) (or n(x,y,z) in three dimensions) is known.

1

1

θi

θo

d

Index n

Index 1Fig. A-11. Geometrical construction for Snell'slaw; it was already proved that at constant re-fractive index, light travels in a straight path, so this path must consist of two straight seg-ments.

A.10 The Great Orthogonality Theorem

This theorem can be used to prove most of the useful facts about group tables, such as thefact that the number of classes equals the number of irreducible representations, that the sum ofthe squares of the dimensions of the irreducible representations equals the number of groupelements, and so forth. The proof given here is a more detailed version of the one outlined inTinkham, Group theory and quantum mechanics.

Page 23: Appendix A: Mathematical tools A.1 Complex variable analysis

A-23

The theorem:

Let Γ & Γ ' be two irreducible representations , and let ˆ U RΓ & ˆ U R

Γ ' be two symmetry operators forsymmetry operation "R"; let h be the number of group elements, and let n be the dimensionalityof Γ (e.g. 2 for a 2x2 irreducible matrix representation). Then

(URΓ i† )kl

R∑ (UR

Γ j )mp =hniδijδ kmδ lp , A-99

that is, the matrix elements of the operators U for different R form orthogonal vectors as afunction of representation and position within the matrix (note: (UR

+ )i j = (UR* ) j i).

Proof:

1) Schur's first lemma: Let a group G have symmetry elements R; if two irreduciblerepresentations with matrices UR

Γ&URΓ' satisfy UR

ΓA= AURΓ' for all R, then either the matrix A=0

or the two representations are equivalent (clearly, if URΓ&UR

Γ' do not even have the samedimension, then A is a rectangular zero matrix).

Before we prove Schur's first lemma, let's introduce a little notation:

Let an operator ˆ U R act on |χ >: ˆ U R |χ >=|χ '>

If |χ > can be represented in a basis {|Ψk >} (e.g. the n degenerate eigenfunctions of Hcorresponding to ˆ U R

Γ ), we can write: |χ >= ΣciΨ |Ψ i > . In matrix notation, we have

|χ > =

c1Ψ

c2Ψ

c3Ψ

:

⎜ ⎜ ⎜

⎟ ⎟ ⎟

|Ψ1 >|Ψ2 >|Ψ3 >

A-100

where the rows are labeled by their basis functions. If

|χ' > =

c'1Ψ

c'2Ψ

::

⎜ ⎜ ⎜

⎟ ⎟ ⎟

then we can write c' iΨ = Σ(UR

Γ ) ijcj* . This is just the usual matrix-vector multiplication wehave

been doing all along ((URΓ )ij is just the ith row, jth column matrix element of ˆ U R in the {|Ψi >}

basis).But what if instead of transforming |χ >, we wanted to transform the basis functions {|Ψi >}

themselves? Picking one of them, say |Ψ i > we can write in both abstract and matrix notation

ˆ U R |Ψ i >=|Ψ' i > or (UR

Γ )1 1 (URΓ )1 2 (UR

Γ )1 3

(URΓ )2 1 : :

⎜ ⎜

⎠ ⎟ ⎟

000:

ciΨ =1

:0

⎜ ⎜ ⎜

⎟ ⎟ ⎟

=

(URΓ )1i

(URΓ )2 i

(URΓ )3i

:

⎜ ⎜ ⎜

⎟ ⎟ ⎟

←|Ψ1 >←|Ψ2 >←|Ψ3 >

etc.

A-101

Page 24: Appendix A: Mathematical tools A.1 Complex variable analysis

A-24

or ˆ U R |Ψ i >=j∑ |Ψ j > (UR

Γ ) ji , which looks like the usual formula "turned" around. Keep in mindthis difference between operating on a function expressed in terms of constant basis function,and operating on the basis functions themselves to change them into a new basis set.

Proof of 1) (Schur's first lemma):

Let {|Ψi >} be a linearly independent basis of n functions for the n× n representation ˆ U RΓ of

ˆ U R ∈ G.ˆ U R |Ψ i >= |Ψ j >

j∑ (UR

Γ ) ji A-102

If ˆ U RΓ were reducible to (or equivalent to) ˆ U R

Γ' , we could find a set of m≤n functions {|ϕi >} interms of the |Ψ i > (i.e. |ϕ k >=

j∑ |Ψ j > ajk ) such that

ˆ U R|ϕ i >= |Ψ j > (j∑ U R

Γ ' ) j i A-103

where Γ ' is one of the representations to which Γ can be reduced. Using A-102, it follows thatˆ U R |ϕ k >= ˆ U R |Ψ j > ajk

j∑ = |Ψi >

i, j∑ (UR

Γ )ij a jk A-104

= |ϕ j >j∑ (UR

Γ ' ) jk (from A-103)

= |Ψ i >i, j∑ aij(UR

Γ' ) jk A-105

where the last equality is obtained by expanding ϕ in terms of Ψ. Since the { |Ψ i >} are linearlyindependent, the two sums A-104 and A-105 can only be equal if

j∑ (UR

Γ )ij a jk =j∑ (UR

Γ' ) jk or URΓA = AUR

Γ'

Thus, if URΓ is indeed reducible (i.e. the UR

Γ for all R can be brought into the same block-diagonal form), a nonzero matrix A exists whose elements are the transform coefficients from|Ψ i >→|ϕ i > . The converse of all these steps also holds: if A is nonzero, UR

Γ must bereducible. Thus, A must be zero or else UR

Γ is reducible to URΓ' (or equivalent to it, if m=n).

QED.

2) Schur's second lemma: If a matrix A commutes with all irreducible matrices URΓ of a

representation Γ , it must be a multiple of the unit matix:UR

Γ A=AURΓ (all R)⇒ A = λ ⋅ I

Proof of 2): Let AM = MΛ, where Λ is the diagonal eigenvalue matrix of A, M the matrix ofcolumn eigenvectors. Then

AURΓ M=UR

Γ AM=URΓ MA

Thus, the column vectors of URΓ M also also must be eigenvectors with the same eigenvalues as

the corresponding columns of M.Thus, the eignvectors of A belonging to a certain eigenvalue λ are all transformed among

themselves by the representations of URΓ ; but this would imply that Γ is reducible, unless all

eigenvalues of A are identical.

Page 25: Appendix A: Mathematical tools A.1 Complex variable analysis

A-25

It follows that A is a matrix all of whose eigenvalues are λ ; A = λ ⋅I is the only matrix whichsatisfies this, or else the assumption of irreducibility of the UR

Γ cannot hold.

QED

To recap these two lemmas in word: the UΓA =AUΓ ' of Schur's first lemma is a generalizedsimilarity transformation, where A can be a rectangular matrix (if m ≠ n). It basically says thattwo independent irreducible representations are not related by a generalized similaritytransformation (i.e. A=0 is the only consistent form of A in that case). Schur's second lemmastates that only a multiple of the identity can commute with all group operations in an irreduciblerepresentation, or else that representation is not irreducible.

Now that those preambles are out of the way, we return to the proof of the theorem proper:

Let X be an arbitrary matix, then A= URΓ

R∑ X(UR

Γ −1) is a multiple of the unit matrix, A-106

since USΓA = US

ΓURΓ

R∑ X(UR

Γ−1)

= (USΓUR

Γ )R∑ X((UR

Γ−1)(US

Γ−1))US

= (USRΓ

R∑ )X(USR

Γ−1)US

Γ [2]

= (URΓ )

R∑ X(UR

Γ −1)US

Γ [3]

= AUSΓ (and A-106 follows by lemma 2)

In step [2] we used the fact the (AB)-1=B-1A-1 & that USRΓ =US

ΓURΓ is the group member ∈G

corresponding to the combined operations R followed by S.In step [3], we used the group closure property; the sum is over all R, so whether we sum overUSR

Γ or URΓ makes no difference, since each group element will be included exactly once either

way.Now let all elements of X except Xkl =1 be equal to zero; from eq. A-106 it follows that

λ klδ ij = ( URΓ

R∑ X(UR

Γ−1

))ij (we just call λ = λ kl for later convenience).

= (URΓ )ik

R∑ (UR

Γ −1)lj

= (URΓ )ik

R∑ (UR

Γ ) j l* (U's are unitary). A-107

Now we calculate the value of λ kl; summing both sides of the last equation over j=i,(UR

Γ )ikR,i∑ (UR

Γ )i l* = λ klδi i

i∑ (n = dimension of UR

Γ ,n = δi ii∑ )

=nλ kl A-108

But also, (URΓ )ik

R,i∑ (UR

Γ )i l* = (UR

Γ )ikR,i∑ (UR

Γ−1)l i

Page 26: Appendix A: Mathematical tools A.1 Complex variable analysis

A-26

=R,i∑ (UR

Γ−1)l i(UR

Γ )ik

= (UR-1 ⋅RΓ

R∑ )lk

= (U IΓ

R∑ )lk , where I is the identity.

= δ lkR∑

=hδ lk (h=# of group elements R) A-109Combining A-108 and A-109, it follows that λ kl =

hn δ lk and

(URΓ

R∑ )ik (UR

Γ+

)lj =hnδ ijδ lk , which proves the GOT for Γ=Γ '. A-110

We still need to show this sum over R vanishes if Γ ≠ Γ' . Following the exact sequence of stepsstarting with eq. A-106, but using lemma 1) instead of lemma 2),

A = URΓ

R∑ X(UR

Γ '−1 )⇒ URΓA = AUR

Γ ' ⇒ A = 0

if Γ and Γ' are independent irreducible representations, as per lemma #1. Again letting X=0,except for Xkl=1 we have

(A )ij = 0 = (URΓ )ik

R∑ (UR

Γ '−1 )lj (in analogy to eq. A-107)

= (URΓ )ik

R∑ (UR

Γ '+)l j (since U are unitary).

Thus δΓ,Τ,' can be added to eq. A-110 above to complete the theorem.

QED; that's all, folks!