Fundamentals of Partial Wave Analysis and an Application to …€¦ · 3 Quantum-mechanical spinless unstable system There are many various ways to model amplitudes A R j that appear

Technische Universitat MunchenPhysics Department

E18 Group

Fundamentals of Partial Wave Analysis

and an Application to Heavy-Meson Decay

using the No-U-Turn Sampler

Arseniy Tsipenyuk

Supervisor: Prof. Stephan Paul, Ph. D.Advisor: Dr. rer. nat Daniel GreenwaldSubmission date: January 30, 2016

i

I assure the single handed composition of this master’s thesis only supportedby declared resources.

Garching,

Abstract

This thesis aims to provide a soft introduction to some concepts appearing inmodel-dependent descripitons of hadronic decays of heavy models and discussesa proof of concept application of the No-U-Turn sampler [1] to the analysis ofsuch decays. The emphasis is set on the basic dynamical parts of the wavesand on the general presentation of phase-space coordinates. We include a briefdescription of the No-U-Turn sampler along with some examples presented inthe Stan programming language using our module stan pwa. The latter maybe used for data generation and fitting of model-dependent three-body decays(with fixed masses and widths of resonances).

iii

Contents

Contents iv

I Introduction 1

II Modelling Partial Waves 31 Partial wave analysis . . . . . . . . . . . . . . . . . . . . . . . . 32 Isobar decomposition formalism . . . . . . . . . . . . . . . . . . 43 Quantum-mechanical spinless unstable system . . . . . . . . . . 54 Phase space volume element . . . . . . . . . . . . . . . . . . . . 235 Relativistic corrections and multiple decay channels . . . . . . 346 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

IIIMonte Carlo Methods 497 Likelihood function in PWA . . . . . . . . . . . . . . . . . . . . 498 Basic sampling methods . . . . . . . . . . . . . . . . . . . . . . 539 HMC sampling methods . . . . . . . . . . . . . . . . . . . . . . 57

IVComputational Implementation 7110 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7111 Data generation and sampling examples . . . . . . . . . . . . . 74

Bibliography 83

iv

Acknowledgments

I would like to thank my supervisor Stephan Paul for the possibility to workon this project.

I especially thank Daniel Greenwald for his help, patience, and fruitfuldiscussions during the work on this thesis. Without his guidance this thesiswould lose a great deal in its quality.

Finally, I thank Irina, Anna, and Ilya for their love, patience, and support.

v

Chapter I

Introduction

We can learn a lot about the nature of CP violation from the hadronic decaysof heavy mesons. This task requires careful phenomenological descriptionsof intermediate resonances and nonresonant structures. Increasingly largeexperimental data sets allow the testing of ever more refined models. A typicallikelihood function for such analyses may contain hundreds of parametersand be evaluated over data sets of magnitude 105 and higher. This presentsa challenge for numerical sampling programs and motivates the need for aclear distinction between the various arguments used in models — be theymathematical, physical, or probabilistic — as well as sampling algorithms thatoperate most efficiently in the corresponding phase space.

This thesis aims to provide a soft introduction to some concepts appearingin model-dependent descriptions of hadronic decays of heavy mesons anddiscusses a proof of concept application of the No-U-Turn sampler (NUTS) [1]to the analysis of such decays.

The model-dependent description of partial wave analysis (PWA) is de-scribed foremost in a quantum-mechanical setting with an attempt at a cleardistinction between concepts entering the amplitudes. Our emphasis is onthe basic dynamical parts of the waves and on the general presentation ofphase-space coordinates.

This thesis includes a brief description of Hamiltonian Monte Carlo (HMC)methods along with some examples presented in the Stan programming lan-guage [2] using our module stan pwa [3]. The latter may be used for datageneration and fitting of model-dependent three-body decays (with fixed massesand widths of resonances). The module is designed in a way that allows ex-tension to n-body decays, mass and width fitting, and model-independentdescriptions although these features remain to be implemented.

1

Chapter II

Modelling Partial Waves inHeavy Meson Decay

1 Partial wave analysis

Partial wave analysis (PWA) is a technique used in scattering theory. In a strictsence, PWA describes the expansion of the total amplitude for elastic scatteringof spinless particles in the center-of-mass system into Legendre polynomials1.In a broader sence, the total amplitude of a quantum system is decomposedinto a sum of partial waves describing various components of the scatteringprocess. These components may themselves be further decomposed into morepartial waves and so on until there is a model describing the “smallest” partialwaves.

The most used decomposition of the total amplitude is the decompositioninto eigenfunctions of some quantum operator. For example, consider a parentparticle P (like a B or a D meson) decaying into daughter particles a and b.The total amplitude of the decaying particle may be expressed as sum of S,P, D, and so on, partial waves, characterized by the relative orbital angularmomentum Lab between a and b. The S, P, D-waves correspond to Lab equalto 0, 1, and 2, respectively. (Further terms are usually negligible in mesondecay processes and shall be ignored in further discussions.) Such partial wavedecomposition is used, among others, in [6].

The goal of this chapter is to describe a slightly more complicated case,when the parent spin-0 particle P decays into three pseudoscalar daughterparticles a, b, c. We want to write the amplitude of this decay process as a sumof some other amplitudes. There are, of course, many such decompositions. Weshall focus on one particular partial wave decomposition: the isobar formalismwith Breit-Wigner resonance parametrization, which we shall apply to thedecay D → π+π−π+. This application motivates the restrictions we put onthe parent and daughter particles. However, some of the formulas that shall

1See eq. (3.4) below or [4, Section 46.5.3], [5, Vol. 3, Thm. XI.51].

3

4 CHAPTER II. MODELLING PARTIAL WAVES

P R a

c b

Figure II.1: Three-body decay as a sequential chain of two-body decays.

be considered below apply to general cases and restrictions on P, a, b, c shallbe lifted where possible.

2 Isobar decomposition formalism

In the isobar formalism, the decay P → abc is described as a sequence oftwo-body decays via an intermediate resonance R (see Fig. II.1). The isobarformalism requires that these intermediate resonances posess well definedquantum numbers such as isospin, G parity, spin, parity, or charge and behaveas particles under space transformations. In other words, resonances may beclassified into scalars, vectors and so forth2.

Just as in the two-body decay, the total amplitude is usually decomposedin S, P, D waves referring to LRc (orbital angular momentum between R andc, or, to be more precise, total angular momentum between the system ab andc):

AP→abc =∑

i∈S,P,D

aiAi, (2.1)

where ai are some complex constants relating amplitudes Ai to each other.Note that eq.(2.1) does not use isobar formalism in the sence that the differenceof the orbital angular momentum between ab and c is an observable quantitywithout any assumptions on the ab-system.

Each of the S, P, and D waves may contain several resonances Rj . Usu-ally, the S-wave will be the most populated wave and is decomposed intopartial waves corresponding to particular resonances or groups of closely placedresonances:

AS =∑j

ajARj . (2.2)

The difficulty is then to find a heuristic model describing resonances ARj ; thistask shall constitute most of this chapter.

2Thus, in context of the isobar formalism, the difference between resonances and decayingparticles is a rather delicate subject (or a matter of preference). In [4, Section 47.3], theresonance is characterized by its complex pole position and by its residues. The same, though,applies to any unstable systems: as pointed out in [7, §134], any decaying particle has complexenergy poles.

3. QUANTUM-MECHANICAL SPINLESS UNSTABLE SYSTEM 5

There is a crucial difference between eq. (2.1) and eq. (2.2): the formerdecomposes the total amplitude into partial waves corresponding to observablestates and is exact, while the latter decomposes an amplitude into partialwaves corresponding to resonances that can not be observed directly due totheir short lifetimes. The total angular momentum between ab and c maybe measured; we expect, therefore, some form of spectral decomposition ofAP→abc into eigenvalues and eigenfunctions of LRjc to be valid3. As such,eq. (2.1) is a natural generalization of the canonical partial wave expansionthat shall be presented below, see eq. (3.4). In contrary to eq. (2.1), theresonances Rj appearing in eq. (2.2) represent decaying states that are notobserved directly. Eq. (2.1) is a decomposition of AP→abc into orbital angularmomentum eigenstates while eq. (2.2) is an approximation to a decompositionof AS into energy eigenstates.

3 Quantum-mechanical spinless unstable system

There are many various ways to model amplitudes ARj that appear in eq. (2.2).In practice, the model depends heavily on the specific properties of the res-onance (like its breadth and distance to neighbouring resonances in the en-ergy spectrum). In this subsection we derive a simplified heuristic quantum-mechanical model describing the decay R→ ab based on [7, 8], also using [9,§ 4.5] and [10]. A modern approach to scattering problems may be found in [11];for a more mathematically rigorous treatment of nonrelativistic two-body decaysee, e.g., [12, 13, 14, 15]. Throughout this discussion, R, a, b are assumed tobe pseudoscalar spinless particles; since we consider only one resonance, weshall write A instead of AR. The goal of this subsection is:

i) to familiarize the reader with the notation and recall some basic resultsfrom quantum mechanics and scattering theory;

ii) to motivate the factors into which A is usually decomposed: Breit-Wignerfunctions, Blatt-Weisskopf factors, and Zemach tensors.

A typical quantum-mechanical amplitude is a function A(x, t) with x ∈ R3.In particle decay (as well as in scattering problems) it is much more convenientto use spherical coordinates and energy, performing a Fourier transform in thetime coordinate: A = A(E, r, φ, θ). To describe A, we perform the separationA(E, r, φ, θ) = ρ(E, r)Z(φ, θ). The factor Z, which will be generalized toZemach tensors in later subsections, will not be discussed yet; in fact, wechoose our model so that Z has the easiest possible form to omit complications.

In this chapter, two cases of the radial Schrodinger equation will play animportant role. In one case, the solution focuses on the behaviour of ρ(E, r)

3Since the operator LRjc is not bounded, the spectral theorem for bounded operatorsdoes not apply in this case in a mathematically rigorous way. See, for example, [5, vol. I].


near a certain energy E0 — the energy of the resonance. This solution willlead to the Breit-Wigner dynamical function. In the other case, the solutionfocuses on the barrier transmission properties of ρ(E, r) for some fixed r. Thissolution leads to the Blatt-Weisskopf barrier factors. Some heuristic argumentscombine these two approaches.

The mathematical setting of a quantum-mechanical system is fixed by itsHamiltonian and its boundary conditions. For both cases above, the boundaryconditions are the same and the potential energy terms of the Hamiltonian arequite similar. The main difference is that these two cases describe differentaspects of the desired solution.

3.1 Requirements on the potential

The Hamiltonian of an unstable quantum-mechanical system is characterizedby a potential U(r) with the following heuristic restrictions.

First, U(r) should describe the short-range interaction between the particlesa and b. The potential barrier should be sufficiently large to ensure the existenceof a quasi-steady state of the Schrodinger equation. This quasi-steady statewill be interpreted as the wavefunction of the resonance R.

Second, U(r) should tend sufficiently fast towards zero for large r to assurethe asymptotic freedom of a and b. In fact, U(r) should decay exponentiallyfast with the distance. We skip this derivation here and refer the reader to [7]instead.

Third, as indicated by the chosen notation, U(r) should be sphericallysymmetrical. This is a technical constraint that significantly simplifies thecalculations below. Spherical symmetry requires that a and b are spinless andany nonspherical inner structure of these particles is ignored. (For example, ifa were a particle with an elliptical form, this would be ignored.)

3.2 Relevant solutions of the Schrodinger equation

It is a common technique to separate the central symmetric Schrodingerequation into radial and spherical parts:

ψ(E, r, θ, φ) =∑l,m

ρl(E, r)Ylm(θ, φ), (3.1)

where l and m are the quantum numbers connected to orbital angular mo-mentum and Ylm are the spherical harmonics. The radial part ρl must satisfy

d2(rρl)

dr2+

(2m

~2

(E − U(r)

)− l(l − 1)

r2

)rρl = 0. (3.2)

This equation is written in the center-of-mass frame, and

m ≡ mamb/(ma +mb)


is the reduced mass of a and b. Note that this equation is equivalent to theone-dimensional Schrodinger equation with the effective potential

Ul(r) = U(r) +~2

2m

l(l + 1)

r2. (3.3)

The first term is sometimes known as the dynamic potential, the second as thecentrifugal (or kinematic) potential. The former depends on the properties ofthe resonance and the nature of the underlying interactions; the latter reflectsthe behaviour of outgoing final-state particles.

If one additionally knows that solutions of the Schrodinger equation mustbe axially symmetric (that is, do not depend on φ), solution (3.1) furthersimplifies to

ψ(E, r, θ, φ) =∞∑l=0

(2l + 1)ρl(E, r)Pl(cos θ

)(3.4)

with Legendre polynomials Pl defined as

Pl(cos(θ)) =1

2ll!

dl

(d cos θ)l(cos2 θ − 1)l

=(2l)!

2l(l!)2

(cosl θ − l(l − 1)

2(2l − 1)cosl−2 θ +

l(l − 1)(l − 2)(l − 3)

2 · 4(2l − 1)(2l − 3)cosl−4 θ − . . .

).

(3.5)

In particular, first three Legendre polynomials have the form

P0(x) = 1,

P1(x) = x, and

P2(x) =1

2(3x2 − 1).

(3.6)

We shall later use the following formula for the integral of Legendre polynomials:∫ 1

−1Pl(x)dx =

√2

2l + 1. (3.7)

The partial wave expansion separates the dependencies on variables E andθ. The fact that these variables may be separated is the reason why splittinginto partial waves by l is so successful. It is equivalent to the observation thatoperators of energy and angular momentum commute with each other. Thisfact allows us to phenomenologically adjust the functions ρ and P to the needsof the decay in question. For example, in the decay of spinful particles theangular factor is represented not by Legendre polynomials, but by Zemachtensors or other angular functions, yet the dynamical approximation to ρ that


we shall derive in Section 3.5 remains valid. It also means (and is highlyrelevant in applications) that various resonances with the same l may sharethe same angular function, which may be cached and re-used in the calculationof partial waves belonging to different resonances. The function Pl and itsgeneralizations that depend on θ (but not on E) are called angular functions.Usually, the function ρ is further separated into dynamic and kinematic factors(both independent from θ). The former describe forces that act on the final-state particles while they are still bound together as a resonance. The latterdescribe kinematic properties. The distinction between variables θ and E isnot so apparent in the relativistic treatment of particle decay, since in thatcase θ and E are written in terms of other, relativistically invariant variables(see Section 4.2).

For the case when U(r) = 0, equation (3.2) may be rewritten as

d2(rρl)

k2dr2=

(l(l − 1)/k2

r2− 1

)(rρl) (3.8)

with the wave vector k defined by the relation

k ≡√

2mE/~. (3.9)

General solutions of this equation are given by

ρl(k, r) = cl(k)rl

kl

(d

rdr

)l sin(kr + δl(k))

kr, (3.10)

where δl(k) is an overall phase and cl(k) is a normalization factor. Thesefactors are determined by the boundary conditions of the Schrodinger equation.(The factor k−l−1 is introduced to the solution to be consistent with [10] inthe definition of Blatt-Weisskopf functions. Note that [7] uses factor (−k)−l

instead.)

The solutions (3.10) are real-valued. Instead of writing cl(k) sin(kr+ δl(k)

)in eq. (3.10) we could have written a complex-valued combination aeiφaeikx +beiφbe−ikx for a, b, φa, and φb ∈ R. However, this form does not add any newphysical solutions, since∣∣∣aeiφaeikx + beiφbe−ikx

∣∣∣2 =∣∣∣aei2δeikx + be−ikx

∣∣∣2=∣∣∣aei(kx+δ) + be−i(kx−δ)

∣∣∣2 = |c sin(kx+ δ + δc)|2 (3.11)

for 2δ = φa−φb, δc = arctan a/b, and c cos δc = a. Every complex solution maybe transformed to a physically equivalent real-valued solution by multiplicationwith a corresponding phase δ. The form cl(k) sin

(kr + δl(k)

)appearing in the

solutions (3.10) is more convenient for further calculations.


We shall later employ the fact that the lowest term in the expansion ofρl(r) is rl, since(

1

r

d

dr

)l sin(kr)

kr=

(1

r

d

dr

)l ∞∑n=0

(−1)n(kr)2n

(2n+ 1)!

= (−1)lk2l 2 · 4 · · · 2l(2l + 1)!

+O(r) =(−1)lk2l

(2l + 1)!!+O(r), (3.12)

where we have used the double factorial denoting a product of numbers withthe same oddity; i.e., n!! ≡ n · (n− 2) · . . . · α, where α is 1 for odd n and 2 foreven n.

Another property of ρ(r) that we shall require is its asymptotic expansion.For r →∞, the slowest decaying term of ρ(r) is the one where the derivativeis always applied to sin

(kr + δl(k)

):

ρ(r) = cl(k)rl

kl1

kr

1

rl

[(d

dr

)lsin(kr + δl(k)

)+O

(1

r

)]for r →∞. (3.13)

We can simplify this expression using the fact

d

drsin(kr + δl(k)

)= k cos(kr + δi) = −k sin

(kr − π

2+ δl(k)

)⇒(

d

dr

)lsin(kr + δl(k)

)= . . . = (−k)l sin

(kr − πl

2+ δl(k)

)to obtain the following asymptotic expansion:

ρ(r) = cl(k)1

kl1

kr(−k)l sin

(kr − πl

2+ δl(k)

)+O(r−2)

= cl(k)(−1)lsin(kr − πl

2 + δl(k))

kr+O(r−2). (3.14)

In scattering problems, the outgoing particles are considered to be asymp-totically free after scattering. Therefore, we can combine the partial waveexpansion (3.4) with expansion (3.14) to obtain the following asymptoticalexpression for the amplitude of a scattered particle:

ψ(k, r, θ, φ) ≈∑l

(2l + 1)1

kr

al(k)

2i

(ei(kr−πl

2+δl(k)

)+ e−i

(kr−πl

2+δl(k)

))Pl(cos(θ)

)=∑l

(2l + 1)1

r

(Al(k)eikr +Bl(k)e−ikr

)Pl(cos(θ)

)(3.15)

as r →∞, where the functions

al(k) = cl(k)(−1)l, δl(k), Al(k) =al(k)eiδl(k)

2ik, Bl(k) =

al(k)e−iδl(k)

2ik(3.16)


are parametrizations describing the properties of the scattering problem. Thedependence of the solutions on k (up to factors e±ikr) is absorbed into thefactors Al and Bl which satisfy the relation

Al(k)

Bl(k)= e2iδl(k). (3.17)

When al(k) = 0, the fraction is understood to be 1 and δl(k) is chosen to be 0,so that relation (3.17) is still valid.

Any solution of the scattering problem in a spherically symmetric potentialcan be asymptotically expressed by (3.15).

As an example, consider the free Schrodinger equation with the boundarycondition eikz. The solution is then simply the planar wave eikz, but it is alsogiven by eq. (3.4) combined with eq. (3.10). Equating the two solutions witheikz expanded as a series yields

∞∑l=0

(ikr cos θ)l

l!=

∞∑l=0

(2l + 1)cl(k)rl

kl

(d

rdr

)l sin(kr)

krPl(cos θ). (3.18)

The right-hand side has phases δl(k) = 0 since otherwise the solutions woulddiverge at kr = 0, in contrary to the left-hand side of the equation. Comparethe coefficients on both sides of the equation for a particular l = l0. On theright-hand side, the factor (r cos θ)l0 appears in the expansion only for l = l0.For l > l0 the radial solution ρl(k, r) contains factors rm,m > l, see eq. (3.12),and for l < l0 the Legendre polynomial Pl(cos θ) contains factors cosm θ,m 6 l,see eq. (3.5). Therefore, it must hold that

(ik)l0

l0!= (2l0 + 1) cl0(k)

1

kl0(−1)l0k(2l0)

(2l0 + 1)!!︸︷︷︸from ρl0

(2l0)!

2l0(l0!)2︸︷︷︸from Pl0

. (3.19)

One can easily show by induction that (2l)!2ll!(2l−1)!!

= 1. Thus we obtain

cl(k) = (−i)l.

By inserting these coefficients in eq. (3.18), we obtain the expansion of theplane wave into spherical waves. From eq. (3.14), it follows that

eikz =∞∑l=0

(2l + 1)(−i)l(−1)lsin(kr − πl

2 )

krPl(cos θ) +O(r−2) (3.20)

for r → ∞. This expression illustrates how to expand a plane wave intospherical waves at large distances. We have discussed it here in such a detailbecause in scattering problems it is often necessary to separate the initial


condition (usually given by a plane wave) from the scattered particle (given bya spherical wave); expression (3.20) will allow us to do that.

Solutions (3.10) may be expressed more conveniently for the description ofparticle decays in terms of spherical Hankel functions:

ρ±l (k, r) = ±ikrcl(k)h±l (kr), (3.21)

where h+l ≡ h

(1)l and h−l ≡ h

(2)l denote the spherical Hankel functions of the

first and second kind. For real-valued arguments h(1)l (x) = h

(2)l (x)∗. The

explicit forms of the first three spherical Hankel functions are

h(1)0 (x) = −ie

ix

x,

h(1)1 (x) = −x+ i

x· e

ix

x, and

h(1)2 (x) = i

(x2 + 3ix− 3)

x2· e

ix

x,

(3.22)

which are plotted in Fig. II.2. The following properties are important:

h+l (x) ∝ x−l−1 +O(x−l) for positive x→ 0, and (3.23a)

h+l (x) = −ie

i(x−πl2

)

x+O

(1

x2

)for x→∞, (3.23b)

from which it follows that |xh+l (x)| → 1 as x → ∞. For proof of these and

other properties of spherical Hankel functions we refer the reader to [7] andreferences therein.

The convenience of the spherical Hankel functions for scattering and particledecay is explained by their asymptotic behaviour: the h+

l satisfy the boundaryequation of the outgoing free particles, which we shall exploit in the followingsubsection.

3.3 Transmission coefficient and Blatt-Weisskopf factors

Consider the following potential approximating a resonance with effectiveradius R:

U(r) =

− Ur6R(r) for r 6 R;

0 +~2

2m

l(l + 1)

r2for r > R,

(3.24)

which is a combination of a potential well and an effective potential (3.3)corresponding to the free Schrodinger equation (see Fig. II.3). The radius Ris an ad hoc introduced quantity: it is the distance at which the interactionbetween daughter particles becomes negligible (in comparisson to their energies).


0.0 0.5 1.0 1.5 2.0 2.5 3.0

Wave vector k, GeV

0.0

0.5

1.0

1.5

2.0

Abs

olut

eva

lue

ofhl(kR

)

First three Hankel functions

l = 0

l = 1

l = 2

π

3π/2

2π

5π/2

3π

Pha

seofhl·e−i(kR−lπ/2

)

Figure II.2: Hankel functions hl(kR) for R = 5 GeV−1. The rationale forchoosing given R is presented in Section 3.3.

0 5 10 15 20 25Radius, GeV−1

0

Arb

itra

rylin

ear

ener

gysc

ale

−Ur≤R

Schrodinger equation potentiall = 0

l = 1

l = 2

Figure II.3: Potential of an unstable system.

In applications to meson decays it is usually chosen to be 1 fm ≈ 5 GeV−1. Asbefore, m is the reduced mass of the daughter particles. The potential Ur6Rdescribes the interaction caused by the strong forces between daughter particles“inside” the resonance. The boundary condition for an outgoing particle is

ψ(k, r, θ, φ)→ ck,θ,φeikr

ras r →∞ (3.25)

with a complex factor c that does not depend on r. Solutions of the problemare given by spherical Hankel functions of the first kind. The solution forr 6 R is unknown without further assumptions on Ur6R. One usually requires


that the solution and its derivative are continuous for all r ∈ R+ (in particular,for r = R).

An important quantity characterizing the described system is the trans-mission coefficient. It is defined as the relative probability of a particle topenetrate some potential barrier and to be detected on the other side of thebarrier:

T (x, y) ≡ limh→0

P (Particle in [y; y + h])

P (Particle in [x;x+ h]),

where y − x is the distance that must be traveled by the particle. In otherwords, if the probability of detecting a particle at x is α, the probability ofdetecting a particle at y is αT (x, y). In most cases, the convinient quantity touse is the single-valued transmission coefficient T (x) ≡ limy→∞ T (x, y), whichis the relative probability of the particle leaving the system.

For stationary solutions ψ with infinite support, the transmission coefficientis given by

T (x, y) =

∣∣∣∣ψ(y)

ψ(x)

∣∣∣∣2 . (3.26)

For solutions (3.24) with orbital angular momentum l and wave vector k,the transmission coefficient has the form

Tl,k(R) = limy→∞

∣∣∣∣krh+l (k(R+ y))

krh+l (kR)

∣∣∣∣2 =

∣∣∣∣ 1

krh+l (kR)

∣∣∣∣2 , (3.27)

which is the relative probability of a particle leaving the system assuming theknowledge of its wave vector and orbital angular momentum.

In a broader sence, the transmission coefficient may be used to comparethe probability to detect the system not in a certain region of physical space[x, x+ h], but in any region of the phase space (for example, in [x, x+ h]×[k, k + h]). Consider solutions (3.21) with cl(k) = 1/h+

l (1) — we choose thesimplest possible coefficients and normalize the solutions to be 1 at kr = 1:

ρl(k, r) = ikrh+l (kr)/h+

l (1). (3.28)

The probability of a particle being at R and leaving the system with wavevector k relative to the probability of the particle being at R and leaving thesystem with wave vector k0 is

Tl(k, k0;R,∞) ≡∣∣∣∣ 1

ikrh+l (kR)

∣∣∣∣2/ ∣∣∣∣ 1

ikrh+l (k0R)

∣∣∣∣2 =

∣∣∣∣ ikrh+l (k0R)

ikrh+l (kR)

∣∣∣∣2 . (3.29)

The wave vector k0 is the reference wave vector. A common choice is to takethe limit k0 →∞ analogously to y →∞ in eq. (3.27), which leads to

Tl(k;R) ≡ Tl(k,∞;R,∞) =

∣∣∣∣ 1

kRh+l (kR)

∣∣∣∣2 . (3.30)


As before, R is a space coordinate at which the comparisson is taken.It is convenient to introduce the following definitions:

Fl(k, r) =1

(kr)l |ρl(k, r)|;

Bl(k, r) =1

|ρl(k, r)|;

B′l(k, r, k0, r0) =

∣∣∣∣ Fl(k, r)Fl(k0, r0)

∣∣∣∣ .(3.31)

If the solution of the Schrodinger equation may be written in the form ρl(k, r) =ρl(kr), we use the shorthand notations Fl(kr), Bl(kr), and B′l(kr, k0r0). Thefunctions Fl are known as barrier factors in [4]4 and the functions Bl and B′lare known as Blatt-Weisskopf barrier factors in [16]. Their explicit forms forl 6 2 are listed in Table II.1. The advantages of these notations are twofold:they underline the probabilistic interpretation of the subject and they extractasymptotic properties given by eqs. (3.23a) and (3.23b) from of the Hankelfunctions. The latter can be seen by inserting eq. (3.31) in eq. (3.28):

|ρl(k, r)| = Fl(kr)−1 = (kr)lBl(kr)

−1 = (kr)−l∣∣∣∣h+l (kr)

/ (1

(kr)l· e

ikr

kr

)∣∣∣∣︸︷︷︸(∗)

.

In the part (∗) of the equality, we have divided hl by the factors that arecommon to all Hankel functions. The remaining part of the numerator (seeeq. (3.22)) is the part that appears in Blatt-Weisskopf factors, see Table II.1.Using this notation, we can rewrite the transition factors as

Tl(k, k0;R,∞) =(kR)2lFl(kR)2

(k0R)2lFl(k0R)2=

(kR)2l

(k0R)2lB′l(kR, k0R)2;

Tl(k;R) =1

(kR)2lFl(kR)2= Bl(kR)2.

(3.32)

These transition factors describe the probability distribution of the solutionin the variable k. The amplitude A(k) describing the decay P → Rc → abcwill contain factors Bl(kR) and Bl(ka) to account for the probability that Rovercomes the centrifugal potential of the system Rc, and a overcomes thecentrifugal potential of the system ab.

The assumption cl(k) = cl in eq. (3.28) is necessary to obtain the Blatt-Weisskopf factors. Remember that multiplying solutions (3.28) by any functionc(k) that depends only on wave vector yields another set of valid solutions for

4Note that normalization we have chosen in eq. (3.28) leads to a different normalizationof the Blatt-Weisskopf factors compared to [4]. To obtain the normalization used in [4], usecl(k) = 1/h+

l (k0R) instead. This difference is of no importance for practical sampling tasks.


l Fl(x) Bl(x) B′l(x, x0)

0 1 1 1

1√

2x+1

√2xx+1

√x0+1x+1

2√

13x2+3x+9

√13x2

x2+3x+9

√x2

0+3x0+9

x2+3x+9

Table II.1: Blatt-Weisskopf form factors.

the Schrodinger equation in question. It is unclear why this solution basis yieldsan appropriate description of the transition probabilities with dependence onlyon k. In other words, the Schrodinger equation prescribes the dependenceof the solutions on r and not on k, but it is the dependence on k that isinvestigated in decay experiments. Any solution c(k)ρ(k, r) could be used toderive different transition factors. A notable property of the solutions (3.28) istheir symmetry in the exchange of variables k and r, which would be lost, ifthe solutions were multiplied by any function of k.

Another point to notice is that the Blatt-Weisskopf factors ignore phasecomponents of solutions (3.28).

3.4 Scattering amplitude and the complex energy sheet

Solution (3.4) remains valid when multiplied by any function of k. This isinconvenient because in the end it is the dependence on k and θ that is measuredby an observer. There is further insight to be gained into the dependence ofthe amplitude on k and θ; to do so, we establish the link between solution (3.4)and the scattering amplitude f . Some of its properties can then be deduced byphysical constraints. Due to crossing, particle decay and non-elastic scatteringare processes with very similar descriptions; this justifies the dual interpretationin the following discussion.

Consider scattering of a particle a by the particle b (or, equivalently,scattering of a fictitious particle with the reduced mass m by the potential Udescribed in Section 3.1).


3.4.1 The scattering amplitude

The asymptotical state of a particle scattered by a central symmetric potetnitalis given by

ψ(k, r, θ) = eikz +f(k, r, θ)

reikr + o(r−2), r →∞, (3.33)

where θ is the angle between the incoming and the scattered particle, andf(k, r, θ) — further shortened as f(k, θ), since in most applications f will beindependent of r — is the scattering amplitude. The first summand in thisformula describes a free particle propagating in the direction z ≡ r cos θ; thesecond describes a spherical wave propagating far away from the scatteringpotential. Expression (3.33) is in complete accordance with the free solutionswe have shown earlier, since it is a superposition of eq. (3.18) and eq. (3.21) inthe form of eq. (3.23b).

If we assume that a detector is shielded from the incoming particle flow(as is usually the case in scattering experiments), the probability of detectingthe scattered particle at a certain point in space (the cross-section) is given by|f(θ)/r|2. Hence, the probability of the scattered particle to be detected onthe infinitesimal surface r2dΩ is given by

dσ = |f(θ)|2dΩ = |f(θ)|2 sin θ dθdφ, (3.34)

where dΩ denotes the spherical surface element. Both this expression andeq. (3.33) may be used as defining equations for the scattering amplitude.

We now have two expressions describing a scattered particle at asymp-totically large distances: eq. (3.15) in terms of partial wave amplitudes andeq. (3.33) in terms of the scattering amplitude. Equating the two yields arelationship for large r:

∞∑l=0

(2l + 1)al(k)

2ikr

(ei(kr−

πl2

+δl(k)) + e−i(kr−πl2

+δl(k)))Pl(cos(θ)

)=

∞∑l=0

(2l + 1)il

2ikr

(ei(kr−

πl2

) − e−i(kr−πl2 ))Pl(cos θ) +

f(θ)

reikr.

(3.35)

The coefficients before the terms e−ikr on both sides must be equal, therefore

al(k) = ileiδl(k), (3.36)

and eq. (3.35) simplifies to

eikr

r

∞∑l=0

(2l + 1)ileiδl(k)

2ikei(−

πl2

+δl(k))Pl(cos(θ)

)=eikr

r

∞∑l=0

(2l + 1)il

2ike−i

πl2 Pl(cos θ) +

eikr

rf(θ). (3.37)


Moving f to one side of the equation and all summations on the other, obtain

f(θ) =1

2ik

∞∑l=0

(2l + 1)(e2iδl(k) − 1)Pl(cos θ). (3.38)

Note that both expressions used in eq. (3.35) are asymptotic expansions ofthe solution. They both contain only the 1/r term of the expansion — therelationship (3.38) is only valid to order 1/r. It connects the scatteringamplitude to phases of the solution of the Schrodinger equation; as such, itis the natural connection between the Schrodinger equation and scatteringproblems.

The quantity

fl(k) ≡ 1

2ik(e2iδl(k) − 1) (3.39)

is the partial scattering amplitude with orbital angular momentum l. It is thepart of the scattering that depends on k and does not depend on θ.

Integrating eq. (3.34) we obtain the total scattering amplitude:

σ =

∫ 2π

0

∫ 1

−1|f(θ)|2d(cos θ)dφ = 2π

∞∑l=0

2

2l + 1

(2l + 1)2

4k2|e2iδl(k) − 1|2

=π

k2

∞∑l=0

(2l + 1)|e2iδl(k) − 1|2. (3.40)

We have used (3.7) to integrate over Legendre polynomials.

3.4.2 The scattering amplitude as function on the complexenergy sheet

Consider solution (3.15) as a function of energy:

ψ(E, r, θ, φ) =

∞∑l=0

(2l + 1)Pl(cos θ

)r

×(Al(E)eir

√2mE/~ +Bl(E)e−ir

√2mE/~)

)︸︷︷︸

χl(E)

. (3.41)

The function χl(E) ≡ rρl(E, r) is chosen to be real-valued according to remarkfollowing eq. (3.11). Since χl(E) = χ∗l (E),

A∗l (E) = Bl(E). (3.42)

Formally, we can consider ψ for any complex values E by looking at

χl(E) =(Al(E)e−r sqrt(−2mE)/~ +Bl(E)er sqrt(−2mE)/~)

), (3.43)


Re(E)−40

4Im(E) −40

4

Re(sqrt +

(E))

0

2

Re(E)−40

4Im(E) −40

4

Im(sqrt +

(E)) −2

0

2

a)Analytical continuation of the square root defined by sqrt+(1) = 1 on C \ R<0, leading to

the so-called physical sheet.

Re(E)−4 0 4

Im(E

) −4

04

Re(sqrt −

(E)) −2

0

Re(E)−4 0 4

Im(E

) −4

04

Im(sqrt −

(E)) −2

0

2

b)Analytical continuation of the square root defined by sqrt−(1) = −1 on C \ R<0, leading to

the so-called unphysical sheet.

Figure II.4: Analytical continuation of the square root function.

where sqrt is an analytical function satisfying(sqrt(x)

)2= x for any x in its

domain. (The analyticity of the nonrelativistic quantum-mechanical amplitudefollows from causality: see, e.g., [17, 18].) Solution (3.41) modified by eq. (3.43)remains an asymptotic solution of the Schrodinger equation (3.2) for r →∞,cf. eqs. (3.4) and (3.15).

There are two particularly common ways to define such a sqrt functionon the domain C\R<0 (see Fig. II.4): the analytic continuation defined bysqrt+(1) ≡ 1 leading to the so-called pysical sheet and the analytic continuationdefined by sqrt−(1) ≡ −1 leading to the so-called unphysical sheet. Thesefunctions are not well-defined on the negative real axis. To define the squareroot of −1, it is common to shift the real axis by iε into the complex plane. Weemploy here the same convention as in [4, Ch. 47]: the real line is shifted into thelower complex semi-plane of the physical sheet, which leads to sqrt±(−1) = −i.This convention leads to the minus sign under the sqrt in eq. (3.43) in contrastto eq. (3.41).

Both sqrt+ and sqrt−, when inserted in eq. (3.43), yield valid solutions


to the Schrodinger equation. For positive real values of E, the differencebetween sqrt+ and sqrt− manifests itself as an exchange of the coefficients infront of eikx and e−ikx. The difference between the physical and unphysicalsheets becomes apparent for complex energy values; we return to this point inSection 3.5.

The cut of the domain of sqrt± is known as the branch cut. It is possible tochange the branch cut, for example, to the positive real axis, and analyticallycontinue the function sqrt+ to some function sqrtc along the negative real axis.Then, the function sqrtc would coincide with sqrt+ on the second quadrant ofthe complex plane and with sqrt− on the third quadrant of the complex plane.If an analytical function is continuously extended “through the branch cut,”its domain changes from the physical to the unphysical sheet of the complexplane (or vice versa).

3.5 Single channel Breit-Wigner formula

We are now ready to return to the original task of finding an amplitudemodeling the decay R→ ab. We consider the Schrodinger equation with thepotential specified in Section 3.1 and search for a quasi-stationary solutiondescribing the decaying particle R. Such a quasi-stationary solution shoulddescribe a system with an average lifetime τ = 1/ω, where ω is the decayprobability of the system. The quasi-stationarity condition implies dψ/dt ≈ 0for small times t τ . The solution is then

ψ(x, t) = ψ(x)e−i~Et = ψ(x)e−

i~E0t− Γ

2~ t, (3.44)

where E = E0 − iΓ/2 is a complex eigenvalue of the Hamiltonian describingthe system. We can not asssume that E is real since the energy of a decayingstate is not an observable quantity. The probability of detecting the particlewithin some volume V is

P (R ∈ V ) = e−Γ~ t

∫V|ψ(x)|2dx; (3.45)

which, by the definition of decay probability, leads to

ω = Γ. (3.46)

We can conclude that Γ > 0, since we are considering an unstable system withτ > 0. The spatial solution ψ(x) has the form (3.4). As shown in eq. (3.41),the radial part of the solution is asymptotically

ρl(E, r) =1

r

(Al(E)e−r sqrt(−2mE) +Bl(E)er sqrt(−2mE)

). (3.47)

If we choose the analytic continuation of sqrt to be sqrt+, then

Im(sqrt(−2m(E0 − iΓ/2))) < 0


(see Fig. II.4) and the term e−r sqrt(−2mE) diverges for r → ∞. Analogously,for sqrt = sqrt−, the term er sqrt(−2mE) diverges for large r. In either case thesolution ψ can not be normalized by the condition

∫|ψ|2 = 1. The underlying

reason is that we are considering an unstable system with complex energyeigenvalues. Such a problem has boundary conditions that correspond tooutgoing particles, which manifests itself as “outward probability flow” intoinfinity.

We introduced the quantity Γ to characterize the complex energy eigenvalue.In general, it is much more convenient to define Γ as the decay rate in the restframe of the particle:

Γ =number of decays per unit time

total number of present particles. (3.48)

This definition leads to the survival probability — the probability that aparticle has not decayed before time t0:

P (t0) = e−t0Γ (3.49)

in the rest frame of the decaying particle. In the relativistic treatment, Γ mustbe divided by Lorentz factor γ to ensure the Lorentz-invariance of the survivalprobability. A general width Γgen is not a constant, but rather a functionΓgen(k) which satisfies Γgen(k0) = Γ for the wave vector k0 =

√2mE0 related

to the eigenvalue E0 − iΓ/2.

We enforce a boundary condition for eigenfunctions corresponding to E =E0 − iΓ/2: for large r, the solution must contain only outgoing waves eikr forthe decay particles a and b that leave the system. This condition is equivalentto

Im(ikr) = Im(−r sqrt(−2m(E0 − iΓ/2))) > 0, (3.50)

which is satisfied for sqrt = sqrt−. The imaginary part of sqrt− in the secondcomplex plane quadrant is less than 0. To eliminate the unwanted terms ineq. (3.47), we require

Bl(E0 − iΓ/2) = 0, (3.51)

which allows us to approximate ρl around the eigenvalue E0 − iΓ/2.

A Taylor expansion of Bl(E) around the pole E0 − iΓ/2 yields

Bl(E) = 0 + bl(E − E0 + iΓ/2) +O(|E − E0 + iΓ/2|2) (3.52)

for some complex constant bl. For values of E on the real line, this expansionis valid only for small |E −E0| and for narrow widths Γ. Using eq. (3.42), theradial part of the solution becomes

ρl(E, r) ≈1

r

[b∗l

(E − E0 −

i

2Γ

)eikr + bl

(E − E0 +

i

2Γ

)e−ikr

]. (3.53)


The phase of this solution is determined using eq. (3.17):

e2iδl(E) ≈ e2iδ(0)l

(E − E0 − i

2Γ

E − E0 + i2Γ

)= e2iδ

(0)l

(1− iΓ

E − E0 + i2Γ

), (3.54)

where e2iδ(0)l ≡ b∗l /bl. We use this result in eq.(3.38) to obtain as approximation

of the scattering amplitude:

f(θ) =1

2ik

∞∑l=0

(2l + 1)(e2iδl(k) − 1)Pl(cos θ)

≈ 1

2ik

∞∑l=0

(2l + 1)e2iδ(0)l

(1− iΓ

E − E0 + i2Γ− e−2iδ

(0)l

)Pl(cos θ)

= f (0)(θ)−∞∑l=0

(2l + 1)e2iδ(0)l

Γ/2

k(E − E0 + i

2Γ)Pl(cos θ) (3.55)

with the term

f (0)(θ) ≡ 1

2ik

∞∑l=0

(2l + 1)(e2iδ

(0)l − 1

)Pl(cos θ) (3.56)

describing the scattering amplitude far away from the resonance. This termdoes not depend on the resonance energy and is sometimes known as potentialscattering. The sum in eq. (3.55) is called the resonance scattering. Potentialscattering only influences the elastic contribution to the scattering amplitudeand does not appear in resonance decays. So the l-th partial decay amplitude,defined by eq. (3.39), is

fl(k) =e2iδ

(0)l Γ

2k(E − E0 + i

2Γ) . (3.57)

Using eq. (3.54) and the formula e2i arctanx = ei arctan x

e−i arctan x = 1+ix1−ix , the scatter-

ing phase may be expressed in the form

δl = δ(0)l − arctan

Γ

2(E − E0),

from which we can see that the phase changes by π in the resonance region(see Fig. II.5).

The resonance scattering contribution to the total cross section in the l-thpartial wave follows from (3.40):

σl =π

k2

(2l + 1)(Γ/2)2

(E − E0)2 + (Γ/2)(3.58)


0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

Energy, GeV

0.0

0.19

0.38

0.57

0.75

0.94

1.13

Abs

olut

eva

lue

off,

arbi

trar

ysc

ale

Dynamical, non-relativistic part of f0(1350)

0

π/6

π/3

π/2

2π/3

5π/6

π

Pha

seof

f

Figure II.5: Partial decay amplitude f0(k) modeling the toy resonance f0(1000)decaying into two pions. The resonance is modeled with the energy E0 =1.0 GeV and width Γ = 0.1 GeV, reduced mass of outgoing particles ism = mπ2/(2mπ) = 70 MeV). Outgoing particles are pions as in the exampleD→ π+π−π+ that is discussed in Section 6.1 in more detail.

The derived approximation is valid for r →∞, E − E0 → 0, and Γ→ 0.

At E = E0 − iΓ/2, equation (3.53) simplifies to

ρl(E, r) ≈ −1

riΓb∗l e

ikr. (3.59)

This solution may not be normalized over R3. A possible normalizationcondition would be ∫ R

0|ρl(r)|2r2dr = 1. (3.60)

Here, the function is integrated over the region in space describing the decayingparticle “contained” in the ball with radius R. We can use this condition toroughly estimate the interaction radius of the resonance.

The total probability current of the outgoing wave (integrated over thesphere with radius R) is proportional to the probability density and to thegroup velocity v of the wave:

v|ρl(r)|2r2 = v|Γbl|2. (3.61)

We have omitted the factors coming from integration over spherical angles suchas∫ π

0 Pl(cos θ) dθ =√

2/(2l + 1). However, our notation is consistent with the

4. PHASE SPACE VOLUME ELEMENT 23

normalization chosen in (3.60). In the 0-th order of the Taylor expansion in Γ,

k = − sqrt−(−2m(E0 − iΓ/2)

)=√

2mE0︸︷︷︸≡k0

+O(Γ), (3.62)

which means that the group velocity is approximately given by v = ω/k0 =Γ/k0.

The probability current (3.61) may be interpreted as the decay probability,i.e.,

v|Γbl|2 = ω = Γ, (3.63)

which leads to

|bl|2 =1

vΓ≈ k0

Γ2. (3.64)

If we evaluate eq. (3.60) for k ≈ k0, we obtain

R ≈ 1

|Γbl|2≈ 1√

2mE0. (3.65)

For example, for a resonance with mass E0 = 1.3 GeV decaying into twopions with reduced mass m = m2

π/(mπ + mπ) ≈ 70 MeV, the interactionradius is approximately 2.7 GeV−1 ≈ 0.5 fm. This method can be used toroughly estimate the interaction radius between the daughter particles. Asmentioned in Section 3.3, R is usually set to 1 fm. It appears explicitely onlyin Blatt-Weisskopf form factors and has a relatively minor impact on the formof scattering amplitudes.

4 Phase space volume element

Here we discuss the variables in terms of which we describe the partial waveamplitudes. Rather than describing the decay in terms of angles between reso-nances and energies thereof, we can employ some Lorentz-invariant quantitiessuch as products of four-momenta of final state particles. When working withsuch derived variables, it is important to verify all constraints that must beimposed on them (such as energy-momenta conservation or on-shell conditionsfor the final state particles).

4.1 Scattering matrix

Instead of working with scattering amplitudes, in nonrelativistic scatteringit is common to use the S matrix, defined as a unitary operator describingthe overlap of incoming and outgoing wave vectors. When the scatteringprobability (or in the decay context, the particle decay rate) is written in termsof quantities defined via the S matrix, it becomes apparent how phase spacefactors come into the decay description. For the process ab → 12 . . . n, let


ka, kb denote the wave vectors of a, b at time T before scattering, k′1, k′2, . . . , k

′n

the wave vectors at time T after scattering. The overlap of in and out statesis given by

〈k′1 . . . k′n|S|kakb〉 ≡ limT→∞

〈k′1 . . . k′n|e−iH(2T )|kakb〉. (4.1)

The S matrix describes the interaction of the in and out states; to ensure theoverall probability conservation, it must be unitary (see eqs. (5.23) and (5.23)).The part of the S matrix corresponding to particle interactions is known as Tmatrix, defined by

S ≡ 1 + iT. (4.2)

The T matrix may be further decomposed into kinematical and dynamicalparts of the scattering:

〈k′1 . . . k′n|iT |kakb〉 = (2π)4δ4(ka + kb −n∑i=1

k′i)M(ka, kb → k′1, . . . k

′n)

(2Ea 2Eb∏ni=1 2E′i)

1/2(4.3)

with the invariant matrix element M describing the dynamic part of the

scattering; E′i =√

p2i +m2

i are the energies of the outgoing particles (with

analogous formulas for Ea, Eb).Using this definition, we can express the differential cross-section in terms

of the invariant matrix elementM— just as eq. (3.34) connects the differentialcross section to the scattering amplitude. For the decay of a single unstableparticle with four-momentum P = (M, 0, 0, 0) into n particles with four-momenta pi = (Ei,pi) we obtain

dσ =(2π)4

2M|M(P → p1 . . . pn)|2dΦn, (4.4)

where

dΦn = δ(4)(P−n∑i=1

pi)n∏i=1

δ(p2i−m2

i )d4pi = δ(4)(P−

n∑i=1

pi)n∏i=1

d3pi(2π)32Ei

(4.5)

is the differential element of the n-body phase space volume. For derivation ofthis expression we refer the reader to [9, Ch. 4.5] (see, in particular, eqs. (4.79),(4.80), and (4.86) therein). The factor 1/(2M) is introduced conventionally,note its similarity to factors 1/(2Ei). With this notation, the interpretationof the formula (4.5) remains the same with respect to crossing. The invariantmatrix element resembles the relativistic analog of the scattering amplitude.

4.2 Phase space volume element

An explicit derivation of the phase space volume element dΦn is often arelatively tedious task. We describe only some of the results based on [19] andrefer the interested reader to this source and references therein.


The goal is to express the integral

dΦn = δ(4)(P −n∑i=1

pi)n∏i=1

δ(p2i −m2

i )d4pi (4.6)

in terms of Lorentz-invariant scalars. We shall use invariant square massesdefined as

m2ij = (pi + pj)

2 = m2i + 2pipj +m2

j = m2i +m2

j + 2(EiEj − pi · pj), (4.7)

with indices 1 6 i < j 6 n. (Obviously, m2ij = m2

ji.) The invariant squaremass of l particles numbered by 1 . . . is analogously defined as

m21...l =

( l∑i=1

pi

)2=

l∑i=1

m2i + 2

∑16j<i6l

pipj . (4.8)

We shall also use the shorthand notation

p2ij ≡ pipj =

1

2(m2

ij −m2i −m2

j ). (4.9)

The variables m1..l, mij , pij are Lorentz-invariant scalars. Note thatdm2

ij = 2dp2ij due to eq. (4.7). It is convenient to write all coordinates p2

ij

in the matrix form

χp1...pn =

p2

11 p212 . . . p2

1n

p221 p2

22 . . . p22n

. . . . . .p2n1 p2

n2 . . . p2nn

. (4.10)

Due to the definition of p2ij , the matrix is symmetric and has (n + 1)n/2

independent entries. If particles pi are on-shell, then p2ii = m2

i and the numberof independent entries is reduced to n(n− 1)/2.

Note that the variables χp1...pn do not contain any information about therelative direction of vector P with respect to final state particles. The choiceof these variables is therefore insufficient to describe the case when the parentparticle has spin. It is necessary to add some angle variables (such as Eulerangles; see below) to the variables χp1...pn .

Our goal is to write dΦn in terms of the dp2ij and to formulate a criterion

to check whether a point χp1...pn belongs to the phase space in question.Consider the decay of a (pseudo-) scalar particle with mass M in its energy

eigenstate EP =√

p2 +M2. It is described by a scalar field in Fock space:

|p〉 = a†(p)|0〉, (4.11)

where p is the momentum of the particle in some reference frame F . It isparametrized by a single degree of freedom (e.g. its energy eigenvalue in F ).


Four-momentum conservation in F for the decay P → p1 . . . pn with knownenergy EP and unknown momenta p1 . . . pn requires:

P = p1 + . . .+ pn. (4.12)

Initial particle is assumed to be on shell. We also require that final-stateparticles are on shell:

p2i = m2

i , i = 1 . . . n. (4.13)

Let us count the number of free variables p2ij that are necessary to satisfy

equations eqs. (4.12) and (4.13): There are 4n variables describing the finalparticle four-momenta. But 4 quantities are fixed by equation (4.12), nquantities by eq. (4.13), and 3 quantities by the choice of the overall orientationof all three-momenta. Only EP (or |p|) is known for the decay — by writing(E,p) we implicitely fixed the orientation of the decay with respect to somecoordinate system. These last 3 fixed variables are usually expressed in the formof Euler angles5. So, eqs. (4.12) and (4.13) leave 4n− 4− n = 3n− 4 variablesfree; the phase space is then a (3n− 4)-dimensional manifold Φ3n−4 ⊆ R4. Bychoosing the orientation of the system, we choose a representation of the spaceΦ3n−7 = Φ3n−4/SO(3), in total leaving 3n − 4 − 3 = 3n − 7 free variables.Since we consider the quotient group Φ3n−4/SO(3), the Euler angles do notappear explicitely in formulas of the decay amplitudes.

If the parent particle has spin, there is no SO(3) rotation invariance of theinitial state, thus the total number of free variables is 3n− 4, and Euler anglesappear explicitely in the decay amplitudes.

It is relatively simple to pick a phase space point in the parent particle restframe. We may freely choose n− 1 three-momenta pi

pn = −n−1∑i=1

pi,

set the energies

Ei =√p2i +m2

i ,

5It is well-known that a rigid n-body in R3 may be oriented in any desirable way by meansof three consecutive rotations; the angles of these rotations are known as Euler angles α, β, γ.The “rigid body” we are rotating is the parallelotope defined by three-vectors p1, . . . ,pn.The condition fixing Euler angles may be written in the form

(e1e2e3) = Θx,γΘz,βΘx,α

1 0 00 1 00 0 1

, (4.14)

where e1, e2, e3 are the basis vectors of the coordinates in which equation eq. (4.12) is written,and Θx,α, Θz,β , Θx,γ are the elementary rotations around x, z, and (again) x-axes by anglesα, β, γ, respectively. Just as example, if two vectors pi,pj are not back-to-back, we could

choose e1 = pi|pi|

, e2 =pi×pj

|pi×pj |, e3 = e1 × e2.


n 3 4 5 6

(a) 3n− 7 2 5 8 11(b) 3n− 4 5 8 11 14

(c) n(n− 1)/2 3 6 10 15(d) (n2 − 7n+ 14)/2 1 1 2 4

Table II.2:Dimension of phase space for n-body decay of

(a) spinless parent particle;(b) spinfull parent particle.

Coordinates of the phase space are transformed to:(c) χp1...pn ∈ Rn×n with n(n− 1)/2 free coordinates

after taking all momenta on-shell.Number of free coordinates of χ that needs to be fixed is

(d) n(n− 1)/2− (3n− 7).

and rescale the momenta according to

pi = pi

∑ni=1 EiM

,

to obtain a point (p1, . . . ,pn) that satisfies all the physical constraints — onshell and four-momentum conservation — of the problem.

Note that if phase-space points are generated according to this scheme, theywill contain unnecessary multiplicities (configurations, which are equivalent upto rigid body rotation by Euler angles). To eliminate these multiplicities, it isnecessary to satisfy the constraint (4.14).

Overall, to check that a point (p1 . . . pn) belongs to the phase space, n+4+3conditions must be verified.

Consider the following function describing coordinate transformation. For

w =

p1...pn

=

p01 p1

1 p21 p3

1...

......

...p0n p1

n p2n p3

n

, (4.15)

define the transformation

ξ : R4n → Rn(n+1)/2 ⊂ Rn×n;

w 7→ w · wT =

p2

11 p212 . . . p2

1n

p221 p2

22 . . . p22n

......

. . ....

p2n1 p2

n2 . . . p2nn

= χp1...pn .(4.16)


The map ξ : R4n → ξ(R4n) is continuously differentiable and bijective, and itholds that

dξ(w)

dw=√

det(Dξ(w)TDξ(w)) =√

det(4wTw). (4.17)

This formula corresponds to the substitution formula on manifolds withdet(Dξ(w)TDξ(w)) — the Gramian determinant of ξ (see [20]). We can usethis to rewrite equation (4.5) with the normalization factor M from eq. (4.4);all non-invariant quantities are written in the parent particle rest frame system:

dΦn

M=

1

Mδ(4)(P −

n∑i=1

pi)

n∏i=1

δ(p2i −m2

i )dw

dξ(w)dξ(w)

=(a)

1

Mδ(1)(M −

n∑i=1

Ei)

n∏i=1

δ(p2i −m2

i )

∑16i6j6n dpij√det(4wwT )

=(b)δ(1)(M2 −

n∑i=1

MEi)

∑16i<j6n dpij√det(4wwT )

=(c)

δ(1)(M2 −∑ni=1m

2i − 2

∑16j<i6n pij)

∑16i<j6n dpij√

det(4wwT ). (4.18)

In (a), we used eq. (4.17) and evaluated the three-dimensional integral overthe first δ function, which fixes the Euler angles. In (b), we used the scalingproperties of the δ function and evaluated all the on-shell δ functions. And in(c), we used the fact that

M2 = Pn∑i=1

pi. (4.19)

Using these transformations, we have rewritten phase space dΦ in termsof Lorentz invariant variables. Note that the resulting formula completelycoincides with the analog formula in [19] for n = 4. We conjecture that forn > 4 eq. (4.18) is consistent with the analog equation in [19].

Formula (4.18) has one major drawback compared to (4.6): it is much morecomplicated to check that a point

χ =

m2

1 p212 . . . p2

1n

p212 m2

2 . . . p22n

......

. . ....

p21n p2

2n . . . m2n

∈ Rn(n−1)/2

belongs to the phase space rather that to check that (p1, . . . , pn) belongs tothe phase space. By fixing the diagonal entries of χ we have automaticallysatisfied the on-shell conditions.

Since the phase space has 3n− 7 dimensions, we still needs to fix

n(n− 1)

2− (3n− 7) =

n2 − 7n+ 14

2(4.20)


variables. We need to check that

i) four-momentum conservation holds,

ii) Euler angle multiplicity is eliminated, and

iii) χ is physical — there exists a vector w = (p1 . . . pn) ∈ R4n with p0i > 0

for all i = 1 . . . n such that χ = ξ(w).

For three- or four-body decay the relationships between all entries of χ arefixed by a single equation (see Table II.2). For example, we can choose thefollowing equation:

P 2 =

(n∑i=1

pi

)2

, (4.21)

which in the parent-particle rest frame is equivalent to

M2 =n∑i=1

m2i + 2

∑16j<i6n

pipj , (4.22)

which is equivalent to the energy conservation appearing in eq. (4.18). It isbenefitial to rewrite this equation in terms of invariant square masses. To doso, note that each pk appears in the last sum of (4.22) exactly n−1 times. Forexample, the terms containing p1 are p1p2, p1p3, . . . p1pn. Therefore, by thedefinition of invariant square mass

(n− 1)∑i

m2i + 2

∑16j<i6n

pipj =∑

16j<i6n

m2ij . (4.23)

(For ease of notation, the variables i, j are assumed to be in the set 1, . . . , nunless explicitely stated otherwise.) Inserting this into eq. (4.22), we obtainthe energy conservation relation

M2 + (n− 2)∑i

m2i =

∑16j<i6n

m2ij . (4.24)

In higher dimensions, further relationships between entries of χ must beinvoked. An example is the equation

(pk + pl)2 =

P − ∑i 6=k, l

pi

2

, (4.25)

which can be resolved in the parent particle rest frame to

m2k +m2

l + 2pkpl = M2 +∑i 6=k, l

m2i − 2

∑i 6=k, l

∑j

pjpi + 2∑i 6=k, l

∑j<ij 6=k, l

pjpi

= M2 +∑i 6=k, l

m2i −

∑i 6=k, l

(pkpi + plpi),


or, in terms of χ,∑i 6=k

p2ki +

∑i 6=l

p2li = M2 −

∑i 6=k, l

m2i −m2

k −m2l . (4.26)

Of course, there are other relationships that may be used; most commonare products of different four-vectors sums.

The question whether the point belongs to the phase space is closely relatedto the properties of the matrix χ. It is possible to show [19] that χ is physical(in the sense defined above) if and only if χ has one positive, three negative,and n−4 zero eigenvalues. The rank of the physical matrix χ does not exceedfour (see definition (4.16)). That of the four remaining eigenvalues one must bepositive and three must be negative is connected to the structure of Minkowskispace. If one of these four eigenvalues is equal to zero, χ will lie on theboundary of the range of ξ.

This eigenvalue condition may be reformulated in the following form:

∆i > 0, for i ∈ 1, 2, 3, 4;∆i = 0, for i ∈ 5, . . . , n; (4.27)

where ∆n−i is the i-th coefficient in the characteristic polynomial of χ —coefficient before λi, where λ is the variable of the characteristic polynomial.Explicitely, these coefficients are determined by conditions

∆l = (−1)l−1∑

det of all (l × l) diagonal minors of χ. (4.28)

In particular,

∆1 =

n∑i=1

m2i ,

∆2 = (−1)1∑

16i<j6n

(m2im

2j − p2

ijp2ij),

∆n = (−1)n−1detχ.

(4.29)

For proof of eq. (4.27), see [19].The conditions ∆1 and ∆2 are valid for any physical χ = ξ(w). Indeed,

∆1 > 0 holds since all masses are positive, and condition ∆2 > 0 follows6 fromequivalence

(−1)(m2im

2j − p2

ijp2ij) > 0⇔ 2mimj < 2pipj

⇔ (mi +mj)2 < (pi + pj)

2 = m2ij . (4.30)

6 An alternative way to check ∆2 > 0 is to use the equality

p2ijp

2ij −m2

im2j = m2

ijq(ij)→ij ,

where q(ij)→ij > 0 is the breakup momentum of the system (ij) into i and j that is definedby eq. (4.37).


The latter follows for any particle four-momenta pi and pj from the definitionof m2

ij in the ij-rest frame system:

m2ij = (pi + pj)

2 = (Ei + Ej)2 > (mi +mj)

2. (4.31)

The inequalities for ∆3, . . . ,∆n become increasingly complicated, but theirinterpretation remains similar: they relate together all possible momenta com-binations and ensure the resulting constraints for corresponding combinationsof p2

ij .Condition (4.27) allows us to check whether a point χ is physical; in

addition, we still need to ensure energy conservation by, for example, ensuringthat eq. (4.22) is valid. The condition on the Euler angles is usually fixedintrinsically by the choice of 3n− 7 free variables. From a practical point ofview, eq. (4.27) has one major drawback: to check it, it is necessary to calculateall entries of χ, while in practice calculations are performed only with 3n−7free variables — entries of χ or functions thereof.

4.3 Some specific cases

4.3.1 Two-body decay

For two-body decay, the above discussion yields trivial results. The invariantsquare mass of the decay particles is constant — m2

12 = M2. For spinfulparent particles, the total number of variables is 3n− 4 = 2. For scalar parentparticles, the total number of variables is 3n − 6 = 0, since the third Eulerangle has no meaning for a back to back decay. Therefore, one must workdirectly with eq. (4.5) and evaluate the integral over the δ(4) function.

Let p1 and p2 be the particle momenta in the parent-particle rest frame;integrating eq. (4.4) over the components of p2 and setting p2 = −p1 yields

dσ =(2π)4

2M|M|2δ(1)(M − E1 − E2)

d|p1|p21dΩ

(2π)32E1(2π)32E2, (4.32)

where E1 =√|p1|2 +m2

1 and E2 =√|p1|2 +m2

2. To perform the remainingintegration, we use the following property of the δ function:

δ(f(x)) =∑i

(f ′(xi))−1,

where the summation is performed over all zeros of f assuming non-vanishingderivative at these points; in our case,

δ(M − E1(|p1|)− E2(|p1|)) =

( |p1|E1

+|p1|E2

)−1

=E1E2

M |p1|.

Inserting this into eq. (4.32), we obtain the following expression for the totalcross-section:

dσ =|p1|

32M2|M|2dΩ, (4.33)


where dΩ = d cos θdφ is the solid angle of particle 1. The factor

(2π)−4

16|p1|/M (4.34)

that up to a constant appears in eq. (4.33) is known as the phase space volumefactor.

For a general process R → ab, the magnitude of the momentum of adaughter particle in the rest frame of R is called the breakup momentum:

qR→ab(m2ab,m

2a,m

2b) = |pa| = |pb|. (4.35)

In this reference frame, four-momentum conservation may be written in theform

pa + pb = (mab, 0, 0, 0).

Therefore,

mabEa = (pa + pb)pa = p2a + papb = m2

a +m2

ab −m2a −m2

b

2,

from which follows

Ea =m2

ab +m2a −m2

b

2mab, and (4.36)

q2R→ab(m2

ab,m2a,m

2b) = E2

a −m2a

=

(m2

ab − (ma +mb)2)(m2

ab − (ma −mb)2)

4m2ab

. (4.37)

For nonrelativistic cases for light final state particles, the breakup momentumand energy of the resonance are related by

qR→ab = ER/2. (4.38)

For three-body decay to abc, the energy Ec of the particle c in the ab-restframe may be calculated in a similar way:

mabEc = (pa + pb)pc = papc + papc =1

2

(m2

ab −m2a −m2

b +m2ac −m2

a −m2c

),

and

|pc|2 = E2c +m2

c . (4.39)

This energy is necessary to determine spin-relativistic corrections to Legendrepolynomials in Section 5.2.


With ~ = 1, the breakup momentum is equal to the wave vector k that weused in the Breit-Wigner formula given by eqs. (3.55) and (3.57) and in theBlatt-Weisskopf functions, eq. (3.31). In particular, in an isobar decay process

Rn = P→ Rn−1pn → Rn−2pn−1pn → . . .→ p1 . . . pn

It is necessary to calculate all breakup momenta

q2Ri+1→Ripi+1

(m2p1...pi+1

,m2p1...pi

,m2pi+1

)

to describe amplitudes of resonances Ri, i = 1 . . . n− 1.

4.3.2 Phase space of three-body decay

For three-body decay,

χ =

m21 p2

12 p213

p212 m2

2 p223

p213 p2

23 m23

.

The total number of free variables is 3n− 7 = 2. A common choice is

m212 = (p1 + p2)2 = m2

1 +m22 + 2p2

12;

m223 = (p2 + p3)2 = m2

2 +m23 + 2p2

23.(4.40)

These variables may be restrained in an explicit way (rather than resortingto conditions (4.27)). First, fix the range of m2

12:

(m1 +m2)2 6 m212 6 (M −m3)2. (4.41)

The first inequality is the same as eq. (4.30); the second follows from energyconservation in the rest frame of the decaying particle. The value of m2

23 forfixed m2

12 is maximal when momenta of particles 2 and 3 are antiparallel andminimal when their momenta are parallel:

(E∗2 + E∗3)2 − (|p∗2|+ |p∗3|)2 6 m223 6 (E∗2 + E∗3)2 − (|p∗2| − |p∗3|)2. (4.42)

Here, particle energies and momenta are written in the 1, 2-rest frame andcan be calculated as follows:

m12E∗2 = (p1 + p2)p2 = p1p2 +m2

2 =m2

12 −m21 −m2

2

2+m2

2 =m2

12 −m21 +m2

2

2;

m12E∗3 = (p1 + p2)p3 = p1p3 + p2p3

=1

2

(m2

13 −m21 −m2

3 +m223 −m2

2 −m23

)=M2 −m2

12 −m23

2, (4.43)

where we have used the energy conservation relation eq. (4.24) for n = 3-body

decay; absolute values of momenta are determined by |p∗i | =√

(E∗i )2 +m2i .

A scatter plot in coordinates (m212,m

223) is called a Dalitz plot. Fig. II.6

illustates phase space plotted using these coordinates. Examples of someamplitudes presented as Dalitz plots are presented in Fig. IV.7 and Fig. IV.8.


Figure II.6: Phase space of three-body decay plotted in invariant square masscoordinates.

5 Relativistic corrections and multiple decaychannels

The decay amplitude given by eqs. (3.55) and (3.57) approximates the decayonly for a single decay channel in nonrelativistic coordinates. In this sectionwe discuss possible improvements on this approximation.

5.1 Relativistic corrections to dynamical factors

Consider the single-channel decay R→ ab. There are three corrections that arecommonly applied to the Breit-Wigner approximation. Two apply to the decayrate in the denominator and numerator of eq. (3.55). The third correctionchanges from a dependence on energy to dependence on invariant square massfor the two-body decay and is presented in Section 5.4 for didactical reasons.

In the spirit of eq. (3.48), we can rewrite the decay rate as

Γ(q) = The probability that the particle decays in unit time

= P (Virtual states a and b are realized “inside the resonance”)×P (Particles a and b overcome the Blatt-Weisskopf barrier)×The phase space volume of the two-body decay. (5.1)

The first factor is the width that we have used before; in this generalizedcontext it is sometimes known as the partial or reduced width, Γr. It isdetermined by short-range interactions in the resonance. The second factor is

5. RELATIVISTIC CORRECTIONS AND MULTIPLE DECAYCHANNELS 35

the transmission coefficient defined in eq. (3.31). The third component is thephase space volume from eq. (4.34) normalized to be 1 at the energy E0 wherethe resonance peaks. Overall, this leads to

Γ(q) = Γr(qR)2l

(q0R)2lB′l(kR, k0R)2 q/M

q0/M0(5.2)

where q = qR→ab(m2ab,m

2a,m

2b) is the breakup-momentum of the particle R

given by eqs. (4.35) and (4.37), M = mab is the invariant square mass of theresonance; M0 = E0 is the model-dependent mass of the resonance (by E0

we mean the real part of the complex energy eigenvalue of the resonance; cf.eqs. (3.55) and (3.57)); and q0 = qM0→ab(M

20 ,m

2a,m

2b) is the model-dependent

breakup momentum.

This correction requires us to replace the width in eqs. (3.55) and (3.57) bythe expression (5.2). Note that the width in the numerator of (3.57) is altereddifferently due to the fact that actual decays are multichanneled.

The Breit-Wigner formula (3.57) is part of the solution (3.4); as such,it may be multiplied by any function of k = qR→ab. Mathematically, thiscorresponds to another choice of the coefficients al in eq. (3.36), which werechosen to satisfy the definition of the scattering amplitude (3.33).

It is common to perform the following heuristic modification of the Breit-Wigner partial amplitude with orbital angular momentum Lab:

fLab,mod(R→ ab) = fLab(q, θ)

√P (a and b leave the system) = fLab

(q, θ)BLab(q).

(5.3)

The second factor is the Blatt-Weisskopf factor defined in (3.31) and corre-sponds to the probability amplitude for the particles a and b to overcomethe barrier set by te orbital angular momentum Lab. In terms of quantumfield theory, this correction corresponds to the normalized vertex function. Inparticular, if the resonance R occurs as an intermediate step in the largerprocess P → Rc → abc (which can, due to crossing, be thought of as thescattering process Pc→ R→ ab, where c denotes the antiparticle of c), theamplitude function must be modified to

fLab,mod(P→ Rc→ ab) = fLab(q, θ)BLPc

(q)BLab(q). (5.4)

Due to crossing, the orbital angular momentum LPc in the scattering Pc isequal to the orbital angular monmentum LRc in the decay of P. Therefore,modification BLPc

(q) accounts for the probability of R and c to overcome thekinematic potential of the parent particle.

In Section 5.3 we will show that for multichannel decay, the total widthappearing in the numerator of (3.57) can be approximately replaced (byimposing time-reversal symmetry and unitarity) by

√ΓaΓb, the square root

of the product of the partial widths of channels a and b. The modification


in eq. (5.4) is completely analogous7 to eq. (5.2) with probabilistic changesapplied mutatis mutandis to the partial widths. In other words, modifications(5.2) and (5.4) have the same underlying probabilistic argument (5.1): theyalter width in the numerator and in the denominator of eq. (3.55) in the sameway, but the width in the numerator must be replaced in a multichannel decayto satisfy unitarity.

Note that Blatt-Weisskopf factors are real-valued only in zeroth orderapproximation. This is consistent with the constant-phase approximation ineq. (3.55) that represents (as a nonrelativistic approximation) the normalizedvertex function (cf. Section 5.3).

5.2 Generalized angular functions

To introduce spin interactions to the scattering amplitude, it is necessaryto generalize factors Pl(cos θ) in eq. (3.55) to a function Z that is called theangular distribution or spin part of the amplitude. There are two commonmethods to make this generalization: the Zemach formalism and the helicityformalism. Both are discussed in [21]; a more recent description of the helicityformalism is formulated in [22].

In the Zemach formalism, the amplitude is written down in terms of three-dimensional spin tensors, each in the rest frame of the decaying resonance.The amplitude is noncovariant. In the helicity formalism, the amplitude isformulated as a sum over the helicity eigenstates.

Both formalisms yield identical descriptions for the case when all final stateparticles are spinless. Consider the decay

P→R + c

R→ a + b.(5.5)

Let J be the spin of the parent particle; a, b, c are spinless. The explicit formsof the angular distributions for some cases are given in Table II.38.

The angular distributions are written in terms of the angle θ betweenthree-momenta of a and c and the ratio z between the modulus of c and theenergy of the resonance:

cos θ ≡ papc

|pa||pc|, z ≡ |pc|

mab. (5.6)

The values |pa| and |pc| are defined by eqs. (4.37) and (4.39), and an expressionfor papc follows from the definition of invariant square mass:

papc =1

2

(m2ac −m2

a −m2c

). (5.7)

7Up to normalization factor c = Bl(q0R)/(q0R)l.8This table is taken from [23]; its extension may be found in [21] with the same notation

for z and θ as presented in this thesis.


J → LRc + Lab Angular distribution Z(θ, z)

0→ 0 + 0 10→ 1 + 1 (1 + z2) cos2 θ

0→ 2 + 2(z2 + 3

2

)2(cos2 θ − 1/3)2

Table II.3: Angular distributions of the decay P → abc. Variable θ is the anglebetween particles a and c in the rest frame of the resonance R, and z is theratio between the modulus of the bachelor particle and the total energy.

Compare Table II.3 with eq. (3.6): the parts of Z that depend on θ aresquared Legendre-polynomials PLRc under given restrictions on P, a, b, c. Thenecessity of taking the square of Legendre polynomials is connected to thefact that we consider two consecutive decays P → Rc, R→ ab instead of one(as discussed previously). The factor

√1 + z2 = EP/mab is the quotient of

parent and resonance energies (in the ab-rest frame); it may be interperetedas relativistic correction to the angular distribution.

5.3 Multiple channel Breit-Wigner formula

Our goal now is to generalize the formulas (3.55) and (3.57) to the case when thedecay happens through multiple channels: R→ abi. A possible interpretationof this process in the decay context is the following: each decay channel has apartial width Γi (corresponding to the probability of R decaying to a and bi.The goal is to determine how Γi alter formula (3.55).

Crucially, total probability must be conserved. This condition is usuallyformulated in terms of the scattering matrix. Therefore, we proceed as follows:i) define the generalization of the scattering amplitude (3.33) for the multiplechannel case; ii) recall the definition of the scattering matrix; iii) establishthe relationship between scattering amplitude and scattering matrix, and iv)formulate the unitarity condition in terms of the amplitude and use this togeneralize (3.55) to the case of multiple channels.

5.3.1 Scattering amplitude for multiple channels

As metioned in Section 3.4, it is simpler to discuss the matter in the contextof scattering rather than particle decay. Consider particle a scattered by thecentral symmetric potential U(r). There is a probability that a is absorbed bythe potential and particle bi is emitted. (We also refer to a, bi as channels.)We write the wave corresponding to a as

ψa(ka, θ, r) = eikaz + faa(θ)eikar

r. (5.8)

Analogously to (3.33), ka is the wave vector of the incoming wave, whichpropagates in the direction z = r cos θ. The first summand corresponds to the


incoming wave, the second corresponds to the elastically scattered wave. Thecross section for elastic scattering in channel a is similar to eq. (3.34):

dσaa = |faa|2dΩ. (5.9)

The wave corresponding to the other particle b ∈ bi is

ψb(kb, θ, r) = fab(θ)

√ma

mb

eikbr

r(5.10)

with the wave vector kb corresponding to b, and ma and mb the particlemasses. Wave vectors are written in the system where U(r) is at rest (in terms

of particle decay: in the resonance rest frame system). The factor√

mamb

is

introduced to yield a more convenient form of cross section in eq. (5.11). Thecross section for inelastic scattering into channel b is given by the probabilitycurrent of ψb in dΩ normalized by the incoming probability current:

dσab =vb|ψb|2va|eikar|2 r

2dΩ = |fab|2pb

padΩ. (5.11)

Here, va and vb are the group velocities of the corresponding wave packets.For the ease of future reference, the wavefunctions for a and b may be

generalized as

ψc = δaceikcz + fac(θ)

√ma

mc

eikcr

r, (5.12)

with index c standing for a or bi. The generalization of eq. (3.38) may bederived mutatis mutandis and has the form

fac(θ) =1

2i√kakc

∞∑l=0

(2l + 1)(ei(δ(a)l +δ

(c)l ) − δac)Pl(cos θ). (5.13)

The partial scattering amplitude has the form

f (l)ac (θ) =

1

2i√kakc

(ei(δ(a)l +δ

(c)l ) − δac), (5.14)

and the generalization of eq. (3.57) has the form

f (l)ac =

1

2ika(e2iδ

(l)a − 1)δac +

1

2√kakc

ei(δ(a)l +δ

(c)l ) ΓM

(l)ac

E − E0 + iΓ/2, (5.15)

where the coefficients M(l)ac relate different channels to each other — in later

discussion they will be set to satisfy unitarity — and δac is the Kronecker deltafunction.

The phases δ(a)l , δ

(c)l are constants (cf. derivation of eq. (3.55)); these

phases correspond to the normalized vertices functions in a quantum field


theory approach. The first summand is the amplitude for elastic scatteringand does not appear in resonance processes. In resonant decay, the partialscattering amplitude is given by

f (l)ac =

1

2√kakc

ΓMac

E − E0 + iΓ/2. (5.16)

We use renormalized constants Mac = M(l)ac for ease of notation, which ab-

sorb all information about coupling of different channels. For more refined

approximations, δ(a)l , δ

(c)l and Mac are all functions of ka, kc, or both.

5.3.2 Scattering matrix

It is common [4] to use the following basis to describe amplitudes:

|k〉 =√

2Ekeik·x, (5.17)

with the normalization

〈k′|k〉 =

∫d3x√

2Ek2Ek′ei(k−k′)·x = 2Ek(2π)3δ(k− k′). (5.18)

Any wavepacket |φ〉 may be written in the form

|φ〉 =

∫d3k

(2π)3

1√2Ek

φ(k)|k〉, (5.19)

where φ(k) is the Fourier transform of the spatial particle distribution andthe factor

√2Ek is introduced to ensure the normalization 〈φ|φ〉 = 1 for any

normalized states with ∫d3k

(2π)3|φ(k)|2 = 1. (5.20)

Consider a system of two particles described by wave vectors k1 and k2 attime T before scattering and n particles described by wave vectors k′1, . . . ,k

′n

after scattering. The eigenvalues Sba of the S matrix describe the probabilitythat initial state a is scattered into the state b:

〈kb|S|ka〉 = 〈kb|Sba|ka〉. (5.21)

Assume that for any incoming state a, the outgoing states |b〉 = S|a〉 areproperly normalized, from which it follows that diagonal entries of S must beequal to 1:

1 = 〈b|b〉 = 〈a|S†S|a〉 =∑

c

〈a|S†|c〉〈c|S|a〉 =∑

c

〈a|c〉〈c|a〉S†acSca = |Saa|2.

(5.22)


Analogously, for two states a and b

〈b|S†S|a〉 = δab|Saa|2 = 〈b|I|a〉, (5.23)

which means that S is unitary.The scattering operator itself is independent from the coordinate choice

(5.17), but its eigenvalues Sab depend on the choice of states a and b. Equationeq. (5.12) is easier to write down in terms of incoming and outgoing angle-independent spherical waves |k〉s = |k〉s rather than in terms of the plane waves|k〉 defined by eq. (5.17). We define spherical wave kets by

〈x|k〉s = 〈r|k〉s × 〈Ω|k〉s =eikr

r〈Ω|k〉s =

eikr

r, (5.24)

where we have separated the radial and angular components of the space vector.These states are normalized by the condition

The probability flux through a sphere with radius r = 4π, (5.25)

or

v s〈k|k〉s = v

∫Ω|ψ(r,Ω)|2dΩ = 4π, (5.26)

where ψ is the wavefunction of the momentum eigenstate, and v = k/m is itsgroup velocity. Set

s〈k′|k〉s = Ir

[∑Ω

〈k′|r〉〈r|k〉 × 〈k′|Ω〉〈Ω|k〉]

= δ(k − k′)∫Sr(0)

dΩ = 4πδ(k − k′).

(5.27)

The summation Ir∑

Ω is a notation we use to denote the sum over projectionoperators 〈r|, 〈Ω|. Summation over Ω corresponds to the integration over thesphere

∫Sr(0) r

2dΩ. Summation over r is strictly speaking not a summationsince we consider states on a sphere with a fixed radius but rather an operatorwith the definition

Ir[r2〈k′|r〉〈r|k〉

]= Ir

[r2 e

i(k−k′)r

r2

]= δ(k − k′), (5.28)

which can be motivated by the following argument: instead of consideringoutgoing waves on a sphere Sr, consider them in the set

Sr,r+ε = ∪r′∈[r,r+ε]Sr′ .

The normalized radial part of s〈k′|k〉s is given by

1

ε

∫ r+ε

rei(k−k

′)r′dr′ = δ(k − k′) +O(ε), (5.29)


leading to eq. (5.28) for infinitesimal values of ε.As before, let |a〉 be the incident particle; let S|a〉 =

∑ |c〉, c ∈ a ∪ bibe any scattered state. We want to calculate the amplitude of detecting anoutgoing momentum eigenstate |kb〉s, b ∈ bi. With introduced notations,

〈r,Ω|a〉 = eikaz = eikar cos θ

〈r,Ω|kb〉 =eikbr

r;

〈r,Ω|S|a〉 =∑

c

〈r,Ω|c〉 =∑

c

ψc(kc, r,Ω),

(5.30)

where the scattered amplitudes ψc are defined by eq. (5.12). The entries of theS matrix are defined as

Sba × s〈kb|ka〉s = s〈kb|S|ka〉s. (5.31)

Explicitely, from our definition of the scattering amplitude we get

s〈kb|a〉 = Ir

[∫Ω

e−ikbr

reikar cos θr2dΩ

]= Ir

[e−ikbr

r(2πr2)

eikar − e−ikar

ikar

]=

2π

ika

(δ(ka − kb)− δ(ka + kb)

)=

1

2ika

(s〈kb|ka〉s − s〈kb| − ka〉s

).

(5.32)

On the other hand,

s〈kb|S|a〉 = Ir

[∫Ω

s〈kb|r,Ω〉〈r,Ω|S|a〉r2dΩ

]= Ir

[∫Ω

e−ikbr

r

(δace

ikar cos θ +

√ma

mcfac(θ)

eikcr

r

)r2dΩ

]= s〈kb|a〉+

∑c

Ir

[∫Ω

(√ma

mcfac(θ)

ei(kc−kb)r

r2

)r2dΩ

]

= s〈kb|a〉+∑

c

4πδ(kb − kc)︸︷︷︸=s〈kb|kc〉s

1

4π

∫Ω

√ma

mcfac(θ) dΩ. (5.33)

To arrive at eq. (5.31), we multiply eq. (5.33) by 2ika and apply eq. (5.32):

s〈kb|S|ka〉 − s〈kb|S| − ka〉s

= s〈kb|ka〉 − s〈kb| − ka〉s + 2ika s〈kb|kb〉s1

4π

∫Ω

√ma

mbfab(θ) dΩ, (5.34)

from which finally follows

s〈kb|S|ka〉 = s〈kb|ka〉s + s〈kb|kb〉s1

4π

∫Ω

√ma

mbfab(θ) dΩ. (5.35)


Therefore, the S matrix may be written in terms of scattering amplitudes usingthe equation

S = 1 + 2ikf with the operator fab|ka〉 =1

4π

∫Ω

√ma

mbfab(θ) dΩ|kb〉s.

(5.36)

This derivation may be generalized to angular-dependent scattering (in partic-ular, for spin-dependent cases), see [7, §124]. We see that the operator f is anonrelativistic analogue of the T matrix (4.2). The unitarity condition maybe expressed in the well known form:

S†S = 1⇔ f − f † = 2ikf f †. (5.37)

Or in coordinate-dependent way:

fab − f∗ba = 2i∑

c

kcfacf∗cb. (5.38)

We apply 〈kb| · |ka〉 to this equation to verify that similar condition holds forthe scattering amplitudes fab:

〈kb|kb〉4π

∫Ω

√ma

mbfab(θ)− 〈ka|ka〉

4π

∫Ω

√ma

mbf∗ba(θ)

=〈ka|ka〉

4π

〈kb|kb〉4π

2i∑

c

kc

∫Ωa

√ma

mcfac(θa)

∫Ωb

√mc

mbf∗cb(θb). (5.39)

Dividing by√

mamb

and omitting the integration — we perform summation over

all eigenstates c, which by assumption form a basis for all eigenstates andmakes the equation above also hold for integrands alone — we get:

fab − f∗ba = 2∑

c

kcfacf∗cb, (5.40)

or in terms of partial scattering amplitudes:

f(l)ab − f

(l)∗ba = 2

∑c

kcf(l)ac f

(l)∗cb . (5.41)

If time-reversal symmetry is applicable, the scattering cross section of eq. (5.11)

is invariant under exchange of channels a and b: f(l)ab = f

(l)ba . Then the unitarity

condition may be written in the form

Imf(l)ab =

∑kcf

(l)ac f

(l)∗bc . (5.42)


5.3.3 Unitarity condition for Breit-Wigner approximation

Let us examine a process that is time symmetric. Inserting multi-channelBreit-Wigner approximation (5.16) into the unitarity condition (5.42), weobtain

− ΓMab

2√kakb(E − E0 + iΓ/2)

+ΓM∗ab

2√kakb(E − E0 − iΓ/2)

=

∑c

2ikcΓ2MacM

∗bc

2√kak2

ckb

((E − E0)2 + Γ2/4

) . (5.43)

Myltiplying by 2√kakbΓ−1

((E − E0)2 + Γ2/4

)simplifies this to

(E − E0)(M∗ab −Mab) + iΓ∑

c

(Mab +M∗ab) = 2iΓ∑

c

MacM∗bc. (5.44)

This equation is satisfied for all energies E only if Mab is real-valued. In thiscase,

Mab =∑

c

MacMcb. (5.45)

Matrix M is symmetric and real-valued, therefore it is diagonizable to acorresponding diagonal matrix D:

M = UTDU (5.46)

with some orthogonal matrix U . Note that

M = M2 ⇒ D = D2; (5.47)

in particular, the entries of D may be 0 or 1. We assume that only one diagonalelement of the matrix is nonzero9, Dii = 1. Then

Mab =∑c,d

UTacDcdUdb = UTaiUib = UiaUib. (5.48)

Introducing the notation

|Uia| =√

Γa

Γb(5.49)

yields

Mab = ±√

ΓaΓb

Γ, (5.50)

9In the case when two (or more) different entries Djj and Dkk are equal to 1, thecoefficient Mab corresponds to the state with degenerate complex eigenvalue E0 − iΓ/2, i.e.channels j and k have exactly the same energy. This case is unphysical, since it occurs if twodifferent resonances have exactly the same complex energy pole.


where we have defined partial widths Γa satisfying the relation∑a

Γa =∑

a

|Uia|2Γ = Γ∑

a

UiaUia = Γ (5.51)

due to the orthogonality of U . We can rewrite this as

f(l)ab =

1

2√kakb

Mab

E − E0 + iΓ/2. (5.52)

The overall probability amplitude has the form

f(l)ab =

1

2√kakb

√ΓaΓb

E − E0 + iΓ/2, (5.53)

where the widths Γa,Γb, and Γ are often modified according to the rulesdescribed in Section 5.1. Note that in the rest frame system of the resonance,ka = kb = qR→ab.

5.4 Relativistic coordinates in the Breit-Wigner formula

One of the most important corrections in relativistic Breit-Wigner formulais the change to invariant coordinates. We introduce it last because it cannot be motivated by quantum-mechanical reasoning (as it has been done withthe previous modifications). We return to the case of a single resonance withsquared momentum p2 = m2

ab and outgoing states a and b. The propagator ofsuch a resonance has the form

1

m2ab −M2 + imabΓ(mab)

. (5.54)

In nonrelativistic cases this expression is reduced to

1

E2R −M2 + iMΓ

=1

(ER −M)(ER +M) + iMΓ

=1

2ER(ER −M) + iMΓ=

1

qR→ab(ER −M) + iMΓ. (5.55)

where we used eq. (4.38). The resulting equation coincides with the approxi-mation eq. (3.57). It is valid up to the orders (mab −M)2 and (mab −M)Γ,where the latter is appropriate due to the assumption ΓM . Other factorsthat appear in the nonrelativistic description arise in quantum field theoryas vertex functions or other corrections to the propagator. The treatment ofmultiple channel decays and unitarity requires a more evolved approach, see[4] and references therein. In the case when two resonances R → out1 andR→ out2 are close together, the dynamic factor is often modeled by a Flatteresonance

1

m2R −M2 + i(qR→out1g2

1 + qR→out2g22), (5.56)

with coupling constants g1 and g2 satisfying g21 + g2

2 = MΓ, and q denotingthe corresponding breakup momenta.

6. SUMMARY 45

6 Summary

6.1 Example: D → π+π−π+

The results obtained in previous sections may be applied to the three-bodydecay D → π+π−π+ as follows.

Throughout this section we use shall describe this process as P→ abc andP→ 123 on equal footing to ensure consistency with previous sections. Theroman notation is favorable when the number of outgoing particles is fixed;the numbered notation is easier to use when the number of outgoing particlesmay vary.

We choose a vector w consisting of Nv = 3n − 7 = 2 free variables thatdescribe the decay; usually, the vector w = (m2

12,m223) is taken. Resonace

structures in these channels are particularly easy to see: there are no knownresonances with charge +2 which is why m2

13 is not used. Let W denote thecorresponding phase space. The decay is physical if and only if w ∈W .

Let Θ denote the vector of model parameters that we intend to fit such ascomplex numbers multiplying resonance amplitudes. Let Ξ denote the vectorof model parameters that we take from other analyses, such as spins or radiiof particular resonances.

Depending on the analysis, some parameters, such as masses and widths ofcertain resonances, may be fitted or taken from other analyses. We restrict thedescription to the case Θ = (a1, . . . , aNr), Ξ = (R1, . . . , RNr), where Nr is thenumber of resonances in our model, ai are the complex numbers multiplyingthe partial waves corresponding to each resonance, and Ri = (Ji, ri,Mi,Γi) arethe vectors containing the spin, radius, mass, and width of the i-th resonance.

The probability likelihood of the decay is, up to scaling, equal to the cross-section of the decay. The boson π+ appears twice among the decay products;the total amplitude must be invariant with respect to exchange of particles 1and 3. This leads to the unnormalized likelihood function

L(w | Θ,Ξ) = 1W (w) |Asym(w,Θ,Ξ)|2

= 1W (w)∣∣A((m2

12,m223),Θ,Ξ

)+A

((m2

32,m221),Θ,Ξ

)∣∣2 . (6.1)

This function may be used to generate data points for given parametersΘ, Ξ. To sample the parameter Θ for a given data set, this function mustbe normalized (i.e., divided by

∫W Ldw; this will be discussed in the next

chapter). The unsymmetrized amplitude is decomposed into partial wavescharacterized by the orbital angular momentum L between ab and c and intoamplitudes corresponding to specific resonances:

A(w,Θ,Ξ) = anonres+∑

Lj=0,1,2

ALj (w,Θ,Ξ) = anonres+∑

i=1,...,Nr

aiARi(w,Ξ). (6.2)

The factor anonres is a complex number that describes the direct decay of Dinto the final states without intermediate resonances. Note that this factor is


Resonance |ai| Phase, deg Ji ri, GeV−1 Mi, GeV Γi, GeV

f0(600) (sigma) 3.7 -3 0 5 0.800 0.800f0(1370) 1.3 -21 0 5 1.350 0.350f0(1500) 1.1 -44 0 5 1.507 0.109ρ(770) 1.0 0 1 5 0.770 0.149f2(1270) 2.1 -123 2 5 1.275 0.185

Special cases |ai| Phase, deg Ji ri, GeV−1 Mi, GeV Γ980,1, GeV Γ980,2, GeV

anonres 1.36 150.1 — — — — —f0(980) 1.4 12 0 5 0.980 0.329 2·0.329

Table II.4: Resonances of the D → π+π−π+ decay used in the model.

purely heuristic; it may not be constant and may be modeled in some otherway.

The last expression does not explicitely contain the orbital angular momen-tum L = LRic between the resonance Ri and the bachelor particle c. However,since all the initial and final particles in our specific decay have spin 0, L isuniquely determined by (and equal to) the spin of the resonance Ri by totalangular momentum conservation. In general cases, it is necessary to add entriesof LRic as known parameters to the vector Ξ. The amplitudes ARi do notdepend on the parameter vector Θ. This greatly simplifies the normalization ofthe model in numeric calculations. Unfortunately, in most cases it is benefitialto fit the masses and widths of resonances; in that case, ARi = ARi(w,Θ,Ξ).

According to our previous discussions, the amplitude of a resonance isgiven by

ARi(w,Ξ) = BLab(cRqR→ab)BLRc

(cPqP→Rc)×× fLRc

(m212,Mi, Gi)Z(Ji, Lab, LRc, qR→ab, qP→rc). (6.3)

The factors Bl are the Blatt-Weisskopf functions given by eq. (3.31) withbreakup-momentum q and interaction radius cR = cP = 1 fm; fl is the partialscattering amplitude, also known as the relativistic Breit-Wigner dynamicalfactor and given by eq. (5.54); and Z is the function that describes the angulardependence of the scattering. We use the Zemach tensor formalism to describeZ(w) = Z(θ, z) (see Table II.3). As before, qR→ab(m2

ab,m2a,m

2b) denotes the

breakup momentum of the particle R into particles a and b in the rest frameof R and is given by eq. (4.37).

The equation (6.3) describing the resonance f0(980) has a slightly differentform. It is described by a modified partial scattering amplitude, namely, theFlatte resonance (5.56). The parameters Θ,Ξ used in implementation are givenin table II.4.

6. SUMMARY 47

6.2 Model-dependent generalizations

Consider the general n-body decay

P→ Rn−1pn → Rn−2pn−1pn → . . .→ p1 . . . pn.

As before, let Θ denote the vector of parameters that we desire to fit, and Ξdetermine the vector of parameters that are taken externally.

A possible way to search for a model-dependent function of the decay isthe following.

1. Parametrizing the phase space W . If the parent particle isspinless, choose vector w constisting of 3n−7 variables that span the phasespace. If possible, these variables are chosen to be Lorentz-invariant; com-monly, invariant square masses of two or more daughter particles. (Vectorparticles P requires 3n−4 variables. Three of them are usually chosen to bethe Euler angles, which are not Lorentz-invariant.) Explicitely write down thecharacteristic function 1W (w). There are some considerations that need to betaken into account:

i) The given data points wi must be transformed into the correspondingphase space points wi (however, only once per simulation, so this doesnot need to be computationally efficient).

ii) The invariant square masses corresponding to the resonances, m2Ri

=m2

1...(i−1) will be calculated for each data point (again, once per simula-

tion).

iii) The resonance amplitudes must be easy to evaluate from w, since thiswill be done multiple times (for every data point wi for every iterationin parameter Θ, i.e. at every step of the simulation).

Given a data point wi, it may be benefitial to precalculate not only the phasespace points wi but also all derived quantities often used in the simulation(such as m2

Rior others).

2. Symmetrizing the amplitude. If some of the particles p1 . . . pn areidentical bosons, the amplitude must be symmetrized just as in equation (6.1).

3. Combining various factors. The model-dependent description isconstructed from the following factors:

i) Breit-Wigner (or other) dynamical factors corresponding to each reso-nance Ri → Ri−1pi:

fLRi−1pi(m2

Ri−1,Mi, Gi), (6.4)

where m2Ri−1

= m21...(i−1) is the invariant square mass of all the particles

comprising the resonance Ri. The number of dynamical factors is equalto the number of resonances, namely n−1. The mass of the resonance


is Mi, and Gi is its partial width. In general, Gi is a function of Mi

and Γi; the latter is the “reduced partial width” described in Section 3.5and corresponds to the decay probability of Ri into Ri−1 and pi. Thepartial width Gi corresponds to the total decay probability of Ri; usually,it is modeled as a sum of decay probabilities of only several channelsthat contribute to the decay of Ri the most. This is justified by the factthat linear changes Γi + ∆Γi cause relatively small contributions to theamplitude that are of order ∆Γi/M

2i .

ii) Blatt-Weisskopf factors must be calculated for every vertex of the isobardecay chain; for Ri → Ri−1pi,

BLRi−1pi(qRi→Ri−1pi ,Ξi) (6.5)

with the breakup-momentum q, and Ξi is the short-hand notation used toexplicitely underline the dependence on the properties of the resonance Ri

(such as spin Ji and radius ri). In total, there are n of the Blatt-Weisskopffactors (for decay P→ Rn−1pn and for every resonance decay).

iii) Collect angular dependence factors that are dependent on the decay inquestion. To each dynamical factor there is an angular dependence factor,i.e., there are n−1 of them.

iv) A nonresonant component may be added to the description (i.e. thecomponent anonres from the previous section). It may be constant ormodeled in some other way.

4. Adding background. An incoherent background may be added tothe model, modifying eq. (6.1) to

Lw.bgr.(w | Θ,Ξ) = 1W (w)×

×(|Asym(w,Θ,Ξ)|2 +

∑i

∣∣abg,iAbg,i,sym(w,Θbg,Ξbg)∣∣2) , (6.6)

where the background amplitude is calculated similarly to the amplitude Asym,but it describes all other processes i with the same final particle state as theconsidered process. This component is sometimes known as the incoherentlysummed in background. The vector of unknown parameters Θbg contains theamplitudes abg,i; these may be fitted as real rather than complex numbers,since the background amplitudes Abg,i,sym are usually modeled as tree processes(processes without intermediate resonances). The vectors Θbg and Ξbg mayshare some parameters with Θ and Ξ. For example, when modeling the decayparametrized in table II.4, background component ρ may be added to thedescription. Its mass and width will be just the same as in the coherentlysummed amplitude, but the coefficient in front of the amplitude abg,ρ will differfrom the coherently summed coefficient aρ.

Chapter III

Monte Carlo Methods

This chapter provides a brief presentation of Markov Chain Monte Carlomethods, with a focus on the No-U-Turn Sampler algorithm, based on [24, 25, 1].No proofs will be presented; no attempt at mathematical rigor is made.

7 Likelihood function in PWA

In the previous chapter we discussed a model that describes partial amplitudesin meson decay. In this section we discuss how such (or any other) model isembedded into a probabilistic context.

Previous results motivate following tasks:

i) Given density Lgen(w) ≡ L(w|Θ,Ξ) and parameters Θ and Ξ, generatea set of data points wi, i = 1 . . . Nw distributed according to thedescribed probability.

ii) Given a set of data points wi i = 1 . . . Nw and fixed parameters Ξ,sample a set of parameters Θ distributed according to the likelihoodLs(Θ) ≡ L(Θ|wii,Ξ).

The functions L are non-negative and Lebesgue integrable. They are notnecessarily probability densities since they are, generally, not normalized. Thenormalized probability density corresponding to some function f shall bedenoted by f .

By Bayes’ theorem, the normalized sampling likelihood (for one given datapoint wi) is given by

Ls(Θ|wi,Ξ) =Lgen(wi|Θ,Ξ)L(Θ|Ξ)

L(wi|Ξ). (7.1)

The probability density distributions L(Θ|Ξ) and L(wi|Ξ) are prior distribu-tions. The former is usually chosen to be uniform with a certain support V(determined experimentally); the latter is usually chosen to be uniform with

49

50 CHAPTER III. MONTE CARLO METHODS

the support equal to the phase space W . Note that Bayes’ theorem yields thenormalized function Ls. For uniform priors eq. (7.1) may be written explicitelyin the form

Ls(Θ|wi,Ξ) =Lgen(wi|Θ,Ξ)∫Lgen(w|Θ,Ξ) dw

1V (Θ)

vol(V )· vol(W )

1W (wi). (7.2)

Sampling of Θ with the likelihood Ls(Θ|wi,Ξ) is not altered when it is multipliedby some constant c(wi,Ξ) that does not depend on Θ. Therefore, volumefactors are usually ommited during the calculation of the likelihood Ls. Thephase space factor 1W (wi) is usually omitted as superfluous for sampling, sincedata points are supposed to belong to the phase space. (It often appearsimplicitely in Lgen.) The factor 1V (Θ) usually enters sampling programs not ina direct way in Ls but on a higher level, when parameters Θ are declared andtheir boundaries are set. If the prior on Θ is not uniform, the correspondingdensity must appear explicitely in Ls.

Consider the following example for the choice of prior: complex constants aiappearing in front of partial wave amplitudes ARi in eq. (6.2) may be declaredas two real variables describing the real and imaginary parts of ai, or as tworeal variables describings the amplitude and phase of ai. This choice sets priorsof ai ∈ Θ to be uniform in Cartesian or polar coordinates, respectively.

With these remarks, the PWA likelihood probability with uniform priors isusually written in the form

Ls(Θ|wi,Ξ) =Lgen(wi|Θ,Ξ)∫Lgen(w|Θ,Ξ)dw

. (7.3)

Note that writing equation in this form is an abuse of notation, since thelikelihood Ls is no longer normalized to one.

Since data events are assumed to be independent, the likelihood of Θ givendata set wi, i = 1 . . . Nw has the form

Ls(Θ|wi,Ξ) =

Nw∏i=1

Lgen(wi|Θ,Ξ)∫Lgen(w|Θ,Ξ)dw

. (7.4)

It is benefitial, if possible, to calculate the explicit form of the function

Igen(Θ) =

∫Lgen(w|Θ,Ξdw). (7.5)

For example, consider the likelihood described in Section 6.1:

Lgen(w|Θ,Ξ) =

∣∣∣∣∣NR∑k=1

akARk(w,Ξ)

∣∣∣∣∣2

=

NR∑k, l=1

aka∗lARk(w,Ξ)A∗Rl(w,Ξ). (7.6)

7. LIKELIHOOD FUNCTION IN PWA 51

In this case, entries of Θ = (a1 . . . aNR) do not appear as variables in partialwaves ARk , and the integral (7.5) may be written explicitely as

Igen(Θ) =

∫ NR∑k, l=1

aka∗lARk(w,Ξ)A∗Rl(w,Ξ) dw = Θ∗†IΘ, (7.7)

with matrix I ∈ RNR×NR that may be calculated before sampling:

Ikl =

∫WARk(w,Ξ)A∗Rl(w,Ξ) dw. (7.8)

The entries Ikl are usually calculated using Monte-Carlo integration: for a setof events ui, i = 1 . . . NNCI distributed uniformely in the phase space W ,

Ikl ≈1

NNCI

NNCI∑i=1

ARk(ui,Ξ)A∗Rl(ui,Ξ). (7.9)

Note that this integration is computationally expensive and the fact that itdoes not occur during sampling is highly advantageous for this particularmodel.

For general models it is not possible to perform the integration (7.5) beforethe sampling, and one must take Monte Carlo integrals for each evaluation ofLs(Θ):

Igen(Θ) =

∫Lgen(w|Θ,Ξ) dw ≈ 1

NNCI

NNCI∑i=1

Lgen(ui|Θ,Ξ). (7.10)

The probability of a detector to register an event is dependent on theposition of this event in the phase space. To account for that fact, one mayreplace the model probability measure by a modified measure:

Lgen(w) dw → Lgen(w)h(w) dw, (7.11)

where h : W → [0, 1] is the detector efficiency. This modification entersexplicitely during simulated data generation. The probability of event wi ∈ dwoccuring and being detected is

Lgen(wi)h(wi)dw∫W Lgen(w)h(w)dw

, (7.12)

while the probability of event wi ∈ dw occuring (but not necessarily beingdetected) is

Lgen, mod(wi) =Lgen(wi)dw∫

W Lgen(w)h(w)dw, (7.13)


Note that accounting for detector efficiency does not change the samplinglikelihood Ls in the form (7.1) (probabilities entering the Bayes’ theorem areconnected only to particle physics and not detector properties): h enters Ls

only through the normalization (7.13) in eq. (7.4), which changes to

Ls(Θ|wi,Ξ) =

Nw∏i=1

Lgen(wi|Θ,Ξ)∫Lgen(w|Θ,Ξ)h(w)dw

. (7.14)

Using Monte Carlo integration,∫Lgen(w|Θ,Ξ)h(w)dw ≈

NNCI∑i=1

Lgen(ui|Θ,Ξ)h(ui) ≈NNCI∑i=1

Lgen(ui|Θ,Ξ),

(7.15)

where ui, i = 1 . . . NNCI are events distributed with density 1W (uniformelyin the phase space), and ui, i = 1 . . . NNCI are events distributed with densityh · 1W . This means that in parameter sampling we may incorporate detectorefficiency by taking a modified set of events for Monte Carlo integration ineqs. (7.9) and (7.10).

According to Section 6.2, a general density L will contain further back-ground parameters and have the form L(w|Θ,Θbg,Ξ,Ξbg). The general formof the likelihood Ls is still valid, since without loss of generality we may haveredefined Θ as a vector concatenation of the original vector Θ and Θbg, andlikewise for Ξ and Ξbg.

In the special case when only complex constants in front of the amplitudesare fitted, the model has the form described in Section 6.1:

Lgen(w|Θ,Ξ) =

∣∣∣∣∣NR∑k=1

akARk(w,Ξ)

∣∣∣∣∣2

+

NR, bg∑k=1

bkBk(w,Ξ), (7.16)

where we write the background contribution in a slightly different form com-pared to eq. (6.6) introducing real-valued constants bk and real-valued back-ground functions Bk. (That allows to model nonresonant structures in thebackground.) Note that for a given data set wii the amplitudes ARk(wi,Ξ)and functions Bk(w,Ξ) may be evaluated and saved before the sampling, savingthe computation time. With complex-valued parameter vector Θ = (a1 . . . aNR)and real-valued background parameter vector Θbg = (b1 . . . bNbg

), integral (7.5)may be written explicitely as

Igen(Θ,Θbg) =

∫ NR∑k, l=1

aka∗lARk(w,Ξ)A∗Rl(w,Ξ) dw +

∫ Nbg∑k=1

bk ·Bk(w,Ξ) dw

= Θ∗†IΘ + IbgΘbg, (7.17)

8. BASIC SAMPLING METHODS 53

with matrix I ∈ RNR×NR defined by eq. (7.8), and vector Ibg ∈ RNRbg definedby

Ik,bg =

∫WBk(w,Ξ) dw. (7.18)

The likelihood function up to the detector efficiency correction is the same asin eq. (7.4):

Ls(Θ,Θbg|wi, i = 1 . . . Nw,Ξ) =

Nw∏i=1

Lgen(wi|Θ,Θbg,Ξ)

Igen(Θ,Θbg). (7.19)

In general discussions we shall omit the explicit dependence on Θbg, implyingits concatenation with Θ.

Both of the tasks set in the beginning of the chapter — namely thegeneration of data points and the sampling of likelihood parameters — cor-respond to the same mathematical task of sampling a variable x — namely,w or Θ — according to a certain distribution f — namely, Lgen(w|Θ0,Ξ)or Ls(Θ|wi, i = 1 . . . Nw,Ξ). Some of the numerical approaches to thischallenge shall constitute the rest of this chapter.

The generation of data points is not needed for practical purposes (thedata is provided by the experiment), but it is useful to check the algorithms forself-consistency: if a data set wii is generated using the starting parametersΘ0, the sampling of parameters Θjj should peak around Θ0. Analysis of thedistribution of such Θjj may be used to gain further insight into propertiesof the considered model Lgen.

8 Basic sampling methods

8.1 Direct sampling

A natural way to sample points qi distributed with density f ∈ L1(R) andcumulative distribution function F (q) =

∫ q−∞ f(q′) dq′ is to sample points ri

with known distribution density u ∈ L1(R) and cumulative distribution U andthen to transform ri to qi in an appropriate way.

Let ri be uniformely distributed between 0 and 1 with density u. We wantto calculate samples qi using ri. Assume that the random variables r and q aredistributed with densities u and f . Then, samples ri and qi are also distributedaccording to these densities if and only if P (ri < r) = P (qi < q). This leads tothe relationship

r =

∫ r

−∞u(r′) dr = P (ri < r) = P (qi < q) = F (q). (8.1)

One can check that setting qi = F−1(ri) yields samples qi with the distributionf .


0 1 2 3 40.0

0.25

0.5

0.75

1.0

u(r)

U(r)

0 1 2 3 4

f(q)

F (q)

Figure III.1: Cumulative distributions of densities u(r) = 1[0,1](r) (left) and

f(q) = 1λe−q/λ · 1[0,∞)(q) (right) illustrate how to transform 10 uniformly

distributed samples to samples with some other distribution: For a sample ri,calculate the value of its cumulative distribution shown by the dashed greyline. The intersection of this line with F (q) leads to the value qi = F−1(ri).

We illustrate this claim with the following example. If

f(q) =1

λe−q/λ · 1[0,∞)(q), (8.2)

its cumulative distribution is given by

r = F (q) = (1− e−q/λ) · 1[0,∞)(q), (8.3)

and samples

qi = −λ ln(1− ri) (8.4)

are distributed with density f for samles ri uniformely distributed between 0and 1, see Fig. III.1.

This method is well suited for density functions with known cumulative-distribution inverses. For the more complicated densities that usually appear inPWA models we must resort to other methods; a class of widely used methodsis Markov Chain Monte Carlo (MCMC).

8.2 Metropolis-Hastings Algorithm

The main idea of MCMC algorithms is to start with a sampling point qi andpropose the next point with a certain probability that is chosen in a way thatensures the overall desired distribution. Each algorithm is characterized by theway the point qi+1 is chosen. This change is called an update. Using multipleupdates, one can construct a chain of samples with the desired distribution.

One of the simplest updates that preserves the distribution is the Metropolis-Hastings update. The Metropolis-Hastings algorithm does the following:

8. BASIC SAMPLING METHODS 55

1) Initialize:

i) Choose a starting point q = q0.

ii) Choose a probability density gprop(q|q) for proposing a new stateq given current state q. Usually, gprop is uniform or Gaussian andcentered around q.

2) Update:

i) Generate a proposition q with distribution gprop(q|qi).ii) Calculate the Hastings ratio

r(qi, q) =f(q)gprop(q|qi)f(qi)gprop(qi|q)

. (8.5)

iii) If r > 1, accept the proposition: qi+1 = q. If r < 1, acceptthe proposition qi+1 = q with probability r; otherwise, reject theproposition and set qi+1 = qi.

Since the Hastings ratio is undefined for f(q) = 0, one must choose theinitial value in the interior of the support of f . Also, the density gprop mustsatisfy the condition

gprob(q|q) 6= 0 for some q and q ⇒ gprob(q|q) 6= 0

to ensure the reversibility of the algorithm. The algorithm then graduallyexplores the support, automatically rejecting propositions with probabilitydensity 0.

The algorithm must be reversible to ensure that the distribution f is pre-served during sampling. With other words, probability of accepting propositionq given state q must be equal to the probability of accepting proposition q givenstate q: then, the Metropolis-Hastings algorithm yields samples distributedaccording to f . For proof of this statement, also known as Metropolis-Hastingstheorem, see [24].

In applications, the last two steps of the update may be reformulated inthe following way:

a ∼ uniform(0, 1)

if f(q proposed)/f(q) > a:

q new = q proposed # Accept

else:

q new = q # Reject

or, multiplying the condition by the denominator in the Hastings ratio,


# Given:

# Initial position q, probability density f()

def update(f, q):

q prop = random.normal(q, sigma)

u = random.uniform(0, f(q))

if f(q prop) > u:

q new = q prop # Accept

else:

q new = q # Reject

return q new

Figure III.2: Python pseudocode of Metropolis-Hastings algorithm with Gaus-sian proposition gprop.

u ∼ uniform(0, f(q))

if f(q proposed) > a:

q new = q proposed # Accept

else:

q new = q # Reject

The variable u is called the slice variable in [1]. The implementation of theMetropolis-Hastings algorithm is presented in Fig. III.2.

An appropriate choice of the proposition density gprop is crucial to obtainsamples that converge fast towards f . If gprop is too narrow, the parameter isexplored very slowly; if gprop is too wide, many propositions are rejected andthe algorithm does not advance.

In practice, gprop is usually chosen to be symmetric with respect to exchangeof q and q, so that the Hastings ratio simplifies to r = f(q)/f(q).

Quantifying the performance of an algorithm is complicated and discussedat length in the literature. Which algorithm performs the best is heavilydependent on the form of density f . The densities appearing in meson decayPWA are characterized by the following properties:

i) Their supports are bounded connected (in many cases convex) sets. It isusually relatively easy to choose a point inside the support, but estimatingthe size of the support is not generally feasible or precise.

ii) The state q is usually a high-dimensional vector. Even in the simplest casedescribed in Section 6.1 without background, one requires 12 parameters;in many practical tasks, the number of parameters is in the order ofhundreds.

9. HMC SAMPLING METHODS 57

iii) Evaluation of f = Ls is numerically costly (for Nw ∼ 200 000 datapoints, each evaluation of Ls contains O(Nw) basic numerical operations);evaluation of ∇f is comparable with f . (The latter will become relevantin Hamiltonian Monte Carlo algorithms discussed below.)

The challenge in meson decay PWA is to choose an algorithm that performsin the most efficient way for the densities of the described type with the leastamount of necessary tuning. A prominent algorithm is Hamiltonian MonteCarlo (HMC) [25] and its variant the No-U-Turn Sampler (NUTS) [1].

9 HMC sampling methods

9.1 Hamiltonian Monte Carlo

The Metropolis-Hastings algorithm is characterized by the fact that a proposi-tion q is chosen according to a certain distribution gprop that must be definedfor all states q before the sampling starts. For certain sampling problems it ismore efficient to use a proposition q chosen according to the local propertiesof the density f in the region around the current state q. Hamiltonian MonteCarlo and its variants exploit this idea.

As before, let f be the probability density of the sampled distributionand q be the current state of the Markov chain. For simplicity, let q beone-dimensional. Hamiltonian Monte Carlo uses the following routine.

1) Initialize:

i) Choose a starting point q = q0.

ii) Choose a probability density gmom(p|q) be used to generate fictitiousmomenta p (see interpretation below) given current state q. Usually,gmom(p|q) = gmom(p) is uniform or Gaussion centered around 0.

iii) Choose a step size ε ∈ R>0 and a number of steps L ∈ N (seeinterpretation below).

2) Update:

i) Interpret parameter q as the position of some fictitious particle.Assign some mass m to this particle. The simplest choice is to setm = 1.

ii) Assign a momentum p to our fictitious particle chosen randomlywith distribution gmom(p|q).

iii) Assign the following energy (i.e., Hamiltonian) to the fictitiousparticle:

H(q, p) = K(p) + U(q) =p2

2m− ln[f(q)], (9.1)


where K is the kinetic energy of the fictitious particle, U is itspotential energy. Here and further we use the kinetic energy K(p) =p2

2m . Depending on the sampling problem, it may be benefitial toconsider other energies. In general, f(q) is replaced by π(q)L(q|D),where π(q) is the prior on q (it is already included in Ls in eq. (7.1))and L is the likelihood of state q given some data D).

iii) Evolve the state of the particle for the time εL using Hamilton’sequations

∂tq = ∂pH,

∂tp = −∂qH(9.2)

In practice, one usually uses the leapfrog method: for p(0) = p, q(0) =q, i = 0 . . . L− 1,

p(i+1/2) = p(i) +ε

2∂tp

(i) = p(i) − ε

2∂qU(q(i));

q(i+1) = q(i) + εp(i+1/2)

m;

p(i+1) = p(i+1/2) − ε

2∂qU(q(i+1)).

(9.3)

The resulting point with its momentum reversed (q(L),−p(L)) ischosen as the proposition of the HMC algorithm. The inversion ofthe sign of p(L) does not change this particular variant of HMC, butis more convenient from the theoretical point of view: it is necessaryto ensure the reversibility of the algorithm (i.e. that it preservesthe distribution f).

For future reference note that a leapfrog step backwards in time iscalculated similarly:

p(i−1/2) = p(i) − ε

2∂qU(q(i));

q(i−1) = q(i) − εp(i−1/2)

m;

p(i−1) = p(i−1/2) − ε

2∂qU(q(i−1)).

(9.4)

iv) Calculate the ratio

r(q(0), p(0), q(L),−p(L)) =e−H(q(0), p(0))

e−H(q(L),−p(L))= eH(q(0), p(0))−H(q(L),−p(L)).

(9.5)

v) Perform the accept/reject stap of the Metropolis-Hastings algorithmas described in Step 2.iii) of that algorithm. Note that r > 1 meansthe energy of the proposed state is smaller than that of the currentone.


# Given:

# Initial position q 0, random momentum p 0, mass vector m

# Probability density f() and its gradient grad f()

def U(q):

return -ln(f(q))

def grad U(q):

return -ln(grad f(q))

def leapfrog(U, grad U, epsilon, L, m, p 0, q 0):

p = p 0

q = q 0

# Half step for momentum at the beginning

p = p - epsilon * grad U(q) / 2

for i in range(L):

# Full step for the position

q = q + epsilon * p / m

# Full step for momentum, except

# at the end of trajectory

if i != L-1:

p = p - epsilon * grad U(q)

# Make a half step for momentum at the end

p = p - epsilon * grad U(q) / 2

return [q,p]

def hmc update(U, grad U, epsilon, L, current q):

current p = random.normal(zeros like[q], sigma)

[q,p] = leapfrog(U, grad U, epsilon, L, m,

current q, current p)

# Evaluate Hamiltonian at start and end of the trajectory

# (Elementwise multiplication and division)

current H = current p * current p / 2 / m + U(current q)

proposed H = p * p / 2 / m + U(q)

u = random.uniform(0, exp(-current H))

if exp(-proposed H) > u:

q new = q # Accept

else:

q new = current q # Reject

return q new

Figure III.3: Pythonian pseudocode of the Hamiltonian Monte Carlo Algorithm.


The pseudocode for this algorithm is presented in Fig. III.3.

The choice of the potential energy U = − ln(f) is motivated by the inter-pretation of the canonical distribution in statistical mechanics: Given energyfunction H(q, p) for a state (q, p), the canonical distribution over states hasthe probability density

P (q, p) =1

Zexp

(−H(p, q)

T

)=

1

Zexp

(−K(p)

T

)exp

(−U(q)

T

), (9.6)

where T is the temperature of the system (measured in units with Boltzmann’sconstant equal to 1) and Z is a normalizing constant. The probability ofdetecting the system in state q is therefore

f(q) ∝ exp

(−U(q)

T

). (9.7)

Inverting this equation (with the arbitrary temperature T = 1) yields U =− ln f . Note that the conventional choice of the kinetic energy correspondsto the posterior momentum distribution given by Gaussian with mean 0 andwidth m:

K(p) =p2

2m= − ln e−

p2

2m .

The prior gmom(p) is commonly chosen to coincide with the posterior:

gmom(p) = e−p2

2m . In that case values of p may be chosen using direct sampling.

The authors of [25] prove that the proposition of HMC keeps the distributionf(q) invariant. The following properties of Hamiltonian dynamics are importantfor the proof:

i) Reversibility: The mapping Ts :(q(t), p(t)

)→(q(t + s), p(t + s)

)is

invertible; the inverse is given by T−s.

ii) Conservation of energy: dH/dt = 0.

iii) Liouville’s theorem: the volume of any phase space regionR ⊆ (q, p) in phase space is invariant in time: vol(Ts(R)) = vol(R).

Due to these properties Hamiltonian dynamics is used in Markov Chain MonteCarlo algorithms.

Let us now generalize to a state vector ~q having Npar entries. The massesof fictitious particles ~m = (m1 . . .mNpar) are usually adjusted with someadditional routine to reflect the scale of each dimension. Note that the masses,appearing in the definition of the kinetic energy and in the leapfrog steps,always couple with the momenta ~p = (p1 . . . pNpar). This correlation betweenmomenta and masses allows certain freedom in the choice of these variables.


We may further tune these parameters, if we use a mass matrix M ∈RNpar×Npar with kinetic energy

K(~p) =1

2~p ·M~q.

This modification is useful for correlated parameters ~q; it corresponds toHamiltonian dynamics with a non-euclidean metric.

Hamiltonian Monte Carlo explores the phase space with Np dimensionsmore rapidly than the Metropolis-Hastings algorithm for higher values ofNpar resulting in the smaller correlation between samples. In HMC, the

number of likelihood function evaluations is roughly O(N5/4p ) per update while

in Metropolis it is O(N2p ) per update, cf. [1] and references therein. This

improvement comes at a cost: HMC requires multiple evaluations of ∇f(q)during each update. This does not present a problem in PWA, since in thesecases evaluations of ∇f(q) and f(q) need comparable computation time.

Parameters L and ε need to be hand-tuned to achieve efficient HMCsampling. The step size ε must be as large as possible to move through thephase space as fast as possible but small enough not to lose stability in theleapfrog algorithm. An unstable leapfrog algorithm will not conserve energyand may propose states unlikely to be accepted, which will result in wastedcalculations.

The number of steps L should be small enough to spare unnecessaryevaluations of f and ∇f and prevent trajectories from double-backing onthemselves, but still large enough to minimize the correlation between samples.

If the distribution gmom yields too small momenta, the algorithm movesslowly through the phase space and yields correlated samples. If the distributiongmom yields too large momenta, the leapfrog propagation will be less stable.These effects may be counteracted by tuning of L and ε; a good choice of theseparameters is crucial for good performance of HMC.

For a detailed discussion of HMC algorithms and their variants the readeris referred to [25] and references therein. We focus here on only one variant:the No-U-Turn Sampler.

9.2 No-U-Turn Sampler

The No-U-Turn Sampler adresses the issues of automatically adjusting ε andL in the HMC algorithm.

9.2.1 Adaptively setting L

The No-U-Turn Sampler, as its name implies, is designed to prevent trajectoriesfrom turning around and heading back towards their starting points. Thefollowing criterion indicates that an HMC chain that starts in the state (~q0,~p0)


and is in the state (~q,~p) is doubling back on itself:

(~q −~q0) ·~p < 0. (9.8)

If the algorithm is continued at the point ~q, the progress it shall make from thepoint ~q0 is proportional to (~q−~q0) · (~p−~p0). When this quantity becomes lessthan zero, the trajectory in the phase space performs a “U-turn” and startsdoubling back on itself.

Another criterion that should be checked is that the algorithm is not in apoint that will be rejected with high certainty. Let

u ∼ uniform(

0, exp(−H(~q0,~p0)

))(9.9)

be the slice variable. The proposition (~q,~p) is rejected if

exp(−H(~q,~p)) < u⇔exp(−H(~q,~p)− ln(u)) < exp(0)⇔

−H(~q,~p)− ln(u) < 0.

In particular, if

−H(~q,~p)− ln(u) < −∆max (9.10)

for some large positive ∆max, then the algorithm is in a phase space regionthat is highly unlikely to contribute to sampling. This may occur if the stepsize ε is too large and errors of the leapfrog algorithm have accumulated intime. The authors of [1] suggest setting ∆max ∼ 1000 so that criterion (9.10)interferes with sampling only if the simulation becomes highly inaccurate.

The naive approach would be to let the HMC update run until (9.8) or (9.10)occurs. Unfortunately, such an algorithm does not guarantee time reversibility,which may lead to the algorithm converging to a wrong distribution. NUTSrestores the reversibility by running the algorithm forward and backward atrandom, building a tree B of possible states, each leaf of which correspondsto a state provided by the leapfrog algorithm. NUTS employs the followingdoubling routine for a given state (~q0,~p0), which is also graphically displayedin Fig. III.4.

1) Initialize:

i) Set the tree B = [(~q0,~p0)].

ii) Set bookkeeping variable j = 0. The tree length shall be kept equalto 2j . During the initialization, there is only one leaf on the tree,namely, (~q0,~p0).

iii) Set the leftmost and the rightmost leafs on the tree: (~q−,~p−) =(~q0,~p0), (~q+,~p+) = (~q0,~p0). The leftmost leaf shall be used topropagate the states backwards in time, the rightmost leaf shall beused to propagate the states forwards in time.


Figure III.4: Illustration from [1], p.1355, describing a binary tree built viarepeated doubling: The figure at top represents a two-dimensional trajectoryevolved over the period over four doublings, and the figures below representthe evolution of the binary tree. The directions chosen were forward (orangenode), backward (yellow nodes), backward (blue nodes), and forward (greennodes).

2) Double the tree until one of the stopping criteria is met:

i) Check eq. (9.8) in the following form:

(~q+ −~q−) ·~p− < 0 or (~q+ −~q−) ·~p+ < 0. (9.11)

One of these inequalities is met if continuing the calculation wouldreduce the distance betwee ~q+ and ~q−. If any of these inequalities orinequality (9.10) is satisfied, proceed directly to step 3). Otherwise,continue with 2ii).

ii) Choose a random direction v = ±1.

iiia) If v = +1, start from (~q+,~p+) and perform 2j leapfrog steps forwardsin time. Add all the intermediate states (~qi,~pi) to the tree B.Increase j by 1. The new tree length becomes 2j + 2j = 2j+1. Resetthe rightmost leaf of the tree (~q+,~p+) to the last state obtainedafter the 2j leapfrog steps.

iiib) If v = −1, start from (~q−,~p−) and perform 2j leapfrog steps back-wards in time. Add all the intermediate states (~q−i,~p−i) to the treeB. Increase j by 1. The new tree length becomes 2j + 2j = 2j+1.Reset the leftmost leaf of the tree (~q−,~p−) to the last state obtainedafter the 2j leapfrog steps.

iv) Return to 2i).

3) Choose the new proposition:

i) Choose the slice variable u ∼ uniform(

0, exp(−H(~q0,~p0)

)).


ii) Uniformely randomly select a leaf (~qprop,~pprop) from the tree B.

iii) If exp(−H(~qprop,~pprop)

)> u, accept the proposition: (~q,~p) =

(~qprop,~pprop). Otherwise, remove (~qprop,~pprop) from B and return tostep 3-ii). Note that there is at least one leaf that can be accepted,namely, the original (~q0,~p0) ∈ B.

Instead of keeping all the leafs until a U-turn occurs and then choosing aproposition, one could pick a leaf (~qprop,~pprop) each time the tree is doubled.Namely, one can choose the slicing variable u during the initialization and set(~qprop,~pprop) = (~q0,~p0). Assume that the tree (with length 2j) has n points(~qi,~pi) satisfying exp

(−H(~qi,~pi)

)> u. Double the tree, keeping the new leafs

(~qk,~pk); for each new leaf (~qk,~pk) satisfying

exp(−H(~qk,~pk)

)> u, (9.12)

set n = n + 1. After that, set (~qprop,~pprop) = (~qk,~pk) with probability 1/n.This is equivalent to uniformly choosing (~qprop,~pprop) from all points on the treesatisfying eq. (9.12). The pseudocode corresponding to this implementation isshown in Fig. III.3.

9.2.2 Adaptively tuning ε

The algorithm briefly described in this section is used in [1] to adapt thestep size ε in HMC methods; it is known as the dual-averaging variant ofstochastic optimization with vanishing adaptation. (The reader is referred tothe aformentioned reference for origins and details on the algorithm.)

Suppose the quantity Ht describes some behaviour of an MCMC algorithmat iteration t > 1; denote its expectation by

h(y) = limT→∞

1

T

T∑t=1

E[Ht|y], (9.13)

with the expectation E[Ht|y] at iteration t and some tunable parameter y.An appropriate choice for the tunable parameter is y = log ε. The reason

for such choice is twofold: the algorithm described below works best if thetunable parameter is real-valued and not constrained (it does not make senceto consider negative or zero-valued step sizes); also, such choice will preventthe algorithm from trying step sizes that are too short and would wastecalculations.

For example, if αt is the Metropolis acceptance probability — i.e. theprobability min(1, rt) for the Hastings ratio rt — one could define

Ht = α− αt, (9.14)

where α is the desired acceptance probability.


# Given: initial position q 0, mass vector m

def is U turn(q l, p l, q r, p r):

if (q r - q l) * p l < 0 or (q r - q l) * p r < 0:

return 1

else:

return 0

def is improbable(q, p, U, u, Delta max):

if ( - p*p / 2 - U(q) - ln(u) ) < - Delta max:

return 1

else:

return 0

def nuts update(U, grad U, epsilon, q 0):

p 0 = random.normal(zeros like[q 0], sigma)

q, q prop, q r, q l = q 0, q 0, q 0, q 0

p, p prop, p r, p l = p 0, p 0, p 0, p 0

U q, U q r, U q l = U(q), U(q r), U(q l)

grad U q, grad U q r, grad U q l = \grad U(q), grad U(q r), grad U(q l)

u = random.uniform(0, exp( - p 0 * p 0 / 2 - U(q 0)))

Delta max = 1000

j = 0

n = 1 # The point on the tree satisfies Metropolis

# acceptance test: exp( - p 0 * p 0 - U(q 0) ) > u

while (is U turn(q l, p l, q r, p r) == 0 and

is improbable(q, p, U, u, Delta max) == 0):

# random integer: -1 or 1

v = 2 * random.randint(2) - 1

if v > 0:

p = p r

q = q r

U q = U q r

grad U q = grad U q r

else:

p = p l

q = q l

U q = U q l

grad U q = grad U q l


# Average acceptance probability, helpful quantity

# used in optimization, see eq. (9.24).

acceptance = 0

for i in range(2**j):

# Half step for momentum at the beginning

p = p - v * epsilon * grad U q / 2

# Make a full step for the position

q = q + v * epsilon * p / m

# Update values of the potential

grad U q = grad U(q)

U q = U(q)

# Make a half step for momentum

p = p - v * epsilon * grad U q / 2

# Calculate acceptance probability

single leaf acceptance = min(1, \exp(p 0 * p 0 / 2 + U(q 0) - p * p / 2 - U q))

acceptance = acceptance + single leaf acceptance

# Check the point for Hastings ratio criterion

if exp( - p * p / 2 - U q) > u:

# Update the proposition with appropriate

# probability

n = n + 1

if random.uniform(0,1) < 1.0 / n:

p prop = p

q prop = q

# Update variables

if v > 0:

p r, q r, U q r, grad U q r = p, q, U q, grad U q

else:

p l, q l, U q l, grad U q l = p, q, U q, grad U q

# Tree doubled for j-th time

acceptance = acceptance / 2**j

# Stopping criteria met

return [[q prop, p prop], acceptance]

Figure III.3: Pythonian pseudocode of the No-U-Turn Sampler.


In general, Ht is chosen such that its expectation h(ε) is equal to zero foran “ideally” tuned parameter y. The reason for such choice is the followingclaim: If h is a nondecreasing function of y and certain other criteria are met(see references in [1]), the update

yt+1 = yt − ηtHt, (9.15)

leads to the desired behaviour

h(yt)→ 0 for t→∞. (9.16)

Here, ηt ∈ R is the adaptation step size satisfying∑t

ηt =∞,∑t

η2t <∞. (9.17)

An appropriate choice for the adaptation step size is

ηt = t−κ, (9.18)

with κ ∈ (0.5, 1]. For such ηt, the adaptation ηtHt goes towards 0 for large t,y converges to a certain value y, and the asymptotic behaviour of the sampleris not altered by the adaptation.

Note that adjustments to y quickly become smaller:

yt+1 − yt = ηtHt. (9.19)

In practice, one does not let the adaptation run for the whole sampling, butonly for warm-up phase, after which tunable parameters are fixed and thealgorithm runs in its stationary phase.

The optimal values of the parameter y are often quite different for thewarm-up phase and the stationary phase. The drawback of eq.(9.15) is that thesteps ηt diminish rapidly, giving too much weight to the early iterations duringthe warm-up phase. This leads to a y that is more suited for the warm-upphase, which is the opposite of what we want.

The dual-averaging scheme works around that by introducing two quantitiesz = log εwarm-up, the warm-up step size, and y = log ε, the stationary step size,in the following manner:

zt+1 = µ−√t

γ

1

t+ t0

t∑i=1

Ht,

yt+1 = ηtzt+1 + (1− ηt)yt. (9.20)

Here, µ ∈ R is the freely chosen point towards which z is shrunk; γ > 0determines how fast z goes towards µ; t0 > 0 is the damping parameter thatstabilizes early iterations of the algorithm; and ηt is defined in eq. (9.18).


The asymptotic behaviour of z is given by

zt+1 − zt = O(−t−0.5Ht), (9.21)

in other words, it corresponds to eq. (9.15) with an appropriate adaptationstep size. The corresponding behaviour of y is

yt+1 − yt = ηtzt+1 + (1− ηt)yt − yt = ηtzt+1 − ηtyt= ηtzt+1 − ηt(ηt−1zt − (1− ηt−1)yt−1)

= ηtzt+1 − ηtηt−1zt +O(ηtηt−1ηt−2). (9.22)

In English, y exponentially damps early iterations of z, giving more weight tothe later contributions of the warm-up.

The authors of [1] use the following parameters for an initial step size guessz0 = y0:

µ = log(10y0), γ = 0.05, t0 = 10, κ = 0.75, (9.23)

and Ht as in eq. (9.14) with the optimal Metropolis acceptance rate α = 0.65and η as in eq. (9.18). The parameter µ is arbitrary; by choosing it to belarger than log(y0) one favors the algorithm to try out larger step sizes first.In general, this spares some computations, since trying out small step sizes ismore costly (more computations are needed for the same chain length εL). Apseudocode implementation for this warm-up routine may be found in Fig. III.2.

To find a reasonable initial guess for y0 = log ε0, one may use the followingalgorithm: choose ε = 1 and some starting point q0, perform a leapfrog stepin a random direction. If the resulting Hastings ratio r is smaller than 1/2,double the stepsize and repeat the routine until the Hastings ratio becomeslarger than 1/2. If r is larger than 1/2, halve the stepsize and repeat until rbecomes smaller than 1/2. (Pseudocode in Fig. III.2.)

Acceptance probability in NUTS. In NUTS there is no single rejectionstate, and one must find another way to estimate the Metropolis acceptanceprobability. For each update starting in the state (~q0,

~~p0), one can define

HNUTSt =

1

2j−1

∑i=2j−1...2j

min

1, exp(H(~q0,~p0)−H(~qi,~pi)

), (9.24)

where (~qi,~pi)i=2j−1...2j is the set of all states explored during the final doublingof the NUTS tree. Then, one can set Ht = HNUTS

t and use the NUTS algorithmwarm-up similarly to the warm-up sketched in Fig. III.2.


# Given:

# initial value vector q 0, mass vector m with same dimension

# real-valued potential function U(q 0)

# its gradient grad U(q 0) (returns q 0-like vector)

# function that performs one leapfrog step with step size

# epsilon:

# [q,p] = leapfrog(U, grad U, q 0, p 0, m, epsilon)

# function ’hmc update’, as in Fig. III.3.

# function ’nuts update’, as in Fig. III.3.

def H(q,p):

return p * p / 2 / m + U(q)

def find reasonable delta(q 0):

delta = 1

p 0 = random.normal(zeros like(q 0), ones like(q 0))

[q,p] = Leapfrog(U, grad U, q 0, p 0, m, delta)

# Variable ’a’ determines whether acceptance rate is

# too high (a=1) or too low (a=-1)

a = 2 * (exp( - H(q,p) + H(q 0,p 0)) > 0.5) - 1

# Adjust step size until acceptance crosses 0.5

while (H(q,p) / H(q 0,p 0)**a > 0.5**a):

delta = 2**a * delta

[q,p] = Leapfrog(U, grad U, q 0, p 0, m, delta)

return delta

# Warm-Up using classical HMC with chains of length ’L’

def warmup hmc update(q 0, M w, L):

delta = find reasonable delta(q 0)

mu = log(10 * delta 0)

epsilon, H 0 = 1, 0

gamma, t 0, kappa, acceptance = 0.05, 10, 0.75, 0.65

z old, y old, H old = log(delta), log(epsilon), H 0

for k in range(M w):


[p,q] = update(U, grad U, epsilon, L, q 0)

acceptance = exp( -H(q,p) + H(q 0,p 0) )

H new = ( 1 - 1 / (k + t 0) ) * H old \+ 1 / (k + t 0) (acceptance - alpha)

z new = mu - sqrt(k) / gamma * H new

y new = k**(-kappa) * z new + (1 - k**(-kappa)) * y old

y old, z old = y new, z new

delta, epsilon = exp(z new), exp(y new)

return [[p,q],epsilon]


# Warm-Up using NUTS

def warmup hmc update(q 0, M w):

delta = find reasonable delta(q 0)

mu = log(10 * delta 0)

epsilon, H 0 = 1, 0

gamma, t 0, kappa, acceptance = 0.05, 10, 0.75, 0.65

z old, y old, H old = log(delta), log(epsilon), H 0

for k in range(M w):


[[p,q],acceptance] = nuts update(U, grad U, epsilon,

q 0)

H new = ( 1 - 1 / (k + t 0) ) * H old \+ 1 / (k + t 0) (acceptance - alpha)

z new = mu - sqrt(k) / gamma * H new

y new = k**(-kappa) * z new + (1 - k**(-kappa)) * y old

y old, z old = y new, z new

delta, epsilon = exp(z new), exp(y new)

return [[p,q],epsilon]

Figure III.2: Pythonian pseudocode of the HMC warm-up.

Chapter IV

ComputationalImplementation

10 Structure

To implement our PWA in an HMC algorithm we use the Stan [2, 26] softwarepackage. Stan is a probabilistic programming language written in C++. Itimplements full Bayesian statistical inference, in particular, HMC and NUTSalgorithms. We use CmdStan-2.9.0 [27], a command-line interface to Stan, todevelop a partial wave analysis toolkit. It is intented as a proof of concept forHMC/NUTS sampling in partial wave analysis.

Stan language includes an automatic differentiation library that calculatesgradients of the likelihood function that are necessary for HMC sampling.

In CmdStan, each sampling program consists of a *.stan file written inStan syntax. Using make, this file is translated into a *.cpp file and thencompiled to an executable sampling program. This program is then calledwith arguments such as a file containing data points, the desired number ofsamples, etc., and returns a file containing the desired samples. Data pointsused by the sampler must be declared in the formatting of the language R andare usually stored in files with extensions *.data.R. Fig. IV.1 illustrates thestructure of a Stan model.

We developed an extension module to Stan called stan pwa. To do so, weintroduced functions to Stan’s mathematical library. For each model, thesefunctions are bundled together to the likelihood function that is exposed toStan’s parser and is directly callable from *.stan files.

The detailed structure is as follows. The main components of stan pwa are alibrary written in C++ (stan pwa/src), python utility scripts (stan pwa/bin),and some PWA example models (stan pwa/models). The C++ library con-tains Breit-Wigner, Blatt-Weisskopf, and some other functions necessary todefine the model specified in Section 6.1. The utility scripts handle the inputand output data, precalculate Monte-Carlo integrals like eq. (7.9) and precalcu-

71

72 CHAPTER IV. COMPUTATIONAL IMPLEMENTATION

stan->c++converter

Samplers:NUTS,HMC

Autodiff

Mathlibraries

Userinput

example.stan example.cpp

example.data.R

output.csv

example

Figure IV.1: Structure of a Stan sampling model.

late amplitudes as discussed after eq. (7.16). Each example model has its ownsrc and bin folders as well as a stan folder. Files in src tie model-dependentamplitudes to the likelihood function; files in bin contain utility scripts specificto the defined model; files in stan are the files compiled and used by Stan.Fig. IV.2 illustrates the structure of a Stan PWA model using stan pwa.

10. STRUCTURE 73

stan->c++converter

Samplers:NUTS,HMC

Autodiff

Mathlibraries

Legend

Circled:m

ustbeadjustedbytheuser

Boxed:program

files

Blue:stan_pwafiles

STAN_data_

generator.stan

STAN_data_

generator.cpp

STAN_data_generator.data.R

generated_data.csv

STAN_amplitude_

fittting.cpp

stan_pwa/src/model.hpp

STAN_am

plitude_

fitting.stan

STAN_data_generator

STAN_amplitude_

fittting

STAN_amplitude_

fitting.data.R

output1.csv

output2.csv

...

stan_pwa/bin

Evaluateamplitudesat

generatedevents

(multiplechains)

Fig

ure

IV.2

:S

tru

ctu

reof

aS

tan

sam

pli

ng

mod

elu

sin

gstanpwa

mod

ule

.


11 Data generation and sampling examples

11.1 Gaussian distribution example

An example of a sampling task may be performed using CmdStan-2.9.0 in thefollowing way. These instructions are valid for a bash session; note that theyare written not in the spirit of elegant Stan programming but are structuredin a way that makes the transition to PWA easier.

Assume we would like to generate 10 events distributed according to aGaussian with mean 0 and width σ = 2; after that, we would like to sample σusing obtained events.

First, download CmdStan and create a directory where your model will bestored:

..$ wget https://github.com/stan-dev/cmdstan/

releases/download/v2.9.0/cmdstan-2.9.0.zip

..$ unzip cmdstan-2.9.0.zip

..$ cd cmdstan-2.9.0

..$ mkdir -p gaussian example/stan

..$ cd gaussian example/stan

In this directory, create a file STAN data generator.stan with contentsspecified in Fig. IV.3.

Each *.stan file consists of various blocks; some of them are listed below.

• Block functions allows the user to define custom functions.

• Block data declares variables whose values are known before thesampling; they must be specified in a corresponding *.data.R file.

• Block parameters declares variables that we intend to sample.

• Block model specifies the likelihood function (or, to be presize, itslogarithm).

For more elegant ways to set the model and precise information on variabledeclaration and syntax, cf. [26].

Create a file STAN data generator.data.R with the following contents:“sigma <- 2.0” (omiting the quotation marks). To generate the data, compilethe Stan file and call it:

..$ cd ../.. # Back to the cmdstan-2.9.0 folder

..$ make gaussian example/stan/STAN data generator


..$ ./STAN data generator sample num samples=10

data file=STAN data generator.data.R

The last command creates a file output.csv with generated events (stored inthe last column of the file) and additional sampling information.

11. DATA GENERATION AND SAMPLING EXAMPLES 75

functionsreal my f gen(real x, real sigma) return exp( - x*x / 2 / sigma / sigma );

data real<lower=0> sigma;

parameters real y;

model real log f;

log f <- 0;

log f <- log f + log( my f gen(y, sigma) );

increment log prob(log f);

Figure IV.3: Example of a data generation file in the Stan modeling language.

To use these events to sample Gaussian width σ, create a fileSTAN amplitude fitting.stan with contents specified in Fig. IV.4.

The difference between data generation and parameter fitting manifests inthe sampling programm as follows:

• Variables in blocks data and parameters exchange their places.One must also account for the fact that there are multiple data points.

• The likelihood function must be normalized (integral∫f gen(y,sigma)dy

must be the same for all values of sigma). During data generation thevalue of sigma is fixed, and there is no need to normalize the likelihood.During parameter sampling the value of sigma is changing, and thenormalization of f gen must be adjusted accordingly.

Create a file STAN amplitude fitting.data.R with contents

N <- 10

y <- c(-0.18, 0.32, 0.10, 0.09, -0.30, 0.14, -0.38, 0.51,

0.13)

replacing the entries of y with data stored in the last column of output.csv.


functionsreal my f gen(real x, real sigma) return exp( - x*x / 2 / sigma / sigma );

real my norm(real sigma) return sqrt(2.0 * pi()) * sigma;

data int<lower=0> N; // Number of events

real y[N];

parameters real<lower=0.> sigma;

model real log f;

log f <- 0;

for (n in 1:N)

log f <- log f +

log( my f gen(y[n], sigma) / my norm(sigma) );

increment log prob(log f);

Figure IV.4: Example of a parameter fitting file in the Stan modeling language.

To sample sigma, compile the Stan file and call it:

..$ cd ../.. # Back to the cmdstan-2.9.0 folder

..$ make gaussian example/stan/STAN amplitude fitting


..$ ./STAN amplitude fitting sample num samples=1000

data file=STAN amplitude fitting.data.R

One can generate and fit this model simpler using utility scripts from stan pwa.After installing the module (requirements and installation instructions maybe found at https://github.com/atsipenyuk/stan_pwa), you may run theexample:

https://github.com/atsipenyuk/stan_pwa


8− 4− 0 4 80

100

200

300

Num. samples vs. y

1.964 2.017 2.070

20

40Num. samples vs. sigma

Figure IV.5: Data generated from a Gaussian distribution (left), width σ = 2;parameter σ is fitted (right) according to the generated data.

..$ cd models/example models/gaussian example/

..$ ./../../../build.sh # run Stan’s make

..$ ./../../../generate.sh 10000 # Generate 10000 pts

..$ ./../../../prepare for fitting.sh # Copy pts to ×

.data.R file

..$ ./../../../fit.py # Fit sigma, convert results to ×

.root file

..$ root output/output.root

root [1] t->Draw(‘‘sigma’’,",", 1000, 0);

The latter command should display a histogram describing parameter sigmaand it should peak around 2.0, see Fig. IV.5.

11.2 Toy PWA example

In the previous example we defined the likelihood function in the *.stan file.However, if the number of user-defined functions becomes too large as in thecase of PWA models, the *.stan file quickly becomes unreadable since theStan language does not support file inclusion or object-oriented programming.Instead one can define the likelihood function using C++ and then expose itto the Stan parser. (For detailed information, see https://github.com/stan-dev/stan/wiki/Contributing-New-Functions-to-Stan.) The user-definedfunction must be written in a templated way so that Stan’s automaticdifferentiation can calculate the gradient of the likelihood function. Forexample, the function my f gen defined in Fig. IV.3 may be implemented inC++ as presented in Fig. IV.6. The file containing the implemented functionis called my f gen.hpp and is placed directly in the cmdstan-2.9.0 folder. Inorder to expose my f gen to Stan, one must then include the C++ file in theStan library, for example, by adding the lines

https://github.com/stan-dev/stan/wiki/Contributing-New-Functions-to-Stan

https://github.com/stan-dev/stan/wiki/Contributing-New-Functions-to-Stan


#include <math.h> /* exp */

namespace stan namespace math template <typename T0, typename T1>

inline

typename boost::math::tools::promote args<T0,T1>::type

my f gen(const T0 &x, const T1 &sigma) return exp( -x*x / 2.0 / sigma / sigma );

Figure IV.6: Example of templated C++ function that may be parsed by Stan.

/* Relative path from cmdstan-2.9.0/stan 2.9.0/src

or any other folder included to compiler options

in cmdstan-2.9.0/makefile */

#include <../../my f gen.hpp>

to the file cmdstan-2.9.0/stan 2.9.0/lib/stan math 2.9.0/stan/math.hpp.One must also expose the function to the Stan parser by adding the lines

// Tell Stan that my f gen returns a double

// and requires two arguments of type double

add("my f gen",DOUBLE T,DOUBLE T,DOUBLE T);

to the file cmdstan-2.9.0/stan 2.9.0/src/stan/lang/function signatures.h.

The module stan pwa introduces various likelihood functions by proceedingin exactly the same way. The link stan pwa/src/model.hpp is included intoStan; it points to a specific model that we want to use and is stored in the filestan pwa/models/*/src/model.hpp.

An example of data generated using stan pwa according to model describedin Table II.4 is presented in Fig. IV.7. The biggest peak at m2

π+π− ≈ 1.9 GeVcorresponds to the resonance f0(1370) interfering with the resonance f2(1270) atm2π+π− ≈ 1.6 GeV. The distinct spin 1 ρ peaks are visible at m2

π+π− ≈ 0.6 GeV.The narrow lines at m2

π+π− ≈ 1 GeV correspond to the resonance f0(980).The barelyvisible peak near m2

π+π− ≈ 2.2 GeV may be interpreted as f0(1500),although it is harder to reckognize due to its interference with f2(1270) andf0(1370).

Consider the following example model: let D+ meson decay into threepions via two fictitious Breit-Wigner resonances f0(1000) and f0(1200), indexdenotes the spin of the resonance, argument denotes their maximal energy inMeV, each with width 100 MeV. Fix the complex number in front of f0(1000)


0 1 2 30

1

2

3

0

10

20

30

40

50

60

70

80

), GeV-pi+(pi2), GeV, vs. m-pi+(pi2m

0 1 2 30

1000

2000

), GeV-pi+(pi2Num. samples vs. m

Figure IV.7: Simulated data for D+ → π+π−π+ decay, 100 000 events.

to 1 and the complex number in front of f0(1200) to 1 · eiπ·45/180 ≈ 0.7 + i0.7.The following instructions show how to generate data according to this modeland how to sample the complex number in front of f0(1200) (the complexnumber in front of f0(1000) is set as a reference parameter).


..$ cd cmdstan-2.9.0/stan pwa/models/two toy res

..$ ./../../../relink model.sh # Use the correct model.hpp

file

..$ ./../../../build.sh # Build executable files

..$ ./../../../generate.sh 100000 # Generate 100 000

events

..$ root output/generated data.root # You may look at the

data

root [1] t->Draw("y.2:y.1>>hh(100,0,3,100,0,3)","","COLZ",

20000, 0);

root [2] .q

..$ # To evaluate amplitudes, python library of our model

must be made

..$ ./../../../wrap python.sh # Creates build/model.so

..$ # Evaluate amplitudes and Monte Carlo integrals

..$ ./../../../prepare for fitting.sh

..$ # Fitting 1000 warmup + 1000 samples using this model

..$ # with 100 000 data pts requires ca. 45 min. on a

home computer

..$ ./../../../fit.py -c 2 / Run two sampling chains,

1000 samples

..$ ./../../../merge output chains.sh

..$ root output/output.root

root [1] t->Draw("y.2:y.1","","COLZ", 20000, 0);

root [2] .q

The results of such run are presented in Fig. IV.8. The algorithm correctlyfinds the fitted parameter; we can observe the correlation between the realand imaginary part of the parameter.

11.3 Caveats and further developement

The advantages of Stan include powerful sampling algorithms and many nativefeatures that ease optimization of models and allow the user to check samplinginformation (such as the autocorrelation of samples, the number of gradientsteps, the parameters of NUTS/HMC algorithms). Some features have notyet been realized (such as a probability to set priors for NUTS without thewarm-up phase) but are announced for Stan-3.0.

The main drawback of Stan for PWA is the necessity of bundling togethervarious parts: the sampling program must be partially written in Stan andpartially implemented in C++, data files and initialization files are stored as*.csv and *.R files. The scripts that perform calculations between data gener-ation and amplitude fitting are necessary to check models for self-consistency.


0 1 2 30

1

2

3

0

20

40

60

80

100

), GeV-pi+(pi2), GeV, vs. m-pi+(pi2m

0.67 0.6933 0.7167 0.740.67

0.6933

0.7167

0.74

0

20

40

60

80

100

120

140

160

(Im(a_f_0(1200)) vs. Re(a_f_0(1200))2m

Figure IV.8: 100 000 simulated events for fictitious resonances f0(1000) andf0(1200) (top) and the fitted complex amplitude af0(1200) in Cartesian coordi-nates (bottom). The generating amplitude is shown in red.

In larger models these scripts become unjustifiably obscure.

The module model pwa may be used at its best for analysis of PWA samplingproperties such as correlation between different sampling parameters or the


choice of appropriate priors. Models can be easily maintained and modified aslong as the structure of the sampling (described in Section 6.1) is preserved;in other words, as long as the argument types of the likelihood function donot need to be modified. For example, it is relatively easy to introduce newtypes of resonances or backgrounds (one must only alter C++ files). It ismore time-consuming to allow fitting of masses and widths: an integrationroutine must be exposed to Stan. It is even more time-consuming to implementmodels with a different likelihood structure such as model-independent fitting:although some parts of the code may be reused, appropriate adjustmentsmust be made in C++ files, Stan files, and data handling scripts (preliminaryinvestigations suggest that one will also have to implement more involvedinitialization routines).

Bibliography

[1] The no-u-turn-sampler: Adaptively setting path lengths in hamiltonianmonte carlo. Journal of Machine Learning Research, 15:1351–1381, 2014.

[2] Stan Developement Team. Stan: A c++ library for probability andsampling, version 2.9.0, 2015.

[3] Arseniy Tsipenyuk. stan pwa: Partial wave analysis module for stan,2015.

[4] K.A. Oliver et al. (Particle Data Group). Review of particle physics.Chinese Physics C38, 38(9):090001, 2014.

[5] M. Reed and B.Simon. Methods of Modern Mathematical Physics. Aca-demic Press Ltd., 2011.

[6] The physics of the B factories. EPJC, 3 2014.

[7] L.D. Landau and E.M. Lifshitz. Quantum Mechanics: Non-RelativisticTheory, volume 3. Pergamon Press, 3 edition, 1977.

[8] John M. Blatt and Victor F. Weisskopf. Theoretical Nuclear Physics. JohnWiley and Sons, Inc., 1952.

[9] Michael E. Peskin and Daniel V. Schroeder. An introduction to QuantumField Theory. Westview Press, 1995.

[10] Frank v. Hippel and C. Quigg. Centrifugal-barrier effects in resonancepartial decay widths, shapes, and production amplitudes. Physical ReviewD, 5:624 – 638, 2 1972.

[11] Harald Friedrich. Scattering Theory. Springer, 2013.

[12] J. Dittrich and P. Exner. A non-relativistic model of two-particle decay. i.galilean invariance. Czech. J. Phys., 37:503, 5 1987.

[13] J. Dittrich and P. Exner. A non-relativistic model of two-particle decay.ii. reduced resolvent. Czech. J. Phys., 37:1028, 5 1987.

83

84 BIBLIOGRAPHY

[14] J. Dittrich and P. Exner. A non-relativistic model of two-particle decay.iii. the pole approximation. Czech. J. Phys., 38:591, 5 1988.

[15] J. Dittrich and P. Exner. A non-relativistic model of two-particle decay.iv. relation to the scattering theory, spectral concentration, and boundstates. Czech. J. Phys., 39:121, 5 1989.

[16] J. Beringer et al. (Particle Data Group). Review of particle physics.Physical Review D, 86:010001, 2012.

[17] N.G. van Kampen. S matrix and causality condition. i. maxwell field.Physical Review D, 89:1072–1079, 3 1953.

[18] N.G. van Kampen. S matrix and causality condition. ii. nonrelativisticparticles. Physical Review D, 91:1267–1276, 5 1953.

[19] N. Byers and C. N. Yang. Physical regions in invariant variables for nparticles and the phase-space volume element. Rev. Mod. Phys., 36:595–609, Apr 1964.

[20] S. Kobayashi. Transformation groups in differential geometry. Springer,1972.

[21] V. Fillipini, A. Fontana, and A. Rotondi. Covariant spin tensors in mesonspectroscopy. Physical Review D, 51:2247, 1995.

[22] S. U. Chung and J. Friedrich. Covariant helicity-coupling amplitudes: Anew formulation. Physical Review D, 78:074027, 2008.

[23] K. Nakamura et al. (Particle Data Group). Review of particle physics.JPG, 37:075021, 2010.

[24] Charles J. Geyer. Introduction to markov chain monte carlo. In SteveBrooks, Andrew Gelman, Galin L. Jones, and Xiao-Li Meng, editors,Handbook of Markov Chain Monte Carlo, chapter 1, pages 3–48. CRCPress, 2011.

[25] Radford M. Neal. Mcmc using hamiltonian dynamics. In Steve Brooks,Andrew Gelman, Galin L. Jones, and Xiao-Li Meng, editors, Handbook ofMarkov Chain Monte Carlo, chapter 5, pages 113–162. CRC Press, 2011.

[26] Stan Developement Team. Stan Modeling Language Users Guide andReference Manual, Version 2.9.0, 2015.

[27] Stan Developement Team. Cmdstan: the command-line interface to stan,version 2.9.0, 2015.

Documents

Fundamentals of Partial Wave Analysis and an Application to …€¦ · 3 Quantum-mechanical spinless unstable system There are many various ways to model amplitudes A R j that appear