30
Numerical Methods for Data Analysis Michael O. Distler [email protected] Bosen (Saar), August 29 - September 3, 2010 Fundamentals Probability distributions Expectation values, error propagation Parameter estimation Regression analysis Maximum likelihood Linear Regression Advanced topics Numerical Methods for Data Analysis

Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

  • Upload
    vunhu

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Numerical Methods for Data Analysis

Michael O. [email protected]

Bosen (Saar), August 29 - September 3, 2010

FundamentalsProbability distributionsExpectation values, error propagationParameter estimation

Regression analysisMaximum likelihoodLinear Regression

Advanced topics

Numerical Methods for Data Analysis

Page 2: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Some statistics books, papers, etc.

Volker Blobel und Erich Lohrmann: Statistische und numerischeMethoden der Datenanalyse, Teubner Verlag (1998)Siegmund Brandt: Datenanalyse, BI Wissenschaftsverlag (1999)Philip R. Bevington: Data Reduction and Error Analysis for thePhysical Sciences, McGraw-Hill (1969)Roger J. Barlow: Statistics, John Wiley & Sons (1993)Glen Cowan: Statistical Data Analysis, Oxford University Press(1998)Frederick James: Statistical Methods in Experimental Physics,2nd Edition, World Scientific, 2006

Wes Metzger’s lecture notes:www.hef.kun.nl/~wes/stat_course/statist.pdf

Glen Cowan’s lecture notes:www.pp.rhul.ac.uk/~cowan/stat_course.html

Particle Physics Booklet: http://pdg.lbl.gov/

Numerical Methods for Data Analysis

Page 3: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Introduction

Data analysis in nuclear and particle physics

Observe events of a certain type

Measure characteristics of each eventTheories predict distributions of these properties up to freeparametersSome tasks of data analysis:

Estimate (measure) the parameters;Quantify the uncertainty of the parameter estimates;Test the extent to which the predictions of a theory are inagreement with the data.

Numerical Methods for Data Analysis

Page 4: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Introduction

Philosophy of ScienceKarl R. Popper (* 28. Juli 1902 in Vienna, Austria;† 17. September 1994 in London, England) coined the termcritical rationalism. At the heart of his philosophy of science liesthe account of the logical asymmetry between verification andfalsifiability. Logik der Forschung, 1934.

−→Existence of a true valueof measured quantities and derived values.

Numerical Methods for Data Analysis

Page 5: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Theory of probability

Probability theory, mathematics:

−→ Kolmogorov axioms

Classical interpretation, frequentist probability:Pragmatical definition of probability:

p(E) = limN→∞

nN

n(E) = number of events EN = number of trials (experiments)Experiments have to be repeatable (in principle).Disadvantage: Strictly speaking one cannot makestatements on the probability of any true value. Only upperand lower limits are possible given a certain confidencelevel.

Numerical Methods for Data Analysis

Page 6: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Theory of probability

Probability theory, mathematicsClassical interpretation, frequentist probabilityBayesian statistics, subjective probability:Prior subjective assumptions enter into the calculation ofprobabilities of a hypotheses H.

p(H) = degree of belief that H is true

Metaphorically speaking: Probabilities are the ratio of the(maximum) wager and the anticipated prize in a bet.

Whatif there were 20 times more yellow cabs than green cabs?Would you still believe the witness?

Numerical Methods for Data Analysis

Page 7: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Theory of probability

Probability theory, mathematicsClassical interpretation, frequentist probabilityBayesian statistics, subjective probability:

Prior subjective assumptions enter into the calculation ofprobabilities of a hypotheses H.

Suppose there is a town with green and yellow taxicabs. Ina hit-and-run accident a man was hurt and a witness saw agreen cab.In court the lawer of the taxi company impeaches thecredibility of the witness, because of the lighting conditions.A test showed that under similar conditions 10% of thewitnesses confuse the color of the cabs.Would you believe the witness?

What if there were 20 times more yellow cabs than greencabs? Would you still believe the witness?

Numerical Methods for Data Analysis

Page 8: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Theory of probability

Probability theory, mathematicsClassical interpretation, frequentist probabilityBayesian statistics, subjective probability:

Prior subjective assumptions enter into the calculation ofprobabilities of a hypotheses H.

Suppose there is a town with green and yellow taxicabs. Ina hit-and-run accident a man was hurt and a witness saw agreen cab.In court the lawer of the taxi company impeaches thecredibility of the witness, because of the lighting conditions.A test showed that under similar conditions 10% of thewitnesses confuse the color of the cabs.Would you believe the witness?What if there were 20 times more yellow cabs than greencabs? Would you still believe the witness?

Numerical Methods for Data Analysis

Page 9: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Theory of probability

Probability theory, mathematicsClassical interpretation, frequentist probabilityBayesian statistics, subjective probability:Prior subjective assumptions enter into the calculation ofprobabilities of a hypotheses H.

taxicabs witness sees . . . statement is . . .200 yellow 180 × “yellow”

20 × “green” 20/29 = 69% wrong10 green 9 × “green” 9/29 = 31% true

1 × “yellow”

Disadvantage: Prior hypotheses influence the probability.Advantages for rare and one-time events, like noisy signalsor catastrophe modeling.

In this lecture we will focus on the classicalstatistics, e.g. error estimates have to be under-stood as confidence regions.

Numerical Methods for Data Analysis

Page 10: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Theory of probability

Probability theory, mathematicsClassical interpretation, frequentist probabilityBayesian statistics, subjective probability:Prior subjective assumptions enter into the calculation ofprobabilities of a hypotheses H.Disadvantage: Prior hypotheses influence the probability.Advantages for rare and one-time events, like noisy signalsor catastrophe modeling.

In this lecture we will focus on the classicalstatistics, e.g. error estimates have to be under-stood as confidence regions.

Numerical Methods for Data Analysis

Page 11: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Theory of probability

Probability theory, mathematicsClassical interpretation, frequentist probabilityBayesian statistics, subjective probability:Prior subjective assumptions enter into the calculation ofprobabilities of a hypotheses H.Disadvantage: Prior hypotheses influence the probability.Advantages for rare and one-time events, like noisy signalsor catastrophe modeling.

In this lecture we will focus on the classicalstatistics, e.g. error estimates have to be under-stood as confidence regions.

Numerical Methods for Data Analysis

Page 12: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Combining probabilities

Two kinds of events are given: A and B. The probability of A isp(A) (B: p(B)). Then the probability of A or B is:

p(AorB) = p(A) + p(B)− p(AandB)

If A and B are mutually exclusive then p(AandB) = 0Example: Drawing from a deck of German Skat cards.

p(Ace or spades) =432

+8

32− 1

32=

1132

Special case: B = A (A will NOT occur).

p(Aand A) = p(A) + p(A) = 1

Numerical Methods for Data Analysis

Page 13: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Combining probabilities

Joint probability of A and B occuring simultaniously:

p(AandB) = p(A) · p(B|A),

p(B|A) is called condional probability.If A and B are independent one gets p(B|A) = p(B),respectively

p(AandB) = p(A) · p(B)

Numerical Methods for Data Analysis

Page 14: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Death in the mountains

In a book on mountaineering achievements of ReinholdMessner one reads the following: “If you consider that theprobability of dying in a expedition to an eight-thousander is3,4%, then Messner had a probability of 3,4% · 29 = 99% to bekilled during his 29 expeditions.”

That may not be true. What if Messner sets off to a 30thexpedition?The probability to survive an expedition is obviously1− 0,034 = 0,966. If one assumes that the various expeditionsrepresent independent events, the probability of surviving all 29expeditions is: P = 0,96629 = 0,367.

Numerical Methods for Data Analysis

Page 15: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Death in the mountains

In a book on mountaineering achievements of ReinholdMessner one reads the following: “If you consider that theprobability of dying in a expedition to an eight-thousander is3,4%, then Messner had a probability of 3,4% · 29 = 99% to bekilled during his 29 expeditions.”That may not be true. What if Messner sets off to a 30thexpedition?

The probability to survive an expedition is obviously1− 0,034 = 0,966. If one assumes that the various expeditionsrepresent independent events, the probability of surviving all 29expeditions is: P = 0,96629 = 0,367.

Numerical Methods for Data Analysis

Page 16: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Death in the mountains

In a book on mountaineering achievements of ReinholdMessner one reads the following: “If you consider that theprobability of dying in a expedition to an eight-thousander is3,4%, then Messner had a probability of 3,4% · 29 = 99% to bekilled during his 29 expeditions.”That may not be true. What if Messner sets off to a 30thexpedition?The probability to survive an expedition is obviously1− 0,034 = 0,966. If one assumes that the various expeditionsrepresent independent events, the probability of surviving all 29expeditions is: P = 0,96629 = 0,367.

Numerical Methods for Data Analysis

Page 17: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Definitions

probability mass function (pmf) probability density function (pdf)of a measured value (=random variable)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.1

0 5 10 15 20 25 30

f(n)

n

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.1

0 5 10 15 20 25 30

f(x)

x

f (n) discrete f (x) continuousNormalization:

f (n) ≥ 0∑

n

f (n) = 1 f (x) ≥ 0∫ ∞−∞

f (x) dx = 1

Probability:

p(n1 ≤ n ≤ n2) =

n2∑n1

f (n) p(x1 ≤ x ≤ x2) =

∫ x2

x1

f (x)dx

Numerical Methods for Data Analysis

Page 18: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Definitions

Cumulative distribution function (CDF):

F (x) =

∫ x

−∞f (x ′)dx ′, F (−∞) = 0, F (∞) = 1

Example:Decay time t of a radioactive nucleus with mean life time τ :

f (t) =1τ

e−t/τ F (t) = 1− e−t/τ

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40 50

t/s

f(t)*12s F(t)

Numerical Methods for Data Analysis

Page 19: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Expectation values and moments

Mean: A random variable X takes on the values X1, X2, . . . , Xnwith probability p(Xi), then the expected value of X (“mean”) is

X = 〈X 〉 =n∑

i=1

Xi · p(Xi)

The expected value of an arbitrary funktion h(x) for acontinuous random variable is:

E [h(x)] =

∫ ∞−∞

h(x) · f (x)dx

The mean ist the expected value of x:

E [x ] = x =

∫ ∞−∞

x · f (x)dx

Numerical Methods for Data Analysis

Page 20: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Expectation values and moments

standard deviation = {mean (deviation from x)2}1/2

σ2 = (x − x)2 =

∫ ∞−∞

(x − x)2 · f (x)dx

=

∫ ∞−∞

(x2 − 2xx + x2) · f (x)dx = x2 − 2x x + x2 = x2 − x2

σ2 = Variance, σ = Standard deviationDiscrete distributions:

σ2 =1N

(∑x2 − (

∑x)2

N

)Attention: This is the definition of the variance! To get a biasfree estimation of the variance, 1

N will be replaced by 1N−1 .

Numerical Methods for Data Analysis

Page 21: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Expectation values and moments

Moments are the expected value of xn and of (x − 〈x〉)n. Theyare called nth algebraic moment µn and nth central moment µ′n,respectivly.Skewness v(x) is a measure of the asymmetry of theprobability distribution of a random variable x :

v =µ′3σ3 =

E [(x − E [x ])3]

σ3

Kurtosis is a measure of the ”peakedness” of the probabilitydistribution of a random variable x .

γ2 =µ′4σ4 − 3 =

E [(x − E [x ])4]

σ4 − 3

Numerical Methods for Data Analysis

Page 22: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Binomial distribution

The binomial distribution is the discrete probability distributionof the number of successes r in a sequence of n independentyes/no experiments, each of which yields success withprobability p (Bernoulli experiment).

P(r) =

(nr

)pr · (1− p)n−r

P(r) is normalized. Proof: Binomial theorem with q = 1− p.The mean of r is:

〈r〉 = E [r ] =n∑

r=0

rP(r)= np

The varianz σ2 is

V [r ] = E [(r − 〈r〉)2] =n∑

r=0

(r − 〈r〉)2P(r)= np(1− p)

Numerical Methods for Data Analysis

Page 23: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Poisson distribution

The Poisson distribution ist given by:

P(r) =µr e−µ

r !

The mean is:

〈r〉 = µ

The variance is:

V [r ] = σ2 = np = µ

0

0.1

0.2

0.3

0.4

0.5

0.6

0 2 4 6 8 10

µ = 0.5

0

0.1

0.2

0.3

0.4

0.5

0.6

0 2 4 6 8 10

µ = 1

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 2 4 6 8 10

µ = 2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 2 4 6 8 10

µ = 4

Numerical Methods for Data Analysis

Page 24: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Law of large numbers

The law of large numbers (LLN) is a theorem that describes theresult of performing the same experiment a large number oftimes.According to the law, the average of the results obtained from alarge number of trials should be close to the expected value,and will tend to become closer as more trials are performed.We perform n independent experiments (Bernoulli trials) wherethe result j occurs nj times.

pj = E [hj ] = E [nj/n]

The variance of a Binomial distribution is:

V [hj ] = σ2(hj) = σ2(nj/n) =1n2 · σ

2(nj) =1n2 · npj(1− pj)

From the product pj(1− pj) which is ≤ 14 , we can deduce the

law of large numbers:

σ2(hj) < 1/n

Numerical Methods for Data Analysis

Page 25: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

The central limit theorem

The central limit theorem (CLT) states conditions under whichthe mean of a sufficiently large number of independent randomvariables, each with finite mean and variance, will beapproximately normally distributed.Let xi be a sequence of n independent and identicallydistributed random variables each having finite values ofexpectation µ and variance σ2 > 0.In the limit n→∞ the random variable w =

∑ni=1 xi will be

normally distributed with mean 〈w〉 = n〈x〉 and varianceV [w ] = nσ2.

Numerical Methods for Data Analysis

Page 26: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Illustration: The central limit theorem

0

0.1

0.2

0.3

0.4

0.5

-3 -2 -1 0 1 2 3

GaussN=1

0

0.1

0.2

0.3

0.4

0.5

-3 -2 -1 0 1 2 3

N=2

0

0.1

0.2

0.3

0.4

0.5

-3 -2 -1 0 1 2 3

N=3

0

0.1

0.2

0.3

0.4

0.5

-3 -2 -1 0 1 2 3

N=10

The sum of uniformly distributed random variables and thestandard normal distribution.

Numerical Methods for Data Analysis

Page 27: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Special probability densities

Uniform distribution: This probability distribution is constant inbetween the limits x = a and x = b:

f (x) =

{ 1b−a a ≤ x < b0 otherwise

Mean and variance:

〈x〉 = E [x ] =a + b

2V [x ] = σ2 =

(b − a)2

12

Numerical Methods for Data Analysis

Page 28: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Gaussian distribution

The most important probability distribution - also called normaldistribution:

f (x) =1√2πσ

e−(x−µ)2

2σ2

The Gaussian distribution has two parameters, the mean µ andthe variance σ2. The probability distribution with mean µ = 0and variance σ2 = 1 is named standard normal distribution orshort N(0,1).The Gaussian distribution can be derived from the binomialdistribution for large values of n and r and similarly from thePoisson distribution for large values of Werte von µ.

Numerical Methods for Data Analysis

Page 29: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Gaussian distribution

∫ 1

−1dx N(0,1) = 0,6827 = (1− 0,3173)∫ 2

−2dx N(0,1) = 0,9545 = (1− 0,0455)∫ 3

−3dx N(0,1) = 0,9973 = (1− 0,0027)

FWHM: useful to estimate the standard deviation:

FWHM = 2σ√

2ln2 = 2,355σ

Numerical Methods for Data Analysis

Page 30: Numerical Methods for Data Analysis - uni-mainz. · PDF fileIntroduction Data analysis in nuclear and particle physics Observe events of a certain type Measure characteristics of each

Gaussian distribution

0 0.05 0.1

0.15 0.2

0.25 0.3

0 2 4 6 8 10 12 14 0

0.05

0.1

0.15

0.2

0 2 4 6 8 10 12 14

Left side: The binomial distribution for n = 10 and p = 0,6in comparison to the Gaussian distributionfor µ = np = 6 and σ =

√np(1− p) =

√2,4.

Right side: The Poisson distribution for µ = 6 and σ =√

6in comparison to the Gaussian distribution.

Numerical Methods for Data Analysis