27
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate Lectures on Particle Phys University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway, University of Londo [email protected] www.pp.rhul.ac.uk/~cowan Course web page: www.pp.rhul.ac.uk/~cowan/stat_course.html

G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

Embed Size (px)

DESCRIPTION

G. Cowan Computing and Statistical Data Analysis / Stat 9 3 Variance of LS estimators In most cases of interest we obtain the variance in a manner similar to ML. E.g. for data ~ Gaussian we have and so or for the graphical method we take the values of  where 1.0

Citation preview

Page 1: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 1

Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits

London Postgraduate Lectures on Particle Physics;

University of London MSci course PH4515

Glen CowanPhysics DepartmentRoyal Holloway, University of [email protected]/~cowan

Course web page:www.pp.rhul.ac.uk/~cowan/stat_course.html

Page 2: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 2

Example of least squares fit

Fit a polynomial of order p:

Page 3: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 3

Variance of LS estimatorsIn most cases of interest we obtain the variance in a mannersimilar to ML. E.g. for data ~ Gaussian we have

and so

or for the graphical method we take the values of θ where

1.0

Page 4: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 4

Two-parameter LS fit

Page 5: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 5

Goodness-of-fit with least squaresThe value of the χ2 at its minimum is a measure of the levelof agreement between the data and fitted curve:

It can therefore be employed as a goodness-of-fit statistic totest the hypothesized functional form λ(x; θ).

We can show that if the hypothesis is correct, then the statistic t = χ2

min follows the chi-square pdf,

where the number of degrees of freedom is

nd = number of data points - number of fitted parameters

Page 6: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 6

Goodness-of-fit with least squares (2)The chi-square pdf has an expectation value equal to the number of degrees of freedom, so if χ2

min ≈ nd the fit is ‘good’.

More generally, find the p-value:

E.g. for the previous example with 1st order polynomial (line),

whereas for the 0th order polynomial (horizontal line),

This is the probability of obtaining a χ2min as high as the one

we got, or higher, if the hypothesis is correct.

Page 7: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 7

Goodness-of-fit vs. statistical errors

Page 8: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 8

Goodness-of-fit vs. stat. errors (2)

Page 9: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 9

LS with binned data

Page 10: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 10

LS with binned data (2)

Page 11: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 11

LS with binned data — normalization

Page 12: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 12

LS normalization example

Page 13: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 13

Using LS to combine measurements

Page 14: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 14

Combining correlated measurements with LS

Page 15: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 15

Example: averaging two correlated measurements

Page 16: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 16

Negative weights in LS average

Page 17: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 17

Interval estimation — introduction

Often use +/ the estimated standard deviation of the estimator.In some cases, however, this is not adequate:

estimate near a physical boundary, e.g., an observed event rate consistent with zero.

In addition to a ‘point estimate’ of a parameter we should report an interval reflecting its statistical uncertainty.

Desirable properties of such an interval may include:communicate objectively the result of the experiment;have a given probability of containing the true parameter;provide information needed to draw conclusions aboutthe parameter possibly incorporating stated prior beliefs.

We will look briefly at Frequentist and Bayesian intervals.

Page 18: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 18

Frequentist confidence intervals

Consider an estimator for a parameter θ and an estimate

We also need for all possible θ its sampling distribution

Specify upper and lower tail probabilities, e.g., = 0.05, = 0.05,then find functions u(θ) and v(θ) such that:

Page 19: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 19

Confidence interval from the confidence belt

Find points where observed estimate intersects the confidence belt.

The region between u(θ) and v(θ) is called the confidence belt.

This gives the confidence interval [a, b]

Confidence level = 1 = probability for the interval tocover true value of the parameter (holds for any possible true θ).

Page 20: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 20

Confidence intervals by inverting a testConfidence intervals for a parameter θ can be found by defining a test of the hypothesized value θ (do this for all θ):

Specify values of the data that are ‘disfavoured’ by θ (critical region) such that P(data in critical region) ≤ for a prespecified , e.g., 0.05 or 0.1.

If data observed in the critical region, reject the value θ .

Now invert the test to define a confidence interval as:

set of θ values that would not be rejected in a test ofsize (confidence level is 1 ).

The interval will cover the true value of θ with probability ≥ 1 .

Equivalent to confidence belt construction; confidence belt is acceptance region of a test.

Page 21: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

Computing and Statistical Data Analysis / Stat 9 21

Relation between confidence interval and p-value

Equivalently we can consider a significance test for eachhypothesized value of θ, resulting in a p-value, pθ..

If pθ < , then we reject θ.

The confidence interval at CL = 1 – consists of those values of θ that are not rejected.

E.g. an upper limit on θ is the greatest value for which pθ ≥

In practice find by setting pθ = and solve for θ.

G. Cowan

Page 22: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 22

Confidence intervals in practiceThe recipe to find the interval [a, b] boils down to solving

→ a is hypothetical value of θ such that

→ b is hypothetical value of θ such that

Page 23: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 23

Meaning of a confidence interval

Page 24: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 24

Central vs. one-sided confidence intervals

Page 25: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 25

Intervals from the likelihood function In the large sample limit it can be shown for ML estimators:

defines a hyper-ellipsoidal confidence region,

If then

(n-dimensional Gaussian, covariance V)

Page 26: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 26

Approximate confidence regions from L(θ) So the recipe to find the confidence region with CL = 1 is:

For finite samples, these are approximate confidence regions.

Coverage probability not guaranteed to be equal to ;

no simple theorem to say by how far off it will be (use MC).

Remember here the interval is random, not the parameter.

Page 27: G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate

G. Cowan Computing and Statistical Data Analysis / Stat 9 27

Example of interval from ln L(θ) For n=1 parameter, CL = 0.683, Q = 1.