Statistics in Particle Physics
20-29 November 2006 Tatsuo Kawamoto
ICEPP, University of Tokyo
1
Outline
1. Introduction
2. Probability
3. Distributions
4. Fitting and extracting parameters
5. Combination of measurements
6. Errors, limits and confidence intervals
7. Likelihood, ANN, and sort of things
References
• Textbooks of statistics in HEP• PDG review (Probability, Statistics)• Relevant scientific papers
1. Introduction
Why bother statistics ?
It’s not fundamental.
As soon as we come to the point to present results of an experiment, we face to a few questions like:
• What is the size of uncertainty?• How to combine results from different runs?• Discovered something new?• If not discovery, what we can say from the experiment?
Prescriptions to these problems often involve considerationsbased on statistics.
Particle Physics
Study of elementary particles that have been discovered
- Quarks - leptons - Gauge bosons - Hadrons
And anything that has not been discovered
- Higgs - Supersymmetry - Extradimensions
Goals of experiments
For each particle we want to know, eg.
What are its properties ? - mass, lifetime, spin, ….
What are its decay modes ?
How it interacts with other particles ?
Does it exist at all ?
Observation is a result of fundamental rules of the nature
these are random, quantum mechanical, processes
Also, the detector effects (resolution, efficiency, …) areoften of random nature
Systematic uncertainty is a subtle subject, but we have to doour best to say something about it, and treat it reasonably.
Template for an experiment
To study X
• Arrange for X to occur e.g colliding beams• Record events that might be X trigger, data acquisition, • Reconstruct momentum, energy, … of visible particles• Select events that could be X by applying CUTS
Efficiency < 100%, Background > 0
• Study distributions of interesting variables• Compare with/ fit to Theoretical distributions• Infer the value of parameter and its uncertainty
Implications
• Essentially counting numbers
• Uncertainties of measurements are understood
• Distributions are reproduced to reasonable accuracy
We don’t use:
•Student’s t•F test•Markov chains•…
Tools•Monte Carlo simulation Know in principle → Know in practice Simple beautiful underlying physics Unbeautiful effects (higher order, fragmentation,..) Ugly detector imperfections (resolution, efficiency)
•Likelihood Fundamental tool to handle probability
•Fitting 2, Likelihood, Goodness of fit
•Toy Monte Carlo Handle complicated likelihood
Extracting parameters
Example:
mZ = 91.1853±0.0029 GeVZ = 2.4947 ±0.0041 GeVhad= 41.82 ±0.044 nb
E. Hubble
Combining results
Discovery or placing limits
Likelihood, Artificial Neural Net
Use as much Information aspossible
Example:W+W- → qqqq
There are other important thingswhich we don’t cover
•Blind analysis•Unfolding•….
2. Probability
What is it?
Mathematical
P(A) is a number obeying the rules:
Kolmogorov axioms
Ai are disjoint events
Lemma
Mathematical
And, that’s almost it.
Classical Laplace, …
Given by symmetry for equally-likely outcomes, for whichwe are equally undecided.
Classify things into certain number of equally-likely cases,And count the number of such favorable cases.
P(A) = number of equally-likely favorable cases / total number
From considerations of games of chances
Tossing a coin P(H)=1/2, Throwing a dice P(1)=1/6
How to handle continuous variables ?
Frequentist
Probability is the limit of frequency (taken over some ensemble)
The event A either occur or not. Relative frequency of occurence
Law of large numbers
An example of throwing a dice
Frequency definition is associated to some ensemble of ‘events’
Can’t say things like:
• It will probably rain tomorrow• Probability of LHC collision in November 2007• Probability of existence of SUSY• …
But one can say:
• The statement ‘It will rain tomorrow’ is probably true• …
Comeback later in the discussion of confidence level
Bayesian or Subjective probability
P(A) is the degree of belief in A
A can be anything:
Rain, LHC completion, SUSY, ….
You bet depending on odds P vs 1-P
Bayes theoremOften used in subjective probability discussions
Conditional probability P(A|B)
Thomas Bayes 1702-1761
Bayes theorem How it works?
Initial belief P(Theory) is modified by experimental results
If Result is negative, P(Result|Theory)=0, the Theory is killed
P(Theory|Result)=0
It’s an extreme case. Will comeback later in the discussion ofconfidence level
Fun with Bayes theorem - 1 Monty Hall problem
• There are 3 doors• Behind one of these, there is a prize (a car, etc)• Behind each of the other two, there is a goat (you lost)
• you choose 1 door whatever you like (you bet), say, Nr 1.
• A door will be opened to reveal a goat, either of Nr 2 or Nr 3, chosen randomly if goat is behind the both.
• Then you are asked if you stay Nr 1, or, switch to Nr 2.
You should stay or switch?
One would say:
you don’t know anyway if there is the prize behind Nr 1 or Nr 2. They are equally probable.To stay or to switch give equal chance.
But the correct strategy is to switch
A ‘classical’ reasoning (count the number of cases)
Before the door is openedAfter the door is opened
Odds to win : stay 1/3 switch 2/3
Using Bayes theorem
P(Ci) : Prize is behind door i = 1/3 P(Ok) : Door k is opened
We want to know P(C1| O3) vs P(C2| O3)
Exercise
P(X) = 0.001 Prior probabilityP(no X) = 0.999
Consider a test of X
P(+ | X) = 0.998P(+ | no X) = 0.03
If the test result were +, how worried you should be ?
ie. What is P(X | +) ?
A disease X (maybe AIDS, SARS, ….)
http://home.cern.ch/kawamoto/lecture06.html