12
ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?. ASTM21 Statistical tools in astrophysics Lecture notes by L. Lindegren (Lund Observatory, 2015) These lecture notes are organised into chapters covering the most important subject areas of the course. The lectures loosely follow the same plan. Chapter 1: Introduction - What the course is about - What is probability? Chapter 2: Random variables and probability distributions Chapter 3: Random number generators Chapter 4: Sample statistics Chapter 5: Maximum Likelihood Estimation Chapter 6: Least Squares Estimation Chapter 7: Hypothesis testing Chapter 8: Time series analysis - Power spectrum and periodogram Chapter 9: Monte Carlo, resampling, and Bayesian analysis 1

ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

Embed Size (px)

Citation preview

Page 1: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

ASTM21 Statistical tools in astrophysics

Lecture notes by L. Lindegren (Lund Observatory, 2015)

These lecture notes are organised into chapters covering the most important subject areas of the course. The lectures loosely follow the same plan.

Chapter 1: Introduction - What the course is about - What is probability?

Chapter 2: Random variables and probability distributions

Chapter 3: Random number generators

Chapter 4: Sample statistics

Chapter 5: Maximum Likelihood Estimation

Chapter 6: Least Squares Estimation

Chapter 7: Hypothesis testing

Chapter 8: Time series analysis - Power spectrum and periodogram

Chapter 9: Monte Carlo, resampling, and Bayesian analysis

1

Page 2: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

Chapter 1: Introduction

What the course is about (syllabus)

Literature

Statistics versus probability theory

What is probability?

Kolmogorov’s axioms

Probability laws

Probability as degree of plausibility

The Venn diagram

The 2×2 contingency table

2

Page 3: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

Introduction - What the course is about

Statistical tools in astrophysics (7.5 ECTS)Intended learning outcomes (translated from the official course syllabus in Swedish)

Having completed the course, the student should be acquainted with: basic concepts in probability theory and statistics a number of the most common discrete and continuous probability distributions and their

applications in physics and astronomy numerical methods to generate pseudorandom numbers for different distributions common graphical methods to present data, distributions, and uncertainties the principle of maximum likelihood confidence intervals and similar error estimates

and be able to: compute and interpret elementary statistical quantities apply the maximum likelihood method to simple estimation problems fit a non-linear mathematical model to given data derive confidence intervals in problems involving parameter estimation or fitting analyze irregular time series to find periodic variations apply hypothesis testing in relation to simple models

3

Page 4: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

Statistical tools in astrophysics

Course literature:

• Wall, J.V., Jenkins, C.R.: Practical statistics for astronomers, 2nd ed. Cambridge Univ Press, Cambridge 2012- lots of astronomical applications. 2nd edition contains a lot of useful material not in 1st ed.

• Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P. (eds.): Numerical recipes. The art of scientific computing. (3rd edition). Cambridge Univ Press, Cambridge 20072nd edition freely available on http://www.nr.com/oldverswitcher.html- mainly for the texts (2nd edition is nearly equivalent in the relevant parts)

Especially relevant: Ch. 4.0-4.3, 7.0-7.3, 14.0-14.6, 15.0-15.7, 17.0-17.1, 18.0-18.3 (3rd ed.) Ch. 4.0-4.3, 7.0-7.3, 14.0-14.6, 15.0-15.7, 16.0-16.1, 17.0-17.3 (2rd ed.)

Supplementary reading:There are many good textbooks on statistics. Also some general handbooks on mathematics and physics/engineering contain very useful chapters on statistics. One particularly good example is:• Riley, K.F., Hobson, M.P., Bence, S.J.: Mathematical Methods for Physics and Engineering,

Ch. 30 Probability and Ch. 31 Statistics (3rd edition, Cambridge 2006)4

Page 5: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

Statistics versus probability theory

Statistics

is about analysing, presenting, and interpreting data (observations)

Probability theory

provides the mathematical basis for doing statistics

If you know the properties of the population and how it is sampled, probability theory can tell you exactly what to expect of the sample. In statistics you want to describe or draw conclusions about the population from the given sample (data).

5

populationsample

statistics

probability theory

Page 6: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

Probability theory versus statistical inference (example)

6

MATLAB:>> betarnd(1.2, 3.1, [n,1])ans = 0.210440603062501 0.178896280366952 0.475489742288650 0.831842959598777 0.037603717601781 0.200819405379855 ....

probability theory

Name Eccentricitygamma Leo A b 0.144NN Ser c 0.220HAT-P-2 b 0.5171HAT-P-24 b 0.067HD 11506 b 0.30tau Boo b 0.023... ...

statistics

Is e ~ Beta(1.2, 3.1) a good modelfor the distribution of eccentricities?

Is e ~ Beta(α, β) a good model?For which parameters α, β?Confidence region for α, β?

Is e ~ FancyModel(α, β, γ) better?

0 0.2 0.4 0.6 0.8 10

0.5

1

1.5

2

2.5

3

x

Prob

abilit

y de

nsity

Beta(_, `)

(_, `) = (1.2, 3.1)(_, `) = (1.0, 3.0)

Page 7: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

What is probability?

Frequentist’s definition:

(= frequency “in the long run”).

Describes the outcomes of repeated, identical (but random) experiments: throwing dice, drawing coloured balls from an urn, making repeated noisy measurements

Subjectivist’s definition:

probability = degree of belief (plausibility)

Reflects our state of knowledge, using the logic of scientific reasoning: theory A is more plausible than theory B, but with new data this might be reversed.

7

SUREDELOLW\ =QXPEHU�RI�IDYRUDEOH�HYHQWV

WRWDO�QXPEHU�RI�HYHQWV

Page 8: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

Axiomatic definition of probability (Andrey Kolmogorov, 1933)

Given a sample space Ω (= the set of all possible outcomes ω of a random experiment),then the probability of any event A (= a subset of Ω) is a real number P(A).

The mapping A → P(A) satisfies the three axioms

0 ≤ P(A) ≤ 1

P(Ω) = 1

If A1, A2, ... are disjoint (mutually exclusive) events, then

These three axioms are sufficient to derive all the laws of probability theory.

Note: The axioms do not tell us what the values P are, only that the laws of probability apply to P, provided that the mapping A → P(A) is done according these rules.

8

3($� � $� � · · · ) =��

L=�3($L)

Page 9: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

A few examples of probability laws

Notation:

9

$ � % = XQLRQ�RI $ DQG % ��$ RU %��

$ � % = LQWHUVHFWLRQ�RI $ DQG % ��$ DQG %��

(definition of conditional probability)

(Bayes’ rule)

(definition of independence)

$ = ۙ\$ = FRPSOHPHQW�RI $ ��QRW $��

3($) = � � 3($)

3($ � %) = 3($) + 3(%) � 3($ � %)

3($|%) =3($ � %)

3(%)

3($ � %) = 3($|%)3(%) = 3(%|$)3($)

3($|%) =3(%|$)3($)

3(%)

3($ � %) = 3($)3(%) � $ DQG % DUH�LQGHSHQGHQW

Page 10: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

Probability as degree of plausibility

The quantitative rules of “common sense” were investigated by Keynes (1929), Jeffreys (1939), Cox (1946), Jaynes (2003), and others. Jaynes showed that if we accept the three assumptions:

1. degrees of plausibility are represented by real numbers, with a higher number representing a higher degree of plausibility;

2. there is a qualitative correspondence with common sense (for example, if A ⇒ B, and B is seen to be true, then this increases the plausibility of A);

3. plausibility is logically consistent: if a conclusion can be reached in more than one way, then every possible way must lead to the same result;

then it follows that the degree of plausibility satisfies Kolmogorov’s axioms.

Thus probability theory provides a mathematical basis for plausible reasoning, even without its underlying concepts (random experiments, sample space, etc).

10

Probability theory is nothing but common sense reduced to calculation.Pierre Simon de Laplace (1819)

Page 11: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

The Venn diagram

11

A BA ∩ B

Ω

The Venn diagram is a useful device to illustrate the possible outcomes of an experiment.• The rectangle represents the sample space (Ω).• Each possible outcome (ω) of an experiment corresponds to a point in the rectangle.• Closed curves like A and B represent events (collections of outcomes, or subsets of

sample space). Non-overlapping curves represent mutually exclusive events.It is useful to think of the area of a curve as being proportional to the probability of the event. The usefulness stems from the following: let S(A) = area of curve A. Then it is easily seen that the mapping A → P(A) ≡ S(A)/S(Ω) satisfies Kolmogorov’s axioms.

Page 12: ASTM21 Statistical tools in astrophysics - Lund … Statistical tools in astrophysics Chapter 1: Introduction - What is probability?. p. Statistical tools in astrophysics Course literature:

ASTM21 Statistical tools in astrophysics p. Chapter 1: Introduction - What is probability?.

The 2×2 contingency table

12

B B Sum

A

A

Sum

P(A ∩ B) P(A ∩ B) P(A)

P(A ∩ B) P(A ∩ B) P(A)

P(B) P(B) 1

The Venn diagram is not so convenient for representing the numerical values of the probabilities. If there are only two evens (A and B), a 2×2 contingency table may be more useful:

Note that the conditional probabilities are the fractional values along a row or column:P(A|B) = P(A ∩ B)/P(B), etc.