36
Lecture 2 : Searching for correlations

Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

Lecture 2 :Searching for correlations

Page 2: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Lecture 1 : basic descriptive statistics

• Lecture 2 : searching for correlations

• Lecture 3 : hypothesis testing and model-fitting

• Lecture 4 : Bayesian inference

The dark energy puzzleThese lectures

Page 3: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Correlation coefficient and its error

• How to quantify the significance of a correlation

• Bootstrap error estimates

• Non-parametric correlation tests

• Common pitfalls when searching for correlations

• Comparing two distributions

The dark energy puzzleLecture 2 : searching for correlations

Page 4: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Two variables are correlated if they share a statistical dependence / relationship

• For example, measurements of temperature at noon and 1pm every day are correlated, because they both lie consistently above the mean daily temperature

• Correlations between variables are important because they indicate some underlying physical relationship between those variables

The dark energy puzzleWhat is a correlation?

Page 5: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzleWhat is a correlation?

Uncorrelated variables !

Page 6: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzleWhat is a correlation?

Trick for generating correlated variables :x, z unit gaussian variablesy = x.r + z (1-r2)1/2

(x,y) have corr. coefficient r

Page 7: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Example in astronomy : black-hole / bulge relation

The dark energy puzzle

Blac

k ho

le m

ass

(sol

ar m

asse

s)

Bulge velocity dispersion (km/s)

What is a correlation?

Page 8: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Describes the strength of the correlation between (x, y)

• Means :

• Standard deviations :

• Definition of correlation coefficient :

The dark energy puzzleCorrelation coefficient

Page 9: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• No correlation [P(x,y) separable into f(x) g(y)] :

• Complete correlation :

• Complete anti-correlation :

• Possible range is

The dark energy puzzleCorrelation coefficient

Page 10: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzleLies, damn lies and statistics

Why was this poor statistics? Correlation is not the same as causation. Other dietary or lifestyle habits

could be a third variable! [ ... more examples later ... ]

Example 3

12/08/12 11:05 PMBBC NEWS | Health | Eating pizza 'cuts cancer risk'

Page 1 of 2http://news.bbc.co.uk/2/hi/health/3086013.stm

Home News Sport Radio TV Weather Languages

Low graphics | Accessibility help

Before people startdialling the pizza takeaway,they should consider thatpizza can be high in saturatedfat, salt and calories

Nicola O'Connor, Cancer ResearchUK

Pizzas are covered with a potentiallyprotective tomato sauce

[an error occurred while processing this directive]

One-Minute World News

News services Your news when youwant it

News Front Page

Africa

Americas

Asia-Pacific

Europe

Middle East

South Asia

UK

Business

Health

Medical notes

Science &Environment

Technology

Entertainment

Also in the news

-----------------

Video and Audio

-----------------

Programmes

Have Your Say

In Pictures

Country Profiles

Special Reports

RELATED BBC SITES

SPORT

WEATHER

ON THIS DAY

EDITORS' BLOG

Last Updated: Tuesday, 22 July, 2003, 10:28 GMT 11:28 UK

E-mail this to a friend Printable version

Eating pizza 'cuts cancer risk'

Italian researchers say

eating pizza could protect

against cancer.

Researchers claim eating pizza

regularly reduced the risk of

developing oesophageal cancer

by 59%.

The risk of developing colon

cancer also fell by 26% and

mouth cancer by 34%, they

claimed.

The secret could be lycopene, an antioxidant chemical in

tomatoes, which is thought to offer some protection against

cancer, and which gives the fruit its traditional red colour.

But some experts cast doubt

on the idea that pizza

consumption was the

explanation for why some

people did not develop cancer.

They said other foods or

dietary habits could play a part.

Lifestyle

The researchers looked at 3,300 people who had developed

cancer of the mouth, oesophagus, throat or colon and 5,000

people who had not developed cancer.

They were asked about their eating habits, and how often

they ate pizza.

Those who ate pizza at least once a week had less chance of

developing cancer, they found.

Dr Silvano Gallus, of the Mario Negri Institute for

Pharmaceutical Research in Milan, who led the research: "We

knew that tomato sauce could offer protection against certain

tumours, but we did not expect pizza as a complete meal

also to offer such protective powers."

Nicola O'Connor, of Cancer Research UK, told BBC News

Online: "This study is interesting and the results should

probably be looked at in the context of what we already know

about the Mediterranean diet and it's association with a lower

risk of certain types of cancer.

"But before people start dialling the pizza takeaway, they

SEE ALSO:

Scientists design 'anti-cancer'tomato 20 Jun 02 | Health

GM tomatoes 'offer health boost' 30 Aug 01 | Science/Nature

Oral cancer attacked by tomatoes 21 Dec 00 | Health

RELATED INTERNET LINKS:

International Journal of Cancer

Cancer Research UK

The BBC is not responsible for thecontent of external internet sites

TOP HEALTH STORIES

Stem cell method put to the test

Hospitals 'eyeing private market'

Low vitamin D 'Parkinson's link'

| News feeds

Page 11: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• We can estimate the Pearson product-moment correlation coefficient as :

• The possible range of values is

The dark energy puzzleEstimating the correlation coefficient

c.f.

Page 12: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

•If the correlation is statistically significant :

• 0 < |r| < 0.3 is a “weak correlation”

• 0.3 < |r| < 0.7 is a “moderate correlation”

• 0.7 < |r| < 1.0 is a “strong correlation”

The dark energy puzzleEstimating the correlation coefficient

Page 13: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Assumption : (x,y) are drawn from a bivariate Gaussian distribution about an underlying linear relation :

• If this model is true, then the uncertainty in the measured value of r is

The dark energy puzzleEstimating the correlation coefficient

Page 14: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Example 1 : who discovered the distance-redshift relation, Lemaitre (1927) or Hubble (1929)?

The dark energy puzzleEstimating the correlation coefficient

!

!

!"#!$%&&'(!)*'+,-(.!!/(012345(!167!!!!

8(6-95-:+,;!

!"#$%&'(&)*+,-.&/,0++*&+1&2+3456"6$+7"*&8&944*$:%&;"60:3"6$,<.&=7$#:><$6?&+1&60:&@$6A"6:><>"7%.&

B+0"77:<C5>D.&/+560&91>$,"(&

"#$%&'(%(&E7:&+1&60:&D>:"6:<6&%$<,+#:>$:<&+1&3+%:>7&6$3:<&$<&60"6&+1&60:&:F4"7%$7D&=7$#:><:.&&"*3+<6&$7#">$"C*?&"66>$C56:%&6+&

G5CC*:&HIJKJL(&@0"6&$<&7+6&A$%:*?&-7+A7&$<&60"6&60:&+>$D$7"*&6>:"6$<:&C?&':3"M6>:&&&HIJKNL&&,+76"$7:%&&"&>$,0&15<$+7&+1&&C+60&

60:+>?&"7%&+1&+C<:>#"6$+7(&&O0:&P>:7,0&4"4:>&A"<&&3:6$,5*+5<*?&,:7<+>:%&&A0:7&&4>$76:%&&$7&Q7D*$<0&R&&"**&%$<,5<<$+7<&+1&>"%$"*&

#:*+,$6$:<&"7%&%$<6"7,:<&&!"#$%&'(%)(*+%,-*.&%(/0-*-1"2%%$(&(*/-#"&-3#%3,%%456%7%%A:>:&+3$66:%(&P"<,$7"6$7D&$7<$D06<&">:&D*:"7:%&

1>+3&"&*:66:>&>:,:76*?&1+57%&$7&60:&':3"M6>:&">,0$#:<(&&97&"44:"*&$<&3"%:&1+>&"&':3"M6>:&O:*:<,+4:.&6+&0+7+5>&&60:&%$<,+#:>:>&+1&

60:&:F4"7%$7D&57$#:><:(&

!"#$%&'"()*+,-.(!($(&/"0'"&12$3((4$4"'5(&

O0:&6$6*:&+1&60:&+>$D$7"*&IJKN&4"4:>&$7%$,"6:<&6+&60:&>:"%:>&60"6&60:&,+76:76&A$**&C:&"&15<$+7&+1&C+60&

60:+>?&"7%&+1&+C<:>#"6$+7S&&

48#%57$#:><&0+3+DT7:&%:&3"<<:&,+7<6"76:&:6&%:&>"?+7&,>+$<<"76.&>:7%"76&,+346:&%:&*"&#$6:<<:&&>"%$"*:&

%:<&&7UC5*:5<:<&:F6>"R9"2"1&-:;(.6%<'-1'%-.%&*"#.2"&($%-#%&'(%=#92-.'%&3#9;(>%&';.?&&

4@%'3/39(#(3;.%;#-)(*.(%3,%13#.&"#&%/"..%"#$%-#1*(".-#9%*"$-;.%"113;#&-#9%1+>&60:&>"%$"*&#:*+,$6?&+1&

:F6>"R9"2"1&-1%#(A;2"(6&

':3"M6>:&<4:76&60:&?:"><&IJKVWX&"6&60:&G">#">%&2+**:D:&EC<:>#"6+>?(&G:&0"%&"7&:F,:**:76&1+57%"6$+7&$7&

+C<:>#"6$+7"*&"<6>+7+3?.&A>$6$7D&"C+56&6:>3<&<5,0&"<&60:&:11:,6$#:&6:34:>"65>:<&+1&<6"><.&6>$D+7+3:6>$,&

4">"**"F:<.&3+#$7DR,*5<6:>&4">"**"F:<.&"C<+*56:&C+*+3:6>$,&3"D7$65%:<.&%A">1&C>"7,0&<6"><.&D$"76&C>"7,0&

<6"><.&"7%&60:&*$-:(&&&

O+&<4:"-&+1&':3"M6>:&HIJKNL&"<&"&3+<6&>:3">-"C*:&"7%&"C<+*56:*?&C>$**$"76&60:+>:6$,"*&4"4:>&+7*?.&$<&"&

D>"#:&$7Y5<6$,:&6+&60:&&#:>?&&6$6*:(&&Z+6&+7*?&%+:<&':3"M6>:&60:+>:6$,"**?&%:>$#:&"&*$7:">&>:*"6$+7<0$4&

C:6A::7&60:&>"%$"*&#:*+,$6$:<&+1&D"*"F$:<&"7%&60:$>&%$<6"7,:<&$7&60:&"C+#:&4"4:>.&C56&0:&$<&:"D:>&6+&

%:6:>3$7:&60:&>"6:&"6&A0$,0&60:&=7$#:><:&:F4"7%<(&&&':3"M6>:&HIJKNL&,">:15**?&5<:<&60:&>"%$"*&#:*+,$6$:<&

+1&&VK&:F6>"D"*",6$,&7:C5*":&&6"C5*"6:%&$7&/6>[3C:>D&&HIJKXL.&"7%&0:&,+7#:>6<&&"44">:76&3"D7$65%:<&3&

$76+&%$<6"7,:&\*+D&>&]&^(K3&_&V(^V`&1+**+A$7D&G5CC*:&&HIJKaL(&O0:&",65"*&#"*5:&A0$,0&':3"M6>:&+C6"$7<&$7&

IJKN&1+>&60:&>"6:&+1&:F4"7<$+7&+1&60:&=7$#:><:&&$<&aKX&-3W<W;4,&b&XNX&-3W<W;4,&A$60&%$11:>:76&

A:$D06$7D&1",6+><&HP$D5>:&IL(&

arXiv:1106.3928 etc.

Page 15: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Example 1 : who discovered the distance-redshift relation, Lemaitre (1927) or Hubble (1929)?

The dark energy puzzleEstimating the correlation coefficient

Page 16: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzle

• Part (a) : For each dataset, find the Pearson product-moment correlation coefficient and its error

• Lemaitre : r = 0.38 +/- 0.15

• Hubble : r = 0.79 +/- 0.13

Estimating the correlation coefficient

Page 17: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• What is the probability of obtaining the measured value of r if the true correlation is zero? (also depends on N)

• In order to determine whether the correlation is significant, calculate

• This obeys the Student’s t probability distribution with number of degrees of freedom

• Consult tables (2-tailed test) with these two numbers

The dark energy puzzleIs a correlation significant?

Page 18: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzleIs a correlation significant?

Distribution of twhen rho=0 Number of data

points is crucial !

Page 19: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Standard tables list the critical values that t must exceed, as a function of nu, for the hypothesis that the two variables are unrelated to be rejected at a particular level of statistical significance (e.g. 95%, 99%)

The dark energy puzzleIs a correlation significant?

“Two-tailed test” :correlation or anti-correlation

Page 20: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzle

• Part (b) : Determine the statistical significance of the correlation

• Lemaitre : t=2.60 , nu=40, prob=1.3e-2 (2.5 sigma)

• Hubble : t=6.03 , nu=22 , prob=4.5e-6 (4.6 sigma)

Is a correlation significant?

Page 21: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzleLinear regression line

• The regression line is the linear fit that minimizes the sum of the squares of the y-residuals

• With intercept [y = a + b x] :

• Without intercept [y = b x] :

Page 22: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzle

• Part (c) : Determine linear least-squares regression lines of the form V = HD and v = H’D + C

• Lemaitre : H=414.9 , H’=221.7 , C=316.5 Hubble : H=423.9 , H’=453.9 , C=-40.4

Linear regression line

Page 23: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Aside : Hubble and Lemaitre both found values of H0 ~ 420 km/s/Mpc with independent techniques ! How could they both be wrong? [Example of statistical bias]

• Today would probably indicate confirmation bias, but Hubble didn’t even cite Lemaitre’s result!

• Lemaitre : assumed galaxy apparent magnitude was standard candle - scuppered by Malmquist bias

• Hubble : used “brightest stars” as standard candles, but could not distinguish brightest star from HII region (systematic error bias due to aperture effect)

The dark energy puzzleLinear regression line

Page 24: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzleBootstrap errors and probabilities

• Bootstrap statistics allow us to determine parameter errors and probability distributions using just the data

• If we have N data points, repeatedly draw at random samples of N points (with replacement)

• Re-compute the parameter of interest for each bootstrap sample

• The distribution of the re-computed parameters estimates the uncertainty in the measurement from the original sample

Page 25: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzleBootstrap errors and probabilities

• Applied to our example : create 1000 bootstrap samples and measure the correlation coefficients

68% conf +/- 0.11

68% conf +/- 0.07

Page 26: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzleBootstrap errors and probabilities

• Applied to our example : create 1000 bootstrap samples and do a linear regression fit

68% conf +/- 50

68% conf +/- 42

Page 27: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• If we do not want to assume (x,y) are drawn from a bivariate Gaussian we can use a non-parametric correlation test

• Let (Xi,Yi) be the rank of (xi,yi) in the overall order such that

• Find Spearman rank correlation coefficient

• Compare to standard tables with

The dark energy puzzleNon-parametric correlation coefficient

D [Mpc] Rank0.03 1.00.03 2.00.21 3.00.26 4.00.28 5.50.28 5.50.45 7.00.50 8.50.50 8.50.63 10.00.80 11.00.90 13.50.90 13.50.90 13.50.90 13.51.00 16.01.10 17.51.10 17.51.40 19.01.70 20.02.00 22.52.00 22.52.00 22.52.00 22.5

1

Page 28: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzleNon-parametric correlation coefficient

• Standard tables list the critical values that rs must exceed, as a function of nu, for the hypothesis that the two variables are unrelated to be rejected at a particular level of statistical significance (e.g. 95%, 99%)

Page 29: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzle

• Part (e) : Determine the Spearman rank cross-correlation coefficient and its statistical significance

• Lemaitre : rs=0.42 , nu=40 , prob=6.0e-3 (2.8 sigma)

• Hubble : rs=0.80 , nu=22 , prob=3.4e-6 (4.7 sigma)

• [Results very similar to Pearson product-moment correlation coefficient, but fewer assumptions]

Non-parametric correlation coefficient

Page 30: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Selection effects leading to spurious correlations, for example Malmquist bias

The dark energy puzzleIssues with correlations

Distance modulus (redshift)

log 1

0(ra

dio

lum

inos

ity)

Selection effectsin this region

Selection effectsin this region

Page 31: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Is the correlation driven by a small number of outliers, so is not robust?

The dark energy puzzleIssues with correlations

Mean, variance, correlation coefficientand regression line are all identical

“Rule ofthumb”

Page 32: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Correlation does not necessarily imply causation

• Sleeping in your shoes causes you to wake up with a headache! [third variable - you were probably having a few drinks the night before...]

• Ice cream sales cause drowning! [third variable - hotter weather means more people are at the beach...]

• Having grey hair causes cancer! [third variable - age...]

• Obesity causes global warming! [third variable - everyone is getting richer...]

The dark energy puzzleIssues with correlations

Page 33: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

The dark energy puzzleLies, damn lies and statistics

“We don’t accept the idea that there are harmful agents in

tobacco”[Phillip Morris,1964]

Why was this poor statistics? Cigarette companies were attempting to invoke a “third variable”. But

correlation does sometimes imply causation, if it can be demonstrated by independent lines of evidence

Example 4

Page 34: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Are two samples drawn from the same distribution?

• Example 2 : samples of flux densities measured at random positions [A] and galaxy positions [B]

The dark energy puzzleComparing two distributions

Page 35: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Part (a) : Are their means consistent?

• Calculate t statistic and no. of degrees of freedom :

• Compare to Student’s t distribution

• t = 0.31, nu = 661.5, prob of consistency = 0.76

• Small print : assumes (x,y) are normally-distributed populations

The dark energy puzzleComparing two distributions

Page 36: Lecture 2 : Searching for correlationsastronomy.swin.edu.au/~cblake/StatsLecture2.pdf · 2012. 8. 18. · •Lecture 1 : basic descriptive statistics •Lecture 2 : searching for

• Part (b) : Are the full distributions consistent?

• The Kolmogorov-Smirnov test considers the maximum value of the absolute difference between the cumulative probability distributions

The dark energy puzzleComparing two distributions

Max diff = D = 0.067

Prob(Q) = 0.427