64
Assignment #8 Chapter 14: 26 Chapter 15: 18, 27 Due next Friday Nov. 27 th by 2pm in your TA’s homework box

Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Embed Size (px)

Citation preview

Page 1: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Assignment #8

Chapter 14: 26 Chapter 15: 18, 27 Due next Friday Nov. 27th by 2pm in your TA’s homework box

Page 2: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Assignment #9

Chapter 16: 20 Chapter 17: 33 Not Due! Just for practice. Answers will be posted on Friday Dec. 4th

Page 3: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Reading

For Today: Chapter 17 For Thursday: Chapter 17

Page 4: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Lab Report •  Posted on web-site •  Dates

–  Rough draft due to TAs homework box Monday Nov. 16th –  Rough draft returned in your registered lab section this week –  Final draft due at start of your registered lab section next week

à MUST HAND IN ROUGH DRAFT WITH FINAL DRAFT (penalty -10 points)

•  10% of course grade –  Rough Draft - 5% –  Final draft - 5% –  If you’re happy with your rough draft mark, you can tell your TA to use it for

the final draft à YOU MUST TELL YOUR TA

•  Read the “Writing a Lab Report” section of your lab notebook for guidance!!

Page 5: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Lab Report •  Posted on web-site •  Dates

–  Rough draft due to TAs homework box Monday Nov. 16th –  Rough draft returned in your registered lab section this week –  Final draft due at start of your registered lab section next week

à MUST HAND IN ROUGH DRAFT WITH FINAL DRAFT (penalty -10 points)

•  10% of course grade –  Rough Draft - 5% –  Final draft - 5% –  If you’re happy with your rough draft mark, you can tell your TA to use it for

the final draft à YOU MUST TELL YOUR TA

•  Read the “Writing a Lab Report” section of your lab notebook for guidance!!

Page 6: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Lab Report •  Posted on web-site •  Dates

–  Rough draft due to TAs homework box Monday Nov. 16th –  Rough draft returned in your registered lab section this week –  Final draft due at start of your registered lab section next week

à MUST HAND IN ROUGH DRAFT WITH FINAL DRAFT (penalty -10 points)

•  10% of course grade –  Rough Draft - 5% –  Final draft - 5% –  If you’re happy with your rough draft mark, you can tell your TA to use it for

the final draft à YOU MUST TELL YOUR TA

•  Read the “Writing a Lab Report” section of your lab notebook for guidance!!

Page 7: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Chapter 16 Review

Page 8: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Correlation: r

•  r is called the “correlation coefficient”

•  Describes the relationship between two numerical variables

•  Parameter: ρ (rho) Estimate: r

•  -1 < ρ < 1 -1 < r < 1

Page 9: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Estimating the correlation coefficient

r =

Xi − X ( )∑ Yi − Y ( )

Xi − X ( )2∑ Yi − Y ( )2∑

“Sum of products”

“Sum of squares”

Page 10: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section
Page 11: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section
Page 12: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Standard error of r

SEr =1− r 2

n − 2

Page 13: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

If ρ = 0,...

t =rSEr

r is normally distributed with mean 0

Therefore, we test a null hypothesis of no correlation using:

with df = n -2

Page 14: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Hypotheses

H0: X and Y are not correlated (ρ = 0). HA: X and Y are correlated (ρ ≠ 0).

Page 15: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Correlation assumes...

• Random sample

• X is normally distributed with equal variance for all values of Y

• Y is normally distributed with equal variance for all values of X

Page 16: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Bivariate Normal Distribution

•  The relationship between X and Y is linear

•  The cloud of points in a scatter plot of X and Y has a circular or elliptical shape •  The frequency distribution of X and Y separately are normal

Page 17: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Most Frequent departures from bivariate normal distribution

Page 18: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Chapter 16 Continued: Correlation between numerical variables

Page 19: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Spearman's rank correlation

•  An alternative to correlation that does not make so many assumptions

Page 20: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Example: Spearman's rs VERSIONS: 1. Boy climbs up rope, climbs down again 2. Boy climbs up rope, seems to vanish, re-appears at top, climbs down again 3. Boy climbs up rope, seems to vanish at top 4. Boy climbs up rope, vanishes at top, reappears somewhere the audience was not looking 5. Boy climbs up rope, vanishes at top, reappears in a place which has been in full view

Page 21: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Example: Spearman's rs

Page 22: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Hypotheses H0: The difficulty of the described trick is not correlated with the time elapsed since it was observed. HA: The difficulty of the described trick is correlated with the time elapsed since it was observed.

Page 23: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Years Elapsed Rank Years Impressiveness Score Rank Impressiveness

2 1 1 2

5 3.5 1 2

5 3.5 1 2

4 2 2 5

17 5.5 2 5

17 5.5 2 5

31 13 3 7

20 7 4 12.5

22 8 4 12.5

25 9 4 12.5

28 10.5 4 12.5

29 12 4 12.5

34 14.5 4 12.5

43 17 4 12.5

44 18 4 12.5

46 19 4 12.5

34 14.5 4 12.5

28 10.5 5 19.5

39 16 5 19.5

50 20.5 5 19.5

50 20.5 5 19.5

Page 24: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Finding rs

Ri − R( ) Si − S( )i=1

n

∑ = RiSi∑#

$%

&

'(−

Ri Si∑∑n

= 566

Ri − R( )2

i=1

n

∑ = Ri2( )∑ −

Ri∑#

$%

&

'(

2

n= 767.5

Si − S( )2

i=1

n

∑ = Si2( )∑ −

Si∑#

$%

&

'(

2

n= 678.5

Page 25: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

rS =566

767.5( ) 678.5( )= 0.784

rS(0.05,21)=0.434 rS(0.01,21)=0.550 Since rS=0.784 is greater than 0.550, P<0.01 We reject the null hypothesis There is a positive correlation between the impressiveness score and number of years elapsed

Page 26: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Spearman’s rank correlation for n >100

SE[rS ]=1− rS

2

n− 2

t = rSSE[rS ]

df = n− 2

Page 27: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Attenuation: The estimated correlation will be lower

if X or Y are estimated with error

Real correlation

Y estimated with measurement

error

X and Y estimated with measurement

error

Page 28: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Correlation depends on range

Page 29: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Chapter 17: Regression

Page 30: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Regression

•  Predicts Y from X

•  Linear regression assumes that the relationship between X and Y can be described by a line

Page 31: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Correlation vs. regression

Page 32: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Regression assumes... •  Random sample

•  Y is normally distributed with equal variance for all values of X

Page 33: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

The least squares regression line is the line for which the sum of all the squared

deviations in Y is smallest

Page 34: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

The parameters of linear regression

Y = α + β X

Intercept Slope

Page 35: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Positive β

Negative β

β = 0

Higher α

Lower α

Page 36: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Estimating a regression line

Y = a + b X

Page 37: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Nomenclature

Residual:

Yi − ˆ Y i

Predicted Value:

Yi

Data Point:

Xi,Yi

Page 38: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Finding the "least squares" regression line

SSresidual = Yi − ˆ Y i( )2

i =1

n

∑Minimize:

Page 39: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Best estimate of the slope

b =

Xi − X ( ) Yi − Y ( )i =1

n

Xi − X ( )2i =1

n

(= "Sum of products" over "Sum of squares of X")

Page 40: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Remember the shortcuts:

Xi − X ( ) Yi − Y ( )i =1

n

∑ = XiYi∑$

% & &

'

( ) ) −

Xi Yi∑∑

n

Xi − X ( )2i =1

n

∑ = Xi2( )∑ −

Xi∑$

% & &

'

( ) )

2

n

Page 41: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Finding a

Y = a + bX So..

a = Y − bX

Page 42: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Example: Predicting age based on radioactivity in teeth

Many above ground nuclear bomb tests in the ‘50s and ‘60s may have left a radioactive signal in developing teeth. Is it possible to predict a person’s age based on dental 14C?

Data from 1965 to present from Spalding et al. 2005. Forensics: age written in teeth by nuclear tests. Nature 437: 333–334.

Page 43: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Teeth data:

Δ14C Date of Birth

622 1963.5

262 1971.7

471 1963.7

112 1990.5

285 1975

439 1970.2

363 1972.6

391 1971.8

Δ14C Date of Birth

89 1985.5

109 1983.5

91 1990.5

127 1987.5

99 1990.5

110 1984.5

123 1983.5

105 1989.5

Page 44: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Teeth data:

X = 3798, Y∑∑ = 31674

X 2 =1340776, XY( )∑∑ = 7495223

Y 2∑ = 62704042

n =16

X = 237.375 Y =1979.63

Let X be the Δ14C, and Y be the year of birth.

Page 45: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Xi − X( ) Yi −Y( )i=1

n

∑ = XiYi∑#

$%

&

'(−

Xi Yi∑∑n

= 7495223−3798( ) 31674( )

16= −23393

Xi − X( )2

i=1

n

∑ = Xi2( )∑ −

Xi∑#

$%

&

'(

2

n

=1340776−3798( )2

16= 439226

b = −23393439226

= −0.053

Page 46: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Calculating a

a =Y − bX=1979.63− −0.053( )237.375=1992.2

Page 47: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Predicted Values The predicted value of Y from a regression line (Y hat)

estimates the the mean value of Y for all individuals having a given value of X

YX1

YX2

YX3

YX4

Page 48: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Y =1992.2− 0.053X

Page 49: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Predicting Y from X

Y =1992.2− 0.053X=1992.2− 0.053 200( )=1981.6

If a cadaver has a tooth with Δ14C content equal to 200, what does the regression line predict its year of birth to be?

Page 50: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Testing hypotheses about regression

H0: β = 0 HA: β ≠ 0

Page 51: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

b has a t distribution

Confidence interval for a slope:

b ± tα[2],df SEb

Hypothesis tests can use t:

t =b − β0SEb

Page 52: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Standard error of a slope

SEb =MSresidual

Xi − X( )2∑

MSresidual = SSresidual / dfresidual

Page 53: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Sums of squares for regression

SSTotal = Yi2∑ −

Yi∑$

% &

'

( )

2

n

SSregression = b Xi − X ( )∑ Yi −Y ( )

SSresidual + SSregression = SSTotal

With n - 2 degrees of freedom for the residual

Page 54: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Radioactive teeth: Sums of squares

SSTotal = Yi2∑ −

Yi∑#

$%

&

'(

2

n

= 62704042−31674( )2

16=1339.75

SSregression = b Xi − X( )∑ Yi −Y( )

= −0.053( ) −23393( ) =1239.8

Page 55: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Teeth: Sums of squares

SSresidual = SSTotal − SSregression =1339.75−1239.8 = 99.9dfresidual =16− 2 =14

Page 56: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Calculating residual mean squares

MSresidual = SSresidual / dfresidual

MSresidual =99.914

= 7.1

Page 57: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Standard error of a slope

SEb =MSresidual

Xi − X( )2∑

= 7.1439226

= 0.004

Page 58: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

b has a t distribution

Confidence interval for a slope:

b ± tα[2],df SEb

Hypothesis tests can use t:

t =b − β0SEb

Page 59: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Example: 95% confidence interval for slope with teeth

example

b± tα[2],df SEb = b± t0.05[2],14SEb

= −0.053± 2.14 0.004( )= −0.053± 0.0018

Page 60: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Confidence bands: confidence intervals for predictions of

mean Y for given X

Page 61: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Prediction intervals: confidence intervals for predictions of

individual Y for given X

Page 62: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

Hypothesis tests on slopes

H0: β = 0 HA: β ≠ 0

t =b − β0SEb

t = −0.053− 00.004

=13.25

t0.0001(2),14= ±5.36

So we can reject H0, P<0.0001

Page 63: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section

r2 predicts the amount of variance in Y explained by the

regression line

r2 is the “coefficient of determination: it is the square of the correlation coefficient r

Page 64: Assignment #8 - Department of Zoology, UBCmfscott/lectures/17_Regression.pdfAssignment #8 Chapter 14: 26 Chapter 15: ... – Final draft due at start of your registered lab section