7

Click here to load reader

S2_MEI

  • Upload
    john-ng

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: S2_MEI

8/10/2019 S2_MEI

http://slidepdf.com/reader/full/s2mei 1/7

 

MEI Structured Mathematics

Module Summary Sheets

Statistics 2(Version B: reference to new book)

Topic 1: The Poisson Distribution

Topic 2: The Normal Distribution

Topic 3: Samples and Hypothesis Testing1. Test for population mean of a Normal distribution

2. Contingency Tables and the Chi-squared Test

Topic 4: Bivariate Data

 Purchasers have the licence to make multiple copies for usewithin a single establishment  

 MEI November, 2005

MEI, Oak House, 9 Epsom Centre, White Horse Business Park, Trowbridge, Wiltshire. BA14 0XG.

Company No. 3265490 England and Wales Registered with the Charity Commission, number 1058911

Tel: 01225 776776. Fax: 01225 775755.

MEI Mathematics in Education and Industry

Page 2: S2_MEI

8/10/2019 S2_MEI

http://slidepdf.com/reader/full/s2mei 2/7

Page 3: S2_MEI

8/10/2019 S2_MEI

http://slidepdf.com/reader/full/s2mei 3/7

E.g. The distribution of masses of adult males

may be modelled by a Normal distribution with

mean 75 kg and standard deviation 8 kg. Findthe probability that a man chosen at random

will have mass between 70 kg and 90 kg.

E.g. Find the probability that when a die isthrown 30 times there are at least 10 sixes.

Using the binomial distribution requires

P(30 sixes) + P(29 sixes) + ..........+ P(10 sixes).

However, using N(!", !"#) where ! = 30 and

 " =1/6, gives N(5, 4.167).

P( $ > 9.5) = P( & > ' 1)

where ' 1 = (9.5  5)/2.04 = 2.205

= 1  0.9863 = 0.0137

 N.B. a continuity correction is applied because

the original distribution (binomial) is being

approximated by a continuous distribution

(Normal).

For N(0,1), P( & < ' 1) can be found from tables

(Students Handbook, page 44)

E.g. P( & < 0.7) = 0.7580

P( & > 0.7) = 1 0.7580 = 0.2420

For N(2,9) [  = 2,    = 3]

P( $ < 5) = P( & < ' 1) where ' 1 = (5  2)/3 = 1

= 0.8413 

Modelling Many distributions in the real world, such as adult

heights or intelligence quotients, can be modelled well bya Normal distribution with appropriate mean and

variance. 

Given the mean    and standard deviation the Normal

distribution N(  ) may often be used.

When the underlying distribution is discrete then the

 Normal distribution may often be used, but in this case acontinuity correction must be applied. This requires us to

take the mid-point between successive possible values

when working with continuous distribution tables.

E.g. P( $ ) > 30 means P( $  > 30.5) if $  can take only

integer values.

The Normal distribution, N( ), is a continuous,

symmetric distribution with mean       and standard

deviation   The standard Normal distribution N(0,1) has

mean 0 and standard deviation 1.

P(X< (1) is represented by the area under the curve

 below (1.

(It is a special case of a continuous probability densityfunction which is a topic in Statistics 3.)

The area under the standard Normal distribution curve

can be found from tables.

To find the area under any other Normal distributioncurve, the values need to be standardised by the formula

The Normal approximation to the binomial

distribution. 

This is a valid process provided (i) ! is large,

(ii)  " is not too close to 0 or 1.

Mean = !", Variance = !"#.

The approximation will be N(!", !"#).

A continuity correction must be applied because we are

approximating a discrete distribution by a continuousdistribution. 

Statistics 2

Version B: page 3

Competence statements N1, N2, N3, N5

© MEI

Exercise 2A

Q. 4

Exercise 2B

Q. 1

 )(*+",- ./0

 1*2- 34

Exercise 2B

Q. 4

References:

Chapter 2

Pages 32-44

References:

Chapter 2

Pages 49, 50

 )(*+",- ./3

 1*2- 56

78++*9: 7. ;<"=> . The Normal Distribution

 (    ' 

 

References:

Chapter 2

Pages 50-52

1 2

1

2

We require P(70< 90) P( )

70 75

where 0.6258

90 75and 1.875

8

P(70 90)

P( 1.875) P 0.625

 P( 1.875) 1 P( 0.625)

0.9696 (1 0.7340) 0.7036

 $ ' & ' 

 ' 

 ' 

 $ 

 & & 

 & & 

The Normal approximation to the Poisson

distribution. 

This is a valid process provided   is sufficiently large for

the distribution to be reasonably symmetric. A goodguideline is if         is at least 10.

For a Poisson distribution, mean = variance =   

The approximation will be N( .

As with the Binomial Distribution, a continuity

correction must be applied because we are approximat-

ing a discrete distribution by a continuous distribution. 

References:

Chapter 2

Pages 52-54

Exercise 2BQ. 11

Exercise 2C

Q. 1, 6, 7

E.g. A large firm has 50 telephone lines. On

average, 40 lines are in use at once and the dis-

tribution may be modelled by Poisson(40). Find

the probability of there not being enough lines.

1

1

The distribution is Poisson(40).

Approximate by N(40,40).

Then we require P( 50) P( )

50.5 40where 1.66

40

P( 50) 1 0.9515 0.0485

 $ & ' 

 ' 

 $ 

Page 4: S2_MEI

8/10/2019 S2_MEI

http://slidepdf.com/reader/full/s2mei 4/7

!"# %&'()&*+(&,- ,. '/012# 0#/-' "# $ %&%'($)*&+ ,$- ./ ,&0/((/0 .- $ 1&2,$( 0*3)2*.'4

)*&+ $+0 3$,%(/3 &# 3*5/ n $2/ )$6/+ #2&, )7/

 %&%'($)*&+8 )7/+ )7/ 0*3)2*.')*&+ &# ,/$+3 &# )7/3/

3$,%(/3 *3 $(3& 1&2,$(9

:)$)*3)*;3 <

=/23*&+ >? %$@/ A

B&,%/)/+;/ 3)$)/,/+)3 1C

D EF"

FG/2;*3/ HI

J9 KL*M8L***M

Summary S2 Topic 3  Samples and Hypothesis Testing: Estimating the population mean of a Normal distribution

 

N/#/2/+;/3?

B7$%)/2 H

O$@/3 CP4QK

<

<

"# )7/ %$2/+) %&%'($)*&+ *3 1 8 )7/+ )7/

3$,%(*+@ 0*3)2*.')*&+ &# ,/$+3 *3 1 8 9n

 

  

FG$,%(/ H9KO$@/ QR

341,("#'&' (#'( .,) ("# 0#/- +'&-5 ("# 6,)0/2

%&'()&*+(&,-

S/3)3 &+ )7/ ,/$+ '3*+@ $ 3*+@(/ 3$,%(/9

TR *3   U  R V7/2/  R *3 3&,/ 3%/;*#*/0 W$('/9

TK ,$- ./ &+/ )$*(/0? X R &2 Y R 

&2 )V& )$*(/0?    R9

"+ &)7/2 V&2038 @*W/+ )7/ ,/$+ &# )7/ 3$,%(/ )$6/+ V/

$36 )7/ Z'/3)*&+8 [B&'(0 )7/ ,/$+ &# )7/ %$2/+) %&%'($)*&+ ./ V7$) V/ )7*+6 *) *3\]

I()/2+$)*W/(-8 *# )7/ W$('/ &# )7/ ,/$+ &# )7/ 3$,%(/(*/3 *+3*0/ )7/ $;;/%)$+;/ 2/@*&+ )7/+ V/ V&'(0 $;;/%)

TR8 .') *# *) ($- *+ )7/ ;2*)*;$( 2/@*&+ )7/+ V/ V&'(02/^/;) TR *+ #$W&'2 &# TK9

I()/2+$)*W/(-8 ;$(;'($)/ )7/ %2&.$.*(*)- )7$) )7/ W$('/ *3

@2/$)/2 )7$+ )7/ W$('/ #&'+0 $+0 3// *# *) (/33 )7$+ )7/3*@+*#*;$+;/ (/W/( ./ '3/09

N/#/2/+;/3?

B7$%)/2 H

O$@/3 QK4QH

7-,8- /-% #'(&0/(#% '(/-%/)% %#9&/(&,- S7/ 7-%&)7/3*3 )/3) 0/3;2*./0 $.&W/ 2/Z'*2/3 )7/ W$('/

&# )7/ 3)$+0$20 0/W*$)*&+ &# )7/ %$2/+) %&%'($)*&+9

"+ 2/$(*)- )7/ 3)$+0$20 0/W*$)*&+ &# )7/ %$2/+)

 %&%'($)*&+ V*(( '3'$((- +&) ./ 6+&V+ $+0 V*(( 7$W/ )&

 ./ /3)*,$)/0 #2&, )7/ 3$,%(/ 0$)$9"# )7/ 3$,%(/ 3*5/ *3 3'##*;*/+)(- ($2@/8 )7/ 3909 &# )7/

3$,%(/ ,$- ./ '3/0 $3 )7/ 3909 &# )7/ %$2/+) %&%'($4

)*&+9

I @&&0 @'*0/(*+/ *3 )& 2/Z'*2/ n  _R9

N/#/2/+;/3?

B7$%)/2 H

O$@/3 QH4QA

FG$,%(/ H9HO$@/ QA

FG/2;*3/ HI

J9 C

F9@9 I %&%'($)*&+ 7$3 W$2*$+;/ KC9 ") *3 2/Z'*2/0 )&

)/3) $) )7/ R9_` (/W/( &# 3*@+*#*;$+;/ V7/)7/2 )7/,/$+ &# )7/ %&%'($)*&+ ;&'(0 ./ KR &2 V7/)7/2 *)

*3 (/33 )7$+ )7*39 I 2$+0&, 3$,%(/ 3*5/ <_ 7$3 $,/$+ &# P9C9

I()/2+$)*W/(-8 *# )7/ ,/$+ *3 KR )7/+ )7/3$,%(*+@ 0*3)2*.')*&+ &# ,/$+3 *3 1LKR8 R9CAM

<

<

:'%%&3/ )7/ %$2/+) %&%'($)*&+ *3 1 8 8 )7/+ )7/

3$,%(*+@ 0*3)2*.')*&+ &# ,/$+3 *3 1 8 9S7/

;2*)*;$( W$('/3 $2/ )7/2/#&2/ V7/2/ )7/

W$('/ &# 0/%/+03 &+ )7/ (/W/( &# 3*@+*#*;$+;/

$+0 V7

n

k n

 

  

  

/)7/2 *) *3 $ &+/ &2 )V&4)$*(/0 )/3)9

a/ )7/2/#&2/ ;$(;'($)/ )7/ W$('/ $+0

;&,%$2/ *) )& )7/ W$('/ #&'+0 *+ )$.(/39

 x

 z n

 

 

 : 

R

K

R

T ? KR

T ? KR

<9_P L#&2 R9_ ̀(/W/(8 K4)$*(/0 )/3)M

AB2*)*;$( W$('/ *3 KR <9_P KR <9_P Q9bHC

_

:*+;/ P9C Y Q9bHC V/ $;;/%) T c)7/2/ *3 +& /W*0/+;/

$) )7/ R9_ ̀(/W/( &# 3*@+*#*;$+;/ )7$) )7/ ,/$+ *3

(

n

 

 

 

/33 )7$+ KR9

F9@9 ") *3 )7&'@7) )7$) )7/ %$2/+) %&%'($)*&+ *3 1&24

,$((- 0*3)2*.')/0 V*)7 ,/$+ <R9I 2$+0&, 3$,%(/ &# _R 0$)$ *)/,3 7$3 $ 3$,%(/,/$+ &# <A9< $+0 3909 P9H9

"3 )7/2/ $+- /W*0/+;/ $) )7/ R9K` 3*@+*#*;$+;/

(/W/( )7$) )7/ ,/$+ &# )7/ %&%'($)*&+ *3 +&) <R\

R

S7/+ OL P9CM K K

:*+;/ R9RARK R9RR_ V/ $;;/%) T

 X  

R

K

T ? <R

T ? <R

L1&)/ )7$) $()7&'@7 )7/ ,/$+ &# )7/ 3$,%(/

*3 @2/$)/2 )7$+ )7/ %2&%&3/0 ,/$+8 V/ 0& +&)

7$W/ Y<R ./;$'3/ &# )7/ V&20*+@ &# )7/

Z'/3)*&+9M

<A9< <R H9_QPP9H

_R

B2*)*;$( W$('/ #2&,

 x z 

n

 

 

 

  

 : 

R K

)$.(/3 #&2 )V&4)$*(/08

R9K` 3*@+*#*;$+;/ (/W/( *3 H9<Q

:*+;/ H9_QPYH9<Q V/ 2/^/;) T *+ #$W&'2 &# T 9

S7/2/ *3 /W*0/+;/ )7$) )7/ ,/$+ &# )7/

 %&%'($)*&+ *3 +&) <R9

F9@9 "# )7/ %$2/+) %&%'($)*&+ *3 1LKR8 KCM $+0 $

3$,%(/ &# 3*5/ <_ 7$3 ,/$+ P9C8 )7/+ )7*3 W$('/

;&,/3 #2&, )7/ 3$,%(*+@ 0*3)2*.')*&+ &# ,/$+3V7*;7 *3 1LKR8 R9CAM9

Page 5: S2_MEI

8/10/2019 S2_MEI

http://slidepdf.com/reader/full/s2mei 5/7

F9@9 "# 02*W*+@ )& B&((/@/ $+0 )7/ 0*3)$+;/ (*W/0

$2/ +&) $33&;*$)/0 /W/+)3 )7/+ *# &+/ 3)'0/+) *3

;7&3/+ $) 2$+0&, )7/ /3)*,$)/0 %2&.$.*(*)*/3 $2/

"+ $ 3*,*($2 V$- )7/ /+)2*/3 *+ )7/ &)7/2 )72//

 .&G/3 $2/ ;$(;'($)/0 )& @*W/ )7/ #&((&V*+@?

a/ )/3) )7/ 7-%&)7/3/3?

TR? )7/ )V& /W/+)3 $2/ +&) $33&;*$)/0TK? S7/ )V& /W/+)3 $2/ $33&;*$)/09

"# )7/ )/3) *3 $) )7/ _` (/W/( )7/+ )7/ )$.(/3 &+

 %$@/ A_ &# )7/ EF" :)'0/+)3d T$+0.&&6 @*W/3 )7/;2*)*;$( W$('/ &# H9PAK Lv U KM9

:*+;/ A9K<KA Y H9PAK V/ 2/^/;) )7/ +'(( 7-%&)7/4

3*38 TR8 $+0 ;&+;('0/ )7$) )7/2/ *3 /W*0/+;/ )7$) )7/

)V& /W/+)3 $2/ $33&;*$)/09

"# )7/ )/3) V/2/ $) )7/ K` 3*@+*#*;$+;/ (/W/( )7/+V/ V&'(0 ;&+;('0/ )7$) )7/2/ V$3 +&) /+&'@7 /W*4

0/+;/ )& 2/^/;) )7/ +'(( 7-%&)7/3*39

F9@9 $ @2&'% &# _R 3)'0/+)3 V$3 3/(/;)/0 $) 2$+0&,#2&, )7/ V7&(/ %&%'($)*&+ &# 3)'0/+)3 $) $

B&((/@/9 F$;7 V$3 $36/0 V7/)7/2 )7/- 02&W/ )&

B&((/@/ &2 +&) $+0 V7/)7/2 )7/- (*W/0 ,&2/ )7$+

&2 (/33 )7$+ KR 6, #2&, )7/ B&((/@/9 S7/ 2/3'()3

$2/ 37&V+ *+ )7*3 )$.(/9

!"#  ; <(/(&'(&= >?"&@'A+/)#% '(/(&'(&=B

S7*3 3)$)*3)*; ,/$3'2/3 7&V #$2 $%$2) $2/ )7/ 3/) &#

&.3/2W/0 $+0 /G%/;)/0 #2/Z'/+;*/39

?,-(&-5#-=4 !/*2#'

:'%%&3/ )7/ /(/,/+)3 &# $ %&%'($)*&+ 7$W/ < 3/)3 &# 0*34

)*+;) ;7$2$;)/2*3)*;3 e X, Y f8 /$;7 3/) ;&+)$*+*+@ $ #*+*)/

+',./2 &# 0*3;2/)/ ;7$2$;)/2*3)*;3 X  U e xK8 x<8g98 xmf $+0

Y  U e yK8 y<8gg98 ynf )7/+ /$;7 /(/,/+) &# )7/ %&%'($)*&+

V*(( 7$W/ $ %$*2 &# ;7$2$;)/2*3)*;3 L xi8 y jM9

S7/ #2/Z'/+;- &# )7/3/ m  n %$*23 L xi8 y jM ;$+ ./ )$.'($)/0

*+)& $+ m  n =,-(&-5#-=4 (/*2#9

S7/ 0/)5&-/2 (,(/2' $2/ )7/ 3', &# )7/ 2&V3 $+0 )7/ 3',

&# )7/ ;&(',+3 $+0 *) *3 '3'$( )& $00 $ 2&V $+0 $ ;&(',+

#&2 )7/3/9

S7/ 2/Z'*2/,/+) *3 )& 0/)/2,*+/ )7/ /G)/+) )& V7*;7 )7/W$2*$.(/3 $2/ 2/($)/09

"# )7/- $2/ +&) 2/($)/0 .') *+0/%/+0/+)8 )7/+ )7/&2/)*;$(

 %2&.$.*(*)*/3 ;$+ ./ /3)*,$)/0 #2&, )7/ 3$,%(/ 0$)$9h&' +&V 7$W/ )V& )$.(/38 &+/ ;&+)$*+*+@ )7/ $;)'$(

L&.3/2W/0M #2/Z'/+;*/3 $+0 )7/ &)7/2 ;&+)$*+*+@ )7/ /3)*4

,$)/0 /G%/;)/0 #2/Z'/+;*/3 .$3/0 &+ )7/ $33',%)*&+ )7$)

)7/ W$2*$.(/3 $2/ *+0/%/+0/+)9

!"# "41,("#'&' (#'(

TR? S7/ W$2*$.(/3 $2/ +&) $33&;*$)/09

TK? S7/ W$2*$.(/3 $2/ $33&;*$)/09

:)$)*3)*;3 <

=/23*&+ >? %$@/ _

B&,%/)/+;/ 3)$)/,/+)3 TK8 T<

D EF"

FG/2;*3/ H>J9 A8 _

N/#/2/+;/3?

B7$%)/2 H

O$@/3 PK4P_

N/#/2/+;/3?

B7$%)/2

O$@/3 PQ4b<

Summary S2 Topic 3  Samples and Hypothesis Testing2: Contingency Tables and the Chi-squared Test  

 yK  y<  yn

 xK  f K8K  f K8<  f K8n

 x<  f <8K  f <8<  f 28n

 xm  f m8K  f m8<  f m,n

 1/$2/2

)7$+ KR 6, 

i'2)7/2 )7$+ KR

6,

j2*W/3  KK  KQ  ;C

j&/3 +&)

02*W/ 

K_  Q  ;;

;D ;E FG

<P <AOL02*W/3M 8OL(*W/3 #'2)7/2 )7$+ KR 6,M

_R _R

<P <A$+0 OL02*W/3 $+0 (*W/3 #'2)7/2 )7$+ KR 6,M U_R _R

R9<CPP

:& &') &# _R %/&%(/ V/ V&'(0 /G%/;) _R R9<CPP

  U KH9AA

   1/$2/2

)7$+ KR 6, 

i'2)7/2 )7$+ KR

6,

j2*W/3  KA9_C  KH9AA  ;C

j&/3 +&)02*W/  KK9AA  KR9_C  ;;

;D ;E FG

<

<

<

&.3/2W/0 #2/Z'/+;- 4 /G%/;)/0 #2/Z'/+;-

/G%/;)/0 #2/Z'/+;-

o e

e

 X 

 f f 

 f 

H#5)##' ,. .)##%,0

S7/ 0*3)2*.')*&+ 0/%/+03 &+ )7/ +',./2 &# #2// W$2*$.(/3

)7/2/ $2/8 ;$((/0 )7/ 0/@2//3 &# #2//0&,8  9

S7*3 *3 )7/ +',./2 &# ;/((3 (/33 )7/ +',./2 &# 2/3)2*;)*&+3

 %($;/0 &+ )7/ 0$)$9

i&2 $ << )$.(/ 3';7 $3 )7/ /G$,%(/ @*W/+ )7/ +',./2 &#;/((3 )& ./ #*((/0 *3 A8 .') )7/ &W/2$(( )&)$( *3 _R V7*;7 *3 $

2/3)2*;)*&+ $+0 )7/ %2&%&2)*&+3 #&2 /$;7 W$2*$.(/ V/2/ $(3&

/3)*,$)/0 #2&, )7/ 0$)$8 @*W*+@ )V& #'2)7/2 2/3)2*;)*&+39

:& )7/ +',./2 &# 0/@2//3 &# #2//0&, *+ )7/ /G$,%(/ *3 K9

"+ @/+/2$( )7/ +',./2 &# 0/@2//3 &# #2//0&, #&2 $+ m  n 

)$.(/ *3 Lm  KMLn KM9

<

<

< < < <KK KA9_C KQ KH9AA K_ KK9AA Q KR9_C

KA9_C KH9AA KK9AA KR9_C

R9PQRA R9bAHR K9KRQP K9<RR<A9K<KA

o e

e

 f f  X 

 f 

N/#/2/+;/3?

B7$%)/2 H

O$@/ P_

FG/2;*3/ HB

J9 K8 P

Page 6: S2_MEI

8/10/2019 S2_MEI

http://slidepdf.com/reader/full/s2mei 6/7

For the data above:

For the data above:

r  can be found directly with an appropriatecalculator.

Example 1

The length ( x cm) and width ( y cm) of leaves

of a tree were measured and recorded as

follows:

 x  2.1 2.3 2.7 3.0 3.4 3.9

 y  1.1 1.3 1.4 1.6 1.9 1.7

The scatter graph is drawn as shown.

Pearson’s Product Moment Correlation Coefficient  provides a standardised measure of covariance.

The pmcc lies between -1 and +1.

Correlation and Regression If the x and y values are both regarded as values of randomvariables, then the analysis is correlation. Choose a sample

from a population and measure two attributes.

If the x value is non-random (e.g. time at fixed intervals)

then the analysis is regression. Choose the value of one vari-able and measure the corresponding value of another. 

Bivariate Data are pairs of values ( x, y) associated with a

single item.

e.g. lengths and widths of leaves.The individual variables  x and  y may be discrete or

continuous.

A scatter diagram is obtained by plotting the points ( x1, y1),( x2, y2) etc.Correlation is a measure of the linear association between

the variables.

There is said to be perfect correlation if all the points lie on

a line.

Statistics 2

Version B: page 6

Competence statements b1, b2, b3

© MEI

 Example 4.1

 Page 112

Exercise 4A

Q. 2

References:

Chapter 4

Pages 111-114

References:

Chapter 4

Pages 104-109

References:

Chapter 4

Pages 110-111

Summary S2 Topic 4   Bivariate Data 1 

 _ _ 

 _ _ 

A line of best fit is a line drawn to fit the set of data points

as closely as possible.

This line will pass through the mean point ( , ) where

 is the mean of the values and is the mean of the

 x y

 x x y

 values. y

x

2.0 3.0 4.01.0

1.5

y2.0

The mean point is ( ) which is (2.9, 1.5)

The line of best fit is drawn through the point (2.9,1.5)

 _ _ 

 x, y

17.4; 9.0; 26.9717.4 9.0

26.97 0.876

 xy xy

 x y xy

S S 

2

2

52.76 2.3

13.92 0.42

0.870.885

2.3 0.42

 xx

 yy

 xy

 xx yy

 x S 

 y S 

S r 

S S 

E.g. 50 students are selected at random and

their heights and weights are measured. This

will require correlation analysis.A ball is bounced 5 times from each of a

number of different heights and the height is

recorded. This will require regressionanalysis.

,

The alternative form for is

= =

 xy i i xx i i

 yy i i

 xy

i i

 xy i i i i

S x x y y S x x x x

S y y y y

 x yS x y x y n x y

n

 _ _ _ _ 

 _ _ 

 _ _ 

Notation for pairs of observations ( , ).n x y

 _ _ 

2 2 _ _ 

 _ _ 

 _ _ 2 2 2 2

.

i i xy

 xx yy

i i

i i

i i

 x x y yS 

r S S 

 x x y y

 x y n x y

 x n x y n y

Page 7: S2_MEI

8/10/2019 S2_MEI

http://slidepdf.com/reader/full/s2mei 7/7

For the data of Example 1:

r = 0.885

We wish to test the hypothesis that there is

 positive correlation between lengths and

widths of the leaves of the tree.

Ho:   = 0 There is no correlation betweenthe two variables.

H1:   > 0 There is positive correlation between the two variables

(1-tailed test).

From the Students’ Handbook, the critical

value for n = 8 at 5% level (one tailed test) is

0.6215.

Since 0.885>0.6215 there is evidence that H0 

can be rejected and that there is positive

correlation between the two variables.

Example 2 

2 judges ranked 5 competitors as follows:

Competitor A B C D E

Judge 1 1 3 4 2 5

Judge 2 2 3 1 4 5d   1 0 -3 2 0

d 2  1 0 9 4 0

The least squares regression line For each value of x the value of y given and the value on the

line may be different by an amount called the residual.

If the data pair is ( xi ,yi) where the line of best fit is

 y = a + bx then yi - (a + bxi )  = ei giving the residual ei .

The least squares regression line is the line that minimises the

sum of the squares of the residuals.

Spearman’s coefficient of rank correlation

If the data do not look linear when plotted on a scatter graph

(but appear to be reasonably monotomic), or if the rank order

instead of the values is given, then the Pearson correlation

coefficient is not appropriate.

Instead, Spearman’s rank correlation coefficient should beused. It is usually calculated using the formula

where d  is the difference in ranks for each data pair.

This coefficient is used:

(i) when only ranked data are available,

(ii) the data cannot be assumed to be linear.

In the latter case, the data should be ranked.

Where r  s has been found the hypothesis test is set up in the

same way. The condition here is that the sample is random.Make sure that you use the right tables!

Tied Ranks If two ranks are tied in, say, the 3rd place then each should be

given the rank 31/2. 

Testing a parent population correlation by means of a

sample where r  has been found

The value of r  found for a sample can be used to test

hypotheses about the value of   , the correlation in the parent population.

Conditions:

(i) the values of x and y must be taken from a bivariate

 Normal distribution,(ii) the data must be a random sample.

An indication that a bivariate Normal distribution is a validmodel is shown by a scatter plot which is roughly elliptical

with the points denser near the middle. 

Statistics 2

Version B:

 page 7

Competence

statements

 b4, b5, b6, b7, b8© MEI

Exercise 4CQ. 3

 Example 4.3

 Page 134

References:

Chapter 4

Pages 132-134

References:

Chapter 4

Pages 142-144

Summary S2 Topic 4  Bivariate Data 2 

2

2

61

( 1)

i

 s

d r 

n n

 _ 

The equation of the line is ( ) xy

 xx

S  y y x x

 _ 

52

1

6 1414 1 0.3

5 24 s

 xd r 

 x

0

1

1

H : = 0 There is no correlation between

the two variables.

H : 0 There is correlation between

the two variables (2 - tailed test.)

Or :

H : > 0 Ther 

  

 

  

1

e is positive correlation between

the two variables (1- tailed test.)

Or :

H : < 0 There is negative correlation between

the two variables (1- tailed test.)

  

References:

Chapter 4

Pages 118-124

For the data of example 2:

r  s = 0.3

Ho:   = 0 There is no correlation between

the two variables.

H1:   > 0 There is positive correlation

 between the two variables

(1-tailed test).

For n = 5 at the 5% level (1 tailed test), the

critical value is 0.9.

Since 0.3 < 0.9 we are unable to reject Ho 

and conclude that there is no evidence tosuggest correlation.

Exercise 4C

Q. 5

Exercise 4D

Q. 4

E.g. For the data  x  1 2 3 4 5

 y  1.1 2.4 3.6 4.7 6.1

2

 _ _ 

 _ 2 2 2

 _ _ 

15, 17.9, 55, 65

15 17.93, 3.58

5 5

55 6 3 10

65 5 3 3.58 11.3

11.33.58 3 1.13 0.19

10

 xx

 xy

 x y x xy

 x y

S x n x

S xy n x y

 y x y x