Correlation analysis ppt

Preview:

DESCRIPTION

Prof Anil Mishra Krupajal Management Studies Bhubaneswar-2

Citation preview

CORRELATION CORRELATION ANALYSISANALYSIS

It is a statistical measure It is a statistical measure which shows relationship which shows relationship

between two or more variable between two or more variable moving in the same or in moving in the same or in

opposite directionopposite direction

04/08/2304/08/23 11anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Visual Displays and Visual Displays and Correlation AnalysisCorrelation Analysis

• The The sample correlation coefficientsample correlation coefficient ( (rr) ) measures the degree of linearity in the measures the degree of linearity in the relationship between relationship between XX and and YY..

-1 -1 << rr << +1 +1

• rr = 0 indicates no linear relationship = 0 indicates no linear relationship

• In Excel, use =CORREL(array1,array2), In Excel, use =CORREL(array1,array2), where array1 is the range for where array1 is the range for XX and array2 is and array2 is the range for the range for YY..

•Correlation AnalysisCorrelation Analysis

Strong negative relationshipStrong negative relationship Strong positive relationshipStrong positive relationship

Types of correlationTypes of correlation

correlation

positive& negative

Simple ,multiple & partial

Linear & non-linear

04/08/2304/08/23 33anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Methods of correlationMethods of correlation

• Scatter diagramScatter diagram

• Product moment or covarianceProduct moment or covariance

• Rank correlationRank correlation

• Concurrent deviationConcurrent deviation

04/08/2304/08/23 44anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Scatter diagramScatter diagram

• Perfectly +vePerfectly +ve

04/08/2304/08/23 55anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Less-degree +veLess-degree +ve

04/08/2304/08/23 66anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Weak Positive Weak Positive CorrelationCorrelation

High degree +veHigh degree +ve

04/08/2304/08/23 77anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Strong Positive Strong Positive CorrelationCorrelation

Perfectly -vePerfectly -ve

04/08/2304/08/23 88anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

High degree -veHigh degree -ve

04/08/2304/08/23 99anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Strong Negative CorrelationStrong Negative Correlation

Less degree -veLess degree -ve

04/08/2304/08/23 1010anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Weak Negative Weak Negative CorrelationCorrelation

Zero degree Zero degree

04/08/2304/08/23 1111anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Karl Pearson correlation Karl Pearson correlation coefficientcoefficient

22.

.

.

),cov(

yx

yxr

yxr

yx

04/08/2304/08/23 1212anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

WhereWhere

YYy

and

XXx

04/08/2304/08/23 1313anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

problem

From the following data find the coefficient of correlation by Karl Pearson method

X:6 2 10 4 8Y:9 11 5 8 7

Sol.

X Y X-6 Y-8

6 9 0 1 0 1 0

2 11 -4 3 16 9 -12

10 5 4 -3 16 9 -12

8 8 -2 0 4 0 0

4 7 2 -1 4 1 -2

30 40 0 0 40 20 -26

2x 2y yx.

Sol.cont.

92.0800

26

20.40

26

.

.

85

40

65

30

22

yx

yxr

N

YY

N

XX

Direct methodDirect method

2222 ..

..

YYNXXN

YXXYNr

04/08/2304/08/23 1717anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Short-cut methodShort-cut method

2222 .

..

yyxx

yxyx

ddNddN

ddddN

r

04/08/2304/08/23 1818anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

WhereWhere

meanassumeA

AYd

AXd

y

x

&

04/08/2304/08/23 1919anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Product moment methodProduct moment method

2

2

.

x

xyb

y

xyb

where

bbr

yx

xy

yxxy

04/08/2304/08/23 2020anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

spearman’s Rank correlationspearman’s Rank correlation

yofrankR

XofrankR

RRD

where

NN

DR

y

x

yx

..

..

)1(

61

2

2

04/08/2304/08/23 2121anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

problem

Calculate spearman’s rank correlation coefficient between advt.cost & sales from the following data

Advt.cost :39 65 62 90 82 75 25 98 36 78

Sales(lakhs): 47 53 58 86 62 68 60 91 51 84

Sol.X Y R-x R-y D

39 47 8 10 -2 4

65 53 6 8 -2 4

62 58 7 7 0 0

90 86 2 2 0 0

82 62 3 5 -2 4

75 68 5 4 1 1

25 60 10 6 4 16

98 91 1 1 0 0

36 51 9 9 0 0

78 84 4 3 1 1

30

2D

Sol.cont.

82.011

911

21

1010

30.61

61

3

3

2

R

R

R

NN

DR

In case of equal rankIn case of equal rank

itemsrepeatedofnom

where

NN

mmmmDR

.

1

........121

121

61

2

332

04/08/2304/08/23 2525anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

problemA psychologist wanted to compare two

methods A & B of teaching. He selected a random sample of 22 students. He grouped them into 11 pairs so that the students in a pair have approximately equal scores in an intelligence test. In each pair one student was taught by method A and the other by method B and examined after the course. The marks obtained by them as follows

Pair:1 2 3 4 5 6 7 8 9 10 11A: 24 29 19 14 30 19 27 30 20 28 11B: 37 35 16 26 23 27 19 20 16 11 21

Sol.A B R-A R-B D

24 37 6 1 5 25

29 35 3 2 1 1

19 16 8.5 9.5 -1 1

14 26 10 4 6 36

30 23 1.5 5 -3.5 12.25

19 27 8.5 3 5.5 30.25

27 19 5 8 -3 9

30 20 1.5 7 -5.5 30.25

20 16 7 9.5 -2.5 6.25

28 11 4 11 -7 49

11 21 11 6 5 25

225

2D

Sol.cont.

0225.0

11211112

14212

14212

142D6

-1R

twicerepeated is

series16 Bin and twicerepeated are 30& 19 items theseriesA in

2

R

Properties of correlation Properties of correlation coefficientcoefficient

• r always lies between +1 & -1r always lies between +1 & -1i.e. -1<r<+1 i.e. -1<r<+1 • Two independent variables are Two independent variables are

uncorrelated but converse is not true uncorrelated but converse is not true • r is independent of change in origin and r is independent of change in origin and

scalescale• r is the G.M. of two regression r is the G.M. of two regression

coefficientscoefficients• r is symmetricalr is symmetrical

04/08/2304/08/23 2929anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Probable errorProbable error

n

rrPE

or

rSErPEn

rrSE

2

2

16745.0

6745.0

1

04/08/2304/08/23 3030anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

The coefficient of determinationIt is the primary way we can measure the

extent or strength of the association that exists between two variables x & y . Because we have used a sample of points to develop regression lines .

It is denoted by 2r

Thank youThank you

04/08/2304/08/23 3232anilmishra5555@rediffmail.comanilmishra5555@rediffmail.com

Recommended