20
Correlation and Regression -Aakriti Agarwal Roll No. 13004 BMS 1A

Correlation analysis

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Correlation analysis

Correlation and Regression

-Aakriti Agarwal

Roll No. 13004

BMS 1A

Page 2: Correlation analysis

Correlation

• Correlation refers to statistical relationships involving two random variables or sets of data

• The correlation coefficient is denoted by ‘r’ and ranges from -1 to +1

• Tells the Direction and Measure of the Relationship between two variables

Page 3: Correlation analysis

Coefficient of Correlation The coefficient of correlation can be:

• perfectly negative r=-1

• strong negative -1<r<0 and r closer to 1

• weak negative -1<r<0 and r closer to 0

• independent r=0

• strong positive 0<r<1 and r closer to 1

• weak positive 0<r<0 and r closer to 0

• perfect positive r=1

Page 4: Correlation analysis

Methods to calculate

Correlation Coefficient

Karl Pearson

Spearman

Page 5: Correlation analysis

Karl Pearson

𝑟 = (𝑥1 − 𝑥 )(𝑦1 − 𝑦 )𝑛𝑖=0

𝑥1 − 𝑥 2 𝑦1 − 𝑦 2

n - number of pairs of observations

Page 6: Correlation analysis

Data for Calculation in MS Excel

Year Marketing

Expenditure Sales (In Rs. Lakhs) (in Unit Lakhs)

2001 8 9.1 2002 10.5 10.1 2003 11 9.3 2004 12 9.9 2005 12.9 11.3 2006 13.5 10.9 2007 11.6 11.6 2008 10.9 12.5 2009 13 14 2010 14 14.5 2011 15.3 15 2012 16 15.6 2013 17 16.2

0

2

4

6

8

10

12

14

16

18

2000 2005 2010 2015

Lakhs

Year

Expenditure In Lakhs

Sales in Lakhs

Page 7: Correlation analysis

Pearson in MS Excel r=$H$16/($E$16*$G$16)^0.5

∑(𝑥𝑖 − 𝑥 )(𝑦𝑖 − 𝑦 )

∑ 𝑥𝑖 − 𝑥 2

∑ 𝑦𝑖 − 𝑦 2

Year x y 𝑥1 − 𝑥 𝑥𝑖 − 𝑥 2 𝑦𝑖 − 𝑦 𝑦𝑖 − 𝑦 2 (𝑥𝑖 − 𝑥 )(𝑦𝑖 − 𝑦 )

2001 8 9.1 -4.746153846 22.52598 -3.20769 10.28929 15.2242

2002 10.5 10.1 -2.246153846 5.045207 -2.20769 4.873905 4.958817

2003 11 9.3 -1.746153846 3.049053 -3.00769 9.046213 5.251893

2004 12 9.9 -0.746153846 0.556746 -2.40769 5.796982 1.796509

2005 12.9 11.3 0.153846154 0.023669 -1.00769 1.015444 -0.15503

2006 13.5 10.9 0.753846154 0.568284 -1.40769 1.981598 -1.06118

2007 11.6 11.6 -1.146153846 1.313669 -0.70769 0.500828 0.811124

2008 10.9 12.5 -1.846153846 3.408284 0.192308 0.036982 -0.35503

2009 13 14 0.253846154 0.064438 1.692308 2.863905 0.429586

2010 14 14.5 1.253846154 1.57213 2.192308 4.806213 2.748817

2011 15.3 15 2.553846154 6.52213 2.692308 7.248521 6.87574

2012 16 15.6 3.253846154 10.58751 3.292308 10.83929 10.71266

2013 17 16.2 4.253846154 18.09521 3.892308 15.15006 16.55728

𝒙 12.74615 ∑ ∑ ∑

𝒚 12.30769 73.33231 74.44923 63.79538

Square

root

R= 0.863399

Page 8: Correlation analysis

Spearman

𝑟 = 1 −6 𝐷2 +

112

𝑚𝑖3 −𝑚𝑖

𝑛𝑖=0

𝑛3 − 𝑛

m=no. of times a pair of observations is repeated

D=Rank 1- Rank 2

Page 9: Correlation analysis

Spearman in MS Excel

=SUM(F3:F15)

=64 ∑𝐷2

=1-(6*F17)/($A$1*($A$1^2-1))

Year x y Rank 1 Rank 2 D(R1-R2) 𝐷2

2001 8 9.1 1 1 0 0

2002 10.5 10.1 2 4 -2 4

2003 11 9.3 4 2 2 4

2004 12 9.9 6 3 3 9

2005 12.9 11.3 7 6 1 1

2006 13.5 10.9 9 5 4 16

2007 11.6 11.6 5 7 -2 4

2008 10.9 12.5 3 8 -5 25

2009 13 14 8 9 -1 1

2010 14 14.5 10 10 0 0

2011 15.3 15 11 11 0 0

2012 16 15.6 12 12 0 0

2013 17 16.2 13 13 0 0

∑𝐷2

No. of pairs of

observations

squaring

R=0.824176

Page 10: Correlation analysis

Are Correlation and Causation the same?

Page 11: Correlation analysis

Correlation ≠ Causation

If it were, these would be true...

Page 12: Correlation analysis

Practical Applications

Page 13: Correlation analysis

Practical Application

Correlation is used in:

• Business

• Government

• Education

• Medicine

• Agriculture

Page 14: Correlation analysis

Business

• Marketing Expenditure and Sales Volume correlation (to measure the efficiency of marketing department)

• Correlation between prices of two securities in the stock market.

• Price of a commodity to supply(or demand) correlation.

Page 15: Correlation analysis

Government • Year on Year Revenue and Expenditure

Correlation (to forecast revenue based on expenditure)

• Tool in formulating various Economic Policies by correlating past trends.

• Yardstick to measure performance (Correlation between Planned and Actual Revenue)

Page 16: Correlation analysis

Education Models • Forecasting of student input flows towards elementary education (Correlation between birth rate data and enrollment in elementary grades)

• Forecasting of dropped out student flows at different levels of education (intermediate, graduate, post graduate)

Page 17: Correlation analysis

Medicine

• Finding out after effects of interactions between different medicines.

• Estimating the best treatment where various methods are applicable (Correlation between individual treatments’ results and severity of disease.

Page 18: Correlation analysis

Agriculture

• Correlation between certain weather conditions and Productivity.

• Correlation between irrigating and Productivity.

• Correlation between price and production or price and demand, to study demand supply pattern of crops in different seasons.

Page 19: Correlation analysis

Conclusion

• Correlation is one of the many effective ways of forecasting and predicting possible outcomes based on past observations.

• Though other statistical methods too need to be implemented to get a complete picture of the situation.

Page 20: Correlation analysis

Thank You