28
Regression Analysis Regression analysis is a mathematical measure of the averages relationship between two or more variable in terms of the original units of data. Types of Regression (i) Simple Regression (Two Variable at a time) (ii) Multiple Regression (More than two variable at a time) Linear Regression: If the regression curve is a straight line then there is a linear regression between the variables . Non-linear Regression/ Curvilinear Regression: If the regression curve is not a straight line then there is a non-linear regression between the variables.

Regression Analysis

Embed Size (px)

DESCRIPTION

Regression analysis is used when you want to predict a continuous dependent variable from a number of independent variables. If the dependent variable is dichotomous, then logistic regression should be used. (If the split between the two levels of the dependent variable is close to 50-50, then both logistic and linear regression will end up giving you similar results.) The independent variables used in regression can be either continuous or dichotomous. Independent variables with more than two levels can also be used in regression analyses, but they first must be converted into variables that have only two levels. This is called dummy coding and will be discussed later. Usually, regression analysis is used with naturally-occurring variables, as opposed to experimentally manipulated variables, although you can use regression with experimentally manipulated variables. One point to keep in mind with regression analysis is that causal relationships among the variables cannot be determined. While the terminology is such that we say that X "predicts" Y, we cannot say that X "causes" Y.

Citation preview

Page 1: Regression Analysis

Regression AnalysisRegression analysis is a mathematical measure of the

averages relationship between two or more variable in terms of the original units of data.

Types of Regression(i) Simple Regression (Two Variable at a time)(ii) Multiple Regression (More than two variable at a time)

Linear Regression: If the regression curve is a straight line then there is a linear regression between the variables .

Non-linear Regression/ Curvilinear Regression: If the regression curve is not a straight line then there is a non-linear regression between the variables.

Page 2: Regression Analysis

Simple Linear Regression Model & its Estimation

A simple linear regression model is based on a single independent variable and its general form is: Slope/ Regression Coefficients

Here Intercepts = dependent variable or regressands

= independent variable or regressor

= random error or disturbance term

Importance of error term:

(i) It captures the effect of on the dependent variable of all variable not included in the model.

(ii) It captures any specification error related to assumed linear functional form.

(iii) It captures the effects of unpredictable random componenets present in the dependent variable.

ttt XY

tY

tX

tt

Page 3: Regression Analysis

Estimation of the Model

Yt Xt

xtyt xt2

Sales (thousands

of Unit)

Adver Exp(million of

Rs.)

37 4.5=309/7

36/7-7.14286 -0.64286 4.591837 0.413265

48 6.5 3.857143 1.357143 5.234694 1.841837

45 3.5 0.857143 -1.64286 -1.40816 2.69898

36 3 -8.14286 -2.14286 17.44898 4.591837

25 2.5 -19.1429 -2.64286 50.59184 6.984694

55 8.5 10.85714 3.357143 36.44898 11.27041

63 7.5 18.85714 2.357143 44.44898 5.556122

∑Yt =309 ∑Xt = 36

44.1428 5.1428 ∑xt yt =157.37

∑xt 2 =

33.354

tY tX ttt YYy ˆ ttt XXx ˆ

Page 4: Regression Analysis

Estimation of the Model

tt

t

tt

XY

XY

x

yx

717.4882.19

882.19)143.5)(717.4(143.44ˆˆˆˆ

717.4354.33

357.157ˆ2

Then the estimated simple linear regression model is

Page 5: Regression Analysis
Page 6: Regression Analysis
Page 7: Regression Analysis
Page 8: Regression Analysis
Page 9: Regression Analysis
Page 10: Regression Analysis
Page 11: Regression Analysis

tt

t

t

XY

XY

x

yx

717.4882.19

882.19)143.5)(717.4(143.44ˆˆˆˆ

717.4354.33

357.157ˆ2

22

Page 12: Regression Analysis
Page 13: Regression Analysis
Page 14: Regression Analysis
Page 15: Regression Analysis
Page 16: Regression Analysis

General Formula for First Order Coefficients

)1)(1( 22.

YWXW

YWXWXYWYX

rr

rrrr

General Formula for Second Order Coefficients

)1)(1( 2.

2.

....

OYWOXW

OYWOXWOXYWOYX

rr

rrrr

Page 17: Regression Analysis

Partial Correlation

Remarks:1. Partial correlation coefficients lies between -1 & 1

2. Correlation coefficients are calculated on the bases of zero order coefficients or simple correlation where no variable is kept constant.

Limitation:3. In the calculation of partial correlation coefficients, it is

presumed that there exists a linear relation between variables. In real situation, this condition lacks in some cases.

4. The reliability of the partial correlation coefficient decreases as their order goes up. This means that the second order partial coefficients are not as dependable as the first order ones are. Therefore, it is necessary that the size of the items in the gross correlation should be large.

5. It involves a lot of calculation work and its analysis is not easy.

Page 18: Regression Analysis

Partial CorrelationExample: From the following data calculate 12.3

x1 : 4 0 1 1 1 3 4 1

x2 : 2 0 2 4 2 3 3 0

x3 : 1 4 2 2 3 0 4 0Solution:

22

16 2

2

16 ,2

2

16321 XandXX

Page 19: Regression Analysis

Partial Correlation

Page 20: Regression Analysis

Multiple CorrelationThe fluctuation in given series are not usually dependent upon a single factor or cause. For example wheat yields is not only dependent upon rain but also on the fertilizer used, sunshine etc. The association between such series and several variable causing these fluctuation is known as multiple correlation.

It is also defined as “ the correlation between several variable.”

Co-efficient of Multiple Correlation:Let there be three variable X1, X2 and X3.

Let X1 be dependent variable, depending upon independent variable , X2 and X3. The multiple correlation coefficient are defined as follows:R1.23 = Multiple correlation with X1 as dependent variable and X2. and X3. , as

independent variableR2.13 = Multiple correlation with X2 as dependent variable and X1. and X3. , as

independent variableR3.12 = Multiple correlation with X3 as dependent variable and X1. and X2 , as

independent variable

Page 21: Regression Analysis

Calculation of Multiple Correlation Coefficient

General Formula

For example

Page 22: Regression Analysis

Remarks

• Multiple correlation coefficient is a non-negative coefficient.• It is value ranges between 0 and 1. It cannot assume a minus

value.• If R1.23 = 0, then r12 = 0 and r13=0• R1.23 r12 and R1.23 r13 • R1.23 is the same as R1.32

• (R1.23 )2 = Coefficient of multiple determination.• If there are 3 independent variable and one dependent variable

the formula for finding out the multiple correlation is

)1)(1)(1(1 34.122

3.122

142

234.1 rrrR

Page 23: Regression Analysis

Limitation

Page 24: Regression Analysis

Advantages of Multiple Correlation

Page 25: Regression Analysis

ExampleGiven the following data

X1: 3 5 6 8 12 14X2: 16 10 7 4 3 2X3: 90 72 54 42 30 12

Compute coefficients of correlation of X3 on X1 and X2

Page 26: Regression Analysis

Example

Page 27: Regression Analysis

Example

Page 28: Regression Analysis

Types of Correlation

X

r12.3 is the correlation between variables 1 and 2 with variable 3 removed from both variables. To illustrate this, run separate regressions using X3 as the independent variable and X1 and X2 as dependent variables. Next, compute residuals for regression...