34
“Regression is a statistical technique which establish a functional relationship between two or more variables in the form of an equation to estimate the value of one variable based on the

Chapter 1 Regression Analysis[1]

Embed Size (px)

DESCRIPTION

nn

Citation preview

Page 1: Chapter 1 Regression Analysis[1]

“Regression is a statistical technique which establish a functional relationship between two or more variables in the form of an equation to estimate the value of one variable based on the value of another variable”

Page 2: Chapter 1 Regression Analysis[1]

Regression Analysis

• Simple Linear Regression Model

y = 0 + 1x +

• Simple Linear Regression Equation

y = 0 + 1x

• Estimated Simple Linear Regression Equation

xb b y 10

Page 3: Chapter 1 Regression Analysis[1]

Principle of least squares technique

Page 4: Chapter 1 Regression Analysis[1]

Case 1:

Observed points : (4,8); (8,1); (12,6)

Estimated points : (4,6); (8,5); (12,4)

Observed points : (4,8); (8,1); (12,6)

Estimated points : (4,2); (8,5); (12,8)

Page 5: Chapter 1 Regression Analysis[1]

Error (graph 1) Error (graph 2)

8-6=2 8-2=6

1-5=-4 1-5=-4

6-4=2 6-8=-2

Total error=0 Total error=0

Page 6: Chapter 1 Regression Analysis[1]

Absolute error Absolute error

I8-6I=2 I8-2I=6

I1-5I=4 I1-5I=4

I6-4I=2 I6-8I=2

Total Absolute error=8Total Abs error=12

Page 7: Chapter 1 Regression Analysis[1]

Case 2:

Observed points: (2,4); (6,7); (10,2)

Estimated points: (2,4); (6,3); (10,2)

Observed points: (2,4); (6,7); (10,2)

Estimated points: (2,5); (6,4); (10,3)

Page 8: Chapter 1 Regression Analysis[1]

Abs Error Abs Error

I4-4I=0 I4-5I=1

I7-3I=4 I7-4I=3

I2-2I=0 I2-3I=1

Total Abs error=4 Total Abs error=5

Page 9: Chapter 1 Regression Analysis[1]

Error Square ErrorSquare

(4-4)2 =0 (4-5) 2=1

(7-3) 2=16 (7-4) 2=9

(2-2) 2=0 (2-3) 2=1

Sum of error square=16 (Graph 1)

Sum of error square=11 (Graph 2)

Page 10: Chapter 1 Regression Analysis[1]

Least Squares Method

• Least Squares Criterion

where:

yi = observed value of the dependent variable for the i th observation

2)ˆ(min ii yy

nobservatioith for the

variabledependent theof valueestimated yi

Page 11: Chapter 1 Regression Analysis[1]

• Slope for the Estimated Regression Equation

x = value of independent variable for ith observationy = value of dependent variable for ith observationn = total number of observations

• y-Intercept for the Estimated Regression Equation

221

xxn

yxxynb

xbyb 10

variabledependent for mean value y

t variableindependenfor mean value x

Page 12: Chapter 1 Regression Analysis[1]

• Simple Linear Regression

Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown below.

Number of TV Ads Number of Cars

Sold1 143 242 181 173 27

Page 13: Chapter 1 Regression Analysis[1]

• The HRD manager of a company wants to find a measure which he can use to fix the monthly income of persons applying for a job in the production department. As an experimental project, he collected data on 7 persons from that department referring to years of service and their monthly income (in 000’s).

Years of experience 11 7 9 5 8 6 10

Income 10 8 6 5 9 7 11

Page 14: Chapter 1 Regression Analysis[1]

• Find the regression equation of income on years of service.

• What initial start would you recommend for a person applying for the job after having served in a similar capacity in another company for 13 years?

• Do you think other factors are to be considered (in addition to the years of service) in fixing the income? Explain.

Page 15: Chapter 1 Regression Analysis[1]

Properties of regression lines and their coefficients:

1. Correlation coefficient is the geometric mean between the regression coefficient

2. The sign of correlation coefficient is the same as that of regression coefficient.

3. Regression coefficients are dependent of the change origin but not of scale.

Page 16: Chapter 1 Regression Analysis[1]

In finance, it is of interest to look at the relationship between Y, a stock’s average return, and X, the overall market return. The slope coefficient computed by linear regression is called the stock’s beta by investment analysts. A beta greater than 1 indicates that the stock is relatively sensitive to changes in the market; a beta less than 1 indicates that the stock is relatively insensitive. For the following data, compute the beta and suggest market trend.

X (%)

10 12 8 15 9 11 8 10 13 11

Y (%)

11 15 3 18 10 12 6 7 18 13

Page 17: Chapter 1 Regression Analysis[1]

Multiple regression Analysis

Page 18: Chapter 1 Regression Analysis[1]

• A linear regression equation with more than one independent variable is called a multiple regression model.

Page 19: Chapter 1 Regression Analysis[1]

chance. to dueerror random the is ε

variable.t independen x the of each with

associated tscoefficien regression the are ...βββ

constant a is β

estimated be to variabledependent of value the is y

where

εxβ........xβxβxββy

:form the takes variablest independen k

with equation regressionlinear The

k

k2,1,

0

kk3322110

Page 20: Chapter 1 Regression Analysis[1]

technique. squaresleast of principle the by obtained

are and tscoefficien regression partial ....bb,b,b

y variabledependent of value estimated the is y

where

)y-(y (SSE) errors squares of sum the

minimizes which xb.......xbxbby

be equation regressionlinear fitted theLet

k321

2

kk22110

ˆ

ˆ

ˆ

Page 21: Chapter 1 Regression Analysis[1]

• Let us consider the case where two independent variables and a dependent variable.

Page 22: Chapter 1 Regression Analysis[1]

ts.coefficien regression the are β,β

intercept.-y the is β

chance. to dueerror random the is ε

variables.t independen are x and x

variabledependent the is y

where

εxβxββ y

:is variablest independen two involving

model regressionlinear multiple The

21

0

21

22110

Page 23: Chapter 1 Regression Analysis[1]

2

21,0

21

2y2.11y1.20

22110

)y-(y(SSE) errors squres of sum the minimizes which

technique squaresleast of priniple the by determined are

and constants unknown the are bb,b

variables.t independen the are x,x

y. variabledependent of value estimated the is y

where

xbxbby

xbxbby

be equation regressionlinear multiple fitted theLet

ˆ

ˆ

ˆ

ˆ

or

Page 24: Chapter 1 Regression Analysis[1]

22y2.121y1.2202

21y2.12

1y1.2101

2y2.11y1.20

210

xbx xbxbx y

xxbxbxbx y

xbxbnby

.determined be can b ,b,b

of values the equations following the solving By

Page 25: Chapter 1 Regression Analysis[1]

2y2.11y1.2

22y2.111y1.2

2y2.11y1.20

2y2.11y1.20

22110

XbXbY

)x-(xb)x(xb)y-(y

(2)-(1)

xbxbby

xbxbbyor

xbxbby

be equation regressionlinear multiple fitted theLet

-(2)---

-(1)---

Page 26: Chapter 1 Regression Analysis[1]

xxX

xxX

y-yY

where

XXXX

XXXYXXYb

XXXX

XXXYXXYb

222

111

2

2122

21

121212

y2.1

2

2122

21

122221

y1.2

Page 27: Chapter 1 Regression Analysis[1]

• A marketing manager of a company wants to predict demand for the product. He is believing strongly demand (Y) is highly influenced by annual average price (X1) of the product (in units) & advertising expenditure (X2) (Rs in lakh).He has collected past data to know the effect of these factors on demand and given below:

Y 4 6 7 9 13 15X1 15 12 8 6 4 3X2 30 24 20 14 10 4

Page 28: Chapter 1 Regression Analysis[1]

Ex: Christmas week is a critical period for most ski resorts. Because many students and adults are free from other obligations, they are able to spend several days indulging in their favorite pastime, skiing. A large proportion of gross revenue is earned during this period. A ski resort in Vermont wanted to determine the effect that weather had on its sales of lift tickets. The manager of the resort collected data on the number of lift tickets sold during Christmas week (y), the total snowfall in inches (x1), and the average temperature in degrees Fahrenheit (x2) for the past 10 years. Develop the multiple regression model.

Page 29: Chapter 1 Regression Analysis[1]

Tickets Snowfall Temperature6835 19 117870 15 -196173 7 367979 11 227639 19 147167 2 -208094 21 399903 19 279788 18 269557 20 16

Page 30: Chapter 1 Regression Analysis[1]

• The Federal Reserve is performing a preliminary study to determine the relationship between certain economic indicators and annual percentage change in the gross national product (GNP). Two such indicators being examined are the amount of the federal government’s deficit (in billions of dollars) and the Dow Jones Industrial Average (the mean value over the year). Data for 6 years follow:

Page 31: Chapter 1 Regression Analysis[1]

Change in GNP 2.5 -1.0 4.0 1.0 1.5 3.0Federal Deficit 100.0 400.0 120.0 200.0 180.0 80.0Dow Jones 2850 2100 3300 2400 2550 2700

i) Calculate the least squares equation that best describes the data.

ii) What % change in GNP would be expected in a year in which the federal deficit was $240 billion and the mean Dow Jones value was 3000?

Page 32: Chapter 1 Regression Analysis[1]

• Multiple correlation analysis:

It is a measure of association between a dependent variable and several independent variables taken together.

Page 33: Chapter 1 Regression Analysis[1]

The coefficient of multiple correlation is given by,

1. and 0 between in lie always value Its

r1

rr2rrrR

212

12y2y12y2

2y1

y.12

Page 34: Chapter 1 Regression Analysis[1]

• Coefficient of multiple determination:

It is the proportion of the total variation in the multiple values of dependent variable y, accounted for or explained by the independent variables in the multiple regression model.

• The square of coefficient of multiple correlation is called Coefficient of multiple determination.