40
Chapter 12 Chapter 12 Multiple Multiple Regression Regression Analysis and Model Analysis and Model Building Building

Chapter 12

  • Upload
    kathy

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Chapter 12. Multiple Regression Analysis and Model Building. Chapter 12 - Chapter Outcomes. After studying the material in this chapter, you should be able to: Understand the general concepts behind model building using multiple regression analysis. - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 12

Chapter 12Chapter 12

Multiple Multiple Regression Regression

Analysis and Model Analysis and Model BuildingBuilding

Page 2: Chapter 12

Chapter 12 - Chapter 12 - Chapter Chapter OutcomesOutcomes

After studying the material in this chapter, you should be able to:•Understand the general concepts behind model building using multiple regression analysis.•Apply multiple regression analysis to business, decision-making situations.•Analyze the computer output for a multiple regression model and test the significance of the independent variables in the model.

Page 3: Chapter 12

Chapter 12 - Chapter 12 - Chapter Chapter OutcomesOutcomes

(continued)(continued)

After studying the material in this chapter, you should be able to:

•Recognize potential problems when using multiple regression analysis and take the steps to correct the problems.•Incorporate qualitative variables into the regression model by using dummy variables.

Page 4: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

SIMPLE LINEAR REGRESSION MODEL SIMPLE LINEAR REGRESSION MODEL (POPULATION MODEL)(POPULATION MODEL)

where:y = Value of the dependent variablex = Value of the independent variable = Population’s y-intercept = Slope of the population regression

line = Error term, or residual

xy 10

01

Page 5: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

ESTIMATED SIMPLE LINEAR ESTIMATED SIMPLE LINEAR REGRESSION MODELREGRESSION MODEL

where:b0 = Estimated y intercept

b1 = Estimated slope coefficient

xbbyi 10ˆ

Page 6: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

A residual or prediction errorresidual or prediction error is the difference between the actual value of y and the predicted value of y.

yye ˆ

Page 7: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

The standard error of the standard error of the estimateestimate refers to the standard deviation of the model errors. The standard error measures the dispersion of the actual values of the dependent variable around the fitted regression plane.

Page 8: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

MULTIPLE REGRESSION MODEL MULTIPLE REGRESSION MODEL (POPULATION MODEL)(POPULATION MODEL)

where: = Population’s regression constant = Population’s regression coefficient

for variable j; j=1, 2, … kk = Number of independent variables = Model error

kk xxxy 22110

0j

Page 9: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

ESTIMATED MULTIPLE REGRESSION ESTIMATED MULTIPLE REGRESSION MODELMODEL

kki xbxbxbby 22110ˆ

Page 10: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

A modelmodel is a representation of an actual system using either a physical or mathematical portrayal.

Page 11: Chapter 12

Model SpecificationModel Specification

• Decide what you want to do and select the dependent variable.

• List the potential independent variables for your model.

• Gather the sample data (observations) for all variables.

Page 12: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

The correlation coefficientcorrelation coefficient is a quantitative measure of the strength of the linear relationship between two variables. The correlation coefficient, r, ranges between -1.0 and +1.0.

Page 13: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

CORRELATION COEFFICIENTCORRELATION COEFFICIENT

One x variable with y

22 )()(

))((

yyxx

yyxxr

or

Page 14: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

CORRELATION COEFFICIENTCORRELATION COEFFICIENT

One x variable with another x

22 )()(

))((

xxxx

xxxxr

Page 15: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

(Example 12-1)(Example 12-1)

Multiple Regression Model:

)(5.203,28)(0.522,3

)(4.410,8)(4.144,1).(1.636.127,31ˆ

GarageBathrooms

BedroomsAgefeetSqy

House Characteristics:x1 = Square feet = 2,100; x2 = Age = 15; x3 = Number of Bedrooms = 4;

x4 = Number of baths = 3;

x5 = Size of garage = 2Point Estimate for Sale Price:

70.802,179$ˆ

)2(5.203,28)3(0.522,3

)4(4.410,8)15(4.144,1)100,2(1.636.127,31ˆ

y

y

Page 16: Chapter 12

Coefficient of Coefficient of DeterminationDetermination

MULTIPLE COEFFICIENT OF MULTIPLE COEFFICIENT OF DETERMINATIONDETERMINATION

The percentage of variation in the dependent variable explained by the independent variable in the regression model:

TSS

SSRR

squaresof sum Totaln regressiosquaresof Sum2

Page 17: Chapter 12

Model DiagnosisModel Diagnosis

• Is the overall model significant?• Are the individual variables

significant?• Is the standard deviation of the

model error too large to provide meaningful results?

• Is multicollinearity a problem?

Page 18: Chapter 12

Is the Model Significant?Is the Model Significant?

0210 kH

0 equalnot does oneleast At iAH

If the null hypothesis is true, the overall regression model is not useful for predictive purposes.

Page 19: Chapter 12

Is the Model Significant?Is the Model Significant?

F-TEST STATISTICF-TEST STATISTIC

where:SSR = Sum of squares regressionSSE = Sum of squares error n = Number of data points k = Number of independent

variablesDegrees of freedom = D1 = k and D2 =

n - k - 1

MSE

MSR

knSSEk

SSR

F

1

Page 20: Chapter 12

Is the Model Significant?Is the Model Significant?

ADJUSTED R-SQUAREDADJUSTED R-SQUAREDA measure of the percentage of explained variation in the dependent variable that takes into account the relationship between the number of cases and the number of independent variables in the regression model.

where: n = Number of data points k = Number of independent

variables

1

1)1(1)( 22

kn

nRRadjsqR A

Page 21: Chapter 12

Are the Individual Are the Individual Variables Significant?Variables Significant?

iallforH

H

iA

i

0:

0:0

Page 22: Chapter 12

Are the Individual Are the Individual Variables Significant?Variables Significant?

t-TEST FOR SIGNIFICANCE OF EACH t-TEST FOR SIGNIFICANCE OF EACH REGRESSION COEFFICIENTREGRESSION COEFFICIENT

where:bi = Sample slope coefficient for the ith

independent variablesbi

= Estimate of the standard error for

the ith sample slope coefficientn-k-1 = Degrees of freedom

ib

i

s

bt

0

Page 23: Chapter 12

364.201. t0

Are the Individual Are the Individual Variables Significant? Variables Significant?

(From Figure 12-7)(From Figure 12-7)

/2 = 0.01

Decision RuleDecision Rule: If -2.364 t 2.364, accept H0 Otherwise, reject H0

364.201. t

/2 = 0.01

02.0

,0.0:

0.0:0

model the in already are variablesother all given

model the in already are variablesother all given ,

iA

i

H

H

31315319

1..

knfd

Page 24: Chapter 12

Are the Individual Are the Individual Variables Significant? Variables Significant?

(From Figure 12-7)(From Figure 12-7)

0

1

H reject 2.364, 15.70 Since

15.70 t -Calculated For

:

0

2

H reject 2.364, - 10.15 -Since

10.15 -t -Calculated For

:

0

3

H reject 2.364, - 2.80 -Since

2.80 -t -Calculated For

:

0

4

H not reject do 2.364, 2.23 Since

2.23 t -Calculated For

:

0

5

H reject 2.364, 9.87 Since

9.87 t -Calculated For

:

Page 25: Chapter 12

Is the Standard Deviation of Is the Standard Deviation of the Regression Model Too the Regression Model Too

Large?Large?

ESTIMATE FOR THE STANDARD ESTIMATE FOR THE STANDARD DEVIATION OF THE MODELDEVIATION OF THE MODEL

where:SSE = Sum of squares error

n = Sample size k = Number of independent

variables

MSEkn

SSEs

1

Page 26: Chapter 12

Is Multicollinearity A Is Multicollinearity A Problem?Problem?

MulticollinearityMulticollinearity refers to the situation when high correlation exists between two independent variables. This means the two variables contribute redundant information to the multiple regression model. When highly correlated independent variables are included in the regression model, they can adversely affect the regression results.

Page 27: Chapter 12

Some Indications of Some Indications of Severe MulticollinearitySevere Multicollinearity

• Incorrect signs on the coefficients.• A sizable change in the values of the

previous coefficients when a new variable is added to the model.

• A variable previously significant in the model becomes insignificant when a new independent variable is added.

• The estimate of the standard deviation of the model increases when a variable is added to the model.

Page 28: Chapter 12

Is Multicollinearity A Is Multicollinearity A Problem?Problem?

The variance inflation factorvariance inflation factor is a measure of how much the variance of an estimated regression coefficient increases if the independent variables are correlated. A VIFVIF equal to one for a given independent variable indicates that this independent variable is not correlated with the remaining independent variables in the model. The greater the multicollinearity, the larger the VIF will be.

Page 29: Chapter 12

Is Multicollinearity A Is Multicollinearity A Problem?Problem?

VARIANCE INFLATION FACTORVARIANCE INFLATION FACTOR

where:Rj

2 = Coefficient of determination when the jth independent variable is regressed against the remaining k - 1 independent variables.

)1(

12jR

VIF

Page 30: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

CONFIDENCE INTERVAL FOR THE CONFIDENCE INTERVAL FOR THE REGRESSION COEFFICIENTREGRESSION COEFFICIENT

where:bi = Point estimate for the regression

coefficient i

t/2= Critical t-value for a 1 - confidence interval

sbi= The standard error of the ith regression coefficient

ibi stb 2/

Page 31: Chapter 12

Multiple Regression Multiple Regression AnalysisAnalysis

(Example from Figure 12-9)(Example from Figure 12-9)

)017.4(967.106.63

ibi stb 2/

90.706.63 $55.16$55.16 $70.97$70.97

Page 32: Chapter 12

Using Qualitative Using Qualitative Independent VariablesIndependent Variables

A dummy variabledummy variable is a variable that is assigned a value equal to 0 or 1 depending on whether the observation possesses a given characteristic or not.

Page 33: Chapter 12

Using Qualitative Using Qualitative Independent Variables Independent Variables

(Example 12-2)(Example 12-2)

22110 xxy

notif 0 MBA,if 1 2x

Age 1xSalary y

21 236,35055,2974,6ˆ xxy

Dummy Variable:

Estimated Regression:

Page 34: Chapter 12

Using Qualitative Using Qualitative Independent Variables Independent Variables

(Example 12-2)(Example 12-2)

If No MBA:

If MBA:

)0(236,35055,2974,6ˆ 1 xy

1055,2974,6ˆ xy

)1(236,35055,2974,6ˆ 1 xy

1055,2210,42ˆ xy

Page 35: Chapter 12

Using Qualitative Using Qualitative Independent VariablesIndependent Variables

(Figure 12-11)(Figure 12-11)

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

0 10 20 30 40 50 60 70

Age

Salary

MBAs

1055,2210,42ˆ xy

Non-MBAs

1055,2974,6ˆ xy b2 = 35,236 = Regression coefficient on the dummy variable

Page 36: Chapter 12

Stepwise RegressionStepwise Regression

Stepwise regressionStepwise regression refers to a method which develops the least squares regression equation in steps, either through forward forward selectionselection, backward eliminationbackward elimination, or through standard stepwisestandard stepwise regression.

Page 37: Chapter 12

Stepwise RegressionStepwise Regression

The coefficient of partial coefficient of partial determinationdetermination is the measure of the marginal contribution of each independent variable, given that other independent variables are in the model.

Page 38: Chapter 12

Best Subsets RegressionBest Subsets Regression

CCp p STATISTICSTATISTIC

where:p = (Number of independent variables in

model) + 1T = 1 + The total number of independent

variables to be considered for inclusion in the model

Rp2 = Coefficient of multiple determination

for the model with p = k parameters

RT2 = Coefficient of multiple determination

for the model that contains all T parameters

)2(1

))(1(2

2

pnR

TnRC

T

pp

Page 39: Chapter 12

Key TermsKey Terms

• Adjusted R-Squared• Correlation

Coefficient• Correlation Matrix• Dummy Variables

• Multicollinearity• Multiple

Coefficient of Determination

• Multiple Regression Model

Page 40: Chapter 12

Key TermsKey Terms(continued)(continued)

• Residual (Prediction Error)

• Standard Error of the Estimate

• Standardized Residual• Variance Inflation

Factor