46

Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

Embed Size (px)

Citation preview

Page 1: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24
Page 2: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

Copyright © 2011 Pearson Education, Inc.

Building Regression Models

Chapter 24

Page 3: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

What explanatory variables belong in a regression model for stock returns?

Initial model motivated by theory such as CAPM

Seek additional variables that improve fit and produce better predictions

The process is typically complicated by correlated explanatory variables (i.e., collinearity)

Copyright © 2011 Pearson Education, Inc.

3 of 46

Page 4: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

The Initial Model

Build a model that describes returns on Sony stock

CAPM provides a theoretical starting point: use % change for the whole stock market as an explanatory variable

Copyright © 2011 Pearson Education, Inc.

4 of 46

Page 5: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

The Initial Model – Scatterplot

Association appears linear, two outliers identified.

Copyright © 2011 Pearson Education, Inc.

5 of 46

Page 6: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

The Initial Model – Timeplot of Residuals

Locates outliers in time (Dec. 1999 and Apr. 2003).No evidence of dependence.

Copyright © 2011 Pearson Education, Inc.

6 of 46

Page 7: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

The Initial Model – Regression Results

Copyright © 2011 Pearson Education, Inc.

7 of 46

Page 8: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

The Initial Model – Residual Plot

Aside from the two outliers, residuals have similar variances.

Copyright © 2011 Pearson Education, Inc.

8 of 46

Page 9: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

The Initial Model – Check Normality

Aside from the two outliers, residuals are nearly normal.

Copyright © 2011 Pearson Education, Inc.

9 of 46

Page 10: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

The Initial Model – Proceed to Inference

Estimates are consistent with CAPM.

The estimated intercept is not significantly different from zero with a p-value of 0.6964.

The estimated slope is highly significant with a p-value less than 0.0001.

Copyright © 2011 Pearson Education, Inc.

10 of 46

Page 11: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

Identifying Other Variables

Research in finance suggests other variables, should be added to the initial model.

Three of these variables are: percentage change in the DJIA (Dow % Change) and differences in performance between small and large companies (Small-Big) and between growth and value stocks (High-Low).

Copyright © 2011 Pearson Education, Inc.

11 of 46

Page 12: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

Correlation Matrix

Copyright © 2011 Pearson Education, Inc.

12 of 46

Page 13: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

Scatterplot Matrix

Copyright © 2011 Pearson Education, Inc.

13 of 46

Page 14: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

Identifying Other Variables

The correlation matrix indicates that percentage changes in the DJIA and in the whole market index are highly correlated.

The scatterplot matrix indicates that the association between the response and these variables appear linear.

Copyright © 2011 Pearson Education, Inc.

14 of 46

Page 15: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

Adding Explanatory Variables

The data consist of 168 observations with four candidate explanatory variables.

Begin model building by including all four variables in the multiple regression model.

Copyright © 2011 Pearson Education, Inc.

15 of 46

Page 16: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

MRM with All Four Explanatory Variables

Copyright © 2011 Pearson Education, Inc.

16 of 46

Page 17: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

Residual Plot: Residuals vs. Fitted Values

Outliers are still present; however, this and other residual plots show the conditions for MRM are satisfied.

Copyright © 2011 Pearson Education, Inc.

17 of 46

Page 18: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

MRM with All Four Explanatory Variables

The F-statistic is 21.59 with p-value of 0.0001; this multiple regression equation explains statistically significant variation in percentage changes in the value of Sony stock.

Based on the t-statistics, only the variable Small-Big improves a regression that contains all of the other explanatory variables.

Copyright © 2011 Pearson Education, Inc.

18 of 46

Page 19: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.1 Identifying Explanatory Variables

MRM with All Four Explanatory Variables

Adding other explanatory variables to the initial model alters the slope for Market % Change.

This once important variable is no longer statistically significant in explaining percentage changes in the value of Sony stock.

Copyright © 2011 Pearson Education, Inc.

19 of 46

Page 20: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.2 Collinearity

Marginal and Partial Slopes

There is a high correlation between Market % Change and Dow % Change (r = 0.89).

This collinearity produces imprecise estimates of the partial slopes.

It explains the difference between the marginal and partial slopes for Market % Change.

Copyright © 2011 Pearson Education, Inc.

20 of 46

Page 21: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.2 Collinearity

Variance Inflation Factor (VIF)

Variance inflation factor: quantifies the amount of unique variation in each explanatory variable and measures the effect of collinearity.

The VIF for is

Copyright © 2011 Pearson Education, Inc.

21 of 46

21

1)(

jj RxVIF

jx

Page 22: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.2 Collinearity

Results for Sony Stock Value Example

Copyright © 2011 Pearson Education, Inc.

22 of 46

Page 23: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.2 Collinearity

Results for Sony Stock Value Example

Is High-Low not statistically significant because it is redundant or simply unrelated to the response?

Because it has a VIF near 1, collinearity has little effect on this variable (not redundant).

Generally, VIF > 5 or 10 suggests redundancy.

Copyright © 2011 Pearson Education, Inc.

23 of 46

Page 24: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.2 Collinearity

Signs of Collinearity

R2 increases less than we’d expect.

Slopes of correlated explanatory variables in the model change dramatically.

The F-statistic is more impressive than individual t-statistics.

Copyright © 2011 Pearson Education, Inc.

24 of 46

Page 25: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.2 Collinearity

Signs of Collinearity (Continued)

Standard errors for partial slopes are larger than those for marginal slopes.

Variance inflation factors increase.

Copyright © 2011 Pearson Education, Inc.

25 of 46

Page 26: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.2 Collinearity

Remedies for Collinearity

Remove redundant explanatory variables.

Re-express explanatory variables (e.g., use the average of Market % Change and Dow % Change as an explanatory variable).

Do nothing if the explanatory variables are significant with sensible estimates.

Copyright © 2011 Pearson Education, Inc.

26 of 46

Page 27: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

24.3 Removing Explanatory Variables

Issues

After adding several explanatory variables to a model, some of those added and some of those originally present may not be statistically significant.

Remove those variables for which both statistics and substance indicate removal (e.g., remove Dow % Change rather than Market % Change).

Copyright © 2011 Pearson Education, Inc.

27 of 46

Page 28: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.1: MARKET SEGMENTATION

Motivation

Within which magazine should a manufacturer of a new mobile phone advertise? One has an older audience. They collect consumer ratings on the new phone design along with consumers’ ages and reported incomes.

Copyright © 2011 Pearson Education, Inc.

28 of 46

Page 29: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.1: MARKET SEGMENTATION

Method

Use multiple regression with ratings as the response and age and income as the explanatory variables. Examine the correlation matrix and scatterplot matrix.

Copyright © 2011 Pearson Education, Inc.

29 of 46

Page 30: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.1: MARKET SEGMENTATION

Method

There is a high correlation between age and income that implies collinearity.

Copyright © 2011 Pearson Education, Inc.

30 of 46

Page 31: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.1: MARKET SEGMENTATION

Method

Association is linear with no outliers.

Copyright © 2011 Pearson Education, Inc.

31 of 46

Page 32: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.1: MARKET SEGMENTATION

Mechanics – Estimation Results

Copyright © 2011 Pearson Education, Inc.

32 of 46

Page 33: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.1: MARKET SEGMENTATION

Mechanics – Examine Plots

MRM conditions are satisfied.

Copyright © 2011 Pearson Education, Inc.

33 of 46

Page 34: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.1: MARKET SEGMENTATION

Mechanics

The F-statistic has a p-value of < 0.0001. The model explains statistically significant variation in the ratings. Although collinear, both predictors (age and income) are statistically significant.

Copyright © 2011 Pearson Education, Inc.

34 of 46

Page 35: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.1: MARKET SEGMENTATION

Message

The manufacturer should advertise in the magazine with younger subscribers. Based on the 95% confidence interval for the slope of Age, an affluent audience that is younger by 20 years assigns, on average, ratings that are 1 to 2 points higher than the older, affluent audience.

Age changes sign when adjusted for differences in income. Substantively, this makes sense because younger customers with money find the new design attractive.

Copyright © 2011 Pearson Education, Inc.

35 of 46

Page 36: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.2: RETAIL PROFITS

Motivation

A chain of pharmacies is looking to expand into a new community. It has data for 110 cities on the following variables: income, disposable income, birth rate, social security recipients, cardiovascular deaths and percentage of local population aged 65 or more.

Copyright © 2011 Pearson Education, Inc.

36 of 46

Page 37: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.2: RETAIL PROFITS

Method

Use multiple regression. The response variable is profit. Examine the correlation matrix and the scatterplot matrix.

Copyright © 2011 Pearson Education, Inc.

37 of 46

Page 38: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.2: RETAIL PROFITS

Method

Several high correlations are present (shaded in table) and indicate the presence of collinearity.

Copyright © 2011 Pearson Education, Inc.

38 of 46

Page 39: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.2: RETAIL PROFITS

Method

This partial scatterplotmatrix identifies communities that aredistinct from others.

Linearity and no lurking variables conditions are met.

Copyright © 2011 Pearson Education, Inc.

39 of 46

Page 40: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.2: RETAIL PROFITS

Mechanics – Estimation Results

Copyright © 2011 Pearson Education, Inc.

40 of 46

Page 41: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.2: RETAIL PROFITS

Mechanics – Examine Plots

These and other plots (not shown here) indicate that all MRM conditions are satisfied.

Copyright © 2011 Pearson Education, Inc.

41 of 46

Page 42: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.2: RETAIL PROFITS

Mechanics

The F-statistic indicates that this collection of explanatory variables explains statistically significant variation in profits. The VIF’s indicate some explanatory variables are redundant and should be removed (one at a time) from the model.

Copyright © 2011 Pearson Education, Inc.

42 of 46

Page 43: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.2: RETAIL PROFITS

Mechanics – Simplified Model

This multiple regression separates the effects of birth rates from age (and income). It reveals that cities with higher birth rates produce higher profits when compared to cities with lower birth rates but comparable income and local population above 65.

Copyright © 2011 Pearson Education, Inc.

43 of 46

Page 44: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

4M Example 24.2: RETAIL PROFITS

Message

Three characteristics of the local community affect estimated profits: disposable income, age and birth rates. Increases in each of these lead to higher profits. The data show that the pharmacy chain will have to trade off these characteristics in selecting a site for expansion.

Copyright © 2011 Pearson Education, Inc.

44 of 46

Page 45: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

Best Practices

Begin a regression analysis by looking at plots.

Use the F-statistic for the overall model and a t-statistic for each explanatory variable.

Learn to recognize the presence of collinearity.

Don’t fear collinearity – understand it.

Copyright © 2011 Pearson Education, Inc.

45 of 46

Page 46: Copyright © 2011 Pearson Education, Inc. Building Regression Models Chapter 24

Pitfalls

Do not remove explanatory variables at the first sign of collinearity.

Don’t remove several explanatory variables from your model at once.

Copyright © 2011 Pearson Education, Inc.

46 of 46