15
Simple Linear Regression: Ŷ b 0 + b 1 X 1 You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained. Find the equation of Annual Store Square Sales Feet ($1000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563

You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Embed Size (px)

Citation preview

Page 1: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Simple Linear Regression: Ŷ b0 + b1 X1

You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained. Find the equation of the straight line that fits the data best.

Annual Store Square Sales

Feet ($1000)

1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760

Page 2: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Which is the dependent Y variable?

A. The Store Number

B. The Square Footage of the Store

C. The Annual Sales of the Store

Page 3: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Which is the independent X variable?

A. The Store Number

B. The Square Footage of the Store

C. The Annual Sales of the Store

Page 4: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Scatter Diagram: Example

0

2000

4000

6000

8000

10000

12000

0 1000 2000 3000 4000 5000 6000

Square Feet

An

nu

al

Sa

les

($00

0)

Excel Output

Page 5: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Equation for the Sample Regression Line: Example

0 1ˆ

1636.415 1.487i i

i

Y b b X

X

From Excel Printout:

CoefficientsIntercept 1636.414726X Variable 1 1.486633657

Page 6: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Graph of the Sample Regression Line: Example

0

2000

4000

6000

8000

10000

12000

0 1000 2000 3000 4000 5000 6000

Square Feet

An

nu

al

Sa

les

($00

0)

Y i = 1636.415 +1.487X i

Page 7: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Interpretation of Results:

The model estimates that for each increase of one square foot in the size of the store, the expected annual sales are predicted to increase by $1487.

ˆ 1636.415 1.487i iY X

If a new 2,000 square foot produce store is built, the model predicts that the expected annual sales at the store will be:

1636 + 1.487*2000 = $4,610 (in 1000s)

Page 8: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

The Coefficient of Determination

• r2 = 94.2%• Measures the proportion of variation in Y

(e.g. Annual Sales) that is explained by the linear regression model with the independent variable X (e.g. square feet)

• Describes the explanatory power of the simple linear regression model; it does not imply that X causes the changes in Y.

Page 9: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Significant Coefficients• If a linear relationship between the

dependent and independent variables does not exist, the true value of the slope should be 0. To test to see if this is true, look at the 95% confidence interval for an independent variable’s coefficient:

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 1636.414726 451.4953308 3.624433 0.015149 475.80903 2797.02042

X Variable 1 1.486633657 0.164999212 9.009944 0.000281 1.06248968 1.91077763

Page 10: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Multiple Linear Regression Model with n independent variables

• Equation:

• Adjusted R-square:– Describes the explanatory power of the

multiple regression, after compensating for sample size and the number of independent variables in the model

Ŷi b0 + b1 X1 + b2 X2 +…..+ bn Xn

Page 11: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Restaurant Sales Exercise (ChiliDogRegress.xlsx)

• The ChiliDog Hut fast food chain wants to identify a good location for a new restaurant

• They have identified three possible independent variables that could have a relationship with the annual sales (in $1,000s) of a restaurant– # of other fast food stores in 1 mile radius– # of schools and businesses in 1 mile radius– $ spent on advertising per year

• Help ChiliDog identify the regression model that forecasts annual sales the best

Page 12: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Simple Linear Regression on # of Other Restaurants in 1 mile radius

Regression StatisticsMultiple R 0.8356977R Square 0.6983906Adjusted R Square 0.6606894Standard Error 24.352403Observations 10

CoefficientsStandard Error t Stat P-value Lower 95%Upper 95%Lower 95.0%Upper 95.0%Intercept 38.664063 22.5752821 1.712672393 0.125128067 -13.3946 90.7 -13.3946 90.72276# of competitors in 1 mile radius 8.4570313 1.9649261 4.303994564 0.002601674 3.925904 13 3.925904 12.98816

Page 13: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

What Conclusions Can You Make From the Simple Linear Regression?

A. More competing restaurants in the 1 mile radius hurt ChiliDog Hut’s sales

B. The # of competing restaurants in the 1 mile radius have little impact on ChiliDog Hut’s sales

C. There is a positive correlation between the # of competing restaurants in the 1 mile radius and ChiliDog Hut’s sales

D. Increasing the # of competing restaurants in the 1 mile radius will increase ChiliDog Hut’s sales

Page 14: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Multiple Regression on $ Spent on Advertising & # of Other Restaurants in 1

mile radiusSUMMARY OUTPUT

Regression StatisticsMultiple R 0.8926645R Square 0.7968499Adjusted R Square 0.738807Standard Error 21.366032Observations 10

ANOVA

  df SS MS FSignifica

nce FRegression 2 12534.4487 6267.224338 13.72863895 0.003779Residual 7 3195.55132 456.5073321Total 9 15730      

 Coefficient

sStandard

Error t Stat P-valueLower 95%

Upper

95%Lower 95.0%

Upper 95.0%

Intercept 78.879957 29.4792254 2.675781191 0.031732659 9.172666 149 9.172666 148.5872$ spent on advertising (in

$1,000s) -5.7649964 3.12989773 -1.84191208 0.108041529 -13.166 1.64 -13.166 1.636036

# of competitors in 1 mile radius 8.2173432 1.72886864 4.753017667 0.002076223 4.129218 12.3 4.129218 12.30547

Page 15: You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained

Coefficient of Determination: Adjusted R-square

• Proportion of variation in Y around its mean that is accounted for by the regression model– 0 <= Adj. R2 <= 1

• Describes the explanatory power of the multiple linear regression model, after compensating for sample size and the number of independent variables.