Upload
may-bridges
View
216
Download
1
Embed Size (px)
Citation preview
Why Design?Why Design?(why not just observe and model?)(why not just observe and model?)
Q: Why Experimental Design A: To avoid multicollinearity
Issues: (1) Testing joint importance versus individual significance
(2) Prediction versus modeling individual effects
(3) Collinearity (correlation among inputs)
Example: Hypothetical company’s sales Y depend on TV advertising X1 and Radio Advertising X2.
Y = 0 + 1X1 + 2X2 +e
Jointly critical (can’t omit both!!)
Two engine plane can still fly if engine #1 failsTwo engine plane can still fly if engine #2 failsNeither is critical individually
Data Sales; input store TV radio sales; (more code)cards; 1 869 868 9089 2 836 820 8290 (more data) 40 969 961 10130
proc g3d data=sales; scatter radio*TV=sales/shape=sval color=cval zmin=8000;run;
TV
Sales
Radio
Conclusion: Can predict well with just TV, just radio, or both!
SAS code: proc reg data=next; model sales = TV radio;
Analysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > FModel 2 32660996 16330498 358.84 <.0001 (Can’t omit both)Error 37 1683844 45509Corrected Total 39 34344840
Root MSE 213.32908 R-Square 0.9510 Explaining 95% of variation in sales
Parameter Estimates
Parameter StandardVariable DF Estimate Error t Value Pr > |t|Intercept 1 531.11390 359.90429 1.48 0.1485TV 1 5.00435 5.01845 1.00 0.3251 (can omit TV)radio 1 4.66752 4.94312 0.94 0.3512 (can omit radio)
Estimated Sales = 531 + 5.0 TV + 4.7 radio with error variance 45509 (standard deviation 213).
TV approximately equal to radio so, approximately
Estimated Sales = 531 + 9.7 TV or
Estimated Sales = 531 + 9.7 radio
Regression
The REG ProcedureModel: MODEL1Dependent Variable: sales
Number of Observations Read 40Number of Observations Used 40
Analysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > F
Model 2 32660996 16330498 358.84 <.0001Error 37 1683844 45509Corrected Total 39 34344840
Root MSE 213.32908 R-Square 0.9510Dependent Mean 9955 Adj R-Sq 0.9483Coeff Var 2.14291
Parameter Estimates
Parameter StandardVariable DF Estimate Error t Value Pr > |t|
Intercept 1 531.11390 359.90429 1.48 0.1485TV 1 5.00435 5.01845 1.00 0.3251radio 1 4.66752 4.94312 0.94 0.3512
Design
The REG ProcedureModel: MODEL1Dependent Variable: SALES
Number of Observations Read 40Number of Observations Used 40
Analysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > F
Model 2 32641505 16320753 358.66 <.0001Error 37 1683699 45505Corrected Total 39 34325204
Root MSE 213.31990 R-Square 0.9509Dependent Mean 10300 Adj R-Sq 0.9483Coeff Var 2.07111
Parameter Estimates
Parameter StandardVariable DF Estimate Error t Value Pr > |t|
Intercept 1 530.72803 366.53079 1.45 0.1560TV 1 5.00492 0.25552 19.59 <.0001Radio 1 4.66742 0.25552 18.27 <.0001
Design matrix-1 for low level +1 for high 12 obs.
1 2 2
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1 0.15 0 0 0.10
1 1 1 1 0 0.15 0.10 0, ( ' )
1 1 1 1 0 0.10 0.15 0
1 1 1 1 0.10 0 0 0.15
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
X X X
1 2 2
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1 0.083 0 0 0
1 1 1 1 0 0.083 0 0, ( ' )
1 1 1 1 0 0 0.083 0
1 1 1 1 0 0 0 0.083
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
X X X
2(0.5( )) 0.15Var estimated effect
2(0.5( )) 0.08333Var estimated effect
High Low
High 5 1
Low 1 5
High Low
High 3 3
Low 3 3
X1 X2