Upload
mili-sabarots
View
108
Download
0
Embed Size (px)
Citation preview
1
Powerpoint 2
REGRESSION,REGRESSION, REGRESSION!
2
REGRESSION / CORRELATION
Object: To measure the degree of association between variables and/or to predict the value of one variable from the knowledge of the values of (an)other variable(s).
Relationships:(1) Functional(2) Statistical
3
Functional Relationship:
Y=f(X), an exact relationship-- no “error”.
e.g., Y = -25 +.10X
$ spent at B&N during the year$ savings
(joining Barnes & Noble book club)
4
Statistical Relationship:(true only “on the average”)
YPRODUCTION
XLABOR HOURS
Linear
YPHYSICAL ABILITY
XAGE
Non-linearUpside-down U-shape
5
Consider the following data, which represent the sales of a product (adjusted for trend) over the last 8 sales periods:
Y = sales (millions)
116 109 117 112122 113 108 115
Y = 114
What would (should) one predict for the next sales period? Probably, one would be hard pressed, in this case, to justify choosing other than Y=114. How good will this prediction be?
Last 8 sales of periods
Average of the 8 sales amounts
6
WE DON’T
KNOW!!!!!
7
But-- we can get an idea by looking at how well we would have done, had we been using this 114 all along:
TSS =Total Sum of Squares
So TSS = Y -Y)2 = 144j
n
j=1
Y Y (Y-Y) (Y-Y)2
116109117112122113108115
114114114114114114114114
2-53
-28
-1-61
425
94
641
361
Y=114 0 144
Prediction error/residual
8
Two ways to look at the “TSS”:
1) A measure of the “mis-prediction” (prediction error) using Y as predictor.
2) A measure of the “Total Variability in the System” (the amount by which the 8 data values aren’t all the same).
When the TSS is larger/when the data varies more, you have more reason to investigate
9
Consider using X, advertising, to “help” predict Y:
105
110
115
120
125
0 1 2 3 4X
Y
Scatter Diagram
Y X
116 109 117 112 122 113 108 115
2 1 3 1 4 2 1 2
Y=114
X=2
10
Consider a Linear or Straight Line Statistical relationship between the two variables, and then consider finding the “best fitting line” to the data. Call this line:
Yc = a+bX
Yc = “Computed Y” or “Predicted Y”
Y is called the Dependent VariableX is called the Independent Variable
11
What do we mean by “best fitting”?
Answer:The “Least Squares” line, i.e., the line which minimizes the sum of the squares of the distances from the “dots”, Y, and the “line”, Yc. Hence, the MATH problem is to minimize
Y -Yc)2
n
j=1
Y1 = 7
Yc1 = 5
X1
Y
X
12
To find this Least Squares line, we theoretically need calculus.
However, as a practical matter, every text gives the answer, and, more importantly, we will get the result using Excel, or SPSS, or other software - NOT “BY HAND.”
(There is an arithmetic formula for “b” and “a” in terms of the sum of the X’s, the sum of the Y’s, the sum of the X•Y’s, etc., but with software available, we never use it.)
13
105
110
115
120
125
0 1 2 3 4
Least squares line Yc = 106 + 4X
14
15
16
17
18
19
20
21
22
Intercept and slope
23
So, using X in the best way, we have a prediction line of Yc=106+4X. How good are the predictions we’ll get using this line? Suppose we had been using it:
TSS SSE
106+4(2)(Y-Y)2 (Y-Y) Y X Yc Y-Yc (Y-Yc)2
4 25 9 464 136 1
2 -5
3 -2
8 -1 -6
1
116109117112122113108115
21314212
114110118110122114110114
2-1-120-1-21
41140141
144 0 0 16
24
So, SSE = (Y-Yc)2 = 16.
SSE = Sum of Squares “due to error”
That is, we use X in the best way possible, and still do not get perfect prediction. The amount of “mis-prediction” still remaining, measured by sum of squares, is 16. This must be due to factors other than advertising (X). (Perhaps: size of sales force, number of retail outlets, strategy of competition, interest rates, etc.)
25
We call all these other factors “ERROR”. That is, “error” is the collective name of all variables (factors) not used in making the prediction.
SSE is also called “SUM OF SQUARED RESIDUALS” or “RESIDUAL SUM OF SQUARES”.
26
We have TSS = 144 and SSE = 16.
TSS - SSE = 128
What happened to the other 128? We call this “SSA”: (“SSR” in text)
SSA = TSS - SSE = 128
SSA = Sum of squares “due to X” or “Attributed to X”.
27
So, TSS = SSA + SSE
TotalVariability
Variabilitydue to ERROR
VariabilityAttributed
to X
= +
28
We have
r2 is called the “Coefficient of Determination”, and is interpreted as the “proportion of variability in Y explained by X” or “... explained by the relationship between Y and X expressed in the regression line”.
r2 = = = .89SSATSS
128144
29
0 ≤ r2 ≤ 1 r2 = SSA
TSS
Of course, 1 - r2 = = .11 SSETSS
is interpreted as the proportion of variability in Y unexplained by X (and still present).
Define r= SQRT(r2). r= correlation (coefficient).Here r=SQRT(.89) = .943
30
But, r can be + or - !!
SQRT(.89) = +.943 or -.943.
It takes on the sign of b in Yc = a+bX.
A value of r near 1 or -1 is suggestive of a strong linear relationship between Y and X. A value of r near 0 is suggestive of no linear relationship between Y and X.
-1 ≤ r ≤ 1
31
Note that the sign of r indicates the direction of the relationship (if any). A “+” indicates that Y and X move in the same direction; a “-” indicates that they move in opposite directions. Some people refer to a positive r as a “positive relationship” and a negative r as an “inverse relationship”.
32
X
Y
r = +1
X
Y
r = -1
X
Y
r = +.8
X
Y
r = -.65
X
Y r = 0
X
Y r = 0
33
Note that a high r2 does not necessarily mean CAUSE/EFFECT.
Frequently we have “spurious correlations” – two variables which are highly related in terms of r2, but only because they are both “driven” by a third variable.
“Classic” example:
Number ofTEACHERS
Number quarts ofLIQUOR SOLD
34
35
R and R2SSA
SSETSS
36
THE MODEL
In order to get a measure of prediction error (e.g., confidence intervals, hypothesis testing), we must make some assumptions about the distribution of points scattered about the regression line. These assumptions are usually couched in what is called a “statistical model.”
37
We specifyY•X = A+BX
Where Y•X is the mean or average value of Y for a given X. We have a (true) slope of B and (true) intercept of A; A and B are parameters, the exact values of which we’ll never know.
38
This says that if we set X = 1 (for example) and sample an infinite number of Y’s (hence finding Y.1) and then set X = 2 and find Y.2, X = 3 and find Y.3, etc., all the Y•X fall exactly on a straight line
X
(TRUE)
AverageY,Y•X
39
But, we never find Y•X. For a given X, we observe a value of Y which differs from Y•X in the same way that when we observe any random variable value, it does not equal “” but is some point governed by some probability law.
Y.XY
f(Y)
40
The way we write this in a formal way is:
Y = Y•X + A+BX +
Where is the difference between an individual Y and the mean Y, all given a specific X.
is basically the impact of having a non-zero
41
Example:Suppose that Y = weight X = height,
and Y•X=70” = 160 lbs.
Then a person 70” tall with weight of 168 pounds has a “personal ” of 8 lbs. If his/her weight were 158 lbs., his/her personal would be -2 pounds.
Of course, since = Y - Y•X, and we don’t know Y•X, we don’t really know anybody’s personal .
42
We find the LS line,
Yc = a + bx
a estimates A
b estimates B
Yc estimates Y•X , and Y itself.
43
We usually make the following assumptions,
which are called
“the standard assumptions.”
1) NORMALITY2) HOMOSCEDASTICITY3) INDEPENDENCE
44
Assumption 1:
Given a value of X, the probability of Y is normal.
(e.g., with Y = weight and X = height, for any given height (say 70”) the Y’s are normal around Y•X =70 (say, 160 lbs.)
160
Y
45
Assumption 2:
The standard deviation of , (which we don’t know) which is usually called Y•X, is constant for all values of X. The characteristic of having Y•X constant is referred to as “Homoscedasticity.”
47
Page 31 old
48
Combining assumptions 1 & 2, we have the Y’s being normally distributed with y.x as mean (and correspondingly, average error of 0) and constant standard dev. y•x.
Of course, as you know, neither Y•X nor y•x is known.
Y•X is estimated by Yc = a+bx
y•x is estimated by “Sy•x”
Sy•x is called the “Standard Error of Estimate,”
Sy•x = SSEn2
49
The SSE makes intuitive sense, in that SSE is a variability due to error. The [n-2] (instead of [n-1], the denominator of S in most previous applications) is really a degrees of freedom number. The df = n minus a degree of freedom for each parameter estimated from the data. Here, there are 2 such parameters, A and B (estimated by a and b, respectively).
50
Later, when we have a model of
Y = A + B1 X1 + B2 X2 + ,
the df will be [n-3].
We usually get Sy•x from the Computer output.
Here, Sy•x = 1.63 (See output on next page).
51
52
sy.x
53
Assumption 3:
The Y values are independent of one another. (This is often a problem when the data form a time series).
In the real world these assumptions may never be exactly true, but are often close enough to true to make our statistical analysis (which follows) valid.
Investigation has shown that moderate departure from assumptions 1 and 2 do not appreciably affect results (i.e., assumptions 1 and 2 are “Robust”). In terms of large departures –– there are ways to recognize them and do the appropriate (but more complex) analysis.
54
CONFIDENCE INTERVALS
95% confidenceIntervals for A and B
55
56
This, you had before
Now added to output
57
This formula is a excellent approximation when n is “large,” (virtually always in MK) and the value of X
at which we are predicting isn’t dramatically far from the center [X-bar] of our data.
Yc ± t1- • Sy•x (n-2) df
For 95% confidence, and X = 3, we have:
118 ± 2.447(1.63) or 118 ± 3.99
Of greater interest (usually) is a confidence interval for the prediction of the next “period.” This is done by:
TINV(.05, 6)
Recall: Yc=106+4X
58
(EXCEL COMMAND)
TINV(.05, 6) = 2.447
In general: TINV(, df)
59
Hypothesis Testing
To test:To test: HH00: B=0 : B=0 Note: B=0 Note: B=0
HH11: B≠0: B≠0same as X & same as X &
YY NOTNOT RELATEDRELATED
Y= A + BX + Y= A + BX + We computeWe compute
ttcalccalc= b-B= b-BH0H0
ssbb
0
and accept Hand accept H0 0 ifif |t|tcalccalc| < t| < t1- 1-
(n-2)df(n-2)dfreject Hreject H00 if |t if |tcalccalc|| > t > t 1- 1-
(n-2)df(n-2)df
60
If =.05, we have t.95= 2.4476 df
and we reject H0
We’ll refer to this as the “t-test.”
tcalcIn our problem- = 6.93 (see output on next page)
(All we really need to do is to examine the p-value)
61
62
P-value (called “significance” by SPSS)
63
Here, where Here, where y•xy•x = A + BX, = A + BX,
there’s only one B, and thus the H’s there’s only one B, and thus the H’s above are the same as the previous above are the same as the previous
HH00: B=0 H: B=0 H11: B≠0: B≠0
To test H0: all B’s=0H1: not all B’s=0,
we have a different procedure.
64
HoweverHowever, for the future, where, for the future, where
y•xy•x = A + B = A + B11XX11 + B + B22XX2, 2, and “all B’s=0” and “all B’s=0” means means
BB11=B=B22=0, and there is a difference =0, and there is a difference between between
“B=0” and “all B’s=0,” we introduce:“B=0” and “all B’s=0,” we introduce:
HH00: all B’s =0: all B’s =0
HH11: not all B’s=0: not all B’s=0
65
To test the above, we determine
Fcalc
We get Fcalc from the output!!! Yeah!!!!
66
And we accept H0 if Fcalc < F1-(1, n-2) df
reject H0 if Fcalc > F1- (1, n-2) df
where F 1- is the appropriate value from the F table.
More easily: examine p-value of F- test (next page)
= 0.05
5.99
F
67
68
Fcalc and p-value
69
MULTIPLE REGRESSION
When there is more than one independent variable (X), we call our regression analysis by the term “Multiple Regression.” With a single independent variable, we call it “Simple Regression.”
70
y•y•xx = A + B = A + B11XX1 1 + B+ B22XX22 + + • • • + B • • • + Bk-1k-1XXk-1k-1
Y = Y = y•xy•x + +
Least Squares hyperplane (“line”):Least Squares hyperplane (“line”):
YYcc = a + b = a + b11xx11 + b + b22xx22 + • • • + b + • • • + bk-1k-1xxk-1k-1
NOTE: NOTE: k-1 = Number of X’sk-1 = Number of X’s k = Number of parameters k = Number of parameters
71
Example:Example:Y = Job PerformanceY = Job PerformanceXX11 = Score on (entrance) Test 1 = Score on (entrance) Test 1
XX22 = Score on Test 2 = Score on Test 2
XX33 = Score on Test 3 = Score on Test 3
oror
Y = SalesY = SalesXX11 = Advertising = Advertising
XX22 = Number of sales people = Number of sales people
XX33 = Number of competitors = Number of competitors
We assume that Computer software gives us all (or nearly all) the numerical results.
72
Typically, we wish to perform two types of Hypothesis Tests:
First: F – test (Y = A+B1X1 + ••• + Bk-1 Xk-1+
H0 : B1 = B2 = B3 = . . . = Bk-1 = 0H1 : not all B’s = 0
73
In “English”: H0: The X’s collectively do not help us predict Y.
H1: At least one of the darn X’s help us predict Y!
We call this, reasonably so, a “TEST OF THE OVERALL MODEL”
H0 : B1 = B2 = B3 = . . . = Bk-1 = 0H1 : not all B’s = 0
74
If we accept H0 that the X’s collectively do not help us predict Y, we probably discontinue formal statistical analysis.
However, if we reject H0 (i.e., the “F is significant”), then we are likely to want a series of t-tests:
H0 : B1 = 0H1 : B1 ≠ 0 ,
H0 : B2 = 0H1 : B2 ≠ 0 ,
H0 : Bk-1 = 0H1 : Bk-1 ≠ 0
•••
75
These are called “Tests for individual X’s.” The test is answering: (using B1 as an example)
H0 : Variable X1 is NOT helping us predict Y, above and beyond the other variables in the model.
H1 : X1 IS INDEED helping us predict Y, above and beyond the other variables in the model.
76
X1 height
X2 pant length
Y weight
F-Test : SIGNIFICANTt1 : NOT SIGNIFICANTt2 : NOT SIGNIFICANT
So, note:
We’re answering whether a variable gives us INCREMENTAL value.
Sometimes a result looks “strange” -
77
If I know a person’s X1, height, do I get additional predictive valueabout Y, weight, from knowing pant length?No - hence, weaccept H0: B2= 0 (t2 not sign.)
Y = WeightY = Weight
XX11 = Height = Height
XX22 = Pant Length = Pant Length
78
If I know XIf I know X22, pant length, do I get, pant length, do I get
additionaladditional predictive value about Y predictive value about Y from knowing height?from knowing height?
(Also) No - hence we (Also) No - hence we acceptaccept HHoo: B: B11= 0= 0
(t(t11 notnot sign.) sign.)
79
When the X’s themselves are highly
interrelated (the fact that leads to
the strange looking - but not really
strange result), we call this
MULTI-COLINEARITY.
80
XX11 R R22 = .5 = .5
XX22 R R22 = .4 = .4
XX11, X, X22 RR22 = ? = ?
YYYYYY
Ans: between .5 and .9Ans: between .5 and .9
(In some unusual, “strange” cases, (In some unusual, “strange” cases, RR22 may exceed .9 ) may exceed .9 )
If XIf X11 and X and X22 not overlapping in the not overlapping in the
information provided, Rinformation provided, R22 = .9; if X = .9; if X22 tells tells
us a total subset of what Xus a total subset of what X11 tells us, tells us,
RR22 = .5. = .5.
Another “look” at this issue:
81
1) The F test is significant because the X’s together tell us (an estimate of) 73% of what’s going on with Y.
2) t1 (likely) not sign., because the gain of .01 (.73 - .72 [with only X2]) is judged by the t-test as too easily due to the “luck of the draw”. (Actually, it depends on the sample size)
3) t2 , similarly.
YYYYYY
XX11 RR22 = .70 = .70
XX22 R R22 = .72 = .72
XX11, X, X22 R R22 = .73, = .73,
If you haveIf you have
82
XX11 X X22 X X33 Y Y
100100 95 95 87 87 88 88 99 99 98 8099 99 98 80 101 103 101 96101 103 101 96 93 95 91 7693 95 91 76 95 102 88 8095 102 88 80 95 94 84 7395 94 84 73
. . . .. . . .n = 25n = 25
Example: Y = Job performanceExample: Y = Job performance XX1 1 = Test 1 score= Test 1 score XX22 = Test 2 score = Test 2 score
XX3 3 = Test 3 score= Test 3 score
83
X1 X2 X3 YX1 X2 X3 Y
84
85
86
LEAST SQUARES LINE
So, Yc =
-106.34 + 1.02•X1 + .137•X2 + .87•X3
88
To test: HTo test: H00: B: B11 = B = B22 = B = B33 = 0 = 0 = .05 = .05
HH11: not all B’s = 0: not all B’s = 0
FF.05.05
47.59847.598
Since Since p-value = .000000001528p-value = .000000001528 < .05< .05,,we reject H0.
from output
fromoutput
89
To Test
Ho: B1 = 0 Ho: B2 = 0 Ho: B3 = 0
H1: B1 ≠ 0 H1 : B2 ≠ 0 H1 : B3 ≠ 0
We have
tcalc1 = 3.65tcalc2 = .80 tcalc3 = 3.57
(p = .0015) (p = .4314) (p = .0018)
t1-= 2.08 = .05
21 df = 25 - 40- 2.08 2.08
90
For and we reject Ho; for we accept Ho.
Conclusion in Practical Terms?
x1 (Test 1) and x3 (Test 3) each gives us incremental predictive value about PERFORMANCE, Y.
X2 (Test 2) is either irrelevant or redundant.
1 3 2
91
An added benefit of the analysis was to indicate how the tests should be weighted: The best fit occurs if the tests are weighted
1.02, .137, .87
(assuming we retain Test 2).
This is equivalent to weights of
1.02 , .137 , .87
2.027 2.027 2.027
or
(.50, .07, .43)
The present weights were (1/3, 1/3, 1/3).
1.02 .137 .87
2.027
92
“PROBLEM IN NOTES”
Consider the following model: Y = A+B1•X1+B2•X2+B3•X3+ Y = Sales Revenue (in units of $100,000)X1= Expenditure on TV advertising (in units of $10,000)X2= Expenditure on Web advertising (in units of $10,000)X3= Expenditure on Newspaper advertising (in units of $10,000)
Refer to computer output following the questions -1. What is the least squares line (hyperplane)?2. What revenue do I expect (in dollars) with no advertising
in any of the three media?3. If $10,000 more were allocated to advertising, which
medium should receive it to generate the most additionalrevenue?
93
4) What percent of the variability in revenue is due to factors other than the expenditures in the three advertising media?5) If management decided to spend the same amount of money on each of the three types of media, how much total money would have to be spent to generate an expected revenue of $40,000,000?6) Test H0: B1 = B2 = B3 = 0 vs. H1: not all B’s = 0, at = .05. What is your conclusion in practical terms?7) For each variable, test H0: B = 0 vs. H1: B ≠ 0, at = .05. What are your conclusions in practical terms?
94
.
95
Dummy Variables(Indicator)
(Categorical)
Ex: Y = A + B1X1 + B2X2 +
Disposable Income / yr.
Sex
Male X2 = 1
Female X2 = 0
$spent on DVDs/mo.
96
We Get Yc = a + b1X1+ b2X2
For any given X1, income, we predict Y as follows:
Male: Yc = a + b1X1 +b2(1) = a + b1X1 + b2
Female: Yc = a + b1X1 +b2(0) = a + b1X1 + 0
How is b2 to be interpreted?
We can test, of course, Ho: B2 = 0 vs. H1: B2 ≠ 0.
97
Ans: The (estimated) amount spent by a Male, above that which would be spent by a Female, given the same X1 value (income). (Of course, if b2 is negative, it says that we estimate that Females spend more than Males, at equal incomes.)
If we had defined X2 = 1 for F’s
X2 = 0 for M’s ,
then b2 would reverse sign, and have the opposite meaning.
98
Remember that a variable is a “dummy” variable because of definition and interpretation. The computer treats a variable whose values are 0 and 1, just like any other variable.
Our data are, perhaps,
Y X1 X2
20 50 1
18 40 1
33 65 0
24 49 0
21 62 1
• • •
• • •
99
Note that we have 2 categories, (M,F), but only one dummy variable.
This is necessary, as is the general situation of C categories, (C-1) dummy variables.
This is because of computation issues involved in matrix inversion;
100
Example
YC = a + b1X1 + b2X2 + b3X3 + b4X4 + b5X5
Water Usage
Temp. Amount Produced # People
Employed
Plant 1
Plant 2
Plant 3
X4 X5
1 0
0 1
0 0
101
Let a + b1X1 + b2X2 + b3X3 = G
Then we predict: (for a given X1, X2 , X3)
FOR PLANT 1: G + b4(1) + b5(0) = G + b4
FOR PLANT 2: G + b4(0) + b5(1) = G + b5
FOR PLANT 3: G + b4(0) + b5(0) = G
How do we interpret b4? b5?
102
STEPWISE REGRESSION
A “variation” of multiple regression to pick the “best” model.
103
Y/X1, X2, X3, X4Step 1:
Internal: Run separate simple regressions with each X; pick the best (best= highest R2)
Y/X1 .45Y/X2 .50Y/X3 .48Y/X4 .28
R2
External: Y/X2, R2=.50
104
Step 2:
Internal: Y/X2, X1 .59
Y/X2, X3 .68 Y/X2, X4 .70
R2
External: Y/X2, X4, R2= .70
105
Step 3: Internal: Y/X2, X4, X1 .77 Y/X2, X4, X3 .73 External: Y/X2, X4, X1, R
2= .77
R2
NOTE: If at any stage, the best variable to enter is not significant by the t-test, the ALGORITHM STOPS (and does not bring that variable in!!!). You select a p-value (pin), and if p-value of entering variable > pin (i.e., variable is not significant), the variable does not enter and the algorithm stops.
106
Also- There’s a step 3b (and 4b, 5b, etc.)
Step 3b) Now that we’ve entered our third variable, software goes back and re-examines previously
entered variables to see if any should be DELETED (specify a “p to go out”, pout, so that if p-
value of a variable in our model > pout, the variable is deleted.
Algorithm continues until it stops!!!!
107
108
Output for the example with three tests and job performance
109
110
KEY!!!
111
Variable 1: Y= GRADUATE GPAVariable 2: X1= UNDERGRAD GPAVariable 3: X2= QUANTITATIVE GMATVariable 4: X3= VERBAL GMATVariable 5: X4= COLLEGE SELECTIVITY
0= Less Selective 1= More Selective
Y1 X1 X2 X3 X4
3.50 3.60 600.00 580.00 0.03.90 3.60 680.00 670.00 1.00 . . . . . . . . . . . . . . .3.20 2.90 440.00 430.00 1.00
112
Detailed Summary of Stepwise Analysis
Ent. Var. LS Line R2
Step 1
Step 2
Step 3
UNDERGRADGPA
QUANTGMAT
COLLEGESEL.
X1
X2
X4
Yc= .85 + .73X1
Yc= .585 .53X1
.00165X2
Yc= 1.197 .309X1
.00163X2
.284X4
+++
++
.609
.833
.915
STOP! If we bring in Verbal GMAT, R2=.919
113
PRACTICE PROBLEM
Y = COMPANY ABC’s SALES ($millions)X1 = OVERALL INDUSTRY SALES ($billions)X2 = COMPANY ABC’s ADVERTISING ($millions)X3 = SPECIAL PROMOTION BY CHIEF COMPETITOR: 0 = YES, 1 = NO
A STEPWISE REGRESSION WAS RUN WITH THESE RESULTS:
STEP 1: VARIABLE ENTERING: X1, Yc = 205+16•X1, R2 = .48
STEP 2: VARIABLE ENTERING: X2, Yc = 183+11•X1+10•X2, R2 = .64
STEP 3: VARIABLE ENTERING: X3, Yc = 180+10•X1+8•X2+65•X3, R2 = .68
A) If ABC’s advertising is to be same next year as this year (i.e., X2 held constant), and we do not know (in advance) the value of X3, what would we predict to be the increase in ABC’s sales if overall industry sales (X1) increase by $1 billion?
a) 10 b) 11 c) 16
B) Based on the given information, we can conclude that the R2 between Y and X2 (the exact value of which we cannot determine from the given information) is between:
a) .16 and .48 b) .16 and .64 c) .48 and .64 d) none of these
C) Answer part B) if the regression results above were NOT part of a stepwise procedure, but simply a set of multiple regression results.