View
218
Download
2
Embed Size (px)
Citation preview
1
Module II
Lecture 4:F-Tests
Graduate School 2004/2005Quantitative Research Methods
Gwilym [email protected]
2
Plan: (1) Testing a set of linear restrictions –
the general case (2) Testing homogenous Restrictions (3) Testing for a relationship – Special
Case of Homogenous Restrictions (4) Testing for Structural Breaks
3
(1) Testing a set of linear Restrictions - The General Procedure
Suppose we want to test whether there are any country specific effects in the relationship between inflation and the money supply:
INFL = a + b MS + g1 COUNTRY1 + …. + g42 COUNTRY2
– I.e. we want to test the following null hypothesis:• H0: g1 = g2 = g3 =…. = g42 = 0
Then we can think of this as being equivalent to comparing two regressions, one restricted and one unrestricted:
4
The Unrestricted regression is:INFL = a + b MS + g1 COUNTRY1 + …. + g42
COUNTRY2
The Restricted regression is:INFL = a + b MS
We can test whether all the g coefficients equal zero using the F-test:
5
The General formula for F:
UU
URrdf
dfdf dfRSS
rRSSRSSFF
U /
/)(numerator
rdenominato
Where:
RSSU = restricted residual sum of squares
= RSS under H1
RSSR = unrestricted residual sum of squares
= RSS under H0
r = number of restrictions = diff. in no. parametersbetween restricted and unrestricted equations
dfu = df from unrestricted regression = n - k where k is allcoefficients including the intercept.
NB RSSR is always greater than RSSU since imposing a restriction on an equation can never reduce the RSS
6
Using the F-test:
If the null hypothesis is true (i.e. restrictions are satisfied) then we would expect the restricted and unrestricted regressions to give similar results– I.e. RSSR and RSSU will be similar– so we accept H0 when the test statistic gives a small
value for F. But if one of the restrictions does not hold, then the
restricted regression will have had an invalid restriction imposed upon it and will be mispecified. higher residual variation higher RSSR
– so we reject H0 when the test stat. gives a large value
7
Test Procedure: (i) Compute RSSU
– Run the unrestricted form of the regression in SPSS and take a note of the residual sum of squares = RSSU
(ii) Compute RSSR
– Run the restricted form of the regression in SPSS and take a note of the residual sum of squares = RSSR
(iii) Calculate r and dfU
(iv) Substitute RSSU, RSSR, r and dfU in the equation for F and find the significance level associated with the value of F you have calculated.
8
Example 1: Ho: no country effects (R and U regressions have the same dependent variable)
Model Summary
.373a .139 .132 1.895Model1
R R SquareAdjustedR Square
Std. Errorof the
Estimate
Predictors: (Constant), CNTRY_3, CNTRY_2,CNTRY_1, money supply
a.
ANOVAb
296.772 4 74.193 20.652 .000a
1835.811 511 3.593
2132.583 515
Regression
Residual
Total
Model1
Sum ofSquares df
MeanSquare F Sig.
Predictors: (Constant), CNTRY_3, CNTRY_2, CNTRY_1, money supplya.
Dependent Variable: inflationb.
Coefficientsa
3.281 1.219 2.692 .007
3.787E-02 .055 .037 .693 .489
-1.716 .692 -.127 -2.479 .014
-4.446 .562 -.330 -7.914 .000
-1.384 .573 -.103 -2.413 .016
(Constant)
money supply
CNTRY_1
CNTRY_2
CNTRY_3
Model1
B Std. Error
UnstandardizedCoefficients
Beta
Standardized
Coefficients
t Sig.
Dependent Variable: inflationa.
Step (i) RSSU = 1835.811
9
Step (ii) RSSR = 2097.722
Model Summary
.128a .016 .014 2.020Model1
R R SquareAdjustedR Square
Std. Errorof the
Estimate
Predictors: (Constant), money supplya.
ANOVAb
34.860 1 34.860 8.542 .004a
2097.722 514 4.081
2132.583 515
Regression
Residual
Total
Model1
Sum ofSquares df
MeanSquare F Sig.
Predictors: (Constant), money supplya.
Dependent Variable: inflationb.
Coefficientsa
1.031 1.000 1.031 .303
.132 .045 .128 2.923 .004
(Constant)
money supply
Model1
B Std. Error
UnstandardizedCoefficients
Beta
Standardized
Coefficients
t Sig.
Dependent Variable: inflationa.
10
Step (iii) r and dfu
r = number of restrictions
= difference in no. of parameters between
the restricted and unrestricted equations
= 3
dfu = df from unrestricted regression = nU - kU
where k is total number of all coefficients including the intercept
= 516 - 5 = 511
11
(iv) Substitute RSSU, RSSR, r and dfU in the formula for F
F = (RSSR - RSSU) / r = (2097.722 - 1835.811)/3
RSSU/dfU 1835.811 / 511
= 87.304 3.593 = 24.298 dfnumerator = r = 3 dfdenominator = dfU = 511 From Tables, we know that at P = 0.01, the value for
F[3,511] would be 3.88 (I.e. Prob(F > 3.88) = 0.01) F we have calculated is > 3.88, so we know that P < 0.01
(I.e. Prob(F > 24.298) <0.01) Reject Ho
12
F Critical ValuesDegrees of freedom in the numerator
p 1 2 3 4 5 6 7
0.100 2.73 2.33 2.11 1.97 1.88 1.80 1.75200 0.050 3.89 3.04 2.65 2.42 2.26 2.14 2.06
0.025 5.10 3.76 3.18 2.85 2.63 2.47 2.350.010 6.76 4.71 3.88 3.41 3.11 2.89 2.730.001 11.15 7.15 5.63 4.81 4.29 3.92 3.65
1000 0.100 2.71 2.31 2.09 1.95 1.85 1.78 1.720.050 3.85 3.00 2.61 2.38 2.22 2.11 2.020.025 5.04 3.70 3.13 2.80 2.58 2.42 2.300.010 6.66 4.63 3.80 3.34 3.04 2.82 2.660.001 10.89 6.96 5.46 4.65 4.14 3.78 3.51D
egre
es o
f fre
edom
in th
e de
nom
inat
or
13
F Critical ValuesDegrees of freedom in the numerator
p 1 2 3 4 5 6 7
0.100 2.73 2.33 2.11 1.97 1.88 1.80 1.75200 0.050 3.89 3.04 2.65 2.42 2.26 2.14 2.06
0.025 5.10 3.76 3.18 2.85 2.63 2.47 2.350.010 6.76 4.71 3.88 3.41 3.11 2.89 2.730.001 11.15 7.15 5.63 4.81 4.29 3.92 3.65
1000 0.100 2.71 2.31 2.09 1.95 1.85 1.78 1.720.050 3.85 3.00 2.61 2.38 2.22 2.11 2.020.025 5.04 3.70 3.13 2.80 2.58 2.42 2.300.010 6.66 4.63 3.80 3.34 3.04 2.82 2.660.001 10.89 6.96 5.46 4.65 4.14 3.78 3.51D
egre
es o
f fre
edom
in th
e de
nom
inat
or
14
Alternatively use Excel calculator:F-Tests.xls
Restricted ModelANOVAModel Sum of Squaresdf Mean SquareF Sig.
1 Regression 34.86042 1 34.86042 8.541767 0.003624Residual 2097.722 514 4.081172Total 2132.583 515
a Predictors: (Constant), money supplyb Dependent Variable: inflation
Unrestricted ModelANOVAModel Sum of Squaresdf Mean SquareF Sig.
1 Regression 296.7719 4 74.19296 20.65169 0Residual 1835.811 511 3.592585Total 2132.583 515
a Predictors: (Constant), CNTRY_3, CNTRY_2, CNTRY_1, money supplyb Dependent Variable: inflation
First Paste ANOVA tables of U and R models:
15
Second, check cell formulas, & let Excel do the rest:
F Test
r = k U - k R = 3
k u = 5
n = 516
df u = n - k u = 511
= 511
F = 24.30111338
Sig F = 1.02857E-14
UU
URrdf dfRSS
rRSSRSSF
U /
/)(
16
Example 2: Ho: b2 + b3 = 1(R and U regressions have different dependent variables)
(i) Compute RSSU
– Run the unrestricted form of the regression in SPSS and take a note of the residual sum of squares = RSSU
(ii) Compute RSSR
– Run the restricted form of the regression in SPSS by:• substituting the restrictions into the equation
• rearrange the equation so that each parameter appears only once
• create new variables where necessary and estimate by OLS
– and take a note of the residual sum of squares = RSSR
(iii) Calculate r and dfU
(iv) Calculate F and find the significance level
17
Unrestricted regression:
y = b1 + b2x2 + b3x3 + u• H0: b2 + b3 = 1; • If H0 is true, then: b3 = 1 - b2 and:
y = b1 + b2x2 + (1-b2)x3 + u
= b1 + b2x2 + x3- b2x3 + u
= b1 + b2(x2 - x3)+ x3 + u
y - x3= b1 + b2(x2 - x3)+ u Restricted regression: z = b1 + b2(v)+ u where z = y - x3; v = x2 - x3
18
Example 3: Ho: b2 = b3
(R and U regressions have the same dependent variable) Unrestricted regression:
y = b1 + b2x2 + b3x3 + u• H0: b2 = b3; H1: b2 b3
• If H0 is true, then:
y = b1 + b2x2 + b2x3 + u
= b1 + b2(x2 + x3) + u
Restricted regression: y = b1 + b2(w)+ u where w = x2 + x3;
19
Example 4: Ho: b3 = b2 + 1
(R and U regressions have the different dependent variables) Unrestricted regression:
Infl = b1 + b2MS_GDP + b3MP_GDP + u• Ho: b3 = b2 + 1
• If H0 is true, then:
Infl = b1 + b2MS_GDP + (b2+1)MP_GDP + u
= b1 + b2MS_GDP + b2 MP_GDP + 1MP_GDP + u
= b1 + b2(MS_GDP + MP_GDP) + MP_GDP + u
Infl - MP_GDP = b1 + b2(MS_GDP + MP_GDP) + u
Restricted regression: z = b1 + b2(v)+ u where z = Infl - MP_GDP ;
v = MS_GDP + MP_GDP
20
Step (i) RSSU = 2069.060ANOVAb
63.523 2 31.761 7.875 .000a
2069.060 513 4.033
2132.583 515
Regression
Residual
Total
Model1
Sum ofSquares df
MeanSquare F Sig.
Predictors: (Constant), MP_GDP, MS_GDPa.
Dependent Variable: inflationb.
Model Summary
.173a .030 .026 2.008Model1
R R SquareAdjustedR Square
Std. Errorof the
Estimate
Predictors: (Constant), MP_GDP, MS_GDPa.
Coefficientsa
-6.706 3.419 -1.961 .050
3.836 1.573 .116 2.438 .015
7.543 4.018 .089 1.877 .061
(Constant)
MS_GDP
MP_GDP
Model1
B Std. Error
UnstandardizedCoefficients
Beta
Standardized
Coefficients
t Sig.
Dependent Variable: inflationa.
21
Step (ii) RSSR = 2070.305
SPSS syntax for creating Z and VCOMPUTE Z = Infl - MP_GDP. EXECUTE.COMPUTE V = MS_GDP + MP_GDP. EXECUTE.
SPSS syntax for Restricted Regression:
REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /NOORIGIN /DEPENDENT Z /METHOD=ENTER V .
SPSS ANOVA Output:
ANOVAb
55.701 1 55.701
2070.305 514 4.028
2126.006 515
Regression
Residual
Total
Model1
Sum ofSquares df
MeanSquare
Predictors: (Constant), Va.
Dependent Variable: Zb.
22
Step (iii) r and dfu
r = number of restrictions
= difference in no. of parameters between
the restricted and unrestricted equations
= 1
dfu = df from unrestricted regression = nU - kU
where k is total number of all coefficients including the intercept
= 516 - 3 = 513
23
(iv) Substitute RSSU, RSSR, r and dfU in the formula for F
F = (RSSR - RSSU) / r = (2070.305 - 2069.060)/1
RSSU/dfU 2069.060 / 513
= 1.245 / 4.033 = 0.309 dfnumerator = r = 1 dfdenominator = dfU = 513 From Excel =FDIST(0.309,1,513), we know that Prob(F >
0.309) = 0.58 (I.e. 58% chance of Type I Error) – I.e. if we reject H0 then there is more than a one in
two chance that we have rejected H0 incorrectly
Accept H0 that b3 = b2 + 1
24
Using F-Tests.xls:
Restricted ModelANOVAModel Sum of Squaresdf Mean SquareF Sig.
1 Regression 55.7013 1 55.7013 13.82911 0.000222Residual 2070.305 514 4.02783Total 2126.006 515
a Predictors: (Constant), Vb Dependent Variable: Z
Unrestricted ModelANOVAModel Sum of Squaresdf Mean SquareF Sig.
1 Regression 63.52287 2 31.76143 7.874889 0.000428Residual 2069.06 513 4.033255Total 2132.583 515
a Predictors: (Constant), MP_GDP, MS_GDPb Dependent Variable: inflation
25
F Test
r = k U - k R = 1
k u = 3
n = 516
df u = n - k u = 513
= 513
F = 0.308674039
Sig F = 0.578737192
UU
URrdf dfRSS
rRSSRSSF
U /
/)(
26
(2) Testing a set of linear Restrictions - When the Restrictions are Homogenous
When linear restrictions are homogenous:– e.g. H0: b2 = b3 = 0
– e.g. H0: b2 = b3
we do not need to transform the dependent variable of the restricted equation.
For restrictions of this type – I.e. where the dependent variable is the same in the
restricted and unrestricted regressions
we can re-write our F-ratio test statistic in terms of R2s:
27
F-ratio test statistic for homogenous restrictions:
UU
RUrdf
dfdf dfR
rRRFF
U /1
/)(2
22numerator
rdenominato
Where:
RSSU = unrestricted residual sum of squares
= RSS under H1
RSSR = unrestricted residual sum of squares
= RSS under H0
r = number of restrictions = diff. in no. parametersbetween restricted and unrestricted equations
dfu = df from unrestricted regression = n - k where k is allcoefficients including the intercept.
28
Proof of simpler formula for homogenous restrictions:
If the dependent variable is the same in both the restricted and unrestricted equations, then the TSS will be the same
We can then make use of the fact that RSS = (1 - R2) TSS, which implies that:
RSSR = (1- RR2) TSS
RSSU = (1- RU2) TSS
29
Proof continued...
Substituting RSSR = (1- RR2) TSS and RSSU =
(1- RU2) TSS into our original formula for the
F-ratio, we find that:
UU
RU
UU
UR
UU
URrdf
dfR
rRR
dfR
rRR
dfTSSR
rTSSRTSSRF
U
/)1(
/)(
/)1(
/)11(
,/])1[(
/])1()1[(
2
22
2
22
2
22
30
Example 1: Ho: no country effects (R and U regressions have the same dependent variable)
Our approach to this restriction when we tested it above was to use the RSSs as follows:
Since it is a homogenous restriction (I.e. dep var is same in restricted and unrestricted models), we shall now attempt the same test but using the R2 formulation of the F-ratio formula:
F = (RSSR - RSSU) / r = (2097.722 - 1835.811)/3 RSSU/dfU 1835.811 / 511
= 24.301Prob(F > F[3,511] 24.298) = 1.028E-14 Reject H0
31
Unrestricted model: RU2 = 0.139
Restricted model: RR2 = 0.016
Model Summary
.373a .139 .132 1.895Model1
R R SquareAdjustedR Square
Std. Errorof the
Estimate
Predictors: (Constant), CNTRY_3, CNTRY_2,CNTRY_1, money supply
a.
Model Summary
.128a .016 .014 2.020Model1
R R SquareAdjustedR Square
Std. Errorof the
Estimate
Predictors: (Constant), money supplya.
32
F = (RU2 - RR
2) / r = (0.139 - 0.016)/3 = 0.041 = 24.301
(1-RU2)
/dfU (1- 0.139) / 511 0.0017
33
Restricted ModelANOVAModel Sum of Squaresdf Mean SquareF Sig.
1 Regression 34.86042 1 34.86042 8.541767 0.003624Residual 2097.722 514 4.081172Total 2132.583 515
a Predictors: (Constant), money supplyb Dependent Variable: inflation
Model SummaryModel R R Square Adjusted R SquareStd. Error of the Estimate
1 0.127854 0.016347 0.014433 2.020191a Predictors: (Constant), money supply
Unrestricted ModelANOVAModel Sum of Squaresdf Mean SquareF Sig.
1 Regression 296.7719 4 74.19296 20.65169 0Residual 1835.811 511 3.592585Total 2132.583 515
a Predictors: (Constant), CNTRY_3, CNTRY_2, CNTRY_1, money supplyb Dependent Variable: inflation
Model SummaryModel R R Square Adjusted R SquareStd. Error of the Estimate
1 0.373043 0.139161 0.132422 1.895412a Predictors: (Constant), CNTRY_3, CNTRY_2, CNTRY_1, money supply
34
F Test Statistic for homogenous restrictions:
r = k U - k R = 3
k u = 5
n = 516
df u = n - k u = 511
= 511
F = 24.30111
Sig F = 1.03E-14
UU
RUrdf
dfdf dfR
rRRFF
U /1
/)(2
22numerator
rdenominato
35
(3) Testing a set of linear Restrictions - When the Restrictions say that bi = 0 i
A special case of homogenous restrictions is where we test for the existence of a relationship– I.e. H0: all slope coefficients are zero:
Unrestricted regression:
y = b1 + b2x2 + b3x3 + u• H0: b2 = b3 = 0; • If H0 is true, then y = b1
In this case, Restricted regression does no explaining at all and so RR
2 = 0
36
And the homogenous restriction F-ratio test statistic reduces to:
This is the F-test we came across in MII Lecture 2, and is the one automatically calculated in the SPSS ANOVA table
UU
U
UU
Urdf
dfdf
dfR
rR
dfR
rRFF
U
/1
/)(
/1
/)0(
2
2
2
2numerator
rdenominato
where,
r = k -1
dfU = n - k
37
(4) Testing for Structural Breaks
The F-test also comes into play when we want to test whether the estimated coefficients change significantly if we split the sample in two at a given point
These tests are sometimes called “Chow Tests” after one of its proponents.
There are actually two versions of the test:– Chow’s first test– Chow’s second test
38
(a) Chow’s First TestUse where n2 > k
(1) Run the regression on the first set of data (i = 1, 2, 3, … n1) & let its RSS be RSSn1
(2) Run the regression on the second set of data (i = n1+1, n1+2, …, end of data) & let its RSS be RSSn2
(3) Run the regression on the two sets of data combined (i = 1, …, end of data) & let its RSS be RSSn1 + n2
39
(4) Compute RSSU, RSSR, r and dfU:
– RSSU = RSSn1 + RSSn2
– RSSR = RSSn1 + n2
– r = k = total no. of coeffts including the constant
– dfU = n1 + n2 -2k
(5) Use RSSU, RSSR, r and dfU to calculate F using the general formula for F and find the sig. Level:
UU
URrdf
dfdf dfRSS
rRSSRSSFF
U /
/)(numerator
rdenominato
40
(b) Chow’s Second TestUse where n2 < k (I.e. when you have insufficient observations on 2nd subsample to do Chow’s 1st test)
(1) Run the regression on the first set of data (i = 1, 2, 3, … n1) & let its RSS be RSSn1
(2) Run the regression on the two sets of data combined (i = 1, …, end of data) & let its RSS be RSSn1 + n2
41
(3) Compute RSSU, RSSR, r and dfU:
– RSSU = RSSn1
– RSSR = RSSn1 + n2
– r = n2
– dfU = n1 - k
(4) Use RSSU, RSSR, r and dfU to calculate F using the general formula for F and find the sig.:
UU
URrdf
dfdf dfRSS
rRSSRSSFF
U /
/)(numerator
rdenominato
42
Example of Chow’s 1st Test:n1: before 1986: n2: 1986 and after
-8.183 4.140
3.648 4.043
8.767 3.617
-3.406 1.547
-7.164 .551
-3.585 .544
.214 .536
.320 .538
.873 .567
-7.85E-02 .539
9.764E-02 .542
7.878E-02 .550
(Constant)
MS_GDP
MP_GDP
CNTRY_1
CNTRY_2
CNTRY_3
CNTRY_4
CNTRY_5
CNTRY_6
CNTRY_7
CNTRY_8
CNTRY_9
Model1
B Std. Error
UnstandardizedCoefficients
Dependent Variable: inflationa.
Coefficients
16.393 4.718
-5.806 4.969
-6.240 5.993
-.167 1.857
-.633 .632
1.943 .642
-.195 .601
-.400 .609
-.559 .624
-.130 .607
-.111 .615
6.658E-02 .624
(Constant)
MS_GDP
MP_GDP
CNTRY_1
CNTRY_2
CNTRY_3
CNTRY_4
CNTRY_5
CNTRY_6
CNTRY_7
CNTRY_8
CNTRY_9
Model1
B Std. Error
UnstandardizedCoefficients
Dependent Variable: inflationa.
43
ANOVA from n 1
ANOVAModel Sum of Squaresdf Mean SquareF Sig.
1 Regression 670.1102 11 60.91911 31.2705 4.53E-43Residual 563.0107 289 1.948134Total 1233.121 300
a Predictors: (Constant), CNTRY_9, CNTRY_8, CNTRY_7, CNTRY_6, CNTRY_5, CNTRY_4, CNTRY_3, CNTRY_2, CNTRY_1, MP_GDP, MS_GDPb Dependent Variable: inflation
ANOVA from n 2
ANOVAModel Sum of Squaresdf Mean SquareF Sig.
1 Regression 65.9947 11 5.999519 3.428609 0.000217Residual 355.2176 203 1.749841Total 421.2124 214
a Predictors: (Constant), CNTRY_9, CNTRY_8, CNTRY_7, CNTRY_6, CNTRY_5, CNTRY_4, CNTRY_3, CNTRY_2, CNTRY_1, MP_GDP, MS_GDPb Dependent Variable: inflation
ANOVA from n1 + n2ANOVAModel Sum of Squaresdf Mean SquareF Sig.
1 Regression 300.8519 11 27.35017 7.525389 0Residual 1831.731 504 3.634387Total 2132.583 515
44
k = 12
r = k = 12
n1 = 301n2 = 215df u = n - k = 492
RSS R = RSSn1+n2 = 1831.730825
RSSU = RSSn1 + RSSn2 = 918.2283021
F = 40.78898826
Sig F = 4.18611E-66
UU
URrdf dfRSS
rRSSRSSF
U /
/)(
45
Summary:
(1) Testing a set of linear restrictions – the general case
(2) Testing homogenous Restrictions (3) Testing for a relationship – Special
Case of Homogenous Restrictions (4) Testing for Structural Breaks
46
Reading
Kennedy (1998) “A Guide to Econometrics”, Chapters 4 and 6
Maddala, G.S. (1992) “Introduction to Econometrics” p. 170-177