Homework2-AK

8/3/2019 Homework2-AK

1/9

1

Homework # 2ECO 7427

ANSWER KEY

Prof. Sarah Hamersma

1.For your data work this week, I would like you to do exercise 5.4 in Wooldridge (allparts). The data are available on the shared drive in the Economics department or onWooldridges website at:http://www.msu.edu/~ec/faculty/wooldridge/book2.htmThe paper you will replicate is Card 1995. It was published in a book, but there are copiesof the working paper version online (NBER # 4483) for your reference.

Wooldridge Ch 5-4

Here is my Stata code, with the verbal answers to the questions embedded in it.

-------------------------------------------------------------------------------log: G:\Wooldridge5-4.log

log type: textopened on: 16 Feb 2005, 11:03:07

.

.

. * Sarah Hamersma

. * 2/15/05

. * program name: Wooldridge5-4.do

.

. * This program provides an answer key to Wooldridge question 5.4

.

. #delimit ;delimiter now ;. use "H:\Wooldridge Data\CARD.DTA", clear;

. * Part a ;

. gen logwage = log(wage);

. regress logwage educ exper expersq black south smsa reg661 reg662> reg663 reg664 reg665 reg666 reg667 reg668 smsa66;

Source | SS df MS Number of obs = 3010-------------+------------------------------ F( 15, 2994) = 85.48

Model | 177.695591 15 11.8463727 Prob > F = 0.0000

Residual | 414.946054 2994 .138592536 R-squared = 0.2998-------------+------------------------------ Adj R-squared = 0.2963

Total | 592.641645 3009 .196956346 Root MSE = .37228

------------------------------------------------------------------------------logwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------educ | .0746933 .0034983 21.35 0.000 .0678339 .0815527exper | .084832 .0066242 12.81 0.000 .0718435 .0978205

expersq | -.002287 .0003166 -7.22 0.000 -.0029079 -.0016662
http://www.msu.edu/~ec/faculty/wooldridge/book2.htmhttp://www.msu.edu/~ec/faculty/wooldridge/book2.htmhttp://www.msu.edu/~ec/faculty/wooldridge/book2.htmhttp://www.msu.edu/~ec/faculty/wooldridge/book2.htm


2/9

2

black | -.1990123 .0182483 -10.91 0.000 -.2347927 -.1632318south | -.147955 .0259799 -5.69 0.000 -.1988952 -.0970148smsa | .1363845 .0201005 6.79 0.000 .0969724 .1757967

reg661 | -.1185698 .0388301 -3.05 0.002 -.194706 -.0424335reg662 | -.0222026 .0282575 -0.79 0.432 -.0776088 .0332036reg663 | .0259703 .0273644 0.95 0.343 -.0276846 .0796251reg664 | -.0634942 .0356803 -1.78 0.075 -.1334546 .0064662

reg665 | .0094551 .0361174 0.26 0.794 -.0613623 .0802725reg666 | .0219476 .0400984 0.55 0.584 -.0566755 .1005708reg667 | -.0005887 .0393793 -0.01 0.988 -.077802 .0766245reg668 | -.1750058 .0463394 -3.78 0.000 -.265866 -.0841456smsa66 | .0262417 .0194477 1.35 0.177 -.0118905 .0643739_cons | 4.739377 .0715282 66.26 0.000 4.599127 4.879626

------------------------------------------------------------------------------

. * This lines up perfectly with Card's Table 2, column 2. The only> * difference is that Card uses "expersq/100" as his regressor so> * his coefficient is exactly 100 times the size of ours (but this> * affects the std error the same way, so the significance level> * is identical). This is a useful place to note that it can be easier> * for the reader if you scale variables for which the coefficient> * would be very very small, to make it easier to interpret, which

> * is what Card did.>>

> * Part b ;. regress educ exper expersq black south smsa reg661 reg662> reg663 reg664 reg665 reg666 reg667 reg668 smsa66 nearc4;


Model | 10287.6179 15 685.841194 Prob > F = 0.0000Residual | 11274.4622 2994 3.76568542 R-squared = 0.4771

-------------+------------------------------ Adj R-squared = 0.4745Total | 21562.0801 3009 7.16586243 Root MSE = 1.9405

------------------------------------------------------------------------------educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------exper | -.4125334 .0336996 -12.24 0.000 -.4786101 -.3464566

expersq | .0008686 .0016504 0.53 0.599 -.0023674 .0041046black | -.9355287 .0937348 -9.98 0.000 -1.11932 -.7517377south | -.0516126 .1354284 -0.38 0.703 -.3171548 .2139296smsa | .4021825 .1048112 3.84 0.000 .1966732 .6076918

reg661 | -.210271 .2024568 -1.04 0.299 -.6072395 .1866975reg662 | -.2889073 .1473395 -1.96 0.050 -.5778042 -.0000105reg663 | -.2382099 .1426357 -1.67 0.095 -.5178838 .0414639reg664 | -.093089 .1859827 -0.50 0.617 -.4577559 .2715779reg665 | -.4828875 .1881872 -2.57 0.010 -.8518767 -.1138982reg666 | -.5130857 .2096352 -2.45 0.014 -.9241293 -.1020421

reg667 | -.4270887 .2056208 -2.08 0.038 -.8302611 -.0239163reg668 | .3136204 .2416739 1.30 0.194 -.1602433 .7874841smsa66 | .0254805 .1057692 0.24 0.810 -.1819071 .2328682nearc4 | .3198989 .0878638 3.64 0.000 .1476194 .4921785_cons | 16.84852 .2111222 79.80 0.000 16.43456 17.26248

------------------------------------------------------------------------------

. * This lines up with Card as well. The coefficient seems reasonably> * large - about 1/3 year added education (on a base of about 13 years)> * for those living near a college -> * and it contributes to the regression meaningfully. One way to assert


3/9

3

> * this would be with an F-test of the effect of including the instruments;. * since there is only one instrument, we can simply look at a t-test of> * the instrument in the first-stage regression and we can see that it> * has statistical explanatory power (see Wooldridge top of p. 105). The> * size and significance suggest a reasonably strong instrument.>

> * Part c ;. ivreg logwage exper expersq black south smsa reg661 reg662> reg663 reg664 reg665 reg666 reg667 reg668 smsa66 (educ = nearc4);

Instrumental variables (2SLS) regression


Model | 141.146813 15 9.40978752 Prob > F = 0.0000Residual | 451.494832 2994 .150799877 R-squared = 0.2382

-------------+------------------------------ Adj R-squared = 0.2343Total | 592.641645 3009 .196956346 Root MSE = .38833

------------------------------------------------------------------------------

logwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------educ | .1315038 .0549637 2.39 0.017 .0237335 .2392742exper | .1082711 .0236586 4.58 0.000 .0618824 .1546598

expersq | -.0023349 .0003335 -7.00 0.000 -.0029888 -.001681black | -.1467757 .0538999 -2.72 0.007 -.2524603 -.0410912south | -.1446715 .0272846 -5.30 0.000 -.19817 -.091173smsa | .1118083 .031662 3.53 0.000 .0497269 .1738898

reg661 | -.1078142 .0418137 -2.58 0.010 -.1898007 -.0258278reg662 | -.0070465 .0329073 -0.21 0.830 -.0715696 .0574767reg663 | .0404445 .0317806 1.27 0.203 -.0218694 .1027585reg664 | -.0579172 .0376059 -1.54 0.124 -.1316532 .0158189reg665 | .0384577 .0469387 0.82 0.413 -.0535777 .130493reg666 | .0550887 .0526597 1.05 0.296 -.0481642 .1583416reg667 | .026758 .0488287 0.55 0.584 -.0689832 .1224992

reg668 | -.1908912 .0507113 -3.76 0.000 -.2903238 -.0914586smsa66 | .0185311 .0216086 0.86 0.391 -.0238381 .0609003_cons | 3.773965 .934947 4.04 0.000 1.940762 5.607169

------------------------------------------------------------------------------Instrumented: educInstruments: exper expersq black south smsa reg661 reg662 reg663 reg664

reg665 reg666 reg667 reg668 smsa66 nearc4------------------------------------------------------------------------------

. * The new estimate of the return to education is almost twice as high (13%> * vs. 7.5% before). The 95% confidence interval here is (.024, .239). The> * earlier one was (.068,.082). We have a lot less precision with the IV> * procedure. However, this lack of precision is appropriate. The precision> * is part (a) is false in the sense that the estimates are not even> * consistent since educ is endogenous (plus they are biased). When we

> * account for the endogeneity, we use a two-stage IV procedure that will> * result in less precision but consistency and unbiasedness of the estimate.;

. * Part d ;

. regress educ exper expersq black south smsa reg661 reg662> reg663 reg664 reg665 reg666 reg667 reg668 smsa66 nearc4 nearc2;


Model | 10297.1164 16 643.569774 Prob > F = 0.0000


4/9

4

Residual | 11264.9637 2993 3.76377002 R-squared = 0.4776-------------+------------------------------ Adj R-squared = 0.4748

Total | 21562.0801 3009 7.16586243 Root MSE = 1.94

------------------------------------------------------------------------------educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

exper | -.4122915 .0336914 -12.24 0.000 -.4783521 -.3462309expersq | .0008479 .00165 0.51 0.607 -.0023874 .0040832black | -.9451729 .0939073 -10.06 0.000 -1.129302 -.7610434south | -.0419115 .1355316 -0.31 0.757 -.3076561 .2238331smsa | .4013708 .1047858 3.83 0.000 .1959113 .6068303

reg661 | -.1687829 .2040832 -0.83 0.408 -.5689404 .2313747reg662 | -.269031 .1478324 -1.82 0.069 -.5588944 .0208325reg663 | -.1902114 .1457652 -1.30 0.192 -.4760216 .0955987reg664 | -.037715 .1891745 -0.20 0.842 -.4086403 .3332102reg665 | -.4371387 .1903306 -2.30 0.022 -.8103307 -.0639467reg666 | -.5022265 .2096933 -2.40 0.017 -.9133841 -.0910688reg667 | -.3775317 .207922 -1.82 0.070 -.7852162 .0301529reg668 | .3820043 .2454171 1.56 0.120 -.0991991 .8632076smsa66 | .0000782 .1069445 0.00 0.999 -.2096139 .2097704nearc4 | .3205819 .0878425 3.65 0.000 .148344 .4928197

nearc2 | .1229986 .0774256 1.59 0.112 -.0288142 .2748114_cons | 16.77306 .2163481 77.53 0.000 16.34885 17.19727------------------------------------------------------------------------------

. * The variable nearc4 seems to be more related, and the relationship is> * more precisely estimated for nearc4;. ivreg logwage exper expersq black south smsa reg661 reg662> reg663 reg664 reg665 reg666 reg667 reg668 smsa66 (educ = nearc4 nearc> 2);

Instrumental variables (2SLS) regression


Model | 100.869 15 6.72459998 Prob > F = 0.0000

Residual | 491.772645 2994 .16425272 R-squared = 0.1702-------------+------------------------------ Adj R-squared = 0.1660Total | 592.641645 3009 .196956346 Root MSE = .40528

------------------------------------------------------------------------------logwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------educ | .1570594 .0525782 2.99 0.003 .0539662 .2601525exper | .1188149 .0228061 5.21 0.000 .0740977 .163532

expersq | -.0023565 .0003475 -6.78 0.000 -.0030379 -.0016751black | -.1232778 .05215 -2.36 0.018 -.2255313 -.0210243south | -.1431945 .0284448 -5.03 0.000 -.1989678 -.0874212smsa | .100753 .0315193 3.20 0.001 .0389512 .1625548

reg661 | -.102976 .0434224 -2.37 0.018 -.1881167 -.0178353reg662 | -.0002286 .0337943 -0.01 0.995 -.066491 .0660337

reg663 | .0469556 .032649 1.44 0.150 -.0170612 .1109724reg664 | -.0554084 .0391828 -1.41 0.157 -.1322364 .0214196reg665 | .0515041 .0475678 1.08 0.279 -.0417647 .144773reg666 | .0699968 .0533049 1.31 0.189 -.0345212 .1745148reg667 | .0390596 .0497499 0.79 0.432 -.0584878 .136607reg668 | -.1980371 .052535 -3.77 0.000 -.3010454 -.0950287smsa66 | .0150626 .022336 0.67 0.500 -.0287328 .058858_cons | 3.339687 .8945377 3.73 0.000 1.585716 5.093658

------------------------------------------------------------------------------Instrumented: educInstruments: exper expersq black south smsa reg661 reg662 reg663 reg664


5/9

5

reg665 reg666 reg667 reg668 smsa66 nearc4 nearc2------------------------------------------------------------------------------

. * The new estimate of the returns to educ is even higher, at 15.7%. There> * is again sufficient precision to say with confidence that there is a> * positive effect of education on earnings, but the confidence interval is> * still fairly wide at (.054,.260). It's notable, though, that the bottom

> * end of the interval is not that much lower (in magnitude) than the point> * estimate in the OLS where we ignored endogeneity. Although we cannot be> * certain, this seems to suggest that the uncorrected OLS estimate was likely> * an underestimate of the real return to education. ;

. * Part e ;

. regress iq nearc4;




------------------------------------------------------------------------------iq | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------nearc4 | 2.5962 .7454966 3.48 0.001 1.134195 4.058206_cons | 100.6106 .6274557 160.35 0.000 99.38014 101.8412

------------------------------------------------------------------------------

. * They are correlated - probably not that shocking, given profs' smart kids!> * We might be concerned about this correlation because IQ could affect> * wages directly, so nearc4 could be picking up IQ effects, which would make> * nearc4 correlated with the error in the outcome equation (that would be> * very bad!).>> * But wait...there's still part f;

. * Part f ;

. regress iq nearc4 smsa66 reg661 reg662 reg669;




------------------------------------------------------------------------------iq | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

nearc4 | .8680808 .8216913 1.06 0.291 -.7433537 2.479515smsa66 | 1.354527 .8027961 1.69 0.092 -.2198513 2.928906reg661 | 4.768099 1.546809 3.08 0.002 1.734623 7.801576reg662 | 5.80812 .9017539 6.44 0.000 4.039673 7.576566reg669 | 1.844655 1.151703 1.60 0.109 -.4139708 4.103281_cons | 99.38472 .7016631 141.64 0.000 98.00868 100.7608

------------------------------------------------------------------------------

. * I'm not sure why we didn't use the whole set of dummies here...anyway,> * this is good - IQ and nearc4 no longer appear to be partially correlated.> * Or, at least, there is not a strong enough correlation for us to be


6/9

6

> * able to measure it precisely. The point here is that it is important> * for us to control for 1966 location and regional dummies in the outcome> * equation because these soak up the effects of IQ in a way that allows> * the instrument to end up uncorrelated with the error in the outcome> * equation (which is required in order for us to legitimately use IV).;. log close;

log: G:\Wooldridge5-4.log

log type: textclosed on: 16 Feb 2005, 11:03:08

2. Please write me a description of the distinction between a proxy and aninstrument. Specifically, tell me about differences in the assumptions required for each tobe valid and differences in the type of situation in which it would be useful to use such avariable.

Proxy:

Use a proxy when you need a representative for an omitted variable for which youdont have direct data (such as using IQ to proxy for ability in a returns-to-education context). You place it directly into the regression to represent thevariable you dont have data for. The interpretation of the coefficient is thepredictive power of the proxy on the outcome. Note that we still cannot measurethe effect of the omitted variable a proxy just acts as a control so that the othercoefficients in the regression arent biased.

Assumptions:a) uncorrelated with the error in the outcome equation (this is also called

redundant in the structural equation if we had the real variable, thisone would be redundant)

b) correlated with the omitted variable- more specifically, it should be closely related enough to the omitted

variable that the other Xs have no power for predicting the omittedvariable once the proxy is taken into account (though theres no wayto check this exactly, since we dont have data on the omittedvariable)

Instrument:

Use an instrument when you have data on an endogenous variable that you think

is correlated with some other omitted variable in your outcome equation, causingthe regression estimates to be biased. An instrument is used to represent theendogenous variable (NOT the omitted one) and if we consider the single-variable case, the instrument is put into the outcome equation directly. While wecould look at the coefficient on the instrument, we are typically interested in theeffect of the endogenous variable, which we get by dividing the coefficient on theinstrument in the outcome equation by the coefficient on the instrument from thefirst-stage.


7/9

7

Assumptions:a) uncorrelated with the error in the outcome equation (this is also called

redundant in the structural equation if we had a clean version of theendogenous regressor (without its implicit correlation with some other

omitted variable) then the instrument would be redundant)b) correlated with the endogenous variable (and NOT with the omittedvariable that is causing the endogenous regressor to be endogenous if it iscorrelated with the omitted variable, it will fail to meet assumption (a).)

In terms of comparing the two one clear similarity is in the assumptions(particularly the first one). However, a clear difference is that they are used to fixdifferent problems. In one case (instrument) we have a variable of interest but itis correlated with some omitted variable, preventing us from estimating the effectproperly. We want a representative that will get rid of the endogenous part of thevariable of interest. In the case of a proxy, controlling for the omitted variableitself is of interest, and we are looking for a way to do this with some substitutebecause the data are not available. In a very practical sense, these are distinct inthat there can be no first stage in a proxy setting because we do not have dataon the variable we are trying to represent (and if we did, we wouldnt need theproxy!).

3. Regarding Lotts work: I would like you to tell me if you think this (his websitedefense) is a sufficient argument for choosing not to use clustering in the analysis.Do your best to convince me of your position by explaining why the analysis does ordoes not need clustering.

This was a hard question. I gave substantial partial credit for wrong answers that were wellthought-out. But please do make sure you read this so you know the right answer.

Outline of answer:a) when clustering standard errors is still needed, even with dummiesb) explanation of what dummies can and cannot successfully fixc) explanation of why clustering will make SEs bigger even if its unneeded

John Lotts analysis uses county-level data from several states and looks at the impact ofstate-level treatments. Note that he does not use individual data at allthe unit of

observation is the county. This means when he refers to using county fixed effects, this isequivalent to an individual fixed effect from the perspective of his sample where eachobservation is a county. He argues that including county fixed effects implicitly includesstate fixed effects. This argument is correct. However, this only moves us one step closer tothe real question: Does including state fixed-effects mean you dont need clustering at thestate level? The answer is that you still may need clustering.


8/9

8

State fixed effects are an important component of an analysis that uses state-level treatments.There may be correlated outcomes Y within a state that are not picked up by observable Xs.This can be thought of as an omitted variables (endogeneity) problem so if this is the case,and we do not include state fixed effects, our estimates of the treatment effect will be biasedand inconsistent (not to mention the standard errors!). Including a state fixed effect allows

us to explain some of this variation. Econometrically, it will force the expected value of theresiduals within each state to be zero (if they averaged something else, this would have beenincorporated into the estimate of the fixed effect by construction).

Suppose that these state fixed effects properly fix the point estimates (i.e. there is no longeran omitted variables problem). What does the error structure look like now? Well, withineach state there are several counties. We can estimate a regression and look at the residualswithin each statethey will average zero (as noted above) but depending on the state theymight be spread widely or distributed narrowly around zero. This is a heteroskedasticityproblemsolve it with the robust function to fix your standard errors.

Where does the clustering come in? It is worth noting that the clustering problem would

have been HUGE if we ignored the fixed effects to start with, and so including them doesmake the problem smaller (which is why some of our intuition suggested that it could fix theproblem). However, it may still remain. The issue is that we have controlled only for a veryspecific form of correlation among observations within a statewe have controlled for aform of correlation in which every observation in the state has a common (state-level)component of variance and a random component that is individual-specific (or, in Lottscase, county-specific). We have assumed all states have this same within-state correlationstructure. It is conceivable, though, that there are other correlations among counties in astate that are not picked up by this very simple model of correlation. Wooldridge, in hispaper Cluster-Sample Methods in Applied Econometrics, says that an example would besomething that is somehow related to the other Xs in the regression...such as if people

within certain states tend to have certain Xs that are related to certain error -term patterns,which could cause a complication in the relationships among the errors within a state.Clustering the standard errors, along with making them robust (Stata does thisautomatically), will address this problem. (However, let me note that in the case describedby Wooldridge there it seems there may also be an endogeneity problem if there is somenonrandom relationship between Xs and error terms).

Mitch Petersons paper Estimating Standard Errors in Finance Panel Data Sets:Comparing Approaches also addresses this issue and gives a nice example of a situation inwhich clustering is still needed in the presence of fixed effects. He examines the use ofvarious standard error corrections in the presence of different types of error correlation.

Some key insights are on pages 6-8, Section IV (starting page 23), and Section V (startingpage 26). I have pasted the most transparent part of the paper for our purposes below.Petersons point is that adding firm (or in our case county) fixed effects will fix everything IFthe only correlation among counties is a fixed, time-invariant component. (This echoesWooldridge). He notes that this will fail if there is a gradually-changing firm effect. I wouldadd that the same may be true for a geographically-based correlations (nearby counties maybe more correlated with each other than distant counties, even within a state).


9/9

9

Once we include the firm effects, the OLS standard errors are unbiased .... The clustered standard errorsare unbiased with and without the fixed effects (see Kezdi, 2004, for examples where the clustered standarderrors are too large in a fixed effect model). This conclusion, however, depends on the firm effect being fixed. Ifthe firm effect decays over time, the firm dummies no longer fully capture the within cluster dependence andOLS standard errors are still biased (see Table 5 - Panel A, columns II-IV). In these simulations, the firmeffect decays over time (in column II, 61 percent of the firm effect dissipates after 9 years). Once the firmeffect is temporary, the OLS standard errors again underestimate the true standarderrors even when firm dummies are included in the regression(Wooldridge, 2003, Baker,Stein, and Wurgler, 2003). (p. 28)

If it happens that you are certain that the error structure of your data is perfectly picked upwith fixed effects, you will not need to cluster. Moreover, you will not want to cluster.Why? Your standard errors will get unnecessarily larger. But why should they change,especially given the way Moulton (1990) presented the formula for the adjustment (whichseems to imply that if there is no correlation, no adjustment is made)? The intuition here isthat anytime we allow for more flexibility of estimation, it costs us something. Estimatingthese flexible standard errors causes a loss of efficiencyyou can think of it as using up

some of our observations (degrees of freedom) to calculate these special standard errors.This would lead us to want to KNOW whether we need clustering, since we wouldnt wantto use it unnecessarily.

A newer (2004) paper by Lott posted on his website contains an appendix with his argumentfor why any correlation in his errors is taken care of with his fixed effects (though he nowincludes clustering throughout the paper, for comparability with other work and to beconservative about his estimates). He does a test for correlation of errors to argue his point(which myself and another econometrics colleague have found to be weak at best). Thisseems a step in the right directionthe idea being that one must still make some kind ofargument for choosing NOT to cluster, even when fixed effects are included. The

argument that fixed effects are included is not itself a sufficient reason to avoidclustering.

Documents

Homework2-AK