6305.Homework 3

Embed Size (px)

Citation preview

  • 7/30/2019 6305.Homework 3

    1/3

    6305: Applied Econometrics for Policy Analysis

    Homework 3Due on 26 April 2013

    Maximum Marks: 100

    April 16, 2013

    1 Oaxaca-Blinder Decomposition

    For this set of questions, use the dataset called Q1.dta. This is the dataset that DiNardo and Pischke usedfor their paper on returns to pencils. We will use it to look at male-female wage differentials.

    1. First, start by computing the mean hourly wages of males and females for 1979 and 1985. Commenton the wage differential. Who does it favour and how does it change between 1979 and 1985?

    2. To what extent do the above figures indicate statistical discrimination against one group? Justify youranswer using insights from what you know from this course.

    3. Perform the Oaxaca Blinder decomposition separately for 1979 and 1985. In your regression, include thefollowing variables: exp, expsq, school, schoolsq, married, sit, computer, pencil, teleph,calc, hammer, city, civser. Make sure to compute clustered standard errors using occupation.Show the output of the regressions and the decomposition.

    4. Write a short paragraph contrasting the results for females and males within each year (in 1979 and1985). What insights can you offer to explain these patterns?

    5. Write a short paragraph on the trends, i.e. comparing the the signs on the covariates within each

    group, for males and females, between 1979 and 1989. Provide hypotheses on why we see what we see.

    6. Using the results on decomposition into endowment effects and coefficients and your insights above,offer your comments on the nature of discrimination or unexplained male-female wage differentials.

    7. What are the assumptions underlying your inferences? How would you persuade yourself that theseare reasonable assumptions to make?

    2 Difference-in-Differences Estimate

    For this set of questions, use the dataset called Q2.dta. This is a dataset of housing prices in North Andover,Massachussetts. We will look at the impact that a new garbage incinerator had (or did not have) onhousing prices. The construction of the incinerator was announced in 1979; it was constructed in 1981 andbecame operational in 1985. Housing values represent prices of houses sold in 1978 and 1981, before theannouncement and during construction respectively.

    1. Here are some preliminary questions to set up the problem.

    (a) What hypothesis would you hold in terms of the direction of impact on housing value with thecoming of the incinerator?

    1

  • 7/30/2019 6305.Homework 3

    2/3

    (b) What would you use as the outcome variable of interest (i.e., Y)?

    (c) What would you use as the proxy that represents the cause (or the treatment variable, i.e., X)?Justify your choice. There are many options here; use your imagination (but not too muchimagination).

    2. For this part of the problem, follow the instructions carefully.

    (a) Using the variable rprice (price in real terms), compute the following four quantitites (a) meanhousing value of the 1978 control, (b) mean housing value of the 1978 treated, (c) mean housingvalue of the 1981 for control observations and (d) the mean housing value for the treated unitsin 1981. For this part of the analysis, use nearinc as the treatment variable.

    (b) Compute the difference-in-differences estimate of the impact of being close to the incinerator.Express your finding in a single complete sentence in a way that a layperson can understandeasily.

    (c) What is the key assumption under which the above findings hold?

    (d) Now run two sets of regressions, where you regress rprice on nearinc, separately for 1978 and1981. Write out these two regression models using the coefficients you have estimated.

    rprice78

    = 78 + nearinc78 + 78 (1)

    rprice81

    = 81 + nearinc81 + 81 (2)

    (e) To test whether this difference-in-differences estimate is statistically significant, use a pooledregression model. For this, pool all the data and run the model

    rprice = + nearinc + y81 nearinc + y81 + (3)

    Determine difference-in-differences estimate from this model and comment on its statisticalsignificance.

    (f) Compare this estimate to the one computed without the regression model? Is the regression-basedestimate smaller or larger? Why? Show this algebraically.

    (g) Now, expand this model to control for covariates that are also time invariant. In particular, run

    the model including the covariates distance from the interstate highway ( intst), the number ofrooms (rooms), bathsbaths, house area (area), land size (land). Distances are in feet and areasare in feet square.

    What happens to the size of the DD estimate?

    What happens to the standard error of the DD estimate?

    Is you result now stronger and weaker after adding the covariates?

    What happens to the coefficient on nearinc ?

    What does this say about the drivers of housing value?

    (h) As a final exercise, estimate the following model

    log(price) = + nearinc + y81 nearinc + y81 + (4)

    using log(price) as the relevant outcome variable. Write out the results substituting the abovemodel with the coefficients you find. Can you express the average treatment effect as representedby the difference-in-differences estimate in percentage terms?

    (i) Explain whether with this dataset you would contemplate using individual fixed effects to controlfor unobserved heterogeneity. Justify your answer.

    (j) What, if any, are the potential problems with the treatment variable nearinc ?

    2

  • 7/30/2019 6305.Homework 3

    3/3

    3 Quantile Regression

    For this section, use the same dataset Q2.dta. In keeping with the spirit of the findings on the impact ofan incinerator on house values, we might believe that despite the shared features of the neighbourhood,the impact might be different at different housing values, so that the distribution of housing valuesmight be affected rather than merely the average. In this section, you will perform a quantile regression

    to assess heterogenous impacts of the incinerator.We will not push this logic too far because we havenot really covered quantile regression in the context of a pooled data that combines cross sections intwo time periods. We will therefore keep it simple.

    (a) Estimate a quantile regression model using rprice on nearinc, y81, y81nrinc, area,land,rooms,cbd, intst, baths, dist, wind, age, agesq. Compute these for quintiles and 100 repeti-tions.

    (b) Let us focus on the the value of the coefficient on y81 nearinc, and the coefficients on y81and nearinc. Perform a test of equality of these across the quintiles. Feel free to choose thosequintiles that you find interesting.

    (c) Write a brief paragraph summarizing your findings, focussing mainly on the three covariates above.But feel free to comment on the others that you might find interesting.

    (d) Focussing on the three key covariates identified above, are these quantile coefficients statisticallysignificantly different from the least squares regression?

    (e) Please provide using sqreg figures for the quantile regression you have just performed.

    3