Economics 470: Economic Fluctuations and Forecastsfaculty.wwu.edu/kriegj/Econ470/Midterm/470 Midterm, Winter 2018... · Economics 470: Economic Fluctuations and Forecasts Take Home

Economics 470: Economic Fluctuations and Forecasts Take Home Midterm

Directions This test will be posted at noon on Thursday, February 22nd. Your answers will be due at 12:05 PM on Tuesday, February 27th in our usual classroom. You may submit the exams before that in my office. Late submissions will not be accepted. You are welcome to use any print or internet resources available. You are not to discuss this exam with others (and that means all other sentient beings). If you are in a position where you are unsure about breaking these simple rules, think of the worst case scenario—if you break these rules I will give you a zero on this exam. I will be happy to help you with clarifying questions but will not answer questions that elucidate the material. Please address questions via e-mail ([email protected]); I will post answers to clarifying questions at http://faculty.wwu.edu/kriegj/Econ470/470%20Page.html and if they are necessary to completing the exam, will e-mail the entire class at your official WWU e-mail address. Your answers may be typed or legibly handwritten. Feel free to submit Stata output if it helps explain your answer (it is easiest to “copy and paste” the Stata output (as a picture) into a Word file that contains your answers). Please answer on blank, white copy paper and only include your student id number as an identifier. Do not include your name on the exam. Please start a new page for each problem; though you may combine parts of a problem onto a single page (so, a complete exam will have at least 4 pages turned in since there are 4 problems). Points possible are in parenthesis after each question. If you are looking for a diversion from taking this exam, remember Adam Wright and I are presenting our work on the impacts of marijuana legalization in Parks Hall 441 at 4pm on Friday. 1. You want to estimate the following very simple time series regression model:

Yt = β0 + β1t + εt where εt is independently and identically distributed error term with mean 0 and variance σ2. The independent variable t is simply a time series count variable that is equal to 1 for the first observation, 2 for the second, 3 for the third, … through the final observation of T for the Tth (Teeth???). a. Is the OLS estimator of β1 unbiased and efficient? (6) For OLS to unbiased and efficient, we need the classical assumptions to be true: 1. E[ε] = 0 2. Var[ε] = σ2 3. Cov[ε,X] = 0 4. Cov[εi, εj] = 0 for all i ≠ j 5. Linearity Since none of these assumptions are violated, OLS will produce unbiased and efficient estimates of β1.

mailto:[email protected]

http://faculty.wwu.edu/kriegj/Econ470/470%20Page.html

b. A friend who has not taken Econ. 470 asks why you are bothering to estimate this model with OLS. She points out that one possible method of estimating β1 is to compute the following:

β̇ =∑ (yt − yt−1)T

t=2

T − 1

Is your friend right? In other words, is β̇ an unbiased estimator of β1? (6)

E[β̇] = E [∑ (yt − yt−1)T

t=2

T − 1]

= E [∑ (𝛽0+𝛽1𝑡+𝜀𝑡−𝛽0−𝛽1(𝑡−1)−𝜀𝑡−1)T

t=2

T−1]

= E [∑ (𝛽1(𝑡−𝑡+1)+𝜀𝑡−𝜀𝑡−1)T

t=2

T−1]

= E [∑ (𝛽1 + 𝜀𝑡 − 𝜀𝑡−1)T

t=2

T − 1]

= E [(𝑇 − 1)𝛽1 + ∑ (𝜀𝑡 − 𝜀𝑡−1)T

t=2

T − 1]

= E [(𝑇−1)𝛽1

T−1] = β1

So yes, β̇ is an unbiased estimator of β1.

c. (BONUS—Don’t work on this until you get the rest done). Is β̇ more or less efficient than the OLS estimator of β1? (4)

We know that the Var[𝛽1̂] = �̂�2

∑ (𝑡−𝑡̅)2𝑇𝑡=1

. In this case �̂�2 =∑ 𝜀𝑖

2𝑇𝑡=1

𝑇−2. Recalling that the sum of the first T

consecutive integers is T(T+1)/2 and the sum of the first T consecutive integers squared is T(T+1)(2T+1)/6 means we can re-write the denominator as (1/12)(T2-1) or the entire variance as:

Var[𝛽1̂] = 12𝜎2

𝑇2−1

The Var[β̇] = 𝐸 [(�̇� − 𝐸[�̇�])2

] = 𝐸 [(∑ (𝛽1+𝜀𝑡−𝜀𝑡−1−𝛽1)T

t=2

T−1)

2

]

= 𝐸 [(∑ (𝜀𝑡−𝜀𝑡−1)T

t=2

T−1)

2

]= 𝐸 [(ε2−ε1+ε3−ε2+ε4−ε3+⋯ε𝑇−ε𝑇−1

T−1)

2]= 𝐸 [(

ε𝑇−ε1

T−1)

2]

= 2𝜎2

(𝑇−1)2

Since the sum of the squared error term will be the same in the OLS case versus your friends case, we

can show that Var[β̇] > Var[𝛽1̂] when 2𝜎2

(𝑇−1)2 >12𝜎2

𝑇2−1 or when T2-1 > 6(T-1)2. For integers greater than

one, T2 – 1 is always greater than 6(T-1)2 so Var[β̇] > Var[𝛽1̂] meaning that OLS is more efficient than your friend’s approach.

It turns out that there is an easier way of demonstrating Var[β̇] > Var[𝛽1̂]. Consider for a moment the

equation for β̇:

β̇ =∑ (yt − yt−1)T

t=2

T − 1=

𝑦2 − 𝑦1 + 𝑦3 − 𝑦2 + ⋯ + 𝑦𝑇 − 𝑦𝑡−1

T − 1=

𝑦𝑇 − 𝑦1

𝑇 − 1

If you think about it, the equation 𝑦𝑇−𝑦1

𝑇−1 is simply the change in Y from the first and last observation of Y,

divided by one less the total number of observations. However, since T is the x-axis in this case, this equation is also the rise divided by the run—the slope that exists between the first and last observations

in our data set. Given this, it is easy to see why Var[β̇] > Var[𝛽1̂]: β̇ is determined by the placement of

only two points while 𝛽1̂ is determined by the placement of all T observations (or T – 1 if we include an intercept in the OLS regression).

2. Consider the model: Yt = β + 1 Yt-1 + 2 Yt-2 + 3 Yt-3 + t where t ~ N(0, 2). a. Under what condition(s) is Yt a stationary series? (4) A necessary condition is that 1 – Σφ ≠ 0 (this will be obvious in part b). b. When Yt is a stationary series, what is its mean? (4) Taking expectations of both sides and making use of the stationary assumption:

E[Yt]= β + 1 E[Yt] + 2 E[Yt] + 3 E[Yt] so

E[Yt] = 𝛽

1−1−2−3

c. When Yt is stationary, what is ρ(1) and ρ(2) for this series? (4)

so

ρ(1) = 1+2 3

1−2−13−32

and

ρ(2) = 1

2+2−22+1 3

1−2−13−32

d. Depending on your answers to the above three questions, discuss how you would forecast future values of Yt. Explain why your process produces unbiased forecasts. (4) Estimate the phi’s using either OLS or ARIMA (they are identical processes) and then apply theses phis to the correct lag of Y. This will be unbiased since OLS produces unbiased estimates of the coefficients AND the future values of the error term are, on average, zero. 3. I have posted 300 quarterly observations of a data set which I created on your website entitled “470 Midterm Monte Carlo Problem.” Please forecast 2319q4 and 2320q1. Provide 95% confidence intervals for your forecasts. I will give extra credit to confidence intervals that are constructed by hand (i.e., not using Stata’s stdf or stdp routines). I will give additional credit to answers that describe the process you used to arrive at your forecasts. (12) I constructed this process as Y = -62.5 + .05t + 25*q + Xt where Xt = .8Xt-1 + et and et is normally distributed with mean 0 and variance 100. The addition of the Xt makes this an AR(1) process with a trend (of .05 units per period) and seasonality (that changes 25 units over each quarter). After creating a trend (gen t = _n) and a quarter indicator (xi i.q or gen q = 1 if quarter ==1, etc.), I find:

A correlogram of these residuals gives:

confidence interval is truncated at zero.

Note: The test of the variance against zero is one sided, and the two-sided

/sigma 10.50395 .4601936 22.83 0.000 9.601992 11.40592

L1. .7924468 .0345344 22.95 0.000 .7247606 .8601331

ar

ARMA

_cons -24.13648 6.080061 -3.97 0.000 -36.05318 -12.21978

q 25.26229 .3749582 67.37 0.000 24.52739 25.99719

t .0983341 .0326942 3.01 0.003 .0342546 .1624135

y

y Coef. Std. Err. z P>|z| [95% Conf. Interval]

OPG

Log likelihood = -1131.699 Prob > chi2 = 0.0000

Wald chi2(3) = 4931.76

Sample: 2244q4 - 2319q3 Number of obs = 300

ARIMA regression

Iteration 6: log likelihood = -1131.6992


(switching optimization to BFGS)






(setting optimization to BHHH)

I would forecast this by first removing the trend and seasonality and then forecasting what is left over:

23 0.0060 0.0273 11.233 0.9807

22 -0.0243 -0.0099 11.222 0.9714

21 -0.0517 -0.0843 11.03 0.9622

20 0.0066 0.0064 10.163 0.9651

19 -0.0083 -0.0090 10.149 0.9492

18 0.0541 0.0639 10.126 0.9277

17 -0.0935 -0.1078 9.1872 0.9342

16 0.0177 0.0194 6.388 0.9833

15 0.0490 0.0583 6.2877 0.9745

14 -0.0581 -0.0669 5.5237 0.9771

13 0.0217 0.0344 4.4549 0.9853

12 -0.0266 -0.0300 4.3059 0.9773

11 0.0487 0.0435 4.0838 0.9674

10 0.0203 0.0224 3.3407 0.9722

9 0.0498 0.0548 3.2125 0.9553

8 0.0116 0.0119 2.4404 0.9645

7 -0.0654 -0.0702 2.3985 0.9345

6 0.0234 0.0232 1.0743 0.9826

5 0.0081 0.0107 .90507 0.9699

4 -0.0465 -0.0462 .88468 0.9267

3 -0.0154 -0.0157 .2239 0.9736

2 0.0046 0.0040 .15153 0.9270

1 0.0219 0.0219 .14509 0.7033

LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor]

-1 0 1 -1 0 1

. corrgram resid

. predict resid, resid


My forecast for periods 301 and 302: Y301 = -24.033+.0976×301+25.09×4-.0044+.7922×18.60 = 120.44 Y302 = 24.033 + .0976×302+25.09×1-.0044+.7922×14.72 = 42.18 Where 18.6 is leftover is period 300 and 14.72 is the predicted value of leftover in period 301. The standard errors of this are easy to write down, but difficult to compute (because we are in a world of multiple independent variables). The forecast error is:

𝜀𝑡+1𝑓

= 𝑌𝑡+1 − �̂�𝑡+1 = 𝛽0 + 𝛽1 (𝑡 + 1) + 𝛽2 𝑞 + 𝛽3 𝑌𝑡 + 𝜀𝑡+1 − �̂�0 + �̂�1 (𝑡 + 1) + �̂�2 𝑞 + �̂�3 𝑌𝑡 So:

𝜀𝑡+1𝑓

= 𝛽0 − �̂�0 + (𝛽1 − �̂�1) (𝑡 + 1) + (𝛽2 − �̂�2) 𝑞 + (𝛽3 − �̂�3) 𝑌𝑡 + 𝜀𝑡+1 In some ways, this is a nicer setup than what you usually see. Because we actually know all values of the variables on the right hand side of the equation (t, q, and Yt) we won’t have any forecast error arising

_cons -.0044739 .6098538 -0.01 0.994 -1.204656 1.195708

lleft .7926116 .035427 22.37 0.000 .7228918 .8623314

leftover Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 88690.5895 298 297.619428 Root MSE = 10.545

Adj R-squared = 0.6264

Residual 33027.3289 297 111.203128 R-squared = 0.6276

Model 55663.2605 1 55663.2605 Prob > F = 0.0000

F(1, 297) = 500.55

Source SS df MS Number of obs = 299

. reg leftover lleft

(2 missing values generated)

. gen lleft = leftover[_n-1]

(2 missing values generated)

. predict leftover, resid

_cons -24.03389 3.00416 -8.00 0.000 -29.94603 -18.12175

q 25.09002 .8936776 28.08 0.000 23.33128 26.84876

t .0976848 .0115374 8.47 0.000 .0749794 .1203902

y Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 346118.408 299 1157.58665 Root MSE = 17.306

Adj R-squared = 0.7413

Residual 88950.118 297 299.495347 R-squared = 0.7430

Model 257168.29 2 128584.145 Prob > F = 0.0000

F(2, 297) = 429.34

Source SS df MS Number of obs = 300

. reg y t q

from having to estimate these variables. If OLS and/or the ARIMA process produce unbiased estimates of the betas, and if the error term is mean zero (all of which are reasonable estimates), the average

forecast error term is zero, or 𝐸[𝜀𝑡+1𝑓

] = 0. The forecast variance is then:

𝜎𝑓2 = 𝐸 [(𝜀𝑡+1

𝑓)

2] − (𝐸[𝜀𝑡+1

𝑓])

2= 𝐸 [(𝜀𝑡+1

𝑓)

2]

𝜎𝑓2 = 𝐸 [(𝛽0 − �̂�0 + (𝛽1 − �̂�1) (𝑡 + 1) + (𝛽2 − �̂�2) 𝑞 + (𝛽3 − �̂�3) 𝑌𝑡 + 𝜀𝑡+1)

2]

With a little work, it is clear that the forecast variance is equal to:

𝜎𝑓2 = 𝑉𝑎𝑟(�̂�0) + 𝑉𝑎𝑟(�̂�1)(𝑡 + 1)2 + 𝑉𝑎𝑟(�̂�2)𝑞2 + 𝑉𝑎𝑟(�̂�3)𝑌𝑡

2 + 𝜎2

+ 𝑎 𝑏𝑢𝑛𝑐ℎ 𝑜𝑓 𝑐𝑜𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒𝑠 𝑜𝑓 𝑎𝑙𝑙 𝑜𝑓 𝑡ℎ𝑒𝑠𝑒 𝑡𝑒𝑟𝑚𝑠 Solving this involves knowing the form of the variances and covariances or some considerable linear algebra. However, if we are willing to assume that our estimated coefficients equal the true coefficients (perhaps a heroic assumption), then all of the variances and covariance turn out to be zero and we are left with σ2 which, in our case, was estimated at 10.5032 = 110.313—not far from the 100 I used to create this data set. So, our confidence interval for period 301 is 120.44 ± 1.96 × 10.503 = {99.86, 141.02} Forecasting period 302 is slightly harder in that we have to forecast period 301 in order to forecast period 302. This means that any forecast error we make in period 301 is compounded into our forecast error of period 302. This is easy to see in a simply AR(1) forecast two periods out: Yt+2 = φYt+1 + εt+2 = φ(φ Yt + εt+1) + εt+2

Since we don’t know εt+2 nor εt+1 at time t when we forecast Yt+2, our forecast is �̂�𝑡+2 = φ2𝑌𝑡

So our forecast error is 𝜀𝑡+2𝑓

= φε𝑡+1 + 𝜀𝑡+1 and the forecast variance is (1+ φ2) σ2.

In our case, the forecast variance would be (1 + .7922)×110.503 so our confidence interval for period 302 is 42.18 ± 1.96 × 13.40 = {28.78, 55.58} 4. I have posted data from the St. Louis Federal Reserve Bank on the velocity of M2. You may find the original data set (and a short description of velocity) at: https://fred.stlouisfed.org/series/M2V. This is quarterly data beginning in 1959q1 and continuing through 2017q4. Please forecast 2018q1 and 2018q2. Provide 95% confidence intervals for your forecasts. I will give extra credit to confidence that are constructed by hand (i.e., not using Stata’s stdf or stdp routines). I will give additional credit to answers that describe the process you used to arrive at your forecasts. (12) My initial correlogram screams AR(2):

https://fred.stlouisfed.org/series/M2V

I estimate an AR(2):

One thing about this bothers me some is that the coefficients on the two AR terms sum to almost one. This also screams something to me: unit root (see answers to 2A of this exam). When I look at the residuals from this model, I find something that looks like white noise:

13 0.6559 0.1579 2183.5 0.0000

12 0.6831 -0.0371 2075.2 0.0000

11 0.7118 -0.1350 1958.2 0.0000

10 0.7393 -0.0452 1831.7 0.0000

9 0.7659 0.0699 1695.9 0.0000

8 0.7934 -0.0053 1550.7 0.0000

7 0.8227 -0.0133 1395.7 0.0000

6 0.8518 0.0161 1229.6 0.0000

5 0.8807 -0.0856 1052.4 0.0000

4 0.9088 -0.0343 863.86 0.0000

3 0.9357 -0.0457 663.91 0.0000

2 0.9611 -0.4366 452.86 0.0000

1 0.9834 1.0035 231.14 0.0000


-1 0 1 -1 0 1

. corrgram VM2



/sigma .0183191 .000587 31.21 0.000 .0171686 .0194696

L2. -.4388779 .0495756 -8.85 0.000 -.5360443 -.3417115

L1. 1.431532 .0491312 29.14 0.000 1.335236 1.527827

ar

ARMA

_cons 1.733572 .1865255 9.29 0.000 1.367989 2.099155

VM2

VM2 Coef. Std. Err. z P>|z| [95% Conf. Interval]

OPG

Log likelihood = 606.5825 Prob > chi2 = 0.0000

Wald chi2(2) = 18490.81


ARIMA regression

Iteration 8: log likelihood = 606.58255











. arima VM2, ar(1/2)

At this point, I’m close to being done. We might add some quarterly dummies (to check for seasonality) and a time trend. When I do this, nothing changes:

So, I return to the original model to calculate forecast variances.

15 -0.0203 -0.0474 18.387 0.2429

14 0.0687 0.0621 18.282 0.1942

13 0.0181 0.0102 17.086 0.1954

12 -0.1437 -0.1387 17.004 0.1494

11 0.1421 0.1447 11.828 0.3767

10 0.0986 0.0972 6.7882 0.7453

9 -0.0291 -0.0352 4.3699 0.8854

8 -0.0529 -0.0662 4.1604 0.8424

7 0.0305 0.0294 3.4713 0.8383

6 0.0198 0.0142 3.2425 0.7778

5 0.0030 0.0047 3.1466 0.6774

4 0.1110 0.1109 3.1445 0.5339

3 0.0200 0.0204 .16378 0.9832

2 0.0147 0.0146 .06706 0.9670

1 -0.0079 -0.0079 .01492 0.9028


-1 0 1 -1 0 1

. corrgram resid

. predict resid, resid

.



/sigma .0182422 .0006322 28.85 0.000 .0170031 .0194813

L2. -.435679 .0523995 -8.31 0.000 -.5383802 -.3329779

L1. 1.428297 .0519387 27.50 0.000 1.326499 1.530096

ar

ARMA

_cons 1.849891 .3939403 4.70 0.000 1.077783 2.622

_Iq_4 -.000392 .0018215 -0.22 0.830 -.0039622 .0031781

_Iq_3 .001086 .0021982 0.49 0.621 -.0032223 .0053943

_Iq_2 .0018009 .0018879 0.95 0.340 -.0018994 .0055012

time -.0009805 .0022242 -0.44 0.659 -.0053399 .003379

VM2

VM2 Coef. Std. Err. z P>|z| [95% Conf. Interval]

OPG

Log likelihood = 607.578 Prob > chi2 = 0.0000

Wald chi2(6) = 18770.18


ARIMA regression
















. arima VM2 time _Iq_2-_Iq_4, ar(1/2)

In an AR(2) model, my forecast of Yt+1 is φ1 Yt + φ2 Yt-1 while the actual Yt+1 is φ1 Yt + φ2 Yt-1 + εt+1. Assuming my estimated φs are equal to the actual φs, my forecast error is εt+1 and my forecast variance is σ2. Using my preferred model, that is simply .0182 = .0003. My forecast of period Yt+2 is a little more complicated for the same reason discussed in the answer to problem #3 of this exam. My forecast of Yt+2 = φ1 Yt+1 + φ2 Yt but notice, I don’t observe Yt+1 at the time I am forecasting two periods into the future. So, I actually have to forecast Yt+2 using φ1 (φ1 Yt + φ2 Yt-1) + φ2 Yt = (φ1

2 + φ2) Yt + φ1 φ2Y Yt-1 The actual value of Yt+2 = = (φ1

2 + φ2) Yt + φ1 φ2Y Yt-1 + φ1 εt+1 + εt+2. The difference between my forecast and the actual value, the forecast error, is φ1

2 εt+1 + εt+2 so the forecast variance must be (1+ φ12) σ2.

Using my preferred model, this is (1 + 1.422)×.0182 = .0008. In this problem, we’re asked to forecast the next two time periods. The last two in the data set are:

One thing I need to remember is that the ARIMA, AR process” does not report to me the intercept, but instead the sample mean of the data. Thus, the “constant” reported above of 1.73 isn’t the intercept, but instead the average of VM2 over the dataset. To be clear, the mean of a stationary AR(2) process is given by:

E[Y] =B0 + φ1 E[Y] + φ2 E[Y] = 𝛽0

1−1−1

.

In our case, Stata tells us that E[Y] = 1.73 and since it also reveals φ1 and φ2 we can figure out that β0 = .012 So, my forecast for the next two periods are: 2018q1 : .005 + 1.431 × 1.431 - .438 ×1.427 = 1.428 2018q2 : .005 + 1.431 × 1.428 - .438 × 1.431 = 1.422 My confidence intervals are: 2018q1 : 1.428 ± 1.96 × .0003 = {1.427,1.429} 2018q2 : 1.422 ± 1.96 × .0008 = {1.420,1.424}

.

236. 2017q4 1.431

235. 2017q3 1.427

t VM2

Documents

Economics 470: Economic Fluctuations and Forecastsfaculty.wwu.edu/kriegj/Econ470/Midterm/470 Midterm, Winter 2018... · Economics 470: Economic Fluctuations and Forecasts Take Home