Forecasting the Probability of Recession

7/28/2019 Forecasting the Probability of Recession

1/24

Forecasting the Probability of Recession

May 13, 2013

Andrew Gellert

Arman Oganisian

Economic Forecasting

ECN 409-001

Dr. Fang Dong


2/24

2

Abstract

Building on previous research, we estimate a probit model to forecast the

probability of recession one month later. We use data from the St. Louis Federal

Reserves database to estimate four different models. We choose the optimal model based

on the models ability to make in-sample predictions of turning points from recession to

expansion and its overall fit. The optimal model is then used to generate 18 out-of-sample

forecasts from October 2011 to March 2013. These forecasts demonstrate that ability to

capture real events, as the predicted probability of recessions jumped in periods of

instability and dropped during periods of stability.

Introduction

The paper begins with a literature review surveying some key papers which build

a probit model the probability of recession. Some papers have built dynamic models

which exploit the autocorrelation structure of the binary dependent variable. Others use

various financial explanatory variables, such as the yield curve, to capture the so called

wisdom of the crowd contained in liquid secondary markets.

We estimate four different models. Two of them are static models and the other

two are nonhomogeneous Markov processes. The main model is described in our

Model section, which is followed by a brief description of our data.

The next section estimates the four models and compares their in-sample fits as

well as their ability to predict turning points in the economy. Model 1 is chosen as the

optimal model because it exhibits the best fit and turning-point predictions.

In the final section, we use model 1 to generate 18 out-of-sample forecasts from

October 2011 to March 2013. The model demonstrates a fine ability to capture the real


3/24

3

macroeconomic risks stemming from the European sovereign debt crisis which raged

from fall 2011 to fall 2012. During the crisis, predictions became very volatile,

fluctuating around 50%. After the German constitutional court approved Greek bailout

funds and the probability of a Greek exit declined, the models predictions decreased to

about 10% and stayed there until the present.

Literature Review

Our model has several features that we borrow from previous models. First, we

include the slope of the yield curve1 as an explanatory variable. Second, we include a

lagged recession dummy as an explanatory variable, transforming the model into a first-

order nonhomogenous Markov process. Finally, we use a probit model, which is the most

widely used probability model in similar research.

Dueker (1997) presents a theoretical argument for why the slope of the yield

curve contains forward-looking information about the economy. He argues that the yields

of long-term and short-term securities, because they are traded on a liquid secondary

market, contain the so-called wisdom of the crowd. The yield curve, which plots the

yields of bonds with different maturities, normally slopes upwards. Higher maturity debt

carries a larger risk of the issuer defaulting and, thus, the market prices the debt at a

premium to lower-maturity bonds2. When the economic outlook dims, the yield curve

may flatten or invert. This is because investors expect looser monetary policy (i.e. lower

short-term rates), so they choose to sell their short-term debt and buy long-term debt to

lock in higher yield. This causes short-term rates to rise and long-term rates to decline.

1Unlessotherwisenoted,hereafteryieldcurvereferstothedifferenceinyields

betweena10-yeargovernmentbondanda3-monthT-bill.2Longer-maturitydebtispricedatapremiumbecauseotherrisks,suchasinflation

spikes,alsoincreasewithtime.


4/24

4

Thus, the slope ( = ) of the yield curveflattens or turns negative. The magnitude of the decline depends on the crowds view of

the severity and duration of the coming downturn.3

There is a wide body of literature devoted to producing multi-period forecasts of

the probability of recession using the yield curve. Estrella and Mishkin (1998) examine

the out-of-sample forecasting performance of several financial variables including the

yield curve (spread between the 10-year and 3-month treasure yields), the NYSE

composite stock index, Commerce Department leading index, as well as Stock-Watson

leading index.

They evaluate the performance of the variables by using a pseudo-R2

after

estimating their probit model using maximum likelihood.4 For short forecasting periods

of one to three quarter horizons, stock prices have superior forecasting ability. Beyond

this period, however, the slope of the yield curve dominates. The pseudo R2for the yield

curve is only .072 for 1-quarter-ahead forecasts, but increases to .295 for 4-quarters-

ahead forecasts. NYSEs pseudo R2, by contrast, is .161 for 1-quarter-ahead forecasts.

However, this metric declines to .016 for 4-quarters-ahead forecasts.

Dueker runs Estrella and Mishkins probit model using monthly 30-year treasury

yields (as opposed to the quarterly 10-year treasury yields) and confirms their results. He

finds that the yield curves predictive power is optimized with a lag of 9 months. The

yield curve becomes the dominant predictor after 3 months (1 quarter), which is

3Theyieldcurveisnotfoolproof.Monetarypolicyisnotbasedsolelyontheexpectedfuturestateof

theeconomy.Itisalsobasedoninflationexpectationsandpurerandomness.

4! = 1 (!"# !!!"# !!

)!

!

!!"#!!,where0and1correspondtonofitandperfectfit,respectively.

TheuseofthismetricisjustifiedinEstrellaandMishkinspaper,aswellastheuseofNewey-West

standarderrorstohandleautocorrelatedforecasterrors.


5/24

5

consistent with Estrella and Mishkin. Before ending his paper, he presents a probit model

augmented with a Markov switching process. He argues that this may be superior since it

exploits the autocorrelation structure of the binary dependent variable.

Chauvet and Potter (2001) criticize the use of Estrella and Mishkins probit

model, claiming that the model is misspecified in two fundamental ways: (1) estimated

parameters are not constant over time and (2) the model does not properly account for

autocorrelated errors. Estrella, Rodrigues, and Schich (2003) examine both U.S. and

German data and find no evidence of breakpoints. Chauvet and Potter develop a

computationally difficult method by applying Bayesian numerical methods (Kauppi,

2008).

According to Kauppi, this approach, and other similar approaches, have

problems in their interpretation, practical implementation, and flexibility. Instead, he

builds a dynamic probit model by including a lagged dependent variable as an

explanatory variable, thus modeling the economy as a first-order nonhomogeneous

Markov chain. It is nonhomogeneous because the transition matrix varies with respect to

the slope of the yield curve. He finds that there is no evidence for parameter instability

provided that the apparent serial dependence of the recession indicator is taken into

account using the lagged dependent variable as an explanatory variable.

Kauppis model is

(! = 1) = (! + !!!! + !!!!").

Xt-12 is the lagged yield curve, yt-1 is the lagged dependent variable, and () is the

cumulative distribution function of N(0,2). The model predicts the probability of


6/24

6

recession 12 months ahead, where yt=1indicates a state of recession. This probability

varies with respect to the slope of the yield curve.

The probabilities outputted by the model form the transition matrix, which, again,

vary with respect to the slope of the yield curve. In a two-state Markov (state 1 is

recession and state 2 is no recession), the transition probabilities from one state to another

can be expressed in a 2x2 matrix:

!! !"

!" !!=

!! 1 !!

1 !! !!

=

(! + ! + !!!!") 1 [1 ! + !!!!" ]

(1

! + ! + !!!!" ) 1 (! + !!!!")5

We extend upon Kauppis dynamic probit model by adding a causal dimension to

the model by way of several leading indicators of consumption, housing, and investment.

We will see whether this extension significantly improves the models probability

forecasts.

ModelWe will construct a time series model that will output an 1-month-ahead forecast

of the probability of recession. Our main model will take the following form:

(!!! = 1) = (! + !), !"#; !; ~. .,(0,!)

Where:

() = c.d.f. for the normal distribution.n = 414k= 4

R = a column vector containing n observations of either 0 or 1, where 0indicates a state of no recession in time t+1 and 1 indicates recession in

time t+1.

5P11istheprobabilityofmovingfromastateofrecessioninperiod ttoanotherstateofrecessionin

periodt+1.P12istheprobabilityofmovingfromastateofrecessioninperiod ttoastateofno

recessioninperiodt+1.


7/24

7

= a vector containing 5 coefficients and one constant term to be

estimated.X = A 415x6 matrix of the following independent variables at time t

(except for the yield curve and lagged dependent variable, which arelagged): housing starts (hs), industrial production index (ip), consumer

sentiment (cs), yield curve (yc) (lagged 11 months, and dependent variable(lagged 1 month).

As mentioned at the end of the literature review, our research seeks to improve

previous models using the yield curve and Markov-switching by adding several

covariates.

We include housing starts as a leading indicator of the housing sector, which is a

large component of residential investment and, consequently, GDP. Building permits

would be an equally valid, yet identical leading indicator. A simple correlation coefficient

indicates that the two variables are correlated with r = .98.

The industrial production index is a good leading indicator of the industrials

sector (which includes manufacturing, mining, and utilities). This is an interest-sensitive

sector, so it is particularly useful when dealing with the business cycle. Adverse shocks to

the economy will hit this sector before all others. Thus, IP is a good leading indicator of

the economy as a whole. Additionally, we include the Michigan sentiment survey as a

leading indicator of consumption activity, which comprises some 70% of total U.S.

output.

We also include the yield curve (lagged 11 months) and the dependent variable

(lagged 1 month) for reasons outlines in the previous literature review. We lag the yield

curve 11-months because previous research shows that the yield curves predictive power

is optimal at a 3-4 quarter horizon. Since the yield curve is at time tis the value from t-


8/24

8

11, t+1 (the 1-month-ahead forecast) falls in the optimal time horizon. For a 2-months-

ahead forecast, a 10-month lag must be used for the yield curve.

Thus, the previous model,

Pr(!!! = 1) = (0 + 1! + 2! + 3! + 4!!!! + 5! + !)~. .,(0,!),

which predicts the probability of recession in t+1, can be used to predict fsteps ahead

with the following generalized model:

Pr(!!! = 1) = (0 + 1! +2! +3! +4!!(!"!!) + 5Pr(!!(!!!) + !)

This model assumes that all forecasts are made in time t, so that the information

set available at time texcludes all information available after this period. This is a huge

drawback because the information set does not increase with the prediction horizon,

which decreases the accuracy of high-fforecasts. The most accurate forecast, therefore, is

the forecast forf=1. If the forecast for period t+fis made in period t+(f-1), this would not

be the case.

We will also estimate three other variations of this model. One model will omit

the lagged recession variable. The third will just include the yield curve and the fourth

will include only the yield curve and the lagged recession variable. Model 1, which will

prove superior to the other three models, does not succumb to the shortcomings of the

model described above. Since it is not a dynamic model, previous predictions are not

explanatory variables, thus the information set depends on values of HS, IP, CS, and YC,

which are exogenously determined.


9/24

9

Data

We retrieved all of our data from the Federal Reserve Economic Data (FRED)

database at the St. Louis Federal Reserve Bank. All data series are seasonally adjusted

and recorded on a monthly basis. Our time sample period is from March 1, 1977 to

September 1, 2011. Summary statistics are available in the table below along with the

expected sign of the variables coefficient. All data was lagged within excel.

Table 1: Data Summary

Variable N Mean St. Dev. Minimum Maximum Expected Sign

Housing Starts 415 1459.95 405.98 478 2273 -

Industrial Production 415 72.15 17.68 46.6 100.7 -

Consumer Sentiment 415 85.39 13.28 51.7 112 -Yield Curve 415 00.13 00.34 0 1 -

Recession 415 01.16 01.21 -3.1 3.4 +

The chart below is a time series of the 11-month lagged yield curve from March

1, 1977 to September 1, 2011, a key component of our data. Since the series is lagged, the

yield curve is flat or negative during or right before recession (indicated in blue).

Estimation and Results

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-5

-3

-1

1

3

5

7

9

11

13

15

Mar-77

Mar-78

Mar-79

Mar-80

Mar-81

Mar-82

Mar-83

Mar-84

Mar-85

Mar-86

Mar-87

Mar-88

Mar-89

Mar-90

Mar-91

Mar-92

Mar-93

Mar-94

Mar-95

Mar-96

Mar-97

Mar-98

Mar-99

Mar-00

Mar-01

Mar-02

Mar-03

Mar-04

Mar-05

Mar-06

Mar-07

Mar-08

Mar-09

Mar-10

Mar-11

YieldCurveSlope(%)

Figure1:YieldCurveSlopeDipsIntoNega


10/24

10

We estimate the four models and compare the fits of each model. The fourth one

is almost identical to Kauppis model.

1 : (!!! = 1) = (! + !HS+ !IP+ !CS+ !YC+ !)2 : (!!! = 1) = (! + !HS+ !IP+ !CS+ !YC+ !R! + !)3 : (!!! = 1) = (!YC+ !)4 : (!!! = 1) = (!YC+ !R! + !)

The estimation results can be found in the appendix to this paper. In model 1, all

the coefficients of the independent variables are significant with p


11/24

11

Table2:Max-RescaledR2

Model1 Static 0.6605

Model2 Dynamic 0.8585

Model3 StaticYieldCurve 0.2831

Model4 DynamicYieldCurve 0.8286

Clearly, the dynamic model with all of the explanatory variables achieves the best

fit. Indeed, the R2

value is slightly higher than that of model 4, which is an imitation of

Kauppis model.7

A graphical analysis of the models predicted and actual events is beneficial. The

chart below plots the two models prediction of a recession in time t. The blue areas

indicate recession in time t. Model 1 fits the data better than model 3. This indicates that

a model with both the yield curve andthe selected independent variables outperforms the

yield curve alone. However, they are both extremely volatile. Model 1s predictions were

very volatile from 1995 to 2001, before the recession in the early 2000s.

Nevertheless, model 1 is very good at identifying turning points in the economy.

When the economy was not yet in a recession in the early 80s, the model predicted an

84.5% chance of recession next month. Next month, there was a recession. Its previous

prediction was only 32%. For each of the 3 months before the 2008 recession, the model

predicted a 65% chance of recession. While the economy was still in the 2008 recession,

7Thismodelisjustanimitation.KauppiusedBayesianestimationandapseudo-R2.

Thus,ourresultsarenotdirectlycomparable.ThehighestR2outofallofKauppis

modelswas.77.KauppiandEstrellausethesameR2:

Pseudo-R2=1-(log(Lu)/log(Lc))^(2log(Lc)/T).

Here,Luis theunconstrainedmaximizedlikelihood function.Lcis theconstrained

likelihoodfunctionwiththeconstraintthatallcoefficients,excepttheconstant,are

zero.Tisthenumberofobservations.


12/24

12

0%

10%

20%

30%

40%

50%

60%

70%80%

90%

100%

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

March-77

March-78

March-79

March-80

March-81

March-82

March-83

March-84

March-85

March-86

March-87

March-88

March-89

March-90

March-91

March-92

March-93

March-94

March-95

March-96

March-97

March-98

March-99

March-00

March-01

March-02

March-03

March-04

March-05

March-06

March-07

March-08

March-09

March-10

March-11

Figure2:StaticModels

Model3

Model1

the model predicted a 5% chance of recession the next month. The recession did end the

following month.

As another example, the model was assigning a 7% chance of recession for the

month of March 1990. Its prediction for the next month shockingly jumped to 20%, then

40% the next month, until finally calling a 55% chance of recession in August. Indeed, a

recession did begin in August. While the nation was still in recession, the model

predicted a 27% chance of recession next month, down from 40% the month before. The

recession did end the next month.


13/24

13

The chart above plots the probability of recession in time tas predicted by model

2 and 4. Again, the shaded area represents recession in time t. The dynamic models

forecasts are much less volatile than the static forecasts. However, they do a poor job at

identifying turning points in the economy. Model 2 assigned a 4% probability of

recession to August 1981. However, a recession did start that month. That recession

ended in December 1982. However, Model 2 had assigned a 97% chance of a recession

to that month. Model 4 made a similar blunder with that recession. It assigned a 5%

probability of a recession to August 1981 and a 97% probability of recession to

December 1982. For every month between the first and last months of the recession, the

model would consistently predict over 90% probabilities of recession.

Both dynamic models follow this pattern for every recession in our sample. We

believe this occurs because of the large coefficient on the recession lag. In both dynamic

models, it is the largest and most significant coefficient. This emphasis on recessionary

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0

0.2

0.4

0.6

0.8

1

1.2Figure3:DynamicModels

Model2

Model4


14/24

14

state in the current month as a predictor of recession next month is a flaw. The models are

unable to correctly predict turning points.

Thus, we decided to select our optimal model from the static group. We select the

model with the highest max-rescaled R2, model 1. Model one has the highest R

2metric of

.66, making it the best-fit model.

Out-of-Sample Predictions

While the in-sample forecasts are good, it remains to be seen whether the out-of-

sample forecasts are accurate. Figure 4 plots model 1s out-of-sample and in-sample

predictions of the probability of recession. The line is marked in red with an arrow

pointing to the future. We make 18 out-of-sample predictions.

The model gets jumpy in the future in the first few months. The model predicted a

55% chance of recession in October 2011, the first month in the out-of-sample forecast

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Mar-77

Mar-79

Mar-81

Mar-83

Mar-85

Mar-87

Mar-89

Mar-91

Mar-93

Mar-95

Mar-97

Mar-99

Mar-01

Mar-03

Mar-05

Mar-07

Mar-09

Mar-11

Mar-13

Figure4:Model1withOut-of-Sample

Predictions

Model1


15/24

15

period. It fluctuated greatly around 50% for the next following months before lowering to

around 10% from December 2012 to March 2013. This fluctuation and apparent

uncertainty is not without cause and we do not believe that it reflects inaccuracies in the

model.

Instead, the period of uncertainty, from October 2011 to November 2012,

corresponds to the uncertainty regarding the European sovereign debt crisis. Yields on

Spanish, Greek, and Italian long-term maturity bonds were soaring throughout this

period. Analysts were entertaining the possibility of contagion, as U.S. banks with

large stakes in European sovereign debt were at risk. Similarly, experts and political

leaders were questioning the very existence of the European Union. There was

widespread fear that a Greek exit from the Euro would spark capital flight out of the

continent and plunge the EU into a recession. There was widespread fear that this would

cause a double-dip recession in the United States. These fears largely subsided after the

German constitutional court decided that a Greek bailout was legal. The fear of a Greek

exit and subsequent macroeconomic shocks disappeared. The model reflects this with

lower recession probability forecasts. The average forecast from December 2012 to

March 2013 was 10%.

Conclusion

Both in-sample and out-of-sample predictions confirm that model one is the

superior performer, as discussed in the previous section. Furthermore, it is worth noting

that model 1, which includes additional explanatory variables is superior to model 2, in

terms of in-sample fit. Thus, a model which augments the yield curve with IP, HS, and

CS is superior to a model which includes YC as the sole explanatory variable. Model 1 is

also superior to both dynamic models, which fail to predict turning points in the

economy.


16/24

16

Bibliography

1. Arturo Estrella & Frederic S. Mishkin, 1996. "Predicting U.S. recessions:financial variables as leading indicators," Research Paper 9609, Federal Reserve

Bank of New York.o http://www.albany.edu/~xl843228/teaching/ECON350/EstrellaMishkin19

98.pdf

2. Arturo Estrella & Frederic S. Mishkin, 1996."The Yield Curve as a Predictor ofU.S. Recessions," Current Issues in Economics and Finance, Federal Reserve

Bank of New York, issue Jun.

o http://www.newyorkfed.org/research/current_issues/ci2-7.pdf3. Heikki Kauppi, 2008."Yield-Curve Based Probit Models for Forecasting U.S.

Recessions: Stability and Dynamics," Discussion Papers 31, Aboa Centre for

Economics.

o http://ethesis.helsinki.fi/julkaisut/eri/hecer/disc/221/yieldcur.pdf4. Marcelle Chauvet & Simon Potter, 2005."Forecasting recessions using the yield

curve," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 24(2), pages 77-

103.

o http://www.newyorkfed.org/research/staff_reports/sr134.pdf5. Michael Dueker, 1997. "Strengthening the case for the yield curve as a predictor

of U.S. recessions," Review, Federal Reserve Bank of St. Louis, issue Mar, pages

41-51.

o http://research.stlouisfed.org/publications/review/97/03/9703md.pdf6. Arturo Estrella & Anthony P. Rodrigues & Sebastian Schich, 2000."How stable is

the predictive power of the yield curve? evidence from Germany and the United

States," Staff Reports 113, Federal Reserve Bank of New York.

o http://www.newyorkfed.org/research/staff_reports/sr113.pdf


17/24

17

Appendix

Model estimates:

The LOGISTIC Procedure: MODEL 1

Model Information

Data Set WORK.SET1Response Variable rec Recession Dummy (dependent)

Number of Response Levels 2

Model binary probit

Optimization Technique Fisher's scoring

Number of Observations Read 415

Number of Observations Used 415

Response Profile

Ordered Total

Value rec Frequency

1 0 359

2 1 56

Probability modeled is rec=1.

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Intercept

Intercept and

Criterion Only Covariates

AIC 330.406 152.452

SC 334.435 172.594

-2 Log L 328.406 142.452

R-Square 0.3611 Max-rescaled R-Square 0.6605


18/24

18

The LOGISTIC Procedure

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 185.9540 4


19/24

19


Model Information

Data Set WORK.SET1

Response Variable rec Recession Dummy (dependent)


Model binary probit




Response Profile

Ordered Total

Value rec Frequency

1 0 359

2 1 56





Intercept

Intercept and


AIC 330.406 77.425

SC 334.435 101.595

-2 Log L 328.406 65.425



20/24

20






21/24

21


Model Information

Data Set WORK.SET1

Response Variable rec Recession Dummy (dependent)


Model binary probit




Response Profile

Ordered Total

Value rec Frequency

1 0 359

2 1 56





Intercept

Intercept and


AIC 330.406 262.622

SC 334.435 270.679

-2 Log L 328.406 258.622



22/24


23/24


24/24

24





Documents

Forecasting the Probability of Recession