By Kasper Jørgensen

BOND RISK PREMIA AND THE MACROECONOMY

By Kasper Jørgensen

A PhD thesis submitted to

School of Business and Social Sciences, Aarhus University,

in partial fulfilment of the requirements of

the PhD degree in

Economics and Business Economics

May 2018

CREATESCenter for Research in Econometric Analysis of Time Series

This version: July 19, 2018 © Kasper Jørgensen

PREFACE

This dissertation is the outcome of my graduate studies at the Department of Eco-

nomics and Business Economics at Aarhus University during the period September

2013 through May 2018. I am grateful to the department and the Center for Research

in Econometric Analysis of Time Series (CREATES) funded by the National Research

Foundation (DNRF78) for providing an outstanding research environment and gen-

erous financial support for participation in numerous conferences and courses.

A number of people deserve special mention. First and foremost, I would like to

thank Martin M. Andreasen for his encouragement, valuable insights, and for always

being extremely helpful. It has been a privilege to work with and learn from you. The

first and third chapter of this dissertation is the result of our joint effort. I truly hope

that we can continue to collaborate in the years to come.

In the fall of 2016, I had the great pleasure to visit James D. Hamilton at the

Department of Economics at University of California, San Diego. I am indebted to

Jim for hosting me, his guidance, and for valuable comments on my work. I would

also like to thank the Department of Economics at UCSD for its hospitality.

The faculty at the Department of Economics and Business Economics at Aarhus

University deserves my gratitude for providing an inspiring research environment.

During my graduate studies, I have had the privilege of being surrounded by many

great colleagues, and I am grateful to all of them for creating an outstanding aca-

demic and social environment. I especially want to thank Alexander, Bo, Carsten,

Christian, Jakob, Johan, Jonas, Niels, Mikkel, and Thomas for contributing to making

my graduate studies a memorable time.

Finally and most importantly, I would like to thank my family for always being

supportive and understanding. My most heartfelt thank you goes to my girlfriend

Anna. Thank you for your encouragement, endless support, patience, and for bearing

over with me being absent-minded at times. Also thank you for our many trips and

experiences in California and elsewhere. It means everything to me.

Kasper Jørgensen

Aarhus, May 2018

i

UPDATED PREFACE

The pre-defense took place on June 26, 2018 in Aarhus. I am grateful to the members

of the assessment committee, Professor Joachim Grammig (University of Tuebingen),

Professor Claus Munk (Copenhagen Business School), and Professor Stig Vinther

Møller (Aarhus University) for their careful reading of the dissertation and their

many insightful comments and suggestions. Some of the suggestions have been

incorporated into the present version of the dissertation while others remain for

future work.

Kasper Jørgensen

Aarhus, July 2018

iii

CONTENTS

Summary vii

Danish summary xi

1 The Importance of Timing Attitudes in Consumption-Based Asset Pric-ing Models 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 A Long-Run Risk Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Estimation Results: The Long-Run Risk Model . . . . . . . . . . . . . 9

1.4 A New Keynesian Model . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

1.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2 How Learning from Macroeconomic Experiences Shapes the Yield Curve 432.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.2 An Illustrative Consumption-Based Model . . . . . . . . . . . . . . . 46

2.3 Bond Return Predictability . . . . . . . . . . . . . . . . . . . . . . . . 48

2.4 Term Premia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

2.5 The Decline in the Equilibrium Real Rate . . . . . . . . . . . . . . . . 69

2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

2.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

3 Bond Risk Premia at the Zero Lower Bound 1073.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

3.2 Bond Return Predictability at the ZLB . . . . . . . . . . . . . . . . . . 110

3.3 A Shadow Rate Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

3.4 Regime-Dependent Market Prices of Risk . . . . . . . . . . . . . . . . 121

3.5 Economic Implications . . . . . . . . . . . . . . . . . . . . . . . . . . 125

3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

3.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

v

SUMMARY

This dissertation is comprised by three self-contained chapters that all are concerned

with understanding the macroeconomics of risk premia in bond markets and, po-

tentially, financial markets more generally. Yields on long-term bonds consists of

two components: (i) average expected short term interest rates over the lifetime of

the bond, (ii) risk premia. Low long-term yields could then signal (i) low expected

short term interest rates, perhaps because of weak growth prospects, or (ii) low risk

premiums, perhaps because of low overall economic uncertainty. Clearly, the policy

implications are very different. For this reason, accurate decompositions of yields are

of critical importance for monetary policy makers.

Chapter 1, "The Importance of Timing Attitudes in Consumption-Based Asset Pric-

ing Models" (joint work with Martin M. Andreasen), studies a new utility kernel within

the Epstein and Zin (1989) and Weil (1990) framework.1 Epstein-Zin-Weil preferences

are widely used because they separate the intertemporal elasticity of substitution

(IES) and relative risk aversion (RRA) which otherwise have a perfect inverse rela-

tionship when using standard expected utility. It is well-known that Epstein-Zin-Weil

preferences achieve this separation by imposing a timing attitude on the household.

This embedded constraint implies that the standard implementation of Epstein-

Zin-Weil preferences determine (i) the IES, (ii) the RRA, and (iii) the timing attitude

using only two parameters. This raises the question; do Epstein-Zin-Weil preferences

perform well because they separate the IES from RRA or because they imply a timing

attitude? We augment the standard power-utility kernel in Epstein-Zin-Weil with a

constant which allows a more flexible specification of the timing attitude. We then

show that the mechanism enabling Epstein-Zin-Weil preferences to explain asset

prices, is not to separate the IES from RRA, but to introduce a strong timing attitude.

These new preferences resolve a puzzle in the long-run risk model, where consump-

tion growth is too strongly correlated with the price-dividend ratio and the risk-free

rate. The proposed preferences also enable a New Keynesian model to match equity

and bond premia with a low RRA of 5.

In chapter 2, "How Learning from Macroeconomic Experiences Shapes the Yield

Curve", I link the shape of the yield curve to macroeconomic fundamentals. I show

1Chapter 1 has a revise and resubmit invitation from Journal of Monetary Economics.

vii

viii SUMMARY

that constant-gain learning measures of inflation and consumption growth expec-

tations capture the long-run variation in the level and slope of the yield curve, re-

spectively. Controlling for the macroeconomic expectation factors, I extract cyclical

level and slope yield curve factors. The four factors decompose the usual level and

slope factors into trend and cycle components. This dynamic distinction is important

for extracting accurate measures of risk premia in long-term bonds. The four factors

predict excess returns with R2’s up to 56%, and subsume and add to the predictive

information in the most popular bond return predictors. The macroeconomic ex-

pectation factors predominantly capture variation in the expectation hypothesis

component of long-term yields, that is the long-run short rate expectations. The

cyclical level and slope factors capture risk premium variation. As a result, my de-

composition of long-term yields imply cyclical term premia. Cyclical term premia is

in line with macro-finance priors and risk premia in other asset classes (Fama and

French, 1989), but in contrast to the popular affine term structure models.

Finally, in chapter 3, "Bond Risk Premia at the Zero Lower Bound" (joint with

Martin M. Andreasen and Andrew C. Meldrum), we study the dynamics of bond risk

premia at the zero lower bound (ZLB). The classical studies by Fama and Bliss (1987)

and Campbell and Shiller (1991) relate the slope of the yield curve to risk premia in

bonds. However, the recent episodes with prolonged periods of short-term interest

rates being restricted by their ZLB poses a challenge to this linear relation. As the

short end of the yield curve becomes constrained from below, this in turn generates

a "slope compression effect", meaning that a given slope of the yield curve carries

a stronger signal at the ZLB. Furthermore, the recent low interest rate environment

has called for unconventional monetary policies. This is likely to affect the required

compensations for risk by bond investors, meaning that we also may have a "price of

risk effect". In predictive regressions of excess bond returns onto yield spreads, we

document a structural break in regression coefficients over the recent low interest

rate regime. The standard three-factor shadow rate model fails to account for this

empirical pattern. Instead, we propose a shadow rate model with market prices of

risk that switch across non-binding and binding zero lower bound regimes. Our

shadow rate model with regime-dependent market prices of risk is consistent with

the provided regression evidence. The regime-switching shadow rate model suggests

that markets expected monetary policy lift-off to occur later than otherwise thought.

ix

References

Campbell, J. Y., Shiller, R. J., 1991. Yield spreads and interest rate movements: A bird’s

eye view. Review of Economic Studies Vol. 58(3), 495–514.

Epstein, L., Zin, S., 1989. Substitution, risk aversion and the temporal behavior of

consumption and asset returns: A theoretical framework. Econometrica Vol. 57,

937–969.

Fama, E. F., Bliss, R. R., 1987. The information in long-maturity forward rates. Ameri-

can Economic Review Vol. 77(4), 680–692.

Fama, E. F., French, K. R., 1989. Business conditions and expected returns on stocks

and bonds. Journal of Financial Economics Vol. 25, 23–49.

Weil, P., 1990. Non-expected utility in macroeconomics. Quarterly Journal of Eco-

nomics Vol. 1, 29–42.

DANISH SUMMARY

Denne afhandling består af tre uafhængige kapitler, som alle omhandler forståel-

sen af makroøkonomien bag risikopræmier i obligationsmarkeder og, potentielt,

finansielle markeder mere generelt. Renten på langsigtede obligationer består af to

komponenter: (i) den gennemsnitlige forventede kortsigtede rente over levetiden

på obligationen og (ii) en risikopræmie. Lave renter kan således være et signal om

(i) lave forventede kortsigtede renter, for eksempel på grund af svage vækstprogno-

ser, eller (ii) lave risikopræmier, for eksempel på grund af lav generel økonomisk

usikkerhed. Implikationerne for økonomisk politik er åbenlyst forskellige. Af den-

ne grund er præcise dekomponeringer af renter utroligt vigtige for pengepolitiske

beslutningstagere.

Kapitel 1, "The Importance of Timing Attitudes in Consumption-Based Asset Pri-

cing Models"(i samarbejde med Martin M. Andreasen), studerer en ny nyttekerne

inden for rammerne af Epstein og Zin (1989) og Weil (1990).1 Epstein-Zin-Weil-

præferencer er vidt udbredte, fordi de separerer den intertemporale substitutionsela-

sticitet (IES) og relative risikoaversion (RRA), som ellers har et perfekt inversforhold

under forventet nytte. Det er velkendt, at Epstein-Zin-Weil præferencer opnår denne

separation ved at pålægge husholdningen en timing attitude. Denne integrerede

begrænsning betyder, at Epstein-Zin-Weil præferencer bestemmer (i) IES, (ii) RRA og

(iii) timing attituden ved hjælp af kun to parametre. Det rejser spørgsmålet: Virker

Epstein-Zin-Weil præferencer godt, fordi de separerer IES og RRA eller fordi de med-

fører en timing attitude? Vi tilføjer en konstant til potensnyttefunktionen i standard

Epstein-Zin-Weil-præferencer, som dermed tillader en mere fleksibel specifikation

af timing attituden. Vi viser dernæst, at mekanismen, som muliggør at Epstein-Zin-

Weil-præferencer kan forklare aktivpriser, ikke er separationen af IES og RRA, men

derimod fordi de introducerer en stærk timing attitude. De nye præferencer løser et

problem i long-run risk modellen, hvor forbrugsvækst er for stærkt korreleret med

pris-dividende ratioen og den risikofrie rente. De foreslåede præferencer muliggør

også, at en Ny Keynesiansk model kan matche aktie- og obligationspræmier med en

lav RRA på 5.

I kapitel 2, "How Learning from Macroeconomic Experiences Shapes the Yield

1Kapitel 1 har en revise og resubmit invitation ved Journal of Monetary Economics.

xi

xii DANISH SUMMARY

Curve", forbinder jeg formen på rentekurven med underliggende makroøkonomisk

forhold. Jeg viser, at constant-gain learning mål for forbrugsvækst- og inflations-

forventninger fanger langsigtet variation i henholdsvis niveauet og hældningen på

rentekurven. Efter at have kontrolleret for de makroøkonomiske forventningsfaktorer

udtrækker jeg cykliske niveau- og hældningsfaktorer. De fire faktorer dekomponerer

de typiske niveau- og hældningsfaktorer i trend- og cyklus-faktorer. Denne dynami-

ske sondring er vigtig for at udtrykke præcise mål for risikopræmien i langsigtede

obligationer. De fire faktorer forudsiger det overskydende afkast med op til 56%

forklaringsgrad, og inkluderer samt tilføjer til informationen i de mest populære

obligationsafkastsprædiktorer. De makroøkonomiske forventningsfaktorer fanger

overvejende forventningshypotese-komponenten i langsigtede renter, dvs. forvent-

ninger over lange horisonter til renten på kortsigtede obligationer. De cykliske niveau-

og hældningsfaktorer fanger variation i risikopræmien. Som et resultat heraf er de-

komponeringen af langsigtede renter ensbetydende med cykliske risikopræmier på

obligationer. Cykliske risikopræmier på obligationer er i overensstemmelse med

makro-finansielle intuition og risikopræmier i andre typer aktiver (Fama og French,

1989), men i kontrast til resultaterne fra de populære affine rentekurvemodeller.

Endeligt, studerer vi i kapitel 3, "Bond Risk Premia at the Zero Lower Bound"(i

samarbejde med Martin M. Andreasen og Andrew C. Meldrum), dynamikken i risiko-

præmierne på obligationer ved den nedre grænse på nominelle renter. De klassiske

studier af Fama og Bliss (1987) og Campbell og Shiller (1991) relaterer hældningen

på rentekurven til risikopræmierne på obligationer. De seneste episoder med vedva-

rende nominelle renter, som er restringeret af deres nedre grænse, udgør imidlertid

en udfordring for denne lineære relation. Når den korte ende af rentekurven bliver

begrænset nedenfra, så genererer dette en "hældningskompressionseffekt", hvilket

medfører, at en given hældning på rentekurven bærer et stærkere signal ved den

nedre grænse på nominelle renter. Derudover har det nylige lave rentemiljø kræ-

vet ukonventionelle pengepolitikker. Dette vil sandsynligvis have en effekt på de

kompensationer som obligationsinvestorer kræver for at påtage sig risiko, hvilket er

ensbetydende med, at vi kan have en "risikopris-effekt". I prædiktive regressioner

af overskydende obligationsafkast på rentespænd dokumenterer vi et strukturelt

brud i regressionskoefficienterne over det nylige lav-rente regime. Tre-faktor skyg-

gerentemodellen kan ikke forklare dette empiriske mønster. I stedet foreslår vi en

skyggerentemodel med risikopriser, som skifter over bindende og ikke-bindende

nedre grænse regimer. Vores skyggerentemodel med regime-afhængige risikopriser

er konsistent med de dokumenterede regressionsevidens. Regime-skifts skyggerente-

modellen antyder, at obligationsmarkedet forventede pengepolitiske rentestigninger

ville ske senere end tidligere troet.

xiii

Litteratur





937–969.





Weil, P., 1990. Non-expected utility in macroeconomics. Quarterly Journal of Econo-

mics Vol. 1, 29–42.

C H A P T E R 1THE IMPORTANCE OF TIMING ATTITUDES IN

CONSUMPTION-BASED ASSET PRICING MODELS

REVISE & RESUBMIT INVITATION FROM JOURNAL OF MONETARY ECONOMICS

Martin M. AndreasenAarhus University, CREATES, and the Danish Finance Institute

Kasper JørgensenAarhus University and CREATES

Abstract

A new utility kernel for Epstein-Zin-Weil preferences is proposed to disentangle

the intertemporal elasticity of substitution (IES), the relative risk aversion (RRA),

and the timing attitude. We then show that the mechanism enabling Epstein-Zin-

Weil preferences to explain asset prices, is not to separate the IES from RRA, but to

introduce a strong timing attitude. These new preferences resolve a puzzle in the

long-run risk model, where consumption growth is too strongly correlated with the

price-dividend ratio and the risk-free rate. The proposed preferences also enable a

New Keynesian model to match equity and bond premia with a low RRA of 5.

1

2 CHAPTER 1. TIMING ATTITUDES IN ASSET PRICING MODELS

1.1 Introduction

Following the seminal work of Epstein and Zin (1989) and Weil (1990), a large number

of consumption-based models use so-called Epstein-Zin-Weil preferences to explain

asset prices (see Bansal and Yaron, 2004; Gourio, 2012, to name just a few). An

important property of these preferences is to disentangle relative risk aversion (RRA)

and the intertemporal elasticity of substitution (IES) which otherwise have an inverse

relationship when using expected utility. It is also well-known that the separation of

the IES and RRA in Epstein-Zin-Weil preferences is achieved by imposing a timing

attitude on the household, which either prefers early or late resolution of uncertainty.

This embedded constraint implies that Epstein-Zin-Weil preferences determine i)

the IES, ii) the RRA, and iii) the timing attitude using only two parameters. However,

experimental evidence suggests that the timing attitude has an independent effect

on decision making beyond what is implied by RRA, and that the timing attitude is

unrelated to RRA (see, for instance, Chew and Ho, 1994; van Winden, Krawczyk, and

Hopfensitz, 2011). This raises the question; do Epstein-Zin-Weil preferences perform

well because they separate the IES from RRA or because they imply a timing attitude?

We address this question in the present paper and explore whether a more flexible

specification of the timing attitude helps to explain asset prices. We study these

questions by augmenting the power-utility kernel adopted in Epstein and Zin (1989)

and Weil (1990) with a constant u0 to account for other aspects than consumption

Ct when modeling the household’s contemporaneous utility level. The benefit of

this extension of the utility kernel u(Ct

)is to obtain greater flexibility in setting

u′′ (Ct)

Ct /u′ (Ct)

and u′ (Ct)

Ct /u(Ct

)compared to the traditional specification of

Epstein-Zin-Weil preferences, where one parameter determines both ratios. Much

attention in the literature has been devoted to u′′ (Ct)

Ct /u′ (Ct), because it controls

the IES. The ratio u′ (Ct)

Ct /u(Ct

), on the other hand, is often ignored but is the main

focus of the present paper, because it determines how the household’s timing attitude

affects RRA. Thus, adding a constant to the utility kernel allows us to disentangle the

IES, the RRA, and the timing attitude.

We start by studying the asset pricing implications of our new utility kernel in the

long-run risk model of Bansal and Yaron (2004). Using an analytical second-order

perturbation approximation, we first show that the household’s timing attitude has

a separate effect on asset prices beyond the IES and RRA, which is consistent with

the experimental evidence cited above. Estimation results for the standard long-run

risk model confirm the finding in Beeler and Campbell (2012) that consumption

growth in the model is too highly correlated with the price-dividend ratio due to its

strong reliance on long-run risk. We further show that this property of the model

also makes the contemporaneous correlation between consumption growth and the

risk-free rate too high, and these findings therefore question the empirical support

for the required degree of long-run risk in the model of Bansal and Yaron (2004). An

important empirical finding in the present paper is to show that our utility kernel

1.1. INTRODUCTION 3

resolves these puzzles, because it reduces the reliance on long-run risk and instead

makes the household display strong preferences for early resolution of uncertainty.

The ability of our extended model to match means, standard deviations, and auto-

correlations is nearly identical to the standard long-run risk model, suggesting that

our extension is identified from contemporaneous correlations, which the literature

mostly ignores when taking the long-run risk model to the data. Another important

finding is that the satisfying performance of the long-run risk model is hardly affected

by lowering RRA from 10 to 5 once u0 is included in the utility kernel. In contrast, the

fit of the standard long-run risk model deteriorates with a RRA of 5. However, our

results also show that the timing premium of Epstein, Farhi, and Strzalecki (2014)

is very high for this model (even with our extension) and it easily implies that the

household is willing to give up 80% of lifetime consumption to have all uncertainty

resolved in the following period.

We also study the asset pricing implications of our new utility kernel in a New Key-

nesian dynamic stochastic general equilibrium (DSGE) model, where consumption

and dividends are determined endogenously. Our estimates reveal that the proposed

utility kernel in this setting resolves the puzzlingly high RRA required in many DSGE

models to explain asset prices. More precisely, the model matches the equity pre-

mium and the bond premium (i.e. the mean and variability of the 10-year nominal

term premium) with a low RRA of 5. The mechanism explaining this substantial im-

provement of the New Keynesian model is similar to the one offered in the long-run

risk model, namely that our new utility kernel allows strong preferences for early

resolution of uncertainty to coincide with low RRA. We also find that changing RRA

has a very small effect on the model’s ability to match the data when using our new

utility kernel. As in the long-run risk model, this suggest that it is not the high RRA in

the traditional formulation of Epstein-Zin-Weil preferences that helps to match asset

prices, but instead the strong timing attitude that is induced by high RRA. We also

find that the timing premium in the New Keynesian model is in the order of 5% to

10% due to the endogenous labor supply, consumption habits, and a low IES. Our

extension preserves this property of the New Keynesian model and hence matches

asset prices with a low RRA and a low timing premium.

Conducting a number of counterfactual experiments, we study the asset pricing

implications of the timing attitude and long-run risk in the two considered models.

To examine the effects of the timing attitude, we set the Epstein-Zin-Weil parameter

to zero in both models such that the RRA is tightly linked to the IES. This modification

generates a small reduction in RRA for the two models, but both models are now un-

able to explain asset prices. A second counterfactual re-introduces strong preferences

for early resolution of uncertainty but omits long-run risk. Here, we also find that the

two models cannot match asset prices, although the IES, the RRA, and the timing

attitude are identical to their estimated values in both models. These experiments,

and our remaining analysis, therefore suggest that the mechanism enabling Epstein-


Zin-Weil preferences to explain asset prices, is not to separate the IES from RRA, but

to introduce strong preferences for early resolution of uncertainty to amplify effects

of long-run risk.

The remainder of this paper is organized as follows. Section 1.2 introduces our new

utility kernel within the long-run risk model. Section 1.3 estimates this extension of

the long-run risk model and studies its empirical performance. Section 1.4 considers

a New Keynesian model with the proposed utility kernel and explores its empirical

performance. Concluding comments are provided in Section 1.5.1

1.2 A Long-Run Risk Model

The representative household is introduced in Section 1.2.1, and the exogenous

processes for consumption and dividends are specified in Section 1.2.2. We present

the new utility kernel in Section 1.2.3 and derive the IES and RRA. The asset pricing

properties of the proposed utility kernel are explored analytically in Section 1.2.4.

1.2.1 The Representative Household

Consider a household with recursive preferences as in Epstein and Zin (1989) and

Weil (1990). Using the formulation in Rudebusch and Swanson (2012), the value

function Vt is given by

Vt = ut +βEt [V 1−αt+1 ]

11−α (1.1)

for ut > 0, where Et [·] is the conditional expectation given information in period t .2

Here, β ∈ (0,1) and ut ≡ u(Ct

)denotes the utility kernel as a function of consumption

Ct . For higher values of α ∈ R \ {1}, these preferences generate higher risk aversion

when ut > 0 for a given IES, and vice versa for ut < 0.

Another important property of (1.1) is to embed the household with preferences

for resolution of uncertainty. This behavioral property is determined by the aggrega-

tion function in (1.1), i.e. by f

(ut ,Et

[V 1−α

t+1

])≡ ut +β

(Et

[(Vt+1

)1−α]) 11−α

, where the

household displays preferences for early (late) resolution of uncertainty if f(·, ·) is

convex (concave) in its second argument (see Weil, 1990). The formulation in (1.1)

therefore implies preferences for early (late) resolution of uncertainty ifα> 0 (α< 0).3

Given that α controls the degree of curvature in f(·, ·) with respect to Et

[V 1−α

t+1

], it

seems natural to consider α as measuring the strength of the household’s timing

attitude. Another and slightly more intuitive measure for temporal resolution of un-

certainty is the timing premium Πt of Epstein et al. (2014), which is the fraction of

lifetime consumption that the household is willing to give up to have all uncertainty

1All technical derivations and proofs are deferred to an online appendix available.2When ut < 0, we define Vt = ut −βEt [(−Vt+1)1−α]

11−α as in Rudebusch and Swanson (2012).

3The opposite sign restrictions apply when ut < 0.

1.2. A LONG-RUN RISK MODEL 5

resolved in the following period. Epstein et al. (2014) show that Πt depends on the

strength of the timing attitude α and the amount of consumption uncertainty. Thus,

it may be useful to think of α as controlling the ’price of timing risk’, whereas the

law of motion for consumption controls the ’quantity of timing risk’. However, the

timing premium is generally not available in closed form, and we will therefore rely

on the household’s timing attitude α when studying the analytical properties of the

proposed preferences.

The household has access to a complete market for state contingent claims At+1.

Resources are spent on Ct and At+1, and we therefore have the budget restriction

Ct +Et[Mt ,t+1 At+1

]= At , where Mt ,t+1 denotes the real stochastic discount factor.

1.2.2 Consumption and Dividends

The process for consumption is specified to be compatible with production economies

displaying balanced growth. Hence, we let Ct ≡ Zt ×Ct , where Zt > 0 is the balanced

growth path of technology, or simply the productivity level. The variable Ct intro-

duces cyclical consumption risk, which in production economies originates from

demand-related shocks, monetary policy shocks, or short-lived supply shocks (see,

for instance, Justiniano and Primiceri, 2008).

Inspired by the work of Bansal and Yaron (2004), we let

log Zt+1 = log Zt + logµz +xt +σzσtεz,t+1

xt+1 = ρx xt +σxσtεx,t+1

σ2t+1 = 1−ρσ+ρσσ2

t +σσεσ,t+1

(1.2)

where σ2t introduces stochastic volatility. Here, εi ,t+1 ∼NID

(0,1

)for i ∈ (

z, x,σ)

with∣∣ρx∣∣< 1 and

∣∣ρσ∣∣< 1.4 Thus, xt introduces persistent changes in the growth rate of

Zt and captures long-run productivity risk. The innovation εz,t does not generate

any persistence in the growth rate of Zt and is therefore referred to as short-run

productivity risk.5 Variation in consumption around Zt is specified as in Bansal et al.

(2010) by letting logCt+1 = ρc logCt +σcσtεc,t+1, where εc,t ∼NID(0,1

)and

∣∣ρc∣∣< 1.

The process for dividends D t is given by∆dt+1 = logµd +φx xt +φc ct +σdσtεd ,t+1,

where dt+1 ≡ logD t+1 and εd ,t ∼NID(0,1

). Here, φx and φc capture firm leverage

in relation to long-run and cyclical risk, respectively, as in Bansal et al. (2010). For

completeness, all innovations are assumed to be mutually uncorrelated at all leads

and lags.

4Although (1.2) does not enforce σ2t ≥ 0, we nevertheless maintain this specification for comparison

with Bansal and Yaron (2004) and Bansal, Kiku, and Yaron (2010).5Hence, we follow the terminology from the long-run risk model (see for instance Bansal et al. (2010)),

although variation in εz,t has a permanent effect on the level of Zt .


1.2.3 The Utility Kernel

To motivate our new utility kernel for disentangling the IES, the RRA, and the timing

attitude, it is useful to start with the general expression for RRA. Recall, that RRA

measures the amount that the household is willing to pay to avoid a risky gamble

over wealth. With recursive preferences as formulated in (1.1), the general expression

for RRA in the steady state (ss) is given by (see Swanson, 2018)

RRA =− u′′ (Ct)

Ct

u′ (Ct) ∣∣∣∣∣

ss

+α u′ (Ct)

Ct

u(Ct

) ∣∣∣∣∣ss

. (1.3)

Hence, the RRA depends on the timing attitudeα and the two ratios u′′ (Ct)

Ct /u′ (Ct)

and u′ (Ct)

Ct /u(Ct

). The first term in (1.3) is the familiar expression for the inverse

of the IES, where the IES measures the percentage change in consumption growth

from a one percent change in the real interest rate under the absence of uncer-

tainty. The second term in (1.3) is controlled by the timing attitude α and the ratio

u′ (Ct)

Ct /u(Ct

). The presence of the ratio u′ (Ct

)Ct /u

(Ct

)in this second term is

rarely mentioned, but this ratio plays a key role for RRA because it determines how

the household’s timing attitude α affects risk aversion. That is, for a given IES and

a given timing attitude α, the ratio u′ (Ct)

Ct /u(Ct

)determines the RRA. This prop-

erty of u′ (Ct)

Ct /u(Ct

)appears to have been largely overlooked in the literature,

because much focus has been devoted to the power utility kernel 11−1/ψC 1−1/ψ

t , where

ψ determines both u′′ (Ct)

Ct /u′ (Ct)

and u′ (Ct)

Ct /u(Ct

).

This observation suggest that the IES, the RRA, and the timing attitude may

be disentangled by considering a utility kernel, where the ratios u′′ (Ct)

Ct /u′ (Ct)

and u′ (Ct)

Ct /u(Ct

)can be determined separately. A simple way to achieve this

separation is to let

u(Ct ) = u0Z 1−1/ψt + 1

1−1/ψC 1−1/ψ

t , (1.4)

where the constant u0 ∈ R augments the standard power kernel. To avoid that this

constant diminishes relative to the utility from consumption as the economy grows,

it is necessary to scale u0 by Z 1−1/ψt to ensure a balanced growth path in the model.6

In this modified utility kernel, the constant u0 determines u′ (Ct)

Ct /u(Ct

), whereas

the ratio u′′ (Ct)

Ct /u′ (Ct)

and the IES are controlled by ψ as in the conventional

power kernel.

The presence of u0 in (1.4) may be motivated by accounting for other aspects

than consumption when modeling household utility. We provide two examples. First,

the household may enjoy utility from government spending Gt on roads, public parks,

6The kernel in (1.4) is obviously not the only way to separately determine u′′ (Ct)

Ct /u′ (Ct)

andu′ (Ct

)Ct /u

(Ct

). A previous version of this paper studied a utility kernel that modifies the standard power

utility kernel by changing u′ (·) and u′′ (·) as opposed to the level of u (·) as in (1.4). However, this alternativespecification is slightly more complicated than (1.4), and we therefore prefer the specification in (1.4),which we are grateful to the associate editor, Eric Swanson, for proposing.

1.2. A LONG-RUN RISK MODEL 7

law and order, etc. When these spendings grow with the size of the economy, i.e. Gt =gss Zt where gss ∈R+, and the utility from Gt is separable from Ct , then conditions

for balanced growth imply a utility kernel of the form g1−1/ψss

1−1/ψ Z 1−1/ψt + 1

1−1/ψC 1−1/ψt as

captured by (1.4). Second, the household may also consume home-produced goods

Ch,t that are made using the technology Lss Zt , where Lss denotes a fixed supply of

labor. When utility from home-produced goods is separable from Ct , conditions

for balanced growth dictates a utility kernel of the form L1−1/ψss

1−1/ψ Z 1−1/ψt + 1

1−1/ψC 1−1/ψt ,

which also has the structure captured by (1.4).

It is straightforward to show that RRA with (1.4) is given by

RRA = 1

ψ+α

1− 1ψ

1+u0

(1− 1

ψ

) , (1.5)

which reduces to the familiar expression 1ψ +α

(1− 1

ψ

)when u0 = 0. Thus, a high

value of u0 reduces RRA, and vice versa. To understand the intuition behind this

effect, consider the case where u0 is high, such that u′ (Ct)

Ct /u(Ct

)is low, and hence

variation in Ct has only a small effect on the overall utility level across the business

cycle. This implies that the value function attains a high and stable level even when

faced with a risky gamble, and the household is therefore only willing to pay a small

amount to avoid this gamble, i.e. it has a low RRA. Thus, by varying u0, we can

separately set RRA, for a given IES and timing attitude α.

1.2.4 Understanding Asset Prices

To explain how the IES, the RRA, and the timing attitude affect asset prices, we follow

Bansal and Yaron (2004) and consider a simplified version of the long-run risk model

without stochastic volatility, i.e. σσ = 0. The presence of u0 in (1.4) implies that

we cannot obtain the household’s wealth in closed form and hence eliminate the

value function from the stochastic discount factor using the procedure in Epstein

and Zin (1989). We are therefore unable to obtain an analytical expression for asset

prices by the log-normal method as in Bansal and Yaron (2004). Instead, we use the

perturbation method to derive an analytical second-order approximation to the long-

run risk model around the steady state. In the interest of space, we only provide the

solution for the value function vt ≡ logVt , the mean of the risk-free rate r ft ≡ logR f

t ,

and the mean of equity return r mt ≡ logRm

t in excess of the risk-free rate.

Proposition 1. The second-order approximation to vt around the steady state is given

by

vt = vss + v c ct + vx xt + 1

2v c c c2

t +1

2vxx x2

t + v cx ct xt + 1

2vσσ


where

vss = log

(∣∣∣∣∣u0 + 11− 1

ψ

∣∣∣∣∣)− log

(1−κ0

)v c = 1−κ0

1−κ0ρc

1− 1ψ

1+u0

(1− 1

ψ

)vx = κ0

1−κ0ρx

(1− 1

ψ

)v c c = 1−κ0

1−κ0ρ2c

(1− 1

ψ

)2

1+u0

(1− 1

ψ

) −[

1− 1ψ

1+u0

(1− 1

ψ

) 1−κ01−κ0ρc

]2

vxx = κ0

1−κ0ρ2x

1−κ0

(1−κ0ρx )2

(1− 1

ψ

)2

v cx =(1− 1

ψ

)2

1+u0

(1− 1

ψ

) 1−κ01−κ0ρc

[ρcκ0

1−κ0ρcρx− κ0

1−κ0ρx

]vσσ = κ0

1−κ0

[v c cσ

2c + (1−α) v2

cσ2c + vxxσ

2x + (1−α) v2

xσ2x + (1−α)

(1− 1

ψ

)2σ2

z

]

with κ0 ≡βµ1− 1

ψ

z .

The steady state of the value function vss is obviously increasing in u0, whereas

the loadings v c and v cx are decreasing in u0. That is, a higher value of u0 raises the

level of the value function and makes it less responsive as argued above. The lower

value of v c is further seen to reduce the contribution from cyclical consumption in

the risk correction vσσ. A key determinant for the size of vσσ is the timing attitude

α, which has a negative impact on vσσ through cyclical, short- and long-run risk,

because α> 1 for plausible levels of RRA with uss ≡ u(Css ) > 0.

Proposition 2. The unconditional mean of the risk-free rate E[

r ft

]and the ex ante

equity premium E[

r mt+1 − r f

t

]in a second-order approximation around the steady state

are given by

E[

r ft

]= rss − 1

2αv2

xσ2x −

1

2

1+ (α−1)

(1− 1

ψ2

)σ2z −

1

2

(1

ψ2 + 1

ψ2αv c +αv2

c

)σ2

c

and

E[

r mt+1 − r f

t

]=ακ1vx

φx − 1ψ

1−κ1ρxσ2

x +(αv c + 1

ψ

)φc +

(1−ρc

) 1ψ

1−κ1ρcκ1σ

2c .

Proposition 2 shows that the mean risk-free rate is given by its steady state level

rss =− logβ+ 1ψ logµz minus uncertainty corrections for each of the shocks affecting

consumption. The first term − 12αv2

xσ2x corrects for long-run risk and is negative

and increasing in the timing attitude α. The second uncertainty correction in E[

r ft

]relates to short-run risk and is also negative ifψ> 1 andα> 1. The final term in E

[r f

t

]corrects for cyclical risk and is also negative and becomes larger (in absolute terms)

1.3. ESTIMATION RESULTS: THE LONG-RUN RISK MODEL 9

when ψ falls and α increases. The effect of u0 enters in the uncertainty correction for

ct through v c , where a lower value of u0 gives a high RRA and a high v c that results in

a large uncertainty correction from cyclical risk.

The equity premium depends positively on long-run risk if φx > 1ψ and ψ > 1,

where the latter requirement is needed to ensure that vx > 0. We also note that this

uncertainty correction is increasing in i) the persistence of xt as determined by ρx ,

ii) the timing attitude α, and iii) firm leverage φx . The second term in E[

r mt+1 − r f

t

]is also positive and corrects for cyclical risk. The size of this term increases in i) the

persistence of ct as determined by ρc , ii) the timing attitude α, iii) firm leverage φc ,

and iv) the loading v c . The latter implies that a lower value of u0 (to increase the RRA

and v c ) also increases the contribution of cyclical risk in the equity premium.

To summarize our insights from these analytical expressions, recall that existing

models tend to generate too low equity premia and too high risk-free rates. Given

identical returns for equity and the risk-free rate under certainty equivalence, we thus

require a positive uncertainty correction in E[

r mt+1 − r f

t

]and a negative uncertainty

correction in E[

r ft

]to resolve the equity premium and risk-free rate puzzles. The

long-run risk model does exactly so for a high timing attitude α and a high RRA,

provided the IES is larger than one. The proposed utility kernel also shows that the

household’s timing attitude α has a separate effect on asset prices beyond the IES

and RRA consistent with the evidence in Chew and Ho (1994) and van Winden et al.

(2011).

1.3 Estimation Results: The Long-Run Risk Model

This section studies the ability of the long-run risk model to explain key features

of the post-war U.S. economy. We first describe the model solution and estimation

methodology in Section 1.3.1. The estimation results for the standard long-run risk

model are provided in Section 1.3.2 as a natural benchmark. Section 1.3.3 considers

our extension of the long-run risk model, while Section 1.3.4 studies the performance

of the model on moments that are not included in the estimation. We finally consider

a number of counterfactuals in Section 1.3.5.

1.3.1 Model Solution and Estimation Methodology

Pohl, Schmedders, and Wilms (2018) show that the widely used log-normal method

to approximate the solution to the long-run risk model may not always be sufficiently

accurate. Our extension allows α to take on even larger values than traditionally

considered, and this may generate even stronger nonlinearities in the long-run risk

model than reported in Pohl et al. (2018). We address this challenge by using a second-

order projection solution, where we exploit properties of quadratic systems with

Gaussian innovations to analytically carry out the required integration. Avoiding


numerical integration allows us to greatly reduce the executing time of this projection

solution to a few seconds, which makes the approximation sufficiently fast to be

used inside an estimation routine. Appendix A.2 provides further details on this

approximation, which constitutes a new numerical contribution to the literature. We

also show in Appendix A.3 that this second-order projection solution is more accurate

than the widely used log-normal method, and that it generally performs as well as a

highly accurate fifth-order projection solution.

The estimation is carried out on quarterly data, as this data frequency strikes

a good balance between getting a reasonably long sample and providing reliable

measures of consumption and dividend growth. Consistent with the common cal-

ibration procedure for the long-run risk model, we let one period in the model

correspond to one month and time-aggregate the theoretical moments to a quarterly

frequency. When simulating model moments, Bansal and Yaron (2004) enforce the

non-negativity of σ2t by replacing negative draws with a small positive number. We

follow their procedure and set this small number to σ2σ.

Our quarterly data set is from 1947Q1 to 2014Q4, where we use the same five

variables as in Bansal and Yaron (2004): i) the log-transformed price dividend ratio

pdt , ii) the real risk-free rate r ft , iii) the market return r m

t , iv) consumption growth

∆ct , and v) dividend growth ∆dt . All variables are stored in this order in datat with

dimension 5×1. We explore whether the model can match the means, variances,

contemporaneous covariances, and persistence in these five variables, as well as

the ability of pdt to forecast excess market return ext ≡ r mt − r f

t and the inability

of pdt to forecast dividend growth. To ease the estimation, the values of µz and µd

are calibrated to match the sample mean of consumption growth and dividends,

respectively. Hence, for the estimation we let

qt≡

�data′t

vec(datat data′

t

)′di ag

(datat data′

t−1

)′(ext −ex

)×pdt−1

∆dt ×pdt−1

,

where �datat contains the first three elements of datat , di ag (·) denotes the diagonal

elements of a matrix, and ex t is the sample average of ext . The model is estimated by

simulated method of moments (SMM), where the model-implied moments 1S

∑Ss=1 qs

are computed by simulation using S = 250,000 monthly observations. We adopt the

conventional two-step implementation of SMM and use a diagonal weighting matrix

in a preliminary first step, where moments related to consumption and dividend

growth have a relatively high weight to ensure that the model does not match as-

set prices at the expense of a distorted fit to macro fundamentals. Based on these

estimates, we then obtain our final estimates using the optimal weighting matrix

computed by the Newey-West estimator with 15 lags.


A preliminary analysis reveals that σσ is badly identified. Given that the long-run

risk model requires high persistence inσ2t , we occasionally find that large estimates of

σσ generate a fairly low probability of σ2t being non-negative (e.g., Pr

(σ2

t ≥ 0)≈ 60%),

making (1.2) a poor approximation for the evolution of σ2t . Therefore, we impose an

upper bound of 0.999 on ρσ as in Bansal, Kiku, and Yaron (2012) and set the value of

σσ to 0.05. This value of σσ ensures that Pr(σ2

t ≥ 0)

is at least 83% with ρσ ≤ 0.999.7

1.3.2 The Benchmark Model

As a natural benchmark, we first consider the standard long-run risk model by letting

u0 = 0 in (1.4). For comparability with nearly all calibrations of this model, we let

the IES = 1.5 and RRA = 10 by setting α appropriately using (1.5). The estimates in

the second column of Table 1.1 show that xt generates a small but very persistent

component in consumption growth with σx = 1.16×10−4 and ρx = 0.990. As in the

calibration of Bansal et al. (2012), σ2t displays high persistence with ρσ = 0.9983.

Cyclical consumption risk is mean-reverting with ρc = 0.975 and fairly volatile with

σc = 0.0027. We also note that the constraint on the effective discount factor β∗ ≡βµ

1−1/ψz < 1 is binding, because a high value of β is needed to generate a low risk-free

rate.

Table 1.1 also reports the timing premium Πt of Epstein et al. (2014). We find

thatΠss = 70%, meaning that the household is willing to give up 70% of its lifetime

consumption to know all future realizations of consumption in the following period.

This level of the timing premium is somewhat higher than the reported 31% for the

long-run risk model in Epstein et al. (2014), but lower than 77% as implied by the

calibrated version of the long-run risk model in Bansal et al. (2012).8

Column three in Table 1.2 verifies the common finding in the literature that the

standard long-run risk model with IES = 1.5 and RRA = 10 is able to explain sev-

eral asset pricing moments. In particular, the model provides a very satisfying fit to

the means and standard deviations of the price-dividend ratio and market return.

However, the risk-free rate has an elevated mean (1.96% vs. 0.83%) and displays insuf-

ficient variability with a standard deviation of 0.75% compared to 2.22% in the data.

Table 1.2 also shows that our estimated version of the long-run risk model matches

the standard deviation and persistence in consumption and dividend growth, al-

though the auto-correlation for dividend growth is somewhat higher than in the data

(0.52 vs. 0.40). It is, however, within the 95% confidence interval[0.27,0.52

], which is

7For comparison, Bansal et al. (2012) let σσ = 0.0378, and our calibration is thus very similar to theirpreferred value of σσ.

8The difference in the timing premium reported in Epstein et al. (2014) and the implied value fromthe calibration in Bansal et al. (2012) is mainly explained by the considered values of β and ρσ. Epsteinet al. (2014) use ρσ = 0.987 and β= 0.9980, but increasing ρσ to 0.999 as in Bansal et al. (2012) raises Πssfrom 31% to 50%. If we also increase β to 0.9989 as in Bansal et al. (2012), then Πss = 82% and hence closeto the 77% in Bansal et al. (2012). Slightly different values of σz and σx in Bansal et al. (2012) and Epsteinet al. (2014) account for the remaining difference.


Table

1.1:Th

eLo

ng-R

un

Risk

Mo

del:T

he

Structu

ralParameters

Estim

ation

results

usin

gd

atafro

m1947Q

1to

2014Q4

and

aseco

nd

-ord

erp

rojectio

nap

pro

ximatio

n.T

he

mo

delh

asa

mo

nth

lytim

efreq

uen

cyw

ithm

od

el-imp

lied

mom

ents

time-aggregated

toa

qu

arterlytim

efreq

uen

cyb

asedon

asim

ulated

samp

leof250,000

mon

thly

observation

s.Th

erep

ortedestim

atesare

fromth

esecon

dstep

inSM

Mw

ithth

eop

timalw

eigthin

gm

atrixestim

atedby

the

New

ey-Westestim

atoru

sing

15lags.Stan

dard

errors

arerep

ortedin

paren

thesis,excep

twh

enan

estimate

is

onth

eb

oun

dary

and

itsstan

dard

erroris

notavailab

le(n

.a.).Th

evalu

esofµ

zan

dµ

dare

calibrated

tom

atchth

esam

ple

mom

ents

ofconsu

mp

tionan

dd

ividen

dgrow

th,

respectively,im

plyin

gµ

z =1.0016

andµ

d =1.0020.T

he

value

ofσσ

issetto

0.05.Th

etim

ing

prem

ium

atthe

steady

state(Π

ss )isd

efin

edas

in(1.8)an

dcom

pu

tedb

ased

on

aseco

nd

-ord

erp

rojectio

no

fthe

value

fun

ction

and

the

utility

levelwh

enu

ncertain

tyis

resolved

inth

efo

llowin

gp

eriod

isco

mp

uted

bysim

ulatio

nu

sing

anti-th

etic

samp

ling

with

10,000d

raws

and

15,000term

sto

app

roximate

the

lifetime

utility

stream.

Ben

chm

arkM

od

elE

xtend

edM

od

el

(1)(2)

(3)(4)

(5)(6)

(7)(8)

RR

A=

5R

RA

=10

RR

A=

5R

RA

=10

IES

=1.5

IES

=1.5

IES

=1.1

IES

=1.5

IES

=2.0

IES

=1.1

IES

=1.5

IES

=2.0

u0

−−

71.37( 3.36)

24.72( 3.09)

9.91( 0.64)

33.22( 4.12)

9.87( 0.90)

2.56( 0.30)

β0.9991

( n.a

.)0.9991

( n.a

.)0.9995

( n.a

.)0.9991

( n.a

.)0.9988

( n.a

.)0.9995

( n.a

.)0.9991

( n.a

.)0.9988

( n.a

.)

ρc

0.7577( 0.3681)

0.9748( 0.0209)

0.9810( 0.0075)

0.9831( 0.0027)

0.9828( 0.0086)

0.9805( 0.0048)

0.9832( 0.0071)

0.9809( 0.0104)

ρx

0.9926( 0.0024)

0.9899( 0.0041)

0.9822( 0.0254)

0.9684( 0.0003)

0.9774( 0.0100)

0.9928( 0.0017)

0.9675( 0.0003)

0.9849( 0.0157)

ρσ

0.9986( 0.0011)

0.9983( 0.0025)

0.9974( 0.0081)

0.9990( n

.a.)

0.9990( n

.a.)

0.9990( n

.a.)

0.9990( n

.a.)

0.9986( 0.0047)

φx

3.2053( 0.2223)

4.3843( 0.0621)

3.5511( 3.5511)

4.595( 0.2558)

4.3246( 1.0974)

3.3772( 2.4230)

4.5664( 0.7024)

4.0767( 0.8778)

φc

2.4172( 0.0751)

0.2396( 0.1219)

0.2745( 0.0620)

0.2737( 0.0028)

0.2716( 0.0976)

0.3263( 0.0537)

0.2630( 0.0839)

0.2763( 0.0786)

σc

0.00001( n

.a.)

0.0027( 0.0008)

0.0030( 0.0006)

0.0027( 0.0003)

0.0027( 0.0005)

0.0026( 0.0003)

0.0027( 0.0004)

0.0028( 0.0008)

σz

0.0020( 0.0003)

0.0014( 0.0012)

0.0013( 0.0011)

0.0016( 0.0004)

0.0016( 0.0006)

0.0020( 0.0002)

0.0016( 0.0003)

0.0015( 0.0010)

σd

0.0125( 0.0004)

0.0116( 0.0010)

0.0116( 0.0009)

0.0107( 0.0001)

0.0106( 0.0008)

0.0108( 0.0007)

0.0107( 0.0011)

0.0108( 0.0018)

σx

1.57×10 −

4(2.34×

10 −5 )

1.16×10 −

4(2.30×

10 −5 )

1.03×10 −

4(6.56×

10 −5 )

1.20×10 −

4(0.70×

10 −5 )

1.20×10 −

4(4.17×

10 −5 )

0.46×10 −

4(4.56×

10 −5 )

1.23×10 −

4(1.35×

10 −5 )

1.02×10 −

4(5.04×

10 −5 )

Mem

o

Pr (σ

2t ≥0 )

86.9%89.2%

93.1%82.6%

82.6%82.6%

82.6%86.9%

u ′( Ct ) C

t

u( Ct ) ∣∣∣∣ss

0.3330.333

0.0120.036

0.0840.023

0.0780.219

Πss

72%70%

93%86%

75%99%

86%73%

α13.00

28.00336.96

120.1053.59

402.04120.08

43.36


Tab

le1.

2:T

he

Lon

g-R

un

Ris

kM

od

el:F

ito

fMo

men

tsT

he

mo

del

has

am

on

thly

tim

efr

equ

ency

wit

hm

od

el-i

mp

lied

mo

men

tsti

me-

aggr

egat

edto

aq

uar

terl

yti

me

freq

uen

cyu

sin

gth

esa

me

pro

ced

ure

asin

Ban

sala

nd

Yaro

n(2

004)

.All

mea

ns

and

stan

dar

dd

evia

tio

ns

are

exp

ress

edin

ann

ual

ized

per

cen

t,ex

cep

tfo

rth

ep

rice

-div

iden

dra

tio.

Th

atis

,th

ere

leva

ntm

om

ents

are

mu

ltip

lied

by

400,

exce

ptf

orth

est

and

ard

dev

iati

onof

the

mar

ketr

etu

rnth

atis

mu

ltip

lied

by20

0.A

llm

odel

-im

plie

dm

omen

tsin

colu

mn

s(2

)to

(9)

are

from

the

un

con

dit

ion

ald

istr

ibu

tion

com

pu

ted

usi

ng

asi

mu

late

dsa

mp

leof

250,

000

mon

thly

obse

rvat

ion

s,w

her

eas

the

emp

iric

ald

ata

mom

ents

inco

lum

n(1

)are

the

emp

iric

alsa

mp

lem

omen

ts.I

nco

lum

n(1

),fi

gure

sin

par

ente

sis

refe

rto

the

stan

dar

der

ror

oft

he

emp

iric

alm

om

ent,

com

pu

ted

bas

edo

na

blo

ckb

oo

tstr

apu

sin

g5,

000

dra

ws

and

ab

lock

len

gth

of3

2q

uar

ters

.

Dat

aB

ench

mar

kM

od

elE

xten

ded

Mo

del

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

RR

A=

5R

RA

=10

RR

A=

5R

RA

=10

IES

=1.

5IE

S=

1.5

IES

=1.

1IE

S=

1.5

IES

=2.

0IE

S=

1.1

IES

=1.

5IE

S=

2.0

Mea

ns

pd

t3.

495

(0.1

22)

3.49

13.

297

3.27

73.

290

3.28

43.

294

3.28

63.

278

rf t

0.83

1(0

.547

)1.

839

1.95

92.

162

1.69

31.

591

1.67

61.

706

1.62

8

rm t6.

919

(1.8

79)

5.70

36.

320

6.31

86.

262

6.30

06.

215

6.27

66.

330

∆c t

1.90

5(0

.244

)1.

894

1.90

21.

905

1.89

71.

896

1.89

41.

897

1.90

0

∆d

t2.

391

(0.9

75)

2.35

42.

377

2.39

82.

363

2.35

72.

358

2.36

32.

370

Std

sp

dt

0.42

1(0

.068

)0.

419

0.34

20.

262

0.28

40.

302

0.26

30.

285

0.29

7

rf t

2.22

4(0

.397

)1.

142

0.75

00.

698

0.49

50.

451

0.58

80.

496

0.46

6

rm t16

.45

(1.1

38)

14.1

014

.48

14.8

314

.80

14.7

314

.77

14.7

814

.694

∆c t

2.03

5(0

.172

)2.

054

2.06

22.

033

2.01

22.

022

2.07

62.

013

2.03

4

∆d

t9.

391

(1.5

31)

9.22

29.

045

8.99

58.

807

8.80

88.

785

8.80

18.

779

Per

sist

ence

corr

(pd

t,p

dt−

1)

0.98

2(0

.056

)0.

985

0.97

60.

957

0.96

30.

968

0.95

70.

964

0.96

7

corr

( rf t

,rf t−

1

)0.

866

(0.0

35)

0.98

70.

978

0.96

40.

951

0.96

60.

981

0.94

90.

975

corr

( rm t,r

m t−1

)0.

084

(0.0

48)

0.01

70.

012

0.00

30.

003

0.00

60.

000

0.00

30.

006

corr

( ∆c t,∆

c t−1

)0.

306

(0.1

18)

0.71

80.

378

0.26

90.

257

0.28

60.

240

0.25

80.

289

corr

( ∆d t,∆

dt−

1) 0.

396

(0.0

63)

0.46

70.

523

0.52

90.

552

0.55

50.

544

0.55

20.

553


Table

1.2:Lon

g-Ru

nR

iskM

od

el:FitofM

om

ents

(con

tinu

ed)

Data

Ben

chm

arkM

od

elE

xtend

edM

od

el

(1)(2)

(3)(4)

(5)(6)

(7)(8)

(9)

RR

A=

5R

RA

=10

RR

A=

5R

RA

=10

IES

=1.5

IES

=1.5

IES

=1.1

IES

=1.5

IES

=2

IES

=1.1

IES

=1.5

IES

=2

Co

rrelation

s

corr (p

dt ,r

ft )0.035( 0.212)

0.9130.668

-0.0840.040

0.303-0.052

0.0330.367

corr (p

dt ,r

mt )0.058( 0.062)

0.1850.212

0.2840.256

0.2360.288

0.2550.243

corr (p

dt ,∆

ct )

0.025( 0.080)

0.6520.366

0.1390.107

0.1480.118

0.1060.187

corr (p

dt ,∆

dt )

−0.017

( 0.095)0.499

0.5350.635

0.6050.573

0.6630.604

0.586

corr (r

ft,r

mt )0.023( 0.044)

0.1640.083

-0.021-0.006

0.013-0.072

-0.0050.004

corr (r

ft,∆

ct )

0.161( 0.080)

0.7890.468

0.3050.253

0.2890.200

0.2560.230

corr (r

ft,∆

dt )

−0.168

( 0.093)0.565

0.336-0.035

0.0090.088

-0.1630.011

0.072

corr (r

mt,∆

ct )

0.233( 0.054)

0.1350.395

0.6230.592

0.5560.558

0.5970.554

corr (r

mt,∆

dt )

0.104( 0.050)

0.2960.294

0.2960.290

0.2890.292

0.2890.289

corr (∆

ct ,∆

dt )

0.062( 0.0496)

0.4650.236

0.0690.075

0.1070.028

0.0760.107

corr (r

mt−

rft

,pd

t−1 )

−0.134

( 0.048)-0.017

-0.0140.006

-0.002-0.009

0.011-0.002

-0.007

corr (∆

dt ,p

dt−

1 )−

0.0163( 0.104)

0.4670.498

0.5860.562

0.5330.616

0.5600.545

Go

od

ness

offi

tQ

step2

-0.0632

0.06210.0624

0.05910.0592

0.06160.0592

0.0593J-test:P-valu

e-

10.93%26.44%

20.20%24.78%

24.59%21.24%

24.5524.49%

Qsca

led-

3.352.26

1.891.54

1.531.62

1.541.61


derived from the reported standard error for each of the sample moments in Table

1.2 shown in parenthesis and computed using a block bootstrap.

The last part of Table 1.2 shows the contemporaneous correlations. We find that

consumption growth is too highly correlated with the price-dividend ratio (0.37

vs. 0.03). This is similar to the finding reported in Beeler and Campbell (2012). We

also find that consumption growth is too strongly correlated with the risk-free rate

(0.47 vs. 0.16). Conventional two-sided t-tests further show that the differences in

cor r(pdt ,∆ct

)and cor r

(pdt ,rt

)have t-statistics of 4.26 and 3.84, respectively.9

To understand why consumption growth is too highly correlated with pdt and

r ft , recall that the standard long-run risk model relies on the power utility kernel

with an IES = 1.5 and RRA = 10. Equation (1.5) then implies a relatively low timing

attitude with α= 28. To explain the market return, the model therefore requires high

persistence in xt to amplify the long-run risk channel (see Section 1.2.4). But, such a

high level of persistence in xt makes consumption growth too highly correlated with

the price-dividend ratio and the risk-free rate. To realize this, consider the analytical

approximation in Section 1.2.4 which implies

cov(∆ct , pdt ) =φ− 1

ψ

1−κ1ρxρx

σ2x

1−ρ2x+

(1−ρc

)2

1−κ1ρc

1

ψ

σ2c

1−ρ2c

(1.6)

and

cov(∆ct ,r ft ) = 1

ψ

[ρx

σ2x

1−ρ2x− (

1−ρc)2 σ2

c

1−ρ2c

], (1.7)

which both are increasing in ρx for the parameter values in Table 1.1. Hence, an

undesirable effect of the high persistence in xt is to amplify the comovement of

consumption growth with pdt and r ft .

The tight link between the timing attitudeα and the degree of long-run risk is seen

clearly when estimating the model with RRA = 5, as shown in the first column of Table

1.1. This lower level of RRA weakens the effect from the timing attitude, asα falls from

28 to 13. To match asset prices, we therefore find an increase in the degree of long-

run risk compared to the benchmark specification with RRA = 10, as σx increases

from 1.16× 10−4 to 1.57× 10−4 and ρx increases from 0.990 to 0.993. The second

column in Table 1.1 shows that this increase in long-run risk produces too much

auto-correlation in consumption growth (0.72 vs. 0.31) and amplifies cor r (∆ct , pdt )

and cor r (∆ct ,r ft ) further.

9Using the log-normal method and the calibration in Bansal and Yaron (2004), the long-run risk model

implies cor r(pdt ,∆ct

) = 0.547 and cor r

(r

ft ,∆ct

)= 0.581. The corresponding empirical moments on

annual data are 0.061 and 0.356, respectively. The slightly modified calibration in Bansal et al. (2012) with

less long-run risk gives cor r(pdt ,∆ct

)= 0.368 and cor r

(r

ft ,∆ct

)= 0.473. Thus, the elevated correlations

for cor r(pdt ,∆ct

)and cor r

(r

ft ,∆ct

)also appear in calibrated versions of the long-run risk model using

annual data.


1.3.3 The Extended Model

We next introduce u0 in the utility kernel and re-estimate the long-run risk model

when conditioning on the familiar values of RRA = 10 and IES = 1.5. Column seven in

Table 1.1 shows that we find u0 = 9.87 with a standard error of 0.90, meaning that u0 is

statistically different from zero at all conventional significance levels. With u0 = 9.87,

the key ratio u′ (Ct)

Ct /u(Ct

)∣∣∣ss

is much lower than in the benchmark version of the

model (0.078 vs. 0.333), and this allows the timing attitude α to increase from 28 to

120 while keeping RRA at 10. Less long-run risk is therefore needed to match asset

prices and this explains the fall in ρx from 0.990 to 0.968. As a result, cor r(pdt ,∆ct

)falls from 0.37 to 0.10 and cor r

(r f

t ,∆ct

)falls from 0.47 to 0.26, implying that both

moments are no longer significantly different from their empirical moments. We also

see improvements in the ability of the model to match cor r(pdt ,r f

t

), cor r

(r f

t ,∆dt

),

cor r(∆ct ,∆dt

), and the mean of r f

t . On the other hand, the fit to cor r(r m

t ,∆ct

),

cor r(r m

t − r ft , pdt−1

), cor r

(∆dt , pdt−1

), and the standard deviations of pdt and r f

t

worsen slightly when including u0.

To evaluate the overall goodness of fit for the long-run risk model, Table 1.2 also

reports the value of the objective function Q step2 in step 2 of our SMM estimation

and the related p-value for the J-test for model misspecification. The benchmark

model and our extension are not rejected by the data, but we note that the J-test

has low power given our short sample (T = 271). The values of Q step2 are unfortu-

nately not comparable across models, because they are computed for model-specific

weighting matrices. To facilitate model comparison, we therefore introduce the fol-

lowing measure for goodness of fit Q scaled =∑ni=1

((md at a

i −mmodeli

)/(1+md at a

i

))2

,

where md at ai and mmodel

i refer to the scaled moments in the data and the model,

respectively, as reported in Table 1.2.10 Although the moments in Q scale are weighted

differently than in the estimation, Q scaled may nevertheless serve as a natural sum-

mary statistic for model comparison from an economic perspective. We find that

the benchmark model implies Q scaled = 2.26, but allowing for u0 in the utility kernel

gives Q scaled = 1.54. This corresponds to an 32% improvement in model fit from

disentangling the timing attitude α from the IES and RRA.

A natural way to extend the timing premium of Epstein et al. (2014) to the utility

10The difference md at ai −mmodel

i in Qscale is standardized by 1+md at ai , as oppose to just md at a

i , toensure that moments close to zero do not get very large weights.


kernel in (1.4) is to define Πt implicitly as

Vt =u0Z 1−1/ψ

t + 1

1− 1ψ

C1− 1

ψ

t

(1−Πt

)1− 1ψ (1.8)

+β

Et

∞∑

i=1βi−1

u0Z 1−1/ψt+i + C

1− 1ψ

t+i

1− 1ψ

(1−Πt

)1− 1ψ

1−α

1/(1−α)

.

That is, we combine Z1− 1

ψ

t u0 and the utility from Ct when computing Πt , because

Z1− 1

ψ

t u0 is a reduced-form term that captures other aspects of consumption than

included in Ct (see Section 1.2.3). This implies thatΠt measures the fraction of overall

lifetime consumption that the household is willing to pay to have all uncertainty

resolved in the following period. Clearly, equation (1.8) reduces to the definition of

Πt in Epstein et al. (2014) when u0 = 0. Table 1.1 shows that Πss increases from 70%

to 86% when introducing u0 in the utility kernel when RRA = 10 and IES = 1.5. That is,

the pronounced increase in the timing attitudeα from 28 to 120 more than outweighs

the effects from less long-run risk and leads to an even higher timing premium.

The remaining columns in Table 1.1 and 1.2 explore the robustness of these

findings to lowering the IES to 1.1, increasing the IES to 2, and reducing RRA to 5.

We emphasize the following two results. First, lowering RRA from 10 to 5 does hardly

affect the model’s ability to match asset prices once u0 is included in the utility kernel.

For instance, we find Q scaled = 1.54 when the IES = 1.5 for both levels of RRA. In

contrast, when using the traditional utility kernel with a RRA of 5, the model’s ability

to match the data deteriorates as Q scaled increases from 2.26 to 3.35. Second, the

effects of changing the IES are generally also small, in particular for RRA = 10. Thus,

we find that the satisfying ability of the long-run risk model to match asset prices

extends to the case of a lower IES of 1.1 and a lower RRA of 5, once u0 is included in

the utility kernel. However, separating these three behavioral characteristics in the

utility function does not alleviate the problem of seemingly implausible high levels

of the timing premium, which remains very high (i.e. above 70%) for all considered

specifications of the IES and RRA.

1.3.4 Additional Model Implications

In addition to the moments used in the estimation, the long-run risk model is also

frequently evaluated based on its ability to reproduce several stylized relationships for

the U.S. stock market. Following Beeler and Campbell (2012), we first study the ability

of the price-dividend ratio to explain past and future consumption growth. Figure 1.1

shows that past and future consumption growth are too highly correlated with the


price-dividend ratio compared to empirical evidence in the standard long-run risk

model. A similar finding is reported in Beeler and Campbell (2012) for two calibrated

versions of this model. In contrast, our extension of the long-run risk model implies

that past and future consumption growth display the same low correlations with

the price-dividend ratio as seen in the data. Figure 1.1 considers the case where the

IES =1.5 and RRA = 5 in our extension of the long-run risk model, but the results are

robust to using any of the other specifications for the IES and RRA reported in Table

1.1. Thus, disentangling the timing attitude from the IES and RRA is also supported

by these stylized regressions, because a higher timing attitude reduces the amount of

long-run risk and hence the degree of predictability in consumption growth.

Figure 1.1: Properties of Consumption Growth and Volatility

All model-implied moments are computed given the estimated parameters in Table 1.1 using a simulatedsample path of 1,000,000 observations. The conditional volatility σt is estimated by

∣∣ut∣∣, where ut is the

residual from the OLS regression ∆ct =α+∑5j=1β

(j)∆ct− j +ut . All the 95 percent confidence bands are

computed using a block bootstrap applied jointly to the regressant and the regressor with a block length of2× j lags.

-5 -4 -3 -2 -1 0 1 2 3 4 5

Forecast horizon j in quarters

0

0.05

0.1

0.15

0.2

0.25

0.3

-5 -4 -3 -2 -1 0 1 2 3 4 5


0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

1 2 3 4 5 6 7 8 9 10


0.1

0.2

0.3

0.4

0.5

1 2 3 4 5 6 7 8 9 10


0

0.05

0.1

0.15

0.2

The last two charts in Figure 1.1 explore the relationship between consumption

volatility and the price-dividend ratio. We find that our extension of the long-run risk

model preserves the good performance of the benchmark model and implies that

i) a high price-dividend ratio predicts future low volatility and ii) high uncertainty

forecasts a low price-dividend ratio.


1.3.5 The Key Mechanisms

We next consider a number of experiments to illustrate some of the key mechanisms

in the model. Here, we apply the estimated version of the model in column four of

Table 1.1 with an IES of 1.5 and a RRA of 5.

Table 1.3: The Long-Run Risk Model: Analyzing the Extended ModelThe model has a monthly time frequency with model-implied moments time-aggregated to a quarterlytime frequency using the same procedure as in Bansal and Yaron (2004). All means and standard deviationsare expressed in annualized percent by multiplying by 400, except for the standard deviation of the marketreturn that is multiplied by 200. The moments are from the unconditional distribution computed usinga simulated sample of 250,000 monthly observations. Unless stated otherwise, all parameters attain theestimated values from column (4) in Table 1.1, meaning that the IES = 1.5 and RRA = 5.

(1) (2) (3) (4) (5) (6) (7)u0 = 10 u0 = 20 u0 = u0 σx = 0 σσ = 0 β= 0.998 α= 0

Meanspdt 66.31 15.71 3.29 14.04 4.46 3.16 8.50

r ft 2.18 1.86 1.69 2.24 2.17 3.21 2.39

r mt 2.36 2.36 6.26 2.38 3.60 6.80 2.40

Stdspdt 0.31 0.31 0.28 0.30 0.27 0.28 0.82

r ft 0.47 0.48 0.50 0.24 0.43 0.50 0.45

r mt 17.99 18.02 14.80 17.02 16.18 14.55 22.22

MemoRRA 5 5 5 5 5 5 0.67Πss 46% 79% 86% 18% 33% 34% 0%α 56.33 99.67 120.10 120.10 120.10 120.10 0

The first experiment we consider is to gradually increase u0 to its estimated value

of 24.72. Table 1.3 shows that a higher value of u0 generates a substantial increase in

the required timing attitude α to ensure a constant RRA. This in turn has desirable

effects on the level of asset prices because a higher value of α reduces E[pdt ] as well

as E[r ft ] and increases E[r m

t ]. To understand these effects of increasing α for a given

level of RRA, recall that the household is indifferent to resolution of uncertainty when

α= 0. Now suppose we increase α to make the household prefer early resolution of

uncertainty, but without affecting the RRA. This modification increases the variability

of the value function and hence increases the precautionary motive. The one-period

risk-free bond therefore becomes more attractive, and this reduces the risk-free rate

as shown in Proposition 2. On the other hand, uncertain future dividends from equity

become less attractive for higher values of α due to the presence of long-run risk.

A household with strong preferences for early resolution of uncertainty therefore

requires a larger compensation for holding equity compared to the case of α= 0 and


this explains the increase in E[r mt ] for higher values of α.

The second experiment we consider is to omit long-run risk by letting σx = 0. The

fourth column in Table 1.3 shows that this modification has profound implications,

as the model now generates a too high level for the the price-dividend ratio (14.04

vs. 3.50 in the data) and the risk-free rate (2.24% vs. 0.83% in the data), whereas the

average market return is too low (2.38% vs. 6.92% in the data). Omitting long-run

risk also has a large effect on the timing premium, which falls from 86% to 18%.

Thus, disentangling the timing attitude α from the IES and RRA does not alleviate the

reliance on long-run risk in the model.

Our third experiment imposes σσ = 0 to evaluate the importance of stochastic

volatility. The fifth column in Table 1.3 shows that the mean of the price-dividend

ratio increases to 4.46 and the mean market return falls to 3.66%. We also find that

the timing premium decreases from 86% to 33%. This shows that stochastic volatility

may have a much larger impact on the timing premium in long-run risk models than

suggested by the results in Epstein et al. (2014). Thus, stochastic volatility remains an

important feature of the long-run risk model, even when the timing attitude is set

independently of the IES and RRA.

The fourth experiment explores whether the high subjective discount factor β

may help to explain the high timing premium in the long-run risk model. We address

this question in the sixth column of Table 1.3 by reducing β from its estimated value

of 0.9991 to 0.9980 as considered in Epstein et al. (2014). This small change in β

reduces the timing premium from 86% to 34%, which is in the neighborhood of the

31% reported in Epstein et al. (2014). However, a β of 0.9980 gives a too high mean for

the risk-free rate (3.21% vs. 0.83% in the data), and hence makes the model unable

to resolve the risk-free rate puzzle. This result explains why our estimation prefers a

high β, although it implies high timing premia.

Our final experiment studies the effect of the timing attitude by letting α = 0,

implying that the household is indifferent between early and late resolution of uncer-

tainty. The seventh column of Table 1.3 shows that this modification only lowers the

RRA from 5 to 0.67, but it nevertheless has a profound impact on the model despite

the presence of long-run risk. That is, the model is simply unable to match asset

prices without strong preferences for early resolution of uncertainty.

1.4 A New Keynesian Model

To provide further support for the considered Epstein-Zin-Weil preferences, we next

show that they also help explain asset prices in an otherwise standard New Keynesian

model. The processes for consumption and dividends are here determined within

the model, whereas they are assumed to be exogenously given in the long-run risk

model. We proceed by presenting our New Keynesian model in Section 1.4.1, the

adopted estimation routine in Section 1.4.2, and the estimation results in Section

1.4. A NEW KEYNESIAN MODEL 21

1.4.3. We finally examine the key mechanisms in our extended New Keynesian model

in Section 1.4.4.

1.4.1 Model Description

1.4.1.1 Household

The household is similar to the one considered in Section 1.2 except for a variable

labor supply Lt . To match the persistence in consumption growth, we follow much

of the New Keynesian tradition and allow for exogenous consumption habits of the

form bCt−1. These modifications are included in the new utility kernel by letting

u(Ct ,Lt ) = u0Z 1−1/ψt +

(Ct −bCt−1

)1−1/ψ

1−1/ψ

+ϕ0Z 1−1/ψt

(1−Lt

)1− 1ϕ

1− 1ϕ

(1.9)

with ϕ0 > 0 and ϕ ∈ R \{1}, which reduces to the specification in Rudebusch and

Swanson (2012) when u0 = b = 0. The constant u0 does not affect the IES at the steady

state ψ(1− b

µZ ,ss

), where consumption habits reduce the IES compared to the value

implied by ψ. The expression for the RRA is slightly more involved than the one

provided in (1.5) due to consumption habits and the variable labor supply, where the

latter gives the household an additional margin to absorb shocks. For the Epstein-

Zin-Weil preferences in (1.1), Swanson (2018) shows that RRA in the steady state is

given by RRA= 1IES

(1+ Wt

ZtΛt

)−1∣∣∣∣

ss+α uC (Ct ,Lt )Ct

u(Ct ,Lt )

∣∣∣ss

, where Λt ≡ −uL (Ct ,Lt )uCC (Ct ,Lt )uC (Ct ,Lt )uLL (Ct ,Lt )

accounts for the labor margin. When inserting for the utility kernel in (1.9) we get

RRA = 1

IES+ ϕWss (1−Lss )Css

+α(1− 1

ψ

)1− b

µZ ,ss+ 1− 1

ψ

Css

[u0C

1ψ

ss

(1− b

µZ ,ss

) 1ψ + (1−Lss )Wss

1− 1ϕ

] .(1.10)

Here, Css and Wss refer to the steady state of consumption and the real wage in

the normalized economy without trending variables, and µZ ,ss denotes the deter-

ministic trend in consumption and productivity, which we specify below in (1.11).

Equation (1.10) shows that u0 also with consumption habits and a variable labor

supply controls RRA through the ratio uC(Ct

)Ct /u

(Ct

).

The real budget constraint for the household is given by Et

[Mt ,t+1

X t+1πt+1

]+Ct =

X tπt

+Wt Lt +D t , where Mt ,t+1 is the nominal stochastic discount factor, X t is nominal

state-contingent claims, πt denotes gross inflation, Wt is the real wage, and D t is real

dividend payments from firms.


1.4.1.2 Firms

Final output Yt is produced by a perfectly competitive representative firm, which

combines differentiated intermediate goods Yt(i)

using Yt =(∫ 1

0 Yt(i) η−1

η di

) ηη−1

with

η > 1. This implies that the demand for the i th good is Yt(i) = (

Pt (i)Pt

)−ηYt , where

Pt ≡(∫ 1

0 Pt(i)1−ηdi

) 11−η

denotes the aggregate price level and Pt(i)

is the price of the

i th good.

Intermediate firms produce the differentiated goods using Yt(i)= Zt At K θ

ss Lt(i)1−θ ,

where Kss and Lt(i)

denote capital and labor services at the i th firm, respectively.

Productivity shocks are allowed to have the traditional stationary component At ,

but also a non-stationary component Zt to generate long-run risk in the model.

For the stationary shocks, we let log At+1 = ρA log At +σAεA,t+1, where∣∣ρA

∣∣ < 1,

σA > 0, and εA,t+1 ∼ NID(0,1

). Similarly for the non-stationary shocks, we intro-

duce µZ ,t+1 = Zt+1/Z and let

log

(µZ ,t+1

µZ ,ss

)= ρZ log

(µZ ,t

µZ ,ss

)+σZ εZ ,t+1, (1.11)

where∣∣ρZ

∣∣< 1, σZ > 0, and εZ ,t+1 ∼NID(0,1

).11

Intermediate firms can freely adjust their labor demand at the given market wage

Wt and are therefore able to meet demand in every period. Similar to Andreasen

(2012), price stickiness is introduced as in Rotemberg (1982), where ξ≥ 0 controls

the size of firms’ real cost ξ2

(Pt

(i)

/(Pt−1

(i)πss

)−1

)2

Yt when changing the optimal

nominal price Pt(i)

of the good they produce.12

1.4.1.3 The Central Bank and Aggregation

The central bank sets the one-period nominal interest rate it according to it = iss +βπ log

(πtπss

)+βy log

(Yt

Zt Yss

), based on a desire to close the inflation and output gap.

Note that the inflation gap accounts for steady-state inflation πss , and that the output

gap is expressed in deviation from the steady state level of output in the normalized

economy Yss without trending variables.

11The specification of long-run productivity risk adopted in the endowment model, i.e. (1.2), couldalso be used in the New Keynesian model, but we prefer the more parsimonious specification in (1.11) forcomparability with the existing DSGE literature (see, for instance, Justiniano and Primiceri, 2008). Thisdifference explains the slightly different notation used in (1.11) for µZ ,t , µZ ,ss , σZ , and εZ ,t+1 comparedto the corresponding parameters in (1.2).

12Specifying nominal regidities by Calvo pricing as in Rudebusch and Swanson (2012) gives largelysimilar results to those reported below. The considered specification is chosen because the solution tothe New Keynesian model with Rotemberg pricing is approximated more accurately by the perturbationmethod than with Calvo pricing. The reason seems to be that Calvo (unlike Rotemberg) pricing induces aprice dispersion index as an extra state variable that makes the New Keynesian model very nonlinear incertain areas of the state space, as shown in Andreasen and Kronborg (2018).


Summing across all firms and assuming that δKss Zt units of output are used

to maintain the constant capital stock as in Rudebusch and Swanson (2012), the

resource constraint becomes Ct +ZtδKss =(1− ξ

2

(πtπss

−1)2

)Yt .

1.4.1.4 Equity and Bond Prices

Equity is defined as a claim on aggregate dividends from firms, i.e. D t = Yt −Wt Lt ,

and its real price is therefore 1 = Et

[Mt ,t+1Rm

t+1

]where Rm

t+1 =(D t+1 +P m

t+1

)/P m

t .

The price in period t of a default-free zero-coupon bond B (n)t maturing in n

periods with a face value of one dollar is B (n)t = Et

[Mt ,t+1πt+1

B (n−1)t+1

]for n = 1, ..., N with

B (0)t = 1. Its yield to maturity is i (n)

t =− 1n logB (n)

t . Following Rudebusch and Swanson

(2012), we define term premia as Ψ(n)t = i (n)

t − i (n)t , where i (n)

t is the yield to maturity

on a zero-coupon bond B (n)t under risk-neutral valuation, i.e. B (n)

t = e−it Et

[B (n−1)

t+1

]with B (0)

t = 1.

1.4.2 Model Solution and Estimation Methodology

We approximate the model solution by a third-order perturbation solution. The model

is estimated by GMM using unconditional first and second moments computed as

in Andreasen, Fernandes-Villaverde, and Rubio-Ramirez (2018). The selected series

describing the macro economy and the bond market are given by ∆ct , πt , it , i (40)t ,

Ψ(40)t , and logLt , where one period in the model corresponds to one quarter. The 10-

year nominal interest rate and its term premium (obtained from Adrian, Crump, and

Moench, 2013) are available from 1961Q3, leaving us with quarterly data from 1961Q3

to 2014Q4. We include all means, variances, and first-order auto-covariances of these

six variables for the estimation, in addition to five contemporaneous covariances

related to the correlations reported at the end of Table 1.5. To examine whether

our New Keynesian model is able to match the equity premium, we also include

the mean of the net market return r mt = logRm

t in the set of moments. Finally, the

GMM estimation is implemented using the conventional two-step procedure for

moment-based estimators as outlined in Section 1.3.1.

We estimate all structural parameters in the model except for a few badly identi-

fied parameters. That is, we let δ= 0.025 and θ = 1/3 as typically considered for the

U.S. economy. We also let η= 6 to get an average markup of 20% and impose ϕ= 0.25

to match a Frisch labor supply elasticity in the neighborhood of 0.5. The ratio of

capital to output in the steady state is set to 2.5 as in Rudebusch and Swanson (2012).

We follow Andreasen (2012) and set ξ based on a linearized version of the model to

match a Calvo parameter of αp = 0.75, giving an average duration for prices of four


quarters.13 Finally, the estimates of the subjective discount factor for all considered

specifications of the New Keynesian model hit the upper bound for this parameter

and we therefore simply let β= 0.9995.

1.4.3 Estimation Results

1.4.3.1 A Standard Power Utility Kernel

We first consider the standard implementation of Epstein-Zin-Weil preferences with

u0 = 0 and condition the estimation of the New Keynesian model on different values

of RRA. Table 1.4 shows that we get fairly standard estimates when RRA = 5. That

is, we find strong habits (b = 0.72), very persistent technology shocks (ρA = 0.99

and ρZ = 0.33), and a central bank that assigns more weight to stabilizing inflation

than output (βπ = 1.46 and βy = 0.02). Table 1.5 shows that the model does well in

matching the mean and variability of inflation, the short rate, the 10-year interest rate,

and the 10-year term premium. The model-implied level of the market return is 3.61%

and reasonably close to the empirical value of 5.53%, when accounting for its large

standard error of 2.01% computed by a block bootstrap. However, the model also

generates too much variability in consumption growth (2.35% vs. 1.80%) and labor

supply (2.85% vs. 1.62%), predicts too strong autocorrelation in consumption growth

(0.73 vs. 0.53), and is unable to match the negative correlation between consumption

growth and inflation (0.19 vs. −0.18). Table 1.4 and 1.5 also show that increasing RRA

to 10 does not materially affect the estimates and performance of the New Keynesian

model. Thus, these results just iterate the finding in Rudebusch and Swanson (2008)

that the standard New Keynesian model with low RRA struggles to match key asset

pricing moments without distorting the fit to the macro economy.

We next increase RRA to 60, although such an extreme level of risk aversion is

hard to justify based on micro-evidence. Table 1.5 shows that the New Keynesian

model now reproduces all means without generating too much variability in the

macro economy, except for a slightly elevated standard deviation in labor supply

(2.45% vs. 1.62%). Thus, a high RRA of 60 implies that the model delivers a better

overall fit to the data with Q scaled = 0.34 compared to Q scaled = 0.76 when RRA =

5. To compute the timing premium in our New Keynesian model we must extend

the definition in Epstein et al. (2014) to account for an endogenous labor supply.

The labor margin gives the household an extra dimension to absorb shocks and

this affects its willingness to pay for getting uncertainty resolved in the following

period. The problem is thus very similar to the one considered in Swanson (2018) for

extending expressions of RRA to account for a variable labor supply, and we therefore

follow his approach and use the equilibrium condition for the consumption-leisure

13The mapping is ξ=(1−θ+ηθ)(

η−1)αp(

1−αp

)(1−θ)

1−αpβµ1− 1

ψZ ,ss

as derived in the online appendix.


Table 1.4: The New Keynesian Model: The Structural ParametersEstimation results using data from 1961Q3 to 2014Q4 using a third-order perturbation approximation with

model-implied moments computed as in Andreasen et al. (2018). The reported estimates are from the

second step of GMM with the optimal weigthing matrix estimated by the Newey-West estimator with 15

lags. The estimates of β are for all specifications on the boundary 0.9995 and therefore not reported below.

The timing premium at the steady state (Πss ) is computed based on (1.12) and a third-order perturbation

approximation, where the utility level when uncertainty is resolved in the followingt period is computed

by simulation using anti-thetic sampling with 5,000 draws and 10,000 terms to approximate the lifetime

utility stream.

Benchmark Model Extended Model(1) (2) (3) (4) (5) (6)

RRA=5 RRA=10 RRA=60 RRA=5 RRA = 10 RRA = 60

u0 - - - −938.67(215.71)

−294.23(36.214)

−24.503(5.4683)

ψ 0.1084(0.0109)

0.2088(0.0216)

0.4040(0.1440)

0.1835(0.0375)

0.3039(0.0470)

0.5169(0.0527)

b 0.7248(0.0165)

0.7588(0.0186)

0.7912(0.0302)

0.4867(0.0255)

0.5575(0.0308)

0.5785(0.0485)

βπ 1.4588(0.0568)

1.4326(0.0734)

1.4597(0.2267)

1.4229(0.0381)

1.3814(0.0576)

1.3263(0.0563)

βy 0.0209(0.0036)

0.0294(0.0053)

0.0565(0.0242)

0.0192(0.0042)

0.0228(0.0106)

0.0563(0.0184)

µZ ,ss 1.0029(0.0002)

1.0038(0.0003)

1.0052(0.0003)

1.0049(0.0004)

1.0050(0.0004)

1.0051(0.0004)

πss 1.0635(0.0053)

1.0458(0.0033)

1.0311(0.0026)

1.0683(0.0119)

1.0458(0.0047)

1.0290(0.0023)

Lss 0.3375(0.0009)

0.3371(0.0007)

0.3368(0.0007)

0.3381(0.0014)

0.3369(0.0009)

0.3378(0.0015)

ρA 0.9910(0.0009)

0.9885(0.0011)

0.9867(0.0012)

0.9927(0.0011)

0.9896(0.0013)

0.9818(0.0011)

ρZ 0.3254(0.0817)

0.5185(0.0718)

0.7883(0.0783)

0.2084(0.1196)

0.3653(0.1780)

0.4434(0.5371)

σA 0.0168(0.0008)

0.0143(0.0010)

0.0116(0.0023)

0.0214(0.0017)

0.0177(0.0013)

0.0098(0.0012)

σZ 0.0098(0.0010)

0.0070(0.0010)

0.0017(0.0010)

0.0053(0.0024)

0.0028(0.0008)

0.0019(0.0019)

MemoIESss 0.030 0.051 0.086 0.095 0.135 0.219u′C

u

∣∣∣ss

−1.93 −1.82 −1.59 −0.10 −0.08 −0.21

Πss 0.1% 4% 10% 6% 11% 17%α −1.27 −4.17 −36.43 −28.43 −103.6 −275.9


Table 1.5: The New Keynesian Model: Fit of MomentsAll variables are expressed in annualized terms in percent, except for the mean of log(Lt ). All model-

implied moments in columns (2) to (7) are from the unconditional distribution, whereas the empirical

data moments in column (1) are given by the sample means. In column (1), figures in parenthesis refer to

the standard error of the empirical moment, computed based on a block bootstrap using 5,000 draws and

a block length of 32 quarters.

Benchmark Model Extended Model(1) (2) (3) (4) (5) (6) (7)

Data RRA=5 RRA=10 RRA=60 RRA=5 RRA=10 RRA=60

Means (in pct)∆ct 1.975

(0.276)1.142 1.497 2.069 1.970 1.984 2.048

πt 3.890(0.793)

3.856 3.789 3.672 3.792 3.781 3.474

it 4.999(0.994)

5.090 5.104 5.161 5.178 5.164 5.115

i (40)t 6.497

(0.904)6.509 6.510 6.551 6.512 6.516 6.513

Ψ(40)t 1.663

(0.355)1.745 1.775 1.768 1.672 1.755 1.777

logLt −1.081(0.004)

−1.080 −1.080 −1.081 −1.080 −1.080 −1.080

r mt 5.527

(2.012)3.607 3.829 3.907 4.669 4.166 3.515

Stds (in pct)∆ct 1.802

(0.122)2.352 2.259 1.444 2.146 1.877 1.400

πt 2.716(0.612)

2.493 2.601 2.899 2.273 2.536 2.997

it 3.173(0.579)

3.045 2.944 2.935 2.374 2.651 2.848

i (40)t 2.621

(0.532)2.635 2.618 2.592 2.360 2.542 2.573

Ψ(40)t 1.165

(0.170)0.967 0.864 0.874 1.000 0.894 0.870

logLt 1.619(0.163)

2.853 2.697 2.450 2.506 2.509 2.082

Persistencecor r

(∆ct ,∆ct−1

)0.529(0.083)

0.727 0.757 0.764 0.479 0.538 0.527

cor r(πt ,πt−1

)0.953(0.056)

0.943 0.958 0.960 0.977 0.972 0.972

cor r(it , it−1

)0.949(0.031)

0.913 0.926 0.911 0.954 0.952 0.955

cor r(i (40)

t , i (40)t−1

)0.976(0.031)

0.989 0.987 0.985 0.989 0.987 0.980

cor r(Ψ(40)

t ,Ψ(40)t−1

)0.937(0.032)

0.991 0.988 0.986 0.993 0.989 0.982

cor r(logLt , logLt−1

)0.932(0.476)

0.751 0.767 0.800 0.875 0.868 0.871


Table 1.5: The New Keynesian Model: Fit of Moments (continued)

Benchmark Model Extended Model(1) (2) (3) (4) (5) (6) (7)

Data RRA=5 RRA=10 RRA=60 RRA=5 RRA=10 RRA=60

Correlationscor r

(∆ct ,πt

) −0.184(0.150)

0.193 0.017 −0.167 −0.104 −0.180 −0.185

cor r(∆ct , it

)0.021(0.199)

0.239 0.020 −0.241 −0.110 −0.203 −0.255

cor r(πt , it

)0.703(0.074)

0.966 0.969 0.959 0.925 0.970 0.977

cor r(it , i (40)

t

)0.900(0.048)

0.809 0.854 0.878 0.912 0.939 0.961

cor r(i (40)

t ,Ψ(40)t

)0.757(0.148)

0.900 0.958 0.988 0.815 0.921 0.976

Goodness of fitQStep2 - 0.061 0.062 0.060 0.050 0.059 0.061J-test: P-value - 0.453 0.437 0.467 0.552 0.399 0.373Q scaled - 0.758 0.445 0.344 0.258 0.280 0.305

trade-off. This implies that the value function can be expressed in consumption units

as

Vt = Z1− 1

ψ

t u0 + 1

1− 1ψ

(Ct −bCt−1

)1− 1ψ (1.12)

+Z1− 1

ψ

t

ϕϕ0

1− 1ϕ

Z

(1− 1

ψ

)(ϕ−1)

t

W (ϕ−1)t

(Ct −bCt−1

) 1ψ (ϕ−1) +β

(Et

[V 1−α

t+1

]) 11−α

,

and it is then straightforward to compute the timing premium. Table 1.4 shows that

the timing premium at the steady stateΠss is 0.1% with RRA = 5, 4% with RRA = 10,

and only 10% with RRA = 60. Note also that this increase in Πss coincides with higher

levels of the timing attitude, as the absolute value of α increases gradually for higher

RRA. Importantly, the timing premium in the New Keynesian model is substantially

lower than in the long-run risk model, even when considering an extreme RRA of 60.

To explore whether the labor margin helps to account for the low timing premium

in the New Keynesian model, we next condition on the reported estimates in Table 1.4

with RRA = 60 and changethe Frisch labor supply elasticityϕ(1/Lt −1

)by considering

different values of ϕ. It is a priori not obvious how the timing premium should be

affected by changing the variability of the labor supply. As argued by Swanson (2018)

in the context of RRA, a higher labor supply elasticity allows the household to better

self-insure against bad productivity shocks to reduce the variability in consumption.

This effect should therefore reduce the timing premium for higher values of ϕ. But, a

more volatile labor supply also makes the household’s value function more uncertain

through the direct effect of leisure in the utility kernel, and this effect should therefore


increase the timing premium for higher values of ϕ. Panel A in Table 1.6 shows that

the second effect dominates, as the timing premium is 10% for ϕ = 0.25, 35% for

ϕ= 0.50, and 94% for ϕ= 0.75. These computations are conditioned on a RRA of 60

by appropriately changing the timing attitude α, which increases substantially in

absolute terms for higher values of ϕ. Panel B of Table 1.6 adopts another approach

by conditioning on α = −36 and instead let RRA vary as we change the value of ϕ.

When using this alternative benchmark, we find a much more gradual increase in the

timing premium when increasing ϕ, showing that the main effect of the labor margin

operates through the timing attitude α. Two other features of the New Keynesian

model that also may have a sizable impact on the timing premium are consumption

habits and the low estimate of ψ. Both features help to generate a low IES, which

reduces the timing premium as shown in Epstein et al. (2014). Panel C in Table

1.6 shows that low consumption habits and higher values of ψ increase the timing

premium. For instance, we find that the timing premium is 45% with b = 0 and

ψ= 0.75.

Thus, the labor margin, consumption habits, and a low estimate of ψ help to

generate a low timing premium in the New Keynesian model.

1.4.3.2 The Extended Utility Kernel

We next let u0 be a free parameter and estimate the New Keynesian model when

conditioning on a RRA of 5. The fourth column in Table 1.4 shows that u0 = −939

and with a standard error of 216. Hence, we clearly reject the null hypothesis of

u0 = 0 (t-statistic = −4.35) and therefore the standard utility kernel. This means

that accounting for other aspects than consumption and leisure when modeling

household utility also helps the New Keynesian model to explain postwar U.S. data.

The estimate of u0 is clearly larger (in absolute terms) than any of the estimates of u0

in the long-run risk model, but such a direct comparison is not particularly useful

because of the structural differences between the two models. For instance, the New

Keynesian model implies Css = 0.80, includes habits, and gives a substantial utility

contribution from leisure (as ϕ0 = 41.49 to match Lss ), whereas the long-run risk

model has Css = 1 and abstracts from both habits and leisure. Instead, it is much more

informative to study the value of uC(Ct ,Lt

)Ct /u

(Ct ,Lt

)∣∣∣ss

, because both models

determine u0 from this ratio to attain a given level of RRA. Table 1.4 shows that

our large estimate of u0 gives a fairly low value of uC(Ct ,Lt

)Ct /u

(Ct ,Lt

)∣∣∣ss=−0.10,

which is remarkably close to the corresponding ratio in the long run risk model,

which is 0.08 with IES = 1.5 and RRA = 10. Thus, the large estimate of u0 in the New

Keynesian model is in this sense in line with our results for the long-run risk model.

We generally find small effects on most of the structural parameters from includ-

ing u0. The main exceptions are smaller consumption habits (b = 0.49), a reduction

in the amount of long-run productivity risk (ρZ and σZ fall), and more risk related to

stationary productivity shocks (ρA and σA increase). We see also find a large increase


Table 1.6: The New Keynesian Model: Analysis of Timing PremiumIn Panel A, the timing premium is computed for different values of ϕ and a RRA = 60, while the remainingparameters are as reported in column (3) of Table 1.4. In Panel B, the timing premium is computed fordifferent values ofϕ and with a constant timing attitude ofα=−36.42, while the remaining parameters areas reported in column (3) of Table 1.4. In Panel C, the timing premium is computed for different values ofψand b, while all the remaining parameters are as reported in column (3) of Table 1.4. The timing premiumis computed based on (1.12) and a third-order perturbation approximation, while the utility level whenuncertainty is resolved in the following period is computed by simulation using anti-thetic sampling with5,000 draws and 10,000 terms to approximate the lifetime utility stream.

Panel A: RRA = 60

ϕ= 0.75 ϕ= 0.50 ϕ= 0.25 ϕ= 0.10Πss 94% 35% 10% 0.0%α −267.34 −94.20 −36.42 −17.06std

(∆ct

)1.615 1.25 1.44 2.80

std(logLt

)14.52 3.95 2.45 7.84

φ0 81.475 61.96 27.25 2.32

Panel B:α=-36.42

ϕ= 0.75 ϕ= 0.50 ϕ= 0.25 ϕ= 0.10Πss 20% 11% 10% 0.0%RRA 8.91 23.96 60 123.21std

(∆ct

)1.386 1.41 1.44 2.31

std(logLt

)3.247 2.62 2.45 7.98

φ0 81.475 61.96 27.25 2.32

Panel C:

ψ= 0.25 ψ= ψ ψ= 0.5 ψ= 0.75b = 0 15% 23% 27% 45%b = 0.25 13% 18% 21% 34%b = 0.5 11% 14% 15% 22%b = b 9% 10% 10% 9%

in the timing attitude, as α increases from −1.3 to −28.4 when RRA = 5. However, this

increase does not generate a substantially higher timing premium, which remains

low at 6% with RRA = 5.

Table 1.5 shows that including u0 in the New Keynesian model enables the model

to match all means and standard deviations, except for the labor supply that displays

the same degree of variability as in the standard New Keynesian model with RRA =

60. Subject to this qualification, the New Keynesian model now explains the equity

premium with a low RRA = 5 and a low timing premium of 6%. The model also

matches the mean and the standard deviation of the 10-year nominal term premium,

implying that we also explain the bond premium puzzle with low RRA and low timing

premium. The auto- and contemporaneous correlations are also well matched, and

the proposed extension of the New Keynesian model therefore has better overall


fit with Q scaled = 0.26 compared to Q scaled = 0.34 for the standard New Keynesian

model with RRA = 60.

The final two columns of Table 1.4 and 1.5 study the effects of higher RRA when

allowing for an unrestricted timing attitude α through u0. We find that higher RRA

does not improve the performance of the New Keynesian model. Actually, its perfor-

mance worsens slightly with Q scaled increasing from 0.26 to 0.31 when changing RRA

from 5 to 60. This suggests that it is not the high RRA in the traditional formulation

of Epstein-Zin-Weil preferences that helps the New Keynesian model match asset

prices, but instead the high timing attitude α that is induced by the high RRA.14

1.4.4 The Key Mechanisms

We next run three experiments to explore some of the key mechanisms in the New

Keynesian model with the extended utility kernel in (1.9). The first experiment con-

sidered in Table 1.7 illustrates the implications of gradually increasing u0. As for the

long-run risk model, a numerically larger value of u0 lowers u′ (Ct)

Ct /u(Ct

)and

allows for strong preferences for early resolution of uncertainty through a high α

without affecting RRA. The large value ofα then amplifies the existing risk corrections

and enables the model to explain asset prices with low RRA.

Our second experiment abstracts from long-run productivity risk by lettingσZ = 0.

The fourth column in Table 1.7 shows that this modification has very large effects as

the model now is unable to explain both the level and variability of πt , it , i (40)t , and

Ψ(40)t . Thus, long-run risk is also an essential feature of the New Keynesian model.

Our final experiment omits Epstein-Zin-Weil preferences by letting α= 0 to make

the household indifferent between early and late resolution of uncertainty. Although

this modification only has a small effect on RRA (reducing it from 5 to 2.2) it neverthe-

less has a profound impact on the model, which largely displays the same properties

as when omitting long-run productivity risk. In other words, the New Keynesian

model is unable to explain asset prices without Epstein-Zin-Weil preferences, and

hence strong preferences for early resolution of uncertainty.

Thus, we confirm the result from the long-run risk model, namely that the main

effect of Epstein-Zin-Weil preferences with our extended utility kernel is not to sepa-

rate the IES from RRA but instead to introduce strong preferences for early resolution

of uncertainty. This finding also helps to clarify why consumption habits may struggle

to match asset prices in DSGE models, although they allow for additional flexibility

in setting the IES and RRA (see Rudebusch and Swanson, 2008). The reason being

that consumption habits do not introduce preferences for early resolution of uncer-

tainty, which we find are essential to explain asset prices in a standard New Keynesian

model.

14The accuracy of the third-order perturbation solution used to estimate the New Keynesian model isdiscussed in Appendix A.4.

1.5. CONCLUSION 31

Table 1.7: The New Keynesian Model: Analyzing the Key MechanismsAll moments are computed using a third-order perturbation and represented as in Table 1.5. Unless statedotherwise, all parameters attain the estimated values from column (4) in Table 1.4.

(1) (2) (3) (4) (5)u0 = 0 u0 =−450 u0 = u0 σZ = 0 α= 0

Means∆ct 1.970 1.970 1.970 1.970 1.970πt 20.674 12.355 3.792 21.856 23.751it 29.230 17.378 5.178 30.913 33.617i (40)

t 30.441 18.649 6.512 32.111 34.198Ψ(40)

t 1.549 1.608 1.672 1.535 0.918logLt -1.074 -1.077 -1.080 -1.074 -1.073r m

t 9.681 7.183 4.669 9.956 10.444

Stds∆ct 3.170 2.622 2.146 2.849 3.372πt 3.407 2.535 2.273 2.746 4.494it 4.236 2.852 2.374 3.390 5.847i (40)

t 4.040 2.856 2.360 3.315 5.138Ψ(40)

t 0.831 0.963 1.000 0.936 0.466logLt 12.475 6.632 2.506 13.269 14.751

MemoRRA 5 5 5 5 2.2Πss 1% 3% 6% 0.0% 0.0%α −1.685 −14.51 −28.43 −28.43 0

1.5 Conclusion

The present paper highlights the importance of the timing attitude for consumption-

based asset pricing. To isolate the effects of the timing attitude, we propose a slightly

more general formulation of Epstein-Zin-Weil preferences than considered previously

to disentangle the timing attitude from the IES and RRA. We then show that this

extension enables us to explain several asset pricing puzzles in both endowment and

production economies. In particularly, we resolve a puzzle in the long-run risk model

where consumption growth is too highly correlated with the price-dividend ratio and

the risk-free rate. We also resolve the need for high RRA in DSGE models by enabling

an otherwise standard New Keynesian model to match the equity premium and the

bond premium with a low RRA of 5. Our analysis also reveals that the reason Epstein-

Zin-Weil preferences help to explain asset prices, is not because they separate the

IES from RRA, but because they introduce strong preferences for early resolution of

uncertainty in the presence of long-run risk.


Acknowledgements

We thank Ravi Bansal, John Cochrane, Mette Trier Damgaard, Wouter den Haan,

James D. Hamilton, Alexander Meyer-Gohde, Claus Munk, Olaf Posch, Morten Ravn,

and Eric Swanson for useful comments and discussions. We acknowledge access to

computer facilities provided by the Danish Center for Scientific Computing (DCSC).

We acknowledge support from CREATES - Center for Research in Econometric Analy-

sis of Time Series (DNRF78), funded by the Danish National Research Foundation.

1.6. REFERENCES 33

1.6 References

Adrian, T., Crump, R. K., Moench, E., 2013. Pricing the term structure with linear

regressions. Journal of Financial Economics Vol. 110, 110–138.

Andreasen, M. M., 2012. On the effects of rare disasters and uncertainty shocks for

risk premia in non-linear dsge models. Review of Economic Dynamics Vol. 15,

295–316.

Andreasen, M. M., Fernandes-Villaverde, J., Rubio-Ramirez, J. F., 2018. The pruned

state space system for non-linear dsge models: Theory and empirical applications

to estimation. Review of Economic Studies Vol. 85, 1–49.

Andreasen, M. M., Kronborg, A., 2018. The extended perturbation method. Working

Paper.

Bansal, R., Kiku, D., Yaron, A., 2010. Long-run risks, the macroeconomy, and asset

prices. American Economic Review, Papers and Proceedings Vol. 100, 1–5.

Bansal, R., Kiku, D., Yaron, A., 2012. An empirical evaluation of the long-run risks

model for asset prices. Critical Finance Review Vol. 1, 183–221.

Bansal, R., Yaron, A., 2004. Risks for the long run: A potential resolution of asset

pricing puzzles. Journal of Finance 59(4), 1481–1509.

Beeler, J., Campbell, J., 2012. The long-run risks model and aggregate asset prices: An

empirical assessment. Critical Finance Review Vol. 1, 141–182.

Chew, S. H., Ho, J. L., 1994. Hope: An empirical study of attitude toward the timing of

uncertainty resolution. Journal of Risk and Uncertainty Vol. 8, 267–288.



937–969.

Epstein, L. G., Farhi, E., Strzalecki, T., 2014. How much would you pay to resolve

long-run risk? American Economic Review Vol. 104, 2680–2697.

Gourio, F., 2012. Disaster risk and business cycles. American Economic Review Vol.

102, 2734–2766.

Justiniano, A., Primiceri, G. E., 2008. The time-varying volatility of macroeconomic

fluctuations. American Economic Review Vol. 98, 604–641.

Levintal, O., 2017. Fifth-order perturbation solution to dsge models. Journal of Eco-

nomic Dynamics and Control Vol. 80, 1–16.


Pohl, W., Schmedders, K., Wilms, O., 2018. Higher-order effects in asset pricing models

with long-run risks. Journal of Finance Forthcoming.

Rotemberg, J. J., 1982. Monopolistic price adjustment and aggregate output. Review

of Economic Studies Vol. 49, 517–531.

Rudebusch, G., Swanson, E., 2012. The bond premium in a dsge model with long-run

real and nominal risks. American Economic Journal: Macroeconomics Vol. 4, No. 1,

105–143.

Rudebusch, G. D., Swanson, E. T., 2008. Examining the bond premium puzzle with a

dsge model. Journal of Monetary Economics Vol. 55, 111–126.

Swanson, E. T., 2018. Risk aversion, risk premia, and the labor margin with generalized

recursive preferences. Review of Economic Dynamics Vol. 28, 290–321.

van Winden, F., Krawczyk, M., Hopfensitz, A., 2011. Investment, resolution of risk,

and the role of affect. Journal of Economic Psychology Vol. 32, 918–939.

Weil, P., 1990. Non-expected utility in macroeconomics. Quarterly Journal of Eco-

nomics Vol. 1, 29–42.

A.1. THE LONG-RUN RISK MODEL: A PERTURBATION APPROXIMATION 35

Appendix

A.1 The Long-Run Risk Model: A Perturbation Approximation

Proposition A.1.1. The second-order approximation to evt ≡ logEt

e(1−α)

(vt+1+

(1− 1

ψ

)logµz,t+1

)with µz,t ≡ Zt /Zt−1 around the steady state is given by

evt = evss +ev c ct +evx xt + 1

2ev c c c2

t +1

2evxx x2

t +ev cx ct xt + 1

2evσσ,

where

evss = (1−α)

log

(∣∣∣∣∣u0 + 11− 1

ψ

∣∣∣∣∣)− log

(1−κ0

)+ (1− 1

ψ

)logµz

ev c = (1−α)ρc

1−κ01−κ0ρc

1− 1ψ

1+u0

(1− 1

ψ

)evx = 1−α

1−κ0ρx

(1− 1

ψ

)ev c c = (1−α)ρ2

c1−κ0

1−κ0ρ2c

(1− 1

ψ

)2

1+u0

(1− 1

ψ

) − (1−α)ρ2c

[1− 1

ψ

1+u0

(1− 1

ψ

) 1−κ01−κ0ρc

]2

evxx = (1−α)ρ2x

κ0

1−κ0ρ2x

1−κ0

(1−κ0ρx )2

(1− 1

ψ

)2

ev cx = (1−α)ρxρc vxc

evσσ = 1−α1−κ0

[v c cσ

2c + (1−α) v2

cσ2c + vxxσ

2x + (1−α) v2

xσ2x + (1−α)

(1− 1

ψ

)2σ2

z

]Proposition A.1.2. The second-order approximation to the risk-free rate r f

t and the

expected equity return r m,et around the steady state are given by

r ft = rss + r c ct + rx xt + 1

2 r fσσ

r m,et = rss + r c ct + rx xt + 1

2 r m,eσσ

where

rss = − logβ+ 1ψ logµz

r c = −(1−ρc

) 1ψ

rx = 1ψ

r fσσ = −αv2

xσ2x −

[1− (1−α)

(1− 1

ψ

)(1+ 1

ψ

)]σ2

z −(

1ψ2 + 1

ψ2αv c +αv2c

)σ2

c

r m,eσσ = −(

1−κ1)

pdσσ+κ1

(pdc c +pd 2

c

)σ2

c +κ1

(pdxx +pd 2

x

)σ2

x +σ2d

with κ1 ≡ epdss

1+epdss.

Proposition A.1.3. The second-order approximation to the log-transformed price-

dividend ratio pdt around the steady state is given by

pdt = pdss +pdc ct +pdx xt + 1

2pdc c c2

t +1

2pdxx x2

t +pdcx ct xt + 1

2pdσσ,


where

pdss = log κ11−κ1

pdc = φc+(1−ρc ) 1ψ

1−κ1ρc

pdx = φx− 1ψ

1−κ1ρx

pdc c = −pd 2c +

2κ1ρc (1−ρc ) 1ψ+φc+φcκ1ρc

1−κ1ρcpdc −

(1−ρ2

c

)1ψ2 −φc (1−ρc ) 1

ψ−2(1−ρc ) 1ψ2

1−κ1ρ2c

pdxx = −pd 2x +

(φx− 1

ψ

)2

1−κ1ρ2x+2κ1ρx

φx− 1ψ

1−κ1ρ2x

pdx

pdcx = −pdc pdx +κ1ρc pdc

(φx− 1

ψ

)1−κ1ρcρx

+(

1ψ (1−ρc )+φc

)(φx− 1

ψ+ρxκ1pdx

)1−κ1ρcρx

pdσσ = σ2d

1−κ1+ σ2

z1−κ1

[α+ (1−α) 1

ψ2

]+ σ2

c1−κ1

[αv2

c −2ακ1pdc v c ++κ1pdc c +κ1pd 2c +2α 1

ψv c −2κ1pdc1ψ + 1

ψ2

]+ σ2

x1−κ1

[αv2

x −2ακ1pdx vx +κ1pdxx +κ1pd 2x

]with κ1 ≡ epdss

1+epdss.

A.2 The Long-Run Risk Model: Second-Order ProjectionApproximation

The long-run risk model may be summarized by the following four equilibrium

equations:

Vt = u0 + 1

1− 1ψ

C1− 1

ψ

t +βEV1

1−αt

EV t = Et

[V 1−α

t+1 µ

(1− 1

ψ

)(1−α)

z,t+1

]

1 = Et

β

(EV t

) 11−α

Vt+1µ−

(1− 1

ψ

)z,t+1

α (

Ct+1

Ct

)− 1ψ

µ− 1ψ

z,t+1R ft

(P/D

)t = Et

β

(EV t

) 11−α

Vt+1µ−

(1− 1

ψ

)z,t+1

α (

Ct+1

Ct

)− 1ψ

µ− 1ψ

z,t+1

((P/D

)t+1µd ,t+1 +µd ,t+1

)

as the market return is given by Rmt =

((P/D

)t +1

)1

(P/D)t−1µd ,t . Here, EVt ≡ Et

[V 1−α

t+1

],

Vt ≡Vt /Z1− 1

ψ

t , and EV t ≡ EVt /Z

(1− 1

ψ

)(1−α)

t . We consider a second-order log-approximation

to the four control variables in the model, i.e. vt = g v0 +gv

s st + 12 s′t gv

ssst , ev t = g ev0 +

A.2. THE LONG-RUN RISK MODEL: SECOND-ORDER PROJECTION APPROXIMATION 37

gevs st + 1

2 s′t gevss st , rt = g r

0 +grs st + 1

2 s′t grssst , and pdt = g pd

0 +gpds st + 1

2 s′t gpdss st , where

vt ≡ log Vt , ev t ≡ log EV t , rt ≡ logR ft , and pdt ≡ log

(P/D

)t . The law of motion for

the states is known and given by ct+1

xt+1

σ2t+1

︸︷︷︸

st+1

=

0

0

1−ρσ

︸︷︷︸

h0

+

ρc 0 0

0 ρx 0

0 0 ρσ

︸︷︷︸

hs

ct

xt

σ2t

︸︷︷︸

st

+

σcσ+t 0 0

0 σxσ+t 0

0 0 σσ

︸︷︷︸

ηt

εc,t+1

εx,t+1

εσ,t+1

︸︷︷︸

εt+1

m

st+1 = h0 +hsst +ηtεt+1, (A.1)

where st is a matrix of size ns ×1 and σ+t ≡

√max

(σ2

t ,0). Below, we use the notation[

g vs

(1, c

)g v

s

(1, x

)g v

s

(1,σ2

) ]to index the elements in gv

s and similar for gevs , gr

s ,

and gpds . Also, gv

ss(c, c) denotes the element on the first row and first column of the

matrix gvss, and so forth. To derive the approximation, we exploit the following result

which we prove in the online appendix:

Proposition A.2.1. Let a ∈R, b be an 1×ns matrix, and C a symmetric ns ×ns matrix.

Given (A.1), we then have that

Et

[exp

{a +bst+1 +s′t+1Cst+1

}]= exp

{a +bh0 +h′

0Ch0 +(2h′

0Chs +bhs)

st +s′t h′sChsst

}×exp

{1

2

(bηt +2h′

0Cηt +2s′th′sCηt

)(I−2η′t Cηt

)−1 (bηt +2h′

0Cηt +2s′th′sCηt

)′}×

∣∣∣(I−2η′t Cηt

)∣∣∣− 12

The projection approximation can be implemented sequentially by first obtaining

vt and then ev t , afterwhich rt and pdt are easily computed using the expressions for

vt and ev t . To conserve space, we only show how to solve for vt , as the remaining

three controls variables are obtained in a similar way. We first note that the expression

for the scaled value function reads

Vt = u0 + 1

1− 1ψ

C1− 1

ψ

t +βEt

[V 1−α

t+1 µ

(1− 1

ψ

)(1−α)

z,t+1

] 11−α

m


exp{

v(st

)} = u0 + 1

1− 1ψ

exp

(

1− 1

ψ

)ct

+βEt

exp{

(1−α) v(st+1

)}exp

(

1− 1

ψ

)(1−α) logµz,t+1

11−α

,

where v(st

) ≡ logV(st

). Due to the independence of the shocks, it is possible to

integrate out εz,t+1 manually as we have

exp{

v(st

)} = u0 + 1

1− 1ψ

exp

(

1− 1

ψ

)ct

+β{Et

[exp

{v

(st+1

)}(1−α)]

exp

(

1− 1

ψ

)(logµz +xt

)1−α

×exp

1

2

(1− 1

ψ

)2

(1−α)2σ2z

(σ+

t

)2

}1

1−α ,

mexp

{v

(st

)}= u0 + 11− 1

ψ

exp

{(1− 1

ψ

)ct

}

+βEt

[exp

{v

(st+1

)+ (1− 1

ψ

)(logµz +xt

)+ 12

(1− 1

ψ

)2(1−α)σ2

z

(σ+

t

)2}1−α] 1

1−α.

To avoid numerical overflow of exp{·}(1−α), given the large values of v(st+1

)and

α, we scale this term by Vt . That is,

exp{

v(st

)}= u0 + 11− 1

ψ

exp

{(1− 1

ψ

)ct

}

+VtβEt

exp

{v(st+1)+

(1− 1

ψ

)(logµz+xt )+ 1

2

(1− 1

ψ

)2(1−α)σ2

z

(σ+

t

)2}

Vt

1−α

11−α

.

Focusing on the last term we have

βEt

(exp

{−vt + vt+1 +

(1− 1

ψ

)(logµz +xt

)+ 12

(1− 1

ψ

)2(1−α)σ2

z

(σ+

t

)2})1−α 1

1−α

=βEt [exp{−v(st

)(1−α)+

(g v

0 +gvs st+1 + 1

2 s′t+1gvssst+1

)(1−α)

+(1− 1

ψ

)(1−α)

(logµz +xt

)+ 12

(1− 1

ψ

)2(1−α)2σ2

z

(σ+

t

)2}]1

1−α

=βexp

{(1− 1

ψ

)(logµz +xt

)+ 12

(1− 1

ψ

)2(1−α)σ2

z

(σ+

t

)2}

×Et

exp

{((1−α)

(g v

0 − v(st

))+ (1−α)gvs st+1 +s′t+1

(1−α)gvss

2 st+1

)}1/(1−α)

A.2. THE LONG-RUN RISK MODEL: SECOND-ORDER PROJECTION APPROXIMATION 39

To apply Proposition A.2.1, let a ≡ (1−α)(g v

0 − v(st

)), b ≡ (1−α)gv

s , and C ≡(1−α)gv

ss2 . This implies

Et

exp

{((1−α)

(g v

0 − v(st

))+ (1−α)gvs st+1 +s′t+1

(1−α)gvss

2 st+1

)}= exp

{(1−α)

(g v

0 − v(st

)+gvs h0 +h′

0gv

ss2 h0 +

(h′

0gvsshs +gv

s hs

)st +s′t h′

sgv

ss2 hsst

)}×exp{ 1

2 (1−α)2(gv

sηt +h′0gv

ssηt +s′th′sgv

ssηt

)(I−η′t (1−α)gv

ssηt

)−1

×(gv

sηt +h′0gv

ssηt +s′th′sgv

ssηt

)′}

×∣∣∣∣(I−η′t (1−α)gv

ssηt

)∣∣∣∣− 12

.

Hence, the Euler residuals for the log-transform value function R v(st

)reads

R v(st

)=−exp{

g v0 +gv

s st + 12 s′t gv

ssst

}+u0 + 1

1− 1ψ

exp

{(1− 1

ψ

)ct

}+βVt exp

{(1− 1

ψ

)(logµz +xt

)+ 12

(1− 1

ψ

)2(1−α)σ2

z

(σ+

t

)2}

×exp

{(g v

0 − v(st

))+gvs h0 +h′

0gv

ss2 h0 +

(h′

0gvsshs +gv

s hs

)st +s′t h′

sgv

ss2 hsst

}×exp{ (1−α)

2

(gv

sηt +h′0gv

ssηt +s′th′sgv

ssηt

)×

(I−η′t (1−α)gv

ssηt

)−1 (gv

sηt +h′0gv

ssηt +s′th′sgv

ssηt

)′}

×∣∣∣∣(I−η′t (1−α)gv

ssηt

)∣∣∣∣− 12(1−α)

.

We then determine g v0 , gv

s , and gvss as follows:

• Construct a multi-dimensional grid for the states based on the Cartesian set

Ss ≡ Sc ×Sx ×Sσ2t.

• Generate Ns points{

sit

}Ns

i=1from the set Ss.

• Determine g v0 , gv

s , and gvss by solving the nonlinear least squares problem,(

g v0 ,gv

s ,gvss

)= argmin

Ns∑i=1

(R v

(si

t

))2

.

The grid for the state variables Ss is constructed using 10 points uniformly distributed

along each dimension, implying Ns = 1,000. The upper and lower bounds along each

dimension is determined following a simulation of the states to cover the maximum

and minimum levels. We evaluate R v(st

)across all Ns points simultaneously by

using a vectorized implementation in MATLAB, where the symbolic toolbox is used to

analytically compute the matrix products, matrix inversions, and determinants in

the expression for R v(st

).


A.3 The Long-Run Risk Model: Accuracy of Solution

This section evaluates the accuracy of the adopted second-order projection approxi-

mation for each of the eight estimated versions of the long-run risk model in Table

1.1. The performance of this approximation is benchmarked to the widely used log-

normal method, a first-order projection solution, and a highly accurate fifth-order

projection solution. As in Pohl et al. (2018), we focus on means and standard devia-

tions for pdt , r ft , and r m

t , because these moments are most sensitive to the adopted

approximation method. The results are summarized in Table A.1, where we highlight

the following results. First, the log-normal method generally underpredicts E[pdt

],

generates too high values of E[

r mt

], and overpredicts the variability in pdt . Hence,

we reproduce the key findings of Pohl et al. (2018) on our estimated models. Second,

a first-order projection solution generally implies that these errors go in the opposite

direction, as it overpredicts E[pdt

]and underpredicts E

[r m

t

]. Third, the proposed

second-order projection solution displays no systematic biases and produces mo-

ments that are nearly identical to those from the fifth-order projection solution. The

main exception is for the extended model with IES = 1.1 and RRA = 10, where we see

somewhat larger deviations.

A.4 The New Keynesian Model: Accuracy of Solution

We evaluate the accuracy of the adopted third-order perturbation approximation by

computing unit-free Euler-equation errors on a grid of 1,000 points. The accuracy

of this solution is benchmarked to a standard first-order approximation and a fifth-

order approximation using the codes of Levintal (2017). Table A.2 reports the root

mean squared Euler-equation errors (RMSEs) for the six estimated versions of the

New Keynesian model in Table 1.4. We generally find that a third-order approxima-

tion improves the accuracy of the linearized solution, both for the Euler-equations

relating to the macro part of the model and for the 40 Euler-equations describing

bond prices. This improvement is particularly evident for bond prices. Increasing

the approximation order from three to five provides only a small improvement to

the macro part of the model when RRA equals 10 and 60, while accuracy actually

deteriorates slightly for RRA = 5. We find even smaller effects on bond prices of going

from third to fifth order, where accuracy only increases for the benchmark model

with RRA = 60 and the extended model with RRA = 10. Thus, these results indicate

that little would be gained by considering a fifth-order approximation. However,

going to fifth order is computationally much more demanding than the adopted

third-order approximation and would therefore not make a formal estimation of the

New Keynesian model feasible.

A.4. THE NEW KEYNESIAN MODEL: ACCURACY OF SOLUTION 41

Table A.1: The Long-Run Risk Model: Accuracy of MomentsThis table reports unconditional moments for the eight estimated versions of the long-run risk model

in Table 1.1 when using the log-normal method as well as a first-, second-, and fifth-order projection

solution with log-transformed variables. The projection approximations are computed by minimizing

the squared Euler-equation errors on a grid of 1,000 points, with 10 points uniformally distributed along

each dimension between its maximum and minimum level in a simulated sample of 250,000 observations.

The fifth-order projection solution is computed using complete Chebyshev polynomials. The log-normal

method is implemented using a first-order projection approximation of the value function and the tradi-

tional log-linear approximation of the price-dividend ratio at the unconditional mean of the price-dividend

level, which is obtained by iterating on the approximated loadings.

IES RRA = 5 RRA = 10

Means Stds Means Stds

pdt r ft r m

t pdt r ft r m

t pdt r ft r m

t pdt r ft r m

tBenchmark Model: 1.5

Log-normal method 3.27 1.83 6.18 0.52 1.15 15.87 3.12 1.95 6.81 0.42 0.75 15.83

1st order 3.76 1.83 4.87 0.38 1.15 13.61 3.58 1.95 5.32 0.32 0.75 14.13

2nd order 3.49 1.84 5.70 0.42 1.14 14.10 3.30 1.96 6.32 0.34 0.75 14.48

5th order 3.49 1.84 5.72 0.44 1.14 14.47 3.31 1.96 6.28 0.36 0.75 14.89

Extended Model: 1.1


1st order 3.80 2.16 4.72 0.26 0.70 15.39 3.59 1.56 5.23 0.27 0.59 15.38

2nd order 3.28 2.16 6.32 0.26 0.70 14.83 3.29 1.68 6.22 0.26 0.59 14.77

5th order 3.27 2.16 6.36 0.27 0.70 15.08 4.26 1.67 3.83 0.29 0.59 16.99

Extended Model: 1.5


1st order 3.81 1.63 4.68 0.28 0.50 15.69 3.81 1.64 4.68 0.28 0.50 15.69

2nd order 3.29 1.69 6.26 0.28 0.50 14.80 3.29 1.71 6.28 0.29 0.50 14.78

5th order 3.28 1.69 6.33 0.29 0.49 15.00 3.28 1.71 6.33 0.29 0.50 15.00

Extended Model: 2.0


1st order 3.66 1.52 5.06 0.29 0.47 15.16 3.64 1.58 5.10 0.28 0.48 14.89

2nd order 3.28 1.59 6.30 0.30 0.45 14.73 3.28 1.63 6.33 0.30 0.47 14.69

5th order 3.29 1.59 6.30 0.31 0.45 15.00 3.28 1.63 6.32 0.31 0.47 15.02


Table A.2: The New Keynesian Model: Euler-Equation ErrorsThis table reports the root mean squared unit-free Euler-equation errors (RMSEs) on a grid of 1,000 pointsfor a first-, third-, and fifth-order perturbation approximation. The grid is constructed by considering 10points uniformly between −2×σx,i and 2×σx,i for each state dimension, whereσx,i denotes the standarddeviation of the i ’th state in a log-linearized solution. Conditional expectations in the Euler-equationsare evaluated by Gauss-Hermite quadratures using 7 points. The considered model parameters are thosereported in Table 1.4. The RMSEs to the 12 equations describing the model without bond prices aresummarized under the label ’Macro Part’, while the RMSEs the 40 equations describing all bond prices aresummarized under the label ’Bond Prices’. The label ’Total’ refers to the RMSEs for the entire model.

Benchmark Model Extended Model(1) (2) (3) (4) (5) (6)

RRA=5 RRA=10 RRA=60 RRA=5 RRA = 10 RRA = 60

Macro Part:1st order 0.0253 0.0275 0.0703 0.2756 0.4170 0.28893rd order 0.1163 0.0565 0.0215 0.1375 0.0579 0.01965th order 0.1274 0.0474 0.0182 0.1525 0.0508 0.0149

Bonds Prices:1st order 0.0426 0.0466 0.1197 0.4717 0.7125 0.49213rd order 0.0013 0.0014 0.0046 0.0014 0.0016 0.00215th order 0.0076 0.0020 0.0033 0.0056 0.0014 0.0038

Total:1st order 0.0382 0.0418 0.1072 0.4224 0.6381 0.44083rd order 0.0543 0.0264 0.0108 0.0642 0.0271 0.00935th order 0.0599 0.0222 0.0089 0.0714 0.0237 0.0077

C H A P T E R 2HOW LEARNING FROM MACROECONOMIC

EXPERIENCES SHAPES THE YIELD CURVE


Abstract

I link constant-gain learning expectations of inflation and consumption growth to the

long-run variation in the level and slope of the U.S. Treasury yield curve, respectively.

The variation in yields that is orthogonal to the consumption-based equilibrium

factors has a two-factor structure with cyclical level and slope factor interpretation.

The four factors predict excess returns with R2’s up to 56%, and subsume and add to

the predictive information in the most popular bond return predictors. My four-factor

model implies cyclical term premia, because the macroeconomic expectations drive

time-variation in long-run short rate expectations that captures the trend component

of long-term yields. The cyclicality of term premia contrasts the implications of the

workhorse affine term structure model.

43

44 CHAPTER 2. MACROECONOMIC EXPERIENCES AND THE YIELD CURVE

2.1 Introduction

Accurate decompositions of long-term nominal yields into expected short rates and

term premia are crucial for understanding the expected returns in bond markets, the

conventional-, and unconventional monetary policy pass-through. The central ques-

tion is what information should go into this decomposition? I address this question

in the present paper.

I uncover a novel factor structure in U.S. Treasury bonds. Two key term structure

factors have a clear equilibrium-based interpretation. Consumption-based asset

pricing models generically imply that time-varying consumption growth and inflation

expectations are nominal yield curve factors. My main contribution is to show that

expected inflation and consumption growth capture the most persistent component

— the trend component — of yield curve level and slope, respectively. Importantly,

the macroeconomic expectations drive time-variation in long-horizon expected

short rates. Long-horizon short rate expectations capture crucial variation in the

expectation hypothesis component of the yield curve. Controlling for long-horizon

short rate expectations, I extract yield curve level and slope cycles. The two cycle

factors capture the least persistent variation in the level and slope of yields, and are

important sources of risk premium variation. The dynamic trend and cycle distinction

in yield curve level and slope greatly improves the measure of bond risk premia.

My work builds on the macro-finance literature studying bond risk premia. A

recent literature has challenged the conventional wisdom that the current yield curve

spans all relevant information for estimating risk premia.1 This paper is most closely

related to the work of Cieslak and Povala (2015) and Bauer and Rudebusch (2017a).

Cieslak and Povala (2015) control for expected inflation and extract a measure of

bond risk premia. I show that further controlling for expected consumption growth

improves upon the bond risk premia measure, because it captures additional vari-

ation in long-run real rate expectations. Bauer and Rudebusch (2017a) control for

expected inflation and extract additional variation in the long-run real rate — the

equilibrium real rate — from the yield curve. I relate this equilibrium real rate varia-

tion to underlying macroeconomic fundamentals and provide an interpretation of

the apparent decline in the equilibrium real rate.

I start from a simple consumption-based argument. Underlying a bond investor’s

behavior is the economics of the fundamental consumption-savings trade-off. High

expected consumption tomorrow disincentivizes real savings, since the expected

marginal utility from consumption tomorrow is comparably low. Equilibrium bond

prices exactly balances this trade-off. Thus, expected consumption growth is an

important factor of equilibrium bond prices. As bonds are nominal, expected inflation

1The macroeconomic variables that have been found to have additional information about bondrisk include the output gap (Cooper and Priestley, 2008), factors from a large macroeconomic data set(Ludvigson and Ng, 2009), Treasury bond supply (Greenwood and Vayanos, 2014), economic activity andinflation (Joslin, Priebsch, and Singleton, 2014), and trend inflation (Cieslak and Povala, 2015).

2.1. INTRODUCTION 45

is another important factor. Beyond the macroeconomic expectations, other factors

affecting the marginal utility of consumption are potentially priced into the term

structure.

Macroeconomic expectations are not directly observable, thus a modeling choice

is needed. One approach is to use survey responses. Piazzesi, Salomao, and Schneider

(2015), Buraschi, Piatti, and Whelan (2017), Cieslak (2017) document that consensus

survey responses are biased forecasts of bond market variables. This is consistent

with the interpretation that the consensus survey response is unrepresentative of

the marginal bond investor’s expectations. Therefore, I pursue a different approach.

Nagel and Malmendier (2016) provide a natural micro-foundation for constant-gain

learning expectations based on a learning from experiences argument. People are un-

certain about the true data generating process, but learn from their experiences. That

is, people overweight macroeconomic data experienced over their lifetime compared

to the macroeconomic history they did not experience. This implies that macroe-

conomic history is down-weighted as new generations emerge and old generations

pass.

Empirically, constant-gain learning expectations for consumption growth and

inflation are significant factors for 1- through 10-year U.S. Treasury bonds. Across

the maturity spectrum, the macroeconomic expectations capture 80-89% of the total

variation in yields from 1971-2015. Expected inflation loads almost equally on short-

and long-maturity yields. This is consistent with the common interpretation of a level

factor. Expected consumption growth loads heavily on short-maturity yields and less

on longer-maturity yields. This is consistent with the common interpretation of a

slope factor. I follow Cieslak and Povala (2015) and extract maturity-specific yield

curve cycles as the variation in yields orthogonal to the macroeconomic expectations.

The maturity-specific cycle factors have a two-factor structure. These two yield curve

cycle factors also admit an interpretation of level and slope factors, respectively. In

fact, the four factors decompose the first two principal components of the yield

curve — that is, the usual measures of level and slope factors. Expected inflation

and consumption growth capture the trend component of the level and slope of the

yield curve, respectively. The two yield curve cycle factors capture less persistent

components of the level and slope of the yield curve. The four factors fully span the

conventional level and slope factors of the term structure, but provide an important

dynamic distinction. The macro-founded level and slope trends predominantly cap-

ture the expectation hypothesis component of long-term yields, whereas the cycle

factors capture risk premium variation.

Following Cochrane and Piazzesi (2005), I combine these four factors into a

single bond return predictor. Across maturities, I find that the predictor holds vast

information about future bond returns with R2’s as high as 56%. The identified bond

return predictor subsumes and improves upon the information in the first three

principal components of yields, the Cochrane and Piazzesi (2005) factor formed


from five forward rates, and the Cieslak and Povala (2015) interest cycle factor. This

added predictability is a robust feature of the data, as it holds (i) in- and out-of-

sample, (ii) when controlling for small-sample distortions, (iii) across sub-samples,

(iv) for different holding periods, (v) across data sets that differ in how zero-coupon

bonds are constructed, and (vi) for macroeconomic expectations computed from an

optimal-gain learning algorithm using the Kalman filter recursions.

Finally, I use the model to decompose the 10-year yield into term premia and

expected short rates. I benchmark the results against the workhorse affine term

structure model (Vasicek, 1977; Duffie and Kan, 1996; Dai and Singleton, 2000; Duffee,

2002). My model implies average expected short rates that move at a lower frequency,

whereas term premia move at a higher frequency. This is because — in my model

— the short rate reverts towards a moving long-run mean. The long-run nominal

short rate reverts to the sum of the equilibrium real rate and long-run inflation

expectations. Both the equilibrium real rate and long-run inflation expectations have

time-variation driven by the two macroeconomic expectations factors. I find that

the equilibrium real rate has declined in recent years. This is consistent with recent

evidence (Hamilton, Harris, Hatzius, and West, 2016; Holston, Laubach, and Williams,

2017; Bauer and Rudebusch, 2017a). In contrast, the affine term structure model

implies a fixed long-run nominal short rate. As a result, long-horizon expectations

display stronger mean reversion. Term premia, as the residual component, then

captures some of the trend component in the 10-year yield. A cyclical term premium

is consistent with the behaviour of risk premia in other asset classes (Fama and

French, 1989).

The paper proceeds as follows. Section 2.2 outlines an illustrative consumption-

based model of the nominal term structure. Section 2.3 shows that the identified

factors predict bond returns both in- and out-of-sample, and that they subsume the

information in the most popular predictors in the literature. Section 2.4 decomposes

long-maturity yields into short rate expectations and term premia, while Section

2.5 discusses the model-implied equilibrium real rate. Finally, Section 2.6 provides

concluding remarks.

2.2 An Illustrative Consumption-Based Model

I argue that expected consumption growth and inflation are two key factors that

help explain variation in the nominal term structure. The economic intuition comes

from a simple consumption-based equilibrium model, where the key ingredient is a

marginal bond investor that uses constant-gain learning to update macroeconomic

expectations.

2.2. AN ILLUSTRATIVE CONSUMPTION-BASED MODEL 47

2.2.1 The Consumption-Savings Trade-Off

The standard consumption-based asset pricing model with time-separable log utility

implies a nominal stochastic discount factor m$t ,t+1 given by

m$t ,t+1 = logδ−∆ct+1 −πt+1, (2.1)

where δ denotes a subjective discount factor, ∆ct+1 denotes consumption growth

and πt+1 denotes the net inflation rate. Consider the decomposition of consumption

growth and inflation into conditionally expected and unexpected components,

∆ct+1 = τc,t +εc,t+1

πt+1 = τπ,t +επ,t+1,(2.2)

where τc,t = Et[∆ct+1

], τπ,t = Et

[πt+1

]are expectations conditional on information

available to the marginal investor at time t and εc,t+1, επ,t+1 are forecast errors.

Equations (2.1) and (2.2) summarize the consumption-savings trade-off. In times of

high expected consumption growth postponing current consumption is less desirable,

since the marginal utility from consuming tomorrow is expected to be comparably

low. That is, saving in bonds is unattractive in times of high expected consumption

growth, and vice versa. Similarly, high expected inflation discourages savings in

nominally denoted bonds, since the real purchasing power of the promised payments

is expected to be low. These arguments imply that expected consumption growth τc,t

and τπ,t are important factors of equilibrium yield curves.

Of course, marginal utility from consumption might exhibit time-variation due

to other factors as well. Popular reasons are external habits as in Campbell and

Cochrane (1999) and Wachter (2006) or demand shocks as in Albuquerque, Eichen-

baum, and Rebelo (2016), Creal and Wu (2016), and Schorfheide, Song, and Yaron

(2017). Incorporating this feature implies a nominal stochastic discount factor given

by

m$t ,t+1 = logδ−τc,t −εc,t+1 −τπ,t −επ,t+1 +∆γt+1, (2.3)

where ∆γt+1 denotes demand shock growth, which in general can depend on a set of

additional factors Pt as well as expected macroeconomic conditions τt =[τc,t τπ,t

]′.The nominal stochastic discount factor implies that, for demand shock processes that

do not depend on expected consumption growth and inflation in an exactly offsetting

manner, the main consumption-savings intuition continues to hold true. Outside

of such knife-edge restrictions on demand shock growth, expected macroeconomic

conditions continue to be factors of equilibrium yields — and any other nominal

asset. Furthermore, the set of factors Pt reflecting additional time-variation in the

marginal utility from consumption is potentially priced into the term structure of

yields. That is, the equilibrium arguments suggest a yield curve factor structure given

by

y (n)t =An +Bτ,nτt +BP,nPt (2.4)


where y (n)t denotes the yield-to-maturity on an n-year bond.2

2.2.2 Learning from Macroeconomic Experiences

The marginal bond investor’s expectations of future consumption growth and in-

flation are not directly observable, and a modelling choice is therefore needed to

obtain these expectations. Here, I follow a large literature in macroeconomics and

finance that emphasizes informational frictions (Mankiw and Reis, 2002; Sims, 2003;

Woodford, 2003, among others). I construct the marginal investor’s macroeconomic

expectations using the popular constant-gain learning algorithm and let

τc,t+1 = τc,t +νcεc,t+1

τπ,t+1 = τπ,t +νπεπ,t+1,(2.5)

where εc,t+1 and επ,t+1 are the realized forecast errors from (2.2). The two constant-

gain parameters νc and νπ summarize how fast forecast errors are incorporated

into expected consumption growth and expected inflation, respectively. In this way,

the constant-gain learning algorithm captures the idea that the bond investor is

uncertain about the true data generating process and learns from his experiences, in

which case the algorithm has been shown to provide a robust optimal prediction rule

(Evans, Honkapohja, and Williams, 2010).

Nagel and Malmendier (2016) provide a natural micro-foundation for the constant-

gain learning algorithm. Individuals overweight consumption growth and inflation

experienced during their lifetimes compared to the macroeconomic history they did

not experience when forming expectations. This behavior implies that macroeco-

nomic history is down-weighted as new generations emerge and old generations

pass. Further, Nagel and Malmendier (2016) show that the average learning from

experience forecast matches closely that of the constant-gain learning algorithm.

Usually, constant-gain learning is motivated with structural shifts or other forms of

parameter instability. Instead, the learning from experiences motivation builds on

the psychological evidence on personal experience and availability bias. The psycho-

logical aspects of personal experiences and availability are discussed in e.g. Tversky

and Kahneman (1973).

2.3 Bond Return Predictability

In this section, I discuss the identification of the term structure factors and present

evidence on their information for current yields and expected bond returns. In partic-

ular, I examine how learning from macroeconomic experiences help measure bond

2I derive the equilibrium factor structure from the marginal bond investor’s optimization problemin the online appendix accompanying this paper. Further, I elaborate on the restrictions on the factorloadings An , Bτ,n , and BP,n that rule out arbitrage opportunities. The online appendix is available on mywebsite or upon request.

2.3. BOND RETURN PREDICTABILITY 49

risk premia. I use data on consumption of non-durables and service goods, core

CPI, and end-of-month unsmoothed Fama and Bliss (1987) yields for 1- through

10-year maturities. The data is for November 1971 through December 2014, giving a

total of T = 518 observations. I provide details on the data construction in the online

appendix.

2.3.1 Identification: Macroeconomic Expectations Factors

As described in Section 2.2, expected consumption growth and inflation are identified

solely from the macroeconomic data using the constant-gain learning algorithm. By

recursive substitution of (2.2) and (2.5), the constant-gain learning algorithm implies

τc,t = νc

t−1∑i=0

(1−νc

)i∆ct−i

τπ,t = νπt−1∑i=0

(1−νπ

)iπt−i .

(2.6)

In principle, the macroeconomic expectations thus depend on their entire history.

However, I follow Cieslak and Povala (2015) and truncate the sums in (2.6) after

K = 120 terms, i.e. only the most recent 10 years of data are used. As the distant

past is heavily downweighted, truncating the sums has no meaningful effect on the

measurement of τc,t and τπ,t . Varying K between 100 and 150 terms verifies that the

results are not sensitive to this choice.

For now, I fix the constant-gain parameters νc = νπ = 0.016, which is consistent

with typical parameter values considered in the literature. Piazzesi et al. (2015) use a

constant-gain parameter of 0.026, Kozicki and Tinsley (2001), Faust and Wright (2013),

and Cieslak and Povala (2015) use 0.013, and Orphanides and Williams (2005) and

Nagel and Malmendier (2016) use 0.005. Hence, my choice is well within the range

of often considered values. Further, Branch and Evans (2006) find that a common

constant-gain parameter close to zero provides the best forecasting rule for inflation

and real activity measured by GDP growth. The results I present here are not sensitive

to the equality constraint, nor to varying the constant-gain parameters between 0.005

and 0.05. Figure 2.1 plots the computed expectations in blue, the cross-section of

expectations from the survey of professional forecasters (SPF) in red, along with the

realized series in black.3 Consumption growth series are in the leftmost graph, and

inflation series are in the rightmost graph. For both series, the constant-gain learning

expectations capture the low-frequency movements in the underlying series.

The constant-gain learning expectations also align well with the cross-section

of expectations from the survey of professional forecasters. However, Figure 2.1 also

3For Figure 2.1, I have computed the constant-gain learning expectations using real personal con-sumption expenditures and all-item CPI inflation for comparison with the SPF responses. All remaininganalysis in this text is done using real per capita consumption of non-durables and service goods and coreCPI inflation.


Figure 2.1: Consumption Growth, Inflation, and Expectations

Consumption growth is the log growth rate of real personal consumption expenditures and inflationis the log growth rate of all-item CPI for the period 1971:11 - 2014:12. Both are sampled with monthlyfrequency. Survey expectations are from the survey of professional forecasters (SPF). Constant-gain

learning expectations for consumption growth and inflation are computed as τc,t = νc∑K

i=0

(1−νc

)i∆ct−1

and τπ,t = νπ∑Ki=0

(1−νπ

)iπt−i , respectively. Parameters are νc = νπ = 0.016 and K = 120. Shaded areas

indicate NBER recession dates.

highlights some limitations to using survey data directly. In particular, it is unclear

which moment of the cross-sectional distribution of survey responses that provides

a good proxy for the marginal bond investor’s expectations. Piazzesi et al. (2015),

Buraschi et al. (2017), and Cieslak (2017) document that the consensus survey re-

sponse implies repeated forecast errors for bond market variables, which is consistent

with the interpretation that the consensus survey response is unrepresentative for

the marginal bond investor’s expectations. Greenwood and Shleifer (2014) document

a similar result for stock market variables.

Although the marginal bond investor may have irrational conditional expecta-

tions, in the sense that the constant-gain learning forecasts for consumption growth

and inflation could be systematically wrong, I find that this is in fact not the case.

Regressions of consumption growth and inflation onto their perceived conditional

expectations, i.e.

∆ct+h = ρ(h)0,c +ρ(h)

1,c τc,t +εc,t+h

πt+h = ρ(h)0,π +ρ(h)

1,πτπ,t +επ,t+h ,(2.7)

show that the constant-gain learning algorithm is not significantly misspecified.

Across different forecasting horizons, the constants ρ(h)0,c and ρ(h)

0,π are never signifi-


Table 2.1: Mincer and Zarnowitz (1969) Regressions

Results are for the regressions∆ct+h/12 = ρ(h)0,c +ρ(h)

1,c τc,t +εc,t+h/12 andπt+h/12 = ρ(h)0,π+ρ

(h)1,πτπ,t +επ,t+h/12.

Newey and West (1987) corrected t-statistics in absolute values using cei l(1.5×h

)lags for the null hypoth-

esisH0 : ρ(h)0,i = 0 andH0 : ρ(h)

1,i = 1 for i ∈ {c,π} are in parenthesis.

Panel A : ∆ct+h/12

h = 1 h = 3 h = 6 h = 12

ρ(h)0,c −0.0009

(0.3057)0.0014(0.3401)

0.0048(0.9303)

0.0104(1.6535)

ρ(h)1,c 1.1214

(0.6973)0.9761(0.0974)

0.7610(0.7828)

0.3966(1.7093)

R2 0.20 0.15 0.09 0.02

Panel B : πt+h/12

h = 1 h = 3 h = 6 h = 12

ρ(h)0,π −0.0022

(0.8651)−0.0012

(0.3370)0.0002(0.0368)

0.0028(0.4207)

ρ(h)1,π 1.1674

(1.7841)1.1394(1.0392)

1.0979(0.5602)

1.0225(0.0955)

R2 0.59 0.56 0.52 0.44

cantly different from zero, and the slope coefficients ρ(h)1,c and ρ(h)

1,π are never signif-

icantly different from one. Thus, in the sense of Mincer and Zarnowitz (1969), the

constant-gain learning forecasts are rational forecasts of consumption growth and

inflation. And so, the marginal bond investor has a weak form of rational expectations.

Further, the two macroeconomic expectations series combine to capture the

low-frequency movements in the yield curve very well. The top-left panel of Figure

2.2 depicts the level of the yield curve measured by a maturity-averaged yield, y t =1

10

∑10i=1 y(i)

t . Fitted values from the simple regression of the maturity-averaged yield

onto a constant and expected inflation are plotted in blue. Expected inflation captures

well the low-frequency variation in the level of the yield curve. The expected inflation

factor is statistically significant with a Newey and West (1987) t-statistic (corrected

using 18 lags) of 15.06 and explains 85% of the total variation.

The top-right panel of Figure 2.2 depicts the slope of the yield curve measured by

the 10-year yield spread, y (10)t − y (1)

t . Fitted values from the simple regression of the

10-year yield spread onto a constant and expected consumption growth are plotted

in blue. Expected consumption growth captures well the low-frequency variation in

the slope of the yield curve. The expected consumption growth factor is negatively

related to the slope of the yield curve, and is statistically significant with a Newey and

West (1987) t-statistic of -5.01. Expected consumption growth explains 22% of the

total variation in the 10-year yield spread.

The bottom panel of Figure 2.2 plots the factor loadings from the maturity-specfic

regressions

y (n)t =α(n) +β(n)

c τc,t +β(n)π τπ,t +e(n)

t , (2.8)


Figure 2.2: Fitted Expectations and Factor Loadings

Yields are unsmoothed Fama-Bliss yields for the period 1971:11 - 2014:12. The fitted expectations are

obtained as the fitted values from the regressions (i) yt =α+βτπ,t +et and (ii) y (10)t − y (1)

t =α+βτc,t +et , where yt = 1

10∑10

i=1 y(i)t . Shaded areas indicate NBER recession dates. Factor loadings are from the

regressions y (n)t =α(n) +β(n)

c τc,t +β(n)π τπ,t +e(n)

t .

as function of maturity. For n = 1,2, . . . ,10, the R2’s associated with the regressions

in (2.8) range between 80-89%. The loadings on the inflation expectation factor

range from 1.62 for the short maturities to 1.44 for the longest maturity. This is

consistent with the interpretation of inflation expectations as a level factor. Loadings

for consumption growth expectations are large at the short end (1.43) and smaller

at the long end (0.53), thus consistent with an interpretation of a slope factor. Both

macroeconomic expectation factors are statistically significant factors across all

maturities.

2.3.2 Identification: Yield Curve Cycle Factors

As the Pt factors are unobserved, I take them to be orthogonal to the macroeconomic

expectations. This implies that the factors can be extracted from the yield curve

information that is orthogonal to the macroeconomic expectations. This approach

is similar to Cieslak and Povala (2015). The cycle factors Pt are only identified up to

a rotation. I fix a particular rotation by assuming that the cycle factors are the first

nP principal components of the variation that is orthogonal to the macroeconomic

expectations. I simply define Pt =We et , where We is a nP×N matrix with weights

and et is a vector with the N yield residuals. That is, I extract maturity-specific cycle


factors as the information in yields orthogonal to the macroeconomic expectations,

i.e.

e(n)t = y (n)

t − α(n) − β(n)c τc,t − β(n)

π τπ,t . (2.9)

I then perform a singular value decomposition to obtain the Pt ’s.4

Table 2.2 reveals that the maturity-specific residuals from (2.9) are highly cor-

related across maturities, have very similar first-order autoregressive coefficients,

and that the first two principal components, P(1)t and P(2)

t , explain 99% of the total

variation in e(1)t , e(2)

t , . . . , e(10)t . For this reason, I let nP = 2 and consider P(1)

t and

P(2)t as the yield curve cycle factors. Figure 2.3 plots the time series dynamics and

Table 2.2: Descriptive Statistics: Latent Factors

Latent factors Pt are obtained from the residuals e(n)t = y (n)

t − β(n)0 − β(n)

c τc,t − β(n)π τπ,t . P(i)

t is the i ’th

principal component of the vector of yield residuals e(n)t for n = 1,2, . . . ,10.

Panel A : Correlation matrix

e(1)t e(2)

t e(3)t e(4)

t e(5)t e(6)

t e(7)t e(8)

t e(9)t e(10)

t

e(1)t 1.00 · · · · · · · · ·

e(2)t 0.98 1.00 · · · · · · · ·

e(3)t 0.94 0.99 1.00 · · · · · · ·

e(4)t 0.90 0.96 0.99 1.00 · · · · · ·

e(5)t 0.85 0.93 0.97 0.99 1.00 · · · · ·

e(6)t 0.81 0.90 0.95 0.98 0.99 1.00 · · · ·

e(7)t 0.78 0.87 0.93 0.96 0.98 0.99 1.00 · · ·

e(8)t 0.74 0.84 0.91 0.95 0.97 0.98 0.99 1.00 · ·

e(9)t 0.73 0.83 0.90 0.93 0.96 0.98 0.98 0.99 1.00 ·

e(10)t 0.70 0.80 0.87 0.91 0.94 0.96 0.97 0.98 0.98 1.00

Panel B : Autocorrelation

e(1)t e(2)

t e(3)t e(4)

t e(5)t e(6)

t e(7)t e(8)

t e(9)t e(10)

t

AR(1) 0.95 0.94 0.94 0.94 0.93 0.93 0.93 0.93 0.93 0.93

Panel C : Variation explained by PCs

P(1)t P(2)

t P(3)t P(4)

t P(5)t P(6)

t P(7)t P(8)

t P(9)t P(10)

t

% 93.0 6.00 0.49 0.17 0.12 0.08 0.06 0.04 0.03 0.02

factor loadings related to P(1)t and P(2)

t , respectively. The factor loadings admit an

interpretation of P(1)t as a cycle level factor and P(2)

t as a cycle slope factor. However,

the time series dynamics of especially P(1)t is somewhat different from the typical

4I perform the singular value decomposition after normalizing the variance covariance matrix toobtain the correlation matrix.


Figure 2.3: Cycle Factors and Factor Loadings

This figure plots the time series of P(1)t and P(2)

t , where P(1)t and P(2)

t are obtained from the singularvalue decomposition Pt = We et . Here, et is the vector of residuals from the restricted regressions, i.e.

e(n)t = y (n)

t − α(n) − β(n)c τc,t − β(n)

π τπ,t . The factor loadings are obtained from the regressions y (n)t =α(n) +

β(n)c τc,t +β(n)

π τπ,t +β(n)1 P(1)

t +β(n)2 P(2)

t +e(n)t .

level factor in affine term structure models. Although the autocorrelation of P(1)t is

high at 0.94, it is smaller than the typical level factor in affine term structure models,

which is often close to unity (see Christensen, Diebold, and Rudebusch, 2011; Duffee,

2011a, among many others). The same is true for the P(2)t factor — the autocorrelation

is high at 0.946, but smaller than for the typical slope factor in affine term structure

models.

2.3.3 An Excess Return Predictor

Macroeconomic expectations and yield curve cycles provide a useful decomposition

of the yield curve for measuring bond risk premia. The standard practice for assessing

the usefulness is predictive regressions of the general form

r x(n)t+1 = Ztθ

(n) +u(n)t+1, (2.10)

where p(n)t denotes the n-year log bond price at time t and r x(n)

t+1 = p(n−1)t+1 −p(n)

t +p(1)

t denotes excess holding return — borrow in the 1-year bond, buy an n-year

bond and sell it one year later. Zt denotes a set of common explanatory variables.

The model-implied explanatory variables for the predictive regressions are Zt =[1 τc,t τπ,t P(1)

t P(2)t

]. I use these variables to construct a single excess return fore-


casting factor by running regressions of the form

r x t+1 = Ztθ+ut+1, (2.11)

and obtain the fitted values as in Cochrane and Piazzesi (2005). The left hand side

variable is the maturity-averaged excess return, r x t = 19

∑10i=2 r x(i)

t . I consider four

Table 2.3: Excess Return Forecasting Factor Identifying Regression

This table reports regression coefficients for r x t+1 = θ0+θcτc,t +θπτπ,t +θ1P(1)t +θ2P

(2)t +ut+1. Forecast-

ing horizon is yearly, but observations are sampled monthly. Newey and West (1987) corrected t-statisticsusing 18 lags are in parentheses () and Hansen and Hodrick (1980) corrected t-statistics using 12 lags are inbrackets []. χ2

NW and χ2H H are Wald test statistics for the null hypothesisH0 : θc = θπ = θ1 = θ2 = 0, where

the variance-covariance matrix is Newey and West (1987) and Hansen and Hodrick (1980) corrected using18 and 12 lags, respectively. The 5-percent and 1-percent critical values for a χ2 distribution with 2 degreesof freedom are 5.99 and 9.21, respectively. The 5-percent and 1-percent critical values for a χ2 distributionwith 3 degrees of freedom are 7.81 and 11.35, respectively. The 5-percent and 1-percent critical values for aχ2 distribution with 4 degrees of freedom are 9.49 and 13.28, respectively.

Regression outputθ0 ×100 θc θπ θ1 θ2 χ2

NW χ2H H R2

(1) 6.3527 -2.3326 -0.1896 - - 3.90 3.07 0.04(2.6490) (-1.7133) (-0.2682) - -

[2.3333] [-1.5288] [-0.2347] - -

(2) 1.7529 - - 1.4820 -3.7313 61.2 51.3 0.48(2.5697) - - (7.7245) (-5.9825)

[2.2371] - - [7.1352] [-5.2526]

(3) 5.6881 -2.3739 - 1.4727 -3.7138 101.4 85.8 0.52(4.0576) (-2.9925) - (8.6278) (-6.6317)

[3.5824] [-2.6483] - [8.1041] [-5.8389]

(4) 6.0004 -2.1697 -0.1783 1.4723 -3.7130 97.6 81.2 0.52(3.7788) (-2.4534) (-0.3605) (8.2961) (-6.6890)

[3.3063] [-2.1822] [-0.3137] [7.7424] [-5.8920]

specifications. First, I report a restricted version using only the two factors capturing

macroeconomic expectations. Neither of the two factors are statistically significant

predictors. The t-statistics for τc,t and τπ,t are -1.71 (-1.53) and -0.27 (-0.23) using

Newey and West (1987) (Hansen and Hodrick, 1980) corrected standard errors with

18 (12) lags, respectively. The Wald test for joint insignificance cannot be rejected

at any of the usual significance levels. This result suggests that the macroeconomic

expectations predominantly capture the expectation hypothesis components in long-

term yields. In a second restricted version, I use only the two latent factors. These two

factors combine to capture an impressive 48% of the total variation in the maturity-

averaged excess returns. Both factors are highly statistically significant predictors.

The t-statistics for P(1)t and P(2)

t are 7.72 (7.14) and -5.98 (-5.25), respectively. The


Wald test for joint insignificance rejects at minuscule significance levels. This result

suggests that the cycle factors are important sources of risk premium variation. Next, I

include conditional expectations for consumption growth. The predictive coefficient

is -2.37 and it is significant with a t-statistic of -2.99 (-2.65). Augmenting the two

latent factors with expected consumption growth increases the explained variation

to 52% of the total variation in maturity-averaged excess returns. Finally, including

inflation expectations as well does not add any significant explanatory power. In fact,

the predictive coefficient for τπ,t is statistically insignificant for any conventional

significance level. For these reasons, I use the third specification and construct the

single excess return forecasting factors as RFt = θ0+ θcτc,t + θ1P(1)t + θ2P

(2)t . Although

τπ,t does not directly affect RFt it is still critical for identifying P(1)t and P(2)

t . Thus,

τπ,t still indirectly affects RFt .

I compute an additional return factor RF?t as the fitted values from the predictive

regression

r x t+1 = 0.06(3.82)

− 2.17(−2.45)

τ?c,t − 0.18(−0.37)

τ?π,t + 1.47(8.22)

P(1),?t − 3.69

(6.68)P(2),?

t +ut+1, (2.12)

where stars indicate that the constant-gain parameters νc , νπ are calibrated to mini-

mize the squared prediction errors. That is, the constant-gain parameters are cali-

brated to bond market data, which exactly reflect the expectations of the marginal

bond investor. The optimal constant-gain parameters are ν?c = 0.0158 and ν?π =0.0164, and these values are of course reflected in the choice of the common constant-

gain parameter νc = νπ = 0.016 used throughout. Restricting a common constant-

gain parameter of νc = νπ = 0.016 comes at a fairly low loss in predicted variation.

The regression in (2.12) achieves R2 = 0.52, and is only marginally higher than the R2

from specifications three and four in Table 2.3.

2.3.4 Other Excess Return Predictors

I consider three sets of prominent excess return predictors identified in the literature.

These predictors are based on the studies of Litterman and Scheinkman (1991),

Cochrane and Piazzesi (2005), and Cieslak and Povala (2015), respectively.

The first set of predictors comprises the contemporaneous information in the

yield curve by the first three principal components; commonly referred to as level,

slope, and curvature. This is a widely used model for decompositions of the yield

curve. That is, Zt =[1 PC1,t PC2,t PC3,t

].

The second set of predictors summarize the information in the current yield

curve by forward rates. I use five forward rates with maturities n = 1,2,3,4,5. That

is, Zt =[

1 f (1,1)t f (2,1)

t f (3,1)t f (4,1)

t f (5,1)t

], where f (n,h)

t denotes the forward rate in

period t for loans between period t +n −h and t +n.

The final set of predictors capture the information in the current yield curve that

is orthogonal to constant-gain inflation expectations. To this end, the predictors


are constrained versions of the latent factors Pt identified in this paper by impos-

ing β(n)c = 0 in (2.8). Cieslak and Povala (2015) label these maturity-specific cycle

factors c(n)t , and summarize their information by c(1)

t and c t = 19

∑10i=2 c(i)

t . That is,

Zt =[

1 c(1)t c t

].

I obtain a single return predicting factor associated with each of the three sets of

yield curve factors Zt as the fitted values from regressions of the form in (2.11). These

three predictive factors are labelled PCt , C Pt , and C Ft , respectively. Three principal

components of the yield curve, five forward rates, and two maturity-specific cycle

factors capture 17%, 20%, and 43% of the total variation in maturity-averaged excess

returns, respectively.

2.3.5 Predictive Regressions

Table 2.4 and Table 2.5 shows the results from the predictive regressions in (2.10) with

a constant and PCt , C Pt , C Ft , RFt , and RF?t as predictors, respectively. Table 2.4

covers the full November 1971 - December 2014 sample. Table 2.5 takes the apparent

yield curve shift documented in Rudebusch and Wu (2007) into account by consid-

ering the shorter January 1990 - December 2014 sample. The results suggest that

all four excess return predictors have significant predictive power for all maturities

considered. However, judged from the reported t-statistics and predictive R2’s, the

RFt factor is the strongest predictor across all maturities. Over the full sample, ac-

counting for consumption growth expectations improves the predictive power of the

C Ft factor by 4-5% for the shortest maturities and by 7% for the longest maturities.

In relative terms, this amount to sizeable improvements in the range of 10−19%

depending on the maturity considered. Compared to the PCt and C Pt the predictive

power more than doubles for all maturities by considering the RFt factor. Over the

shorter sample, the improvements in predictive power over the C Ft factor is even

more pronounced. The predictive power of the RFt factor improves the C Ft factor by

5−7% for the shortest maturities and by as much as 20% for the longest maturities. In

relative terms, the improvements are as large as 32−57% depending on the maturity

considered. Compared to the PCt and C Pt the predictive power more than doubles

for all maturities except for the 2-year bond by considering the RFt factor.

The four excess return predictors naturally share some of the same predictive

information. This is the case, since they are all constructed using the same yield

curve and (potentially) macroeconomic data. However, it is possible that the different

predictors complement each other. For this reason, I run the multivariate regressions

r x(n)t+1 = θ(n)

0 +θ(n)RF RFt +θ(n)

F Ft +u(n)t+1, (2.13)

where Ft in turn refers to PCt , C Pt , and C Ft . I present the regression results based on

the longer sample, since this includes the Volcker inflation event and thus is the most

challenging sample. The results are similar or even stronger on the shorter sample.


Table 2.4: Predictive Regressions, 1971-2015

This table reports regression coefficients for r x(n)t+1 = Ztθ

(n) +u(n)t+1. Forecasting horizon is yearly, but

observations are sampled monthly. Newey and West (1987) corrected t-statistics using 18 lags are inparentheses () and Hansen and Hodrick (1980) corrected t-statistics using 12 lags are in brackets [].

Maturities

2 3 4 5 6 7 8 9 10

Panel A: r x(n)t+1 = θ(n)

0 +θ(n)PC

PCt +u(n)t+1.

θ(n)PC

0.2067 0.3964 0.6156 0.8267 1.0372 1.2052 1.4411 1.5794 1.6916(2.6865) (2.7124) (2.9553) (3.2749) (3.3838) (3.3864) (3.7655) (3.7299) (3.4650)[2.3667] [2.3760] [2.5854] [2.8684] [2.9710] [2.9726] [3.3034] [3.2835] [3.0464]

R2 0.10 0.11 0.13 0.15 0.17 0.17 0.19 0.19 0.17

Panel B: r x(n)t+1 = θ(n)

0 +θ(n)C P C Pt +u(n)

t+1.

θ(n)C P 0.2157 0.4295 0.6719 0.8366 1.0404 1.2053 1.4036 1.5201 1.6771

(3.7464) (3.8851) (4.3975) (4.4450) (4.5814) (4.6946) (4.9545) (4.7644) (4.7869)[3.3321] [3.4343] [3.8833] [3.9252] [4.0813] [4.1865] [4.4253] [4.2785] [4.2726]

R2 0.13 0.15 0.19 0.19 0.20 0.20 0.21 0.21 0.20

Panel C: r x(n)t+1 = θ(n)

0 +θ(n)C F C Ft +u(n)

t+1.

θ(n)C F 0.2345 0.4510 0.6588 0.8412 1.0301 1.2066 1.3906 1.5213 1.6660

(5.8566) (6.3891) (6.8434) (7.2674) (7.6226) (8.0163) (8.0546) (7.8907) (7.8089)[5.2505] [5.7265] [6.1187] [6.5173] [6.8757] [7.2428] [7.3064] [7.1624] [7.0939]

R2 0.33 0.37 0.40 0.42 0.44 0.46 0.47 0.46 0.44

Panel D: r x(n)t+1 = θ(n)

0 +θ(n)RF RFt +u(n)

t+1.

θ(n)RF 0.2302 0.4450 0.6486 0.8336 1.0236 1.2015 1.4005 1.5293 1.6876

(6.2104) (7.3131) (8.1573) (8.8618) (9.4483) (10.1541) (10.3266) (9.9930) (10.3324)[5.5887] [6.6121] [7.3809] [8.0615] [8.6629] [9.3382] [9.5838] [9.2297] [9.6175]

R2 0.37 0.41 0.44 0.47 0.50 0.52 0.54 0.53 0.51

Panel E: r x(n)t+1 = θ(n)

0 +θ(n)RF?

RF?t +u(n)t+1.

θ(n)RF?

0.2278 0.4426 0.6476 0.8328 1.0221 1.2043 1.3989 1.5269 1.6972(5.9877) (7.0617) (7.8005) (8.4570) (9.0050) (9.6335) (9.8434) (9.5450) (9.8765)[5.3840] [6.3777] [7.0408] [7.6625] [8.2178] [8.8096] [9.0782] [8.7721] [9.1315]

R2 0.36 0.41 0.44 0.47 0.50 0.52 0.54 0.53 0.52


Table 2.5: Predictive Regressions, 1990-2015

This table reports regression coefficients for r x(n)t+1 = Ztθ

(n) +u(n)t+1. Forecasting horizon is yearly, but

observations are sampled monthly. Newey and West (1987) corrected t-statistics using 18 lags are inparentheses () and Hansen and Hodrick (1980) corrected t-statistics using 12 lags are in brackets [].

Maturities

2 3 4 5 6 7 8 9 10


0 +θ(n)PC

PCt +u(n)t+1.

θ(n)PC

0.2232 0.4038 0.6259 0.8311 1.0109 1.2145 1.4076 1.5636 1.7194(2.0039) (2.0695) (2.4189) (2.7856) (2.9673) (3.2068) (3.5625) (3.7857) (3.8269)[1.7360] [1.7964] [2.1087] [2.4482] [2.6307] [2.8697] [3.2340] [3.4892] [3.5287]

R2 0.11 0.11 0.13 0.15 0.15 0.17 0.18 0.18 0.19



t+1.

θ(n)C P 0.2227 0.4078 0.6407 0.8535 1.0287 1.2306 1.4060 1.5548 1.6552

(2.3551) (2.5280) (3.0302) (3.5498) (3.7550) (4.0599) (4.4149) (4.6013) (4.4917)[2.0375] [2.1933] [2.6406] [3.1313] [3.3537] [3.6730] [4.0787] [4.3331] [4.2470]

R2 0.13 0.12 0.15 0.17 0.18 0.19 0.20 0.20 0.19



t+1.

θ(n)C F 0.1572 0.3715 0.6015 0.8346 1.0439 1.2551 1.4281 1.5839 1.6552

(2.3912) (2.9137) (3.4706) (4.0743) (4.5679) (5.0014) (5.3250) (5.4416) (5.5712)[2.1453] [2.6160] [3.1198] [3.6915] [4.1908] [4.6390] [4.9895] [5.1587] [5.3296]

R2 0.11 0.17 0.23 0.28 0.31 0.34 0.35 0.36 0.36



t+1.

θ(n)RF 0.1491 0.3549 0.5820 0.8148 1.0337 1.2476 1.4307 1.6114 1.7759

(3.3827) (4.2065) (5.1274) (6.0775) (6.7765) (7.4030) (7.7950) (8.1552) (8.3024)[3.1809] [3.9295] [4.7679] [5.6782] [6.4071] [7.0624] [7.5099] [7.9237] [8.1237]

R2 0.14 0.23 0.31 0.39 0.45 0.49 0.52 0.54 0.56

Panel E: r x(n)t+1 = θ(n)

0 +θ(n)RF?

RF?t +u(n)t+1.

θ(n)RF?

0.1569 0.3596 0.5865 0.8151 1.0317 1.2434 1.4287 1.6107 1.7675(3.4902) (4.3139) (5.2037) (6.0681) (6.7201) (7.2527) (7.6151) (7.9892) (8.1217)[3.2458] [4.0108] [4.8154] [5.6413] [6.3212] [6.8763] [7.2817] [7.7078] [7.9039]

R2 0.16 0.24 0.32 0.40 0.45 0.50 0.53 0.55 0.56


In Figure 2.4, I plot the p-values associated with the null hypothesis H0 : θ(n)F = 0.

Across all maturities, the null hypothesis cannot be rejected for any conventional

Figure 2.4: Additional Information in Prominent Excess Return Predictors

This figure plots the p-values associated with the null hypothesis H0 : θ(n)F = 0, where θ(n)

F is estimated

from the regressions r x(n)t+1 = θ(n)

0 +θ(n)RF RFt +θ(n)

F Ft +u(n)t+1, with Ft being PCt , C Pt , and C Ft in turn.

The leftmost graph is based on Newey and West (1987) corrected standard errors with 18 lags, and therightmost graph is based on Hansen and Hodrick (1980) corrected standard errors with 12 lags.

significance levels. These results are clearcut evidence that PCt , C Pt , and C Ft do not

contain additional predictive power, as RFt subsumes all information embedded in

these predictors.

2.3.6 The Unspanned Factor Phenomenon

The RFt factor builds on a four-factor structure. Principal component analysis of the

yield curve should capture this. However, allowing for a fourth principal component

in the PCt factor does not change the results much. In fact, even a fifth factor does

not add much. Essentially, a five-factor model is equivalent to the C Pt factor, which

is build from five forward rates.

Consider — for the sake of exposition — a two-factor affine model. That is, the

yield curve has the form

y (n)t =An +B1,nP1,t +B2,nP2,t , (2.14)

where An , B1,n , and B2,n are factors loadings. Suppose that B2,n = aB1,n . Then the

model in equation (2.14) does not allow recovery of P1,t and P2,t using yield data only.


But the model instead suggests a one-factor structure, i.e.

y (n)t =An +B1,nP1,t +aB1,nP2,t =An +B1,nPt , (2.15)

where Pt = P1,t +aP2,t . Thus, the model in equation (2.15) is only invertible for Pt .

Assuming dynamics for Pt of AR(1) form, Pt+1 = φPt + vt+1, the one-factor model

implies yield forecasts of the form

Et

[y (n)

t+h |Pt

]=An +B1,nφ

hPt . (2.16)

These forecasts may be severely misspecified if the two pricing factors P1,t and P2,t

have distinct dynamics, e.g. Pi ,t+1 = φi Pi ,t + vi ,t+1 for i ∈ {1,2} and φ1 6= φ2. In this

case, the correct forecast would be

Et

[y (n)

t+h |P1,t ,P2,t

]=An +B1,nφ

h1 P1,t +B2,nφ

h2 P2,t . (2.17)

Hence, forecasting solely based on Pt induces the predictable forecast error

Et

[y (n)

t+h |P1,t ,P2,t

]−Et

[y (n)

t+h |Pt

]=B1,n

(φh

1 −φh)

P1,t +B2,n

(φh

2 −φh)

P2,t . (2.18)

If one of the factors where observed, say P1,t , then equation (2.14) could be inverted

for P2,t . Then controlling for P1,t in the forecasting regression would eliminate the

forecast error, although controlling for P1,t would leave the cross-sectional yield

curve fit unchanged. Following this line of logic, the macroeconomic factors might

appear as risk unspanned by the current yield curve even if it is perfectly priced into

the term structure. The invertibility issue naturally generalizes to more than two

factors and potentially explains why the RFt factor provides superior forecasts. I find

that both τπ,t and P(1)t are level factors, and both τc,t and P(2)

t are slope factors.

To provide some evidence on this, I run the regressions

PC1,t = ξ0,1 +ξ1,1τc,t +ξ2,1τπ,t +ξ3,1P(1)t +ϑ1,t

PC2,t = ξ0,2 +ξ1,2τc,t +ξ2,2τπ,t +ξ3,2P(2)t +ϑ2,t .

(2.19)

The R2 of the first regression is literally equal to 1, and the R2 from the second

regression is 0.998. This strongly suggests that the usual level and slope factors are

linear combinations of the term structure factors identified in this paper. Further,

restricted versions of the regressions in (2.19), imposing ξ1,1 = ξ1,2 = ξ2,1 = ξ2,2 = 0,

show that the latent factors P(1)t and P(2)

t have very different information from that

in the level PC1,t and slope PC2,t factors. These restricted regressions have R2’s

of 0.12 and 0.71, respectively. Finally, the level PC1,t and slope PC2,t factors have

autoregressive coefficients of 0.995 and 0.962, respectively. In comparison, the cycle

factors P(1)t and P(2)

t have autoregressive coefficients of 0.940 and 0.946, respectively.

Thus, the macroeconomic factors capture the low-frequency variation in the level

and slope factors, and the yield curve cycle factors capture the higher-frequency

variation.


The restrictions in Duffee (2011b) and Joslin et al. (2014) for non-invertibility of

the factor loading matrix is not exact in the data, and so in theory (linear combinations

of) the macroeconomic factors should be identifiable from yield data. However, the

studies by Duffee (2011b), Cieslak and Povala (2015), and Bauer and Rudebusch

(2017b) document that it is difficult to recover such factors when yields are measured

with small measurement errors. In particular, this is expected to be the case when the

factor loading matrix has close to linearly dependent columns, as I document is the

case here.

2.3.7 The Bauer and Hamilton (2018) Test

The predictive regressions for bond returns are known to be plagued by small-sample

distortions. I have applied both Newey and West (1987) and Hansen and Hodrick

(1980) corrections when computing standard errors to account for these problems.

However, Bauer and Hamilton (2018) argue that even these robust standard errors

are subject to serious small-sample distortions. To overcome the problem, they pro-

pose a bootstrap procedure. The bootstrap is under the null hypothesis that only

the first three principal components contain predictive information about future

bond returns. I follow their example and implement the bootstrap procedure. The

intuition is to re-sample from the first three principal components and construct

bootstrap yield curves. For each bootstrap sample, re-run the predictive regressions

with the additional proposed factors, which by construction should have no addi-

tional explanatory power. I compute the critical value for the Wald test as the 95th

percentile from the bootstrap distribution of χ2NW and χ2

H H for each maturity. That

is, the value for the Wald statistics that one would expect to see less than 5% of the

times if the three principal components were the only factors with predictive power

for excess returns.5 In Figure 2.5, I plot the bootstrapped critical values along with

the empirical values for the Wald test statistics. The bootstrap exercise indicates

severe small-sample distortions as argued by Bauer and Hamilton (2018). In fact,

the critical value at the 5% percent significance level is more than three times larger

than what asymptotic theory suggests. However, the empirical Wald test statistics

are far bigger than the critical values. Thus, the null that the first three principal

components of yields are the only factors with predictive power for excess returns is

strongly rejected. Accordingly, the improvements in performance by considering the

proposed forecasting factor is not an artifact of small-sample distortions.

2.3.8 Out-of-Sample Predictability: Excess Returns

Although the in-sample projections show significant predictive power of the return

forecasting factor, it is unclear if the predictive gains could have been realized over

5I provide details on the bootstrap procedure in the online appendix.


Figure 2.5: The Bauer and Hamilton (2018) Bootstrap Test

This figure plots bootstrapped critical values for the Wald statistics χ2NW and χ2

H H for testing the hypoth-

esis that the factors τc,t , P(1)t , and P(2)

t have no predictive power beyond that in the first three principle

components of yields. Formally, the critical values are for the hypothesisH0 : θ(n)F = 0 in the regressions

r x(n)t+1 = θ(n)

0 +θ(n)PC

PCt +θ(n)F Ft +u(n)

t+1, where PCt =[PC1,t PC2,t PC3,t

]′ and Ft =[τc,t P(1)

t P(2)t

]′.

the sample period. In fact, it has proven remarkably difficult to outperform the no-

predictability benchmark, see e.g. Thornton and Valente (2012). For this reason, I

perform a pseudo out-of-sample exercise. The predictive ability is evaluated over the

latest 25 years of data, i.e. I start the test period in January 1990. Factor identification

and predictive coefficients are recursively updated on the expanding training sample.

For comparison, I implement univariate regressions for each of the excess returns

onto the forecasting factors PCt , C Pt , C Ft , and RFt . The RFt factor is based on

the constant-gain parameters νc = νπ = 0.016, and thus is subject to some look-

ahead bias. I also implement a RF?t factor, where the constant-gain parameters are

calibrated to the initial training sample (November 1971 - January 1990). I present the

results with root mean squared forecast errors as measure of forecast performance.

All numbers are relative to the root mean squared forecasting errors from using the

historical average for the same maturity as forecast, i.e. relative to a weak expectation

hypothesis benchmark that rules out predictability. This means that numbers less

than one imply that the specific model’s predictions outperform the expectation

hypothesis. I implement the encompassing test in Harvey et al. (1998), where the

null hypothesis is that the RFt factor encompasses the other prominent bond return

factors.

The first two columns iterate the finding that in-sample predictability is difficult to


Table 2.6: Out-of-Sample Prediction Results: Excess Returns

This table reports out-of-sample root mean squared forecasting errors for the predictive regressions

r x(n)t+1 = θ(n)

0 +θ(n)F Ft +u(n)

t+1 for Ft being PCt , C Pt , C Ft , RFt , and RF?t . Forecasting horizon is yearly, butobservations are sampled monthly. All numbers are relative to the root mean squared forecasting errorsfrom using the historical mean for the same maturity as forecast. Test statistics for the hypothesis that RFtencompasses a given variable is in parenthesis (), and that RF?t encompasses a given variable is in brackets[]. See Harvey, Leybourne, and Newbold (1998). Standard errors are Newey and West (1987) corrected with18 lags. The test sample is 1990:1 to 2014:12. Factor identification and predictive coefficients are recursivelyupdated as the training sample expands.

PCt C Pt C Ft RFt RF?t

r x(2)t+1 1.173 1.129 0.982 0.944 0.938

(0.621) (0.427) (-1.840)[-0.543] [-1.074] [-0.465]

r x(3)t+1 1.162 1.141 0.976 0.929 0.916

(0.782) (0.806) (-1.665)[-0.441] [-0.843] [-0.084]

r x(4)t+1 1.147 1.145 0.955 0.907 0.875

(0.840) (1.133) (-1.607)[-0.288] [-0.560] [0.242]

r x(5)t+1 1.139 1.138 0.950 0.891 0.852

(0.919) (1.351) (-1.426)[-0.206] [-0.501] [-0.546]

r x(6)t+1 1.138 1.147 0.957 0.878 0.850

(0.897) (1.487) (-1.281)[-0.433] [-0.673] [0.558]

r x(7)t+1 1.128 1.151 0.941 0.884 0.825

(0.865) (1.651) (-1.394)[-0.191] [-0.295] [0.689]

r x(8)t+1 1.137 1.159 0.958 0.880 0.834

(0.720) (1.539) (-1.277)[-0.510] [-0.663] [0.655]

r x(9)t+1 1.113 1.142 0.968 0.866 0.837

(0.584) (1.468) (-1.065)[-0.853] [-0.954] [0.696]

r x(10)t+1 1.110 1.160 0.974 0.905 0.833

(0.506) (1.605) (-1.167)[-0.539] [-0.391] [0.979]

realize over the sample period in real time. Both the first three principal components

and five forward rates do not outperform the no-predictability benchmark. In fact, the

two models have root mean squared forecasting errors that are in the order of 11−17%

larger than the forecasts from simple historical averages. The third column shows that

the C Ft factor from Cieslak and Povala (2015) does outperform the benchmark. Across

maturities, the gains are in the order 2−6%. Finally, in the two rightmost columns,

the return forecasting factor developed in this paper delivers root mean squared

forecasting errors that are in the order of 6−17% smaller than the no-predictability

benchmark. For all maturities considered, the RF?t factor delivers the lowest root


mean squared error, and the factor does particularly well for the longer maturities.

This is especially true for maturities beyond five years, where the gain in root mean

squared error relative to the C Ft factor is greater than 10%. Across all maturities,

the null hypothesis that the RF?t factor (or the RFt factor) encompasses the other

prominent bond return factors is never rejected.

2.3.9 Further Robustness

In addition to the above results, I have conducted a range of robustness checks. The

results are reported in the online appendix accompanying the paper. Here, I briefly

summarize these results.

First, holding periods of three and six months are also often considered in the

literature, because they rely less on overlapping data compared to the benchmark

one-year holding period. It is comforting that the conclusions are not sensitive to

the length of the holding period. For both three and six months holding periods, the

relative ranking of the four considered predictors remains unchanged in terms of

predicted variation. The relative improvements of the RFt factor over the C Ft factor

remain comparable to those achieved over the one-year holding period. The relative

improvements are in the order of 5−26% and 5−22% for the three and six months

holding periods, respectively.

Second, I have also checked that the results are not only present when using

Fama and Bliss (1987) zero-coupon yields by implementing the predictive regressions

using the Gurkaynak, Sack, and Wright (2007) dataset. The two datasets are highly

correlated, and the results from the predictive regressions inherits the similarity in

the two data sets.

Third, I have documented that the out-of-sample results do not only hold for

returns. I have done a similar out-of-sample exercise as above for yield levels. Because

of the extreme persistence in yields, a great number of studies have documented

difficulties in improving forecasting performance over a simple random walk. The

factor structure from this paper outperforms the random walk benchmark, and

provides the best forecasts. The forecast from the learning from macroeconomic

experiences forecasts encompasses the forecasts from the other factor models.

Finally, I have checked the robustness against variations on the learning al-

gorithm for the macroeconomic expectations. In particular, I have implemented

an optimal-gain learning algorithm and a recursive least squares algorithm. The

optimal-gain learning algorithm sets the gain parameters using the Kalman filter

recursions, whereas the recursive least squares applies a decreasing gain parameter

νc,t = νπ,t = t−1. The two algorithms are different in an important way. Recursive

least squares implies equal weighting of all historical macroeconomic data at each

point in time, and thus does not capture down-weighting of macroeconomic data as

new generations emerge and old generations passes. This is unlike the constant-gain

learning algorithm (and potentially the optimal-gain learning algorithm). I construct


bond return forecasting factors using the same methodology as outlined for the

constant-gain learning algorithm. The bond return factor implied by the recursive

least squares algorithm only does marginally better than level, slope, and curvature.

On the contrary, the bond return factor implied by the optimal-gain learning algo-

rithm predicts a similar proportion of the excess return variation as the RFt factor

studied here.

2.3.10 Predicting Short Rate Changes

A successful decomposition of the yield curve should account for both expected

excess returns as well as expected short rates. To address whether the model-implied

factors also capture well variation in future short rates, I run the regressions

∆y (1)t ,t+h = Ztω+ηt+h , (2.20)

where ∆y (1)t ,t+h = y (1)

t+h − y (1)t and h = 1,2,3,4,5 in turn. Again, I consider four sets of

short rate factors. First, I consider the level PC1,t , slope PC2,t , and curvature PC3,t

factors. Second, I use the five forward rates. Third, I use inflation expectation and the

short rate cycle factor c(1)t . Finally, I use the four term structure factors identified in

this paper. Panel D in Table 2.7 indicates that my four term structure factors have

strong predictive power for future short rate changes 1 through 5 years ahead. The

four factors capture 34-65% of the variation in future short rate changes across the

different horizons. The two latent factors are eminently important, whereas the two

factors capturing macroeconomic expectation are not significant predictors for any

of the considered horizons. In fact, the more parsimonious specification that omits

the two macroeconomic factors does equally well. Although the two macroeconomic

factors do not add much in the predictive regressions, they are critical for the strong

predictive power. This is because the macroeconomic factors are key to identify the

latent factors P(1)t and P(2)

t . This is also evident from panel A, where the short rate

changes are regressed on the first three principal components of yields, i.e. without

considering the macroeconomic factors. The predicted variation is in the range of 4-

24%, and thus the predictive power is much weaker. PanelB reveals that the predictive

power in the five forward rates, by and large, resembles that of the first three principal

components. In contrast, panel C shows that conditioning on inflation expectations

increases the predictable variation across all horizons compared to the first three

principal components of yields. As expected inflation and the short rate cycle factor

capture 15-48% of the future short rate variation, the predictive power is still not

as strong as the predictive power when accounting for both expected consumption

growth and expected inflation.

2.4. TERM PREMIA 67

Table 2.7: Short Rate Changes: Predictive Regressions

This table reports regression coefficients for ∆y (1)t ,t+h

= Ztω+ηt+h . Results are for h = 1,2,3,4,5 years,

but observations are sampled monthly. Newey and West (1987) corrected t-statistics using 18 lags are inparentheses () and Hansen and Hodrick (1980) corrected t-statistics using 12 lags are in brackets [].

Horizon, h

1 2 3 4 5

Panel A: Zt =[1 PC1,t PC2,t PC3,t

].

R2 0.04 0.11 0.20 0.24 0.19

Panel B: Zt =[

1 f (1,1)t f (2,1)

t f (3,1)t f (4,1)

t f (5,1)t

].

R2 0.07 0.10 0.15 0.21 0.17

Panel C: Zt =[

1 τπ,t c(1)t

].

R2 0.15 0.34 0.46 0.48 0.35

Panel D: Zt =[

1 τc,t τπ,t P(1)t P(2)

t

].

ωc 0.3188 0.2468 0.0423 0.4379 1.5887(1.1265) (0.7048) (0.0878) (0.6686) (2.0484)[0.9920] [0.6219] [0.0765] [0.5848] [1.7975]

ωπ -0.0364 -0.0394 -0.0417 -0.0909 -0.1595(-0.2598) (-0.2610) (-0.2298) (-0.5030) (-0.9469)[-0.2273] [-0.2294] [-0.1994] [-0.4364] [-0.8242]

ω1 -0.3104 -0.5336 -0.5802 -0.5913 -0.5155(-5.9047) (-6.5370) (-5.5188) (-4.9728) (-3.9219)[-5.4721] [-5.7382] [-4.9181] [-4.4031] [-3.4568]

ω2 0.1261 -0.0844 -0.4643 -0.6893 -0.6138(0.8234) (-0.3104) (-1.8681) (-3.3185) (-2.3110)[0.7257] [-0.2711] [-1.6531] [-2.9547] [-2.0275]

R2 0.34 0.56 0.62 0.65 0.56

2.4 Term Premia

The results suggest that the four-factor model is well-suited for decompositions of

long-term yields into an expectation hypothesis component — average expected

short rates over the lifetime of the bond — and a term premia component. This

decomposition is summarized by

y (n)t = 1

nEt

[n−1∑i=0

y (1)t+i

]+ψ(n)

t , (2.21)


where ψ(n)t denotes the n-year term premia. Term premia is then the difference

between two strategies; buying the 10-year bond and rolling over in 1-year bonds as

they expire. To evaluate the roll-over strategy it is convenient to have a dynamically

complete model. The constant-gain learning algorithm provides a forecasting model

for the macroeconomic expectation factors. I use an annual VAR(1) sampled at the

monthly frequency for the cycle factors; that is, time t cycle factors are regressed on

their one-year lagged values, but the model is estimated using monthly observations.

Cochrane and Piazzesi (2005) argue that the dynamic properties of bond risk premia

are more evident at the annual horizon. The one-year horizon is also consistent with

the previous predictive regressions. The VAR(1) model is usual practice for affine term

structure models (Dai and Singleton, 2000; Duffee, 2002). For comparison, I have

implemented the three-factor affine term structure model using the same procedure.

The dynamically complete model for the decomposition in (2.21) does not en-

force no-arbitrage restrictions on factor loadings estimates in the short rate equa-

tion. Adrian et al. (2013) and Joslin, Le, and Singleton (2013) note that enforcing

no-arbitrage restrictions has little effect on the implied decomposition.6 In Figure

Figure 2.6: Model-Implied Term Premia

This figure plots the decomposition of model-implied yields with 10 years to maturity into term premiaand short rate expectations. The leftmost graph plots the decomposition for the proposed learning frommacroeconomic experiences model, and the rightmost graph plots the decomposition for the benchmarkthree-factor affine term structure model.

6I have implemented a version of the model enforcing no-arbitrage restrictions. The results are verysimilar. The results are also similar for a monthly VAR model for the cycle factors. For details on theserobustness results, see the online appendix.

2.5. THE DECLINE IN THE EQUILIBRIUM REAL RATE 69

2.6, I plot the 10-year yield along with the resulting decompositions into expected

short rates and term premia. The three-factor affine term structure model implies a

stationary VAR for the factors. This implies long-horizon short rate expectations that

display strong mean reversion. As term premia is extracted as the residual compo-

nent, the estimates of term premia will account for some of the trend component in

the 10-year yield. On the other hand, the constant-gain learning expectations implies

time-variation in the long-run mean of the nominal short rate. To see this, consider

y (1)∞ = lim

h→∞Et

[y (1)

t+h

]=α(1) +β(1)

τ τt +β(1)P

(I−ΦP

)−1µP, (2.22)

where(I−ΦP

)−1µP is the unconditional mean of the VAR(1) system for the cycle

factors; ΦP is the autoregressive loading matrix and µP the constant term. The long-

run mean of the macroeconomic factors is limh→∞Et[τt+h

] = τt , because of the

constant-gain learning structure. The time-variation in the long-run short rate expec-

tations now captures the trend component in the 10-year yield. In turn, term premia

estimates are more cyclical. This is because they are predominantly driven by the

cycle factors — as is also evident from the previous predictive regressions. Allowing

for the dynamic trend and cycle distinction of the affine term structure measures of

level and slope thus implies very different term premia dynamics. That expectations

of short rates capture the trend component in long-term yields is consistent with the

observed trends in survey responses (Kozicki and Tinsley, 2001; Kim and Orphanides,

2012). A cyclical term premium is consistent with the behaviour of risk premia in

other assets (Fama and French, 1989).

The mechanism generating the expected short rate and term premia decom-

position — that aligns with macro-finance priors — is the underlying equilibrium

consumption-based structure. In the long-run, the yield curve level and slope are

determined by the underlying economics behind the consumption-savings decision.

The level and slope deviates from this equilibrium yield curve over the short run, but

does so in cycles that mean-revert towards the long-run equilibrium. This mechanism

is different from other efforts that have been made to bring the affine term structure

model decompositions into alignment with the macro-finance priors. Cochrane and

Piazzesi (2008) and Bauer (2017) restrict market prices of risk, Bauer, Rudebusch,

and Wu (2012) bias correct the factor VAR, Kim and Orphanides (2012) incorporate

survey measures of yield curve expectations, and Duffee (2011b) and Christensen

and Rudebusch (2016) impose a unit root on the level factor. All are attempts to

increase the persistence of long-horizon short rate expectations, but there is no direct

underlying economic equilibrium interpretation.

2.5 The Decline in the Equilibrium Real Rate

Subtracting inflation expectations from the model of the nominal term structure of

yields implies that a model of the real term structure naturally follows. Measuring the


equilibrium real rate — the real short rate that would prevail over the long run — is of

great importance for both policy makers and investors that are assessing the current

stance of the economy. The equilibrium real rate r?t is defined to be

r?t = limh→∞

Et[rt+h

], (2.23)

where rt = y (1)t −Et

[πt+1

]denotes the 1-year real short rate. Using the model structure,

the equilibrium real rate is given by

r?t =α(1) +β(1)τ,cτc,t +

(β(1)τ,π−1

)τπ,t +β(1)

P

(I−ΦP

)−1µP, (2.24)

since long-run inflation expectations are given by limh→∞

Et[πt+h

]= τπ,t . Importantly,

the model implies time-variation in the equilibrium real rate. This is consistent with

a recent literature documenting a decline in the equilibrium real rate over the last

15-20 years (Hamilton et al., 2016; Holston et al., 2017; Bauer and Rudebusch, 2017a,

among others). Time-variation in the equilibrium real rate is a novel feature of the

proposed model, since equilibrium models of the term structure generally imply a

constant equilibrium real rate, see Wachter (2006), Piazzesi and Schneider (2007),

and Bansal and Shaliastovich (2013). The same is true for the workhorse three-factor

affine term structure model. Here, both long-run inflation and consumption growth

expectations affect the equilibrium real rate.

Figure 2.7 plots the estimated time series for the equilibrium real rate r?t . Consis-

tent with the recent evidence, the equilibrium real rate was fairly constant around

the 3-4% level until 1990. After 1990 the equilibrium real rate has been gradually

declining. Over the recent recession the equilibrium real rate even went into nega-

tive territory and has remained remarkably low since then. This extraordinarily low

equilibrium real rate is consistent with the view expressed in Yellen (2015).

The model-implied decline in the equilibrium real rate comes from two sources.

First, during the 1990’s, inflation expectations were still gradually coming down,

which affected the real rate as the gap between the central banks inflation target and

inflation expectations narrowed. Second, after 2000 consumption growth expecta-

tions have declined and caused a fall in the real rate. Of the estimated 241 basis point

decline in the equilibrium real rate over the period 2000-2015, the biggest contributor

was the drop in expected consumption growth — accounting for 196 basis points.

This result is consistent with the interpretation of a declining equilibrium real rate

because of a slowdown in productivity, see e.g. Holston et al. (2017).

The decomposition is also consistent with the recent interpretation of Bauer and

Rudebusch (2017a). Although nominal yields started to fall in the early- to mid-1980’s,

this decline was early on due to inflation expectations coming down as the equilib-

rium real rate remained fairly constant over this period. As inflation expectations

settled around 2% during the 1990’s, the equilibrium real rate started to fall, causing

nominal yields to continue their downward trajectory. The main difference compared

2.6. CONCLUSION 71

Figure 2.7: Equlibrium Real Rate

This figure plots the model-implied equilibrium real rate, r?t . The equilibrium real rate is defined to be

r?t = limh→∞

Et[rt+h

], where rt = y (1)

t −Et[πt+1

]denotes the real short rate. Using the model structure, the

equilibrium real rate is given by r?t =α(1)+β(1)τ,cτc,t +

(β(1)τ,π−1

)τπ,t +β(1)

P

(I−ΦP

)−1µP, since the long-run

consumption growth and inflation expectations are given by limh→∞

Et[∆ct+h

]= τc,t and limh→∞

Et[πt+h

]=τπ,t , respectively.

to Bauer and Rudebusch (2017a) is the structural underpinning of the decline in the

equilibrium real rate.

2.6 Conclusion

Usual measures of level and slope of the yield curve each have a separate trend and

cycle component. Level and slope trends have a clear equilibrium interpretation

— inflation expectations capture the level trend, consumption growth expectations

capture the slope trend. Accounting for the trend components identifies two yield

curve cycles; a level cycle and a slope cycle.

The yield curve cycles are important sources of variation in bond risk premia.

Decomposing the trend and cycle in the level and slope of the yield curve helps

predict excess bond returns with R2’s as high as 56%. The model-implied measure of

bond risk subsumes and add to the information in the most successful measures in

the literature. More importantly, the decomposition implies bond risk premia that

are predominantly cyclical; this is in contrast to the implications of the workhorse

affine term structure models.

Macroeconomic expectations capture the most persistent variations in the yield


curve, and implies time-variation in the equilibrium real rate. The macroeconomic

expectations provide an equilibrium underpinning for the recent decline in the long-

run real rate. Equilibrium real rate variation driven by macroeconomic expectations

predominantly affect the expectation hypothesis component in long-term yields. The

cycle variation — deviations from the consumption-based equilibrium yield curve —

captures the risk premia component. Affine term structure models using yield data

only understates the persistence of the expectations hypothesis component, because

there is no potential driver of variation in the equilibrium real rate.

Acknowledgements

I thank Martin M. Andreasen, Michael D. Bauer, Ambrogio Cesa-Bianchi, Anna Cies-

lak, Hans Dewachter, Tom Engsted, James D. Hamilton, Emanuel Möench, Stig V.

Møller for insightful comments, and Anh Le for sharing his zero-coupon bond data.

Remarks and suggestions from seminar participants at Aarhus University, Bank of

England, Deutsche Bundesbank, Fed Board, KU Leuven, Norwegian School of Eco-

nomics (NHH), University of Gothenburg and conference participants at the 11th

International Conference on Computational and Financial Econometrics are also

much appreciated. I acknowledge support from CREATES - Center for Research

in Econometric Analysis of Time Series (DNRF78), funded by the Danish National

Research Foundation.

2.7. REFERENCES 73

2.7 References

Adrian, T., Crump, R. K., Moench, E., 2013. Pricing the term structure with linear

regressions. Journal of Financial Economics Vol. 110, 110–138.

Albuquerque, R., Eichenbaum, M., Rebelo, S., 2016. Valuation risk and asset pricing.

Journal of Finance Vol. 71, 2861–2904.

Bansal, R., Shaliastovich, I., 2013. A long-run risk explanation of predictability puzzles

in bond and currency markets. Review of Financial Studies Vol. 26, 1–33.

Bauer, M. D., 2017. Restrictions on risk prices in dynamic term structure model.

Journal of Business and Economic Statistics Forthcoming.

Bauer, M. D., Hamilton, J. D., 2018. Robust bond risk premia. Review of Financial

Studies Vol. 31, 399–448.

Bauer, M. D., Rudebusch, G. D., 2017a. Interest rates under falling stars. Working

Paper.

Bauer, M. D., Rudebusch, G. D., 2017b. Resolving the spanning puzzle in macro-

finance term structure models. Review of Finance Vol. 21, 511–553.

Bauer, M. D., Rudebusch, G. D., Wu, J. C., 2012. Correcting estimation bias in dynamic

term structure models. Journal of Business and Economic Statistics Vol. 3, 454–467.

Branch, W., Evans, G. W., 2006. A simple recursive forecasting model. Economic

Letters Vol. 91, 158–166.

Buraschi, A., Piatti, I., Whelan, P., 2017. Expected term structures. Working Paper.

Campbell, J. Y., Cochrane, J. H., 1999. By force of habit: A consumption-based expla-

nation of aggregate stock market behavior. Journal of Political Economy Vol. 107,

205–251.



Christensen, J. H. E., Diebold, F. X., Rudebusch, G. D., 2011. The affine arbitrage-free

class of nelson-siegel term structure models. Journal of Econometrics Vol. 164,

4–20.

Christensen, J. H. E., Rudebusch, G. D., 2016. Modeling yields at the zero lower bound:

Are shadow rates the solution? In: Hillebrand, E., Koopman, S. J. (Eds.), Dynamic

Factor Models (Advances in Econometrics, vol. 35). Emerald Publishing Group,

75–125.


Cieslak, A., 2017. Short-rate expectations and unexpected returns in treasury bonds.

Review of Financial Studies Forthcoming.

Cieslak, A., Povala, P., 2015. Expected returns in treasury bonds. Review of Financial

Studies Vol. 28, 2859–2901.

Cochrane, J. H., Piazzesi, M., 2005. Bond risk premia. American Economic Review Vol.

95, 138–160.

Cochrane, J. H., Piazzesi, M., 2008. Decomposing the yield curve. Working Paper.

Cooper, I., Priestley, R., 2008. Time-varying risk premiums and the output gap. Review

of Financial Studies Vol. 22, 2801–2833.

Creal, D. D., Wu, J. C., 2016. Bond risk premia in consumption-based models. Working

Paper.

Dai, Q., Singleton, K., 2000. Specification analysis of affine term structure models.


Dai, Q., Singleton, K. J., 2002. Expectation puzzles, time-varying risk premia, and

affine models of the term structure. Journal of Financial Economics Vol. 63, 415–

441.

Duffee, G., 2002. Term premia and interest rate forecast in affine models. Journal of

Finance Vol. 57, 405–443.

Duffee, G. R., 2011a. Forecasting with the term structure: The role of no-arbitrage

restrictions. Working paper.

Duffee, G. R., 2011b. Information in (and not in) the term structure. Review of Finan-

cial Studies Vol. 24, 2895–2934.

Duffie, D., Kan, R., 1996. A yield-factor model of interest rates. Mathematical Finance

Vol. 6, 379–406.

Evans, G. W., Honkapohja, S., Williams, N., 2010. Generalized stochastic gradient

learning. International Economic Review Vol. 51, 237–262.





Faust, J., Wright, J. H., 2013. Forecasting inflation. In: Elliott, G., Timmermann, A.

(Eds.), Handbook of Economic Forecasting. Elsevier, 3–51.

2.7. REFERENCES 75

Greenwood, R., Shleifer, A., 2014. Expectations of returns and expected returns. Re-

view of Financial Studies Vol. 27, 714–746.

Greenwood, R., Vayanos, D., 2014. Bond supply and excess bond returns. Review of

Financial Studies Vol. 27, 663–713.

Gurkaynak, R. S., Sack, B., Wright, J. H., 2007. The u.s. treasury yield curve: 1961 to

the present. Journal of Monetary Economics Vol. 54, 2291–2304.

Hamilton, J. D., 1994. Time Series Analysis. Princeton University Press.

Hamilton, J. D., Harris, E. S., Hatzius, J., West, K. D., 2016. The equilibrium real funds

rate: Past, present, and future. IMF Economic Review Vol. 64, 660–707.

Hansen, L. P., Hodrick, R. J., 1980. Forward exchange rates as optimal predictors of

future spot rates: An econometric analysis. Journal of Political Economy Vol. 88,

829–853.

Harvey, D. S., Leybourne, S. J., Newbold, P., 1998. Tests for forecast encompassing.

Journal of Business and Economic Statistics Vol. 16, 254–259.

Holston, K., Laubach, T., Williams, J. C., 2017. Measuring the natural rate of inter-

est: International trends and determinants. Journal of International Economics

Forthcoming.

Joslin, S., Le, A., Singleton, K. J., 2013. Why gaussian macro-finance term structure

models are (nearly) unconstrained factor-vars. Journal of Financial Economics Vol.

109, 604–622.

Joslin, S., Priebsch, M., Singleton, K. J., 2014. Risk premiums in dynamic term structure

models with unspanned macro risk. Journal of Finance Vol. 69, 453–468.

Joslin, S., Singleton, K. J., Zhu, H., 2011. A new perspective on gaussian dynamic term

structure models. Review of Financial Studies Vol. 24, 926–970.

Kim, D. H., Orphanides, A., 2012. Term structure estimation with survey data on

interest rate forecasts. Journal of Financial and Quantitative Analysis Vol. 47, 241–

272.

Kozicki, S., Tinsley, P., 2001. Shifting endpoints in the term structure of interest rates.

Journal of Monetary Economics Vol. 47, 613–652.

Le, A., Singleton, K. J., 2013. The structure of risks in equilibrium affine models of

bond yields. Working paper.

Litterman, R., Scheinkman, J., 1991. Common factors affecting bond returns. Journal

of Fixed Income Vol. 1, 54–61.


Ludvigson, S. C., Ng, S., 2009. Macro factors in bond risk premia. Review of Financial

Studies Vol. 22, 5027–5067.

Mankiw, G., Reis, R., 2002. Sticky information versus sticky prices: A proposal to

replace the new keynesian phillips curve. Quarterly Journal of Economics Vol. 117,

1295–1328.

Mincer, J., Zarnowitz, V., 1969. The evaluation of economic forecasts. In: Mincer, J.

(Ed.), Handbook of Economic Forecasting. New York: National Bureau of Economic

Research, 81–111.

Nagel, S., Malmendier, U., 2016. Learning from inflation experiences. Quarterly Jour-

nal of Economics Vol. 131, 53–87.

Newey, W. K., West, K. D., 1987. A simple, positive semi-definite, heteroskedasticity

and autocorrelation consistent covariance matrix. Econometrica Vol. 55(3), 703–

708.

Orphanides, A., Williams, J. C., 2005. The decline of activist stabilization policy: Natu-

ral rate misperceptions, learning, and expectations. Journal of Economic Dynamics

and Control Vol. 29, 1927–1950.

Piazzesi, M., Salomao, J., Schneider, M., 2015. Trend and cycle in bond premia. Work-

ing Paper.

Piazzesi, M., Schneider, M., 2007. Equilibrium yield curves. In: NBER Macroeco-

nomics Annual 2006, Volume 21. MIT Press, 389–472.

Rudebusch, G. D., Williams, J. C., 2009. Forecasting recessions: The puzzle of the

enduring power of the yield curve. Journal of Business and Economic Statistics Vol.

27, 492–503.

Rudebusch, G. D., Wu, T., 2007. Accounting for a shift in term structure behavior with

no-arbitrage and macro-finance models. Journal of Money, Credit, and Banking

Vol. 39, 395–422.

Schorfheide, F., Song, D., Yaron, A., 2017. Identifying long run risks: A bayesian mixed-

frequency approach. Econometrica Forthcoming.

Sims, C. A., 2003. Implications of rational inattention. Journal of Monetary Economics

Vol. 50, 665–690.

Thornton, D. L., Valente, G., 2012. Out-of-sample predictions of bond excess returns

and forward rates: An asset allocation perspective. Review of Financial Studies Vol.

25, 3141–3168.

2.7. REFERENCES 77

Tversky, A., Kahneman, D., 1973. Availability: A heuristic for judging frequency and

probability. Cognitive Psychology (4), 207–232.

Vasicek, O., 1977. An equilibrium charactherization of the term structure. Journal of

Financial Economics Vol. 5, 177–188.

Wachter, J. A., 2006. A consumption-based model of the term structure of interest

rates. Journal of Financial Economics Vol. 79, 365–399.

Woodford, M., 2003. Optimal interest-rate smoothing. Review of Economic Studies

Vol. 70, 861–886.

Yellen, J. L., 2015. Normalizing Monetary Policy: Prospects and Perspectives. Speech

at the conference “The New Normal for Monetary Policy” at the Federal Re-

serve Bank of San Francisco, March 27, url: https://www.federalreserve.gov/

newsevents/speech/yellen20150327a.htm.

Zellner, A., 1962. An efficient method of estimating seemingly unrelated regressions

and tests for aggregation bias. Journal of American Statistical Association Vol. 57,

348–368.

https://www.federalreserve.gov/newsevents/speech/yellen20150327a.htm

https://www.federalreserve.gov/newsevents/speech/yellen20150327a.htm


Appendix

B.1 Model Details

B.1.1 The Stochastic Discount Factor

This section derives the real stochastic discount factor. The representative household

has lifetime utility Vt given by

Vt = Et

[ ∞∑i=0

δiΓt+i logCt+i

]= Γt logCt +δEt

[Vt+1

], (B.1)

where Vt denotes the value function, Ct denotes the consumption good, Γt is an

exogenous demand shock. The subjective discount factor is denoted by δ. A complete

market for state contingent claims At (s) paying one unit of consumption in period

t if state s materializes are available to the household. Thus, the budget constraint

reads

Ct +Et[Mt ,t+1 At+1

]= At , (B.2)

where all terms are measured in real terms, i.e. consumption units. Mt ,t+1 denotes

the real stochastic discount factor. For notational convenience the state dependence

is suppressed. This implies that the maximization problem reads

maxCt ,At+1

Vt = Γt logCt +δEt[Vt+1

]s.t. Ct +Et

[Mt ,t+1 At+1

]= At ,(B.3)

and a standard no-Ponzi game condition. The Bellman equation reads

V(

At)= max

Ct ,At+1{Γt logCt +δEt

[Vt+1

]+Λt

(At −Ct −Et

[Mt ,t+1 At+1

])}. (B.4)

First order conditions are

Γt C−1t −Λt = 0

−Λt Mt ,t+1P (s)+δ ∂Vt+1

∂At+1P (s) = 0,

(B.5)

where P (s) denotes the probability of state s. By the envelope theorem, it follows that∂Vt∂At

=Λt . Thus,

Mt ,t+1 = δΛt+1

Λt= δ

(Ct+1

Ct

)−1 Γt+1

Γt, (B.6)

which hold for all states s. The log real stochastic discount factor then reads

mt+1 = logδ−∆ct+1 +∆γt+1, (B.7)

where γt+1 = logΓt+1 − logΓt . This implies a (log) nominal stochastic discount factor

given by

m$t+1 = logδ−∆ct+1 +∆γt+1 −πt+1, (B.8)

B.1. MODEL DETAILS 79

where πt+1 is net inflation. Finally, plugging in for the consumption growth and infla-

tion dynamics as perceived by the representative household, the nominal stochastic

discount factor is given by

m$t+1 = logδ−τc,t −εc,t+1 +∆γt+1 −τπ,t −επ,t+1, (B.9)

where τc,t and τπ,t are perceived conditional expectations of consumption growth

and inflation, respectively. εc,t and επ,t are forecast errors.

B.1.2 Exogenous Demand Shocks

The exogenous demand shock growth is specified as

∆γt+1 = γ0 +γττt +γPPt + 1

2λ′τ,tλτ,t +

(λ′τ,tΣ

−1τ − [1 1]

)εt+1

+ 1

2λ′P,tλP,t +λ′

P,tΣ−1P vt+1,

(B.10)

where τt = [τc,t τπ,t

]′ and εt = [εc,t επ,t

]′ ∼ N(0,ΣτΣ′

τ

). Pt is an nP × 1 vector of

latent factors. vt+1 ∼N(0,ΣPΣ

′P

)are one-step ahead forecast errors associated with

Pt . This specification implies a nominal stochastic discount factor given by

m$t+1 =−α−βττt−βPPt−1

2λ′τ,tλτ,t−λ′

τ,tΣ−1τ εt+1−1

2λ′P,tλP,t−λ′

P,tΣ−1P vt+1, (B.11)

where α= γ0 − logδ, βτ = γτ+ [1 1], and βP = γP. The market prices of risk (or risk

sensitivity functions) are given by

Στλτ,t =λ0,τ+λ1,ττt

ΣPλP,t =λ0,P+λ1,PPt ,(B.12)

where λ0,τ, λ1,τ, λ0,P, and λ1,P are parameters.

B.1.3 Perceived Law of Motions

The household recursively updates conditional expectations using a constant-gain

learning algorithm, i.e.

τt+1 = τt +νεt+1, (B.13)

where ν is the constant-gain matrix. This matrix is assumed diagonal. The latent

factors have law of motion

Pt+1 =µP+ΦPPt + vt+1. (B.14)

The innovations vt are unrelated to εt at all leads and lags.


B.1.4 The Nominal Short Rate

The pricing equation for the nominal short rate it is given by

Et

[em$

t+1

]= e−it . (B.15)

Since the stochastic discount factor is conditionally normal, and it hold for any

X ∼N(µ,σ2

)that E

[e X

]= eµ+

12σ

2, it follows that

it =−Et

[m$

t+1

]− 1

2Vt

[m$

t+1

]. (B.16)

The conditional expectation of the stochastic discount factor is

Et

[m$

t+1

]=−α−βττt −βPPt − 1

2λ′τ,tλτ,t − 1

2λ′P,tλP,t , (B.17)

and so, the innovation is given by

m$t+1 −Et

[m$

t+1

]=−λ′

τ,tΣ−1τ εt+1 −λ′

P,tΣ−1P vt+1. (B.18)

It follows that

Vt

[m$

t+1

]=λ′

τ,tλτ,t +λ′P,tλP,t , (B.19)

and so, the short nominal interest rate is given by

it =α+βττt +βPPt . (B.20)

B.1.5 Affine Yield Curve Representation

For all bonds with maturity longer than one period, the pricing equations are given

recursively as

Et

[em$

t+1+p(n−1)t+1

]= ep(n)

t . (B.21)

Inserting for the stochastic discount factor, the pricing equations read

Et

[e−α−βττt−βPPt− 1

2λ′τ,tλτ,t−λ′τ,tΣ

−1τ εt+1− 1

2λ′P,tλP,t−λ′P,tΣ

−1P

vt+1+p(n−1)t+1

]= ep(n)

t . (B.22)

Then, guessing and verifying, a solution for log bond prices that is linear in the state

variables, i.e. p(n)t = an +bτ,nτt +bP,nPt , implies that the left hand side exponent

reads

−α−βττt −βPPt − 1

2λ′τ,tλτ,t −λ′

τ,tΣ−1τ εt+1 − 1

2λ′P,tλP,t −λ′

P,tΣ−1P vt+1

+an−1 +bτ,n−1τt+1 +bP,n−1Pt+1.(B.23)

B.1. MODEL DETAILS 81

Inserting for law of motions of the state variables, it follows that

−α−βττt −βPPt − 1

2λ′τ,tλτ,t −λ′

τ,tΣ−1τ εt+1 − 1

2λ′P,tλP,t −λ′

P,tΣ−1P vt+1

+an−1 +bτ,n−1τt +bτ,n−1νεt+1 +bP,n−1µP+bP,n−1ΦPPt +bP,n−1vt+1,(B.24)

or

−α+an−1 +bP,n−1µP+ (bτ,n−1 −βτ

)τt +

(bP,n−1ΦP−βP

)Pt

−1

2λ′τ,tλτ,t − 1

2λ′P,tλP,t +

(bτ,n−1ν−λ′

τ,tΣ−1τ

)εt+1 +

(bP,n−1 −λ′

P,tΣ−1P

)vt+1.

(B.25)

Thus, applying the log-normal expectation formula, it follows that

Et

[em$

t+1+p(n−1)t+1

]= e

−α+an−1+bP,n−1µP+(bτ,n−1−βτ

)τt+

(bP,n−1ΦP−βP

)Pt

− 12λ

′τ,tλτ,t− 1

2λ′P,tλP,t+ 1

2

(bτ,n−1ν−λ′τ,tΣ

−1τ

)ΣτΣ

′τ

(bτ,n−1ν−λ′τ,tΣ

−1τ

)′+ 1

2

(bP,n−1−λ′P,tΣ

−1P

)ΣPΣ

′P

(bP,n−1−λ′P,tΣ

−1P

)′.

(B.26)

Lifting the two last parenthesis,

Et

[em$

t+1+p(n−1)t+1

]= e

−α+an−1+bP,n−1µP+(bτ,n−1−βτ

)τt+

(bP,n−1ΦP−βP

)Pt

− 12λ

′τ,tλτ,t+ 1

2 bτ,n−1νΣτΣ′τν

′b′τ,n−1+ 1

2λ′τ,tλτ,t−bτ,n−1νΣτλτ,t

− 12λ

′P,tλP,t+ 1

2 bP,n−1ΣPΣ′P

b′P,n−1+ 1

2λ′P,tλP,t−bP,n−1ΣPλP,t ,

(B.27)

and then rearranging,

Et

[em$

t+1+p(n−1)t+1

]= e−α+an−1+bP,n−1µP+ 1


′b′τ,n−1+ 1

2 bP,n−1ΣPΣ′P

b′P,n−1

+(bτ,n−1−βτ

)τt−bτ,n−1νΣτλτ,t

+(bP,n−1ΦP−βP

)Pt−bP,n−1ΣPλP,t .

(B.28)

Inserting for the market prices of risk, λτ,t and λP,t , implies

Et

[em$

t+1+p(n−1)t+1

]= e−α+an−1+bP,n−1µP+ 1


′b′τ,n−1+ 1

2 bP,n−1ΣPΣ′P

b′P,n−1

+(bτ,n−1−βτ

)τt−bτ,n−1νλ0,τ−bτ,n−1νλ1,ττt

+(bP,n−1ΦP−βP

)Pt−bP,n−1λ0,P−bP,n−1λ1,PPt ,

(B.29)

or

Et

[em$

t+1+p(n−1)t+1

]= e

−α+an−1+bP,n−1

(µP−λ0,P

)−bτ,n−1νλ0,τ

+ 12 bτ,n−1νΣτΣ

′τν

′b′τ,n−1+ 1

2 bP,n−1ΣPΣ′P

b′P,n−1

+[

bτ,n−1(I−νλ1,τ

)−βτ]τt

+[

bP,n−1

(ΦP−λ1,P

)−βP

]Pt

.

(B.30)


Thus, for

ean+bτ,nτt+bP,nPt = e−α+an−1+bP,n−1

(µP−λ0,P


+ 12 bτ,n−1νΣτΣ

′τν

′b′τ,n−1+ 1

2 bP,n−1ΣPΣ′P

b′P,n−1

+[

bτ,n−1(I−νλ1,τ

)−βτ]τt

+[

bP,n−1

(ΦP−λ1,P

)−βP

]Pt

.

(B.31)

to hold for all states, it must be that

an = −α+an−1 +bP,n−1(µP−λ0,P


+ 1

2bτ,n−1νΣτΣ

′τν

′b′τ,n−1 +

1

2bP,n−1ΣPΣ

′Pb′

P,n−1

bτ,n = bτ,n−1(I−νλ1,τ

)−βτbP,n = bP,n−1

(ΦP−λ1,P

)−βP(B.32)

Then yields are affine in the state variables, i.e.

y (n)t =An +Bτ,nτt +BP,nPt , (B.33)

where An =− 1n an , Bτ,n =− 1

n bτ,n , and BP,n =− 1n bP,n .

B.2 Data Construction

I use data that is standard in the literature. Monthly real per capita consumption

growth is constructed from the NIPA tables at the U.S. BEA. I use data on non-durables

and services. Each of the series are divided by their price indexes before they are added

together. Finally, the sum is divided by the population size, which is downloaded

from the Federal Reserve Bank of St. Louis (FRED) database. Inflation is constructed

from core U.S. CPI, which also is downloaded from the FRED database. Consumption

growth and inflation are both calculated as year-over-year log differences of the

consumption and CPI levels. Real per capita consumption is available from January

1959, whereas core U.S. CPI is available from January 1957.

Yield curve data are end-of-month unsmoothed Fama and Bliss (1987) zero-

coupon yields constructed in Le and Singleton (2013) for maturities of 12, 24, 36,

48, 60, 72, 84, 96, 108, and 120 months. I start the sample in November 1971 as in

Cieslak and Povala (2015). Prior to this period, data on bond prices with maturities

longer than 5 years is sparse (see Fama and Bliss (1987) and Le and Singleton (2013))

and thus excluded here. Data is available through December 2014, giving a total of

T = 518 observations.

B.3. ROBUSTNESS RESULTS 83

B.3 Robustness Results

B.3.1 Sample split - 1985:M1 to 2014:M12

Table B.1: Predictive Regressions

The table reports regression coefficient for r x(n)t+1 = Ztθ

(n) +u(n)t+1. Newey and West (1987) corrected t-

statistics using 18 lags are in parentheses () and Hansen and Hodrick (1980) corrected t-statistics using 12

lags are in brackets [].

Maturities2 3 4 5 6 7 8 9 10


0 +θ(n)PC

PCt +u(n)t+1.

θ(n)PC

0.1850 0.3561 0.5844 0.7843 1.0243 1.2119 1.4241 1.6581 1.7718(1.8970) (1.9504) (2.2649) (2.5405) (2.6756) (2.8665) (3.0617) (3.1168) (3.1577)

[1.6590] [1.7056] [1.9843] [2.2296] [2.3581] [2.5346] [2.7165] [2.7738] [2.8085]

R2 0.11 0.11 0.14 0.16 0.18 0.19 0.21 0.22 0.21



t+1.

θ(n)C P 0.2036 0.3865 0.6370 0.8142 1.0452 1.2124 1.3986 1.6246 1.6779

(3.0063) (3.0499) (3.6125) (3.8840) (3.8411) (4.0733) (4.1501) (4.0617) (3.9262)

[2.6529] [2.7022] [3.2098] [3.4772] [3.4522] [3.6854] [3.7730] [3.6881] [3.5794]

R2 0.15 0.15 0.20 0.20 0.22 0.23 0.24 0.25 0.22



t+1.

θ(n)C F 0.1774 0.3813 0.6043 0.8036 1.0188 1.2346 1.4155 1.6012 1.7633

(3.9934) (4.6492) (5.3454) (6.0371) (6.6336) (7.0493) (7.3807) (7.1803) (7.3176)

[3.7148] [4.4221] [5.1594] [6.0092] [6.8913] [7.3253] [7.9218] [7.8297] [8.0063]

R2 0.17 0.22 0.27 0.30 0.32 0.36 0.36 0.36 0.37



t+1.

θ(n)RF 0.1678 0.3668 0.5876 0.7864 1.0099 1.2306 1.4187 1.6252 1.8070

(4.5082) (5.5549) (6.2337) (6.6635) (7.3047) (7.8272) (8.3678) (8.5503) (9.7131)

[4.3015] [5.4834] [6.1619] [6.6618] [7.5589] [8.0382] [8.9703] [9.4654] [11.5637]

R2 0.21 0.27 0.34 0.38 0.42 0.47 0.49 0.50 0.51


B.3.2 Holding Period: 3 months


The table reports regression coefficient for r x(n)t+3/12

= Ztθ(n) +u(n)

t+3/12. Newey and West (1987) corrected

t-statistics using 5 lags are in parentheses () and Hansen and Hodrick (1980) corrected t-statistics using 3


Maturities2 3 4 5 6 7 8 9 10

Panel A: r x(n)t+3/12

= θ(n)0 +θ(n)

PCPCt +u(n)

t+3/12.

θ(n)PC

0.0941 0.3425 0.5141 0.7765 0.8228 1.1322 1.3727 1.6261 1.6256(0.9361) (1.6912) (1.8841) (2.2702) (1.9810) (2.3908) (2.6350) (2.9719) (2.5602)

[0.9280] [1.6512] [1.8314] [2.1904] [1.8838] [2.2800] [2.4988] [2.8433] [2.4292]

R2 0.01 0.02 0.02 0.03 0.03 0.04 0.04 0.05 0.04

Panel B: r x(n)t+3/12

= θ(n)0 +θ(n)

C P C Pt +u(n)t+3/12

.

θ(n)C P 0.2071 0.4389 0.5372 0.8639 0.9149 1.0930 1.3189 1.4520 1.5026

(2.2254) (2.4662) (2.1792) (2.9784) (2.7577) (2.8590) (3.0874) (3.1406) (2.8884)

[2.1504] [2.4024] [2.1543] [2.9875] [2.7823] [2.8884] [3.2231] [3.1706] [2.9137]

R2 0.06 0.06 0.05 0.07 0.06 0.06 0.07 0.07 0.06

Panel C: r x(n)t+3/12

= θ(n)0 +θ(n)

C F C Ft +u(n)t+3/12

.

θ(n)C F 0.1595 0.3861 0.5847 0.7982 0.9052 1.1224 1.3137 1.4918 1.5460

(4.3066) (5.0386) (5.4537) (5.7831) (5.4811) (5.8550) (6.1013) (6.3384) (5.8329)

[4.1907] [4.8226] [5.2015] [5.5012] [5.1700] [5.5158] [5.6985] [5.9749] [5.4895]

R2 0.07 0.10 0.11 0.12 0.11 0.13 0.14 0.14 0.13

Panel D: r x(n)t+3/12

= θ(n)0 +θ(n)

RF RFt +u(n)t+3/12

.

θ(n)RF 0.1684 0.3728 0.5865 0.7791 0.8902 1.1121 1.2960 1.4979 1.5528

(4.5246) (5.0169) (5.6531) (5.8983) (5.6622) (6.1930) (6.4921) (6.7693) (6.3247)

[4.2596] [4.6984] [5.3015] [5.5413] [5.3029] [5.7750] [6.0340] [6.3137] [5.8956]

R2 0.09 0.10 0.13 0.13 0.12 0.14 0.15 0.16 0.15


B.3.3 Holding Period: 6 months


The table reports regression coefficient for r x(n)t+6/12

= Ztθ(n) +u(n)

t+6/12. Newey and West (1987) corrected

t-statistics using 9 lags are in parentheses () and Hansen and Hodrick (1980) corrected t-statistics using 6


Maturities2 3 4 5 6 7 8 9 10

Panel A: r x(n)t+6/12

= θ(n)0 +θ(n)

PCPCt +u(n)

t+6/12.

θ(n)PC

0.0678 0.3144 0.5166 0.7565 0.9342 1.1630 1.3214 1.5722 1.6580(1.4846) (2.6257) (2.8141) (3.1122) (3.2951) (3.3767) (3.3504) (3.7370) (3.3813)

[1.3671] [2.4352] [2.6083] [2.8760] [3.0525] [3.1090] [3.0792] [3.4430] [3.0954]

R2 0.02 0.05 0.06 0.08 0.08 0.09 0.09 0.10 0.09

Panel B: r x(n)t+6/12

= θ(n)0 +θ(n)

C P C Pt +u(n)t+6/12

.

θ(n)C P 0.1173 0.3557 0.5495 0.8029 0.9654 1.1313 1.3009 1.4644 1.5969

(3.1620) (3.6965) (4.0030) (4.6613) (4.8839) (4.7665) (4.9991) (5.2116) (4.9227)

[3.0978] [3.6211] [3.8732] [4.5280] [4.7505] [4.6037] [4.8465] [5.0401] [4.7328]

R2 0.08 0.11 0.11 0.14 0.14 0.14 0.14 0.14 0.14

Panel C: r x(n)t+6/12

= θ(n)0 +θ(n)

C F C Ft +u(n)t+6/12

.

θ(n)C F 0.1064 0.3472 0.5646 0.7780 0.9265 1.1348 1.3162 1.4923 1.6082

(4.7192) (5.9206) (6.4044) (6.7024) (6.9512) (7.0751) (7.2896) (7.4352) (7.0344)

[4.5394] [5.7175] [6.1081] [6.3045] [6.6013] [6.6643] [6.8042] [6.9119] [6.5324]

R2 0.13 0.19 0.22 0.24 0.24 0.25 0.26 0.27 0.26

Panel D: r x(n)t+6/12

= θ(n)0 +θ(n)

RF RFt +u(n)t+6/12

.

θ(n)RF 0.1119 0.3399 0.5574 0.7559 0.9070 1.1250 1.3097 1.5019 1.6082

(4.8136) (5.9278) (6.5530) (6.8471) (7.1438) (7.4625) (7.8117) (8.0493) (7.6499)

[4.5512] [5.6603] [6.2077] [6.3965] [6.7329] [6.9948] [7.2702] [7.4720] [7.1026]

R2 0.15 0.21 0.24 0.25 0.25 0.28 0.29 0.30 0.29


B.3.4 Gurkaynak et al. (2007) data






Maturities2 3 4 5 6 7 8 9 10


0 +θ(n)PC

PCt +u(n)t+1.

θ(n)PC

0.1956 0.3995 0.6059 0.8105 1.0119 1.2093 1.4023 1.5907 1.7744(2.5902) (2.8533) (3.0878) (3.2960) (3.4813) (3.6428) (3.7783) (3.8864) (3.9675)

[2.2848] [2.5055] [2.7071] [2.8894] [3.0541] [3.1993] [3.3226] [3.4227] [3.4995]

R2 0.09 0.12 0.14 0.16 0.17 0.18 0.19 0.20 0.20



t+1.

θ(n)C P 0.1947 0.3955 0.5969 0.7982 1.0001 1.2024 1.4043 1.6048 1.8032

(2.7828) (3.1165) (3.4108) (3.6765) (3.9193) (4.1372) (4.3253) (4.4800) (4.5994)

[2.4583] [2.7406] [2.9945] [3.2262] [3.4397] [3.6321] [3.7993] [3.9379] [4.0464]

R2 0.10 0.13 0.15 0.17 0.19 0.21 0.22 0.23 0.24



t+1.

θ(n)C F 0.2304 0.4436 0.6438 0.8352 1.0201 1.1996 1.3742 1.5440 1.7092

(5.6404) (6.1164) (6.5804) (7.0062) (7.3516) (7.5938) (7.7312) (7.7761) (7.7470)

[5.0562] [5.4695] [5.8861] [6.2794] [6.6071] [6.8433] [6.9822] [7.0331] [7.0124]

R2 0.32 0.36 0.39 0.42 0.44 0.45 0.47 0.47 0.48



t+1.

θ(n)RF 0.2299 0.4412 0.6403 0.8317 1.0177 1.1992 1.3762 1.5483 1.7155

(6.3949) (7.5433) (8.4357) (9.1154) (9.5904) (9.8788) (10.0109) (10.0203) (9.9376)

[5.7936] [6.8626] [7.6983] [8.3480] [8.8150] [9.1069] [9.2459] [9.2614] [9.1832]

R2 0.36 0.40 0.44 0.47 0.49 0.51 0.53 0.53 0.54

B.3.5 Out-of-Sample Predictability: Yield Levels

Because of the extreme persistence in yields, a great number of studies have docu-

mented difficulties in improving forecasting performance over a simple random walk,

i.e. y (n)t+h = y (n)

t for all h. Here, I show that accounting for the macroeconomic origin

of such persistence does in fact improve forecasting performance. Using the same

methodology as before, I construct forecasts from performing the regressions

y (n)t+h = Zt%

(n) +ζ(n)t+h , (B.34)


on an expanding sample. As before, I start the test period in January 1990. The set of

predictor variables Zt is in turn i) the first three principle components ii) the first five

forward rates, iii) inflation expectations and two interest rate cycles, and finally iv)

the proposed four factors, i.e. expected consumption growth, expected inflation, and

two latent factors.

Table B.5: Out-of-Sample Prediction Results: Yield Level

This table reports out-of-sample root mean squared forecasting errors for the predictive regressions

y (n)t+h

= Zt%(n) +ζ(n)

t+hfor Zt being i) the first three principal components, ii) five forward rates, iii) inflation

expectations and two interest rate cycles, iv) and consumption growth expectations, inflation expectations,and two latent factors, (v) optimal and consumption growth expectations, inflation expectations, and twolatent factors. All numbers are relative to the root mean squared forecasting errors from using a simplerandom walk for the same maturity as forecast. Test statistics for the hypothesis that (iv) encompasses agiven variable are in parenthesis (), and that (v) encompasses a given variable are in brackets []. See Harveyet al. (1998). Standard errors are Newey and West (1987) corrected with ceil(h ×1.5) lags. The test sample is1990:1 to 2014:12. Factor identification and predictive coefficients are recursively updated as the trainingsample expands.

(i) (ii) (iii) (iv) (v)

Panel A: 5-year yieldh = 3/12 1.076 1.0742 1.020 0.988 0.984

(-0.068) (-0.200) (0.488)[0.075] [-0.062] [0.858]

h = 6/12 1.112 1.116 0.978 0.933 0.927(0.350) (0.333) (0.498)[0.458] [0.455] [0.949]

h = 1 1.249 1.233 1.008 0.926 0.924(0.064) (-0.169) (0.537)[-0.039] [-0.253] [0.7253]

Panel B: 7-year yieldh = 3/12 1.052 1.056 0.966 0.964 0.962

(0.057) (-0.124) (-0.608)[0.1101] [-0.056] [-0.294]

h = 6/12 1.085 1.081 0.929 0.917 0.912(0.381) (0.071) (-0.254)[0.349] [0.063] [0.077]

h = 1 1.228 1.215 0.957 0.895 0.896(-0.011) (-0.280) (0.277)[-0.259] [-0.527] [0.477]

Panel C: 10-year yieldh = 3/12 1.053 1.167 1.014 1.011 1.006

(-2.304) (0.269) (-0.648)[-2.010] [0.334] [-0.330]

h = 6/12 1.091 1.107 0.928 0.942 0.937(-0.513) (-0.597) (-1.283)[-0.471] [-0.563] [-1.066]

h = 1 1.227 1.206 0.931 0.926 0.929(-0.522) (-1.042) (-0.825)[-0.733] [-1.226] [-0.724]

I use root mean squared forecasting errors as measure of forecast performance.

The numbers are relative to the root mean squared forecasting errors from the simple

random walk for the same maturity. I implement the encompassing test in Harvey

et al. (1998), where the null hypothesis is that the expected consumption growth,

expected inflation, and two latent factors encompasses the other prominent bond

return factors.


The first two columns confirm the finding that the random walk is difficult to

beat with the usual term structure factors. Across maturities, the performance is

always worse than the random walk forecast by 5-25%. The performance worsen

as the forecast horizon increases, which is consistent with the term structure fac-

tors displaying excess mean reversion compared to what is seen in yields. Column

(iii) shows that accounting for persistent inflation expectations does improve the

forecasting performance relative to the usual term structure factors. Relative to the

random walk forecast, the forecasting performance is also improved for six and twelve

month horizons by by 4-7% for the 7-year and 10-year yields. For the 5-year yield,

the forecasting performance is comparable to the random walk. Finally, column

(iv) shows that controlling for expected consumption growth further improves the

forecasting performance. Accounting for the macroeconomic expectation factors

improves forecasting performance relative to the random walk benchmark across

maturities and forecasting horizons except for the 10-year yield with a three month

horizon. The forecasting performance relative to the random walk benchmark is

improved with the forecasting horizon, which is consistent with the interpretation

that macroeconomic expectations capture the most persistent source of variation

in yields. The null that expected consumption growth, expected inflation, and two

latent factors encompasses the other prominent bond return factors is not rejected

(with the exception of the level, slope, and curvature factors for the 10-year yield at

the three month horizon).

B.4 The Bauer and Hamilton (2018) Bootstrap

1. Estimate a VAR(1) for the level, slope, and curvature factors, i.e.

PCt+1 = µPC+ ΦPCPCt + vPC,t+1, (B.35)

and a VAR(1) for the proposed new factors. In this case, estimate a VAR(1) for

Ft =[τc,t P(1)

t P(2)t

]′, i.e.

Ft+1 = µF + ΦF Ft + vF,t+1. (B.36)

Finally, estimate the measurement equation for yields under the null, i.e.

y (n)t = α(n)

PC+ β(n)

PCPCt + κ(n)

t (B.37)

2. Draw 10,000 bootstrap samples, each with length T = 518 as in the original

sample, by iterating on

PC?s+1 = µPC+ ΦPCPC?s + v?PC,s+1

F?s+1 = µF + ΦF F?

s + v?F,s+1,(B.38)

B.5. OTHER LEARNING ALGORITHMS 89

where[

v?PC,s v?F,s

]′is drawn with replacement from the joint empirical dis-

tribution of[vPC,t vF,t

]′. Finally, the bootstrapped yield curve is computed

from

y?,(n)s = α(n)

PC+ β(n)

PCPC?s +κ?,(n)

s , (B.39)

where κ?,(n)s

i .i .d .∼ N(0,σ2

κ,n

). The measurement error standard deviation, σκ,n ,

is set to the standard error of the empirical measurement errors, κ(n)t .

3. For each bootstrap sample, run the regressions7

r x?,(n)s+1 = θ(n)

0 +θ(n)PC

PC?s +θ(n)F F?

s +u(n)s+1, (B.40)

and compute the Wald statistics χ2?NW and χ2?

H H associated with the null hypoth-

esisH0 : θ(n)F = 0.

B.5 Other Learning Algorithms

I consider two other learning algorithms outlined in Branch and Evans (2006). Both

learning algorithms allow for time-variation in the gain parameter, but in two impor-

tantly different ways.

B.5.1 Recursive Least Squares

The recursive least squares algorithm is a simple formulation of the recursively up-

dated mean. Here,

τt = τt−1 +νt(Mt −τt−1

), (B.41)

with νt = t−1. This structure implies,

τt = 1

t

t−1∑i=0

Mt−i . (B.42)

This implies that all of macroeconomic history receives equal weight at any point in

time. In this way, macroeconomic experiences are not weighted more heavily.

B.5.2 Kalman Filter Learning Rule

The Kalman Filter learning rule sets the gain parameter, νt , in an optimal way given

the observed data. Here, the recursions are given by

τt = τt−1 +νt(Mt −τt−1

)νt = Pt−1

σ2 +Pt

Pt = Pt−1 −P 2

t−1

σ2 +Pt−1+χ2σ2.

(B.43)

7The bootstrap excess returns are computed as r x?,(n)s+1 = ny?,(n)

s − (n −1) y?,(n−1)s+1 − y?,(1)

s .


Pt is an estimate of the variance of τt , and νt here is the optimal Kalman gain (Hamil-

ton, 1994). I follow Branch and Evans (2006) and estimate σ2 as the variance of the

residuals from an AR(1) model for Mt . I set χc = 0.009 and χπ = 0.016. This is in line

with evidence that inflation experienced greater structural change over the sample

period compared to real activity (Branch and Evans, 2006).

B.5.3 Results

I construct two bond return factor, RF RLSt and RF K F

t , following the same procedure

as in the main paper. The results from the maturity-specific predictive regressions

are reported below.






Maturities2 3 4 5 6 7 8 9 10


0 +θ(n)RLS RF RLS

t +u(n)t+1.

θ(n)RLS 0.2197 0.4289 0.6261 0.8355 1.0337 1.1810 1.4288 1.5610 1.6852

(3.3135) (3.7794) (4.1957) (4.6530) (4.7691) (4.6853) (5.0825) (4.9936) (4.7582)

[2.9172] [3.3283] [3.7036] [4.1120] [4.2206] [4.1352] [4.4942] [4.4051] [4.2038]

R2 0.16 0.18 0.20 0.23 0.25 0.24 0.27 0.27 0.25


0 +θ(n)K F RF K F

t +u(n)t+1.

θθ(n)K F 0.2154 0.4319 0.6379 0.8285 1.0215 1.2036 1.3974 1.5400 1.7239

(5.5073) (6.6604) (7.6016) (8.3371) (9.1651) (9.9005) (10.0281) (10.2180) (10.7496)

[4.9678] [6.0326] [6.8962] [7.5999] [8.4485] [9.1668] [9.3866] [9.5759] [10.1642]

R2 0.33 0.39 0.44 0.48 0.51 0.53 0.55 0.56 0.55

B.6 Information in Cycle Factors

The natural question to ask is, what kind of variation does the latent factors cap-

ture? In order to address this question, I augment the Mincer and Zarnowitz (1969)

regressions for consumption and inflation with the two latent factors.

Table B.7 shows that the latent slope factor predicts future consumption growth

one month, one quarter, and half a year ahead with a negative regression slope.

The significance of the added predictive power increases with the horizon, and is

statistically significant at the 10 percent significance level for h = 3/12 and at the 5

percent significance level for h = 6/12. Both latent factors are strong predictors of

future inflation at a 1 percent significance levels for all horizons. Rudebusch and

B.6. INFORMATION IN CYCLE FACTORS 91

Table B.7: Augmented Predictive Macroeconomic Regressions

Results are for the regressions ∆ct+h = ρ(h)0,c +ρ(h)

1,c τc,t +ρ(h)2,c P

(1)t +ρ(h)

3,c P(2)t + εc,t+h and πt+h = ρ

(h)0,π +

ρ(h)1,πτπ,t +ρ(h)

2,πP(1)t +ρ(h)

3,πP(2)t + επ,t+h . Newey and West (1987) corrected t-statistics in absolute values

using 5 lags for the null hypothesisH0 : ρ(h)2,i = 0 andH0 : ρ(h)

3,i = 0 for i ∈ {c,π} are in parenthesis.

Panel A : ∆ct+h

h = 1/12 h = 3/12 h = 6/12

ρ(h)2,c −0.0097

(0.2234)−0.0069

(0.3401)0.0050(0.1387)

ρ(h)3,c −0.1441

(1.2701)−0.2043

(1.8521)−0.2765

(2.5723)

R2 0.21 0.17 0.13

Panel B : πt+h

h = 1/12 h = 3/12 h = 6/12

ρ(h)2,π −0.1220

(2.8195)−0.1361

(3.0415)−0.1717

(3.4322)

ρ(h)3,π 0.8233

(5.6388)0.8781(5.4770)

0.9483(5.3548)

R2 0.68 0.67 0.64

Williams (2009) document that the yield curve has information useful for forecasting

recessions above and beyond that of professional forecasters. This is in line with the

interpretation of the latent factors governing variation in marginal utilities that is not

reflected in macroeconomic expectations. Because the latent factors are reflected in

marginal utilities of consumption, they are priced into the bond market, although

the factors are not processed when the representative household forms conditional

expectations of macroeconomic variables. In this sense, the representative household

has irrational conditional expectations, although expectations are not systematically

wrong and thus not irrational in the sense of Mincer and Zarnowitz (1969).


B.7 Term Premia Estimates

B.7.1 Unrestricted Model: Monthly VAR

Table B.8: Unrestricted Model: Learning from Macro Experiences

This table reports the estimated model parameters. Estimation is by OLS.

Short Rate Parameters:

α(12) β(12)c β(12)

π β(12)1 β(12)

2−0.0255(0.0007)

1.4319(0.0365)

1.5902(0.0132)

0.2815(0.0058)

0.7154(0.0211)

Time Series Parameters:

µP(1,1

)µP

(2,1

)ΦP

(1,1

)ΦP

(1,2

)ΦP

(2,1

)ΦP

(2,2

)4.26×10−5

(0.0005)2.35×10−5

(0.0002)0.9318(0.0178)

0.0472(0.0504)

−0.0002(0.0062)

0.9467(0.0188)

Table B.9: Unrestricted Model: Three-factor DTSM



α(12) β(12)1 β(12)

2 β(12)3

0.0001(0.0002)

0.3103(0.0007)

0.6461(0.0046)

0.5451(0.0183)


µPC

(1,1

)ΦPC

(1,1

)ΦPC

(1,2

)ΦPC

(1,3

)0.0019(0.0020)

0.9918(0.0084)

0.0435(0.0450)

−0.0079(0.1989)

µPC

(2,1

)ΦPC

(2,1

)ΦPC

(2,2

)ΦPC

(2,3

)−0.0005

(0.0007)0.0006(0.0025)

0.9580(0.0155)

−0.0543(0.0794)

µPC

(3,1

)ΦPC

(3,1

)ΦPC

(3,2

)ΦPC

(3,3

)0.0012(0.0003)

−0.0037(0.0010)

0.0112(0.0076)

0.8188(0.0320)

B.7. TERM PREMIA ESTIMATES 93

Figure B.1: Model Implied Term Premia

This figure reports the decomposition of model implied yields with 10 years to maturity into term premiaand short rate expectations. The leftmost graph plots the decomposition for the proposed learning frommacroeconomic experiences model, and the rightmost graph plots the decomposition for the benchmarkthree-factor DTSM.


Figure B.2: Equlibrium Real Rate

This figure reports the model implied equilibrium real rate, r?t . The equilibrium real rate is defined to be

r?t = limh→∞

Et[rt+h

], where rt = y (12)

t −Et[πt+12

]denotes the real short rate. Using the model structure,

the equilibrium real rate is given by r?t = α(12) +β(12)τ,c τc,t +

(β(12)τ,π −1

)τπ,t +β(12)

P

(I−ΦP

)−1µP, since

the long-run consumption growth and inflation expectations are given by limh→∞

Et[∆ct+h

] = τc,t and

limh→∞

Et[πt+h

]= τπ,t , respectively.

B.7.2 Unrestricted Model: Annual VAR

Table B.10: Unrestricted Model: Learning from Macro Experiences



α(1) β(1)c β(1)

π β(1)1 β(12)

2−0.0255(0.0007)

1.4319(0.0365)

1.5902(0.0132)

0.2815(0.0058)

0.7154(0.0211)


µP(1,1

)µP

(2,1

)ΦP

(1,1

)ΦP

(1,2

)ΦP

(2,1

)ΦP

(2,2

)0.0007(0.0043)

0.0002(0.0014)

0.2515(0.1200)

0.7253(0.3332)

−0.0512(0.0398)

0.5884(0.1073)


Table B.11: Unrestricted Model: Three-factor DTSM



α(1) β(1)1 β(1)

2 β(1)3

0.0001(0.0002)

0.3103(0.0007)

0.6461(0.0046)

0.5451(0.0183)


µPC

(1,1

)ΦPC

(1,1

)ΦPC

(1,2

)ΦPC

(1,3

)0.0234(0.0234)

0.8999(0.0964)

0.6319(0.4474)

0.4884(1.7878)

µPC

(2,1

)ΦPC

(2,1

)ΦPC

(2,2

)ΦPC

(2,3

)−0.0041

(0.0058)0.0004(0.0202)

0.5971(0.1097)

−0.5058(0.4039)

µPC

(3,1

)ΦPC

(3,1

)ΦPC

(3,2

)ΦPC

(3,3

)0.0035(0.0011)

−0.0121(0.0035)

−0.0063(0.0280)

0.3793(0.0717)

B.7.3 No-Arbitrage: Annual VAR

B.7.3.1 Estimation

I estimate the model by maximum likelihood treating the four factors as observed.

All yields included in the estimation are taken to be measured with a small amount of

noise, i.e.

yot − yt = εt , (B.44)

where yot is the vector of observed yields, yt is the vector of model-implied yields.

The measurement errors εt ∼ N(0,σ2

εI)

are uncorrelated with other innovations

at all leads and lags. I implement the model using an annual VAR(1) for the two

latent pricing factors, but sampled at the monthly frequency. That is, the latent

pricing factors are regressed on a constant and their one-year lagged values. These

regressions are estimated using the monthly observations. As suggested by the results

in Cochrane and Piazzesi (2005) and Cochrane and Piazzesi (2008), the dynamics

of risk premia are more clear at an annual horizon. Further, the annual horizon is

consistent with the one-year holding period considered throughout this text. This

implies the likelihood function

logL(Θ

)= T∑t=1

log fy

(y0

t |Pt ,τt ,Θ)+ log fP

(Pt |Pt−12,Θ

), (B.45)

where fy and fP are normal pdfs. The structure of the model admits a separation of

the parametersΘ into time series parameters governing the dynamics of the factors

and cross-sectional parameters determining the factor loadings. This separation is

such that Θ1 = {α,βτ,βP,λ0,τ,λ1,τ,λ0,P,λ1,P,ΣP} and Θ2 = {µP,ΦP}, and the likeli-

hood reads

logL(Θ

)= T∑t=1

log fy

(y0

t |Pt ,τt ,Θ1

)+ log fP

(Pt |Pt−12,Θ1,Θ2

). (B.46)


The only content of Θ1 which affects fP is the covariance matrix ΣP, and from the

well-known Zellner (1962) result, this covariance matrix does not affect the maximum

likelihood estimates of the parameters inΘ2. Thus, as suggested in Joslin, Singleton,

and Zhu (2011), I adopt a two-step procedure where, firstly, I estimateΘ2 by simple

OLS, and then secondly, maximize overΘ1 conditional on the estimates ofΘ2. Here,

the first step provides good starting values for ΣP in the second step.

B.7.3.2 Three-factor ATSM: Estimation Results

Table B.12: No-Arbitrage Restrictions: Three-factor ATSM

This table reports the estimated model parameters when enforcing the no-arbitrage restrictions. Asymp-totic standard errors computed from the outer product of the gradient of the likelihood function areprovided in parenthesis.

Short-rate Parameters:

α βPC

(1,1

)βPC

(1,2

)βPC

(1,3

)0.0001(0.0521)

0.3106(0.1213)

0.6461(0.1532)

0.5250(0.1659)

Market Price of Risk Parameters:

λ0,PC

(1,1

)λ1,PC

(1,1

)λ1,PC

(1,2

)λ1,PC

(1,3

)0.0225(0.0136)

−0.1067(0.0437)

1.4613(0.2466)

1.0281(0.6762)

λ0,PC

(2,1

)λ1,PC

(2,1

)λ1,PC

(2,2

)λ1,PC

(2,3

)−0.0035

(0.0037)−0.0080

(0.0122)−0.0590

(0.0685)0.7876(0.2655)

λ0,PC

(3,1

)λ1,PC

(3,1

)λ1,PC

(3,2

)λ1,PC

(3,3

)0.0040(0.0007)

−0.0138(0.0024)

0.0011(0.0144)

−0.3700(0.0582)

Variance-Covariance of Innovations:

ΣPC

(1,1

)ΣPC

(2,1

)ΣPC

(2,2

)0.0398(0.0014)

0.0055(0.0005)

0.0088(0.0004)

ΣPC

(3,1

)ΣPC

(3,2

)ΣPC

(3,3

)−0.0001

(0.0001)0.0006(0.0001)

0.0024(0.0001)

Time-Series Parameters:

µPC

(1,1

)ΦPC

(1,1

)ΦPC

(1,2

)ΦPC

(1,3

)0.0234(0.0081)

0.8999(0.0267)

0.6319(0.1984)

0.4884(0.6260)

µPC

(1,1

)ΦPC

(1,1

)ΦPC

(1,2

)ΦPC

(1,3

)−0.0041

(0.0021)0.0004(0.0081)

0.5971(0.0571)

−0.5058(0.1911)

µPC

(1,1

)ΦPC

(1,1

)ΦPC

(1,2

)ΦPC

(1,3

)0.0035(0.0006)

−0.0121(0.0022)

−0.0063(0.0121)

0.3793(0.0468)


B.7.3.3 Learning from Macroeconomic Experiences: Estimation Results

Table B.13: No-Arbitrage Restrictions: Model Parameters

This table reports the estimated model parameters when enforcing the no-arbitrage restrictions. Asymp-totic standard errors computed from the outer product of the score function are provided in parenthesis.

Short-rate Parameters:

α βτ(1,1

)βτ

(1,2

)βP

(1,1

)βP

(1,2

)−0.0263

(0.0108)1.4536(0.2738)

1.6019(0.1529)

0.2833(0.1113)

0.7113(0.2513)

Market Price of Risk Parameters:

λ0,τ(1,1

)λ0,τ

(2,1

)λ1,τ

(1,1

)λ1,τ

(1,2

)λ1,τ

(2,1

)λ1,τ

(2,2

)−0.0033

(0.0080)−0.0088

(0.0162)0.0635(0.1499)

−0.0634(0.0603)

0.1035(0.0212)

0.0570(0.0336)

λ0,P(1,1

)λ0,P

(2,1

)λ1,P

(1,1

)λ1,P

(1,2

)λ1,P

(2,1

)λ1,P

(2,2

)0.0287(0.0935)

0.0029(0.0107)

−0.7827(0.1039)

1.6033(0.4191)

−0.1059(0.0316)

0.0834(0.0815)

Variance-Covariance of Innovations:

ΣP(1,1

)ΣP

(2,1

)ΣP

(2,2

)0.0304(0.0010)

0.0045(0.0005)

0.0085(0.0004)


µP(1,1

)µP

(2,1

)ΦP

(1,1

)ΦP

(1,2

)ΦP

(2,1

)ΦP

(2,2

)0.0007(0.0019)

0.0002(0.0005)

0.2515(0.0569)

0.7253(0.1686)

−0.0512(0.0168)

0.5884(0.0513)


B.7.3.4 Model-implied Term Premia and Equilibrium Real Rate

Figure B.3: Model Implied Term Premia

This figure reports the decomposition of model implied yields with 10 years to maturity into term premiaand short rate expectations. The leftmost graph plots the decomposition for the proposed learning frommacroeconomic experiences model, and the rightmost graph plots the decomposition for the benchmarkthree-factor DTSM.


Figure B.4: Equlibrium Real Rate

This figure reports the model implied equilibrium real rate, r?t . The equilibrium real rate is defined to be

r?t = limh→∞

Et[rt+h

], where rt = y (1)

t −Et[πt+1

]denotes the real short rate. Using the model structure, the

equilibrium real rate is given by r?t =α+βτ,cτc,t +(βτ,π−1

)τπ,t +βP

(I−ΦP

)−1µP, since the long-run

consumption growth and inflation expectations are given by limh→∞

Et[∆ct+h

]= τc,t and limh→∞

Et[πt+h

]=τπ,t , respectively.

B.7.3.5 Model-Implied Term Premia: Specification Tests

Since the contribution of Dai and Singleton (2002), a popular way of assesing if model-

implied term premia are well-specified is by means of the ordinary Campbell and

Shiller (1991) regressions, i.e.

y(n−h)t+h − y (n)

t = θ(n)0 +θ(n)

1

h

n −h

(y (n)

t − y(h)t

)+ε(n)

t+h , (B.47)

and the adequately risk-adjusted Campbell and Shiller (1991) regressions, i.e.

y(n−h)t+h − y (n)

t −∆ψ(n−h)t ,t+h + h

n −hζ

(n−h,h)t = δ(n)

0 +δ(n)1

h

n −h

(y (n)

t − y(h)t

)+e(n)

t+h , (B.48)

where ∆ψ(n−h)t ,t+h =ψ

(n−h)t+h −ψ(n−h)

t and ζ(n,h)t = f (n,h)

t −Et[rt+n

]is the forward pre-

mium. The intuition behind these specification tests is straightforward. If model-

implied term premia are correctly specified they should account for the empirically

observed deviations from the expectation hypothesis, e.g. the evidence against the

expectation hypothesis presented in Campbell and Shiller (1991). That is, model-

implied yields should display the same pattern of negative and decreasing regression


slopes θ(n)1 as function of maturity as seen in the data. Further, adequately risk-

adjusting yields, i.e. accounting for ∆ψ(n−h)t ,t+h and ζ(n,h)

t as in (B.48), should recover

δ(n)1 = 1 for all n as implied by the expectation hypothesis.

I simulate 100,000 observations from my four-factor model and the three-factor

Gaussian DTSM. To simulate from the macroeconomic variables in my model, I fit an

annual VAR(1) sampled at a monthly frequency. I use the estimated annual VAR(1)

to sample from consumption growth and inflation, and I then compute implied

conditional expectations as perceived by the representative household using the

constant-gain algorithm. In the leftmost column of figure B.5, I report results for

my four-factor model. The model-implied ordinary Campbell and Shiller (1991)

regression slopes are negative and decreasing with maturity as seen in the data.

Although negative and decreasing, the model-implied regression slopes are slightly

above those observed in the data. However, the model-implied regression slopes are

not outside the 95% confidence intervals, with the exception of the 10-year maturity

which is a borderline case. Thus it is likely that the model could have generated the

pattern observed in the data. In the same way, the risk-adjusted regressions show

that the expectations hypothesis implied unity regression slopes are within the 95%

confidence intervals for all maturities except the 10-year bond, which is again a

borderline case. And so, the proposed model has well-specified model-implied risk

corrections judged from the metric proposed in Dai and Singleton (2002).


Figure B.5: Ordinary and Risk-Adjusted Campbell-Shiller Regressions

This figure plots ordinary and risk-adjusted Campbell and Shiller (1991) regressions. Learning fromMacroeconomic Experiences is in the leftmost column, and the three-factor DTSM is in the rightmostcolumn. Model-implied ordinary Campbell and Shiller (1991) regression coefficients are obtained fromsimulated samples of 100,000 observations. Confidence bounds are computed from Hansen and Hodrick(1980) corrected standard errors using 12 lags.

B.7.3.6 Out-of-Sample Forecasting Performance

Another way to study the specification of conditional moments is by evaluating how

well the models predict future yields out of sample. Although I have documented

out-of-sample forecasting gains by using the four yield curve factors identified in this

paper, it is interesting how well the models perform when enforcing the no-arbitrage

restrictions. I use the data from November 1971 through January 1990 to estimate

the model parameters and then use the model structure to forecast yields 12 months

ahead. I then add one month of data, re-estimate the model parameters, and use the

model structure to forecast yields 12 months ahead. I iterate on this procedure until

December 2013 as the last year of data is reserved for evaluating the forecasts.

I evaluate the performance over two test samples. The first is the period 1990-

2008, which excludes the period with yields near their zero lower bound (ZLB) for the

shortest maturities. Second, I consider the full period 1990-2015 which includes the

ZLB period. Over both test samples, my model shows sizeable improvements over the

standard three-factor Gaussian DTSM. Over the pre-ZLB period the model further

systematically outperforms the random walk benchmark. Here, the improvements

are in the order of 11-23%, whereas the three-factor Gaussian DTSM has root mean

squared forecasting errors that are in the order of 16-25% greater than the random


walk. Over the full test sample, the relative ranking of the two models favours my

model over the standard three-factor Gaussian DTSM for all maturities. However,

the random walk does better when including the ZLB period. This is to be expected,

since short rates are constrained by the ZLB and thus are extraordinarily persistent

in this part of the sample. The rest of the maturity spectrum inherits much of this

persistence, and this implies that a random walk forecasts extraordinarily well. Over

the full test sample, my model and the random walk display very similar forecasting

performance. Over the sample where yields were unconstrained by their ZLB my

model shows sizeable gains over both the standard three-factor Gaussian DTSM and

the random walk. I cannot reject the null that the learning from macroeconomic

experience model forecasts encompass the standard three factor model forecasts for

any of the considered maturities.

Table B.14: No-Arbitrage Restrictions: Out-of-Sample Forecasts

This table reports one year ahead out-of-sample forecast for (i) the three-factor DTSM and (ii) the learningfrom macroeconomic experiences four-factor model when enforcing the no-arbitrage restrictions. Allnumbers are relative to the root mean squared forecasting errors from using a simple random walk forthe same maturity as forecast. Test statistics for the hypothesis that the learning from macroeconomicexperiences forecast encompasses the three factor model forecasts are in parenthesis (). Parameters arere-estimated as the training sample expands.

Maturity, n 1990-2008 Sample 1990-2015 Sample

(i) (ii) (i) (ii)

1 1.246(0.214)

0.891 1.297(0.125)

0.998

2 1.212(0.158)

0.875 1.297(0.470)

1.048

3 1.181(−0.177)

0.845 1.279(0.593)

1.057

4 1.162(−0.471)

0.816 1.254(0.489)

1.048

5 1.166(−0.493)

0.796 1.247(0.586)

1.035

6 1.170(−0.567)

0.787 1.236(0.459)

1.024

7 1.171(−0.674)

0.776 1.219(0.402)

1.010

8 1.173(−0.782)

0.767 1.212(0.357)

1.005

9 1.190(−0.789)

0.773 1.215(0.403)

1.006

10 1.204(−0.779)

0.771 1.207(0.504)

0.983


B.7.3.7 Simulation Evidence

I argued that term structure data is insufficient to identify the true risk factors, i.e. a

principal component analysis would account for all relevant variation in the cross-

section of bond yields, but would fail to identify the true underlying factor structure.

Here, I provide some simulation evidence to support this statement further. I simulate

1,000 test data sets from the proposed four-factor model and the standard three-factor

Gaussian DTSM with T = 518 observations as in the original sample. I simulate using

the model parameters as estimated from the data and the fitted annual VAR(1) for

consumption growth and inflation. The simulated macroeconomic variables are then

used to form conditional expectations using the constant-gain learning algorithm.

Then, I perform a singular value decomposition for each test data set to detect the

model-implied factor structure. If the factor loading matrices do not suffer from

invertibility issues the singular value decomposition should identify (i) the three-

factor structure in the Gaussian DTSM, and (ii) the four-factor structure in my model.

In Table B.15, I report the average variation explained by each principal compo-

nent across the test data sets. The singular value decompositions correctly identify

three risk factors in the test data sets simulated from the Gaussian DTSM. Although

the variation explained by the third principal component is low, the significant drop

in explanatory power is going from three to four factors. However, the singular value

decomposition of the test data sets simulated from my model displays a two-factor

structure. This is evidence that yield data is insufficient to recover the true under-

lying factor structure. Further, in column (iii) I find that when conditioning on the

macroeconomic factors, the two-factor structure remains. This is consistent with

the identification of the latent factors from the variation that is orthogonal to the

macroeconomic factors. That is, only when conditioning on the macroeconomic

factors is it possible to identify the true four-factor structure in the data generating

process.

I then investigate the properties of the implied model factor structure, if the

econometrician misspecifies the learning parameters. The investigation is based on

1,000 test economies with T = 518 observations each as in the original sample. Table

B.16 reports the explained variation by each principal component of the residual vari-

ation in yields after conditioning on the (misspecified) conditional macroeconomic

expectations.

In conclusion, misspecification of the learning parameter does not wash away

the factor structure in yields conditional on macroeconomic expectation factors.

Next, I investigate if misspecification of the learning parameter induces bias in

the VAR transition matrix of the latent factor dynamics. Table B.17 reports the bias

estimates obtained from the simulated test data sets from above.

In conclusion, small misspecifications does not induce severe bias in the transi-

tion matrix of the VAR dynamics for the latent variables. However, extreme misspeci-

fication of the learning parameters can induce a bias.


Table B.15: Model-Implied Factor Structure: Simulation Evidence

This table reports the average variation explained by the principal components of 1,000 test data setsobtained by simulating from (i) the three-factor DTSM, and (ii) the learning from macroeconomic experi-ences four-factor model. The third column (iii) reports the average variation explained by the principalcomponents of the 1,000 test data sets obtained by simulating from the learning from macroeconomicexperiences four-factor model and conditioning on the macroeconomic expectation factors. Each testdata set consists of T = 518 observations as in the original sample, and the simulations are conductedusing the estimated parameter values.

Variation explained by i ’th PC

Three-factor model Four-factor model Four-factor model

unconditional conditional on macro factors

i (i) (ii) (iii)

1 98.913% 96.081% 92.877%

2 0.983% 3.507% 6.233%

3 0.065% 0.075% 0.158%

4 0.007% 0.063% 0.143%

5 0.006% 0.057% 0.130%

6 0.006% 0.053% 0.118%

7 0.006% 0.048% 0.106%

8 0.005% 0.044% 0.093%

9 0.005% 0.039% 0.079%

10 0.004% 0.033% 0.062%

Table B.16: Simulation Evidence: Factor Structure

This table reports the average variation explained by the principal component of 1,000 test data sets

obtained by simulating from the learning from macroeconomic experiences model. Principal components

are of the residual variation in model-implied yields after conditioning on the (misspecified) conditional

macroeconomic expectations. The true data generating process has ν= 0.016.

Variation explained by i ’th PC

i ν= 0.016 ν= 0.018 ν= 0.020 ν= 0.050

1 92.872% 92.893% 92.946% 94.202%2 6.237% 6.221% 6.176% 5.125%3 0.159% 0.158% 0.156% 0.119%4 0.143% 0.142% 0.141% 0.106%5 0.130% 0.129% 0.128% 0.097%6 0.118% 0.117% 0.116% 0.088%7 0.106% 0.106% 0.105% 0.080%8 0.094% 0.093% 0.092% 0.072%9 0.078% 0.079% 0.078% 0.062%

10 0.062% 0.061% 0.061% 0.051%


Table B.17: Simulation Evidence: Bias in VAR Dynamics

This table reports the estimated bias induced in the VAR transition matrix of the latent factor dynamics bymisspecification of the learning parameter ν. The true data generating process has ν= 0.021.

Bias, ν= 0.016 Bias, ν= 0.018

-0.011 -0.004 -0.007 -0.0080.000 -0.019 0.001 -0.019

Bias, ν= 0.020 Bias, ν= 0.050

0.002 -0.018 0.199 -0.2210.002 -0.020 0.024 -0.034

C H A P T E R 3BOND RISK PREMIA AT

THE ZERO LOWER BOUND

Martin M. AndreasenAarhus University, CREATES, and the Danish Finance Institute


Andrew C. MeldrumBoard of Governors of the Federal Reserve System

Abstract

We study bond risk premia at the zero lower bound. In predictive regressions of

excess bond returns onto yield spreads, we document a structural break in regression

coefficients over the recent low interest rate regime. The standard three-factor shadow

rate model fails to account for this empirical pattern. Instead, we propose a shadow

rate model with market prices of risk that switch across non-binding and binding zero

lower bound regimes. Our shadow rate model with regime-dependent market prices

of risk is consistent with the provided regression evidence. The regime-switching

shadow rate model suggests that markets expected monetary policy lift-off to occur

later than otherwise thought.

107

108 CHAPTER 3. BOND RISK PREMIA AT THE ZERO LOWER BOUND

3.1 Introduction

The classical studies by Fama and Bliss (1987) and Campbell and Shiller (1991) relate

the slope of the current yield curve to risk premia in bond markets. This evidence is in

essence based on predictive linear regressions of excess bond returns onto measures

of yield curve slope. However, the recent episodes with prolonged periods of short-

term interest rates being restricted by their zero lower bound (ZLB) across several

countries poses a serious challenge to this linear relation. As the short end of the

yield curve becomes constrained from below, this in turn affects the slope of the yield

curve and generates a "slope compression effect". That is, the slope of the yield curve

is flatter than it would otherwise have been in the absence of a ZLB, meaning that a

given slope of the yield curve carries a stronger signal at the ZLB. Furthermore, the

recent low interest rate environment has called for unconventional monetary policies

such as forward guidance and quantitative easing. This is likely to affect the required

compensations for risk by bond investors, meaning that we also may have a "price of

risk effect".

We study modified regressions of 1-year excess bond holding period returns onto

the yield spread that allow for separate intercepts and regression slope coefficients

over non-binding and binding ZLB periods. Our modified excess bond return regres-

sions reveal a structural break in the predictive regression coefficients. In "normal"

times — when the federal funds rate is deemed unrestricted by its lower bound — we

find the typical pattern. The regressions slope coefficients are positive and increasing

in the maturity of the considered bond. This pattern is amplified when the federal

funds rate is restricted by its ZLB; the regression slope coefficients are larger and

increase faster in the maturity of the considered bond. The estimated differences in

predictive regression coefficients are statistically significant for bonds with maturities

in the 3- through 10-year range. This new empirical fact is robust to: (i) measuring

the slope of the yield curve by a forward spread or the second principal component

of yields instead of the yield spread, (ii) including yield curve level and curvature

factors as control variables, and (iii) including macroeconomic information as control

variables.1

Dai and Singleton (2002) use the classical Campbell-Shiller regression evidence

as a specification test for the popular affine term structure models developed in

Duffie and Kan (1996) and Duffee (2002). In a similar spirit, we use our modified

excess return regressions to conduct a specification test of the popular shadow rate

model (SRM) developed in Black (1995) that enforces the ZLB.2 The idea is simple; a

1We focus on real activity measured by the Chicago Fed National Activity Index (Joslin et al., 2014) andtrend inflation (Cieslak and Povala, 2015). Other macroeconomic variables that have been found to havepredictive information about future bond returns include the output gap (Cooper and Priestley, 2008),factors extracted from a large macroeconomic data set (Ludvigson and Ng, 2009), and Treasury bondsupply (Greenwood and Vayanos, 2014).

2Kim and Singleton (2012), Christensen and Rudebusch (2015), Bauer and Rudebusch (2016), and Wuand Xia (2016) study multi-factor versions of the SRM. The ZLB may also be enforced by affine models with

3.1. INTRODUCTION 109

well-specified model of bond risk premia at the ZLB should capture the model-free

empirical patterns from our modified excess return regressions. The SRM seems like

a natural candidate, since it maintains the properties of the affine term structure

model away from the ZLB.

Our main results from estimating a standard three-factor SRM on monthly U.S.

data from January 1990 through December 2017 are as follows. First, the standard

three-factor SRM does not pass our new specification test. The SRM does achieve

differences in regression coefficients when regressing excess bond returns onto the

yield spread across non-binding and binding ZLB periods. These differences, however,

are quantitatively far too small to explain the patterns observed empirically. That is,

the slope compression effect of the standard SRM is not strong enough to explain the

structural break in bond risk premia dynamics at the ZLB. Second, we also show that

the inability of the standard SRM to match this break is that it implies spells of binding

ZLB that are too short-lived. This is contradictory to one of the arguments that have

popularized the SRM; its capability to generate potentially long spells of zero short-

term interest rates. The implication is that the ZLB regime is characterized by strong

mean-reversion forces pulling the short term interest rate towards its unconditional

mean — far above zero in our sample. These mean-reversion forces contaminate the

ZLB dynamics and explain why the standard three-factor SRM fail our specification

test.

To evaluate the importance of the price of risk effect, we propose a three-factor

SRM with regime-dependent market prices of risk. Here, the marginal bond investor

requires different risk compensations at and away from the ZLB. The ZLB has called

for unconventional policies such as forward-guidance and large scale asset purchases.

Contrary to conventional policies, these policies aim at affecting the slope of the

yield curve through long-maturity bond prices. As these unconventional policies

change the nature of risks in bond markets, we expect bond investors to update the

compensations they require to take on this risk.

We document the following results from estimating our three-factor SRM with

regime-dependent market prices of risk. First, the regime-switch in market prices of

risk across non-binding and binding ZLB periods is statistically significant. Second,

our extension of the SRM passes the proposed specification test. Our extension of

the SRM matches the empirical patterns in regressions slope coefficients both at and

away from the ZLB. One particular reason is that the extended model implies dura-

tions of ZLB spells that are potentially much longer than implied by the standard SRM.

This is because the extended model implies mean-reversion towards a much lower

long-run short rate within the ZLB regime. That is, the SRM with regime-dependent

market prices of risk does not have very strong mean-reversion forces that pulls the

square-root processes, as in Cox, Ingersoll, and Ross (1985) and Dai and Singleton (2000), or by quadraticterm structure models as in Ahn, Dittmar, and Gallant (2002) and Leippold and Wu (2002). More recently, anumber of alternative ways to enforce the ZLB has been proposed by Feunou, Fontaine, Le, and Lundblad(2015), Filipovic, Larsson, and Trolle (2017), and Monfort, Pegoraro, Renne, and Roussellet (2017).


model away from the ZLB regime. Third, the extended SRM implies expected excess

bond returns that were up to 43% more volatile over the recent ZLB episode compared

to its SRM implied counterparts. The additional volatility in bond risk premia does

not show up in the pre-ZLB regime. The model-implied conditional expectations of

excess bond returns from the SRM with regime-dependent market prices of risk does

better than the SRM in Mincer and Zarnowitz (1969)-type specification tests over

the ZLB period. Over the non-binding ZLB period, the two models show comparable

performance. Finally, the SRM with regime-dependent market prices of risk imply

monetary policy lift-off probabilities that were substantially lower than their SRM

counterparts all the way up until shortly before the fact. This suggests that markets

expected monetary policy lift-off to occur later than otherwise thought.

The paper proceeds as follows. Section 3.2 presents our new empirical evidence

on bond risk premia at the ZLB. We conduct our specification test of the baseline

SRM in Section 3.3. In Section 3.4, we present our extension of the SRM with regime-

dependent market prices of risk and show that it passes our proposed specification

test. Section 3.5 discusses the economic implication of our SRM specification with

regime-dependent market prices of risk, while Section 3.6 concludes.

3.2 Bond Return Predictability at the ZLB

Bond risk premia measure the risk compensation required by bond investors to hold

bonds with long maturities. By now, there is substantial evidence for sizeable and

time-varying bond risk premia. The strongest and most robust predictor appears to

be the slope of the yield curve (Fama and Bliss, 1987; Campbell and Shiller, 1991).

More recently, a number of other yield curve and macroeconomic variables have

been found to be significant predictors as well (e.g. Cochrane and Piazzesi, 2005;

Ludvigson and Ng, 2009; Joslin et al., 2014; Cieslak and Povala, 2015). The common

practice to asses the time-variation and dynamics of bond risk premia is by means of

predictive regressions, where the typical specifications admit the representation

r x(n)t+1 =β(n)

0 +β(n)1 St +γ(n)′Zt +ε(n)

t+1. (3.1)

Here, r x(n)t+1 = p(n−1)

t+1 − p(n)t + p(1)

t denotes excess holding returns — borrow in the

1-year bond, buy a n-year bond and sell it one year later — and p(n)t is the log price

of a bond with n-years to maturity at time t . The variable St denotes the slope

of the current yield curve, and Zt denotes a vector of control variables including

additional yield curve factors or macroeconomic variables. The forecast errors ε(n)t+1

are orthogonal to the included regressors. The expectation hypothesis predicts β(n)1 =

γ(n) = 0 for all n, whereas consistent deviations from this null hypothesis is evidence

of predictable time-variation in bond risk premia.

The specification in (3.1) imposes a linear relation between the current slope of

the yield curve and bond risk premia. Prolonged ZLB episodes — as experienced

3.2. BOND RETURN PREDICTABILITY AT THE ZLB 111

recently — are an unavoidable yield curve non-linearity. Thus, the relation between

the slope of the yield curve and bond risk premia is likely to inherent this non-linearity.

This poses a serious challenge to the linear relation in (3.1).

One interpretation is in terms of a shadow rate model. As the short end of the

yield curve becomes constrained from below, this would in turn constrain the slope

of the yield curve — a slope compression effect. Following this line of reasoning, the

information in the observable slope of the yield curve is quite different when the ZLB

is binding. In particular, the signal from a given slope of the current yield curve about

future bond returns is expected to be stronger when the lower bound is binding. This

is because the current yield curve is flatter than it would have been, had there been no

lower bound on the short rate. That is, the so-called shadow rate is negative, whereas

the short rate is zero.

An alternative interpretation is in terms of re-pricing of yield curve risks. The ZLB

period has called for unconventional monetary policies such as forward-guidance

and large scale asset purchases. Contrary to conventional policy, the unconventional

policies aim at affecting longer-term yields. These policy differences could trigger

changes in required risk compensations by the marginal bond investor. In particular,

the nature of duration — or yield curve slope — risk is potentially very different with

such unconventional policies being active components of monetary policy. Changes

in market prices of risk alter the transmission of risk factors to required risk premia

and ultimately yield curve dynamics.

3.2.1 Modified Excess Bond Return Regressions

In our baseline specification, we initially strip down the specification in (3.1) to focus

on the information in the yield curve slope variable. That is, we impose γ(n) = 0 and

study the modification of the regression in (3.1),

r x(n)t+1 =β(n)

0 +β(n)1

(y (n)

t − y (1)t

)+δ(n)

0 I{rt<c} +δ(n)1 I{rt<c}

(y (n)

t − y (1)t

)+ε(n)

t+1. (3.2)

Here, I{·} is the indicator function, which takes on a value of one if the short rate

rt falls below a certain treshold c. The treshold value is set to c = 0.01 to let the

indicator function identify periods where the short rate is taken to be constrained

by the ZLB. We take the yield spread, y (n)t − y (1)

t , as a measure of yield curve slope.

This specification immediately allows for an assessment of the difference in bond risk

premia dynamics when the short rate is deemed to be constrained from below. Here,

the main null hypothesis of interest is H0 : δ(n)1 = 0. We implement the regressions

in (3.2) for maturities of n = 2,3, . . . ,10 years using end-of-month U.S. nominal zero-

coupon Treasury yields from Gurkaynak et al. (2007). We take the effective federal

funds rate to be the short rate. The sample is January 1990 through December 2017.

The start date is chosen to avoid the structural break in U.S. yield dynamics during

the 1980’s (Rudebusch and Wu, 2007).


Figure 3.1: Modified Bond Return Regressions: Yield Spread

This figure reports modified 1-year excess bond return regression coefficients estimated over the January

1990 through December 2017 sample. The estimated specification is r x(n)t+1 = β(n)

0 +β(n)1

(y (n)

t − y (1)t

)+

δ(n)0 I{rt<c} + δ(n)

1 I{rt<c}

(y (n)

t − y (1)t

)+ ε(n)

t+1. Confidence bands are constructed using the 2.5 and 97.5

percentiles from 5,000 block bootstrap repetitions with a block length of 24 months. Each bootstrap

sample is required to have a minimum of 50 ZLB observations.

Figure 3.1 plots the estimated regression coefficients along with 95% confidence

bands for the difference between non-ZLB and ZLB periods. The confidence bands

are computed using a block bootstrap with 5,000 repetitions and a block window

of two years of data.3 Prior to the ZLB period, we find the usual empirical pattern;

the slope regression coefficients are greater than zero and increasing as a function

of maturity. Both the intercept and slope regression coefficients are significantly

different when the ZLB is binding. In particular, we find that the usual empirical

pattern is amplified at the ZLB. The slope regression coefficients are larger and

increase faster as a function of maturity when the ZLB is binding. This evidence

suggests that the slope of the current yield curve carries a stronger signal about bond

risk premia when the shortest maturities are constrained from below. Additionally,

the regression intercepts are consistently smaller over the ZLB period. The differences

3We use a stationary block bootstrap, where the data is resampled in blocks of consecutive observa-tions of both the left- and right-hand side variables of the regressions. This way we account for possibletime series dependencies and preserve cross-sectional dependencies in the data. We require each boot-strap sample to have a minimum of 50 ZLB observations, since this achieves a conservative lower boundon the fraction of binding ZLB periods of 15% in simulated samples. In the data, binding ZLB periodsaccount for 31% of the observations.

3.2. BOND RETURN PREDICTABILITY AT THE ZLB 113

in intercepts and regression slope coefficients are statistically significant at a 5% level

for all maturities beyond three years.

3.2.2 Robustness

For robustness, we implement the regressions in (3.2) with alternative measures of

yield curve slope; (i) the forward spread f (n)t − y (1)

t , where f (n)t = p(n−1)

t −p(n)t denotes

the forward rate at time t for a loan between time t +n −1 and t +n, and (ii) the

second principal component of the yield curve. We further implement the regressions

controlling for typical macroeconomic variables; (i) inflation trend (Cieslak and

Povala, 2015) and (ii) real activity measured by the Chicago Fed National Activity

Index (Joslin et al., 2014).

Figure 3.2: Modified Bond Return Regressions: Other Slope Measures

This figure reports modified 1-year excess returns regression coefficients estimated over the January 1990

through December 2017 sample using two alternative measures of yield curve slope; (i) forward spread

and (ii) the second principal component of yields. Confidence bands are constructed using the 2.5 and

97.5 percentiles from 5,000 block bootstrap repetitions with a block length of 30 months. Each bootstrap


Across the different measures of yield curve slope and controlling for the macroe-

conomic variables, the results remain largely consistent with the findings from Figure

3.1. The slope of the yield curve predicts higher bond risk premia and the regression

slopes are increasing in time to maturity. At the ZLB, this effect is amplified — a given

slope of the yield curve predicts higher bond risk premia when the ZLB binds, and


Figure 3.3: Modified Bond Return Regressions: Macroeconomic Controls

This figure reports modified 1-year excess returns regression coefficients estimated over the January 1990

through December 2017 sample using yield spreads as measure of yield curve slope. Two macroeconomic

variables are further included as control variables; inflation trend and the Chicago Fed National Activity

Index. Confidence bands are constructed using the 2.5 and 97.5 percentiles from 5,000 block bootstrap

repetitions with a block length of 30 months. Each bootstrap sample is required to have a minimum of 50

ZLB observations.

the effect increases faster with maturity. The differences in slope regression coeffi-

cients are statistically significant across the specifications considered. For the forward

spread regressors, the differences are significant at a 5% level for maturities beyond

three years. For the second principal component, the difference is significant at a 5%

level for maturities beyond two years. Controlling for macroeconomic variables, the

differences remain statistically significant at a 5% level for maturities beyond two

years.

The baseline result is further robust to (i) extending the data sample back to

November 1971, (ii) holding periods of 3 and 6 months, (iii) thresholds of 25 and 50

bps for identifying when the short rate was constrained by its lower bound. For the

regression with the second principal component as measure of slope, the results are

also robust to including level and curvature as controls; that is, controlling for the first

and third principal components. These results are relegated to the online appendix.

3.3. A SHADOW RATE MODEL 115

3.3 A Shadow Rate Model

Following Dai and Singleton (2002), the empirical patterns observed above serve as a

natural specification test for the popular SRM; if the SRM constitutes a well-specified

model of the yield curve and bond risk premia dynamics, the model should account

for the apparent differences in slope regression coefficients observed in the actual

data.

We consider the SRM outlined in e.g. Kim and Singleton (2012), Christensen and

Rudebusch (2015), Bauer and Rudebusch (2016), and Wu and Xia (2016). Following

Black (1995), the short rate rt is modelled as,

rt = max{st ,0}, st =α+β′X t , (3.3)

where st denotes the shadow rate that is affine in NX pricing factors X t . We assume

that no arbitrage opportunities exist, and thus that there exist a risk-neutral mea-

sure Q. The pricing factors have risk-neutral first-order vector auto-regression (VAR)

dynamics given by

X t+1 =Φµ+(I−Φ)

X t +ΣεQt+1. (3.4)

where εt+1 is an i.i.d. standard normally distributed vector. I denotes the identity

matrix, and Σ is a lower triangular matrix with dimension NX ×NX identifying the

covariance of the factor innovations. Further, the pricing factors have physical proba-

bility measure P dynamics that is given by

X t+1 = h0 +hX X t +ΣεPt+1, (3.5)

where εPt+1 is an i.i.d. standard normally distributed vector. This assumption implies

an essentially affine stochastic discount factor (Duffee, 2002). The price of a n-year

bond is given recursively as P (n)t = EQt

[exp

(−rt)

P (n−1)t+1

]with yield to maturity y (n)

t =−n−1 logP (n)

t . Yields do not have closed-form expressions, and we therefore use the

second-order approximation of Priebsch (2013). For identification, we impose the

standard restrictions: (i) β′ = 1, (ii) µ = 0, (iii) Σ is lower triangular, and (iv) Φ is in

Jordan form with increasing diagonal elements. The physical measure transition

parameters, h0 and hx , are free parameters.

3.3.1 Estimation

We estimate the model using the sequential regression (SR) approach of Andreasen

and Christensen (2015). The SR approach has known asymptotic properties and

is computationally easy to implement even for non-linear dynamic term structure

models. This is in contrast to the alternative non-linear filtering techniques and

quasi-maximum likelihood estimators.

Our data set is as above. We consider a three factor specification, that is NX = 3.

The SR approach is constructed for large cross-sections. For this reason, we include


more yields than typically considered in the estimation. That is, we include Ny = 25

yields at each date; yields in the 3-month through 2.5-year range at three month

intervals, and yields in the 3-year through 10-year range at six month intervals. All

maturities are taken to be observed with a small measurement error v (n)t , i.e. y (n)

t =gn

(X t

)+ v (n)t . Here, gn

(X t

)is the function relating the factors to the cross-section

of yields. This function is governed by the risk-neutral VAR dynamics in (3.4). We

assume the measurement errors are mean zero and have finite, positive-definite

variance-covariance matrix.

The SR approach has three steps. In step 1, the risk-neutral parameters and latent

factors are estimated jointly from cross-sectional regressions. For given values of risk-

neutral parameters θ1 =[α di ag

(Φ

)vech

(Σ

)′ ], we estimate the factors at each

point in time by cross-sectional non-linear regressions. Estimates of the risk-neutral

parameters θ1 are obtained from minimizing the pooled squared residuals from the

cross-sectional regressions.

In step 2, the P dynamics are estimated using the factor estimates from step 1.

The physical measure factor dynamics h0, hX , and Σ can be estimated by a linear

regression that is adequately modified for estimation uncertainty in the factors.

Both step 1 and step 2 provide consistent estimates of Σ. In step 3, we condition

on Σstep2, since unreported results show that Σstep2 tends to be the most efficient

estimate.4 The remaining risk-neutral parameters and factor estimates are then

updated by re-running step 1 conditional on Σstep2. Finally, the estimates of h0 and

hX are updated by re-running step 2 using the new factor estimates. See Andreasen

and Christensen (2015) for further details on the estimation procedure.

3.3.2 Cross-Sectional Fit

The model provides a good fit to the cross-section of yields at each point in time.

Short maturities are heavily affected by the ZLB restriction, whereas long maturities

are less restricted and move more freely.

Table 3.1 and 3.2 report the estimated model parameters. The estimates are in

line with the typical findings; the largest eigenvalue of the risk-neutral transition

matrix I−Φ is close to unity, and the two remaining eigenvalues are smaller and

nearly identical (Christensen et al., 2011). The physical measure dynamics also show

the typical high persistence for the factors. Model-implied yields track closely the

observed data both at short maturities and the long maturities. In the short end of

the maturity-spectrum, the model occasionally produces 3-month yields that are

too low. This happens in particularly when the shadow rate is negative and the short

rate is truncated at zero, whereas over our sample the observed 3-month yield stays

slightly positive. These deviations are, however, only in the order of magnitude of a

few basis points. The fit is even closer at the long end of the maturity-spectrum.

4This is consistent with the results in Joslin et al. (2011) for affine term structure models and Andreasenand Meldrum (2018) for SRMs.


Table 3.1: Risk-Netural Parameters

Asymptotic standard errors are provided in parentheses and are robust to yield measurement errorsdisplaying heteroscedasticity in the time series dimension, cross-sectional correlation, and autocorrelation.We use ωD = 5 and ωT = 10 in the estimator provided in Andreasen and Christensen (2015).

SRM R-SRM

α 0.0168 (0.0070) 0.0169 (0.0072)Φ11 0.0012 (0.0007) 0.0012 (0.0007)Φ22 0.0461 (0.0020) 0.0467 (0.0020)Φ33 0.0555 (0.0017) 0.0553 (0.0018)

Table 3.2: Time Series Parameters

This table presents the estimated model parameters. The estimation procedures are outlined in Andreasenand Christensen (2015) and Andreasen, Engsted, Møller, and Sander (2016). Asymptotic standard errorsare provided in parentheses and are computed as described in Andreasen et al. (2016).

Panel A: SRM

h0 hX hX(·,1

)hX

(·,2)

hX(·,3

)Σ Σ

(·,1)

Σ(·,2

)Σ

(·,3)

−1.57×10−4(1.53×10−4

) hX(1, ·) 0.9875

(0.0116)0.0188(0.0104)

0.0179(0.0123)

Σ(1, ·) 3.16×10−4(

2.29×10−5) · ·

4.99×10−4(0.0016)

hX(2, ·) 0.0556

(0.1143)1.0362(0.1605)

0.0965(0.1769)

Σ(2, ·) −0.0012(

3.94×10−4) 0.0038(

2.34×10−4) ·

−9.42×10−4(0.0016)

hX(3, ·) −0.0865

(0.1128)−0.1037

(0.1607)0.8267(0.1761)

Σ(3, ·) 9.42×10−4(

3.86×10−4) −0.0039(

2.50×10−4) 2.27×10−4(

2.79×10−5)

Panel B: R-SRM

h(1)0 h(1)

X h(1)X

(·,1)

h(1)X

(·,2)

h(1)X

(·,3)

Σ Σ(·,1

)Σ

(·,2)

Σ(·,3

)−2.45×10−4(

2.32×10−4) h(1)

X

(1, ·) 0.9814

(0.0176)0.0033(0.0164)

8.17×10−4(0.0201)

Σ(1, ·) 3.12×10−4(

2.25×10−5) · ·

−0.0010(0.0027)

h(1)X

(2, ·) −0.0654

(0.2081)0.9466(0.2612)

0.0182(0.2906)

Σ(2, ·) −0.0013(

3.95×10−4) 0.0041(

2.37×10−4) ·

4.15×10−4(0.0027)

h(1)X

(3, ·) 0.0229

(0.2039)−0.0164(0.2580)

0.8997(0.2850)

Σ(3, ·) 0.0010(

3.86×10−4) −0.0042(

2.53×10−4) 2.19×10−4(

2.85×10−5)

h(2)0 h(2)

X h(2)X

(·,1)

h(2)X

(·,2)

h(2)X

(·,3)

−4.24×10−4(8.59×10−4

) h(2)X

(1, ·) 0.9707

(0.0503)0.0282(0.0428)

0.0294(0.0405)

−0.0074(0.0114)

h(2)X

(2, ·) −0.4066

(0.6831)0.7539(0.5785)

−0.1712(0.5634)

0.0068(0.0116)

h(2)X

(3, ·) 0.3698

(0.6940)0.14388(0.5959)

1.0631(0.5814)

Figure 3.4 shows the root mean squared measurement errors in basis points for

all the included maturities. Over the entire sample, the measurement errors are only

a few basis points for all maturities. This pattern largely remains over the binding and

non-binding ZLB periods.


Figure 3.4: In-Sample Fit: SRM

This figure reports root mean squared measurement errors in basis points for all maturities considered in

the estimation.

3.3.3 The Modified Linear Projection Test

We expose the SRM to our modified linear projection test. If the SRM is a well-

specified model, we would expect the non-linearity imposed by the truncation of

the short rate to account for the apparent shift in the regression coefficient that we

observe in the data. Specifically, we ask if the population regression coefficients δ(n)1

implied by the SRM match the pattern from Figure 3.1.

Taking the estimated model parameters at face value, we simulate a sample path

of length T = 1,000,000 and run the regressions in (3.2) to identify the population

regression coefficients δ(n)1 . Figure 3.5 plots the results. The SRM fails to account for

the differences in regression coefficients. In particular, the differences in regression

intercepts are negative but far too small to line up with the data. Regression slope

differences are mostly positive — at least for the longer maturities — but again far too

small to be in line with what we observe in the data. Thus, the model predicts changes

in regression coefficient that are in the right direction, but the changes are too small

to account for the empirical patterns. This means that the slope compression effect is

insufficient to generate the observed structural break in bond risk premia dynamics

at the ZLB.


Figure 3.5: Modified Linear Projection Test: SRM

This figure reports the results from the modified linear projection test. The population regression co-

efficients for the shadow rate model are computed from a simulated sample of length T = 1,000,000.

Simulations are for the estimated parameter values. The regression specification is r x(n)t+1 = β(n)

0 +β(n)

1

(y (n)

t − y (1)t

)+δ(n)


(y (n)

t − y (1)t

)+ε(n)

t+1. Confidence bands are constructed using

the 2.5 and 97.5 percentiles from 5,000 block bootstrap repetitions with a block length of 24 months. Each

bootstrap sample is required to have a minimum of 50 ZLB observations.

3.3.4 Diagnosing the Failure

The model-implied regression coefficients do not account for the apparent shift

in regression coefficients. However, the SRM does drive a change in the regression

coefficients in the correct direction. This is consistent with the intuition of truncation

of the yield curve slope from below. To address whether it is in fact the non-linearity

at the ZLB that drives the shift in regression coefficients, consider the same model

without imposing a ZLB. In this case, the model has the well-known closed form

solutions for yields that are affine in the underlying risk factors, y (n)t =An +Bn X t ,

where the factor loadings are given by the usual recursions. We redo the modified

linear projection test on simulated data from the affine yield curve specification using

the same model parameters and simulated state vectors as for the SRM above.

Figure 3.6 shows that the non-linearity imposed by the shadow rate specification

only result in minor changes in the regression coefficients. One particular reason for

this is highlighted in Table 3.3. The estimated SRM implies durations for ZLB spells

that are very short-lived. Because short rates at zero are far from the unconditional

mean of the risk factors, the VAR(1) dynamics of the factors imply strong mean


Figure 3.6: SRM vs. ATSM

The population regression coefficients for the shadow rate and affine models are computed from a

simulated sample of length T = 1,000,000. Simulations are for the estimated parameter values. The

regression specification is r x(n)t+1 =β(n)

0 +β(n)1

(y (n)

t − y (1)t

)+δ(n)


(y (n)

t − y (1)t

)+ε(n)

t+1.

Table 3.3: Duration of ZLB Spells

The model-implied distributions of durations of ZLB spells are obtained from simulating a sample pathof length T = 1,000,000 from each model at the estimated parameter values. Durations are measured asconsecutive months with the short rate falling below the threshold c = 0.01.

Mean 50-prctile 75-prctile 95-prctile 99-prctile

SRM 13.10 4 14 58 98

R-SRM 18.65 6 26 75 106

reversion and hence short ZLB spells. In fact, the model-implied median duration of

a ZLB spell is only 4 months and the 95th percentile duration is 58 months. As short

rates are expected to lift-off from zero quickly, such episodes do not affect longer

maturities much. This result suggests that the non-linearity is unlikely to capture the

shift in dynamics of bond risk premia at the ZLB. Instead, it suggests that underlying

bond risk premia dynamics of the SRM (and of the affine model for that matter) is

misspecified.

3.4. REGIME-DEPENDENT MARKET PRICES OF RISK 121

3.4 Regime-Dependent Market Prices of Risk

To guide intuition, consider the affine model that does not enforce the ZLB. In this

case, the coefficient from regressing excess holding returns onto the yield spread is

given in closed form by

β(n)1 =

(n −1)Bn−1λXV[

X t](

1nBn +β′

)′(

1nBn +β′

)V

[X t

](1nBn +β′

)′ , (3.6)

where λX = hX − (I−Φ)

denotes the market price of risk loading on the factors X t .

Thus, a re-pricing of risk has the potential to capture the observed shift in the regres-

sion coefficients. Re-pricing of risk is motivated by the restrictions that the ZLB puts

on conventional short rate policies. As policy makers turn toward unconventional

policies aimed at affecting long-maturity bond prices, the marginal bond investor

realizes that the risk he faces have changed and updates his required risk compen-

sations. In particular, the nature of duration risk is likely to have been somewhat

different over the recent U.S. ZLB episode, where the Federal Reserve engaged in

unconventional policies such as forward-guidance and large scale asset purchases.

To the extend that this led bond investor’s to adjust their required duration risk

compensation, such policies could cause a shift in bond risk premia dynamics.

We reconcile the two-regime risk factor dynamics in Andreasen et al. (2016) to

fit our setting. In particular, we introduce regime-dependent market prices of risk,

which in turn implies regime-dependent factor dynamics. Except for the Pmeasure

dynamics, the model is defined as for the standard SRM. That is, we consider a regime-

switching shadow rate model — the R-SRM hereafter — where the factor dynamics

are given by

X t+1 = I{rt≥c}h(1)0 +I{rt≥c}h

(1)X X t +I{rt<c}h

(2)0 +I{rt<c}h

(2)X X t +ΣεPt+1. (3.7)

This specification is consistent with regime-dependent market prices of risk, while

leaving the risk-neutral distribution — and hence largely the cross-sectional yield

curve fit — unaffected. At the same time, the regime-dependent market price of risk

specification allows for potentially longer durations of ZLB spells. The ZLB-regime

can exhibit extraordinarily factor autocorrelation, which can prolong the spells at zero

short term interest. Further, the parameters in the vector h(2)0 capture the point that

the risk factors mean-reverts towards in the ZLB-regime. This point can potentially

be low, and thus further prolonging the duration of ZLB spells.

3.4.1 Estimation

Since the set of parameters governing the risk-neutral dynamics are unaffected by the

extension only the second step of the SR approach has to be adjusted. We adjust the

time-series regression in step 2 to account for the regime-dependence of the market


prices of risk, while continuing to adequately adjust for estimation uncertainty in

the risk factors. Importantly, the P dynamics remains given in closed form. Thus,

the estimation of the regime-dependent dynamic parameters h(1)0 , h(2)

0 , h(1)X , and h(2)

Xcomes at no additional complexity or computational cost. See Andreasen et al. (2016)

for further details.

3.4.2 Cross-Sectional Fit and Significance

Table 3.1 shows that the risk-neutral parameter estimates are largely unaffected by

the shift in market prices of risk. The extension maintains the good cross-sectional fit

of yields at each point in time. Model-implied yields track closely the observed data

and are hardly distinguishable from the baseline SRM. This result is not surprising,

since only the P dynamics are affected by the regime-switch at the ZLB. The shift

in P dynamics affect the risk-neutral parameters through their common variance-

covariance matrix of factor innovations Σ, although this effect is small. Measurement

errors remain in the order of a few basis points, and the maturity-specific pattern

mirrors that of the baseline model. The similarity also remain over the period with

binding and non-binding ZLB. This evidence is in Figure 3.7.

Figure 3.7: In-Sample Fit: R-SRM

This figure reports root mean squared measurement errors in basis points for all maturities considered in

the estimation.

The extension is nevertheless important for understanding the dynamics of the

yield curve when the short end is constrained from below. The two regimes are

different in potentially important ways. Our dynamic distinction — at and away

3.4. REGIME-DEPENDENT MARKET PRICES OF RISK 123

from the ZLB — is statistically significant with a 0.03 p-value based on a Wald test.

That is, we reject the null hypothesis that h(1)0 = h(2)

0 and h(1)X = h(2)

X at a five percent

significance level. In particular, the regime distinction implies very different long-run

shadow rate means within each regime. The shadow rate mean-reverts towards a

long-run mean of 3.4% in the "normal" regime, whereas the shadow rate reverts

towards a long-run mean of 0.4% in the ZLB regime. In the baseline SRM the shadow

rate reverts towards a long-run mean of 2.2%. This difference in dynamics near

the ZLB implies that we should expect to see longer durations of ZLB spells in the

regime-switching SRM. Table 3.3 shows that this is in fact the case. The distribution

of durations of spells at the ZLB is substantially shifted compared to the baseline

SRM. In particular, the short rate is expected to remain at the ZLB for a prolonged

period. The durations are longer by as much as 50% compared to the baseline SRM.

For example, the 75th and 95th percentiles increase from 14 and 58 months to 26

and 75 months, respectively. Further, the dynamic differences have the potential to

explain why we find that a given yield spread predicts higher bond returns at the ZLB

compared to "normal" times.

3.4.3 The Modified Linear Projection Test

As for the baseline SRM, we expose our extension to the modified linear projection

test. We do so by simulating a sample path of length T = 1,000,000 taking our esti-

mated model parameters as given. The resulting regression coefficients are plotted in

Figure 3.8.

The extended model with regime-dependent market prices of risk does much

better in terms of matching the empirical regression coefficients. Across all maturities,

the model-implied population coefficients are always within the 95% confidence

intervals. This is the case both for the shift in regression intercept and regression

slope on the yield spread.

3.4.4 Interpretation

To obtain insights into which parameters are driving the success of the R-SRM in

passing the modified linear projection test, we conduct a range of experiments. First,

consider the counterfactual exercise, where we omit regime-switching by imposing

the same factor dynamics over the ZLB regime as estimated over the non-binding

ZLB regime. Under this assumption, we re-compute the model-implied regression

coefficients over the ZLB regime by simulating a sample path of length T = 1,000,000.

We then in turn allow for subsets of the parameters governing the factor dynamics

to switch to their estimated ZLB regime values. This is to isolate the effect on the

model-implied regression coefficients.

The first two rows of Table 3.4 iterate the finding that the single-regime model

does not capture the regression evidence in the data. In particular, the single-regime


Figure 3.8: Modified Linear Projection Test: R-SRM

This figure reports the results from the modified linear projection test. The population regression co-

efficients for the shadow rate model are computed from a simulated sample of length T = 1,000,000.

Simulations are for the estimated parameter values. The regression specification is r x(n)t+1 = β(n)

0 +β(n)

1

(y (n)

t − y (1)t

)+δ(n)


(y (n)

t − y (1)t

)+ε(n)

t+1. Confidence bands are constructed using

the 2.5 and 97.5 percentiles from 5,000 block bootstrap repetitions with a block length of 24 months. Each

bootstrap sample is required to have a minimum of 50 ZLB observations.

model struggles to generate the large regression coefficients for the 5- to 10-year ma-

turities (e.g. 3.88 vs. 7.34 in the data for the 10-year bond). Row (3) in Table 3.4 allows

for the regime-switch in the intercepts of the factor dynamics only. This specification

substantially increases the regression coefficients across the maturity-spectrum. In

fact, the model-implied regression coefficients overshoot the data estimates for the

short- to medium-term maturities. For the 2-year bond, the model-implied regression

coefficient increases to 3.12 compared to the empirical value of 1.76. The intuition

behind this result is that the regime-shift in the constants affects the unconditional

factor means within the ZLB regime. As the shift in constants drives down the point

that the short rate mean-reverts towards, this in turn implies longer durations at

the ZLB, since there is no strong mean-reversion forces pulling the short rate away

from the ZLB regime. Row (4) in Table 3.4 instead allow only the transition coefficient

of the yield curve slope factor to shift. This specification again increase the regres-

sion coefficients, although the effect is less than for the intercepts. The regression

coefficient for the 2-year bond excess return largely matches the data counterpart

(1.87 vs. 1.76 in the data), whereas the longer-term bond excess return regression

coefficients fall short of the data counterparts (4.44 vs. 7.34 in the data). This pattern

3.5. ECONOMIC IMPLICATIONS 125

Table 3.4: Model-Implied Linear Projection Experiments

This table presents the effects of the regime-switching parameters on the model-implied excess returnregression coefficients. The regression coefficients are obtained by simulating a sample path with lengthT = 1,000,000 from the R-SRM evaluated at the specified parameters. The regression specification is

r x(n)t+1 = β(n)

0 +β(n)1

(y (n)

t − y (1)t

)+δ(n)


(y (n)

t − y (1)t

)+ε(n)

t+1. The baseline specification

has factor dynamics that are equal across the two regimes and estimated based on the non-binding ZLBperiod. Column 1 outlines which parameters are allowed to switch to the values estimated over the bindingZLB period.

β(2)1 +δ(2)

1 β(5)1 +δ(5)

1 β(10)1 +δ(10)

1

(1) Data 1.76 4.66 7.34

(2) No regime-switch 1.45 2.46 3.88

(3) h(1)0 → h(2)

0 3.12 5.61 7.58

(4) h(1)X

(2, ·)→ h(2)

X

(2, ·) 1.87 2.89 4.44

(5) h(1)X → h(2)

X 2.13 3.34 4.63

(6) h(1)0 ,h(1)

X → h(2)0 ,h(2)

X 2.08 4.26 6.44

remains in row (5), where the entire factor transition matrix is allowed to switch.

Finally, the specification with both intercept and transition parameters switching

strikes a good balance between matching the short-, medium-, and long-term excess

return regression coefficients within the ZLB regime. The 2-, 5- and 10-year model-

implied regression coefficients are 2.08, 4.26, and 6.44 compared to the empirical

counterparts of 1.76, 4.66, and 7.34. The experiments highlight the importance of

prolonging the durations of the ZLB spells. Here, this is largely achieved by changing

the point towards which the short rate mean-reverts within the ZLB regime.

3.5 Economic Implications

The differences in market prices of risk are economically important. Because the

market prices of risk affect the factor dynamics, and thus ultimately yield dynamics

and persistence, the extended model predicts important differences in expected bond

returns compared to the standard SRM. These differences have important implica-

tions for measuring bond risk premia and doing inference on market expectations

for the monetary policy lift-off horizon.

3.5.1 Bond Risk Premia

The regime-dependent market prices of risk imply differences in yield curve dynamics

and model-implied bond risk premia.

Figure 3.9 shows the model-implied expected bond excess returns for 3-, 5-, 7-,

and 10-year bonds. The two model-implied estimates of expected bond returns are of

course highly correlated. For the reported maturities, the correlation is in the range


Figure 3.9: Model-Implied Expected Excess Returns

This figure plots the time series of model-implied expected excess returns for different maturities. The

model-implied expectations are computed from 2,000 Monte Carlo draws from the R-SRM and SRM

models, respectively.

0.91-0.93. However, the two models do display important differences. In particular,

over the ZLB period, the R-SRM implies expected excess returns that display greater

variability than the SRM counterpart. The variability in the R-SRM implied expected

excess returns are 19-43% higher than the SRM counterparts depending on the

considered maturity. For comparison, the variability of the R-SRM implied expected

excess returns over the full sample is only 6-8% greater than the SRM counterparts;

this difference is primarily driven by the ZLB period.

There is of course no information in Figure 3.9 useful for judging which of the two

models that provide a better measure of bond risk premia. Our proposed specifica-

tion test suggest that the R-SRM have well-specified bond risk premia, whereas the

standard SRM struggle to capture the empirical properties of excess return dynam-

ics. Another popular specification test for conditional expectations are the Mincer

and Zarnowitz (1969) regressions. For this particular application, the test is based

on a regression of realized excess returns onto a constant and the model-implied

conditional expected excess return. That is, we run the regressions

r x(n)t+1 =φ(n)

0 +φ(n)1 Et

[r x(n)

t+1

]+ε(n)

t+1. (3.8)

A well-specified measure of conditional expectations should return a constant and

regression slope of 0 and 1, respectively. Unreported results show that there is little

3.5. ECONOMIC IMPLICATIONS 127

evidence to distinguish between the SRM and R-SRM over the full sample as both

models return constants slightly above 0 and regressions slopes slightly below 1. For

both models, the model-implied expected excess returns cannot be rejected to be well-

specified when accounting for estimation uncertainty. However, we are particularly

interested in the period with a binding ZLB. Therefore, we perform a modified Mincer

and Zarnowitz (1969) test. We redo the regressions over two subsamples; (i) the

periods with a non-binding ZLB, and (ii) over the period with a binding ZLB. Formally,

the modified Mincer and Zarnowitz (1969) test reads

r x(n)t+1 = I{rt≥c}ν

(n)0 +I{rt≥c}ν

(n)1 Et

[r x(n)

t+1

]+

I{rt<c}ρ(n)0 +I{rt<c}ρ

(n)1 Et

[r x(n)

t+1

]+ε(n)

t+1.(3.9)

As before, a given month is deemed to be constrained by the ZLB if the short rate is

below the threshold level c = 0.01.

Figure 3.10: Mincer-Zarnowitz Regressions

This figure plots the results from the Mincer and Zarnowitz (1969) regressions, r x(n)t+1 = φ(n)

0 +φ(n)

1 Et

[r x(n)

t+1

]+ε(n)

t+1, where Et

[r x(n)

t+1

]are model-implied expectations from the R-SRM and SRM. The

regressions are done over two subsamples: (i) non-binding ZLB periods and (ii) binding ZLB periods.The model-implied expectations are computed from 2,000 Monte Carlo draws from the R-SRM and SRMmodels, respectively.

For both models, the constants ν(n)0 and ρ(n)

0 are all very close to zero. Over the

subsample with non-binding ZLB, the regression coefficients ν(n)1 are similar for the

two models. As for the full sample, the regression coefficients are below 1 for both

models. The binding ZLB period displays larger differences between the two models.

The R-SRM has regression coefficients ρ(n)1 that are somewhat closer to unity than the


SRM, although estimation uncertainty unavoidably makes the differences statistically

insignificant due to the short sample period. Thus, the R-SRM does somewhat better

than the SRM over the ZLB period without compromising the performance over the

non-binding ZLB period.

3.5.2 Lift-Off Probability

A key feature that enables the R-SRM to match the dynamics of bond risk premia at

the ZLB is the prolonged durations of ZLB episodes. One popular application of the

baseline SRM is to extract market expectations about the timing of policy rate lift-off

from the ZLB (e.g. Bauer and Rudebusch, 2016). Obviously, it is important to specify

the dynamics of the short rate correctly to obtain the most accurate predictions of

the timing of anticipated lift-off.

Figure 3.11: Lift-Off Probabilities

This figure plots the time series of R-SRM and SRM implied lift-off probabilities over the one-year horizon.Probabilities are computed from 100,000 Monte Carlo draws from the model-implied distribution of theshort rate. Lift-off is defined as the short rate exceeding the threshold level c = 0.01.

We compute model-implied lift-off probabilities at each point in time by simula-

tion. Starting from the estimated state vector, we simulate forward 100,000 sample

paths and calculate the proportion of sample paths where the short rate exceeds the

threshold value c = 0.01 one year from the considered date.

As expected, the R-SRM generally implies lower lift-off probabilities than the SRM.

This result comes from the lower degree of mean-reversion within the ZLB regime

for the R-SRM. However, over the early part of the period with binding ZLB, the two

models generally agree that lift-off was an unlikely event. The two models disagree

3.6. CONCLUSION 129

with a wider margin in early to mid-2014 and afterwards. The SRM assigns substantial

probability (around 60%) to the short rate exceeding 1 percent as early as December

2015. On the other hand, the R-SRM assigns much smaller probability to this event;

around 20%. The Federal Reserve first raised its target range to 25-50 bps in December

2015, and proclaimed further increases in the federal funds rate target range at a

steady pace onwards. Over the course of 2015 and 2016, the R-SRM assigns increasing

probability of the short rate exceeding 1 percent. The same pattern is evident for the

SRM. The exact probabilities, however, still display substantial differences across the

two models. The SRM model assigns probabilities around 80% by the end of 2016,

whereas the R-SRM assigns a probability of around 50% to the short rate exceeding

1 percent by the end of 2017. By early 2017, the R-SRM catches up with the SRM as

both models assign a high probability of lift-off of approximately 80%. The actual fed

funds target range increased to 1-1.25 percent in June 2017.

The R-SRM model suggests that simply extrapolating the short rate dynamics

from the pre-ZLB period for forming short rate projections at the ZLB may lead

to lift-off probabilities that exceed those of market participants. The baseline SRM

model assigns substantial probabilities of the short rate exceeding 1 percent as early

as two-three years before they did. In comparison, the R-SRM — which estimates

the lift-off probabilities from the yield curve dynamics estimated over the actual

ZLB period — suggests that the SRM lift-off probabilities may be overstated. Further,

the R-SRM suggests that the anticipated timing of lift-off may have been later than

otherwise thought.

3.6 Conclusion

We document a structural break in the regression coefficients from linear projections

of excess bond returns onto yield spreads. This empirical observations serve as a

natural specification test of the standard SRMs. As the short end of the yield curve be-

comes constrained from below, the slope of the yield curve will be flatter than it would

have been in the absence of the lower bound. We find that the slope compression

effect of the three-factor SRM is not strong enough to explain the regression-based ev-

idence. One particular reason for this is that the SRM implies durations of ZLB spells

that are very short-lived. This implies that strong mean-reversion forces dominate

the dynamics of yields and bond risk premia near the ZLB.

Instead, we propose a SRM with regime-dependent market prices of risk. The

regime-dependence of risk prices is motivated by the changing nature of bond market

risks over the recent ZLB episode. Unconventional monetary policies have, unlike

conventional short rate policies, aimed at affecting longer-term bond prices. This

changing nature of risks in bond markets may have induce investors to update their

required risk compensations.

We find a significant switch in market prices of risk at the ZLB. Further, the SRM


with regime-dependent market prices of risk matches the regression-based empirical

evidence. Compared to the standard SRM, our extensions imply longer durations

of ZLB episodes. This is because the short rate mean-reverts towards a much lower

long-run mean within the ZLB regime. Contrary to the standard SRM, the extension

does not have strong mean-reversion forces pulling the short rate away from the ZLB

regime.

The extension has some interesting economic implications. First, model-implied

bond risk premia were up to 43% more volatile over the recent ZLB episode compared

to its SRM counterpart. Second, the regime-switching model assigns substantially

lower probability to monetary policy lift-off all the way up until shortly before the

fact.

Acknowledgements

Andreasen acknowledges financial support from the Danish e-Infrastructure Cooper-

ation (DeIC). Andreasen and Jørgensen acknowledges financial support to CREATES

(Center for Research in Econometric Analysis of Time Series; DNRF78) from the Dan-

ish National Research Foundation. The analysis and conclusions are those of the

authors and do not indicate concurrence by the Board of Governors of the Federal

Reserve System or other members of the research staff of the Board.

3.7. REFERENCES 131

3.7 References

Ahn, D. H., Dittmar, R. F., Gallant, R. A., 2002. Quadratic term structure models: Theory

and evidence. Review of Financial Studies Vol. 15, 243–288.

Andreasen, M. M., Christensen, B. J., 2015. The sr approach: A new estimation proce-

dure for non-linear and non-gaussian dynamic term structure models. Journal of

Econometrics Vol. 184, 420–451.

Andreasen, M. M., Engsted, T., Møller, S. V., Sander, M., 2016. Bond market asym-

metries across recessions and expansions: New evidence on risk premia. Working

paper.

Andreasen, M. M., Meldrum, A. C., 2018. A shadow rate or a quadratic policy rule? the

best way to enforce the zero lower bound in the united states. Journal of Financial

and Quantitative Analysis Forthcoming.

Bauer, M. D., Rudebusch, G. D., 2016. Monetary policy expectations at the zero lower

bound. Journal of Money, Credit, and Banking Vol. 48, 1439–1465.

Black, F., 1995. Interest rates as options. Journal of Finance Vol. 50, 1371–1376.



Christensen, J. H. E., Diebold, F. X., Rudebusch, G. D., 2011. The affine arbitrage-free

class of nelson-siegel term structure models. Journal of Econometrics Vol. 164,

4–20.

Christensen, J. H. E., Rudebusch, G. D., 2015. Estimating shadow-rate term structure

models with near-zero yields. Journal of Financial Econometrics Vol. 13, 226–259.

Cieslak, A., Povala, P., 2015. Expected returns in treasury bonds. Review of Financial

Studies Vol. 28, 2859–2901.

Cochrane, J. H., Piazzesi, M., 2005. Bond risk premia. American Economic Review Vol.

95, 138–160.

Cooper, I., Priestley, R., 2008. Time-varying risk premiums and the output gap. Review

of Financial Studies Vol. 22, 2801–2833.

Cox, J. C., Ingersoll, J. E., Ross, S. A., 1985. A theory of the term structure of interest

rates. Econometrica Vol. 53, 385–407.

Dai, Q., Singleton, K., 2000. Specification analysis of affine term structure models.



Dai, Q., Singleton, K. J., 2002. Expectation puzzles, time-varying risk premia, and

affine models of the term structure. Journal of Financial Economics Vol. 63, 415–

441.

Duffee, G., 2002. Term premia and interest rate forecast in affine models. Journal of

Finance Vol. 57, 405–443.

Duffie, D., Kan, R., 1996. A yield-factor model of interest rates. Mathematical Finance

Vol. 6, 379–406.



Feunou, B., Fontaine, J. S., Le, A., Lundblad, C., 2015. Tractable term-structure models

and the zero lower bound. Working Paper.

Filipovic, D., Larsson, M., Trolle, A. B., 2017. Linear-rational term structure models.


Greenwood, R., Vayanos, D., 2014. Bond supply and excess bond returns. Review of

Financial Studies Vol. 27, 663–713.

Gurkaynak, R. S., Sack, B., Wright, J. H., 2007. The u.s. treasury yield curve: 1961 to

the present. Journal of Monetary Economics Vol. 54, 2291–2304.

Joslin, S., Priebsch, M., Singleton, K. J., 2014. Risk premiums in dynamic term structure

models with unspanned macro risk. Journal of Finance Vol. 69, 453–468.

Joslin, S., Singleton, K. J., Zhu, H., 2011. A new perspective on gaussian dynamic term

structure models. Review of Financial Studies Vol. 24, 926–970.

Kim, D. H., Singleton, K. J., 2012. Term structure models and the zero bound: An

empirical investigation of japanese yields. Journal of Econometrics Vol. 170, 32–49.

Leippold, M., Wu, L., 2002. Asset pricing under the quadratic class. Journal of Finan-

cial and Quantitative Analysis Vol. 37, 271–295.

Ludvigson, S. C., Ng, S., 2009. Macro factors in bond risk premia. Review of Financial

Studies Vol. 22, 5027–5067.

Mincer, J., Zarnowitz, V., 1969. The evaluation of economic forecasts. In: Mincer, J.

(Ed.), Handbook of Economic Forecasting. New York: National Bureau of Economic

Research, 81–111.

Monfort, A., Pegoraro, F., Renne, J. P., Roussellet, G., 2017. Staying at zero with affine

processes: A new dynamic term structure model. Journal of Econometrics Vol. 20,

348–366.

3.7. REFERENCES 133

Priebsch, M. A., 2013. Computing arbitrage-free yields in multi-factor gaussian

shadow-rate term structure models. Board of Governors of the Federal Reserve

System (U.S.) Finance and Economics Discussion Series 2013-63.

Rudebusch, G. D., Wu, T., 2007. Accounting for a shift in term structure behavior with

no-arbitrage and macro-finance models. Journal of Money, Credit, and Banking

Vol. 39, 395–422.

Wu, J. C., Xia, F. D., 2016. Measuring the macroeconomic impact of monetary policy

at the zero lower bound. Journal of Money, Credit, and Banking Vol. 48, 253–291.


Appendix

C.1 Robustness Results

C.1.1 Extended Sample: November 1971 - December 2017

Figure C.1: Modified Bond Return Regressions: Yield Spread

This figure reports modified 1-year excess bond return regression coefficients estimated over the November


0 +β(n)1

(y (n)

t − y (1)t

)+


1 I{rt<c}

(y (n)

t − y (1)t

)+ ε(n)




C.1. ROBUSTNESS RESULTS 135

C.1.2 Holding Period: 3 Months




0 +β(n)1

(y (n)

t − y (1)t

)+


1 I{rt<c}

(y (n)

t − y (1)t

)+ ε(n)





C.1.3 Holding Period: 6 Months




0 +β(n)1

(y (n)

t − y (1)t

)+


1 I{rt<c}

(y (n)

t − y (1)t

)+ ε(n)




C.1. ROBUSTNESS RESULTS 137

C.1.4 Threshold: 25 bps




0 +β(n)1

(y (n)

t − y (1)t

)+


1 I{rt<c}

(y (n)

t − y (1)t

)+ ε(n)





C.1.5 Threshold: 50 bps




0 +β(n)1

(y (n)

t − y (1)t

)+


1 I{rt<c}

(y (n)

t − y (1)t

)+ ε(n)




C.2. SHADOW RATE MODEL 139

C.1.6 Controlling for Level and Curvature

Figure C.6: Modified Bond Return Regressions: Second PC


1990 through December 2017 sample. The estimated specification is r x(n)t+1 =β(n)

0 +β(n)1 PC1,t +β(n)

2 PC2,t +β(n)

3 PC3,t +δ(n)1 I{rt<c}PC2,t +ε(n)

t+1. Confidence bands are constructed using the 2.5 and 97.5 percentiles

from 5,000 block bootstrap repetitions with a block length of 24 months. Each bootstrap sample is required

to have a minimum of 50 ZLB observations.

C.2 Shadow Rate Model

Define the shadow rate st to be a linear function of the NX ×1 vector of pricing factors

X t , i.e.

st =α+β′X t . (C.1)

The short rate is given by

rt = max{st ,0}, (C.2)

where 0 is assumed to be the lower bound on the short rate. The risk factors have a

VAR(1) law of motion under the risk-neutralQmeasure


X t +ΣεQt+1, (C.3)

where εQt+1 ∼NID(0, I

). Here, Φ is a NX ×NX matrix, µ is a NX ×1 vector, and Σ is a

NX ×NX matrix. Under the assumption of an essentially affine stochastic discount


factor, the physical measure P dynamics are

X t+1 = h0 +hX X t +ΣεPt+1, (C.4)

where εPt+1 ∼NID(0, I

). Here, hX is a NX ×NX matrix and h0 is a NX ×1 vector.

C.2.1 Risk-Neutral Pricing

No-arbitrage implies that the price P (n)t of an n-period zero-coupon bond at time t is

given by

P (n)t = EQt

[exp{−rt }P (n−1)

t+1

], (C.5)

or by recursive substitution

P (n)t = EQt

[exp{−

n−1∑i=0

rt+i }

]. (C.6)

Since the yield to maturity is defined as y (n)t =− 1

n logP (n)t , we have

y (n)t =− 1

nlogEQt

exp

(−

n−1∑i=0

rt+i

) . (C.7)

C.2.2 Cumulant Approximation

The quantity logEQt

[exp

(−∑n−1

i=0 rt+i

)]appearing in (C.7) is the conditional cumulant-

generating function underQ, evaluated at -1, of the random variable RNt ≡∑n−1

i=0 rt+i .

It can be expressed in terms of its series representation, i.e.

logEQt

[exp

(−RN

t

)]=

∞∑j=1

(−1) jκQ

j

j !, (C.8)

where κQj is the j ’th cumulant of RNt underQ. An approximation to (C.7) can therefore

be computed by truncating the sum in (C.8) after a finite number of terms.

C.2.2.1 First-Order Approximation

The first-order approximation of (C.7) reads

y (n)t = 1

nEQt

[n−1∑i=0

rt+i

]. (C.9)

In order to compute the expectation in (C.9), notice that if Z ∼N(µ,σ2

)then we have

E[max{Z ,0}

]=µΦ(µ

σ

)+σφ

(µ

σ

). (C.10)


Note that since X t is normal, then st is normal as well. Thus (C.10) is applicable for

evaluating the expectation in (C.9). We simply need to evaluate the conditional mean

and standard deviation of st . Firstly, the conditional mean can be computed as

EQt

[st+i

]=α+β′EQt[

X t+i]

, (C.11)

where EQt[

X t+i]

can be computed by the recursive substitutions


X t+1 +ΣεQt+2

=Φµ+ (I−Φ)(

Φµ+ (I−Φ)

X t +ΣεQt+1

)+ΣεQt+2

=Φµ+ (I−Φ)

Φµ+ (I−Φ)2 X t +

(I−Φ)

ΣεQt+1 +ΣεQt+2


X t+2 +ΣεQt+3

=Φµ+ (I−Φ)(

Φµ+ (I−Φ)

Φµ+ (I−Φ)2 X t +

(I−Φ)

ΣεQt+1 +ΣεQt+2

)+ΣεQt+3

=Φµ+ (I−Φ)

Φµ+ (I−Φ)2

Φµ+ (I−Φ)3 X t +

(I−Φ)2

ΣεQt+1

+ (I−Φ)

ΣεQt+2 +ΣεQt+3

...

X t+i =i−1∑j=0

(I−Φ) j

Φµ+ (I−Φ)i X t +

i−1∑j=0

(I−Φ) j

ΣεQ

t+i− j .

(C.12)

This implies

EQt

[X t+i

]= i−1∑j=0

(I−Φ) j

Φµ+ (I−Φ)i X t . (C.13)

Hence,

EQt

[st+i

]=α+β′ i−1∑j=0

(I−Φ) j

Φµ+β′ (I−Φ)i X t . (C.14)

The conditional variance can be computed as

VQt

[st+i

]=β′VQt[

X t+i]β, (C.15)

where VQt[

X t+i]

can be computed from (C.12) as

VQt

[X t+i

]= EQt [(X t+i −EQt

[X t+i

])(X t+i −EQt

[X t+i

])′]

= EQt

i−1∑

j=0

(I−Φ) j

ΣεQ

t+i− j

i−1∑j=0

(I−Φ) j

ΣεQ

t+i− j

′ .(C.16)


Since the innovations are independent across time it follows that

VQt

[X t+i

]= EQti−1∑

j=0

(I−Φ) j

ΣεQ

t+i− j

(εQ

t+i− j

)′Σ′

((I−Φ) j

)′=

i−1∑j=0

(I−Φ) j

ΣΣ′((I−Φ) j

)′.

(C.17)

Equation (C.17) implies that the variances can be computed recursively for i = 1, . . . , N

by

VQt

[X t+i

]= i−1∑j=0

(I−Φ) j

ΣΣ′((I−Φ) j

)′=

i−2∑j=0

(I−Φ) j

ΣΣ′((I−Φ) j

)′+ (I−Φ)i−1

ΣΣ′((I−Φ)i−1

)′=VQt

[X t+i−1

]+ (I−Φ)i−1

ΣΣ′((I−Φ)i−1

)′,

(C.18)

with the initial condition VQt[

X t]= 0. Applying (C.10) it follows that

EQt

[rt+i

]=EQt [max{st+i ,0}

]=µt ,t+iΦ

(µt ,t+i

σt ,t+i

)+σt ,t+iφ

(µt ,t+i

σt ,t+i

),

(C.19)

where µt ,t+i =α+β′EQt[

X t+i]

and σ2t ,t+i =β′VQt

[X t+i

]β. It is then straight-forward

to compute

EQt

[n−1∑i=0

rt+i

]=

n−1∑i=0

EQt

[rt+i

], (C.20)

using the result from (C.19). This completes the computation of the first-order ap-

proximation.

C.2.2.2 Second-Order Approximation

The second-order approximation of (C.7) reads

y (n)t = 1

n

EQt[

n−1∑i=0

rt+i

]− 1

2VQt

[n−1∑i=0

rt+i

] . (C.21)


The second term can be computed by noting that

VQt

[n−1∑i=0

rt+i

]=EQt

(n−1∑i=0

rt+i

)2−EQt

[n−1∑i=0

rt+i

]2

=EQt

n−1∑i=0

n−1∑j=0

rt+i rt+ j

−EQt[

n−1∑i=0

rt+i

]2

=n−1∑i=0

n−1∑j=0

EQt

[rt+i rt+ j

]−

(n−1∑i=0

EQt

[rt+i

])2

.

(C.22)

Equation (C.22) implies that only EQt

[rt+i rt+ j

]needs to be computed, since EQt

[rt+i

]is already known from the first-order approximation. Inserting for the short-rate it

follows that

EQt

[rt+i rt+ j

]=EQt

[max{st+i ,0}max{st+ j ,0}

]. (C.23)

Note that if [Z1

Z2

]∼N

[µ1

µ2

],

[σ2

1 σ12

σ12 σ22

] , (C.24)

then

E[max{Z1,0}max{Z2,0}

]=(µ1µ2 +σ12

)Φd

2

(−ζ1,−ζ2;χ)

+σ2µ1φ(ζ2

)Φ

(ζ1 −χζ2√

1−χ2

)

+σ1µ2φ(ζ1

)Φ

(ζ2 −χζ1√

1−χ2

)

+σ1σ2

√1−χ2

2πφ

√ζ2

1 −2χζ1ζ2 +ζ22

1−χ2

,

(C.25)

where ζ j = µ j

σ j, χ = σ12

σ1σ2and Φd

2

(ζ1,ζ2;χ

) = 1−Φ(ζ1

)−Φ(ζ2

)+Φ2(ζ1,ζ2;χ

). Thus

to evaluate EQt

[rt+i rt+ j

]it is necessary to compute CovQt

[st+i , st+ j

]. From (C.1) it

follows that

CovQt

[st+i , st+ j

]=β′CovQt

[X t+i , X t+ j

]β, (C.26)

and from (C.12) that

CovQt

[X t+i , X t+ j

]=CovQt

[i−1∑k=0

(I−Φ)k

ΣεQ

t+i−k ,i−1∑h=0

(I−Φ)h

ΣεQ

t+i−h

]

=i−1∑k=0

i−1∑h=0

CovQt

[(I−Φ)k

ΣεQ

t+i−k ,(I−Φ)h

ΣεQ

t+i−h

].

(C.27)


Since the innovations are independent across time, only terms where i −k = j −h or

k = i − j +h are non-zero. Hence,

CovQt

[X t+i , X t+ j

]=

i−1∑h=0

CovQt

[(I−Φ)i− j+h

ΣεQ

t+ j−h ,(I−Φ)h

ΣεQ

t+i−h

]=

i−1∑h=0

(I−Φ)i− j+h

ΣCovQt

[εQ

t+ j−h ,εQt+i−h

]Σ′

((I−Φ)h

)′=

i−1∑h=0

(I−Φ)i− j+h

ΣΣ′((I−Φ)h

)′.

(C.28)

Covariances can then be computed recursively as

CovQt

[X t+i+1, X t+ j

]= (I−Φ)

CovQt

[X t+i , X t+ j

], (C.29)

with the initial condition CovQt

[X t+ j , X t+ j

]= V

Qt

[X t+ j

]. Using these result, it is

straight-forward to compute

EQt

[max{st+i ,0}max{st+ j ,0}

]=[

µt ,t+iµt ,t+ j +σt ,t+i ,t+ j

]Φd

2

(−ζt ,t+i ,−ζt ,t+ j ;χt ,t+i ,t+ j

)+σt ,t+ jµt ,t+iφ

(ζt ,t+ j

)Φ

ζt ,t+i −χt ,t+i ,t+ j ζt ,t+ j√1−χ2

t ,t+i ,t+ j

+σt ,t+iµt ,t+ jφ

(ζt ,t+i

)Φ

ζt ,t+ j −χt ,t+i ,t+ j ζt ,t+i√1−χ2

t ,t+i ,t+ j

+σt ,t+iσt ,t+ j

√1−χ2

t ,t+i ,t+ j

2πφ

√√√√ζ2

t ,t+i −2χt ,t+i ,t+ j ζt ,t+iζt ,t+ j +ζ2t ,t+ j

1−χ2t ,t+i ,t+ j

,

(C.30)

where σt ,t+i ,t+ j = β′CovQt

[X t+i , X t+ j

]β, ζt ,t+i = µt ,t+i

σt ,t+i, and χt ,t+i ,t+ j = σt ,t+i ,t+ j

σt ,t+iσt ,t+ j.

This completes the computation of the second-order approximation.

Documents

By Kasper Jørgensen