How Eﬃcient is the Kalman Filter at Estimating Aﬃne Term Structure …cepr.org/sites/default/files/events/1854_CLR_simulation... · 2015. 8. 25. · Gaussian AFNS models for which

How Efficient is the Kalman Filter at

Estimating Affine Term Structure Models?†

Jens H. E. Christensen

Jose A. Lopez

and

Glenn D. Rudebusch

Federal Reserve Bank of San Francisco

101 Market Street, Mailstop 1130

San Francisco, CA 94105

Preliminary and incomplete draft. Comments are welcome.

Abstract

We perform a carefully orchestrated simulation study to analyze the bias of the Kalman

filter in estimating arbitrage-free Nelson-Siegel (AFNS) models with and without stochas-

tic volatility. For Gaussian AFNS models, we document significant finite-sample bias in

the estimated mean-reversion parameters. Since the Kalman filter is consistent and ef-

ficient for that model class, this exercise provides a measure of the finite-sample bias

that will affect any estimator. For AFNS models with stochastic volatility, significant

finite-sample upward estimation bias remains, but it is not materially larger than in the

Gaussian model. Hence, we recommend estimation based on the Kalman filter for both

types of AFNS models and corresponding affine term structure models in general.

JEL Classification: C13, C58, G12, G17.

Keywords: arbitrage-free Nelson-Siegel models, finite-sample bias, stochastic volatility

†We thank seminar participants at the Second Humboldt Copenhagen Conference on Financial Economet-rics for comments on an earlier draft of this paper. The views in this paper are solely the responsibility of theauthors and should not be interpreted as reflecting the views of the Federal Reserve Bank of San Francisco orthe Board of Governors of the Federal Reserve System.

This version: August 24, 2015.

1 Introduction

Interest rate volatility is a topic of great research interest given its role in derivatives pricing

and portfolio risk management. However, as compared to the empirical results presented

in the extensive GARCH literature, the results of modeling interest rate volatility within

the more commonly used affine, arbitrage-free models of the term structure have been less

clear-cut, partly due to the difficulty in estimating their parameters.

Estimation of flexible affine term structure models is complicated and time consuming,

partly due to the fairly large number of parameters, and partly due to the latent nature of the

state variables in such models. The latter causes the estimation to be plagued by numerous

local maxima that are distinct in the sense that they are not invariant affine transformations1

of each other and therefore may have very different economic implications, see Duffee (2011)

and Kim and Orphanides (2012) for discussions of these issues.

To overcome those problems, Christensen et al. (2011, henceforth CDR) introduce the

affine arbitrage-free class of Nelson-Siegel term structure models (henceforth referred to as

AFNS models). These are affine term structure models that preserve the level, slope, and

curvature factor loading structure in the bond yield function known from the standard Nelson

and Siegel (1987) yield curve model. These models are easy to estimate because the role

of each factor is predetermined and does not vary for any admissible set of parameters.

Furthermore, in that model class, the state variables are Gaussian with constant volatility.

As a consequence, the models can be estimated with the standard Kalman filter, which

is equivalent to exact maximum likelihood estimation and therefore is both efficient and

consistent in the limit. However, despite its consistency and efficiency, the Kalman filter

remains subject to any unavoidable finite-sample bias.

In a recent paper, Christensen et al. (2014a, henceforth CLR) generalize the AFNS

model framework introduced in CDR by incorporating stochastic volatility into the state

variables. These models are also easy to estimate, again due to the imposed Nelson-Siegel

factor loading structure. CLR estimate their models using the standard Kalman filter and

report model fit on par with the original Gaussian AFNS model. Now, though, the Kalman

filter is no longer efficient and potentially inconsistent because it only approximates the true

probability distribution of the state variables by matching the first and second moment,

essentially treating the state variables as if they were Gaussian. Thus, in addition to any

finite-sample bias, there is potential for added bias arising from the fact that the Kalman filter

is only an approximation to the true likelihood function. Despite this concern, Kalman filter-

based estimation of affine term structure models with stochastic volatility is relatively common

in empirical term structure analysis,2 but the size of any bias in realistic three-factor settings

1See Dai and Singleton (2000) for the definition of this concept.2For examples, see Duffee (1999), Driessen (2005), Feldhütter and Lando (2008), and Christensen et al.

(2015).

1

has not been studied in detail in the existing term structure literature (to the best of our

knowledge). In this paper, we focus on the AFNS model classes with and without stochastic

volatility. This provides us with an ideal setting to study both the finite-sample bias and the

added bias from using the Kalman filter for estimation of affine non-Gaussian models. As an

alternative, Joslin et al. (2011) and Hamilton and Wu (2012) provide identification schemes

that facilitate the estimation of affine Gaussian models in that they avoid the filtering of the

unobserved latent factors.3 However, it is not obvious if or how those approaches extend to

affine non-Gaussian models. Thus, the AFNS-based identification of affine Gaussian models

provided by CDR and extended by CLR to affine non-Gaussian models remains an important

contribution without which the analysis in this paper would not have been feasible.4

Because interest rates are highly persistent, empirical autoregressive models, including

dynamic term structure models, suffer from substantial small-sample estimation bias. Specif-

ically, model estimates will generally be biased toward a dynamic system that displays much

less persistence than the true process (so estimates of the real-world mean-reversion matrix,

KP , are upward biased). Furthermore, if the degree of interest rate persistence is underes-

timated, future short rates would be expected to revert to their mean too quickly causing

their expected longer-term averages to be too stable. Therefore, the bias in the estimated

dynamics distorts the decomposition of yields and contaminates estimates of long-maturity

term premiums.

To study this finite-sample problem in detail, we start out simulating and estimating

Gaussian AFNS models for which the Kalman filter is an efficient estimator as already noted.

We simulate short ten-year and long forty-year samples to study the finite-sample bias problem

directly. We allow for low and high noise to assess how data quality affects our conclusions.

Furthermore, for the benchmark Gaussian AFNS model, we also analyze samples at weekly

frequency in addition to the monthly frequency used throughout, but since this turns out to

matter little for our conclusions, we do not repeat this exercise for the models with stochastic

volatility. We then proceed to simulate and estimate AFNS models with stochastic volatility

in a similarly careful way.

Our findings can be summarized as follows.

In the Gaussian AFNS model, there is a significant finite-sample upward bias in the

estimates of the mean-reversion rate of the Nelson-Siegel level factor due to its near unit-root

property. In addition, there is a more modest, finite-sample upward estimation bias in the

mean-reversion parameters for the slope and curvature factor thanks to their lower persistence.

Importantly, there is no finite-sample bias in the estimated mean parameters of any of the

factors. Furthermore, all parameters that relate to the model’s Q-dynamics used for pricing

3Andreasen and Christensen (2015) offer an alternative way of estimating non-Gaussian term structuremodels.

4The related literature include Duan and Simonato (1995), Lund (1997), De Jong (2000), Duffee andStanton (2004), and Duffee and Stanton (2008) among others.

2

and fitting the cross section of yields are well determined and without any measurable bias.

This property turns out to hold for non-Gaussian models as well. However, the accuracy of

the estimated Q-dynamics is affected by the amount of noise in the data. Finally, the data

frequency plays no role for these conclusions as both weekly and monthly simulated data

produce similar results. However, in the weekly samples, the parameter standard deviations

estimated from the optimized likelihood function in the Kalman filter tend to be too low. This

makes the upward biased mean-reversion parameters appear even more significant than they

are, which complicates model selection. Hence, we document one of the unusual situations

where more data do not necessarily lead to better inference. For selecting the appropriate

specification of the mean-reversion matrix, which matters for forecast performance, term

premium decompositions etc., we therefore recommend to rely on monthly rather than weekly

data.

We then proceed to simulate and estimate AFNS models with stochastic volatility gener-

ated by the level factor in one set of exercises, and with stochastic volatility generated by the

curvature factor in another set of exercises.

First, we find that the finite-sample upward bias in the estimated mean-reversion parame-

ters is not materially different in the models with stochastic volatility relative to the Gaussian

AFNS model. The intuition behind this result is that the time series properties of the three

state variables are primarily determined by the Nelson-Siegel factor loading structure, which

is almost identical for all AFNS models with and without stochastic volatility. For similar

reasons we also see little bias in the estimated mean parameters in these models.

Second, we analyze in detail the ability of the Kalman filter to estimate the volatility

sensitivity parameters that determine the degree to which the stochastic volatility factor

affects the volatility of the unconstrained factors in each model. For U.S. Treasury yields,

these sensitivity parameters are often estimated to be negligible (see CLR for an example) and

we report similar results. To assess whether this is a general weakness of the Kalman filter

when applied to models with stochastic volatility, we perform separate simulation experiments

with large values for the sensitivity parameters. Our results show that the Kalman filter is in

fact able to estimate them with some accuracy. Thus, when their estimated values are tiny

and insignificant, it is most likely because the data call for them to be so.

Third, in general, it is the case that the parameters that primarily affect the models’

fit to the cross section of yields tend to have small or no bias, but their accuracy varies

positively with the quality of the data. We note one exception though. In the AFNS model

with stochastic volatility generated by the curvature factor, the mean of the curvature factor

under the risk-neutral Q measure is not well identified. However, we show that this can be

solved at practically no cost by fixing it at a low value that is exactly high enough that the

curvature factor does not reach its zero lower bound.

Another key finding is that the Kalman filter is as efficient at filtering state variables in

3

non-Gaussian models as it is at filtering in Gaussian models, in particular under optimal con-

ditions with high-quality data. As a consequence, the fit of the AFNS models with stochastic

volatility is as good as, if not better than, the fit of the Gaussian AFNS model.

Finally, in light of the low interest rate environment in recent years, we emphasize that

our study has no baring on how Kalman filter-based estimations perform when yields are near

their lower bound and exhibit asymmetric behavior for that reason. This is a task that we

leave for future research. Still, the results we report could serve as a useful benchmark even

for that kind of exercise.

The rest of the paper is structured as follows. Section 2 describes our sample of U.S.

Treasury yields and motivates our focus on the Nelson-Siegel yield curve model, while Section

3 briefly details the original Gaussian AFNS model of the term structure. Section 4 goes on

to describe the five classes of AFNS models with stochastic volatility dynamics introduced

in CLR. Section 5 details the estimation methodology, while Section 6 describes the simu-

lation study. Section 7 contains the results from the simulation exercises for the Gaussian

AFNS model, while Sections 8 and 9 contain the results for the AFNS models with stochastic

volatility generated by the level and curvature factor, respectively. Section 10 concludes the

paper.

2 Motivation for the Nelson-Siegel Model

In this section, we motivate our focus on the Nelson-Siegel yield curve model using principal

components analysis. Recall that principal components analysis decomposes the observed

data into a number of factors equal to the number of time series and ranks those factors

according to how much of the observed variation each factor explains.

The specific Treasury yields we analyze to obtain realistic parameter sets to be used

in our simulation exercises are zero-coupon yields constructed by the method described in

Gürkaynak et al. (2007) and briefly detailed here.5 For each business day a zero-coupon yield

curve of the Svensson (1995)-type

yt(τ) = β0 +1− e−λ1τλ1τ

β1 +[1− e−λ1τ

λ1τ− e−λ1τ

]β2 +

[1− e−λ2τλ2τ

− e−λ2τ]β3

is fitted to price a large pool of underlying off-the-run Treasury bonds. Thus, for each busi-

ness day, we have the fitted values of the four coefficients (β0(t), β1(t), β2(t), β3(t)) and two

parameters (λ1(t), λ2(t)). From this data set zero-coupon yields for any relevant maturity

can be calculated. As demonstrated by Gürkaynak et al. (2007), this discount function prices

the underlying pool of bonds extremely well. By implication, the zero-coupon yields derived

from this approach constitute a very good approximation to the true underlying Treasury

5The Board of Governors of the Federal Reserve updates the data on its website athttp://www.federalreserve.gov/pubs/feds/2006/index.html.

4

1988 1992 1996 2000 2004 2008

02

46

810

Rat

e in

per

cent

10−year yield 5−year yield 1−year yield 3−month yield

Figure 1: Time Series of Treasury Yields.Illustration of the weekly observed Treasury zero-coupon bond yields covering the period from Decem-

ber 4, 1987, to January 2, 2009. The yields shown have maturities: Three-month, one-year, five-year,

and ten-year.

Maturity Mean Std. dev.in months in % in %

Skewness Kurtosis

3 4.52 2.02 0.03 2.416 4.61 2.05 -0.01 2.4012 4.77 2.04 -0.04 2.4124 5.03 1.95 -0.03 2.4336 5.24 1.86 0.02 2.3960 5.58 1.72 0.15 2.2584 5.85 1.62 0.26 2.13120 6.16 1.52 0.36 2.05

Table 1: Summary Statistics of Treasury Yields.Summary statistics for the sample of weekly observed Treasury zero-coupon bond yields covering the

period from December 4, 1987, to January 2, 2009.

zero-coupon yield curve.6

To have the most active part of the maturity spectrum represented, we construct Treasury

zero-coupon bond yields with the following maturities: 3-month, 6-month, 1-year, 2-year, 3-

year, 5-year, 7-year, and 10-year. We use weekly data (Fridays) and limit our sample to the

6D’Amico and King (2013) show that the Svensson functional form has had some difficulty at times infitting the underlying bond prices since the peak of the financial crisis. This explains why we end our sampleon January 2, 2009. Furthermore, we emphasize that we merely use the U.S. Treasury yields to obtain realisticparameter sets to be used in the model simulations. Hence, ultimately, the accuracy of the Svensson smoothedcurve does not matter for our exercise and the conclusions we draw.

5

Maturity Loading onin months First P.C. Second P.C. Third P.C.

3 -0.38 -0.44 0.526 -0.39 -0.38 0.1912 -0.40 -0.25 -0.2124 -0.38 -0.03 -0.4736 -0.36 0.12 -0.4260 -0.33 0.33 -0.1184 -0.30 0.44 0.18120 -0.27 0.53 0.45

% explained 94.12 5.58 0.27

Table 2: Eigenvectors of the First Three Principal Components of Treasury Yields.The loadings of yields of various maturities on the first three principal components are shown. The

final row shows the proportion of all bond yield variability accounted for by each principal component.

The data consist of weekly U.S. Treasury zero-coupon bond yields from December 4, 1987, to January

2, 2009.

period from December 4, 1987, to January 2, 2009. The summary statistics are provided in

Table 1, while Figure 1 illustrates the constructed time series of the three-month, one-year,

five-year, and ten-year Treasury zero-coupon yields.

Researchers have typically found that three factors are sufficient to model the time-

variation in the cross section of Treasury bond yields (e.g., Litterman and Scheinkman, 1991).

Indeed, for our weekly Treasury bond data, 99.97% of the total variation is accounted for by

three factors. Table 2 reports the eigenvectors that correspond to the first three principal

components of our data. The first principal component accounts for 94.1% of the variation in

the Treasury bond yields, and its loading across maturities is uniformly negative. Thus, like

a level factor, a shock to this component changes all yields in the same direction irrespective

of maturity. The second principal component accounts for 5.6% of the variation in these data

and has sizable negative loadings for the shorter maturities and sizable positive loadings for

the long maturities. Thus, like a slope factor, a shock to this component steepens or flattens

the yield curve. Finally, the third component, which accounts for only 0.3% of the variation,

has a U-shaped factor loading as a function of maturity, which is naturally interpreted as a

curvature factor.

In summary, three factors can explain more than 99.97% of the variation in this set of

Treasury bond yields, and they have properties consistent with an interpretation of level,

slope, and curvature as in the Nelson-Siegel model detailed in the following.

6

3 The AFNS Model with Constant Volatility

In this section, we briefly review the AFNS model with constant volatility, throughout referred

to as the AFNS0 specification.7,8 We start from a standard continuous-time affine arbitrage-

free structure (Duffie and Kan, 1996) that underlies all the models to be estimated in this pa-

per. To represent an affine diffusion process, define a filtered probability space (Ω,F , (Ft), Q),where the filtration (Ft) = {Ft : t ≥ 0} satisfies the usual conditions (Williams, 1997). Thestate variables Xt are assumed to be a Markov process defined on a set M ⊂ Rn that solvesthe following stochastic differential equation (SDE)9

dXt = KQ(t)[θQ(t)−Xt]dt+Σ(t)D(Xt, t)dWQt , (1)

where WQ is a standard Brownian motion in Rn, the information of which is contained in

the filtration (Ft). The drift terms θQ : [0, T ] → Rn and KQ : [0, T ] → Rn×n are bounded,continuous functions.10 Similarly, the volatility matrix Σ : [0, T ] → Rn×n is assumed to be abounded, continuous function, while D :M × [0, T ] → Rn×n is assumed to have the followingdiagonal structure

√γ1(t) + δ1(t)Xt . . . 0

.... . .

...

0 . . .√γn(t) + δn(t)Xt

,

where

γ(t) =

γ1(t)...

γn(t)

, δ(t) =

δ11(t) . . . δ1n(t)

.... . .

...

δn1 (t) . . . δnn(t)

,

γ : [0, T ] → Rn and δ : [0, T ] → Rn×n are bounded, continuous functions, and δi(t) denotesthe ith row of the δ(t)-matrix. Finally, the instantaneous risk-free rate is assumed to be an

affine function of the state variables

rt = ρ0(t) + ρ1(t)′Xt,

7Our nomenclature follow CLR and draws on Dai and Singleton (2000). Our AFNSn models are membersof their An(3) class of models, which have three state variables and n square-root processes.

8This model has been shown to exhibit both good in-sample fit and out-of-sample forecast accuracy forvarious yield curves. The empirical analysis conducted in CDR is based on unsmoothed Fama-Bliss data fornominal Treasury yields. Christensen et al. (2010) examine yields for nominal and real Treasuries as perGürkaynak et al. (2007, 2010), while Christensen et al. (2014b) examine short-term LIBOR and highly-ratedbanks’ and financial firms’ corporate bond rates.

9The affine property applies to bond prices; therefore, affine models only impose structure on the factordynamics under the pricing measure.

10Stationarity of the state variables is ensured if all the eigenvalues of KQ(t) are positive (if complex, the realcomponent should be positive), see Ahn et al. (2002). However, stationarity is not a necessary requirementfor the process to be well defined.

7

where ρ0 : [0, T ] → R and ρ1 : [0, T ] → Rn are bounded, continuous functions.Duffie and Kan (1996) prove that zero-coupon bond prices in this framework are exponential-

affine functions of the state variables

P (t, T ) = EQt[exp

(−∫ T

t

rudu)]

= exp(B(t, T )′Xt +A(t, T )

),

where B(t, T ) and A(t, T ) are the solutions to the following system of ordinary differential

equations (ODEs)

dB(t, T )

dt= ρ1 + (K

Q)′B(t, T )− 12

n∑

j=1

(Σ′B(t, T )B(t, T )′Σ)j,j(δj)′, B(T, T ) = 0, (2)

dA(t, T )

dt= ρ0 −B(t, T )′KQθQ −

1

2

n∑

j=1

(Σ′B(t, T )B(t, T )′Σ)j,jγj, A(T, T ) = 0, (3)

and the possible time-dependence of the parameters is suppressed in the notation. These

pricing functions imply that the zero-coupon yields are given by

y(t, T ) = − 1T − t log P (t, T ) = −

B(t, T )′

T − t Xt −A(t, T )

T − t .

As per CDR, assume that the instantaneous risk-free rate is defined by

rt = Lt + St.

In addition, assume that the state variables Xt = (Lt, St, Ct) are described by the following

system of SDEs under the risk-neutral Q-measure

dLt

dSt

dCt

=

0 0 0

0 λ −λ0 0 λ

θQ1

θQ2

θQ3

−

Lt

St

Ct

dt+Σ

dWL,Qt

dW S,Qt

dWC,Qt

, λ > 0.

Then, zero-coupon bond yields are given by

y(t, T ) = Lt +(1− e−λ(T−t)

λ(T − t)

)St +

(1− e−λ(T−t)λ(T − t)

− e−λ(T−t))Ct −

A(t, T )

T − t.

This result defines the class of AFNS0 models derived in CDR and the additional term in

the yield function is a so-called yield-adjustment term that represents convexity effects due

to Jensen’s inequality; see CDR for details. To complete the model, we need to specify the

risk premium structure that generates the connection to the dynamics under the real-world

P -measure. To that end, it is important to note that there are no restrictions on the dynamic

drift components under the empirical P -measure. Therefore, beyond the requirement of

constant volatility, we are free to choose the dynamics under the P -measure. To facilitate

8

the empirical implementation, we follow CDR and limit our focus to the essentially affine risk

premium introduced in Duffee (2002). In the Gaussian framework, this specification implies

that the risk premiums Γt depend linearly on the state variables; that is,

Γt = γ0 + γ1Xt,

where γ0 ∈ R3 and γ1 ∈ R3×3 contain unrestricted parameters. The relationship betweenreal-world yield curve dynamics under the P -measure and risk-neutral dynamics under the

Q-measure is given by

dWQt = dWPt + Γtdt.

Thus, we can write the P -dynamics of the state variables as

dXt = KP (θP −Xt)dt+ΣdWPt ,

where both KP and θP are allowed to vary freely relative to their counterparts under the

Q-measure. Following CDR, we identify this class of models by fixing the means under the

Q-measure at zero, i.e., θQ = 0.11 Furthermore, CDR show that Σ cannot be more than a

triangular matrix for the model to be identified. Thus, the maximally flexible specification of

the original AFNS model has Q-dynamics given by

dLt

dSt

dCt

=

0 0 0

0 −λ λ0 0 −λ

Lt

St

Ct

dt+

σ11 0 0

σ21 σ22 0

σ31 σ32 σ33

dWL,Qt

dW S,Qt

dWC,Qt

,

while its P -dynamics are given by

dLt

dSt

dCt

=

κP11 κP12 κ

P13

κP21 κP22 κ

P23

κP31 κP32 κ

P33

θP1

θP2

θP3

−

Lt

St

Ct

dt+

σ11 0 0

σ21 σ22 0

σ31 σ32 σ33

dWL,Pt

dW S,Pt

dWC,Pt

.

The main limitation of the AFNS0 class of models is that it is characterized by a constant

volatility matrix Σ. CLR modify the AFNS0 model in a straightforward fashion in order to

incorporate stochastic volatility. The key assumption to preserving the desirable Nelson-Siegel

factor loading structure in the zero-coupon bond yield function is to maintain the KQ mean-

reversion matrix under the Q-measure. Furthermore, all model classes will be characterized

by an instantaneous risk-free rate defined as the sum of the first two factors

rt = Lt + St.

11CDR demonstrate that this choice is without loss of generality.

9

The details of the AFNS models with stochastic volatility are briefly provided in the following

section.

4 Five AFNS Specifications with Stochastic Volatility

In this section, we present five AFNS specifications with stochastic volatility that vary de-

pending on whether they contain one, two, or three stochastic volatility factors and on the

identity of those factors. For each model class, we derive the maximally flexible specifica-

tion that can be obtained using the extended affine risk premium specification introduced in

Cheridito et al. (2007).

4.1 AFNS Models with One Stochastic Volatility Factor

There are two AFNS stochastic volatility specifications that allow just one factor to exhibit

stochastic volatility. The first, denoted as the AFNS1-L model, allows only the level factor

to exhibit stochastic volatility. The state variables in this specification follow this system of

stochastic differential equations under the risk-neutral Q-measure:12

dLt

dSt

dCt

=

ε 0 0

0 λ −λ0 0 λ

θQ1

θQ2

θQ3

−

Lt

St

Ct

dt

+

σ11 0 0

σ21 σ22 0

σ31 σ32 σ33

√Lt 0 0

0√1 + β21Lt 0

0 0√1 + β31Lt

dWL,Qt

dW S,Qt

dWC,Qt

,

where the level factor Lt is a square-root process with stochastic volatility that affects the

instantaneous volatility of the two other factors through the volatility sensitivity parameters,

β21 and β31.

For the factor loadings in the zero-coupon bond prices, B1(t, T ) is the solution to

dB1(t, T )

dt= 1 + εB1(t, T )− 1

2σ211B

1(t, T )2 − 12σ221B

2(t, T )2 − 12σ231B

3(t, T )2

−σ21σ11B1(t, T )B2(t, T )− σ31σ11B1(t, T )B3(t, T )− σ21σ31B2(t, T )B3(t, T )

−12β21

[σ222B

2(t, T )2 + σ232B3(t, T )2 + 2σ22σ32B

2(t, T )B3(t, T )]− 1

2β31σ

233B

3(t, T )2,

12Note that we cannot set κQ11

to zero as that would eliminate the drift of Lt and cause this process to remainat zero once it hits zero, which it will P -a.s. Instead, we fix this parameter at a small, but positive, ε = 10−6,to get close to the unit-root property imposed in the AFNS0 model.

10

while B2(t, T ) and B3(t, T ) are given by

B2(t, T ) = −(1− e−λ(T−t)

λ

),

B3(t, T ) = (T − t)e−λ(T−t) −(1− e−λ(T−t)

λ

).

The last two factor loadings match exactly the factor loadings of the slope and curvature

factors in the Nelson-Siegel zero-coupon yield function, while the ODE for B1(t, T ) contains

quadratic elements related to the stochastic volatility of Lt. The A(t, T )-function in the

yield-adjustment term in this class of models must solve the following ODE:

dA(t, T )

dt= −B(t, T )′KQθQ − 1

2σ222B2(t, T )2 − 1

2(σ2

32+ σ2

33)B3(t, T )2 − σ22σ32B2(t, T )B3(t, T ).

To estimate this model, we specify the dynamics under the real-world P -measure as the

measure change dWQ = dWPt + Γtdt. Note that we are limited to the essentially affine risk

premium structure introduced by Duffee (2002) for this particular model class.13 Given this

limitation, the maximally flexible affine P -dynamics are, in general, given by

dLt

dSt

dCt

=

κP11 0 0

κP21 κP22 κ

P23

κP31 κP32 κ

P33

θP1

θP2

θP3

−

Lt

St

Ct

dt

+

σ11 0 0

σ21 σ22 0

σ31 σ32 σ33

√Lt 0 0

0√1 + β21Lt 0

0 0√1 + β31Lt

dWL,Pt

dW S,Pt

dWC,Pt

.

For the first factor with stochastic volatility, there is a restriction on the mean parameter θP1

that we implement as14

θP1 =ε · θQ1κP11

.

Furthermore, for this process to be well-defined under both probability measures, we require

that

κP11θP1 > 0 and ε · θ

Q1 > 0.

These two inequalities are satisfied provided κP11 > 0 and θQ1 > 0. These restrictions ensure

13We choose not to use the extended affine risk premium specification for this particular specification becauseof the restriction imposed on κQ

11to obtain a level factor structure as similar as possible to the one in the

Nelson-Siegel model. If we were to do so, we would expect the Feller condition for Lt to be violated under theQ-measure as Lt would approach a unit-root process (CLR observe such violations in the AFNS3 model to bedetailed later despite imposing Feller conditions on all three state variables under both probability measures),but we stress that this is a self-imposed restriction based on the above concern, and not a theoretical necessity.

14A similar approach is used in the other model classes with stochastic volatility generated by the levelfactor.

11

that the Lt-process will move into positive territory whenever it hits the zero lower bound.

Finally, we identify this class of models by fixing θQ2 = θQ3 = 0, that is, we eliminate the Q-

means of the unconstrained processes as in CDR. These restrictions allow the corresponding

means under the P -measure to be determined in the estimation.

The natural next AFNS one-factor stochastic volatility specification would allow the slope

factor to exhibit stochastic volatility. However, examination of the matrix

KQ =

0 0 0

0 λ −λ0 0 λ

,

shows that St cannot be a square-root process with Ct as an unconstrained process, if the

important off-diagonal element κQ23 is to remain equal to −λ, which generates the uniquefactor loading of the curvature factor in the AFNS model. Thus, there is no admissible

AFNS1-S model. Instead, we turn to the AFNS1-C model by allowing the curvature factor

to be a stochastic volatility factor. This approach preserves the properties of the level and

slope factors, allows the curvature factor to continue to serve as the stochastic mean of the

slope factor under the pricing measure, and designates the curvature factor to be the source

of stochastic volatility in the model.

For the AFNS1-C model, we assume that the state variables Xt are described under the

risk-neutral Q-measure as:

dLt

dSt

dCt

=

0 0 0

0 λ −λ0 0 λ

θQ1

θQ2

θQ3

−

Lt

St

Ct

dt

+

σ11 σ12 σ13

0 σ22 σ23

0 0 σ33

√1 + β13Ct 0 0

0√1 + β23Ct 0

0 0√Ct

dWL,Qt

dW S,Qt

dWC,Qt

.

The curvature factor here is a square-root process that induces stochastic volatility in the

other two factors through the volatility sensitivity parameters, β13 and β23.

In this model class, the first two factor loadings are identical to those in the AFNS0 model,

while B3(t, T ) is the solution to:

dB3(t, T )

dt= −λB2(t, T ) + λB3(t, T )− 1

2σ213B

1(t, T )2 − 12σ223B

2(t, T )2 − 12σ233B

3(t, T )2

−σ13σ23B1(t, T )B2(t, T )− σ13σ33B1(t, T )B3(t, T )− σ23σ33B2(t, T )B2(t, T )

−12β13σ

211B

1(t, T )2 − 12β23

[σ212B

1(t, T )2 + σ222B2(t, T )2 + 2σ12σ22B

1(t, T )B2(t, T )].

The A(t, T )-function in the yield-adjustment term in this class of models solves the ODE:

12

dA(t, T )


2(σ2

11+ σ2

12)B1(t, T )2 − 1

2σ222B2(t, T )2 − σ12σ22B1(t, T )B2(t, T ).

We estimate this model using the extended affine risk premium specification such that

the measure change is dWQ = dWPt + Γtdt. The maximally flexible affine P -dynamics are,

in general, given by

dLt

dSt

dCt

=

κP11 κP12 κ

P13

κP21 κP22 κ

P23

0 0 κP33

θP1

θP2

θP3

−

Lt

St

Ct

dt

+

σ11 σ12 σ13

0 σ22 σ23

0 0 σ33

√1 + β13Ct 0 0

0√1 + β23Ct 0

0 0√Ct

dWL,Pt

dW S,Pt

dWC,Pt

.

To keep the model arbitrage-free, Ct cannot be allowed to hit the zero lower bound. This

outcome is ensured by requiring that the parameters for the Ct-process satisfy the Feller

condition under both probability measures; i.e.,

κP33θP3 >

1

2σ233 and λθ

Q3 >

1

2σ233.

Finally, we identify this class of models by fixing θQ1 = θQ2 = 0, which allows the means

under the P -measure of the unconstrained factors to vary freely and be determined in the

estimation.

4.2 AFNS Models with Two Stochastic Volatility Factors

Our third and fourth classes of stochastic volatility models allow for two stochastic volatility

factors. Although there are three potential specifications, the specification with just the level

and slope factors exhibiting stochastic volatility is not admissible because it does not permit

the important off-diagonal element κQ23 to equal −λ, which is the unique characteristic of thecurvature factor in the original AFNS model. Instead, stochastic volatility is associated with

either level and curvature or slope and curvature. The first of these specifications, denoted

13

AFNS2-LC, has factor dynamics under the risk-neutral Q-measure given by15

dLt

dSt

dCt

=

ε 0 0

0 λ −λ0 0 λ

θQ1

θQ2

θQ3

−

Lt

St

Ct

dt

+

σ11 0 0

σ21 σ22 σ23

0 0 σ33

√Lt 0 0

0√1 + β21Lt + β23Ct 0

0 0√Ct

dWL,Qt

dW S,Qt

dWC,Qt

.

The level and curvature factors, Lt and Ct, exhibit stochastic volatility and induce time-

varying volatility in the slope factor, St, via the volatility sensitivity parameters, β21 and

β23.

The factor loadings in the zero-coupon bond price function are the unique solutions to

the following set of ODEs:

dB1(t, T )

dt= 1 + εB1(t, T )− 1

2σ211B

1(t, T )2 − 12σ221B

2(t, T )2

−σ11σ21B1(t, T )B2(t, T )−1

2β21σ

222B

2(t, T )2,

dB2(t, T )

dt= 1 + λB2(t, T ),

dB3(t, T )

dt= −λB2(t, T ) + λB3(t, T )− 1

2σ233B

3(t, T )2 − 12σ223B

2(t, T )2

−σ23σ33B2(t, T )B3(t, T )−1

2β23σ

222B

2(t, T )2,

where we note that the solution to B2(t, T ) is simply

B2(t, T ) = −1− e−λ(T−t)

λ.

Hence, St preserves its role as a slope factor. The A(t, T )-function is the solution to:

dA(t, T )


2σ222B

2(t, T )2.

Using the extended affine risk premium structure, the maximally flexible affine P -dynamics

15Note that, as before, we fix ε = 10−6 to approximate the unit-root property imposed in the standardAFNS0 model.

14

are given by

dLt

dSt

dCt

=

κP11 0 0

κP21 κP22 κ

P23

κP31 0 κP33

θP1

θP2

θP3

−

Lt

St

Ct

dt

+

σ11 0 0

σ21 σ22 σ23

0 0 σ33

√Lt 0 0

0√1 + β21Lt + β23Ct 0

0 0√Ct

dWL,Pt

dW S,Pt

dWC,Pt

.

For the level factor, the condition ε · θQ1 = κP11θP1 must be satisfied. Furthermore, to keep thismodel class arbitrage free, Ct cannot hit the zero-boundary, which is prevented by requiring

that the parameters for the Ct-process satisfy the Feller condition under both probability

measures; i.e.,16

κP31θP1 + κ

P33θ

P3 >

1

2σ233 and λθ

Q3 >

1

2σ233.

Finally, to have a well-defined Ct-process, the effect of the level factor on the drift of the cur-

vature factor must be positive, which we impose with the κP31 ≤ 0 constraint. This conditionimplies that the two square-root processes cannot be negatively correlated. To identify this

model class, we fix the θQ2 mean at zero.

The second AFNS specification with two volatility factors allows the slope and curvature

factors to be square-root processes while the level factor remains unconstrained. The factor

dynamics of this AFNS2-SC model under the Q-measure are:

dLt

dSt

dCt

=

0 0 0

0 λ −λ0 0 λ

θQ1

θQ2

θQ3

−

Lt

St

Ct

dt

+

σ11 σ12 σ13

0 σ22 0

0 0 σ33

√1 + β12St + β13Ct 0 0

0√St 0

0 0√Ct

dWL,Qt

dW S,Qt

dWC,Qt

.

Note that the square-root processes, St and Ct, are positively correlated through the off-

diagonal element κQ23 = −λ < 0. Beyond generating their own stochastic volatility, these twofactors induce instantaneous volatility for Lt via the volatility sensitivities, β12 and β13.

For the first factor loading in the zero-coupon bond price function, this structure implies

that

B1(t, T ) = −(T − t),

which preserves the role of the level factor. The next two factor loadings are the unique

16For Lt, we just need to ensure that the process does not turn negative, which is achieved provided thatε · θ

Q1

> 0 and κP11θP1 > 0.

15

solutions to:

dB2(t, T )

dt= 1 + λB2(t, T )− 1

2σ222B

2(t, T )2 − 12σ212B

1(t, T )2

−σ12σ22B1(t, T )B2(t, T )−1

2β12σ

211B

1(t, T )2,

dB3(t, T )

dt= −λB2(t, T ) + λB3(t, T )− 1

2σ233B

3(t, T )2 − 12σ213B

1(t, T )2

−σ13σ33B1(t, T )B3(t, T )−1

2β13σ

211B

1(t, T )2.

The A(t, T )-function in the yield-adjustment term is the solution to

dA(t, T )


2σ211B

1(t, T )2.

Using the extended affine risk premium specification, the maximally flexible affine P -dynamics

can be written as

dLt

dSt

dCt

=

κP11 κP12 κ

P13

0 κP22 κP23

0 κP32 κP33

θP1

θP2

θP3

−

Lt

St

Ct

dt

+

σ11 σ12 σ13

0 σ22 0

0 0 σ33

√1 + β12St + β13Ct 0 0

0√St 0

0 0√Ct

dWL,Pt

dW S,Pt

dWC,Pt

.

To keep this class of models arbitrage-free, the slope and curvature factors, St and Ct, must

avoid hitting the zero-boundary. This outcome is ensured by imposing the Feller condition

on their parameters as follows:

κP22θP2 + κ

P23θ

P3 >

1

2σ222; λθ

Q2 − λθ

Q3 >

1

2σ222; κ

P33θ

P3 + κ

P32θ

P2 >

1

2σ233; and λθ

Q3 >

1

2σ233.

Furthermore, for St and Ct to be well defined, the sign of the effect they have on each other

must be positive, which we impose using the constraints κP23 ≤ 0 and κP32 ≤ 0. This impliesthat the two square-root processes cannot be negatively correlated. Finally, we identify this

class of models by fixing θQ1 = 0, which allows θP1 to vary freely.

16

4.3 AFNS Models with Three Stochastic Volatility Factors

In the fifth and final AFNS3 specification, all three factors exhibit stochastic volatility. The

dynamics of Xt are described under the Q-measure as17

dLt

dSt

dCt

=

ε 0 0

0 λ −λ0 0 λ

θQ1

θQ2

θQ3

−

Lt

St

Ct

dt

+

σ11 0 0

0 σ22 0

0 0 σ33

√Lt 0 0

0√St 0

0 0√Ct

dWL,Qt

dW S,Qt

dWC,Qt

.

In this model class, the factor loadings in the zero-coupon bond price function are given by

the unique solution to

dB1(t, T )

dt= 1 + εB1(t, T )− 1

2σ211B

1(t, T )2,

dB2(t, T )

dt= 1 + λB2(t, T )− 1

2σ222B

2(t, T )2,

dB3(t, T )

dt= −λB2(t, T ) + λB3(t, T )− 1

2σ233B

3(t, T )2,

while the A(t, T )-function in the yield-adjustment term is given by the solution to:

dA(t, T )

dt= −B(t, T )′KQθQ.

Applying the extended affine risk premium specification, the maximally flexible affine P -

dynamics are given by

dLt

dSt

dCt

=

κP11 0 0

κP21 κP22 κ

P23

κP31 κP32 κ

P33

θP1

θP2

θP3

−

Lt

St

Ct

dt

+

σ11 0 0

0 σ22 0

0 0 σ33

√Lt 0 0

0√St 0

0 0√Ct

dWL,Pt

dW S,Pt

dWC,Pt

.

For Lt, the constraint ε ·θQ1 = κP11θP1 must be satisfied. The limited risk premium specificationdue to the near unit-root property of Lt also implies that St and Ct cannot impact the drift

of Lt once κQ12 and κ

Q13 have been fixed at zero. We need these restrictions in order to match

the Nelson-Siegel factor loading structure as closely as possible.

To keep this model class arbitrage-free, St and Ct must not hit their zero lower bounds.

17Note that, we again fix ε = 10−6 to approximate the unit-root property imposed in the AFNS0 model.

17

We ensure this by imposing the Feller condition on their parameters under both probability

measures, i.e.,18

κP21θ

P1 + κ

P22θ

P2 + κ

P23θ

P3 >

1

2σ2

22; λθQ2− λθ

Q3

>1

2σ2

22; κP31θ

P1 + κ

P32θ

P2 + κ

P33θ

P3 >

1

2σ2

33; and λθQ3

>1

2σ2

33.

Furthermore, to have well-defined processes for St and Ct, the sign of the effect that the factors

have on each of these two factors must be positive, which we impose with the restrictions

κP21 ≤ 0, κP23 ≤ 0, κP31 ≤ 0, and κP32 ≤ 0. Note that these restrictions imply that the threesquare-root processes cannot be negatively correlated.

5 Estimation Methodology

The stochastic volatility models described in the previous section are estimated using the

Kalman filter algorithm. In affine term structure models, zero-coupon yields are affine func-

tions of the state variables such that

yt(τ) = −1

τB(τ)′Xt −

1

τA(τ) + εt(τ),

where εt(τ) represents i.i.d. Gaussian white noise measurement errors. The conditional mean

for multi-dimensional affine continuous-time diffusion processes is given by

EP [XT |Xt] = (I − exp(−KP (T − t)))θP + exp(−KP (T − t))Xt, (4)

where exp(−KP (T−t)) is a matrix exponential. In general, the conditional covariance matrixfor affine diffusion processes is given by

V P [XT |Xt] =∫ T

t

exp(−KP (T − s))ΣD(EP [Xs|Xt])D(EP [Xs|Xt])′Σ′ exp(−(KP )′(T − s))ds. (5)

Stationarity of the system under the P -measure is ensured if the real components of all

the eigenvalues of KP are positive, and this condition is imposed in all estimations. For this

reason, we can start the Kalman filter at the unconditional mean and covariance matrix19

X̂0 = θP and Σ̂0 =

∫∞

0e−K

P sΣD(θP )D(θP )′Σ′e−(KP )′sds.

However, the introduction of stochastic volatility implies that the factors are no longer

simply Gaussian. We choose to approximate the true probability distribution of the state

variables with the first and second moments and use the Kalman filter algorithm as if the

18For Lt, we just need to ensure that the process does not become negative, which is assured if ε · θQ1

> 0and κP11θ

P1 > 0.

19In the estimation, we calculate the conditional and unconditional covariance matrices using the analyticalsolutions provided in Fisher and Gilles (1996).

18

state variables were Gaussian.20 Thus, the state equation is given by

Xt = (I − exp(−KP∆t))θP + exp(−KP∆t)Xt−1 + ηt, ηt ∼ N(0, Vt−1),

where ∆t is the time between observations and Vt−1 is the conditional covariance matrix given

in equation (5). However, the discrete nature of the state equation can cause the square-root

processes to become negative despite the fact that the parameter sets are forced to satisfy

Feller conditions and other nonnegativity restrictions. Whenever this happens, we follow the

literature and simply truncate those processes at zero; see Duffee (1999) for an example.

In the Kalman filter estimations, the error structure is given by

(ηt

εt

)∼ N

[(0

0

),

(Vt−1 0

0 H

)],

where H is assumed to be a diagonal matrix of the measurement error standard deviations,

σε, that are specific to each yield maturity when we perform estimations with the Treasury

yield data described in Section 2, while σε is assumed to be uniform for all yield maturities

in the simulated yield samples as discussed below. The linear least-squares optimality of the

Kalman filter requires that the white noise transition and measurement errors be orthogonal

to the initial state; i.e., E[f0η′

t] = 0 and E[f0ε′

t] = 0. Finally, the standard deviations of the

estimated parameters are calculated as

Σ(ψ̂) =1

T

[1

T

T∑

t=1

∂ log lt(ψ̂)

∂ψ

∂ log lt(ψ̂)

∂ψ

′]−1

,

where ψ̂ denotes the optimal parameter set.

6 Simulation study

To study the efficiency of the Kalman filter in estimating affine term structure models with

and without stochastic volatility, we undertake a carefully orchestrated simulation study the

details of which are provided in the following.

First, we search for a realistic parameter set for each AFNSi model class to use in the

simulations. From CDR it follows that neither maximally flexible models nor parsimonious

independent-factors models appear to reflect the true dynamics of the state variables, the

former performs poorly out of sample and the latter is counterfactual in that the state variables

do appear to be correlated. For that reason we look for parsimonious specifications in between

these two extremes. For each model class, we go through a general-to-specific model selection

20A few notable examples of papers that follow this approach include Duffee (1999), Driessen (2005),Feldhütter and Lando (2008), and Christensen et al. (2015).

19

procedure using the Bayesian Information Criterion defined as

BIC(k) = −2 logL+ k log T,

where k is the number of estimated parameters, while T is the number of observations in the

data. As described in Section 2, our data sample contains T = 1,101 weekly observations.

Since CDR report limited gains in terms of forecasting performance from allowing for flexible

specifications of the volatility matrix Σ, we restrict this matrix to be diagonal throughout.

Based on the estimated parameters from the preferred specification for each model class,

we perform two sets of simulations. In the first, we simulate N = 1,000 sample paths for the

three state variables observed at a monthly frequency over a ten-year period. In the other,

we repeat this, but simulate over a forty-year period.21

In a second step, these simulated factor paths are converted into simulated zero-coupon

yields observed at a monthly frequency with the following eight maturities, 0.25, 0.5, 1, 2, 3,

5, 7, and 10 years. Finally, a Gaussian i.i.d. measurement error is added to each bond yield.

To study the role, if any, of the data quality, we consider two values for the measurement

error standard deviation, σε. In one simulated data sample, this standard deviation is fixed

uniformly at 1 basis point, in the other data sample it is fixed uniformly at 10 basis points,

which is at the upper end of the noise we observe in the Treasury yield data. In order to

make the results as comparable as possible across model classes, the simulated measurement

errors are kept the same, that is, the simulated measurement errors are the same for the

ten- and forty-year samples, respectively, independent of the model class being simulated and

independent of the size of the measurement error standard deviation.

We now turn to the details of the simulation of the factor paths. The continuous-time

P -dynamics are, in general, given by

dXt = KP (θP −Xt)dt+ΣD(Xt)dWPt .

For both restricted square-root processes and unconstrained processes we approximate the

continuous-time process using the Euler approximation.22 To exemplify, for a restricted

square-root process,

dXit = κPii (θ

Pi −Xit)dt+ κPij(θPj −X

jt )dt+ σii

√XitdW

P,it ,

the algorithm is

Xit = Xit−1 + κ

Pii (θ

Pi −Xit−1)∆t+ κPij(θPj −X

jt−1)∆t+ σii

√Xit−1

√∆tzit , z

it ∼ N(0, 1).

21For the Gaussian AFNS0 model class we also take out weekly observations from the simulated paths. Theresults presented later show that increasing the sampling frequency does not materially alter any of the results.For that reason we do not analyze weekly samples for the non-Gaussian AFNS model classes.

22Thompson (2008) is an example.

20

We fix ∆t at a uniform value of 0.0001, which is equivalent to approximately 27 shocks per day

to each process through the Brownian motion. As Feller conditions and other non-negativity

requirements are imposed in the estimations performed with the observed Treasury yields, the

parameter sets used in the simulations satisfy all non-negativity requirements, so the “true”

underlying continuous-time process never becomes negative P -a.s. However, for the discretely

observed process above there is always a positive, but usually very small, probability that

the approximation will become negative. Whenever this happens, we truncate the simulated

square-root processes at 0 similar to what we do in the model estimations.

As for the starting point of the simulation algorithm, X0, we ideally want to draw it

from the unconditional joint distribution of the three state variables. However, with the

exception of the Gaussian AFNS0 model, we do not know the unconditional distribution

of Xt = (Lt, St, Ct). To overcome this problem, we take the estimated value of the three

state variables at the end of the observed Treasury yield sample and simulate the three state

variables according to the algorithm above for 100 years and repeat this 1,000 times. This

effectively gives us random draws from the joint unconditional distribution ofXt = (Lt, St, Ct).

These starting values are identical for both the ten- and forty-year simulated samples within

each model class, again in an attempt to make the results as comparable as possible.

In the final step, we use the 1,000 simulated samples from each exercise as input into a

corresponding number of Kalman filter estimations where we use the true parameters as the

starting point for each optimization. Since we are estimating the true model in each case,

this provides us with a clean read of the properties of the Kalman filter as an estimator, not

impacted by any errors related to model misspecification.

7 Results for the Gaussian AFNS0 Model

In this section, we describe our estimation results based on the simulated data of the Gaussian

AFNS0 model that serves as the benchmark in our analysis. For this model class, the Kalman

filter is a consistent and efficient estimator equivalent to exact maximum likelihood estimation.

This allows us to study whether there is any finite-sample bias in the estimated parameters.

Due to the efficiency of the Kalman filter, such finite-sample bias will affect any estimator.

Hence, these results provide an ideal background for understanding the bias in Kalman filter-

based estimations of non-Gaussian AFNS models with stochastic volatility.

To begin, the result of the model selection for the Gaussian AFNS0 model is reported in

Table 3. The statistics in the table show that the preferred specification according to the

Bayesian Information Criterion has P -dynamics given by

dLt

dSt

dCt

=

κP11 0 0

0 κP22 κP23

0 0 κP33

θP1

θP2

θP3

−

Lt

St

Ct

dt+

σ11 0 0

0 σ22 0

0 0 σ33

dWL,Pt

dW S,Pt

dWC,Pt

.

21

Alternative Goodness-of-fit statisticsspecifications logL k p-value BIC(1) Unrestricted KP 51,042.41 24 n.a. -101,916.7(2) κP

12= 0 51,042.40 23 0.8875 -101,923.7

(3) κP12

= κP32

= 0 51,042.40 22 0.8875 -101,930.7(4) κP

12= κP

32= κP

31= 0 51,042.23 21 0.5598 -101,937.4

(5) κP31

= . . . = κP21

= 0 51,037.57 20 0.0023 -101,935.1(6) κP

31= . . . = κP

13= 0 51,035.98 19 0.0745 -101,938.9

(7) κP31

= . . . = κP23

= 0 51,015.27 18 < 0.0001 -101,904.5

Table 3: Evaluation of Alternative Specifications of the AFNS0 Model.There are seven alternative estimated specifications of the AFNS0 model with constant volatility. Each

specification is listed with its maximum log likelihood (logL), number of parameters (k), the p-value

from a likelihood ratio test of the hypothesis that it differs from the specification above with one more

free parameter, and the Bayesian information criterion (BIC).

KP KP·,1 K

P·,2 K

P·,3 θ

P Σ

KP1,· 0.03943 0 0 0.07242 Σ1,· 0.00570

(0.07332) (0.01703) (0.00009)KP

2,· 0 0.43102 -0.69198 -0.03173 Σ2,· 0.00888(0.11962) (0.08121) (0.01271) (0.00020)

KP3,· 0 0 0.83341 -0.01873 Σ3,· 0.02728

(0.22767) (0.00676) (0.00047)

Table 4: Parameter Estimates for the Preferred AFNS0 Model.The estimated parameters of the KP -matrix, the θP -vector, and the Σ-matrix for the preferred AFNS0model according to the Bayesian Information Criterion are shown. The Q-related parameter is λ =

0.53650 (0.00363). The numbers in parentheses are the estimated standard deviations of the parameter

estimates. The maximum log likelihood value is 51,035.98.

The estimated dynamic parameters for this specification are reported in Table 4. Relative

to the unrestricted model, the likelihood ratio test for the five restrictions jointly in the

preferred specification are

LRBIC = 2[51, 042.41 − 51, 035.98] = 12.86 ∼ χ2(5).

The probability of observing at least 12.86 with five degrees of freedom is 0.0247. Thus,

the five restrictions are not jointly supported by the data at the 5% level, but they are not

overwhelmingly rejected either.

In terms of the estimated parameters reported in Table 4 that are used in the simulations

of the AFNS0 model, we note the usual pattern that the level factor is the most persistent

and least volatile factor, the curvature is the most volatile and least persistent factor, and the

slope factor has dynamic properties in between those two extremes. Finally, the estimated

value of λ is close to 0.5, which is a typical value for this parameter.

22

Ten-year samples, σε = 1 bpParameterTrue Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%

κP11 0.03943 0.50335 0.41496 0.07704 0.20025 0.39116 0.67401 1.2888κP22 0.43102 0.61752 0.28653 0.25449 0.42644 0.56905 0.74371 1.1264κP23 -0.69198 -0.76423 0.20026 -1.1174 -0.88497 -0.74995 -0.63269 -0.46515κP33 0.83341 1.2243 0.57056 0.53381 0.82011 1.1043 1.4970 2.2933

σ11 0.00570 0.00571 0.00024 0.00533 0.00555 0.00570 0.00587 0.00611σ22 0.00888 0.00882 0.00058 0.00782 0.00842 0.00880 0.00924 0.00973σ33 0.02728 0.02747 0.00178 0.02454 0.02625 0.02752 0.02866 0.03038

θP1 0.07242 0.07273 0.01802 0.04275 0.06020 0.07265 0.08519 0.10222θP2 -0.03173 -0.03199 0.01407 -0.05503 -0.04109 -0.03195 -0.02253 -0.00902θP3 -0.01873 -0.01903 0.00859 -0.03407 -0.02487 -0.01863 -0.01319 -0.00519

λ 0.53650 0.53635 0.00322 0.53134 0.53415 0.53642 0.53838 0.54172

σε 0.00010 0.00010 0.00000 0.00010 0.00010 0.00010 0.00010 0.00010

Ten-year samples, σε = 10 bpsParameterTrue Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%

κP11 0.03943 0.53931 0.47879 0.08383 0.21206 0.39589 0.70937 1.4538κP22 0.43102 0.62201 0.29042 0.24693 0.41999 0.57518 0.75535 1.1663κP23 -0.69198 -0.77180 0.20727 -1.1374 -0.89486 -0.75441 -0.63236 -0.46117

κP33 0.83341 1.2275 0.58183 0.51673 0.81498 1.1118 1.4986 2.3486

σ11 0.00570 0.00583 0.00067 0.00474 0.00539 0.00585 0.00627 0.00691σ22 0.00888 0.00871 0.00079 0.00749 0.00814 0.00870 0.00927 0.00996σ33 0.02728 0.02755 0.00263 0.02331 0.02580 0.02750 0.02936 0.03191

θP1 0.07242 0.07281 0.01805 0.04235 0.06009 0.07300 0.08522 0.10207θP2 -0.03173 -0.03199 0.01412 -0.05512 -0.04133 -0.03202 -0.02258 -0.00862

θP3 -0.01873 -0.01917 0.00868 -0.03430 -0.02478 -0.01885 -0.01323 -0.00548

λ 0.53650 0.53714 0.02259 0.50028 0.52241 0.53714 0.55097 0.57385

σε 0.00100 0.00100 0.00003 0.00096 0.00098 0.00100 0.00102 0.00104

Table 5: Summary Statistics of Estimated Parameters from Simulated Ten-YearMonthly Samples of the Preferred AFNS0 Model.

The table reports the summary statistics of the estimation results from N = 1,000 simulated data

sets of the preferred AFNS0 model, each with a length of ten years and a uniform measurement error

standard deviation of σε = 1 basis point and σε = 10 basis points, respectively.

7.1 Analysis of Ten-Year Monthly Samples

The summary statistics from the 1,000 estimations based on simulated ten-year monthly

samples of the preferred specification of the AFNS0 model are reported in Table 5. We note

that there is an upward bias in the absolute size of all four mean-reversion parameters, that

is, the three positive parameters in the diagonal of KP have means and medians well above

their true values, while the negative off-diagonal element, κP23, has a mean and median that

is below its true value. Hence, there is notable finite-sample bias in the estimates of these

parameters. In particular, the near unit-root property of the Nelson-Siegel level factor is

causing the estimator significant difficulty. More than 95% of the estimates of κP11 are above

0.077 despite its true value of only 0.039. These results show that a near unit-root process

can come across as very persistent as well as rather quickly mean-reverting in samples of short

length such as the ten-year samples analyzed here. Figure 2 provides the visual representation

23

0 200 400 600 800 1000

01

23

4

Estimation No.

Par

amet

er e

stm

ate

(a) κP11.

0 200 400 600 800 1000

0.0

0.5

1.0

1.5

2.0

2.5

Estimation No.

Par

amet

er e

stm

ate

(b) κP22.

0 200 400 600 800 1000

−1.

5−

1.0

−0.

50.

0

Estimation No.

Par

amet

er e

stm

ate

(c) κP23.

0 200 400 600 800 1000

01

23

45

Estimation No.

Par

amet

er e

stm

ate

(d) κP33.

Figure 2: Estimated Mean-Reversion Parameters from Simulated Ten-YearMonthly Samples of the Preferred AFNS0 Model.

Illustration of the estimated mean-reversion parameters in the KP matrix from N = 1,000 simulated

data sets of the preferred AFNS0 model, each with a length of ten years sampled monthly and a uni-

form measurement error standard deviation of σε = 10 basis points. The true value of each parameter

is indicated with a horizontal solid grey line.

of the estimated mean-reversion parameters across the 1,000 samples. We note that they have

notably skewed distributions, partly as a consequence of the imposed stationarity.

Turning to the three volatility parameters in the Σ matrix, we note that they are well

determined with almost identical means and medians, both close to the true values, and the

standard deviations of their estimates are also small. Importantly, though, their accuracy

is sensitive to the quality of the data as a low value of σε decreases the dispersion of their

24

estimated values. This result applies to all three factors, and it suggests that the values of the

volatility parameters are determined to a large extent from their impact on the cross-sectional

fit of yields rather than from the time series properties of the state variables, which are the

same in the simulated data by construction and independent of the value of σε.

The mean parameters under the P -measure, θP , represent the opposite case. Due to the

flexibility of the essentially affine risk premium specification within the Gaussian models,

these parameters play no role for the Q-dynamics and, by implication, have no effect on the

cross-sectional fit of the model. As a consequence, their estimated values are purely derived

from the time series properties of the state variables and their distributions are independent

of the level of noise in the yield data. Furthermore, they are estimated without any detectable

bias, and the standard deviation of their estimated values is also relatively modest, but larger

the more persistent the factor in question is.

Focusing on the estimates of λ, Table 5 shows that this parameter is well determined

in the estimation with a small standard deviation. It has a 95% confidence interval given

by (0.500, 0.574) for the case with noise error standard deviation of 10 basis points, and

an even narrower interval given by (0.531, 0.541) when we reduce the standard deviation of

the measurement noise to 1 basis point. Since λ only affects the risk-neutral Q-dynamics,

it is exclusively determined from the cross section of yields and therefore sensitive to the

quality of the data. Still, variation in the values of λ in the ranges above does not alter the

cross-sectional fit of the model by much. Thus, its statistical uncertainty is largely without

economic consequences.

Finally, the estimates of the measurement error standard deviation exhibit very little

variation across the simulated samples. However, as noted, their size affect the accuracy of

the three volatility parameters and λ. This supports the conjecture put forward by CDR

that the elements in the volatility matrix in the AFNS0 model are determined primarily in

order to deliver the best possible fit to the cross section of yields rather than matching the

actual volatility correlation structure among the three state variables. On the other hand,

the properties of the estimates of the elements in the mean-reversion matrix KP and the

mean vector θP are essentially unaffected by the size of σε as these parameters reflect the

time-series dynamics of the three state variables and their values have no consequences for

the bond yield function fitted to the cross section of observed yields.

In addition to studying the finite-sample properties of the estimated parameters, we are

also interested in knowing to what extent the parameter standard deviations estimated from

the optimized likelihood function in the Kalman filter are reliable in the sense that they reflect

the variation in the estimated parameters across the 1,000 simulated samples. In this exercise,

we hence use the empirical standard deviation of the 1,000 estimates of each parameter as

a proxy for the true, unobserved standard deviation of the estimated parameters.23 Table 6

23One potential caveat here is that the estimated parameters—the KP parameters in particular—followasymmetric distributions that are not necessarily well summarized by the standard deviation.

25

Parameter Ten-year samples, σε = 1 bpstd. dev. “True” Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%

σ(κP11) 0.41496 0.32510 0.13282 0.15106 0.22730 0.30352 0.39944 0.57859σ(κP22) 0.28653 0.25756 0.09782 0.14000 0.18904 0.23983 0.30301 0.44689σ(κP23) 0.20026 0.20055 0.05054 0.12971 0.16252 0.19487 0.22862 0.28981σ(κP33) 0.57056 0.55056 0.14534 0.34405 0.44860 0.53131 0.63680 0.80056

σ(σ11) 0.00024 0.00026 0.00003 0.00021 0.00023 0.00025 0.00028 0.00032σ(σ22) 0.00058 0.00065 0.00008 0.00052 0.00059 0.00065 0.00070 0.00077σ(σ33) 0.00178 0.00189 0.00022 0.00156 0.00174 0.00187 0.00203 0.00228

σ(θP1 ) 0.01802 0.00586 0.00471 0.00140 0.00256 0.00434 0.00778 0.01526σ(θP2 ) 0.01407 0.01398 0.00860 0.00473 0.00812 0.01201 0.01742 0.03048σ(θP3 ) 0.00859 0.00878 0.00460 0.00399 0.00597 0.00780 0.01055 0.01641

σ(λ) 0.00322 0.00332 0.00076 0.00221 0.00277 0.00324 0.00376 0.00470

σ(σε) 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000

Parameter Ten-year samples, σε = 10 bps

std. dev. “True” Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%

σ(κP11) 0.47879 0.36140 0.18019 0.15427 0.23479 0.32578 0.44125 0.69790σ(κP22) 0.29042 0.26323 0.10357 0.13906 0.19128 0.24009 0.30903 0.46648σ(κP23) 0.20727 0.20818 0.05662 0.13038 0.16588 0.20025 0.24281 0.31129

σ(κP33) 0.58183 0.58518 0.17083 0.35368 0.45967 0.56241 0.67907 0.89759

σ(σ11) 0.00067 0.00073 0.00008 0.00060 0.00067 0.00072 0.00078 0.00087σ(σ22) 0.00079 0.00086 0.00010 0.00071 0.00079 0.00086 0.00092 0.00102σ(σ33) 0.00263 0.00287 0.00035 0.00232 0.00262 0.00285 0.00309 0.00348

σ(θP1 ) 0.01805 0.00585 0.00463 0.00142 0.00256 0.00430 0.00768 0.01525σ(θP2 ) 0.01412 0.01413 0.00891 0.00461 0.00816 0.01213 0.01735 0.02998

σ(θP3 ) 0.00868 0.00895 0.00493 0.00407 0.00589 0.00784 0.01071 0.01666

σ(λ) 0.02259 0.02320 0.00525 0.01558 0.01944 0.02261 0.02644 0.03220

σ(σε) 0.00003 0.00003 0.00000 0.00003 0.00003 0.00003 0.00003 0.00003

Table 6: Summary Statistics of Estimated Parameter Standard Deviations fromSimulated Ten-Year Monthly Samples of the Preferred AFNS0 Model.

The table reports the summary statistics of the estimated parameter standard deviations from N =

1,000 simulated data sets of the preferred AFNS0 model, each with a length of ten years and a uniform

measurement error standard deviation of σε = 1 basis point and σε = 10 basis points, respectively.

contains the summary statistics for the monthly ten-year samples.

The parameter standard deviations we calculate from the optimized likelihood function are

reasonably accurate for the parameters without bias; σ11, σ22, σ33, θP2 , θ

P3 , λ, and σε. However,

even for the parameters with a modest bias, κP22, κP23, and κ

P33, the estimated parameter

standard deviations are relatively close to, but slightly below the actual variation in the

estimated parameters. Finally, for κP11 and θP1 , there is a more severe downward bias in the

estimated parameter standard deviations relative to the actual variation in the parameter

estimates. Overall, the conclusion is that the standard deviations obtained from the Kalman

filter underestimate the true variation for the parameters with bias. This will make these

parameters look more significant than they actually are. This problem is particularly severe

for the estimated parameters in the mean-reversion matrix KP as their point estimates are

notably upward biased to begin with. This makes model selection and validation extremely

26

0 200 400 600 800 1000

−1.

0−

0.5

0.0

0.5

1.0

Estimation No.

Est

imat

ed c

orre

latio

n co

effic

ient

(a) Correlation of Lt and St.

0 200 400 600 800 1000

−1.

0−

0.5

0.0

0.5

1.0

Estimation No.

Est

imat

ed c

orre

latio

n co

effic

ient

(b) Correlation of Lt and Ct.

0 200 400 600 800 1000

−1.

0−

0.5

0.0

0.5

1.0

Estimation No.

Est

imat

ed c

orre

latio

n co

effic

ient

(c) Correlation of St and Ct.

Figure 3: Pairwise Correlations of Estimated Factor Paths from Simulated Ten-Year Monthly Samples of the Preferred AFNS0 Model.

Illustration of the correlations between the estimated paths of the three state variables in N = 1,000

simulated data sets of the preferred AFNS0 model, each with a length of ten years and a uniform

measurement error standard deviation of σε = 10 basis points. Horizontal solid grey lines indicate the

factor correlations in the true unconditional distribution.

treacherous when one or more of the state variables are highly persistent. Unfortunately,

this is not an issue that can be neglected since it is the specification of KP that determines

a model’s forecast performance and term premium decomposition as discussed in detail in

Bauer et al. (2012).

Figure 3 shows the correlations between the estimated factor paths across the 1,000 sam-

ples. We note that, in short ten-year samples, factor path correlations are not a reliable guide

to “spotting” the appropriate dynamic relationship between the factors in multi-dimensional

models of the yield curve as the lack of mean-reversion of the level factor means that almost

any level of correlation can be observed even though within the simulated model, the level

factor is entirely independent of the two other factors. Furthermore, even for the slope and

curvature factors, which are strongly positively correlated within the simulated model, the

observed correlation can be low, and even negative, with non-trivial probability.

To end the analysis of the ten-year monthly samples, we analyze the accuracy of the

filtering of the state variables. Table 7 reports the mean absolute difference between the

simulated factor paths and the estimated factor paths from the Kalman filter. For the level

and the slope factor, their absolute filtered error is close to the size of σε that represents

the noise in the data. This might be due to the fact that they affect yields one-for-one at

their maximum loading in the yield function. For the curvature factor, its absolute filtered

error tends to be slightly more than three times larger than the size of σε since its maximum

loading in the yield function is barely 0.3.

27

State Mean absolute fitted error, ten-year samples, σε = 1 bpvariable Mean Std. dev. 5 percentile 1st quartile Median 3rd quartile 95 percentileLt 2.16 0.77 1.48 1.66 1.87 2.39 3.79St 2.01 0.81 1.30 1.47 1.69 2.27 3.78Ct 4.89 0.48 4.21 4.55 4.83 5.13 5.77

State Mean absolute fitted error, ten-year samples, σε = 10 bpsvariable Mean Std. dev. 5 percentile 1st quartile Median 3rd quartile 95 percentileLt 11.77 1.59 9.56 10.70 11.52 12.64 14.51St 11.42 1.66 9.37 10.31 11.13 12.17 14.57Ct 34.66 3.25 29.87 32.48 34.42 36.56 40.35

Table 7: Summary Statistics of Mean Absolute Fitted Errors of the Filtered StateVariables from Simulated Ten-Year Monthly Samples of the Preferred AFNS0Model.

The table reports the summary statistics of the mean absolute fitted error of the three state variables

from N = 1,000 simulated data sets of the preferred AFNS0 model, each with a length of ten years

and a uniform measurement error standard deviation of σε = 1 basis point and σε = 10 basis points,

respectively. All numbers are measured in basis points.

7.2 Analysis of Forty-Year Monthly Samples

In this section, we analyze the results obtained for the forty-year monthly samples simulated

from the AFNS0 model.

For a start, Table 8 contains the summary statistics for the 1,000 estimated parameter sets

we obtain from these monthly forty-year samples. For the parameters determined primarily

from the cross section, i.e., λ, σ11, σ22, and σ33, we see a reduction of about 50% in their

dispersion when we quadruple the length of the sample. For the other unbiased parameters,

θP1 , θP2 , and θ

P3 , we see a similar reduction in the dispersion for the two latter, while the

variation in the estimates of θP1 is reduced by only about 20%. This is tied to the fact that,

even with this sample length, κP11 is still estimated with notable upward bias although it is

much less severe than in the ten-year samples. On the other hand, for the remaining mean-

reversion parameters with bias, κP22, κP23, and κ

P33, we see a significant reduction in their bias.

In addition, the uncertainty of their estimated values is reduced by a factor of 2.5, which

reflects the combined effect of increasing the sample length (which reduces the uncertainty in

itself) and the reduction in the finite-sample bias.

For the parameters determined from the cross section of yields, we note that a ten-year

sample of high quality (σε = 1 basis point) tends to lead to more accurate estimates than

forty-year samples of relatively noisy data (σε = 10 basis points). Thus, whether a long, more

noisy sample or a short, high quality sample is the more appropriate, really depends on the

parameters of interest. The accuracy of parameters in KP and θP are determined by the

sample length and largely independently of the data quality, while the accuracy of estimates

of λ and the parameters in the Σ volatility matrix can be more sensitive to data quality than

to sample length.

28

Forty-year samples, σε = 1 bpParameterTrue Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%

κP11 0.03943 0.15530 0.11662 0.03282 0.07563 0.12431 0.20568 0.38359κP22 0.43102 0.46791 0.08887 0.34128 0.40697 0.45893 0.52220 0.63095κP23 -0.69198 -0.70833 0.08404 -0.84889 -0.76527 -0.70567 -0.64882 -0.57729κP33 0.83341 0.94312 0.22975 0.62266 0.77463 0.91457 1.0842 1.3699

σ11 0.00570 0.00571 0.00011 0.00553 0.00563 0.00571 0.00578 0.00589σ22 0.00888 0.00887 0.00028 0.00840 0.00868 0.00887 0.00905 0.00932σ33 0.02728 0.02733 0.00084 0.02602 0.02675 0.02732 0.02792 0.02871

θP1 0.07242 0.07299 0.01497 0.04628 0.06317 0.07379 0.08262 0.09583θP2 -0.03173 -0.03166 0.00817 -0.04497 -0.03728 -0.03167 -0.02590 -0.01842θP3 -0.01873 -0.01864 0.00492 -0.02653 -0.02184 -0.01865 -0.01540 -0.01053

λ 0.53650 0.53641 0.00141 0.53405 0.53546 0.53636 0.53735 0.53873

σε 0.00010 0.00010 0.00000 0.00010 0.00010 0.00010 0.00010 0.00010

Forty-year samples, σε = 10 bpsParameterTrue Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%

κP11 0.03943 0.15875 0.12341 0.03325 0.07462 0.12392 0.20531 0.40518κP22 0.43102 0.46867 0.09180 0.33625 0.40536 0.45924 0.52424 0.63261κP23 -0.69198 -0.71050 0.08795 -0.85659 -0.76772 -0.70791 -0.65208 -0.57646

κP33 0.83341 0.94851 0.24482 0.60052 0.77666 0.91701 1.0976 1.4252

σ11 0.00570 0.00574 0.00033 0.00520 0.00553 0.00574 0.00594 0.00626σ22 0.00888 0.00885 0.00038 0.00821 0.00859 0.00885 0.00910 0.00949σ33 0.02728 0.02734 0.00126 0.02522 0.02646 0.02738 0.02820 0.02938

θP1 0.07242 0.07300 0.01499 0.04632 0.06290 0.07406 0.08273 0.09618θP2 -0.03173 -0.03166 0.00818 -0.04470 -0.03727 -0.03167 -0.02590 -0.01831

θP3 -0.01873 -0.01866 0.00495 -0.02662 -0.02193 -0.01864 -0.01548 -0.01042

λ 0.53650 0.53618 0.01019 0.51956 0.52931 0.53605 0.54286 0.55275

σε 0.00100 0.00100 0.00001 0.00098 0.00099 0.00100 0.00101 0.00102

Table 8: Summary Statistics of Estimated Parameters from Simulated Forty-YearMonthly Samples of the Preferred AFNS0 Model.

The table reports the summary statistics of the estimation results from N = 1,000 simulated data sets

of the preferred AFNS0 model, each with a length of forty years and a uniform measurement error

standard deviation of σε = 1 basis point and σε = 10 basis points, respectively.

Figure 4 shows the distribution of the estimated parameters in the KP mean-reversion

matrix across the 1,000 samples when the sample length is forty years and the noise has a

standard deviation of 10 basis points. Relative to the distribution from the ten-year samples

shown in Figure 2, we note the significant reduction in both the dispersion and skewness of

the estimates of each of these four parameters when the sample length is quadrupled.

Table 9 reports the summary statistics of the estimated parameter standard deviations we

obtain from the optimized likelihood function in the Kalman filter for the forty-year samples.

We note that the means and medians are close to each other and close to the standard devia-

tion of the parameter estimates that we use as a proxy for the true, but unobserved parameter

uncertainty. The pair (κP11, θP1 ) remains the exception for which the estimated standard devi-

ations still significantly understate the actual variation in the estimated parameters.

To end the analysis of the forty-year monthly samples, we analyze the accuracy of the

29

0 200 400 600 800 1000

0.0

0.5

1.0

1.5

2.0

Estimation No.

Par

amet

er e

stm

ate

(a) κP11.

0 200 400 600 800 1000

0.0

0.2

0.4

0.6

0.8

1.0

Estimation No.

Par

amet

er e

stm

ate

(b) κP22.

0 200 400 600 800 1000

−1.

0−

0.8

−0.

6−

0.4

−0.

20.

0

Estimation No.

Par

amet

er e

stm

ate

(c) κP23.

0 200 400 600 800 1000

0.0

0.5

1.0

1.5

2.0

Estimation No.

Par

amet

er e

stm

ate

(d) κP33.

Figure 4: Estimated Mean-Reversion Parameters from Simulated Forty-YearMonthly Samples of the Preferred AFNS0 Model.

Illustration of the estimated mean-reversion parameters in the KP matrix from N = 1,000 simulated

data sets of the preferred AFNS0 model, each with a length of forty years and a uniform measurement

error standard deviation of σε = 10 basis points. The true value of each parameter is indicated with

a horizontal solid grey line.

filtering of the state variables. Table 10 reports the mean absolute difference between the

simulated factor paths and the estimated factor paths from the Kalman filter in this case.

Compared to the results for the ten-year monthly samples reported in Table 7, there is a

modest gain in the quality of the filtering from quadrupling the sample length. However,

as measured by the median of the absolute filtered errors, the difference is about 0.5 basis

points. Thus, for all practical purposes, the filtering accuracy is the same and not sensitive

30

Parameter Forty-year samples, σε = 1 bpstd. dev. “True” Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%

σ(κP11) 0.11662 0.08415 0.03025 0.04295 0.06274 0.07986 0.10214 0.13916σ(κP22) 0.08887 0.08785 0.01606 0.06462 0.07669 0.08584 0.09698 0.11745σ(κP23) 0.08404 0.08482 0.01030 0.06927 0.07720 0.08390 0.09169 0.10208σ(κP33) 0.22975 0.22606 0.03238 0.17799 0.20380 0.22351 0.24643 0.28205

σ(σ11) 0.00011 0.00012 0.00001 0.00011 0.00011 0.00012 0.00012 0.00013σ(σ22) 0.00028 0.00030 0.00002 0.00027 0.00029 0.00030 0.00031 0.00033σ(σ33) 0.00084 0.00087 0.00005 0.00078 0.00083 0.00087 0.00090 0.00096

σ(θP1 ) 0.01497 0.00802 0.00540 0.00228 0.00414 0.00663 0.01041 0.01921σ(θP2 ) 0.00817 0.00798 0.00214 0.00491 0.00647 0.00769 0.00926 0.01174σ(θP3 ) 0.00492 0.00476 0.00113 0.00313 0.00396 0.00461 0.00542 0.00680

σ(λ) 0.00141 0.00141 0.00018 0.00115 0.00129 0.00140 0.00152 0.00172

σ(σε) 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000

Parameter Forty-year samples, σε = 10 bps

std. dev. “True” Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%

σ(κP11) 0.12341 0.08677 0.03343 0.04349 0.06199 0.08040 0.10587 0.14805σ(κP22) 0.09180 0.08995 0.01677 0.06647 0.07849 0.08762 0.09911 0.12037σ(κP23) 0.08795 0.08794 0.01140 0.07112 0.07970 0.08732 0.09499 0.10791

σ(κP33) 0.24482 0.23593 0.03729 0.18051 0.21062 0.23243 0.25698 0.30326

σ(σ11) 0.00033 0.00032 0.00002 0.00029 0.00031 0.00032 0.00034 0.00036σ(σ22) 0.00038 0.00039 0.00002 0.00036 0.00038 0.00040 0.00041 0.00043σ(σ33) 0.00126 0.00130 0.00008 0.00116 0.00124 0.00129 0.00135 0.00143

σ(θP1 ) 0.01499 0.00802 0.00542 0.00223 0.00414 0.00656 0.01042 0.01893σ(θP2 ) 0.00818 0.00800 0.00218 0.00491 0.00649 0.00773 0.00931 0.01186

σ(θP3 ) 0.00495 0.00478 0.00115 0.00314 0.00394 0.00464 0.00545 0.00686

σ(λ) 0.01019 0.01050 0.00127 0.00864 0.00961 0.01039 0.01130 0.01283

σ(σε) 0.00001 0.00001 0.00000 0.00001 0.00001 0.00001 0.00001 0.00001

Table 9: Summary Statistics of Estimated Parameter Standard Deviations fromSimulated Forty-Year Monthly Samples on the Preferred AFNS0 Model.

The table reports the summary statistics of the estimated parameter standard deviations from N =

1,000 simulated data sets of the preferred AFNS0 model, each with a length of forty years and a uniform

measurement error standard deviation of σε = 1 basis point and σε = 10 basis points, respectively.

to the sample length.

7.3 Analysis of Weekly Samples

In this section, we analyze the estimation results we obtain with the exact same data analyzed

thus far, but sampled at a weekly frequency. Importantly, we emphasize that the simulated

factor paths are identical estimation-by-estimation, only the observed frequency has changed.

This should make the results as comparable as possible. Thus, only the simulated measure-

ment errors are not the same across the two exercises.

As before, we start with an analysis of the ten-year samples the results for which are

reported in Table 11. In general, the mean and median of the 1,000 estimates of each param-

eter are close to identical to those obtained with monthly data. Thus, in this sense, there

are limited benefits from increasing the data frequency. Still, we do see some reduction in

31

State Mean absolute fitted error, forty-year samples, σε = 1 bpvariable Mean Std. dev. 5 percentile 1st quartile Median 3rd quartile 95 percentileLt 1.69 0.20 1.50 1.57 1.63 1.73 2.09St 1.51 0.23 1.31 1.37 1.43 1.55 1.97Ct 4.64 0.17 4.38 4.52 4.63 4.75 4.93

State Mean absolute fitted error, forty-year samples, σε = 10 bpsvariable Mean Std. dev. 5 percentile 1st quartile Median 3rd quartile 95 percentileLt 11.13 0.57 10.23 10.71 11.09 11.49 12.12St 10.64 0.54 9.84 10.27 10.61 10.96 11.59Ct 34.07 1.46 31.64 33.06 34.02 35.04 36.52

Table 10: Summary Statistics of Mean Absolute Fitted Errors of the Filtered StateVariables from Simulated Forty-Year Monthly Samples of the Preferred AFNS0Model.

The table reports the summary statistics of the mean absolute fitted error of the three state variables

from N = 1,000 simulated data sets of the preferred AFNS0 model, each with a length of forty years

and a uniform measurement error standard deviation of σε = 1 basis point a

Documents

How Eﬃcient is the Kalman Filter at Estimating Aﬃne Term Structure …cepr.org/sites/default/files/events/1854_CLR_simulation... · 2015. 8. 25. · Gaussian AFNS models for which