Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
How Efficient is the Kalman Filter at
Estimating Affine Term Structure Models?†
Jens H. E. Christensen
Jose A. Lopez
and
Glenn D. Rudebusch
Federal Reserve Bank of San Francisco
101 Market Street, Mailstop 1130
San Francisco, CA 94105
Preliminary and incomplete draft. Comments are welcome.
Abstract
We perform a carefully orchestrated simulation study to analyze the bias of the Kalman
filter in estimating arbitrage-free Nelson-Siegel (AFNS) models with and without stochas-
tic volatility. For Gaussian AFNS models, we document significant finite-sample bias in
the estimated mean-reversion parameters. Since the Kalman filter is consistent and ef-
ficient for that model class, this exercise provides a measure of the finite-sample bias
that will affect any estimator. For AFNS models with stochastic volatility, significant
finite-sample upward estimation bias remains, but it is not materially larger than in the
Gaussian model. Hence, we recommend estimation based on the Kalman filter for both
types of AFNS models and corresponding affine term structure models in general.
JEL Classification: C13, C58, G12, G17.
Keywords: arbitrage-free Nelson-Siegel models, finite-sample bias, stochastic volatility
†We thank seminar participants at the Second Humboldt Copenhagen Conference on Financial Economet-rics for comments on an earlier draft of this paper. The views in this paper are solely the responsibility of theauthors and should not be interpreted as reflecting the views of the Federal Reserve Bank of San Francisco orthe Board of Governors of the Federal Reserve System.
This version: August 24, 2015.
1 Introduction
Interest rate volatility is a topic of great research interest given its role in derivatives pricing
and portfolio risk management. However, as compared to the empirical results presented
in the extensive GARCH literature, the results of modeling interest rate volatility within
the more commonly used affine, arbitrage-free models of the term structure have been less
clear-cut, partly due to the difficulty in estimating their parameters.
Estimation of flexible affine term structure models is complicated and time consuming,
partly due to the fairly large number of parameters, and partly due to the latent nature of the
state variables in such models. The latter causes the estimation to be plagued by numerous
local maxima that are distinct in the sense that they are not invariant affine transformations1
of each other and therefore may have very different economic implications, see Duffee (2011)
and Kim and Orphanides (2012) for discussions of these issues.
To overcome those problems, Christensen et al. (2011, henceforth CDR) introduce the
affine arbitrage-free class of Nelson-Siegel term structure models (henceforth referred to as
AFNS models). These are affine term structure models that preserve the level, slope, and
curvature factor loading structure in the bond yield function known from the standard Nelson
and Siegel (1987) yield curve model. These models are easy to estimate because the role
of each factor is predetermined and does not vary for any admissible set of parameters.
Furthermore, in that model class, the state variables are Gaussian with constant volatility.
As a consequence, the models can be estimated with the standard Kalman filter, which
is equivalent to exact maximum likelihood estimation and therefore is both efficient and
consistent in the limit. However, despite its consistency and efficiency, the Kalman filter
remains subject to any unavoidable finite-sample bias.
In a recent paper, Christensen et al. (2014a, henceforth CLR) generalize the AFNS
model framework introduced in CDR by incorporating stochastic volatility into the state
variables. These models are also easy to estimate, again due to the imposed Nelson-Siegel
factor loading structure. CLR estimate their models using the standard Kalman filter and
report model fit on par with the original Gaussian AFNS model. Now, though, the Kalman
filter is no longer efficient and potentially inconsistent because it only approximates the true
probability distribution of the state variables by matching the first and second moment,
essentially treating the state variables as if they were Gaussian. Thus, in addition to any
finite-sample bias, there is potential for added bias arising from the fact that the Kalman filter
is only an approximation to the true likelihood function. Despite this concern, Kalman filter-
based estimation of affine term structure models with stochastic volatility is relatively common
in empirical term structure analysis,2 but the size of any bias in realistic three-factor settings
1See Dai and Singleton (2000) for the definition of this concept.2For examples, see Duffee (1999), Driessen (2005), Feldhütter and Lando (2008), and Christensen et al.
(2015).
1
has not been studied in detail in the existing term structure literature (to the best of our
knowledge). In this paper, we focus on the AFNS model classes with and without stochastic
volatility. This provides us with an ideal setting to study both the finite-sample bias and the
added bias from using the Kalman filter for estimation of affine non-Gaussian models. As an
alternative, Joslin et al. (2011) and Hamilton and Wu (2012) provide identification schemes
that facilitate the estimation of affine Gaussian models in that they avoid the filtering of the
unobserved latent factors.3 However, it is not obvious if or how those approaches extend to
affine non-Gaussian models. Thus, the AFNS-based identification of affine Gaussian models
provided by CDR and extended by CLR to affine non-Gaussian models remains an important
contribution without which the analysis in this paper would not have been feasible.4
Because interest rates are highly persistent, empirical autoregressive models, including
dynamic term structure models, suffer from substantial small-sample estimation bias. Specif-
ically, model estimates will generally be biased toward a dynamic system that displays much
less persistence than the true process (so estimates of the real-world mean-reversion matrix,
KP , are upward biased). Furthermore, if the degree of interest rate persistence is underes-
timated, future short rates would be expected to revert to their mean too quickly causing
their expected longer-term averages to be too stable. Therefore, the bias in the estimated
dynamics distorts the decomposition of yields and contaminates estimates of long-maturity
term premiums.
To study this finite-sample problem in detail, we start out simulating and estimating
Gaussian AFNS models for which the Kalman filter is an efficient estimator as already noted.
We simulate short ten-year and long forty-year samples to study the finite-sample bias problem
directly. We allow for low and high noise to assess how data quality affects our conclusions.
Furthermore, for the benchmark Gaussian AFNS model, we also analyze samples at weekly
frequency in addition to the monthly frequency used throughout, but since this turns out to
matter little for our conclusions, we do not repeat this exercise for the models with stochastic
volatility. We then proceed to simulate and estimate AFNS models with stochastic volatility
in a similarly careful way.
Our findings can be summarized as follows.
In the Gaussian AFNS model, there is a significant finite-sample upward bias in the
estimates of the mean-reversion rate of the Nelson-Siegel level factor due to its near unit-root
property. In addition, there is a more modest, finite-sample upward estimation bias in the
mean-reversion parameters for the slope and curvature factor thanks to their lower persistence.
Importantly, there is no finite-sample bias in the estimated mean parameters of any of the
factors. Furthermore, all parameters that relate to the model’s Q-dynamics used for pricing
3Andreasen and Christensen (2015) offer an alternative way of estimating non-Gaussian term structuremodels.
4The related literature include Duan and Simonato (1995), Lund (1997), De Jong (2000), Duffee andStanton (2004), and Duffee and Stanton (2008) among others.
2
and fitting the cross section of yields are well determined and without any measurable bias.
This property turns out to hold for non-Gaussian models as well. However, the accuracy of
the estimated Q-dynamics is affected by the amount of noise in the data. Finally, the data
frequency plays no role for these conclusions as both weekly and monthly simulated data
produce similar results. However, in the weekly samples, the parameter standard deviations
estimated from the optimized likelihood function in the Kalman filter tend to be too low. This
makes the upward biased mean-reversion parameters appear even more significant than they
are, which complicates model selection. Hence, we document one of the unusual situations
where more data do not necessarily lead to better inference. For selecting the appropriate
specification of the mean-reversion matrix, which matters for forecast performance, term
premium decompositions etc., we therefore recommend to rely on monthly rather than weekly
data.
We then proceed to simulate and estimate AFNS models with stochastic volatility gener-
ated by the level factor in one set of exercises, and with stochastic volatility generated by the
curvature factor in another set of exercises.
First, we find that the finite-sample upward bias in the estimated mean-reversion parame-
ters is not materially different in the models with stochastic volatility relative to the Gaussian
AFNS model. The intuition behind this result is that the time series properties of the three
state variables are primarily determined by the Nelson-Siegel factor loading structure, which
is almost identical for all AFNS models with and without stochastic volatility. For similar
reasons we also see little bias in the estimated mean parameters in these models.
Second, we analyze in detail the ability of the Kalman filter to estimate the volatility
sensitivity parameters that determine the degree to which the stochastic volatility factor
affects the volatility of the unconstrained factors in each model. For U.S. Treasury yields,
these sensitivity parameters are often estimated to be negligible (see CLR for an example) and
we report similar results. To assess whether this is a general weakness of the Kalman filter
when applied to models with stochastic volatility, we perform separate simulation experiments
with large values for the sensitivity parameters. Our results show that the Kalman filter is in
fact able to estimate them with some accuracy. Thus, when their estimated values are tiny
and insignificant, it is most likely because the data call for them to be so.
Third, in general, it is the case that the parameters that primarily affect the models’
fit to the cross section of yields tend to have small or no bias, but their accuracy varies
positively with the quality of the data. We note one exception though. In the AFNS model
with stochastic volatility generated by the curvature factor, the mean of the curvature factor
under the risk-neutral Q measure is not well identified. However, we show that this can be
solved at practically no cost by fixing it at a low value that is exactly high enough that the
curvature factor does not reach its zero lower bound.
Another key finding is that the Kalman filter is as efficient at filtering state variables in
3
non-Gaussian models as it is at filtering in Gaussian models, in particular under optimal con-
ditions with high-quality data. As a consequence, the fit of the AFNS models with stochastic
volatility is as good as, if not better than, the fit of the Gaussian AFNS model.
Finally, in light of the low interest rate environment in recent years, we emphasize that
our study has no baring on how Kalman filter-based estimations perform when yields are near
their lower bound and exhibit asymmetric behavior for that reason. This is a task that we
leave for future research. Still, the results we report could serve as a useful benchmark even
for that kind of exercise.
The rest of the paper is structured as follows. Section 2 describes our sample of U.S.
Treasury yields and motivates our focus on the Nelson-Siegel yield curve model, while Section
3 briefly details the original Gaussian AFNS model of the term structure. Section 4 goes on
to describe the five classes of AFNS models with stochastic volatility dynamics introduced
in CLR. Section 5 details the estimation methodology, while Section 6 describes the simu-
lation study. Section 7 contains the results from the simulation exercises for the Gaussian
AFNS model, while Sections 8 and 9 contain the results for the AFNS models with stochastic
volatility generated by the level and curvature factor, respectively. Section 10 concludes the
paper.
2 Motivation for the Nelson-Siegel Model
In this section, we motivate our focus on the Nelson-Siegel yield curve model using principal
components analysis. Recall that principal components analysis decomposes the observed
data into a number of factors equal to the number of time series and ranks those factors
according to how much of the observed variation each factor explains.
The specific Treasury yields we analyze to obtain realistic parameter sets to be used
in our simulation exercises are zero-coupon yields constructed by the method described in
Gürkaynak et al. (2007) and briefly detailed here.5 For each business day a zero-coupon yield
curve of the Svensson (1995)-type
yt(τ) = β0 +1− e−λ1τλ1τ
β1 +[1− e−λ1τ
λ1τ− e−λ1τ
]β2 +
[1− e−λ2τλ2τ
− e−λ2τ]β3
is fitted to price a large pool of underlying off-the-run Treasury bonds. Thus, for each busi-
ness day, we have the fitted values of the four coefficients (β0(t), β1(t), β2(t), β3(t)) and two
parameters (λ1(t), λ2(t)). From this data set zero-coupon yields for any relevant maturity
can be calculated. As demonstrated by Gürkaynak et al. (2007), this discount function prices
the underlying pool of bonds extremely well. By implication, the zero-coupon yields derived
from this approach constitute a very good approximation to the true underlying Treasury
5The Board of Governors of the Federal Reserve updates the data on its website athttp://www.federalreserve.gov/pubs/feds/2006/index.html.
4
1988 1992 1996 2000 2004 2008
02
46
810
Rat
e in
per
cent
10−year yield 5−year yield 1−year yield 3−month yield
Figure 1: Time Series of Treasury Yields.Illustration of the weekly observed Treasury zero-coupon bond yields covering the period from Decem-
ber 4, 1987, to January 2, 2009. The yields shown have maturities: Three-month, one-year, five-year,
and ten-year.
Maturity Mean Std. dev.in months in % in %
Skewness Kurtosis
3 4.52 2.02 0.03 2.416 4.61 2.05 -0.01 2.4012 4.77 2.04 -0.04 2.4124 5.03 1.95 -0.03 2.4336 5.24 1.86 0.02 2.3960 5.58 1.72 0.15 2.2584 5.85 1.62 0.26 2.13120 6.16 1.52 0.36 2.05
Table 1: Summary Statistics of Treasury Yields.Summary statistics for the sample of weekly observed Treasury zero-coupon bond yields covering the
period from December 4, 1987, to January 2, 2009.
zero-coupon yield curve.6
To have the most active part of the maturity spectrum represented, we construct Treasury
zero-coupon bond yields with the following maturities: 3-month, 6-month, 1-year, 2-year, 3-
year, 5-year, 7-year, and 10-year. We use weekly data (Fridays) and limit our sample to the
6D’Amico and King (2013) show that the Svensson functional form has had some difficulty at times infitting the underlying bond prices since the peak of the financial crisis. This explains why we end our sampleon January 2, 2009. Furthermore, we emphasize that we merely use the U.S. Treasury yields to obtain realisticparameter sets to be used in the model simulations. Hence, ultimately, the accuracy of the Svensson smoothedcurve does not matter for our exercise and the conclusions we draw.
5
Maturity Loading onin months First P.C. Second P.C. Third P.C.
3 -0.38 -0.44 0.526 -0.39 -0.38 0.1912 -0.40 -0.25 -0.2124 -0.38 -0.03 -0.4736 -0.36 0.12 -0.4260 -0.33 0.33 -0.1184 -0.30 0.44 0.18120 -0.27 0.53 0.45
% explained 94.12 5.58 0.27
Table 2: Eigenvectors of the First Three Principal Components of Treasury Yields.The loadings of yields of various maturities on the first three principal components are shown. The
final row shows the proportion of all bond yield variability accounted for by each principal component.
The data consist of weekly U.S. Treasury zero-coupon bond yields from December 4, 1987, to January
2, 2009.
period from December 4, 1987, to January 2, 2009. The summary statistics are provided in
Table 1, while Figure 1 illustrates the constructed time series of the three-month, one-year,
five-year, and ten-year Treasury zero-coupon yields.
Researchers have typically found that three factors are sufficient to model the time-
variation in the cross section of Treasury bond yields (e.g., Litterman and Scheinkman, 1991).
Indeed, for our weekly Treasury bond data, 99.97% of the total variation is accounted for by
three factors. Table 2 reports the eigenvectors that correspond to the first three principal
components of our data. The first principal component accounts for 94.1% of the variation in
the Treasury bond yields, and its loading across maturities is uniformly negative. Thus, like
a level factor, a shock to this component changes all yields in the same direction irrespective
of maturity. The second principal component accounts for 5.6% of the variation in these data
and has sizable negative loadings for the shorter maturities and sizable positive loadings for
the long maturities. Thus, like a slope factor, a shock to this component steepens or flattens
the yield curve. Finally, the third component, which accounts for only 0.3% of the variation,
has a U-shaped factor loading as a function of maturity, which is naturally interpreted as a
curvature factor.
In summary, three factors can explain more than 99.97% of the variation in this set of
Treasury bond yields, and they have properties consistent with an interpretation of level,
slope, and curvature as in the Nelson-Siegel model detailed in the following.
6
3 The AFNS Model with Constant Volatility
In this section, we briefly review the AFNS model with constant volatility, throughout referred
to as the AFNS0 specification.7,8 We start from a standard continuous-time affine arbitrage-
free structure (Duffie and Kan, 1996) that underlies all the models to be estimated in this pa-
per. To represent an affine diffusion process, define a filtered probability space (Ω,F , (Ft), Q),where the filtration (Ft) = {Ft : t ≥ 0} satisfies the usual conditions (Williams, 1997). Thestate variables Xt are assumed to be a Markov process defined on a set M ⊂ Rn that solvesthe following stochastic differential equation (SDE)9
dXt = KQ(t)[θQ(t)−Xt]dt+Σ(t)D(Xt, t)dWQt , (1)
where WQ is a standard Brownian motion in Rn, the information of which is contained in
the filtration (Ft). The drift terms θQ : [0, T ] → Rn and KQ : [0, T ] → Rn×n are bounded,continuous functions.10 Similarly, the volatility matrix Σ : [0, T ] → Rn×n is assumed to be abounded, continuous function, while D :M × [0, T ] → Rn×n is assumed to have the followingdiagonal structure
√γ1(t) + δ1(t)Xt . . . 0
.... . .
...
0 . . .√γn(t) + δn(t)Xt
,
where
γ(t) =
γ1(t)...
γn(t)
, δ(t) =
δ11(t) . . . δ1n(t)
.... . .
...
δn1 (t) . . . δnn(t)
,
γ : [0, T ] → Rn and δ : [0, T ] → Rn×n are bounded, continuous functions, and δi(t) denotesthe ith row of the δ(t)-matrix. Finally, the instantaneous risk-free rate is assumed to be an
affine function of the state variables
rt = ρ0(t) + ρ1(t)′Xt,
7Our nomenclature follow CLR and draws on Dai and Singleton (2000). Our AFNSn models are membersof their An(3) class of models, which have three state variables and n square-root processes.
8This model has been shown to exhibit both good in-sample fit and out-of-sample forecast accuracy forvarious yield curves. The empirical analysis conducted in CDR is based on unsmoothed Fama-Bliss data fornominal Treasury yields. Christensen et al. (2010) examine yields for nominal and real Treasuries as perGürkaynak et al. (2007, 2010), while Christensen et al. (2014b) examine short-term LIBOR and highly-ratedbanks’ and financial firms’ corporate bond rates.
9The affine property applies to bond prices; therefore, affine models only impose structure on the factordynamics under the pricing measure.
10Stationarity of the state variables is ensured if all the eigenvalues of KQ(t) are positive (if complex, the realcomponent should be positive), see Ahn et al. (2002). However, stationarity is not a necessary requirementfor the process to be well defined.
7
where ρ0 : [0, T ] → R and ρ1 : [0, T ] → Rn are bounded, continuous functions.Duffie and Kan (1996) prove that zero-coupon bond prices in this framework are exponential-
affine functions of the state variables
P (t, T ) = EQt[exp
(−∫ T
t
rudu)]
= exp(B(t, T )′Xt +A(t, T )
),
where B(t, T ) and A(t, T ) are the solutions to the following system of ordinary differential
equations (ODEs)
dB(t, T )
dt= ρ1 + (K
Q)′B(t, T )− 12
n∑
j=1
(Σ′B(t, T )B(t, T )′Σ)j,j(δj)′, B(T, T ) = 0, (2)
dA(t, T )
dt= ρ0 −B(t, T )′KQθQ −
1
2
n∑
j=1
(Σ′B(t, T )B(t, T )′Σ)j,jγj, A(T, T ) = 0, (3)
and the possible time-dependence of the parameters is suppressed in the notation. These
pricing functions imply that the zero-coupon yields are given by
y(t, T ) = − 1T − t log P (t, T ) = −
B(t, T )′
T − t Xt −A(t, T )
T − t .
As per CDR, assume that the instantaneous risk-free rate is defined by
rt = Lt + St.
In addition, assume that the state variables Xt = (Lt, St, Ct) are described by the following
system of SDEs under the risk-neutral Q-measure
dLt
dSt
dCt
=
0 0 0
0 λ −λ0 0 λ
θQ1
θQ2
θQ3
−
Lt
St
Ct
dt+Σ
dWL,Qt
dW S,Qt
dWC,Qt
, λ > 0.
Then, zero-coupon bond yields are given by
y(t, T ) = Lt +(1− e−λ(T−t)
λ(T − t)
)St +
(1− e−λ(T−t)λ(T − t)
− e−λ(T−t))Ct −
A(t, T )
T − t.
This result defines the class of AFNS0 models derived in CDR and the additional term in
the yield function is a so-called yield-adjustment term that represents convexity effects due
to Jensen’s inequality; see CDR for details. To complete the model, we need to specify the
risk premium structure that generates the connection to the dynamics under the real-world
P -measure. To that end, it is important to note that there are no restrictions on the dynamic
drift components under the empirical P -measure. Therefore, beyond the requirement of
constant volatility, we are free to choose the dynamics under the P -measure. To facilitate
8
the empirical implementation, we follow CDR and limit our focus to the essentially affine risk
premium introduced in Duffee (2002). In the Gaussian framework, this specification implies
that the risk premiums Γt depend linearly on the state variables; that is,
Γt = γ0 + γ1Xt,
where γ0 ∈ R3 and γ1 ∈ R3×3 contain unrestricted parameters. The relationship betweenreal-world yield curve dynamics under the P -measure and risk-neutral dynamics under the
Q-measure is given by
dWQt = dWPt + Γtdt.
Thus, we can write the P -dynamics of the state variables as
dXt = KP (θP −Xt)dt+ΣdWPt ,
where both KP and θP are allowed to vary freely relative to their counterparts under the
Q-measure. Following CDR, we identify this class of models by fixing the means under the
Q-measure at zero, i.e., θQ = 0.11 Furthermore, CDR show that Σ cannot be more than a
triangular matrix for the model to be identified. Thus, the maximally flexible specification of
the original AFNS model has Q-dynamics given by
dLt
dSt
dCt
=
0 0 0
0 −λ λ0 0 −λ
Lt
St
Ct
dt+
σ11 0 0
σ21 σ22 0
σ31 σ32 σ33
dWL,Qt
dW S,Qt
dWC,Qt
,
while its P -dynamics are given by
dLt
dSt
dCt
=
κP11 κP12 κ
P13
κP21 κP22 κ
P23
κP31 κP32 κ
P33
θP1
θP2
θP3
−
Lt
St
Ct
dt+
σ11 0 0
σ21 σ22 0
σ31 σ32 σ33
dWL,Pt
dW S,Pt
dWC,Pt
.
The main limitation of the AFNS0 class of models is that it is characterized by a constant
volatility matrix Σ. CLR modify the AFNS0 model in a straightforward fashion in order to
incorporate stochastic volatility. The key assumption to preserving the desirable Nelson-Siegel
factor loading structure in the zero-coupon bond yield function is to maintain the KQ mean-
reversion matrix under the Q-measure. Furthermore, all model classes will be characterized
by an instantaneous risk-free rate defined as the sum of the first two factors
rt = Lt + St.
11CDR demonstrate that this choice is without loss of generality.
9
The details of the AFNS models with stochastic volatility are briefly provided in the following
section.
4 Five AFNS Specifications with Stochastic Volatility
In this section, we present five AFNS specifications with stochastic volatility that vary de-
pending on whether they contain one, two, or three stochastic volatility factors and on the
identity of those factors. For each model class, we derive the maximally flexible specifica-
tion that can be obtained using the extended affine risk premium specification introduced in
Cheridito et al. (2007).
4.1 AFNS Models with One Stochastic Volatility Factor
There are two AFNS stochastic volatility specifications that allow just one factor to exhibit
stochastic volatility. The first, denoted as the AFNS1-L model, allows only the level factor
to exhibit stochastic volatility. The state variables in this specification follow this system of
stochastic differential equations under the risk-neutral Q-measure:12
dLt
dSt
dCt
=
ε 0 0
0 λ −λ0 0 λ
θQ1
θQ2
θQ3
−
Lt
St
Ct
dt
+
σ11 0 0
σ21 σ22 0
σ31 σ32 σ33
√Lt 0 0
0√1 + β21Lt 0
0 0√1 + β31Lt
dWL,Qt
dW S,Qt
dWC,Qt
,
where the level factor Lt is a square-root process with stochastic volatility that affects the
instantaneous volatility of the two other factors through the volatility sensitivity parameters,
β21 and β31.
For the factor loadings in the zero-coupon bond prices, B1(t, T ) is the solution to
dB1(t, T )
dt= 1 + εB1(t, T )− 1
2σ211B
1(t, T )2 − 12σ221B
2(t, T )2 − 12σ231B
3(t, T )2
−σ21σ11B1(t, T )B2(t, T )− σ31σ11B1(t, T )B3(t, T )− σ21σ31B2(t, T )B3(t, T )
−12β21
[σ222B
2(t, T )2 + σ232B3(t, T )2 + 2σ22σ32B
2(t, T )B3(t, T )]− 1
2β31σ
233B
3(t, T )2,
12Note that we cannot set κQ11
to zero as that would eliminate the drift of Lt and cause this process to remainat zero once it hits zero, which it will P -a.s. Instead, we fix this parameter at a small, but positive, ε = 10−6,to get close to the unit-root property imposed in the AFNS0 model.
10
while B2(t, T ) and B3(t, T ) are given by
B2(t, T ) = −(1− e−λ(T−t)
λ
),
B3(t, T ) = (T − t)e−λ(T−t) −(1− e−λ(T−t)
λ
).
The last two factor loadings match exactly the factor loadings of the slope and curvature
factors in the Nelson-Siegel zero-coupon yield function, while the ODE for B1(t, T ) contains
quadratic elements related to the stochastic volatility of Lt. The A(t, T )-function in the
yield-adjustment term in this class of models must solve the following ODE:
dA(t, T )
dt= −B(t, T )′KQθQ − 1
2σ222B2(t, T )2 − 1
2(σ2
32+ σ2
33)B3(t, T )2 − σ22σ32B2(t, T )B3(t, T ).
To estimate this model, we specify the dynamics under the real-world P -measure as the
measure change dWQ = dWPt + Γtdt. Note that we are limited to the essentially affine risk
premium structure introduced by Duffee (2002) for this particular model class.13 Given this
limitation, the maximally flexible affine P -dynamics are, in general, given by
dLt
dSt
dCt
=
κP11 0 0
κP21 κP22 κ
P23
κP31 κP32 κ
P33
θP1
θP2
θP3
−
Lt
St
Ct
dt
+
σ11 0 0
σ21 σ22 0
σ31 σ32 σ33
√Lt 0 0
0√1 + β21Lt 0
0 0√1 + β31Lt
dWL,Pt
dW S,Pt
dWC,Pt
.
For the first factor with stochastic volatility, there is a restriction on the mean parameter θP1
that we implement as14
θP1 =ε · θQ1κP11
.
Furthermore, for this process to be well-defined under both probability measures, we require
that
κP11θP1 > 0 and ε · θ
Q1 > 0.
These two inequalities are satisfied provided κP11 > 0 and θQ1 > 0. These restrictions ensure
13We choose not to use the extended affine risk premium specification for this particular specification becauseof the restriction imposed on κQ
11to obtain a level factor structure as similar as possible to the one in the
Nelson-Siegel model. If we were to do so, we would expect the Feller condition for Lt to be violated under theQ-measure as Lt would approach a unit-root process (CLR observe such violations in the AFNS3 model to bedetailed later despite imposing Feller conditions on all three state variables under both probability measures),but we stress that this is a self-imposed restriction based on the above concern, and not a theoretical necessity.
14A similar approach is used in the other model classes with stochastic volatility generated by the levelfactor.
11
that the Lt-process will move into positive territory whenever it hits the zero lower bound.
Finally, we identify this class of models by fixing θQ2 = θQ3 = 0, that is, we eliminate the Q-
means of the unconstrained processes as in CDR. These restrictions allow the corresponding
means under the P -measure to be determined in the estimation.
The natural next AFNS one-factor stochastic volatility specification would allow the slope
factor to exhibit stochastic volatility. However, examination of the matrix
KQ =
0 0 0
0 λ −λ0 0 λ
,
shows that St cannot be a square-root process with Ct as an unconstrained process, if the
important off-diagonal element κQ23 is to remain equal to −λ, which generates the uniquefactor loading of the curvature factor in the AFNS model. Thus, there is no admissible
AFNS1-S model. Instead, we turn to the AFNS1-C model by allowing the curvature factor
to be a stochastic volatility factor. This approach preserves the properties of the level and
slope factors, allows the curvature factor to continue to serve as the stochastic mean of the
slope factor under the pricing measure, and designates the curvature factor to be the source
of stochastic volatility in the model.
For the AFNS1-C model, we assume that the state variables Xt are described under the
risk-neutral Q-measure as:
dLt
dSt
dCt
=
0 0 0
0 λ −λ0 0 λ
θQ1
θQ2
θQ3
−
Lt
St
Ct
dt
+
σ11 σ12 σ13
0 σ22 σ23
0 0 σ33
√1 + β13Ct 0 0
0√1 + β23Ct 0
0 0√Ct
dWL,Qt
dW S,Qt
dWC,Qt
.
The curvature factor here is a square-root process that induces stochastic volatility in the
other two factors through the volatility sensitivity parameters, β13 and β23.
In this model class, the first two factor loadings are identical to those in the AFNS0 model,
while B3(t, T ) is the solution to:
dB3(t, T )
dt= −λB2(t, T ) + λB3(t, T )− 1
2σ213B
1(t, T )2 − 12σ223B
2(t, T )2 − 12σ233B
3(t, T )2
−σ13σ23B1(t, T )B2(t, T )− σ13σ33B1(t, T )B3(t, T )− σ23σ33B2(t, T )B2(t, T )
−12β13σ
211B
1(t, T )2 − 12β23
[σ212B
1(t, T )2 + σ222B2(t, T )2 + 2σ12σ22B
1(t, T )B2(t, T )].
The A(t, T )-function in the yield-adjustment term in this class of models solves the ODE:
12
dA(t, T )
dt= −B(t, T )′KQθQ − 1
2(σ2
11+ σ2
12)B1(t, T )2 − 1
2σ222B2(t, T )2 − σ12σ22B1(t, T )B2(t, T ).
We estimate this model using the extended affine risk premium specification such that
the measure change is dWQ = dWPt + Γtdt. The maximally flexible affine P -dynamics are,
in general, given by
dLt
dSt
dCt
=
κP11 κP12 κ
P13
κP21 κP22 κ
P23
0 0 κP33
θP1
θP2
θP3
−
Lt
St
Ct
dt
+
σ11 σ12 σ13
0 σ22 σ23
0 0 σ33
√1 + β13Ct 0 0
0√1 + β23Ct 0
0 0√Ct
dWL,Pt
dW S,Pt
dWC,Pt
.
To keep the model arbitrage-free, Ct cannot be allowed to hit the zero lower bound. This
outcome is ensured by requiring that the parameters for the Ct-process satisfy the Feller
condition under both probability measures; i.e.,
κP33θP3 >
1
2σ233 and λθ
Q3 >
1
2σ233.
Finally, we identify this class of models by fixing θQ1 = θQ2 = 0, which allows the means
under the P -measure of the unconstrained factors to vary freely and be determined in the
estimation.
4.2 AFNS Models with Two Stochastic Volatility Factors
Our third and fourth classes of stochastic volatility models allow for two stochastic volatility
factors. Although there are three potential specifications, the specification with just the level
and slope factors exhibiting stochastic volatility is not admissible because it does not permit
the important off-diagonal element κQ23 to equal −λ, which is the unique characteristic of thecurvature factor in the original AFNS model. Instead, stochastic volatility is associated with
either level and curvature or slope and curvature. The first of these specifications, denoted
13
AFNS2-LC, has factor dynamics under the risk-neutral Q-measure given by15
dLt
dSt
dCt
=
ε 0 0
0 λ −λ0 0 λ
θQ1
θQ2
θQ3
−
Lt
St
Ct
dt
+
σ11 0 0
σ21 σ22 σ23
0 0 σ33
√Lt 0 0
0√1 + β21Lt + β23Ct 0
0 0√Ct
dWL,Qt
dW S,Qt
dWC,Qt
.
The level and curvature factors, Lt and Ct, exhibit stochastic volatility and induce time-
varying volatility in the slope factor, St, via the volatility sensitivity parameters, β21 and
β23.
The factor loadings in the zero-coupon bond price function are the unique solutions to
the following set of ODEs:
dB1(t, T )
dt= 1 + εB1(t, T )− 1
2σ211B
1(t, T )2 − 12σ221B
2(t, T )2
−σ11σ21B1(t, T )B2(t, T )−1
2β21σ
222B
2(t, T )2,
dB2(t, T )
dt= 1 + λB2(t, T ),
dB3(t, T )
dt= −λB2(t, T ) + λB3(t, T )− 1
2σ233B
3(t, T )2 − 12σ223B
2(t, T )2
−σ23σ33B2(t, T )B3(t, T )−1
2β23σ
222B
2(t, T )2,
where we note that the solution to B2(t, T ) is simply
B2(t, T ) = −1− e−λ(T−t)
λ.
Hence, St preserves its role as a slope factor. The A(t, T )-function is the solution to:
dA(t, T )
dt= −B(t, T )′KQθQ − 1
2σ222B
2(t, T )2.
Using the extended affine risk premium structure, the maximally flexible affine P -dynamics
15Note that, as before, we fix ε = 10−6 to approximate the unit-root property imposed in the standardAFNS0 model.
14
are given by
dLt
dSt
dCt
=
κP11 0 0
κP21 κP22 κ
P23
κP31 0 κP33
θP1
θP2
θP3
−
Lt
St
Ct
dt
+
σ11 0 0
σ21 σ22 σ23
0 0 σ33
√Lt 0 0
0√1 + β21Lt + β23Ct 0
0 0√Ct
dWL,Pt
dW S,Pt
dWC,Pt
.
For the level factor, the condition ε · θQ1 = κP11θP1 must be satisfied. Furthermore, to keep thismodel class arbitrage free, Ct cannot hit the zero-boundary, which is prevented by requiring
that the parameters for the Ct-process satisfy the Feller condition under both probability
measures; i.e.,16
κP31θP1 + κ
P33θ
P3 >
1
2σ233 and λθ
Q3 >
1
2σ233.
Finally, to have a well-defined Ct-process, the effect of the level factor on the drift of the cur-
vature factor must be positive, which we impose with the κP31 ≤ 0 constraint. This conditionimplies that the two square-root processes cannot be negatively correlated. To identify this
model class, we fix the θQ2 mean at zero.
The second AFNS specification with two volatility factors allows the slope and curvature
factors to be square-root processes while the level factor remains unconstrained. The factor
dynamics of this AFNS2-SC model under the Q-measure are:
dLt
dSt
dCt
=
0 0 0
0 λ −λ0 0 λ
θQ1
θQ2
θQ3
−
Lt
St
Ct
dt
+
σ11 σ12 σ13
0 σ22 0
0 0 σ33
√1 + β12St + β13Ct 0 0
0√St 0
0 0√Ct
dWL,Qt
dW S,Qt
dWC,Qt
.
Note that the square-root processes, St and Ct, are positively correlated through the off-
diagonal element κQ23 = −λ < 0. Beyond generating their own stochastic volatility, these twofactors induce instantaneous volatility for Lt via the volatility sensitivities, β12 and β13.
For the first factor loading in the zero-coupon bond price function, this structure implies
that
B1(t, T ) = −(T − t),
which preserves the role of the level factor. The next two factor loadings are the unique
16For Lt, we just need to ensure that the process does not turn negative, which is achieved provided thatε · θ
Q1
> 0 and κP11θP1 > 0.
15
solutions to:
dB2(t, T )
dt= 1 + λB2(t, T )− 1
2σ222B
2(t, T )2 − 12σ212B
1(t, T )2
−σ12σ22B1(t, T )B2(t, T )−1
2β12σ
211B
1(t, T )2,
dB3(t, T )
dt= −λB2(t, T ) + λB3(t, T )− 1
2σ233B
3(t, T )2 − 12σ213B
1(t, T )2
−σ13σ33B1(t, T )B3(t, T )−1
2β13σ
211B
1(t, T )2.
The A(t, T )-function in the yield-adjustment term is the solution to
dA(t, T )
dt= −B(t, T )′KQθQ − 1
2σ211B
1(t, T )2.
Using the extended affine risk premium specification, the maximally flexible affine P -dynamics
can be written as
dLt
dSt
dCt
=
κP11 κP12 κ
P13
0 κP22 κP23
0 κP32 κP33
θP1
θP2
θP3
−
Lt
St
Ct
dt
+
σ11 σ12 σ13
0 σ22 0
0 0 σ33
√1 + β12St + β13Ct 0 0
0√St 0
0 0√Ct
dWL,Pt
dW S,Pt
dWC,Pt
.
To keep this class of models arbitrage-free, the slope and curvature factors, St and Ct, must
avoid hitting the zero-boundary. This outcome is ensured by imposing the Feller condition
on their parameters as follows:
κP22θP2 + κ
P23θ
P3 >
1
2σ222; λθ
Q2 − λθ
Q3 >
1
2σ222; κ
P33θ
P3 + κ
P32θ
P2 >
1
2σ233; and λθ
Q3 >
1
2σ233.
Furthermore, for St and Ct to be well defined, the sign of the effect they have on each other
must be positive, which we impose using the constraints κP23 ≤ 0 and κP32 ≤ 0. This impliesthat the two square-root processes cannot be negatively correlated. Finally, we identify this
class of models by fixing θQ1 = 0, which allows θP1 to vary freely.
16
4.3 AFNS Models with Three Stochastic Volatility Factors
In the fifth and final AFNS3 specification, all three factors exhibit stochastic volatility. The
dynamics of Xt are described under the Q-measure as17
dLt
dSt
dCt
=
ε 0 0
0 λ −λ0 0 λ
θQ1
θQ2
θQ3
−
Lt
St
Ct
dt
+
σ11 0 0
0 σ22 0
0 0 σ33
√Lt 0 0
0√St 0
0 0√Ct
dWL,Qt
dW S,Qt
dWC,Qt
.
In this model class, the factor loadings in the zero-coupon bond price function are given by
the unique solution to
dB1(t, T )
dt= 1 + εB1(t, T )− 1
2σ211B
1(t, T )2,
dB2(t, T )
dt= 1 + λB2(t, T )− 1
2σ222B
2(t, T )2,
dB3(t, T )
dt= −λB2(t, T ) + λB3(t, T )− 1
2σ233B
3(t, T )2,
while the A(t, T )-function in the yield-adjustment term is given by the solution to:
dA(t, T )
dt= −B(t, T )′KQθQ.
Applying the extended affine risk premium specification, the maximally flexible affine P -
dynamics are given by
dLt
dSt
dCt
=
κP11 0 0
κP21 κP22 κ
P23
κP31 κP32 κ
P33
θP1
θP2
θP3
−
Lt
St
Ct
dt
+
σ11 0 0
0 σ22 0
0 0 σ33
√Lt 0 0
0√St 0
0 0√Ct
dWL,Pt
dW S,Pt
dWC,Pt
.
For Lt, the constraint ε ·θQ1 = κP11θP1 must be satisfied. The limited risk premium specificationdue to the near unit-root property of Lt also implies that St and Ct cannot impact the drift
of Lt once κQ12 and κ
Q13 have been fixed at zero. We need these restrictions in order to match
the Nelson-Siegel factor loading structure as closely as possible.
To keep this model class arbitrage-free, St and Ct must not hit their zero lower bounds.
17Note that, we again fix ε = 10−6 to approximate the unit-root property imposed in the AFNS0 model.
17
We ensure this by imposing the Feller condition on their parameters under both probability
measures, i.e.,18
κP21θ
P1 + κ
P22θ
P2 + κ
P23θ
P3 >
1
2σ2
22; λθQ2− λθ
Q3
>1
2σ2
22; κP31θ
P1 + κ
P32θ
P2 + κ
P33θ
P3 >
1
2σ2
33; and λθQ3
>1
2σ2
33.
Furthermore, to have well-defined processes for St and Ct, the sign of the effect that the factors
have on each of these two factors must be positive, which we impose with the restrictions
κP21 ≤ 0, κP23 ≤ 0, κP31 ≤ 0, and κP32 ≤ 0. Note that these restrictions imply that the threesquare-root processes cannot be negatively correlated.
5 Estimation Methodology
The stochastic volatility models described in the previous section are estimated using the
Kalman filter algorithm. In affine term structure models, zero-coupon yields are affine func-
tions of the state variables such that
yt(τ) = −1
τB(τ)′Xt −
1
τA(τ) + εt(τ),
where εt(τ) represents i.i.d. Gaussian white noise measurement errors. The conditional mean
for multi-dimensional affine continuous-time diffusion processes is given by
EP [XT |Xt] = (I − exp(−KP (T − t)))θP + exp(−KP (T − t))Xt, (4)
where exp(−KP (T−t)) is a matrix exponential. In general, the conditional covariance matrixfor affine diffusion processes is given by
V P [XT |Xt] =∫ T
t
exp(−KP (T − s))ΣD(EP [Xs|Xt])D(EP [Xs|Xt])′Σ′ exp(−(KP )′(T − s))ds. (5)
Stationarity of the system under the P -measure is ensured if the real components of all
the eigenvalues of KP are positive, and this condition is imposed in all estimations. For this
reason, we can start the Kalman filter at the unconditional mean and covariance matrix19
X̂0 = θP and Σ̂0 =
∫∞
0e−K
P sΣD(θP )D(θP )′Σ′e−(KP )′sds.
However, the introduction of stochastic volatility implies that the factors are no longer
simply Gaussian. We choose to approximate the true probability distribution of the state
variables with the first and second moments and use the Kalman filter algorithm as if the
18For Lt, we just need to ensure that the process does not become negative, which is assured if ε · θQ1
> 0and κP11θ
P1 > 0.
19In the estimation, we calculate the conditional and unconditional covariance matrices using the analyticalsolutions provided in Fisher and Gilles (1996).
18
state variables were Gaussian.20 Thus, the state equation is given by
Xt = (I − exp(−KP∆t))θP + exp(−KP∆t)Xt−1 + ηt, ηt ∼ N(0, Vt−1),
where ∆t is the time between observations and Vt−1 is the conditional covariance matrix given
in equation (5). However, the discrete nature of the state equation can cause the square-root
processes to become negative despite the fact that the parameter sets are forced to satisfy
Feller conditions and other nonnegativity restrictions. Whenever this happens, we follow the
literature and simply truncate those processes at zero; see Duffee (1999) for an example.
In the Kalman filter estimations, the error structure is given by
(ηt
εt
)∼ N
[(0
0
),
(Vt−1 0
0 H
)],
where H is assumed to be a diagonal matrix of the measurement error standard deviations,
σε, that are specific to each yield maturity when we perform estimations with the Treasury
yield data described in Section 2, while σε is assumed to be uniform for all yield maturities
in the simulated yield samples as discussed below. The linear least-squares optimality of the
Kalman filter requires that the white noise transition and measurement errors be orthogonal
to the initial state; i.e., E[f0η′
t] = 0 and E[f0ε′
t] = 0. Finally, the standard deviations of the
estimated parameters are calculated as
Σ(ψ̂) =1
T
[1
T
T∑
t=1
∂ log lt(ψ̂)
∂ψ
∂ log lt(ψ̂)
∂ψ
′]−1
,
where ψ̂ denotes the optimal parameter set.
6 Simulation study
To study the efficiency of the Kalman filter in estimating affine term structure models with
and without stochastic volatility, we undertake a carefully orchestrated simulation study the
details of which are provided in the following.
First, we search for a realistic parameter set for each AFNSi model class to use in the
simulations. From CDR it follows that neither maximally flexible models nor parsimonious
independent-factors models appear to reflect the true dynamics of the state variables, the
former performs poorly out of sample and the latter is counterfactual in that the state variables
do appear to be correlated. For that reason we look for parsimonious specifications in between
these two extremes. For each model class, we go through a general-to-specific model selection
20A few notable examples of papers that follow this approach include Duffee (1999), Driessen (2005),Feldhütter and Lando (2008), and Christensen et al. (2015).
19
procedure using the Bayesian Information Criterion defined as
BIC(k) = −2 logL+ k log T,
where k is the number of estimated parameters, while T is the number of observations in the
data. As described in Section 2, our data sample contains T = 1,101 weekly observations.
Since CDR report limited gains in terms of forecasting performance from allowing for flexible
specifications of the volatility matrix Σ, we restrict this matrix to be diagonal throughout.
Based on the estimated parameters from the preferred specification for each model class,
we perform two sets of simulations. In the first, we simulate N = 1,000 sample paths for the
three state variables observed at a monthly frequency over a ten-year period. In the other,
we repeat this, but simulate over a forty-year period.21
In a second step, these simulated factor paths are converted into simulated zero-coupon
yields observed at a monthly frequency with the following eight maturities, 0.25, 0.5, 1, 2, 3,
5, 7, and 10 years. Finally, a Gaussian i.i.d. measurement error is added to each bond yield.
To study the role, if any, of the data quality, we consider two values for the measurement
error standard deviation, σε. In one simulated data sample, this standard deviation is fixed
uniformly at 1 basis point, in the other data sample it is fixed uniformly at 10 basis points,
which is at the upper end of the noise we observe in the Treasury yield data. In order to
make the results as comparable as possible across model classes, the simulated measurement
errors are kept the same, that is, the simulated measurement errors are the same for the
ten- and forty-year samples, respectively, independent of the model class being simulated and
independent of the size of the measurement error standard deviation.
We now turn to the details of the simulation of the factor paths. The continuous-time
P -dynamics are, in general, given by
dXt = KP (θP −Xt)dt+ΣD(Xt)dWPt .
For both restricted square-root processes and unconstrained processes we approximate the
continuous-time process using the Euler approximation.22 To exemplify, for a restricted
square-root process,
dXit = κPii (θ
Pi −Xit)dt+ κPij(θPj −X
jt )dt+ σii
√XitdW
P,it ,
the algorithm is
Xit = Xit−1 + κ
Pii (θ
Pi −Xit−1)∆t+ κPij(θPj −X
jt−1)∆t+ σii
√Xit−1
√∆tzit , z
it ∼ N(0, 1).
21For the Gaussian AFNS0 model class we also take out weekly observations from the simulated paths. Theresults presented later show that increasing the sampling frequency does not materially alter any of the results.For that reason we do not analyze weekly samples for the non-Gaussian AFNS model classes.
22Thompson (2008) is an example.
20
We fix ∆t at a uniform value of 0.0001, which is equivalent to approximately 27 shocks per day
to each process through the Brownian motion. As Feller conditions and other non-negativity
requirements are imposed in the estimations performed with the observed Treasury yields, the
parameter sets used in the simulations satisfy all non-negativity requirements, so the “true”
underlying continuous-time process never becomes negative P -a.s. However, for the discretely
observed process above there is always a positive, but usually very small, probability that
the approximation will become negative. Whenever this happens, we truncate the simulated
square-root processes at 0 similar to what we do in the model estimations.
As for the starting point of the simulation algorithm, X0, we ideally want to draw it
from the unconditional joint distribution of the three state variables. However, with the
exception of the Gaussian AFNS0 model, we do not know the unconditional distribution
of Xt = (Lt, St, Ct). To overcome this problem, we take the estimated value of the three
state variables at the end of the observed Treasury yield sample and simulate the three state
variables according to the algorithm above for 100 years and repeat this 1,000 times. This
effectively gives us random draws from the joint unconditional distribution ofXt = (Lt, St, Ct).
These starting values are identical for both the ten- and forty-year simulated samples within
each model class, again in an attempt to make the results as comparable as possible.
In the final step, we use the 1,000 simulated samples from each exercise as input into a
corresponding number of Kalman filter estimations where we use the true parameters as the
starting point for each optimization. Since we are estimating the true model in each case,
this provides us with a clean read of the properties of the Kalman filter as an estimator, not
impacted by any errors related to model misspecification.
7 Results for the Gaussian AFNS0 Model
In this section, we describe our estimation results based on the simulated data of the Gaussian
AFNS0 model that serves as the benchmark in our analysis. For this model class, the Kalman
filter is a consistent and efficient estimator equivalent to exact maximum likelihood estimation.
This allows us to study whether there is any finite-sample bias in the estimated parameters.
Due to the efficiency of the Kalman filter, such finite-sample bias will affect any estimator.
Hence, these results provide an ideal background for understanding the bias in Kalman filter-
based estimations of non-Gaussian AFNS models with stochastic volatility.
To begin, the result of the model selection for the Gaussian AFNS0 model is reported in
Table 3. The statistics in the table show that the preferred specification according to the
Bayesian Information Criterion has P -dynamics given by
dLt
dSt
dCt
=
κP11 0 0
0 κP22 κP23
0 0 κP33
θP1
θP2
θP3
−
Lt
St
Ct
dt+
σ11 0 0
0 σ22 0
0 0 σ33
dWL,Pt
dW S,Pt
dWC,Pt
.
21
Alternative Goodness-of-fit statisticsspecifications logL k p-value BIC(1) Unrestricted KP 51,042.41 24 n.a. -101,916.7(2) κP
12= 0 51,042.40 23 0.8875 -101,923.7
(3) κP12
= κP32
= 0 51,042.40 22 0.8875 -101,930.7(4) κP
12= κP
32= κP
31= 0 51,042.23 21 0.5598 -101,937.4
(5) κP31
= . . . = κP21
= 0 51,037.57 20 0.0023 -101,935.1(6) κP
31= . . . = κP
13= 0 51,035.98 19 0.0745 -101,938.9
(7) κP31
= . . . = κP23
= 0 51,015.27 18 < 0.0001 -101,904.5
Table 3: Evaluation of Alternative Specifications of the AFNS0 Model.There are seven alternative estimated specifications of the AFNS0 model with constant volatility. Each
specification is listed with its maximum log likelihood (logL), number of parameters (k), the p-value
from a likelihood ratio test of the hypothesis that it differs from the specification above with one more
free parameter, and the Bayesian information criterion (BIC).
KP KP·,1 K
P·,2 K
P·,3 θ
P Σ
KP1,· 0.03943 0 0 0.07242 Σ1,· 0.00570
(0.07332) (0.01703) (0.00009)KP
2,· 0 0.43102 -0.69198 -0.03173 Σ2,· 0.00888(0.11962) (0.08121) (0.01271) (0.00020)
KP3,· 0 0 0.83341 -0.01873 Σ3,· 0.02728
(0.22767) (0.00676) (0.00047)
Table 4: Parameter Estimates for the Preferred AFNS0 Model.The estimated parameters of the KP -matrix, the θP -vector, and the Σ-matrix for the preferred AFNS0model according to the Bayesian Information Criterion are shown. The Q-related parameter is λ =
0.53650 (0.00363). The numbers in parentheses are the estimated standard deviations of the parameter
estimates. The maximum log likelihood value is 51,035.98.
The estimated dynamic parameters for this specification are reported in Table 4. Relative
to the unrestricted model, the likelihood ratio test for the five restrictions jointly in the
preferred specification are
LRBIC = 2[51, 042.41 − 51, 035.98] = 12.86 ∼ χ2(5).
The probability of observing at least 12.86 with five degrees of freedom is 0.0247. Thus,
the five restrictions are not jointly supported by the data at the 5% level, but they are not
overwhelmingly rejected either.
In terms of the estimated parameters reported in Table 4 that are used in the simulations
of the AFNS0 model, we note the usual pattern that the level factor is the most persistent
and least volatile factor, the curvature is the most volatile and least persistent factor, and the
slope factor has dynamic properties in between those two extremes. Finally, the estimated
value of λ is close to 0.5, which is a typical value for this parameter.
22
Ten-year samples, σε = 1 bpParameterTrue Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%
κP11 0.03943 0.50335 0.41496 0.07704 0.20025 0.39116 0.67401 1.2888κP22 0.43102 0.61752 0.28653 0.25449 0.42644 0.56905 0.74371 1.1264κP23 -0.69198 -0.76423 0.20026 -1.1174 -0.88497 -0.74995 -0.63269 -0.46515κP33 0.83341 1.2243 0.57056 0.53381 0.82011 1.1043 1.4970 2.2933
σ11 0.00570 0.00571 0.00024 0.00533 0.00555 0.00570 0.00587 0.00611σ22 0.00888 0.00882 0.00058 0.00782 0.00842 0.00880 0.00924 0.00973σ33 0.02728 0.02747 0.00178 0.02454 0.02625 0.02752 0.02866 0.03038
θP1 0.07242 0.07273 0.01802 0.04275 0.06020 0.07265 0.08519 0.10222θP2 -0.03173 -0.03199 0.01407 -0.05503 -0.04109 -0.03195 -0.02253 -0.00902θP3 -0.01873 -0.01903 0.00859 -0.03407 -0.02487 -0.01863 -0.01319 -0.00519
λ 0.53650 0.53635 0.00322 0.53134 0.53415 0.53642 0.53838 0.54172
σε 0.00010 0.00010 0.00000 0.00010 0.00010 0.00010 0.00010 0.00010
Ten-year samples, σε = 10 bpsParameterTrue Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%
κP11 0.03943 0.53931 0.47879 0.08383 0.21206 0.39589 0.70937 1.4538κP22 0.43102 0.62201 0.29042 0.24693 0.41999 0.57518 0.75535 1.1663κP23 -0.69198 -0.77180 0.20727 -1.1374 -0.89486 -0.75441 -0.63236 -0.46117
κP33 0.83341 1.2275 0.58183 0.51673 0.81498 1.1118 1.4986 2.3486
σ11 0.00570 0.00583 0.00067 0.00474 0.00539 0.00585 0.00627 0.00691σ22 0.00888 0.00871 0.00079 0.00749 0.00814 0.00870 0.00927 0.00996σ33 0.02728 0.02755 0.00263 0.02331 0.02580 0.02750 0.02936 0.03191
θP1 0.07242 0.07281 0.01805 0.04235 0.06009 0.07300 0.08522 0.10207θP2 -0.03173 -0.03199 0.01412 -0.05512 -0.04133 -0.03202 -0.02258 -0.00862
θP3 -0.01873 -0.01917 0.00868 -0.03430 -0.02478 -0.01885 -0.01323 -0.00548
λ 0.53650 0.53714 0.02259 0.50028 0.52241 0.53714 0.55097 0.57385
σε 0.00100 0.00100 0.00003 0.00096 0.00098 0.00100 0.00102 0.00104
Table 5: Summary Statistics of Estimated Parameters from Simulated Ten-YearMonthly Samples of the Preferred AFNS0 Model.
The table reports the summary statistics of the estimation results from N = 1,000 simulated data
sets of the preferred AFNS0 model, each with a length of ten years and a uniform measurement error
standard deviation of σε = 1 basis point and σε = 10 basis points, respectively.
7.1 Analysis of Ten-Year Monthly Samples
The summary statistics from the 1,000 estimations based on simulated ten-year monthly
samples of the preferred specification of the AFNS0 model are reported in Table 5. We note
that there is an upward bias in the absolute size of all four mean-reversion parameters, that
is, the three positive parameters in the diagonal of KP have means and medians well above
their true values, while the negative off-diagonal element, κP23, has a mean and median that
is below its true value. Hence, there is notable finite-sample bias in the estimates of these
parameters. In particular, the near unit-root property of the Nelson-Siegel level factor is
causing the estimator significant difficulty. More than 95% of the estimates of κP11 are above
0.077 despite its true value of only 0.039. These results show that a near unit-root process
can come across as very persistent as well as rather quickly mean-reverting in samples of short
length such as the ten-year samples analyzed here. Figure 2 provides the visual representation
23
0 200 400 600 800 1000
01
23
4
Estimation No.
Par
amet
er e
stm
ate
(a) κP11.
0 200 400 600 800 1000
0.0
0.5
1.0
1.5
2.0
2.5
Estimation No.
Par
amet
er e
stm
ate
(b) κP22.
0 200 400 600 800 1000
−1.
5−
1.0
−0.
50.
0
Estimation No.
Par
amet
er e
stm
ate
(c) κP23.
0 200 400 600 800 1000
01
23
45
Estimation No.
Par
amet
er e
stm
ate
(d) κP33.
Figure 2: Estimated Mean-Reversion Parameters from Simulated Ten-YearMonthly Samples of the Preferred AFNS0 Model.
Illustration of the estimated mean-reversion parameters in the KP matrix from N = 1,000 simulated
data sets of the preferred AFNS0 model, each with a length of ten years sampled monthly and a uni-
form measurement error standard deviation of σε = 10 basis points. The true value of each parameter
is indicated with a horizontal solid grey line.
of the estimated mean-reversion parameters across the 1,000 samples. We note that they have
notably skewed distributions, partly as a consequence of the imposed stationarity.
Turning to the three volatility parameters in the Σ matrix, we note that they are well
determined with almost identical means and medians, both close to the true values, and the
standard deviations of their estimates are also small. Importantly, though, their accuracy
is sensitive to the quality of the data as a low value of σε decreases the dispersion of their
24
estimated values. This result applies to all three factors, and it suggests that the values of the
volatility parameters are determined to a large extent from their impact on the cross-sectional
fit of yields rather than from the time series properties of the state variables, which are the
same in the simulated data by construction and independent of the value of σε.
The mean parameters under the P -measure, θP , represent the opposite case. Due to the
flexibility of the essentially affine risk premium specification within the Gaussian models,
these parameters play no role for the Q-dynamics and, by implication, have no effect on the
cross-sectional fit of the model. As a consequence, their estimated values are purely derived
from the time series properties of the state variables and their distributions are independent
of the level of noise in the yield data. Furthermore, they are estimated without any detectable
bias, and the standard deviation of their estimated values is also relatively modest, but larger
the more persistent the factor in question is.
Focusing on the estimates of λ, Table 5 shows that this parameter is well determined
in the estimation with a small standard deviation. It has a 95% confidence interval given
by (0.500, 0.574) for the case with noise error standard deviation of 10 basis points, and
an even narrower interval given by (0.531, 0.541) when we reduce the standard deviation of
the measurement noise to 1 basis point. Since λ only affects the risk-neutral Q-dynamics,
it is exclusively determined from the cross section of yields and therefore sensitive to the
quality of the data. Still, variation in the values of λ in the ranges above does not alter the
cross-sectional fit of the model by much. Thus, its statistical uncertainty is largely without
economic consequences.
Finally, the estimates of the measurement error standard deviation exhibit very little
variation across the simulated samples. However, as noted, their size affect the accuracy of
the three volatility parameters and λ. This supports the conjecture put forward by CDR
that the elements in the volatility matrix in the AFNS0 model are determined primarily in
order to deliver the best possible fit to the cross section of yields rather than matching the
actual volatility correlation structure among the three state variables. On the other hand,
the properties of the estimates of the elements in the mean-reversion matrix KP and the
mean vector θP are essentially unaffected by the size of σε as these parameters reflect the
time-series dynamics of the three state variables and their values have no consequences for
the bond yield function fitted to the cross section of observed yields.
In addition to studying the finite-sample properties of the estimated parameters, we are
also interested in knowing to what extent the parameter standard deviations estimated from
the optimized likelihood function in the Kalman filter are reliable in the sense that they reflect
the variation in the estimated parameters across the 1,000 simulated samples. In this exercise,
we hence use the empirical standard deviation of the 1,000 estimates of each parameter as
a proxy for the true, unobserved standard deviation of the estimated parameters.23 Table 6
23One potential caveat here is that the estimated parameters—the KP parameters in particular—followasymmetric distributions that are not necessarily well summarized by the standard deviation.
25
Parameter Ten-year samples, σε = 1 bpstd. dev. “True” Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%
σ(κP11) 0.41496 0.32510 0.13282 0.15106 0.22730 0.30352 0.39944 0.57859σ(κP22) 0.28653 0.25756 0.09782 0.14000 0.18904 0.23983 0.30301 0.44689σ(κP23) 0.20026 0.20055 0.05054 0.12971 0.16252 0.19487 0.22862 0.28981σ(κP33) 0.57056 0.55056 0.14534 0.34405 0.44860 0.53131 0.63680 0.80056
σ(σ11) 0.00024 0.00026 0.00003 0.00021 0.00023 0.00025 0.00028 0.00032σ(σ22) 0.00058 0.00065 0.00008 0.00052 0.00059 0.00065 0.00070 0.00077σ(σ33) 0.00178 0.00189 0.00022 0.00156 0.00174 0.00187 0.00203 0.00228
σ(θP1 ) 0.01802 0.00586 0.00471 0.00140 0.00256 0.00434 0.00778 0.01526σ(θP2 ) 0.01407 0.01398 0.00860 0.00473 0.00812 0.01201 0.01742 0.03048σ(θP3 ) 0.00859 0.00878 0.00460 0.00399 0.00597 0.00780 0.01055 0.01641
σ(λ) 0.00322 0.00332 0.00076 0.00221 0.00277 0.00324 0.00376 0.00470
σ(σε) 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
Parameter Ten-year samples, σε = 10 bps
std. dev. “True” Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%
σ(κP11) 0.47879 0.36140 0.18019 0.15427 0.23479 0.32578 0.44125 0.69790σ(κP22) 0.29042 0.26323 0.10357 0.13906 0.19128 0.24009 0.30903 0.46648σ(κP23) 0.20727 0.20818 0.05662 0.13038 0.16588 0.20025 0.24281 0.31129
σ(κP33) 0.58183 0.58518 0.17083 0.35368 0.45967 0.56241 0.67907 0.89759
σ(σ11) 0.00067 0.00073 0.00008 0.00060 0.00067 0.00072 0.00078 0.00087σ(σ22) 0.00079 0.00086 0.00010 0.00071 0.00079 0.00086 0.00092 0.00102σ(σ33) 0.00263 0.00287 0.00035 0.00232 0.00262 0.00285 0.00309 0.00348
σ(θP1 ) 0.01805 0.00585 0.00463 0.00142 0.00256 0.00430 0.00768 0.01525σ(θP2 ) 0.01412 0.01413 0.00891 0.00461 0.00816 0.01213 0.01735 0.02998
σ(θP3 ) 0.00868 0.00895 0.00493 0.00407 0.00589 0.00784 0.01071 0.01666
σ(λ) 0.02259 0.02320 0.00525 0.01558 0.01944 0.02261 0.02644 0.03220
σ(σε) 0.00003 0.00003 0.00000 0.00003 0.00003 0.00003 0.00003 0.00003
Table 6: Summary Statistics of Estimated Parameter Standard Deviations fromSimulated Ten-Year Monthly Samples of the Preferred AFNS0 Model.
The table reports the summary statistics of the estimated parameter standard deviations from N =
1,000 simulated data sets of the preferred AFNS0 model, each with a length of ten years and a uniform
measurement error standard deviation of σε = 1 basis point and σε = 10 basis points, respectively.
contains the summary statistics for the monthly ten-year samples.
The parameter standard deviations we calculate from the optimized likelihood function are
reasonably accurate for the parameters without bias; σ11, σ22, σ33, θP2 , θ
P3 , λ, and σε. However,
even for the parameters with a modest bias, κP22, κP23, and κ
P33, the estimated parameter
standard deviations are relatively close to, but slightly below the actual variation in the
estimated parameters. Finally, for κP11 and θP1 , there is a more severe downward bias in the
estimated parameter standard deviations relative to the actual variation in the parameter
estimates. Overall, the conclusion is that the standard deviations obtained from the Kalman
filter underestimate the true variation for the parameters with bias. This will make these
parameters look more significant than they actually are. This problem is particularly severe
for the estimated parameters in the mean-reversion matrix KP as their point estimates are
notably upward biased to begin with. This makes model selection and validation extremely
26
0 200 400 600 800 1000
−1.
0−
0.5
0.0
0.5
1.0
Estimation No.
Est
imat
ed c
orre
latio
n co
effic
ient
(a) Correlation of Lt and St.
0 200 400 600 800 1000
−1.
0−
0.5
0.0
0.5
1.0
Estimation No.
Est
imat
ed c
orre
latio
n co
effic
ient
(b) Correlation of Lt and Ct.
0 200 400 600 800 1000
−1.
0−
0.5
0.0
0.5
1.0
Estimation No.
Est
imat
ed c
orre
latio
n co
effic
ient
(c) Correlation of St and Ct.
Figure 3: Pairwise Correlations of Estimated Factor Paths from Simulated Ten-Year Monthly Samples of the Preferred AFNS0 Model.
Illustration of the correlations between the estimated paths of the three state variables in N = 1,000
simulated data sets of the preferred AFNS0 model, each with a length of ten years and a uniform
measurement error standard deviation of σε = 10 basis points. Horizontal solid grey lines indicate the
factor correlations in the true unconditional distribution.
treacherous when one or more of the state variables are highly persistent. Unfortunately,
this is not an issue that can be neglected since it is the specification of KP that determines
a model’s forecast performance and term premium decomposition as discussed in detail in
Bauer et al. (2012).
Figure 3 shows the correlations between the estimated factor paths across the 1,000 sam-
ples. We note that, in short ten-year samples, factor path correlations are not a reliable guide
to “spotting” the appropriate dynamic relationship between the factors in multi-dimensional
models of the yield curve as the lack of mean-reversion of the level factor means that almost
any level of correlation can be observed even though within the simulated model, the level
factor is entirely independent of the two other factors. Furthermore, even for the slope and
curvature factors, which are strongly positively correlated within the simulated model, the
observed correlation can be low, and even negative, with non-trivial probability.
To end the analysis of the ten-year monthly samples, we analyze the accuracy of the
filtering of the state variables. Table 7 reports the mean absolute difference between the
simulated factor paths and the estimated factor paths from the Kalman filter. For the level
and the slope factor, their absolute filtered error is close to the size of σε that represents
the noise in the data. This might be due to the fact that they affect yields one-for-one at
their maximum loading in the yield function. For the curvature factor, its absolute filtered
error tends to be slightly more than three times larger than the size of σε since its maximum
loading in the yield function is barely 0.3.
27
State Mean absolute fitted error, ten-year samples, σε = 1 bpvariable Mean Std. dev. 5 percentile 1st quartile Median 3rd quartile 95 percentileLt 2.16 0.77 1.48 1.66 1.87 2.39 3.79St 2.01 0.81 1.30 1.47 1.69 2.27 3.78Ct 4.89 0.48 4.21 4.55 4.83 5.13 5.77
State Mean absolute fitted error, ten-year samples, σε = 10 bpsvariable Mean Std. dev. 5 percentile 1st quartile Median 3rd quartile 95 percentileLt 11.77 1.59 9.56 10.70 11.52 12.64 14.51St 11.42 1.66 9.37 10.31 11.13 12.17 14.57Ct 34.66 3.25 29.87 32.48 34.42 36.56 40.35
Table 7: Summary Statistics of Mean Absolute Fitted Errors of the Filtered StateVariables from Simulated Ten-Year Monthly Samples of the Preferred AFNS0Model.
The table reports the summary statistics of the mean absolute fitted error of the three state variables
from N = 1,000 simulated data sets of the preferred AFNS0 model, each with a length of ten years
and a uniform measurement error standard deviation of σε = 1 basis point and σε = 10 basis points,
respectively. All numbers are measured in basis points.
7.2 Analysis of Forty-Year Monthly Samples
In this section, we analyze the results obtained for the forty-year monthly samples simulated
from the AFNS0 model.
For a start, Table 8 contains the summary statistics for the 1,000 estimated parameter sets
we obtain from these monthly forty-year samples. For the parameters determined primarily
from the cross section, i.e., λ, σ11, σ22, and σ33, we see a reduction of about 50% in their
dispersion when we quadruple the length of the sample. For the other unbiased parameters,
θP1 , θP2 , and θ
P3 , we see a similar reduction in the dispersion for the two latter, while the
variation in the estimates of θP1 is reduced by only about 20%. This is tied to the fact that,
even with this sample length, κP11 is still estimated with notable upward bias although it is
much less severe than in the ten-year samples. On the other hand, for the remaining mean-
reversion parameters with bias, κP22, κP23, and κ
P33, we see a significant reduction in their bias.
In addition, the uncertainty of their estimated values is reduced by a factor of 2.5, which
reflects the combined effect of increasing the sample length (which reduces the uncertainty in
itself) and the reduction in the finite-sample bias.
For the parameters determined from the cross section of yields, we note that a ten-year
sample of high quality (σε = 1 basis point) tends to lead to more accurate estimates than
forty-year samples of relatively noisy data (σε = 10 basis points). Thus, whether a long, more
noisy sample or a short, high quality sample is the more appropriate, really depends on the
parameters of interest. The accuracy of parameters in KP and θP are determined by the
sample length and largely independently of the data quality, while the accuracy of estimates
of λ and the parameters in the Σ volatility matrix can be more sensitive to data quality than
to sample length.
28
Forty-year samples, σε = 1 bpParameterTrue Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%
κP11 0.03943 0.15530 0.11662 0.03282 0.07563 0.12431 0.20568 0.38359κP22 0.43102 0.46791 0.08887 0.34128 0.40697 0.45893 0.52220 0.63095κP23 -0.69198 -0.70833 0.08404 -0.84889 -0.76527 -0.70567 -0.64882 -0.57729κP33 0.83341 0.94312 0.22975 0.62266 0.77463 0.91457 1.0842 1.3699
σ11 0.00570 0.00571 0.00011 0.00553 0.00563 0.00571 0.00578 0.00589σ22 0.00888 0.00887 0.00028 0.00840 0.00868 0.00887 0.00905 0.00932σ33 0.02728 0.02733 0.00084 0.02602 0.02675 0.02732 0.02792 0.02871
θP1 0.07242 0.07299 0.01497 0.04628 0.06317 0.07379 0.08262 0.09583θP2 -0.03173 -0.03166 0.00817 -0.04497 -0.03728 -0.03167 -0.02590 -0.01842θP3 -0.01873 -0.01864 0.00492 -0.02653 -0.02184 -0.01865 -0.01540 -0.01053
λ 0.53650 0.53641 0.00141 0.53405 0.53546 0.53636 0.53735 0.53873
σε 0.00010 0.00010 0.00000 0.00010 0.00010 0.00010 0.00010 0.00010
Forty-year samples, σε = 10 bpsParameterTrue Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%
κP11 0.03943 0.15875 0.12341 0.03325 0.07462 0.12392 0.20531 0.40518κP22 0.43102 0.46867 0.09180 0.33625 0.40536 0.45924 0.52424 0.63261κP23 -0.69198 -0.71050 0.08795 -0.85659 -0.76772 -0.70791 -0.65208 -0.57646
κP33 0.83341 0.94851 0.24482 0.60052 0.77666 0.91701 1.0976 1.4252
σ11 0.00570 0.00574 0.00033 0.00520 0.00553 0.00574 0.00594 0.00626σ22 0.00888 0.00885 0.00038 0.00821 0.00859 0.00885 0.00910 0.00949σ33 0.02728 0.02734 0.00126 0.02522 0.02646 0.02738 0.02820 0.02938
θP1 0.07242 0.07300 0.01499 0.04632 0.06290 0.07406 0.08273 0.09618θP2 -0.03173 -0.03166 0.00818 -0.04470 -0.03727 -0.03167 -0.02590 -0.01831
θP3 -0.01873 -0.01866 0.00495 -0.02662 -0.02193 -0.01864 -0.01548 -0.01042
λ 0.53650 0.53618 0.01019 0.51956 0.52931 0.53605 0.54286 0.55275
σε 0.00100 0.00100 0.00001 0.00098 0.00099 0.00100 0.00101 0.00102
Table 8: Summary Statistics of Estimated Parameters from Simulated Forty-YearMonthly Samples of the Preferred AFNS0 Model.
The table reports the summary statistics of the estimation results from N = 1,000 simulated data sets
of the preferred AFNS0 model, each with a length of forty years and a uniform measurement error
standard deviation of σε = 1 basis point and σε = 10 basis points, respectively.
Figure 4 shows the distribution of the estimated parameters in the KP mean-reversion
matrix across the 1,000 samples when the sample length is forty years and the noise has a
standard deviation of 10 basis points. Relative to the distribution from the ten-year samples
shown in Figure 2, we note the significant reduction in both the dispersion and skewness of
the estimates of each of these four parameters when the sample length is quadrupled.
Table 9 reports the summary statistics of the estimated parameter standard deviations we
obtain from the optimized likelihood function in the Kalman filter for the forty-year samples.
We note that the means and medians are close to each other and close to the standard devia-
tion of the parameter estimates that we use as a proxy for the true, but unobserved parameter
uncertainty. The pair (κP11, θP1 ) remains the exception for which the estimated standard devi-
ations still significantly understate the actual variation in the estimated parameters.
To end the analysis of the forty-year monthly samples, we analyze the accuracy of the
29
0 200 400 600 800 1000
0.0
0.5
1.0
1.5
2.0
Estimation No.
Par
amet
er e
stm
ate
(a) κP11.
0 200 400 600 800 1000
0.0
0.2
0.4
0.6
0.8
1.0
Estimation No.
Par
amet
er e
stm
ate
(b) κP22.
0 200 400 600 800 1000
−1.
0−
0.8
−0.
6−
0.4
−0.
20.
0
Estimation No.
Par
amet
er e
stm
ate
(c) κP23.
0 200 400 600 800 1000
0.0
0.5
1.0
1.5
2.0
Estimation No.
Par
amet
er e
stm
ate
(d) κP33.
Figure 4: Estimated Mean-Reversion Parameters from Simulated Forty-YearMonthly Samples of the Preferred AFNS0 Model.
Illustration of the estimated mean-reversion parameters in the KP matrix from N = 1,000 simulated
data sets of the preferred AFNS0 model, each with a length of forty years and a uniform measurement
error standard deviation of σε = 10 basis points. The true value of each parameter is indicated with
a horizontal solid grey line.
filtering of the state variables. Table 10 reports the mean absolute difference between the
simulated factor paths and the estimated factor paths from the Kalman filter in this case.
Compared to the results for the ten-year monthly samples reported in Table 7, there is a
modest gain in the quality of the filtering from quadrupling the sample length. However,
as measured by the median of the absolute filtered errors, the difference is about 0.5 basis
points. Thus, for all practical purposes, the filtering accuracy is the same and not sensitive
30
Parameter Forty-year samples, σε = 1 bpstd. dev. “True” Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%
σ(κP11) 0.11662 0.08415 0.03025 0.04295 0.06274 0.07986 0.10214 0.13916σ(κP22) 0.08887 0.08785 0.01606 0.06462 0.07669 0.08584 0.09698 0.11745σ(κP23) 0.08404 0.08482 0.01030 0.06927 0.07720 0.08390 0.09169 0.10208σ(κP33) 0.22975 0.22606 0.03238 0.17799 0.20380 0.22351 0.24643 0.28205
σ(σ11) 0.00011 0.00012 0.00001 0.00011 0.00011 0.00012 0.00012 0.00013σ(σ22) 0.00028 0.00030 0.00002 0.00027 0.00029 0.00030 0.00031 0.00033σ(σ33) 0.00084 0.00087 0.00005 0.00078 0.00083 0.00087 0.00090 0.00096
σ(θP1 ) 0.01497 0.00802 0.00540 0.00228 0.00414 0.00663 0.01041 0.01921σ(θP2 ) 0.00817 0.00798 0.00214 0.00491 0.00647 0.00769 0.00926 0.01174σ(θP3 ) 0.00492 0.00476 0.00113 0.00313 0.00396 0.00461 0.00542 0.00680
σ(λ) 0.00141 0.00141 0.00018 0.00115 0.00129 0.00140 0.00152 0.00172
σ(σε) 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
Parameter Forty-year samples, σε = 10 bps
std. dev. “True” Mean Std. dev. 5% 1st quartile Median 3rd quartile 95%
σ(κP11) 0.12341 0.08677 0.03343 0.04349 0.06199 0.08040 0.10587 0.14805σ(κP22) 0.09180 0.08995 0.01677 0.06647 0.07849 0.08762 0.09911 0.12037σ(κP23) 0.08795 0.08794 0.01140 0.07112 0.07970 0.08732 0.09499 0.10791
σ(κP33) 0.24482 0.23593 0.03729 0.18051 0.21062 0.23243 0.25698 0.30326
σ(σ11) 0.00033 0.00032 0.00002 0.00029 0.00031 0.00032 0.00034 0.00036σ(σ22) 0.00038 0.00039 0.00002 0.00036 0.00038 0.00040 0.00041 0.00043σ(σ33) 0.00126 0.00130 0.00008 0.00116 0.00124 0.00129 0.00135 0.00143
σ(θP1 ) 0.01499 0.00802 0.00542 0.00223 0.00414 0.00656 0.01042 0.01893σ(θP2 ) 0.00818 0.00800 0.00218 0.00491 0.00649 0.00773 0.00931 0.01186
σ(θP3 ) 0.00495 0.00478 0.00115 0.00314 0.00394 0.00464 0.00545 0.00686
σ(λ) 0.01019 0.01050 0.00127 0.00864 0.00961 0.01039 0.01130 0.01283
σ(σε) 0.00001 0.00001 0.00000 0.00001 0.00001 0.00001 0.00001 0.00001
Table 9: Summary Statistics of Estimated Parameter Standard Deviations fromSimulated Forty-Year Monthly Samples on the Preferred AFNS0 Model.
The table reports the summary statistics of the estimated parameter standard deviations from N =
1,000 simulated data sets of the preferred AFNS0 model, each with a length of forty years and a uniform
measurement error standard deviation of σε = 1 basis point and σε = 10 basis points, respectively.
to the sample length.
7.3 Analysis of Weekly Samples
In this section, we analyze the estimation results we obtain with the exact same data analyzed
thus far, but sampled at a weekly frequency. Importantly, we emphasize that the simulated
factor paths are identical estimation-by-estimation, only the observed frequency has changed.
This should make the results as comparable as possible. Thus, only the simulated measure-
ment errors are not the same across the two exercises.
As before, we start with an analysis of the ten-year samples the results for which are
reported in Table 11. In general, the mean and median of the 1,000 estimates of each param-
eter are close to identical to those obtained with monthly data. Thus, in this sense, there
are limited benefits from increasing the data frequency. Still, we do see some reduction in
31
State Mean absolute fitted error, forty-year samples, σε = 1 bpvariable Mean Std. dev. 5 percentile 1st quartile Median 3rd quartile 95 percentileLt 1.69 0.20 1.50 1.57 1.63 1.73 2.09St 1.51 0.23 1.31 1.37 1.43 1.55 1.97Ct 4.64 0.17 4.38 4.52 4.63 4.75 4.93
State Mean absolute fitted error, forty-year samples, σε = 10 bpsvariable Mean Std. dev. 5 percentile 1st quartile Median 3rd quartile 95 percentileLt 11.13 0.57 10.23 10.71 11.09 11.49 12.12St 10.64 0.54 9.84 10.27 10.61 10.96 11.59Ct 34.07 1.46 31.64 33.06 34.02 35.04 36.52
Table 10: Summary Statistics of Mean Absolute Fitted Errors of the Filtered StateVariables from Simulated Forty-Year Monthly Samples of the Preferred AFNS0Model.
The table reports the summary statistics of the mean absolute fitted error of the three state variables
from N = 1,000 simulated data sets of the preferred AFNS0 model, each with a length of forty years
and a uniform measurement error standard deviation of σε = 1 basis point a