Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
The Limits to Volatility Predictability: Quantifying
Forecast Accuracy Across Horizons∗
Xingyi Li† and Valeriy Zakamulin‡
This revision: December 18, 2017
Abstract
Volatility forecasting is crucial for portfolio management, risk management, and pric-
ing of derivative securities. Still, little is known about how far ahead one can forecast
volatility. First, in this paper we introduce the notions of the spot and forward predicted
volatilities and propose to describe the term structure of volatility predictability by the
spot and forward forecast accuracy curves. Then, by employing a few popular time-series
volatility models, we perform a comprehensive empirical study on the horizon of volatility
predictability. Our results suggest that, whereas the spot volatility can be predicted over
horizons that extend to 35 weeks, the horizon of the forward volatility predictability is
rather short and limited to approximately 7.5 weeks. Finally, we suggest a plausible expla-
nation for why standard models fail to provide sensible longer-horizon volatility forecasts.
We argue that volatility is less persistent and does not revert to its long-run mean as the
models assume. Our analysis reveals an important but currently overlooked stylized fact
about volatility: it cycles erratically over time and periods of high or low volatility follow
one another.
Key words: spot volatility, forward volatility, volatility forecasting, forecast accuracy,
term structure, out-of-sample forecasting, model comparison
JEL classification: C22, C53, C58, G17; EFM classification: 450
∗The authors are grateful to Jochen Jungeilges for his helpful comments and suggestions regarding theprevious draft of this paper. Any remaining errors in the manuscript are the authors’ responsibility.†School of Business and Law, University of Agder, Service Box 422, 4604 Kristiansand, Norway, Tel.: (+47)
38 14 13 38, E-mail: [email protected]‡School of Business and Law, University of Agder, Service Box 422, 4604 Kristiansand, Norway, Tel.: (+47)
38 14 10 39, E-mail: [email protected]
1
1 Introduction
Volatility forecasting is crucial for portfolio management, risk management, and pricing of
financial derivatives. Specifically, the volatility of a financial asset is a primary input to the
optimal portfolio choice problem. Volatility forecasting is a mandatory risk-management ex-
ercise for many financial institutions and banks around the world. Volatility is the most vital
input variable in the valuation of derivative securities. For instance, to price an option one
needs to know the future volatility of the underlying asset till the option maturity.
Nowadays, there is a trade in derivatives that are written on volatility itself. Examples
of such derivatives are Forward Volatility Agreements (FVA). The FVA is a forward contract
on the future spot realized or implied volatility of a financial asset (examples are: individual
stock, stock market index, commodity, foreign currency, etc.). In particular, the FVA specifies
the realized or implied volatility for an interval starting at a future date. The value of the FVA
at maturity is the difference between the contractual volatility level, which is determined at
the contract inception date, and the volatility level observed at the settlement date. The key
motivation to trade FVAs is that they allow investors to hedge volatility risk and to speculate
on volatility levels.
In portfolio management and risk management, the volatility needs to be forecasted over
horizons ranging from 1 day to 1 month. In contrast, in the valuation of derivative securities
the volatility needs to be forecasted over much longer horizons. For example, on the Chicago
Board Options Exchange (CBOE) one can trade short-term options with a maximum of 12
months to maturity and long-term options (LEAPS) that have expiration dates up to 39 months
into the future. Therefore, the successful pricing of options requires accurate forecasting of
volatility over a relatively long-term period starting now. The successful use of a FVA contract
requires accurate forecasting of volatility over a period starting at some point in the future.
FVA contracts are typically traded in over-the-counter markets and have maturities ranging
from 1 to 24 months (see, for example, Corte, Kozhan, and Neuberger (2017)).
There is now an enormous body of research on the properties of volatility, volatility mod-
eling and forecasting. The following stylized facts about volatility have been identified and
described in the financial literature: persistence and mean-reversion. Persistence in volatility
makes it possible to forecast future volatility. In fact, it is well documented in financial econo-
2
metric literature that volatility is predictable (for a good review of this literature, see Poon
and Granger (2003)). But what is the horizon of volatility predictability?
There is no doubt that volatility is forecastable over short horizons up to 1 month into the
future. However, there is a controversy in the literature about how far ahead the volatility is
forecastable. The answer to this question seems to depend greatly on employed methodology
of measuring forecast accuracy. Specifically, in the absolute majority of studies on volatility
predictability the researchers run a horse-race between several alternative forecasting models.
In these studies, the forecast accuracy is typically evaluated using measures based on either
(absolute or squared) forecast errors or percentage errors.1 Such studies report the ranking of
competing models and make recommendations to practitioners about which forecasting model
should be preferred. In some of these studies the forecast horizon is extended to 30-60 months
(examples of such studies are Cao and Tsay (1992), Alford and Boatsman (1995), Figlewski
(1997), and Green and Figlewski (1999)). The results of these studies seem to suggest that
volatility is forecastable over long-term horizons that extend to several years.
There are two issues with the above-mentioned studies. The first issue is that the results
are plagued by the fact that they are joint assessments of volatility forecastability and an
assumed model, and the results vary not only with the horizon, but also with the model.
To address this problem, Christoffersen and Diebold (2000) develop a model-free procedure
for measuring volatility predictability across horizons. They implement their procedure using
the data on four stock market indices and four exchange rates. In contrast to the previous
studies, Christoffersen and Diebold (2000) find that volatility forecastability decays quickly
with horizon and volatility is not predictable over horizons longer than 8 weeks.
The other more serious issue with the above-mentioned studies is that even though fore-
casting errors allow comparing alternative forecasting models, they do not allow measuring
predictive accuracy per se. For example, if the volatility over some forecast horizon is unpre-
dictable, all model forecasts are likely to be worthless. In this case using forecasting errors
to select the best model among the poor ones creates the illusion of predictability when none
is present. To overcome this problem, Galbraith (2003) proposes a procedure of determining
the horizon (coined the “content horizon”) beyond which forecasts from univariate time se-
ries models of stationary processes add nothing to the forecast implicit in the unconditional
1Examples of such measures are Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE).
3
mean. Using data on two exchange rates, Galbraith and Kisinbay (2005) find that the “content
horizon” in forecasting the volatility of exchange rates is limited to 6 weeks.
The present paper attempts to extend our knowledge about how far into the future useful
volatility forecast can be made from time-series models. Our contributions are threefold. On
the methodological side, we introduce the notions of the spot and forward predicted volatili-
ties. Consequently, we suggest describing the term structure of volatility predictability by two
curves: the spot and forward forecast accuracy curves. In our terminology, the “spot predicted
volatility” is the volatility forecasted over a period starting now. Hence, the “spot forecast
accuracy curve” shows the forecast accuracy for the spot predicted volatility. The “forward
predicted volatility” is the volatility forecasted over a period starting at some point in the
future; the “forward forecast accuracy curve” depicts the forecast accuracy for the forward
predicted volatility.
The motivation for the notion of “forward predicted volatility” comes from two sources.
First, the spot predicted volatility can be decomposed into the spot predicted volatility over
the first period and the forward predicted volatilities over all subsequent periods. Therefore,
the standard spot forecast accuracy curve provides useful, but at the same time misleading
information about the model’s ability to forecast volatility across a specific horizon. This is
because it creates a wrong impression that the predictive accuracy is evenly spread over the
total horizon; this impression is far from the reality. Therefore, much more useful information
about the model’s ability to forecast volatility across various horizons can be obtained from the
forward forecast accuracy curve. Second, the forward forecast accuracy curve is very relevant
in the context of FVAs because it measures directly the accuracy of a forward volatility forecast
and, therefore, allows one to understand the limits to the forward volatility predictability.
On the empirical side, by employing three popular yet simple time-series volatility models,2
we perform a comprehensive study on the horizon of volatility predictability. Specifically, using
data on 23 individual stocks, 39 world stock indices, 16 bond indices, 17 exchange rates, and 8
commodities, we estimate the term structure of volatility predictability for all major financial
markets. As a result, in our study we use a relatively large and broad data set of diverse
financial assets as compared to most of the existing studies on volatility forecasting. In contrast
to the previous studies that employ only a few financial assets, we do not report the results for
2Each of these models captures different stylized facts about volatility.
4
each individual asset because each of these results might not be truly representative. Instead,
we perform a so-called “meta-analysis” that combines the results for all individual assets that
belong to the same asset class. A key benefit of this approach is that the aggregation of
information on individual assets leads to a higher statistical power and more robust forecast
accuracy estimates than it is possible to obtain from the information on any individual asset.
Our empirical results are strikingly similar across the financial markets. Whereas the spot
volatility can be predicted over horizons that extend to 20-35 weeks, the horizon of the forward
volatility predictability is rather short and limited to 5-10 weeks. In other words, our results
on the forward forecast accuracy advocate that, regardless of the financial market, the horizon
of the spot (forward) volatility predictability is limited to 8 (2) months ahead. Therefore, the
horizon of volatility predictability is much shorter than the longest maturity of traded LEAPS
(39 months) and FVA (24 months) contracts.
Our third contribution consists in suggesting a plausible explanation for why standard
volatility models fail to provide sensible longer-horizon volatility forecasts. We argue that the
volatility dynamics do not fully correspond to the assumptions embedded in standard models,
which assume that volatility is highly persistent and mean-reverting. To support our argu-
ments, we suggest looking from a different perspective on the volatility dynamics. Specifically,
we propose to consider the volatility dynamics as a process which is constantly “switching”
between high and low volatility states. The interchanging periods of high and low volatility
can be regarded as waves of irregular cycle length and varying amplitude. Put differently, the
important but currently overlooked stylized fact about volatility is its oscillating dynamics:
volatility cycles erratically over time and periods of high or low volatility follow one another.
Using a few distinct financial assets, we detect the periods of rising and falling volatility and
present their summary statistics. We find that the statistics are strikingly similar across vari-
ous asset classes. Most importantly, our analysis reveals that the volatility is sometimes rather
non-persistent: its value can change dramatically over a course of a single day. In addition, we
indicate that the volatility does not revert to its long-run mean as the models assume.
The rest of the paper is organized as follows. Section 2 introduces the basic notions and
terminology used in the paper. Section 3 describes the data. The empirical methodology is
outlined in Section 4 that covers the volatility forecasting models, how we measure the forecast
accuracy, and how we conduct statistical inference about estimated forecast accuracies. Section
5
5 presents the empirical results. In the subsequent Section 6 we discuss the results and present
our view on the volatility dynamics. Finally, Section 7 summarizes and concludes the paper.
2 Basic Notions and Terminology
2.1 Spot and Forward Predicted Volatilities
Our notions of spot and forward predicted volatilities build on the notions of spot and forward
implied volatilities,3 see Taleb (1997, Chapter 9), Egelkraut, Garcia, and Sherrick (2007),
Glasserman and Wu (2011), and Corte, Sarno, and Tsiakas (2011). Let t denote the present
time and let {Ti} be a set of forecast horizons such that Ti+1 > Ti. Consider a time-series model
that uses historical returns for volatility forecasting. Depending on the method of forecasting,
such a model can predict the future volatility directly over the horizon of interest, σ̂t,t+Ti ,
where the subscript t, t + Ti means “the time t forecast of volatility from time t + 1 till time
t+Ti”. In other words, σ̂t,t+Ti denotes the time t predicted volatility over horizon Ti. We refer
to σ̂t,t+Ti as to “the spot predicted volatility”. When there is no dependency in the time series
of returns, using two spot predicted volatilities over horizons Ti and Tj (such that Tj > Ti) we
can deduce the predicted volatility over the period from t+ Ti to t+ Tj
σ̂2t+Ti,t+Tj = σ̂2t,t+Tj − σ̂2t,t+Ti . (2.1)
We refer to σ̂t+Ti,t+Tj as to “the forward predicted volatility”. Specifically, the forward pre-
dicted volatility represents the time t forecasted volatility between two future dates t+ Ti and
t+ Tj .
Alternatively, a time-series model can forecast the volatility over horizon of interest by
performing a rolling one-period ahead forecast. That is, first the model predicts the next
period volatility σ̂t,t+T1 , then it predicts the volatility over the second period σ̂t+T1,t+T2 , and
so on until the last period volatility σ̂t+Ti−1,t+Ti . In this case the spot predicted volatility over
the total horizon Ti is computed as
σ̂2t,t+Ti = σ̂2t,t+T1 + σ̂2t+T1,t+T2 + . . .+ σ̂2t+Ti−1,t+Ti . (2.2)
3It is worth noting that the notions of spot and forward implied volatilities are built, in their turn, on thenotions of spot and forward interest rates.
6
The forward predicted volatility between two future dates t+ Ti and t+ Tj is computed as
σ̂2t+Ti,t+Tj = σ̂2t+Ti,t+Ti+1+ σ̂2t+Ti+1,t+Ti+2
+ . . .+ σ̂2t+Tj−1,t+Tj . (2.3)
2.2 Term Structure of Volatility Predictability
As a rule, the procedure for measuring the forecast accuracy across various horizons is per-
formed as follows. First of all, one predicts the volatility across a set of horizons {Ti}, σ̂t,t+Ti .
Second, one computes the realized volatility across the same set of horizons, σt,t+Ti . Finally,
one compares the predicted and realized volatilities. For this purpose one usually employs
some function Ps(Ti) that computes the forecast accuracy for given horizon Ti. The subscript
s in this function emphasizes that the function uses the spot predicted volatilities. Typically,
the value of Ps(Ti) is limited from above by 1 (or 100%) meaning that if a model correctly
predicts the volatility, then the forecast accuracy equals 100%. Usually the value of Ps(Ti) is
significantly less than 1 and it is known that the volatility forecastability decays with horizon.
Therefore the function Ps(Ti) is decreasing. We refer to function Ps(Ti) as to “the forecast
accuracy curve for the spot predicted volatility” or just “the spot forecast accuracy curve” for
short.
The spot forecast accuracy curve provides useful, but at the same time misleading infor-
mation about the model’s ability to forecast volatility across a specific horizon. Consider the
following motivating example. Suppose you measure the model’s forecast accuracy in pre-
dicting the volatility over two periods, till time T2. You estimate that the forecast accuracy,
as measured by function Ps(T2), amounts to 50%. When you recall that the spot predicted
volatility σ̂t,t+T2 can be decomposed into the spot predicted volatility over the first period
σ̂t,t+T1 and the forward volatility over the second period σ̂t+T1,t+T2 , the natural question to ask
is: how accurate is the volatility forecast for each component? The forecast accuracy of 50%
over two periods is a meaningful measure when both the components, σ̂t,t+T1 and σ̂t+T1,t+T2 ,
are forecasted with about the same accuracy. In reality it might be the case that the value
of 50% appears because the first component is forecasted with 100% accuracy, but the other
component is forecasted with 0% accuracy.
We argue that the standard procedure for measuring the forecast accuracy across horizons
creates a wrong impression that the predictive accuracy is evenly spread over the total horizon.
7
However, this impression is far from the reality. Therefore, much more useful information about
the model’s ability to forecast volatility across various horizons can be obtained if one compares
the forward predicted volatilities, σ̂t+Ti−1,t+Ti , with the future realized volatilities over the same
periods, σt+Ti−1,t+Ti . For this purpose one can employ exactly the same function, Pf (Ti), but
in this case the function computes the volatility forecast accuracy for a set of (non-overlapping)
future periods {t+Ti−1, t+Ti}. The subscript f in this function emphasizes that the function
uses the forward predicted volatilities. Function Pf (Ti) has the same properties as the function
Ps(Ti): the value of the function is limited from above by 1, and the function is decreasing
with horizon. We refer to function Pf (Ti) as to “the forecast accuracy curve for the forward
predicted volatility” or just “the forward forecast accuracy curve” for short.
By the “term structure of volatility predictability” we mean the two curves, Ps(Ti) and
Pf (Ti), that jointly provide information about the model’s ability to forecast volatility across
various horizons. By augmenting the standard spot forecast accuracy curve with the forward
forecast accuracy curve, one gets a better insight into how far into the future one can really
rely on the model’s volatility forecasts.
3 Data
In our study we use historical data for the four major asset classes; two of the asset classes also
include sub-asset classes. These asset classes (sub-classes) include stocks (individual stocks
and stock market indices), bonds (intermediate- and long-term bonds), currencies, and com-
modities. All data come at the daily frequency and cover the period from January 1995 to
December 2016. The data for the two first asset classes (and sub-classes) are downloaded from
Yahoo Finance.4 The data for the two other asset classes are obtained from the Federal Re-
serve Economic Data (FRED), a database maintained by the Research division of the Federal
Reserve Bank of St. Louis.5 Table 1 lists the components of each data set.
The data set of stocks consists of prices of 23 individual stocks. These stocks represent
either current or previous components of the Dow Jones Industrial Average (DJIA) that have
price data in the whole sample period.6 The dataset of stock market indices includes the prices
4https://finance.yahoo.com/.5https://fred.stlouisfed.org/.6The DJIA is an index of the prices of 30 large US corporations selected to represent a cross section of US
industry. As for today, the components of the DJIA have changed 51 times since its beginning in 1896.
8
Table 1: Datasets: Asset classes, subclasses, and their components
Asset class Symbols
Individual stocks AXP, BA, CAT, CSCO, CVX, DD, DIS, GE, HD, INTC, JNJ,JPM, KO, MCD, MMM, MRK, MSFT, NKE, PFE, TRV, VZ,WMT, XOM
Stock market indices ˆAORD, ˆATX, ˆBFX, ˆBTK, ˆBVSP, ˆDJA, ˆDJI, ˆDJT,ˆDJU, ˆFCHI, ˆGDAXI, ˆGSPC, ˆGSPTSE, ˆHSI, ˆIXIC, ˆMID,ˆMXX, ˆMXY, ˆN225, ˆNBI, ˆNDX, ˆNYA, ˆOEX, ˆPSE,ˆRUA, ˆRUI, ˆRUT, ˆSML, ˆSOX, ˆSSMI, ˆTA100, ˆUTY,ˆXAU, ˆXCI, ˆXII, ˆXMI, ˆXNG, ˆXOI, ˆXTC
Long-term bonds VFIIX, VFITX, VFICX, VWEHX, VCAIX, VWITX
Intermediate-term bonds VBLTX, VUSTX, VWESX, VCITX, VWAHX, VWLTX,VNJTX, VNYTX, VOHIX, VPAIX
Exchange rates DEXBZUS, DEXCAUS, DEXDNUS, DEXHKUS, DEXINUS,DEXJPUS, DEXKOUS, DEXNOUS, DEXSDUS, DEXSFUS,DEXSIUS, DEXUSAL, DEXUSNZ, DEXUSUK, DTWEXB,DTWEXM, DTWEXO
Commodities DCOILWTICO, GOLDAMGBD228NLBM, DCOILBRENTEUGOLDPMGBD228NLBM, DGASUSGULF, GOLDPMGBD229NLBM,DGASNYH, GOLDAMGBD229NLBM
Notes: For stocks and bonds, Symbols refers to the symbols used in Yahoo Finance. For exchangerates and commodities, Symbols refers to the symbols used in FRED database.
of 39 major world stock indices.
The two data sets of bonds consist of prices of 6 intermediate-term bond indices and
10 long-term bond indices respectively. All bond indices represent the most popular bond
index funds provided by the Vanguard Group.7 These intermediate-term (long-term) bond
indices offer a low-cost, diversified approach to bond investing, providing broad exposure to
US investment-grade bonds with maturities of about 6 (14) years.
The data set of currencies consists of 14 individual spot foreign exchange rates and 3
indices. Each of these indices represents a weighted average of the foreign exchange value of
the US dollar against a specific subset of the broad index currencies. Finally, the data set of
commodities includes daily spot prices on 8 commodities.
4 Methodology
4.1 Volatility Forecasting Models
A number of stylized facts about the volatility of financial asset prices have emerged over
the years, and been confirmed in numerous studies. Engle and Patton (2001) highlight the
7https://investor.vanguard.com. The Vanguard Group is one of the world’s largest and most respectedinvestment companies. It provides low-cost passive funds available for their clients to invest in.
9
following stylized facts about volatility:
• Even though the volatility is not constant over time, the volatility exhibits persistence.
Simply put, volatility persistence means that period-to-period changes in volatility are
relatively small. In technical terms, volatility persistence means that the volatility process
exhibits a significant positive autocorrelation.
• Volatility tends to cluster through time. The volatility clustering occurs because large
moves in the price process tends to be followed by large moves (of either sign), and
small moves tend to be followed by small moves. The volatility clustering is commonly
considered to be yet another manifestation of volatility persistence.
• Volatility is mean-reverting. This means that a period of high (low) volatility will even-
tually be followed by a period of low (high) volatility. Mean reversion in volatility
is generally interpreted as meaning that there is a normal level of volatility to which
volatility will eventually return.
To illustrate the above points, Figure 1 plots the historical volatility of the S&P 500 stock
price index over the period from January 1995 to December 2016. The graph in this plot
clearly illustrates that the volatility is persistent and mean-reverting. A prominent feature
of the dynamics of volatility is that it has many periodic peaks and the spikes in volatility
coincide with the periods of stock market turmoil. For instance, during the period of the
Global Financial Crisis of 2008, the volatility reached 75% (on annual basis).
10
Figure 1: Historical volatility of the S&P 500 index
20
40
60
1995 2000 2005 2010 2015
S&
P 5
00 v
ola
tilit
y, %
Notes: The volatility is estimated using daily returns and the EWMA model with λ = 0.96.The volatility is expressed on annual basis in percentage points.
In our study we use three simple yet different models to forecast volatility. Each of these
models captures some of the stylized facts about volatility. Specifically, given the daily re-
turns of a financial asset as input, we implement the EWMA model, the GARCH model,
and the HAR model to produce T -day ahead volatility forecast, where T takes values in
{5, 10, 15, 20, . . . , 170, 175}. Our convention is that there are 5 trading days per week. Conse-
quently, we forecast volatility for 1-35 weeks in the future. We assume that the daily logarithmic
return process of any asset is given by
rt = µ+ σtεt, (4.1)
where µ is the daily long-run mean of rt, σt is the daily volatility, and εt is a white noise process
with zero mean and unit variance.
4.1.1 The EWMA Model
Our first volatility forecasting model is the well-known Exponentially Weighted Moving Av-
erage (EWMA) model popularized by the RiskMetricsTM group (Longerstaey and Spencer
11
(1996)). The one-step ahead forecasting equation in this model is given by
σ̂2t,t+1 = (1− λ)r2t + λσ̂2t−1,t, (4.2)
where λ is the so-called “decay factor”. We estimate the optimal decay factor for each asset by
minimizing the Mean Squared Error (MSE) of daily forecast.8 When the length of the forecast
horizon Ti is greater than one day, the multi-step ahead volatility forecast is performed using
the square root of time rule
σ̂t,t+Ti = σ̂t,t+1
√Ti. (4.3)
Observe that the EWMA model assumes that the volatility is highly persistent. That is, the
expected volatility over all future periods equals the forecasted volatility for the subsequent
period.
4.1.2 The GARCH Model
We employ the most widely used Generalized AutoRegressive Conditional Heteroskedasticity
(GARCH) (1,1) model, proposed by Bollerslev (1986), as the second alternative volatility
forecasting model. In this model the latent daily volatility is assumed to evolve according to
the following process
σ2t+1 = ω + α(rt − µ)2 + βσ2t , (4.4)
where the coefficients α, β, and ω are estimated using daily returns in (4.1) by the maximum
likelihood method. Observe that the EWMA model is a special case of the GARCH(1,1) model
where ω = 0 and α = 1−β. The volatility in the GARCH(1,1) model is not only persistent, but
also mean reverting. The persistence is measured by α+ β. When persistence equals 1, there
is no mean-reversion as in the EWMA model. In a stable GARCH(1,1) model, α+β < 1. The
higher the persistence, the slower the reversion to the mean is. The so-called half-life of mean
reversion, defined as ln(0.5)/ ln(α + β), measures the average time it takes volatility to move
halfway towards its long-term average. In stock markets, a typical estimate of the volatility
half-life amounts to 15 weeks (Engle and Patton (2001)).
The one-step ahead volatility forecast for day t+1 is given by equation (4.4). The volatility
8We use√
r2t = |rt| as the proxy of day t realized volatility.
12
for day t+2 is forecasted using the fact that E[(rt+1 − µ)2
]= σ2t+1. As a result, the multi-step
volatility forecast is based on rolling one-day ahead volatility forecasts from day t + 1 to day
t+ Ti
σ̂2t+i−1,t+i =
ω + α(rt − µ)2 + βσ2t if i = 1
ω + (α+ β)σ̂2t+i−2,t+i−1 if i > 1.
(4.5)
The forecasted Ti-day volatility is computed as the square root of the sum of daily forecasted
variances
σ̂t,t+Ti =
√√√√ Ti∑j=1
σ̂2t+j−1,t+j . (4.6)
4.1.3 The HAR Model
The persistence and mean-reversion of volatility can be captured by a simple Auto-Regressive
(AR(1)) model for volatility
σt+1 = β0 + β1σt + εt, (4.7)
where β0 and β1 are real constants and εt ∼ i.i.d.(0, σ2ε). For example, French, Schwert, and
Stambaugh (1987) use both the AR(1) and GARCH(1,1) models to predict the stock market
volatility and report that both the models have about the same forecast accuracy. Because
both the AR(1) and GARCH(1,1) models have the same forecast accuracy, but in the AR(1)
model the volatility can in principle be negative, the AR(1) model is virtually never used in
practical applications.
In the context of Realized Volatility9 (RV), Corsi (2009) introduces a Heterogeneous Auto-
Regressive model of Realized Volatility (HAR-RV) and shows that it has a superior forecast
accuracy compared to a set of alternative models. Inspired by Corsi, we adapt his HAR-RV
model for forecasting volatility using daily data only. This model is a simple auto-regressive
type model where the volatility is forecasted using several past volatilities realized over different
time horizons. When only daily data are available, we simply replace the realized volatilities
computed using past intraday returns with those computed using past daily returns. Therefore,
9The “realized volatility” is the estimator of daily volatility computed using intraday returns.
13
we label our model as HAR model and it is specified by
σt,t+T = β0,T + β1,Tσt + β2,Tσt−4,t + β3,Tσt−20,t + β4,Tσt−62,t + β5,Tσt−125,t + εt, (4.8)
where σt−τ,t is the realized volatility over τ + 1 days from day t − τ + 1 till day t (by our
convention σt−1,t = σt)
σt−τ,t =
√√√√τ−1∑i=0
r2t−i , τ + 1 ∈ {1, 5, 21, 63, 126}. (4.9)
The HAR model extends the AR(1) model (given by equation (4.7)) by adding several regres-
sors. The volatility in the HAR model is also persistent and mean-reverting.10 The inclusion
of lags of realized volatility aggregated over different time horizons improves the forecast ac-
curacy. One possible explanation of this fact can be found in the paper by Engle and Rangel
(2008). In particular, these authors convincingly demonstrate that the volatility dynamics
have two components: a high-frequency component that can be captured by a GARCH pro-
cess and a low-frequency component that can be explained by macroeconomic factors. Because
of the presence of a low-frequency component in volatility, the GARCH model alone is not able
to provide sensible longer-horizon volatility forecasts. In contrast, the HAR model allows, to
some extent, to capture the high- and low-frequency components in the volatility dynamics.
In the original HAR-RV model by Corsi (2009), σt−τ,t is estimated using intraday data. In
addition, in his original model Corsi uses only daily, weekly, and monthly realized volatilities as
regressors; the volatility is forecasted for one day only. Since in our empirical study we forecast
future volatility up to 35-week horizon, we augment the original model by two additional
regressors: 3-month and 6-month past realized volatilities. We assume that there are 21
trading days per month. Also note that our HAR model is used to perform a T -day ahead
volatility forecast where T ≥ 1. Therefore the βi coefficients are re-estimated (by OLS) for
each specific horizon T , hence we use notation βi,T .
10Note that the unconstrained HAR model suffers from the same drawback as the AR(1) model: the forecastedvolatility can in principle be negative. However, the main justification for using this model is that it provides asuperior long-horizon forecast accuracy as compared to that of the GARCH model.
14
4.2 Measuring Forecast Accuracy
The standard procedure for assessing the forecast accuracy starts with evaluating the forecast
errors
et,t+T = σ̂t,t+T − σt,t+T , (4.10)
which is the difference between the T -day forecasted volatility σ̂t,t+T and the realized volatility
σt,t+T . To measure the forecast accuracy, most often researchers use the MSE because it is
robust to the estimation error in the volatility proxy. The MSE is computed according to
MSE =1
M
M∑t=1
e2t,t+T , (4.11)
where M is the number of T -day ahead volatility forecasts in the out-of-sample period. Quite
obviously, the best forecasting model is that one which has the smallest MSE.
However, the MSE measure has several drawbacks. In the context of our empirical study,
two of them deserve mentioning. First, the MSE measures the absolute size of the squared
forecast errors and, therefore, it is scale-depended (see Hyndman and Athanasopoulos (2013)).
As a result, the MSE can be used for comparing forecasting models on a single dataset only.
Second, even though the MSE allows comparing alternative forecasting models, it does not
allow measuring predictive accuracy per se. For example, if the volatility over some forecast
horizon is unpredictable, all model forecasts are likely to be worthless. In this case using
the MSE criterion (to select the best model among the poor ones) creates the illusion of
predictability when none is present.
To overcome the first drawback and obtain a scale-free measure of forecast accuracy, one
possibility is to compute the following ratio
∑Mt=1 e
2t,t+T∑M
t=1(σt,t+T − σ̄t,t+T )2, (4.12)
where σ̄t,t+T is the mean value of the realized volatility in the out-of-sample period
σ̄t,t+T =1
M
M∑t=1
σt,t+T . (4.13)
15
The ratio given by (4.12) compares the sum of squared forecast errors with the sum of squared
variations of σt,t+T . Hence, this ratio measures the relative size of the forecast errors and
can be used for comparing forecasting models on several datasets. Still, this ratio retains the
second drawback because a particular value of this ratio is difficult to interpret.
Galbraith (2003) suggests measuring the forecast accuracy using the following function
C = 1− MSE
MSEBM, (4.14)
where MSEBM is the MSE of a selected Benchmark Model (BM), for example, the historical
mean model.11 Again, function C can be used for comparing forecasting models on several
datasets, but it seems to retain the second drawback because a selected benchmark model may
be totally useless for forecasting.
To overcome both the drawbacks, we employ the proportion of variance explained by the
forecasts (this measure is proposed by Blair, Poon, and Taylor (2001)):
Ps(Ti) = 1−∑M
t=1 e2t,t+Ti∑M
t=1(σt,t+Ti − σ̄t,t+Ti)2. (4.15)
Notice that this measure equals one minus the ratio given by (4.12), therefore it is a scale-free
measure as well. Observe in addition that the computation of P is similar to the computation of
the out-of-sample R-squared (R2) in the constrained linear regression model with zero intercept
and unit slope. Therefore, the computation of P can be interpreted as
P = 1− MSE
TSS, (4.16)
where TSS denotes the Total Sum of Squares. Consequently, the value of P can be conveniently
reported in percentages.
It is worth noting that the smaller the respective MSE, the closer P to 100%. Therefore
this measure allows evaluating predictive accuracy per se. Given that P is equivalent to an R2
in the restricted model, it is likely to be smaller than conventional R2. The value of P can even
be negative since the ratio given by (4.12) can be greater than 1. A negative P indicates that
the forecast errors have a greater amount of variations than the actual volatility, which means
11In the historical mean model, the forecasted volatility equals the sample mean of the historical volatility.
16
that a forecasting model does not have any predictive power. The subscript s in this function
emphasizes that the function uses the spot predicted volatilities. Thus, in our definition, this
function produces the spot forecast accuracy curve.
The forward forecast accuracy curve is defined by
Pf (Ti) = 1−∑M
t=1 e2t+Ti−1,t+Ti∑M
t=1(σt+Ti−1,t+Ti − σ̄t+Ti−1,t+Ti)2, (4.17)
where
et+Ti−1,t+Ti = σ̂t+Ti−1,t+Ti − σt+Ti−1,t+Ti (4.18)
is the error in forecasting forward volatilities.
4.3 Statistical Inference
We conduct inference about estimated forecast accuracies. Specifically, we test the following
null hypothesis for both the spot and forward forecast accuracy:
H0 : P (Ti) ≤ 0. (4.19)
In words, the null hypothesis assumes the absence of predictive ability over horizon of length
Ti. We illustrate the computation of the p-value of the hypothesis for the spot forecast accu-
racy. The computation of the p-value of the corresponding hypothesis for the forward forecast
accuracy is conducted along the similar lines.
We remind the reader that the spot forecast accuracy over horizon of length i is computed
as
Ps(Ti) = 1− MSE
TSS= 1−
∑Mt=1 e
2t,t+Ti∑M
t=1 ε2t,t+Ti
, (4.20)
where
εt,t+Ti = σt,t+Ti − σ̄t,t+Ti . (4.21)
Therefore, the null hypothesis can alternatively be formulated as
H0 :MSE
TSS≥ 1. (4.22)
17
In words, under the null hypothesis the MSE is greater or equal to the TSS. Consequently, we
reject the null hypothesis when the MSE is significantly below the TSS.
If the times series of et,t+Ti and εt,t+Ti are assumed to be Gaussian, serially uncorrelated,
and contemporaneously uncorrelated, then the ratio MSETSS under the null hypothesis has the
usual F -distribution. However, in our case, the assumptions listed above are not met. First,
because we perform a multi-step ahead volatility forecasting, the time series of et,t+Ti and
εt,t+Ti are serially correlated. Second, the times series of et,t+Ti and εt,t+Ti are contemporane-
ously correlated.12 Finally, the assumption of Gaussian errors also seems to be inappropriate.
Therefore, to compute the p-value of the null hypothesis we employ the block bootstrap ap-
proach.
Each bootstrap trial consists of 2 steps. First, using the original time series of {e1,1+Ti ,
e2,2+Ti , . . . , eM,M+Ti} and {ε1,1+Ti , ε2,2+Ti , . . . , εM,M+Ti} we construct two re-samples {e∗1,1+Ti ,
e∗2,2+Ti , . . . , e∗M,M+Ti
} and {ε∗1,1+Ti , ε∗2,2+Ti
, . . . , ε∗M,M+Ti} using the stationary block-bootstrap
method of Politis and Romano (1994). The optimal block length is selected automatically using
the method proposed by Politis and White (2004).13 The two re-sampled time series retain not
only the historical serial correlations, but also the historical contemporary correlation between
the original time series. The latter is achieved by insuring that in the re-sampled data the pair
{e∗t,t+Ti , ε∗t,t+Ti
} corresponds to the pair of original observations {eτ,τ+Ti , ετ,τ+Ti} at some time
τ . Second, using the bootstrapped data we compute MSE∗, TSS∗, and finally the fraction
MSE∗
TSS∗ . We estimate the sampling distribution of the ratio MSETSS by carrying out N = 1000
bootstrap trials in total. Finally, to estimate the significance level, we count how many times
the simulated value of the ratio MSE∗
TSS∗ happens to be greater or equal to 1. Denote this value
by n. The p-value of the predictive ability test (over horizon of length Ti) is computed as
p(Ti) = n/N .
Even though we compute the forecast accuracy curves and corresponding p-values for each
individual financial asset, we do not report the results for each individual asset because each of
these results might not be truly representative. Instead, we perform a so-called “meta-analysis”
that combines the results for all individual assets that belong to the same asset class. A key
12For the sake of illustration, suppose that at some time the volatility increases dramatically. Subsequently,as a result of this spike in volatility, both the forecast error and the difference between the current volatilityand the long-run volatility increase.
13See also the subsequent correction of the method by Patton, Politis, and White (2009).
18
benefit of this approach is that the aggregation of information on individual assets leads to
a higher statistical power and more robust forecast accuracy estimates than it is possible to
obtain from the information on any individual asset.
The empirical results on the forecast accuracy curves are reported for each asset class by
averaging the forecast accuracy curves for individual assets. Specifically, denoting by k the
number of individual assets that belong to the same asset class, we compute
P (Ti) =k∑j=1
Pj(Ti), (4.23)
where Pj(Ti) denotes the (spot or forward) forecast accuracy over horizon of length Ti for asset
j. Similarly, for each asset class we combine the results of multiple tests of the null hypothesis
to ask whether there is evidence from the collection of individual tests that might reject the
null hypothesis. In other words, we combine k p-values for individual assets that belong to the
same asset class to test whether collectively they can reject a common null hypothesis of no
predictive ability.
When the p-values of individual tests are independent, Fisher’s method (Fisher (1925)) of
combining the probabilities is asymptotically optimal among essentially all methods of com-
bining the results of independent tests (Littell and Folks (1971)). The method is to compute
the following test statistic
Ψ(Ti) =k∑j=1
−2 log pj(Ti), (4.24)
where pj(Ti) denotes the p-value of the volatility unpredictability hypothesis over horizon of
length Ti for asset j. Fisher demonstrated that for independent p-values the statistic Ψ(Ti)
follows a chi-squared distribution with 2k degrees of freedom, Ψ(Ti) ∼ χ22k.
Brown (1975) extended the Fisher’s method to the dependent case where p-values are
correlated. In the dependent case, the statistic Ψ(Ti) has the following mean and variance
E[Ψ(Ti)] = 2k, V ar[Ψ(Ti)] = 4k + 2∑∑m<j
Cov (−2 log pm(Ti),−2 log pj(Ti)) , (4.25)
where Cov(x, y) represents the covariance between x and y. Brown’s method is based on
the assumption that the distribution of Ψ(Ti) can be approximated by that of cχ22f where
19
c represents a re-scaling constant and χ22f is a chi-squared distribution with 2f degrees of
freedom. Brown calculated c and f by equating the first two moments of Ψ(Ti) and cχ22f
resulting in
f =E[Ψ(Ti)]
2
V ar[Ψ(Ti)], c =
V ar[Ψ(Ti)]
2E[Ψ(Ti)]=k
f. (4.26)
The combined p-value is then given by
p(Ti) = 1.0− Φ2f
(Ψ(Ti)
c
), (4.27)
where Φ2f is the cumulative distribution function of χ22f .
The covariances in (4.25) can be evaluated using either a numerical integration or by
Gaussian quadrature. We follow the original Brown’s method and use a Gaussian quadrature
that approximates the covariances by two quadratic functions of the correlation coefficient
ρ(−2 log pm(Ti),−2 log pj(Ti)). The problem is that we do not have data on the correlation
coefficients between two individual p-values. However, since any such correlation coefficient is
a function of the correlation between forecast accuracies of assets m and j, we assume that
these correlations can be estimated using the time series of squared errors. That is, we assume
that
ρ(−2 log pm(Ti),−2 log pj(Ti)) = ρ(
(emt,t+Ti)2, (ejt,t+Ti)
2),
where emt,t+Ti and ejt,t+Ti are forecast errors for assets m and j respectively. We find that the
time series of squared forecast errors exhibit significant positive correlations. For example, for
the data set of stocks, depending on the length of the forecast horizon Ti the average correlation
coefficient varies from 0.25 to 0.55.
5 Empirical Results
We remind the reader that our total sample covers the period from January 1995 to December
2016. The period from January 1995 to December 1999 (5 years) is used as the initial in-
sample period. Consequently, the out-of-sample period in our study is from January 2000 to
December 2016 (17 years) that covers several interchanging calm and turbulent times. All
forecasts are obtained using an expanding window scheme. Given a fixed forecasting model
20
and a fixed forecast horizon, we perform out-of-sample volatility forecasting for every asset
in a selected data set. Specifically, first the parameters of a model are estimated using in-
sample observations [1, 2, . . . , t]. Then the future volatility is forecasted for T -days ahead, T ∈
{5, 10, 15, 20, . . . , 170, 175}. After that, we expand the in-sample period by one day (it becomes
[1, 2, . . . , t + 1]) and repeat the forecasting procedure. Since estimation of the parameters of
each forecasting model is rather time consuming, to speed up the forecasting process we re-
estimate the model’s parameters every 50 days only. In the end of this forecasting process, we
compute the spot and forward forecast accuracy for each asset in a data set, as well as the
corresponding p-values of the predictive ability test. Finally, we compute the average spot and
forward forecast accuracy over all assets in a data set, as well as the p-values of the combined
probability tests.
For each financial asset class and sub-class, Figures 2 - 7 plot the average spot and forward
forecast accuracy curves produced by three volatility forecasting models: EWMA, GARCH,
and HAR. The same figures plot the corresponding p-values of the combined probability test
for each forecasting model. Specifically, in each figure the top left panel plots the average
spot forecast accuracy curves, whereas the top right panel plots the average forward forecast
accuracy curves. The forecast accuracy is reported in percentages. The forecast horizon varies
from 1 to 35 (20) weeks for the spot (forward) forecast accuracy. The bottom left panel
plots the p-value of the combined probability test for the spot forecast accuracies, while the
bottom right panel plots the p-values of the combined probability test for the forward forecast
accuracies. The dashed horizontal lines in the bottom panels show the locations of the 5% and
10% significance levels. A p-value below the chosen significance level leads to a rejection of the
volatility unpredictability hypothesis.
21
Figure 2: Individual stocks.
−20
0
20
40
0 10 20 30
Forecast horizon, weeks
Spot
fore
ca
st
accu
racy,
%
−20
0
20
40
5 10 15 20
Forecast horizon, weeks
Forw
ard
fore
cast
accura
cy,
%
0.00
0.05
0.10
0.15
0.20
0.25
0 10 20 30
Forecast horizon, weeks
Sp
ot
fore
ca
st p
−va
lue
0.00
0.05
0.10
0.15
0.20
0.25
5 10 15 20
Forecast horizon, weeks
Fo
rward
fo
reca
st
p−
valu
e
Model EWMA GARCH HAR
Notes: The top left panel plots the average spot forecast accuracy curves, whereas the top right panelplots the average forward forecast accuracy curves. The bottom left panel plots the p-value of thecombined probability test for the spot forecast accuracies, while the bottom right panel plots the p-values of the combined probability test for the forward forecast accuracies.
The main question of our study is how far ahead one can forecast volatility. Perhaps not
surprisingly, the answer to this question depends on whether the spot or forward forecast accu-
racy curve is used. If the spot forecast accuracy curve is used to gauge the limits to volatility
predictability, depending on the forecasting model, asset class, and chosen significance level,
the volatility can be predicted over horizons ranging from 5 weeks to 35 weeks. In contract,
the forward forecast accuracy curves (together with the corresponding p-values) advocate that
volatility can be predicted only over horizons ranging from 3 to 12 weeks. On average, at the
5% significance level, the horizon of forward volatility predictability is limited to 7.5 weeks
only. This means, among other things, that the forward volatility is not forecastable when the
22
future period (over which the forward volatility is predicted) is defined to extend beyond the
first 2 months.
Figure 3: Stock market indices
−20
0
20
40
0 10 20 30
Forecast horizon, weeks
Spo
t fo
recast
accu
racy,
%
−20
0
20
40
5 10 15 20
Forecast horizon, weeks
Forw
ard
fo
recast
accu
racy,
%
0.00
0.05
0.10
0.15
0.20
0.25
0 10 20 30
Forecast horizon, weeks
Sp
ot
fore
cast
p−
valu
e
0.00
0.05
0.10
0.15
0.20
0.25
5 10 15 20
Forecast horizon, weeks
Forw
ard
fo
reca
st
p−
valu
e
Model EWMA GARCH HAR
Notes: The top left panel plots the average spot forecast accuracy curves, whereas the top right panelplots the average forward forecast accuracy curves. The bottom left panel plots the p-value of thecombined probability test for the spot forecast accuracies, while the bottom right panel plots the p-values of the combined probability test for the forward forecast accuracies.
The empirical results for all financial asset classes and sub-classes share some similarities.
The first similarity is that, for all asset classes and regardless of the forecasting model, the
forecast accuracy curves have the same shape. Whereas the forward forecast accuracy curve is
a strictly monotonically decreasing and convex function, the spot forecast accuracy curve has
a hump-shaped form. Specifically, the spot forecast accuracy first increases as forecast horizon
increases, then decreases. The maximum is usually attained at a 4-week horizon. Therefore,
judging by the spot forecast accuracy, the volatility can be forecasted with the highest precision
23
over approximately a 1-month horizon. Then, the longer the forecast horizon, the worse the
forecast accuracy is. For both the spot and forward volatilities the p-values increase with
horizon; the p-values grow faster for the forward volatility than those for the spot volatility.
Figure 4: Long-term bond indices
−20
0
20
0 10 20 30
Forecast horizon, weeks
Sp
ot fo
recast
accura
cy,
%
−20
0
20
5 10 15 20
Forecast horizon, weeksF
orw
ard
fore
cast
accura
cy,
%
0.00
0.05
0.10
0.15
0.20
0.25
0 10 20 30
Forecast horizon, weeks
Sp
ot
fore
cast
p−
valu
e
0.00
0.05
0.10
0.15
0.20
0.25
5 10 15 20
Forecast horizon, weeks
Forw
ard
fo
reca
st
p−
valu
e
Model EWMA GARCH HAR
Notes: The top left panel plots the average spot forecast accuracy curves, whereas the top right panelplots the average forward forecast accuracy curves. The bottom left panel plots the p-value of thecombined probability test for the spot forecast accuracies, while the bottom right panel plots the p-values of the combined probability test for the forward forecast accuracies.
24
Figure 5: Intermediate-term bond indices
−20
0
20
40
0 10 20 30
Forecast horizon, weeks
Sp
ot fo
recast
accura
cy,
%
−20
0
20
40
5 10 15 20
Forecast horizon, weeks
Forw
ard
fore
cast
accu
racy,
%
0.00
0.05
0.10
0.15
0.20
0.25
0 10 20 30
Forecast horizon, weeks
Sp
ot fo
reca
st
p−
valu
e
0.00
0.05
0.10
0.15
0.20
0.25
5 10 15 20
Forecast horizon, weeks
Fo
rward
fore
ca
st p
−va
lue
Model EWMA GARCH HAR
Notes: The top left panel plots the average spot forecast accuracy curves, whereas the top right panelplots the average forward forecast accuracy curves. The bottom left panel plots the p-value of thecombined probability test for the spot forecast accuracies, while the bottom right panel plots the p-values of the combined probability test for the forward forecast accuracies.
The second similarity lies in the comparative ranking of alternative forecasting models.
Typically, the EWMA model produces the worst forecast accuracy, whereas the HAR model
produces the best one. The reader is reminded that the main difference between the EWMA
and GARCH models is that the GARCH model captures the mean reversion of volatility,
whereas the EWMA model does not. Apparently, accounting for mean reversion allows the
GARCH model to outperform the EWMA model. The HAR model includes the lags of real-
ized volatility aggregated over different time horizons. We conjecture that superior forecast
accuracy provided by the HAR model is explained by the fact that this model captures not
only the persistence and mean reversion of volatility, but also the high- and low-frequency
25
components in the volatility dynamics.
Figure 6: Currencies
−20
0
20
40
0 10 20 30
Forecast horizon, weeks
Sp
ot fo
recast
accura
cy,
%
−20
0
20
40
5 10 15 20
Forecast horizon, weeks
Forw
ard
fore
cast
accu
racy,
%
0.00
0.05
0.10
0.15
0.20
0.25
0 10 20 30
Forecast horizon, weeks
Sp
ot fo
reca
st
p−
valu
e
0.00
0.05
0.10
0.15
0.20
0.25
5 10 15 20
Forecast horizon, weeks
Fo
rward
fore
ca
st p
−va
lue
Model EWMA GARCH HAR
Notes: The top left panel plots the average spot forecast accuracy curves, whereas the top right panelplots the average forward forecast accuracy curves. The bottom left panel plots the p-value of thecombined probability test for the spot forecast accuracies, while the bottom right panel plots the p-values of the combined probability test for the forward forecast accuracies.
Qualitatively, for all financial asset classes and sub-classes the forecast accuracy curves
look similarly. Yet there are small quantitative differences. In particular, the volatility can
be forecasted with the best accuracy in the stock and currency markets. In these markets,
the spot forecast accuracy amounts to approximately 45% (20%) over horizons ranging from 2
to 5 (20 to 25) weeks. The forward forecast accuracy decreases to zero over horizons ranging
from 10 to 15 weeks. In contrast, the volatility forecast accuracy is worse in the bond and
commodity markets. In these markets, the spot forecast accuracy amounts to about 30% (10%)
26
over horizons ranging from 2 to 5 (20 to 25) weeks. The forward forecast accuracy decreases
to zero over horizons ranging from 5 to 10 weeks.
Figure 7: Commodities
−20
0
20
40
0 10 20 30
Forecast horizon, weeks
Sp
ot fo
recast
accura
cy,
%
−20
0
20
40
5 10 15 20
Forecast horizon, weeks
Fo
rwa
rd fore
cast
accura
cy,
%
0.00
0.05
0.10
0.15
0.20
0.25
0 10 20 30
Forecast horizon, weeks
Sp
ot
fore
cast
p−
valu
e
0.00
0.05
0.10
0.15
0.20
0.25
5 10 15 20
Forecast horizon, weeks
Forw
ard
fo
reca
st
p−
valu
e
Model EWMA GARCH HAR
Notes: The top left panel plots the average spot forecast accuracy curves, whereas the top right panelplots the average forward forecast accuracy curves. The bottom left panel plots the p-value of thecombined probability test for the spot forecast accuracies, while the bottom right panel plots the p-values of the combined probability test for the forward forecast accuracies.
6 Discussion
Why do all standard volatility models fail to provide sensible longer-horizon volatility forecasts?
In this section we argue that the volatility dynamics do not fully correspond to the assumptions
embedded in the standard volatility models, which effectively assume highly persistent and
mean-reverting dynamics. In order to motivate our point of view, consider again the plot of
27
the historical volatility of the S&P 500 index depicted in Figure 1. After having studied the
dynamics of volatility, one conclusion appears to have emerged, namely, that the volatility
is volatile. High volatility eventually gives way to low volatility and vice versa. On the one
hand, this observation confirms that the volatility is persistent, mean-reverting, and exhibits
volatility clustering. On the other hand, this observation suggests looking from a different
perspective on the volatility dynamics. Specifically, the volatility dynamics can be considered
as a process which is constantly “switching” between high and low volatility states. The
interchanging periods of high and low volatility can be regarded as waves of irregular cycle
length and varying amplitude. Put differently, the volatility exhibits oscillating dynamics: it
cycles erratically over time and periods of high or low volatility follow one another.
Considering the aforesaid, the new perspective on the volatility dynamics motivates us to
describe the evolution of volatility in terms of periods of rising and falling volatility. As a
starting point we postulate the existence of two distinct phases in the evolution of volatility,
specifically, periods of rising and falling volatility. Since a movement from a rising (falling)
volatility phase to a falling (rising) volatility phase involves a turning point, we need an algo-
rithm for dating of turning points in volatility cycles. For this purpose, we employ a simple
and well-known algorithm for detecting turning points between the bull and bear phases of a
financial market, namely, the algorithm of Lunde and Timmermann (2004). This algorithm is
motivated by the idea that, in order to qualify for a distinct bull or bear phase, the financial
asset price should change substantially from the previous peak or trough. For example, the
rise (fall) in the price should be greater than 20% from the previous local trough (peak) in
order to qualify for being a distinct bull (bear) market.
The algorithm of Lunde and Timmermann (2004) is based on imposing a minimum on the
price change since the last peak or trough. This dating rule is implemented in the following
manner. Let λ1 be a scalar defining the threshold of the movement in volatility that triggers
a switch from a falling-volatility state to a rising-volatility state, and let λ2 be the threshold
for shifts from a rising-volatility state to a falling-volatility state. Denote by Vt the value of
volatility at time t and suppose that a trough in volatility has been detected at time t0 < t.
Therefore, the algorithm knows that a rising-volatility state begins from time t0 + 1. The
28
algorithm first finds the maximum value of volatility on the time interval [t0, t]
V maxt0,t = max{Vt0 , Vt0+1, . . . , Vt}
and then computes the (inverse of the) relative change in volatility where the maximum value
serves as the reference value
δt =V maxt0,t − VtV maxt0,t
.
If δt > λ2, then a new peak is detected at time tpeak at which volatility attains maximum on
[t0, t]. The period [t0 + 1, tpeak] is labeled as a rising-volatility state. A falling-volatility state
begins from tpeak + 1.
If, on the other hand, a peak in volatility has been detected at time t0 < t, then the
algorithm finds the minimum value of volatility on the time interval [t0, t]
V mint0,t = min{Vt0 , Vt0+1, . . . , Vt}
and computes the relative change in volatility from the minimum value
δt =Vt − V min
t0,t
V mint0,t
.
If δt > λ1, then a new trough is detected at time ttrough at which volatility attains minimum
on [t0, t]. The period [t0 + 1, ttrough] is labeled as the falling-volatility state. A rising-volatility
state begins from ttrough + 1.
The application of the dating algorithm requires making an arbitrary choice of two param-
eters {λ1, λ2}. It is unclear how to make an appropriate choice in our case because, to the best
of the authors’ knowledge, no one before has attempted to describe the evolution of volatility
in terms of periods of rising and falling volatility. Lunde and Timmermann (2004) report the
empirical results for several alternative sets of parameters. We select the most typical param-
eter values {λ1 = 20%, λ2 = 15%}. It should be noted, however, that the results reported
below in this section are, to some extent, sensitive to the changes in the parameters. However,
qualitatively our findings remain intact when the parameters of the algorithm are changed.
29
Figure 8 illustrates the results of turning points detection for four individual assets14 that
belong to different asset classes: the S&P500 stock market index, the long-term bond market
index, the US/UK exchange rate, and the crude oil brent price. Shaded areas in each panel
indicate the periods of falling volatility. Once we establish turnings points in volatility, it
is possible to summarize various characteristics of the movements between each phase. We
compute the duration of each phase, D, and report the minimum, average, median, and the
maximum duration. For instance, the duration of a rising-volatility phase is computed as
D = tpeak − ttrough,
where ttrough denotes the date of a trough in volatility and tpeak denotes the day of the sub-
sequent peak in volatility. We also compute the amplitude of each phase, A, and report the
minimum, average, median, and the maximum amplitude. The amplitudes of the rising- and
falling-volatility phases are computed as
Arising =Vpeak − Vtrough
Vtrough, Afalling =
Vtrough − VpeakVpeak
.
For a rising-volatility phase, Vtrough denotes the value of volatility at a trough and Vpeak denotes
the value of volatility at the subsequent peak. For a falling-volatility phase, Vpeak denotes the
value of volatility at a peak and Vtrough denotes the value of volatility at the subsequent trough.
The summary statistics of periods of rising and falling volatility are reported in Table 2.
The statistics in the table are interesting and not only qualitatively, but also quantitatively
strikingly similar across various asset classes. It is clear that falling-volatility phases tend to
be longer than rising-volatility phases. Both the median duration and the average duration of
phases is rather stable across various asset classes. Specifically, the average (median) durations
of the rising- and falling-volatility phases amount to 6.5 (5) and 13 (10) weeks respectively.
Consequently, the average (median) duration of a falling-volatility phase exceeds the average
(median) duration of a rising-volatility phase by a factor of 2. The minimum duration of a
14For each asset, the historical volatility is estimated using daily returns and the EWMA model. It shouldbe noted, however, that volatility is unobservable and thus must be estimated from the data. There are severalalternative approaches to estimating the historical volatility. We use the EWMA model because it produces amuch smoother volatility curve as compared to, for example, a conditional volatility from the fitted GARCH(1,1)model.
30
Figure 8: Periods of rising and falling volatility identified by the dating algorithm
20
40
60
1995 2000 2005 2010 2015
Vo
latilit
y, %
S&P 500 index
5
10
15
20
25
1995 2000 2005 2010 2015
Vo
latilit
y, %
Long−term bond index
10
20
30
1995 2000 2005 2010 2015
Vo
latilit
y, %
US/UK exchange rate
25
50
75
100
1995 2000 2005 2010 2015
Vo
latilit
y, %
Crude oil brent price
Notes: Shaded areas indicate the periods of falling volatility.
rising-volatility phase amounts to 1 day only (in bond and currency markets). This number
says that volatility can increase for more than 20% over the course of single day. The average
(median) full cycle length amounts to 21.5 (15) weeks. On average, over a rising-volatility
phase, volatility increases by approximately 60% from the previous trough; over a falling-
volatility phase it decreases by about 30% from the previous peak. The median peak amplitude
amounts to approximately 40%. Since the median peak amplitude is substantially less than
the average peak amplitude, the distribution of peaks in volatility is highly non-symmetrical.
Specifically, the distribution of peaks is skewed to the right.
The summary statistics of periods of rising and falling volatility, reported in Table 2, help
explain why all standard volatility models fail to provide sensible longer-horizon volatility
forecasts. First of all, the volatility is sometimes rather non-persistent. That is, its value can
change dramatically over a course of a single day. Second, the volatility does not revert to its
long-run mean as the models assume. There are two issues with the mean-reverting dynamics
incorporated in the GARCH and HAR models. The first issue is that a typical estimate of the
31
Table 2: Summary statistics of periods of rising and falling volatility
S&P 500 Bond index US/UK rate Crude oil priceStatistics
Rising Falling Rising Falling Rising Falling Rising Falling
Number of phases 56 57 58 58 56 55 59 59Minimum duration 0.6 2.6 0.2 3.0 0.2 2.2 1.0 2.4Average duration 6.5 13.1 7.1 12.2 6.7 13.3 6.5 12.7Median duration 4.4 10.4 5.8 10.6 5.7 11.2 5.6 10.0Maximum duration 25.0 45.6 27.6 40.8 24.4 40.4 22.4 52.8Minimum amplitude 20 -15 21 -16 20 -15 20 -16Average amplitude 71 -36 49 -30 56 -32 58 -32Median amplitude 47 -36 36 -28 38 -32 39 -30Maximum amplitude 352 -78 177 -53 265 -58 338 -74
Notes: Duration is measured in weeks. The amplitude is measured in percentages.
volatility half-life amounts to 15 weeks (Engle and Patton (2001)). That is, if volatility is above
the long-run mean, the mean-reverting models assume that during the 15 subsequent weeks the
volatility will move halfway back towards its long-run mean. De facto, our results reveal that in
50% of cases over the course of subsequent 15 weeks the volatility will go through a full cycle:
a period of increasing volatility will be followed by a period of decreasing volatility or vice
versa. Interestingly, the estimated median cycle duration (15 weeks) is double as long as the
estimated horizon of predictability of forward volatility (7.5 weeks). This observation suggests
that the horizon of volatility predictability is limited from above by the median half-cycle
duration.
The second issue is that no model assumes that if the volatility is above (below) its long-run
mean, it can increase (decrease) even further. To make the discussion more concrete, consider
two specific examples that use data on the historical volatility of the S&P 500 index. Over
the period from 1995 to 2016, the long-run average volatility of the S&P 500 index was 16.6%
(on annual basis). In the first example, consider the volatility forecast on September 19, 2008,
when the index volatility increased to 33%. The EWMA model assumes that the volatility is
persistent and, therefore, it forecasts the volatility of 33% in all future.15 The GARCH model,
on the other hand, assumes that the volatility is persistent and mean-reverting. As a result,
the GARCH model forecasts that over the course of the following 15 weeks the volatility will
gradually decrease to 25%. No model assumes that the volatility can increase further. However,
the volatility kept increasing further and even further and on December 2, 2008, it attained
74%. In the second example, consider the volatility forecast on August 15, 2003, when the
15For the sake of illustration, we assume that rt = 0 in the EWMA model given by equation (4.2).
32
volatility decreased to 15%. Whereas the EWMA model predicts that the volatility will stay on
the same 15% level in all future, the GARCH model forecasts that the volatility will gradually
increase to its long-run mean. No model assumes that the volatility can decrease further. In
reality, the volatility kept falling down and decreased to 8%; the volatility stayed below its
long-run mean till August 6, 2007, over the course of 4 full years.
7 Summary and Conclusions
Even though volatility forecasting is crucial for portfolio management, risk management, and
pricing of derivative securities, little is still known about how far ahead one can forecast
volatility. Whereas the results reported in some papers seem to suggest that volatility is
forecastable over long-term horizons that extend to several years, in a handful of studies the
researchers demonstrate that forecast horizon is limited to 6-8 weeks. In this paper we aim to
fill this gap in the literature on the horizon of volatility predictability.
First of all, we suggest a novel approach to measuring the forecast accuracy over various
horizons. Specifically, we propose to use not only the spot forecast accuracy curve, but also the
forward forecast accuracy curve. We argue that the forward forecast accuracy curve provides
very useful information about the model’s ability to forecast volatility across various horizons.
The term structure of volatility predictability should be described by the spot and forward
forecast accuracy curves. Both curves are highly relevant in practice because in financial
markets there is a trade in long-term option contracts (LEAPS) with maturities up to 39
months in the future and contracts on forward volatility (FVA) with maturities up to 24
months ahead. The traders in these contracts are naturally interested in the horizon of the
spot and forward volatility predictability.
Second, using three popular models (EWMA, GARCH, and HAR) we conduct the most
comprehensive evaluation of the horizon of volatility predictability in all major financial mar-
kets. Our empirical results are strikingly similar across the different financial markets. We find
that, depending on the asset class, the horizon of the spot volatility predictability is confined
to 20-35 weeks, whereas the horizon of the forward volatility predictability is even shorter and
limited to 5-10 weeks. The longest horizon of the volatility predictability is observed in the
stock and currency markets, whereas the shortest one is observed in the bond markets. In
33
the majority of cases, the HAR model provides the best forecast accuracy, while the EWMA
model provides the worst one. Thereby our results are consistent with academic studies such as
Christoffersen and Diebold (2000) and Galbraith and Kisinbay (2005), who find that volatility
is unforecastable beyond a relatively short-term horizon. In addition, our results suggest that
the horizon of volatility predictability is much shorter than the longest maturity of traded
LEAPS and FVA contracts.
Finally, we suggest a plausible explanation for why standard volatility models are not able
to provide sensible longer-horizon volatility forecasts. Our analysis reveals an important but
currently overlooked stylized fact about volatility: it cycles erratically over time and periods
of high or low volatility follow one another. For a few distinct financial assets, we provide
the descriptive statistics of the periods of rising and falling volatility states. We demonstrate
that the volatility dynamics do not fully correspond to the assumptions embedded in standard
models, which assume that volatility is highly persistent and mean-reverting. Specifically, we
find that the volatility is sometimes rather non-persistent: its value can change dramatically
over a course of a single day. In addition we indicate that the volatility does not revert
to its long-run mean as the models assume. All this suggests the desirability of developing
volatility models that embed the new stylized fact about volatility dynamics. Such models can
potentially significantly extend the horizon of volatility predictability.
References
Alford, A. W. and Boatsman, J. R. (1995). “Predicting Long-Term Stock Return Volatility:
Implications for Accounting and Valuation of Equity Derivatives”, Accounting Review,
70 (4), 599–618.
Blair, B. J., Poon, S.-H., and Taylor, S. J. (2001). “Forecasting S&P 100 Volatility: The Incre-
mental Information Content of Implied Volatilities and High-Frequency Index Returns”,
Journal of Econometrics, 105 (1), 5 – 26.
Bollerslev, T. (1986). “Generalized Autoregressive Conditional Heteroskedasticity”, Journal
of Econometrics, 31 (3), 307–327.
34
Brown, M. B. (1975). “400: A Method for Combining Non-Independent, One-Sided Tests of
Significance”, Biometrics, 31 (4), 987–992.
Cao, C. Q. and Tsay, R. S. (1992). “Nonlinear Time-Series Analysis of Stock Volatilities”,
Journal of Applied Econometrics, 7 (S1), S165–S185.
Christoffersen, P. F. and Diebold, F. X. (2000). “How Relevant is Volatility Forecasting for
Financial Risk Management?”, Review of Economics and Statistics, 82 (1), 12–22.
Corsi, F. (2009). “A Simple Approximate Long-Memory Model of Realized Volatility”, Journal
of Financial Econometrics, 7 (2), 174–196.
Corte, P. D., Kozhan, R., and Neuberger, A. (2017). “The Cross-Section of Currency Volatility
Premia”, Working paper, Imperial College Business School, Warwick Business School,
and Cass Business School.
Corte, P. D., Sarno, L., and Tsiakas, I. (2011). “Spot and Forward Volatility in Foreign
Exchange”, Journal of Financial Economics, 100 (3), 496 – 513.
Egelkraut, T. M., Garcia, P., and Sherrick, B. J. (2007). “The Term Structure of Implied
Forward Volatility: Recovery and Informational Content in the Corn Options Market”,
American Journal of Agricultural Economics, 89 (1), 1–11.
Engle, R. and Patton, A. (2001). “What Good is a Volatility Model?”, Quantitative Finance,
1 (2), 237–245.
Engle, R. F. and Rangel, J. G. (2008). “The Spline-GARCH Model for Low-Frequency Volatil-
ity and Its Global Macroeconomic Causes”, Review of Financial Studies, 21 (3), 1187–
1222.
Figlewski, S. (1997). “Forecasting Volatility”, Financial Markets, Institutions & Instruments,
6 (1), 1–88.
Fisher, R. A. (1925). Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh.
French, K. R., Schwert, G., and Stambaugh, R. F. (1987). “Expected Stock Returns and
Volatility”, Journal of Financial Economics, 19 (1), 3 – 29.
35
Galbraith, J. W. (2003). “Content Horizons for Univariate Time-Series Forecasts”, Interna-
tional Journal of Forecasting, 19 (1), 43 – 55.
Galbraith, J. W. and Kisinbay, T. (2005). “Content Horizons for Conditional Variance Fore-
casts”, International Journal of Forecasting, 21 (2), 249 – 260.
Glasserman, P. and Wu, Q. (2011). “Forward and Future Implied Volatility”, International
Journal of Theoretical and Applied Finance, 14 (3), 407–432.
Green, T. C. and Figlewski, S. (1999). “Market Risk and Model Risk for a Financial Institution
Writing Options”, Journal of Finance, 54 (4), 1465–1499.
Hyndman, R. J. and Athanasopoulos, G. (2013). Forecasting: Principles and Practice. OTexts.
Littell, R. C. and Folks, J. L. (1971). “Asymptotic Optimality of Fisher’s Method of Combining
Independent Tests”, Journal of the American Statistical Association, 66 (336), 802–806.
Longerstaey, J. and Spencer, M. (1996). “RiskMetricsTM—Technical Document”, Tech. rep.,
Morgan Guaranty Trust Company of New York: New York.
Lunde, A. and Timmermann, A. (2004). “Duration Dependence in Stock Prices: An Analysis
of Bull and Bear Markets”, Journal of Business and Economic Statistics, 22 (3), 253–273.
Patton, A., Politis, D. N., and White, H. (2009). “Correction to “Automatic Block-Length
Selection for the Dependent Bootstrap” by D. Politis and H. White”, Econometric
Reviews, 28 (4), 372–375.
Politis, D. N. and White, H. (2004). “Automatic Block-Length Selection for the Dependent
Bootstrap”, Econometric Reviews, 23 (1), 53–70.
Politis, D. and Romano, J. (1994). “The Stationary Bootstrap”, Journal of the American
Statistical Association, 89 (428), 1303–1313.
Poon, S.-H. and Granger, C. W. J. (2003). “Forecasting Volatility in Financial Markets: A
Review”, Journal of Economic Literature, 41 (2), 478–539.
Taleb, N. N. (1997). Dynamic Hedging: Managing Vanilla and Exotic Options. New York:
John Wiley & Sons.
36