View
1
Download
0
Category
Preview:
Citation preview
On Optimal Sample-Frequency and Model-Averaging Selection when
Predicting Realized Volatility
Joakim Gartmark*
Abstract Predicting volatility of financial assets based on realized volatility has grown popular in
the literature due to its strong prediction power. Theoretically, realized volatility has the
advantage of being free from measurement error since it accounts for intraday variation
that occurs on high frequencies in financial assets. However, in practice, as sample-
frequency increases, market microstructure noise might be absorbed and as a result lead
to inaccurate predictions. Furthermore, predicting realized volatility based on single
models cause predictions to suffer from model uncertainty, which might lead to
understatements of the risk in the forecasting process and as a result cause poor
predictions. Based on mentioned issues, this paper investigates which sample frequency
that minimizes forecast error, 1-, 5- or 10-min, and which model-averaging process that
should be used to deal with model uncertainty, Mean forecast combinations, Bayesian
model-averaging or Dynamic model-averaging. The results suggest that a 1-min sample-
frequency minimize forecast errors and that Bayesian model-averaging performs better
than Dynamic model-averaging on 1-day and 1-week horizons, while Dynamic model-
averaging performs slightly better on 2-weeks horizon.
Keywords
Realized Volatility, Market Microstructure Noise, Sample-Frequency, Model Uncertainty,
Bayesian Model-Averaging, Dynamic Model-Averaging and Forecasting
Department of Economics
Master Thesis, 30 credits
Economics
Master of Science, Economics 120 credits
Spring Term 2017
Supervisor: Annika Alexius
*I would like to send my sincerest gratitude to Björn Hagströmer at Stockholm Business School, who helped
with the data used in this study
1
Table of Contents 1. Introduction .............................................................................................. 2
2. Theoretical Background ................................................................................. 5
2.1 Portfolio Optimization .................................................................................................................. 5
2.2 The Process of the Stock Price ....................................................................................................... 6
2.3 Measuring the Volatility of the Stock Price .................................................................................. 7
2.4 Market Microstructure Noise ....................................................................................................... 9
2.5 Model Uncertainty ...................................................................................................................... 10
3. Previous Research ...................................................................................... 11
3.1 Realized Volatility ....................................................................................................................... 11
3.2 Sample-Frequency and Market Microstructure Noise .............................................................. 12
3.3 Model-Averaging ........................................................................................................................ 14
4. Econometric Methodology ............................................................................. 15
4.1 The Forecasting Process .............................................................................................................. 16
4.2 Realized volatility ........................................................................................................................ 17
4.3 HAR-Models ................................................................................................................................ 18
4.4 GARCH-Models ........................................................................................................................... 20
4.5 Loss Functions ............................................................................................................................. 23
4.6 Model-Averaging ........................................................................................................................ 24
5. Data ..................................................................................................... 26
5.1 Daily Realized Volatility ............................................................................................................. 26
5.2 Weekly Realized Volatility .......................................................................................................... 28
5.3 Two Weeks Realized Volatility ................................................................................................... 29
6. Results ................................................................................................... 31
6.1 Optimal Frequency ..................................................................................................................... 31
6.2 Model-Averaging ........................................................................................................................ 34
7. Conclusions ............................................................................................. 44
Bibliography ............................................................................................... 46
Appendix ................................................................................................... 48
2
1. Introduction
The failure of assessing and anticipating the financial risk on the credit markets
was one of the major reasons for the financial crisis in 2008. In order to avoid a
crisis of similar magnitude again, it is vital to ensure that financial volatility is
modeled and predicted efficiently. Economists are interested in predicting
financial volatility for several reasons. First, expected future volatility determines
how an investor should balance his portfolio of risky and risk-free assets in
order to minimize risk subject to expected return. Second, it helps policy makers
to anticipate potential financial crises and counteract by adjust their policies
accordingly. Third, it is an important determinant in asset pricing in the sense
that large risk should be compensated by larger return. Fourth, it has a huge
influence on the pricing process of financial derivatives and helps investors to
hedge risks on the market.
Historically, financial volatility has been measured and forecasted either
through daily standard deviation based on daily returns or through parametric
models such as GARCH and stochastic volatility models. Even though daily
standard deviation has the advantage of being observed it still absorbs a lot of
noise since the measure only contains of one observation in each trading day.
Furthermore, using GARCH or stochastic volatility models require assumptions
regarding the volatility’s distribution, while the volatility itself is never actually
observed. Thus, Andersen et al. (2001, b) proposed a new way to measure risk,
referred to as realized volatility, in which high-frequency data consisting of, for
example, 1-min, 5-min or 10-min prices of the financial assets are used. By
transforming the intraday prices into intraday return, the realized volatility of
each day is then calculated by taking the sum of all squared intraday returns in
one trading day. Compared to daily standard deviation and parametric models,
realized volatility has the advantage of providing a model-free measurement
that allows observing more of the volatility that occurs during one trading day.
Furthermore, empirical evidence has found that realized volatility significantly
improves forecast and portfolio performance (Andersen, et al., 2003) &
(Fleming, et al., 2003). However, modeling realized volatility based on intraday
return has a potential drawback; it might absorb more market microstructure
noise and as a result bias the estimator. Market microstructure noise is all the
variation of the stock price when observed on high frequencies that is not
related to the true volatility. Examples of such variation are bid-ask bounce
effects, rounding errors due to price discreteness and recording errors. In
theory, the higher frequency used to model volatility, the more market
3
microstructure noise might also be absorbed. However, meanwhile, as more
observations are included, the efficiency of the estimators is improved. Previous
research has concluded that selecting the frequency that minimizes forecast
error has economic value in terms of portfolio performance. However, one
single optimal frequency that consistently minimizes forecast errors has not yet
been established (Bandi & Russel, 2006) & (Potter, et al., 2008). A second issue
that has been investigated in the context of predicting financial volatility is how
to deal with model uncertainty. Model uncertainty means that a single model
might yield poor predictions during certain periods even though it on average
performs well. In order to deal with this issue, previous research has combined
several models to forecast future volatility, further on referred to as model-
averaging. Generally there are three different approaches when model-
averaging realized volatility; Mean forecast combination, Bayesian model-
averaging and Dynamic model-averaging. Mean forecast combination weighs
all models equally regardless of each model’s forecast performance and has been
advocated due to its simplicity and since performing at least as good as those
approaches based on forecast performance (Smith & Wallis, 2009). Bayesian
model-averaging selects the weight of each model based on the average past
forecast performance of each model, while dynamic model-averaging selects the
weight of each model based only on the last observed forecast performance.
Thus if models forecast performance of realized volatility are time-varying,
dynamic model-averaging should outperform Bayesian model-averaging.
Previous research has concluded that regardless of approach, forecasting
realized volatility based on model-averaging add economic value in terms of
portfolio and forecast performance (Wang, Ma, Wei, & Wu, 2016). However
despite previous effort, research is still divided regarding which model-
averaging approach that yields best forecasts of realized volatility (Wang &
Nishiyama, 2015) & (Liu, et al., 2017).
Thus, based on mentioned gaps in the literature, the purpose of this paper is to
identify which sample-frequency (1) and model-averaging approach (2) that
minimize forecast error of realized volatility.
(1) denotes this paper’s first purpose, in which 1-, 5- and 10-min frequencies
are examined. (2) denotes this paper’s second purpose, in which Mean forecast
combination, Bayesian and Dynamic model-averaging are examined. To the best
of the author’s knowledge in the context of forecasting realized volatility,
previous research has not yet evaluated optimal frequency based on forecast
performance of several models nor has it investigated differences in forecast
4
performance when the model-averaging process is restricted to only include the
historically best performing models. Thus, findings of this paper will be first of
its kinds and provide new insight regarding optimal sample-frequency and
differences in forecast performance of Mean forecast combination, Bayesian and
Dynamic model-averaging when forecasting realized volatility. Data is based on
OMX30, which is an index representing the most traded stocks on Nasdaq
Stockholm stock exchange. The Swedish stock exchange has been selected with
respect to its maturity and well working capital markets. Since previous research
mostly has been conducted on the U.S stock exchange this study does not only
contribute by filling gaps in the existing literature, but it also complements
previous findings by providing results based on another dataset. The data
reflects the period between 2007 and 2012, a period characterized with high
levels of volatility on the financial markets.
This paper will use some of the new proposed ways to model and forecast
realized volatility in order to keep the results as up-to-date as possible. One of
the models that has arisen in spirit of the realized measures is the heterogeneous
autoregressive (HAR) model, originally suggested by Corsi (2009). Due to its
simplicity and strong forecast performance, the HAR model has been frequently
used to predict future volatility. The HAR model follows a simple autoregressive
structure, in which the last day’s, week’s and month’s volatility are modeled to
predict future volatility. Based on this model, new models accounting to this
family have been developed, in which jumps and leverage effects are included.
Other authors have focused on the impact of volatility on realized volatility by
applying different models from the general autoregressive conditional
heteroskedasticity (GARCH) family and found strong evidence that this improves
forecast performance (Barndorff-Nielsen & Shephard, 2005) & (Corsi, et al.,
2008). Thus, based on 1-min, 5-min and 10-min data, single and combined
models of the HAR and GARCH family have been applied to forecast realized
volatility 1-day, 1-week and 2-weeks ahead.
The structure of this paper is divided into five further sections. Section 2 gives
the theoretical background related to this paper’s purpose. Section 3 highlights
what previous research has found within this area. Section 4 explains the
econometric methodology to model realized volatility. Section 5 presents the
data used in this investigation in more detail. Section 6 presents the results of
this paper literally and in tables. Finally, section 7 gives conclusions and
proposals to future research.
5
2. Theoretical Background
This paper aims to identify which frequency and model-averaging approach
that minimize forecast error of realized volatility. This section is divided into five
subsections, which aim to explain the theoretical background related to this
paper’s purpose. The first section explains how forecasting volatility accurately
adds economic value in terms of portfolio performance. However, the first
section is not directly related to the purpose of this paper, but serves more as an
eye-opener to why it is important to care about predicting realized volatility.
The second section describes the nature of a stock’s return, which in its essence
is what determines the volatility of the stock, which is further explained in the
third section. The fourth and fifth section concerns the parts directly related to
this paper’s purpose, in which the theory behind market microstructure noise
and model-uncertainty is explained, respectively.
2.1 Portfolio Optimization Previous research has established that predicting volatility accurately adds
economic value in terms of portfolio selection in the sense that it helps investors
to make better investment decisions1. The framework in this section is not
applied in this paper, but is described in order to make a standpoint for why
predicting realized volatility is important according to the existing literature.
In portfolio theory it is generally known that an investor selects the portfolio in
which the return is maximized subject to the risk. To do this the investor
estimates the future risk on the market in order to ensure that the portfolio is
weighted according to her risk preferences. Fleming et al. (2001) suggest a
mean-variance framework that considers an investor with a short-time horizon,
1-day, 1-week and 1-month, who aims to minimize variance subject to a certain
level of expected return. Furthermore, the theory assumes a constant expected
return based on, among others, Merton (1980), who proved that it is hard to
expect variation in expected returns in the short horizon. Thus, the only thing
changing in the short horizon is the predicted variance and consequently the
investor follows a portfolio strategy based on volatility timing. To illustrate this
approach consider numbers of risky assets where denotes the return of
each asset in a vector then the expected return matrix is given by
and the variance-covariance matrix is given by ∑
. Now the investor aims to minimize the risk of his portfolio subject
1 See for instance (Fleming, Kirkby, & Ostdiek, 2003). A more thoroughly discussion of previous
findings in terms of how predicting volatility adds economic value is given in section 3.
6
to his expected target return, , and select the weight, , of each asset
accordingly. This optimization problem is illustrated below:
∑ (1.1)
Subject to
( ) , (1.2)
where denotes the expected return of the risk-free asset. Solving this problem
for then yields:
( )∑ ( )
( ) ∑ ( )
(1.3)
Now since is assumed to be constant, the weight of each asset is determined
based on the predicted values of the variance-covariance matrix, ∑ shown
below:
2
21
2
2
221
112
2
1
nnn
n
n
(1.4)
Where denotes the predicted variance of risky asset and denotes the
covariance between risky asset and . The investor predicts the variance and
covariance of each risky asset, shown in matrix (1.4), and then weighs each
asset according to equation (1.3). Fleming et al. (2003) substitute the standard
variance and covariance measures with realized variance and covariance and
finds improvement in portfolio performance. Thus, realized measures based on
intraday observations might help investors to make more accurate investment
decisions since forecasts of risks become more accurate. This paper focus on the
realized volatility, in which both realized variance and covariance can be
calculated from (if assuming constant correlation). Next two sections give a
more detailed explanation of realized volatility and why, in theory, it is a good
proxy for financial risk.
2.2 The Process of the Stock Price Since volatility basically is variation in the stock price, it is important to
understand what drives this variation. This section explains the process of the
stock price and how it is modeled according to the existing literature.
There are many ways to describe the nature of the stock price, but common for
most models is that volatility plays an important part. In cases when change over
time for a variable is uncertain it is common to consider the change in the
𝑡
7
variable to follow a stochastic process. This is the general underlying assumption
in empirical asset pricing when investigating the nature of the stock price. Based
on the literature of Hull (2012), it is generally assumed that the price process
contains of a continous-price and continous-time stochastic process. A
continous-price stochastic process means that that the stock price can take any
value within a certain range and a continous-time stochastic process means that
the change of the stock price might occur at any time. However, in reality, time
and prices are generally observed discretely rather than continous. Thus, these
assumptions are usually relaxed when forecasting prices of financial assets
(Corsio & Renó, 2012).
The most basic model assumes that changes in the log-price of a stock, ,
follow a standard Brownian motion that includes the stock’s expected return, ,
in period and its volatility, , times the wiener process, . Since this paper
uses intraday daily return to predict future volatility, it is important to add the
market microstructure noise that arises from the level of frequency on intraday
data, . The market microstructure noise on high frequency data could be such
things as typographical errors or delayed quotes to mention some2. Also
unpredicted announcement effects causing volatility to change drastically are
important to consider, generally referred to as jumps, . In order to consider
jumps and market microstructure noise, the following equation is used to
explain variation in stock prices3:
( ) ( ) ( ) ( ) ( ) ( ), (2.1)
where ( ) is the expected rate of return, ( ) is referred to as the integrated
volatility of the stock price, ( ) is the Wiener process, is the mean zero
random noise independent of the Wiener process arising due to market
microstructure noise and ( ) reflects the stochastic jump process.
2.3 Measuring the Volatility of the Stock Price Since this paper forecasts realized volatility rather than daily standard deviation,
it is important to understand why research has moved towards this
measurement. This section explains the concept and nature of volatility, the
measurement error arising when daily standard deviation is used and how
realized volatility deals with this issue.
2 A more detailed explanation of market microstructure noise is given in section 2.4
3 The equation is a combination of Zhou (1996) and (Barndorff-Nielsen & Shephard, 2004a) to
illustrate the impact of jumps and microstructure noise separately
8
According to Hull (2012), stock volatility can be thought of as a measure of
uncertainty about the returns provided by the stock. Generally one can say that
volatility reflects the variation of a stock’s price. Variation occurs due to new
information conceived by the market4. Investors consider these news and
reevaluate the price and as a result movements occur in the stock price. In
general, larger expected volatility requires a larger expected return in order to
compensate for the risk.
Daily standard deviation based on the square root of the squared return’s
deviation from its past mean has historically been used as a proxy for financial
risk due to its simplicity. However, this measurement has been considered
inefficient since potentially suffering from measurement errors. This is because
daily standard deviation only consists of one single observation in each trading
day, usually the closing price, and as a result absorbs a lot of noise since not
observing the intraday volatility (Andersen & Bollerslev, 1998). To demonstrate
this issue consider the integrated volatility, ( ), in trading day as shown in
equation (2.1). In order to model this variable as accurate as possible one seeks
to model the cumulative quadratic variation of all small periods in trading day ,
in other words the intraday movements, referred to as period . The integrated
volatility from equation (2.1) in trading day can then be expressed as the
square root of the integral of all squared movements in period such that5:
( ) √∫ ( )
∑ ( )
( ), (3.1)
where ( ) is the variance process in period , denotes jumps in period ,
denotes the market microstructure noise in period and is the total number
of intraday movements in trading day . Thus if variation is large in trading day
, daily standard deviation based only on one observation is theoretically a weak
proxy of the actual variation occurring in one trading day. This is basically one
of the main reasons for why using intraday return to predict volatility has been
growing popular in the volatility forecasting literature. The realized volatility
requires intraday observations of the stock price, for example on 1-min, 5-min
or a 10-min frequency and can be expressed as follows:
( ) √∑
, (3.2)
4 See chapter 17 (Elton, Gruber, Browm, & Goetzmann, 2013) for a discussion regarding efficient
markets and the efficient market hypothesis (EMH) 5 The integrated volatility equation is inspired by & (Corsi, 2009) & (Andersen, Bollerslev, &
Diebold, 2007)
9
where denotes all the intraday observations in trading day, , and grows
larger as the sample-frequency increases, denotes each intraday observation
and is the intraday log-return of the stock. Thus equation (3.2) is convenient
in the sense that variation is based on intraday returns and as a result absorbs
more of the actual variation occurring in trading day . Previous papers have
established that as sample frequency in equation (3.2) increases, it converges to
follow the quadratic variation process expressed in equation (3.1) such that6:
( ) √∫ ( )
∑ ( )
( ) ( ) (3.3)
The equations shows that as intraday observations, , goes to infinity (i.e the
sample frequency goes very close to zero), realized volatility, , becomes an
efficient and unbiased estimator of the integrated volatility in period . As will be
discussed more in section 3, recent research has moved more towards
measuring volatility in this way due to its efficiency and since, theoretically,
being free from measurement error. However, when using realized volatility as
a proxy of volatility another bias arises due to the increased proportion of
absorbed market microstructure noise. This is further explained in section 2.4.
2.4 Market Microstructure Noise This paper investigates two research questions, in which the first one concerns
the bias that arises on high frequencies due to market microstructure noise. This
section explains how market microstructure noise is defined in this paper and
the trade-off that occurs when increasing the sample frequency.
Black (1986) distinguish between the meaning of market microstructure noise
in the context of finance, econometrics and macroeconomic. Further, he
explains that the only thing market microstructure noise has in common for all
these contexts is that it refers to something that the model observe, but that is
not related to the causality the model tries to explain. This paper concerns the
market microstructure noise that arise in financial data, which Black (1986)
describes as information concerning the movement of the stock price that is not
actual information. Though this definition might seem a bit confusing, it makes
sense in the context of realized volatility. This is because when observing stock-
prices on high frequencies there is an increased chance that a proportion of the
observed prices suffers from errors such as bid-ask bounce effects, rounding
errors due to price discreteness and recording errors. None of these errors are
6 For a discussion of the convergence see (Andersen & Bollerslev, 1998) & (Barndorff-Nielsen &
Shephard, 2002, a)
10
related to the true movement of the stock price, but since observed it is still used
as information to explain the movements. As a result of this, estimates are based
on information that is not actual information, which provide noisy estimations
and might lead to inaccurate predictions (Hansen & Lunden, 2006). Thus in this
paper, market microstructure noise is defined as observed movements of the
intraday stock price that has no explanatory power of the true volatility.
In theory, increasing the sample frequency implies an increased risk of
observing more market microstructure noise and as a result one might estimate
the microstructure volatility rather than the realized volatility (Awartani, et al.,
2009). However, the impact of the market microstructure noise depends on the
specifics of the market in the sense that if the market contains high proportions
of market microstructure noise relative to true variation, then realized volatility
should be measured on a lower frequency since a high frequency would absorb
too much noise (Andersen, et al., 2011). However, if the proportion of market
microstructure noise is small relative to the observed variation, a high frequency
is preferred due to the increased efficiency in the estimates. Thus in order to
maintain reliable predictions it is important to choose the right sample-
frequency.
2.5 Model Uncertainty The second purpose of this paper concerns how to deal with model uncertainty
through different methods of model-averaging. For this reason it is important to
understand the concept of model uncertainty and how model-averaging deals
with this issue in the forecasting process. This section will give some theoretical
background regarding model uncertainty and explain how model-averaging
might deal with this issue.
Model uncertainty is a potential issue when predicting a variable’s future values
based on a single model. This can be illustrated with a simple example. Consider
two different models used for predicting the future value of a variable. The
basics of forecasting theory tells you to pick the model that produces smallest
forecast error. However, this approach ignores the uncertainty of the single
model. An early paper that considers this issue points out two factors that arise
based on this approach (Bates & Granger, 1969):
1. Each model is based on information that the other model has not
considered
2. Each model interpret the relation between the independent and the
dependent variable differently
11
The second one is not necessarily an issue in the sense that if the estimated
relation in one of the models is wrong, it is better to only use the correct model.
However, the issue of model uncertainty in terms of forecasting arises due to
that future performance of a single model is uncertain and since being selected
based on its average performance, it is possible that there might be periods that
other models perform better. For this reason, model-averaging has been
considered a “cure” for model uncertainty. Model-averaging combines several
models in order to adjust for model uncertainty. In macroeconomics, model-
averaging grew popular in the beginning of the 2000th-century to forecast
inflation and real output growth, while the application of model-averaging in
the financial economics literature is still relatively rare and it is not until recent
years it has been growing in popularity in this context as well (Rapach, et al.,
2010). Thus this is a relevant subject to further investigate in order to facilitate
prediction of financial variables based on combined models.
3. Previous Research
This section is divided into three subsections, in which previous research
regarding realized volatility, frequency and market microstructure noise and
model-averaging are highlighted. These are all related to the purpose of this
paper in the sense that realized volatility is being forecasted and the optimal
sample-frequency and model-averaging approach are being investigated. Since
most main findings in this research subject have been conducted on U.S data,
this section will present these findings. However, it should be highlighted that
this study is based on data from the Swedish stock exchange, which is less liquid
than the U.S stock exchange, but still very similar in terms of maturity stage.
3.1 Realized Volatility Since realized volatility is being forecasted in this paper, it is vital to understand
why this measure is important and what previous papers has found regarding
this measurement. This section gives insight in previous papers findings
regarding realized volatility and its empirical support to forecast future volatility
compared to previous more conventional approaches.
In one of the first papers investigating realized volatility, Andersen et al. (2001,
a) examined the distribution of realized exchange rate volatility based on 5-min
frequency during 1986-1996. Their paper found strong evidence supporting
that realized volatility provides a variable free from measurement error and
might be more accurate than those based on parametric estimates of the error
term of daily returns. In their second paper the same year, the authors did a
12
similar investigation, based on 5-min data during 1993-1998, of stocks
included in Dow Jones Industrial Average (DJIA) (Andersen, et al.,2001, b). Also
on financial data, the authors found that realized volatility provides a model-
free measurement. Furthermore, the authors argued that realized volatility
should be preferred to parametric models, in which volatility is never actually
observed, since it allows observing actual volatility and is at least as accurate as
parametric models. Somewhat biased in the sense that some of the main findings
in this field of study is practically based on the same authors over several years,
but in their third paper concerning realized volatility, Andersen et al. (2003)
compare prediction performance of realized exchange rate volatility and daily
exchange rate volatility based on GARCH components. Their findings suggest
that basic AR models based on realized volatility outperform the forecasts of
those that only observe the daily exchange rate and then through parametric
models predict volatility. In another paper, McMillan & Speight (2004)
highlight the poor evidence in favor for that GARCH models can provide better
forecasts than a simple autoregressive model when predicting future volatility
based on daily exchange rate or stock return. The authors suggest that this is
due to all the noise being observed when based only on one daily observation.
Thus by using intraday 30-min data for 17 different exchange rates to predict
realized volatility and apply different models from the GARCH family, the
authors find strong evidence suggesting that GARCH models perform better than
AR models when realized volatility is used. The authors then use this as an
argument to support the hypothesis that predicting volatility based on daily
return to measure volatility suffers from measurement error. In another famous
paper, Corsi (2009) proposed the Heterogeneous Autoregressive model of the
realized volatility (HAR-RV). The model is very straight forward and basically
aggregate daily realized volatility into past day, past week and past month in an
OLS regression. His result suggests that despite the model’s simplicity it could
still outperform the GARCH-model and as a result the HAR-model, along with its
extensions, has been used frequently ever since.
As pointed out above, the empirical evidence for realized volatility is supportive
and as a result research in this subject has become popular as well. This section
has highlighted some of the most important findings in this area.
3.2 Sample-Frequency and Market Microstructure Noise Despite previous effort, an optimal sample-frequency has not yet been
established, which is the objective of this paper’s first purpose. However, as this
section will discuss further, there is strong evidence supporting that choosing
the sample-frequency that minimize forecast error has significant economic
13
value. Thus, it is vital to do further investigations in this area in order to provide
new insight of the sample-frequency’s impact on realized volatility forecasts.
As mentioned in section 2.4 when choosing sample frequency, a trade-off
occurs. Picking a high sample frequency might reduce the stochastic error of the
measurement, also referred to as increasing the efficiency of the coefficients, but
it might also introduce more market microstructure noise and as a result
provide biased estimators (Corsi, 2009).
Bandi & Russel (2006) use data from stocks included in S&P100 in an attempt to
establish the optimal frequency. The authors’ findings suggest that the optimal
frequency for realized variance varies from 0.4 min to 13.8 min. The authors
also conclude that there is significant economic gain from chosing the right
frequency when applying different frequencies on the framework proposed by
Fleming (2003). However, a range between 0.4 minutes and 13.8 minutes is not
very useful when choosing between 1-min, 5-min and 10-min frequencies,
especially if the performances of these frequencies are significantly different
from each other. Another paper concerning frequencies was exercised by Potter
et al. (2008), who use a similar framework to investigate the performance when
predicting the realized covariance matrix of S&P100 from 1- min to 130-min
frequency. Similar to Bandi & Russel (2006), Potter et al. (2008) conclude that
choosing the right frequency has economic value. However, they also suggest
that the optimal frequency ranges between 30 and 65 minutes, which is a
significantly lower frequency than proposed by Bandi & Russel (2006).
However, the results of Potter et al. (2008) are not as convincing as the paper
might imply. First, their results also imply a strong performance on 10-min
frequency, but the authors do not consider this as important since all other
results are in favor of lower frequencies. Second, the authors only use one model
to evaluate the performance of each frequency and do not consider the model
uncertainty that might arise in this context. This issue also holds for Bandi &
Russel (2006), who investigate performance depending on frequency based on
different stocks, but only on one model. For this reasons it is vital to statistically
test several single models performances on each frequency and then test
performances of each model when based on different frequencies in order to be
more statistically certain about the optimal frequency.
Despite previous attempts7 and the economic value it might add to portfolio
optimization, research has not yet been able to establish a “golden rule” when
7 See also (Bandi & Russel, 2008), (McAleer & Medeiros, 2008) and (Shin & Hwang, 2015) for
further discussions concerning market microstructure when predicting realized volatility.
14
selecting sample-frequency. Instead it has been common to use an ad hoc rule,
in which a sample-frequency between 5 and 30 minutes are selected (Shin &
Hwang, 2015). This paper aims to fill this gap by investigating the performance
of several single models on higher frequencies, 1-min, 5-min and 10-min, in an
attempt to see if there is a significant difference in forecast performance
depending on frequency.
3.3 Model-Averaging This paper’s second purpose is to investigate differences in Mean forecast
combinations, Bayesian and Dynamic model-averaging when forecasting
realized volatility with a larger focus on the two latter. This section will give
insight in what previous research has established in this subject.
Model-averaging in the context of predicting realized volatility has grown
popular in the last decade and is still a topic that many papers consider the
economic value of. As pointed out in section 2.5, model-averaging combines
several models in order to deal with model uncertainty, which might arise when
using one single model to predict the future value of a variable. Ignoring model
uncertainty might lead to understatements of the risk in the forecasting and as a
result cause poor predictions (Hibon & Evgeniou, 2004).
One of the first papers testing model-averaging to predict future realized
variance was Liu & Maheu (2009), who used a Bayesian model-averaging
(BMA) approach, which combines several models based on each model’s
average historical predictive power. The authors use 5-min S&P data during
1997 and 2004 and apply linear HAR and AR models to forecast 1-day, 1-week
and 2-weeks ahead and find that BMA significantly improves the prediction
performance compared to single models on all horizons. For these findings the
authors give two reasons. First, there is not one single model that dominates in
performance across markets and horizons. Second, giving models more weight
during periods when performing well contributes to a decreased uncertainty
and consequently provides more reliable predictions. Another study predicts
volatility on stock indices on Chinese and Japanese markets using three new
models specifically invented for high-frequency data. This study confirms that
prediction performance improves when using BMA compared to single models
(Wang & Nishiyama, 2015). However, Wang et al. (2016) argue that BMA does
not consider structural breaks or the fact that models forecast performance are
time-varying and therefore suggest a dynamic model-averaging (DMA)
approach. A DMA approach as the name implies is dynamic in the sense that the
model selection is very flexible. Due to its flexibility it allows parameters to be
15
time-varying. Wang et al. (2016) use 5-min data from the S&P 500 index
during 1996-2013 and apply eight different models all derived from the HAR
family. The results show that DMA on average performs better than single
models but not significantly better than BMA. Furthermore, the authors run a
portfolio exercise using a similar framework as the one presented in section 2.1
and finds that both BMA and DMA improve portfolio performance. In another
recent paper, Liu et al. (2017) compare performance of BMA and DMA when
forecasting the realized range volatility on S&P and crude oil based on models
from the HAR family. The authors find that the DMA approach is significantly
better than BMA and individual models to forecast future volatility. However,
their model-averaging only consist of five single models from the same family
implying that the combined models will probably follow a somewhat similar
pattern. Thus by adding models from two different families and expanding
models included in the model-averaging process, as exercised in this paper,
might contribute to more heterogeneity in the forecasts and as a result provide
different findings concerning performance of BMA and DMA. Furthermore,
previous research has not been very concerned about the impact of horizons
when model-averaging. Since time-variation of models’ forecast performance
might depend on the horizon it also makes sense to consider this when
investigating model-averaging approach
Summing up, previous research has concluded that model-averaging add
economic value in terms of portfolio and prediction accuracy. However,
previous research has not yet considered the magnitude of restricting the
number of models while averaging and how this is related to the performance
between BMA and DMA nor has it considered how BMA and DMA might
depend on horizon. Thus, it might be useful to see how performance changes
when restricting model-averaging to only include a limited amount of models
based on their performance. Furthermore, investigating the difference on each
horizon is of interest in order to see if this should be considered when selecting
between BMA and DMA. This paper fills the gap concerning if number of
models included in the model-averaging process is essential and if BMA and
DMA performance depends on the horizon.
4. Econometric Methodology
This section starts by explaining this study’s forecast procedure. The following
subsection gives an explanation regarding how realized volatility is specified in
this paper. Subsections 3 and 4 illustrate and explain the models used to forecast
16
realized volatility from the HAR and GARCH family, respectively. Subsection 5
describes the loss functions used to measure the forecast error of each model. In
the final section, the different selection approaches used for model-averaging is
illustrated and explained.
4.1 The Forecasting Process As mentioned in previous sections this paper aims to identify the optimal
frequency and differences in model-averaging approaches when predicting
realized volatility based on the OMX30 index during 2007-2012. Thus findings
of this paper will be based on the Swedish stock exchange rather than the U.S
stock exchange. However, since previous research has focused on the U.S
market, this paper, except for providing new findings, also complement
previous findings since based on a new dataset, the Swedish stock exchange. As
a first step, rolling window out-of-sample forecasts for 10 different models on
three horizons and on three different frequencies are executed. When running a
rolling out-of-sample forecast it is possible to choose between rolling recursively
or with a fixed window. In a recursive approach one adds new observations
after each rolling forecast, in other words the length of the sample expands after
each executed forecast. The primary problem with this approach is that it is
unfair to compare the observed forecast within the sample in the sense that they
are based on different lengths of the sample. For this reason a rolling window is
preferred since it allows the length of the sample to be equal after each forecast
and consequently the forecast observations are more comparable. The three
horizons that have been forecasted are 1-day, 1-week and 2-weeks. These
horizons have been chosen based on the mean-variance framework explained in
section 2.1 with an investor changing his portfolio daily, weekly or monthly
based on the predicted volatility. For the 1-day horizon, the sample window has
been set to 300 observations, which is approximately one year and two months.
The length has been chosen with respect to that volatility is dynamic and
changes quickly and for this reason using volatility based on past observations
that exceeds more than one and a half year is irrelevant when predicting
volatility 1-day ahead. For the 1-week and 2-weeks horizon, however, the
window has been set to 100 observations, which is the minimum amount of
observations when running GARCH models in R using the “rugarch” package.
However since the 1-week and 2-weeks horizons are based on aggregated daily
realized volatility of 1-week and 2-weeks respectively, the window of the 1-
week horizon is almost two years and the window of the 2-weeks horizon is
almost 4 years. For the 1-day and 1-week horizons, 100 forecast observations
17
are obtained and for the 2-weeks horizon 50 forecast observations are obtained
and thus for all forecast samples it is possible to assume a normal distribution8.
All single model forecasts are executed on 1-min, 5-min and 10-min frequency
data. In order to test for frequency performance, OLS-tests are then executed
based on the loss function of each model’s performance on each horizon. In total
each frequency has 30 different model forecast observations, 10 single models
on three horizons and thus are assumed to be large and normally distributed as
.
Based on the established optimal frequency, model-averaging is further
investigated. As explained further in section 4.6, three kinds of model-averaging
selections are examined to see if these on average outperform single models. The
Bayesian and Dynamic model-averaging approach is then further investigated to
see if prediction performance is improved when applying restrictions that only
include the best performing models in the averaging process. Based on these
results, each model restriction that on average performs best on each horizon
and loss function is identified. Finally the Bayesian and Dynamic model-
averaging forecasts are tested against each other to see if any difference in
forecasting can be established. Also in this procedure, OLS-tests are executed in
all steps to see if any significant difference between models forecast performance
can be verified.
As a final step two robustness checks have been executed. In the first one, the
whole process described above is ran again based on an Allshare index for
Stockholm Nasdaq including all listed large, mid and small cap firms on this
stock exchange in the same period. In the second one, the first two and half year
of the OMXS30 data is dropped in order to see if results change when excluding
the financial crisis in 2008. However, since this subsample only consist of 187
trading weeks, only the 1-day and 1-week horizons have been examined.
All presented results in this paper are based on Newey-West Heteroskedasticity-
Autocorrelation-Consistent standard errors.
4.2 Realized volatility As already shown in equation (3.2), realized volatility in this paper refers to the
square root of the sum of intraday squared log-return. This measure is useful
since it is, in the absence of market microstructure noise, free from
measurement error and as sample-frequency goes to infinity yields an unbiased
and efficient proxy of the integrated volatility. Furthermore, it has empirical
8 According to the central limit theorem as goes to infinity is large and approximately
normally distributed. As a rule of thumb is approximately large when
18
support in the sense that it performs smaller forecast error than daily standard
deviation and those based on parametric assumptions9. Below is the formula for
realized volatility from here on denoted :
√∑
, (4.3)
where is the squared log intraday return in intraday period and trading day
and denotes all observed intraday returns in day . Thus if using 1-min
frequency, consists of approximately five and ten more observations than a
frequency on 5-min and 10-min, respectively.
4.3 HAR-Models Due to the increased interest of modeling volatility based on high-frequency
data, several new models have been developed. Among these is the popular HAR
model, which in its essence is a basic linear model following an autoregressive
structure. F. Corsi (2009)10 published the Heterogeneous Autoregressive model
of the Realized Volatility (HAR-RV), which is convenient due to its simplicity and
long-memory. Basically the model assumes that markets are heterogeneous in
terms of investors who have different time horizons. He argues that the market
can be divided into three kinds of investors that might have an impact on
volatility, high-frequency traders with a 1-day horizon, portfolio managers with
a 1-week horizon and long-term investors with a horizon on 1-month or more.
He proposed the following model to predict stock volatility:
, (5.1)
where is the intercept, is the lagged daily realized volatility in day , is
the lagged weekly realized volatility in week , is the lagged monthly
realized volatility in month and is the forecast error. In order to consider
the potential threat arising from jumps–A stochastic process rising due to
announcements or other unpredictable actions that has a significant impact on
the stock price–, Andersen et al. (2007) suggested two models that include
jumps, in which the square root of the logarithmic standardized realized
bipower variation is calculated as below:
√ √(√
)
∑ | || ( )| , (5.3)
9 See section 3.1 for more information regarding previous findings of realized volatility
10 This paper was known and applied already in 2004, however it was not published until 2009
19
where (√
)
denotes the expected mean according to a standard normal
distributed random variable and ∑ | || ( )| denotes the bipower
variation term which is basically the sum of the absolute value of intraday
return times the absolute value of the intraday return in the next period
( ). Generally expressed, bipower variation attempts to catch the quadratic
variation in the stock return that is not captured by the realized volatility
measure. According to Barndorff & Shephard (2004a), the jump component is
expressed as follow:
( ), (5.4)
where the jump component, , is truncated at zero so that it only consists of
nonnegative estimates. Thus by including the jump component in model (5.4), it
is possible to run the Heterogeneous Autoregressive model of the realized
volatility with jumps (HAR-RV-J):
, (5.5)
where , and denotes the jump component according to equation (5.4)
with 1-day, 1-week and 1-month’s lag respectively. Furthermore, Andersen et
al. (2007) argued that realized volatility could be decomposed into continuous
path (CSP) and jump components (CJ). The authors constructed these
components as shown below:
( ) (5.6)
( ) ( ) , (5.7)
where denotes the indicator function and denotes the critical value
identifying the jump according to the standardized normal distributed . Thus
in equation (5.6) identifies the significant jumps determined by its critical
value and in equation (5.7) is the sum of the residuals not consisting of
jumps. This equation is referred to as the Heterogeneous Autoregressive model
of the realized volatility with continuous jumps (HAR-RV-CJ), expressed as
below:
, (5.8)
where and are used in the model with 1-day, 1-week and 1-month lag.
As a final step for the HAR family models, the “leverage effect” is considered,
which is a general concept in financial markets. Leverage effect implies that
20
negative shocks in returns have a larger impact on volatility than positive
shocks. This is basically explained by the increasing default risk that occurs as a
result of a decrease in the stock price due to the increased debt relative to equity.
In order to account for the leverage effect, Corsio & Renó (2012) proposed the
leveraged Heterogeneous Autoregressive model of the realized volatility with
jumps (LHAR-RV-J) and continuous jumps (LHAR-RV-CJ). The leverage
components can then be modeled in the following way:
( ), (5.9)
where , is the aggregated negative return in period based on intraday
return in period . Thus, this component is added to model (5.5) and (5.8),
respectively:
(5.10)
, (5.11)
where equation (5.10) is the LHAR-RV-J model and equation (5.11) is the
LHAR-RV-CJ model.
4.4 GARCH-Models Previous research has confirmed that a simple Autoregressive (AR) model based
on past realized volatility to predict future realized volatility outperforms
stochastic volatility or GARCH models based on daily return (Andersen, et al.,
2003). Previous research has also considered the role of volatility on realized
volatility and found strong results suggesting that this helps explain some of the
variation in realized volatility. For example the results found by Barndorff-
Nielsen & Shephard (2005) suggest that realized volatility might suffer from
heteroskedastic errors because of time-varying volatility in the realized volatility
estimator. Based on these findings, among others, Corsi et al. (2008)
investigated this further by including a GARCH component in the HAR-RV
model explained in section 4.2. The results indicate that modeling the volatility
of realized volatility improves forecast accuracy. Thus, in spirit of previous
research this paper will apply a similar strategy and adapt AR models combined
with GARCH(1,1) components in order to forecast realized volatility.
Introducing the autoregressive conditional heteroskedasticity (ARCH) model, F.
Engler (1982) suggested a parametric approach to model the size of the errors
21
in the residual. To model the ARCH component one assumes a distribution of the
error term in the AR model, also referred to as the mean equation, and runs a
regression of the assumed variance, referred to as the variance equation, of the
error term based on past error terms. Four years later, Bollerslev (1986)
introduced the GARCH component, which includes past variance. A simple
AR(1)+GARCH(1,1), Where denotes the realized volatility can be shown as:
( ) (6.1)
, (6.2)
where equation (6.1) is the mean equation and is the AR(1) and equation
(6.2) is the variance equation, in which is the ARCH component and
is
the GARCH component. Also note that ( ) refers to the assumed
distribution of the error term in the mean equation. Furthermore all coefficients
in (6.2) takes a nonnegative value and in order for stationarity to hold
. In many cases, the ARCH component takes a value very close to zero,
which yields a highly persistent volatility. In order to deal with this issue, Engle
& Bollerslev (1986) proposed the IGARCH model presented below:
( )
, (6.3)
where equation (6.3) presents the variance equation in which ( ) now
becomes less persistent. As mentioned in previous section negative and positive
return might have asymmetric impact on volatility. This might also be true when
speaking of negative and positive volatility of realized volatility. However in this
case one might suspect that the opposite holds, in other words an increase in
volatility might have a larger impact on the volatility compared to a decrease in
volatility. In order to consider this one can apply the eGARCH model, which has
proven useful in financial modeling (Nelson, 1991). The model is shown below:
( ) (
| |
√
) (
), (6.4)
where captures the assymetric information, in which positive shocks have a
greater impact than negative shocks of equal magnitude. By transforming the
GARCH components into logarithms one allows the parameters to be negative,
while the conditional variance maintains non-negative. Gloste et al. (1993)
proposed another way to model asymmetry in the ARCH component, referred to
as the GJR-GARCH model, explained below:
22
( )
, (6.5)
where is an indicator variable taking value 0 if and 1 if .
Furthermore all coefficients are positive and the conditional variance is
nonnegative. Basically the model is showing that positive shocks will have
impact on volatility while negative shocks will have impact on volatility.
Thus if positive volatility has a larger impact on realized volatility than negative
volatility one should expect . Finally, this paper applies an alternative
way for equation (6.5) that has been proven useful for modeling realized
volatility11, named TGARCH originally proposed by Zakoian (1994). The
difference is that instead of modeling the variance equation using variance, the
model takes the square root of (6.5). The model is expressed as:
√ √
( ) √ √
, (6.6)
where is an indicator variable taking value 0 if and 1 if .
One of the first papers investigating the distribution of the volatility of realized
volatility was Corsi et al. (2008), in which they concluded that it is non-
Gaussian. The authors further assumed an inverse Gaussian distribution and
found that it improves prediction accuracy significantly. Since this distribution
has proven useful in other papers as well12, this paper has chosen to follow a
similar strategy and assumes an inverse Gaussian distribution of the residual in
the mean equation. Also worth mentioning is that Corsi et al. (2008) and some
other papers have used the HAR approach as the mean equation and then
modeled the variance equation based on the residual. Even though these results
have performed well for forecasting realized volatility, this paper has chosen to
use simple AR models when modeling the GARCH components. This is because
the second purpose of this paper is to evaluate model-averaging models and for
that reason it is useful to include models of different orders in order to get more
heterogeneity in predictions. It is, however, possible to argue against this
approach and thus previous explanation is presented as a precaution for these
arguments. Furthermore, the AR order of the mean equation has been chosen
based on the Bayesian information criterion (BIC) for the whole sample. The AR
model with consecutive lags and smallest BIC-value was chosen for each
sample-frequency. It is possible to argue against this approach as well since this
paper deals with forecasts and thus using the full sample is kind of contradictive
11
See (Degiannakis, 2008) 12
See also (Caporin & Velo, 2015) who adopts an inverse Gaussian distribution when predicting
realized measures based on different GARCH models
23
in the sense that rolling window forecasts are never based on the whole sample.
However, this paper has assumed that the impact on forecast performance if
selecting AR order based on another approach would be marginal and for that
reason this approach has been chosen due to its convenience. For all sample
frequencies, the AR(3) returned smallest BIC-value and has for this reason been
chosen as the mean equation for all of the AR+GARCH models. The GARCH
order has been set to (1,1) for all GARCH models. This is the standard order
according to the literature and has been used in previous studies when
forecasting realized volatility (Caporin & Velo, 2015) & (Corsi, et al., 2008).
4.5 Loss Functions This paper has used similar loss functions as previous studies in order to
determine forecast performance13. These functions involve mean absolute error,
mean squared error and root mean square error. The mean absolute error
weights the size of the error equally, while mean square error and root mean
squared error assigns more weight to larger forecast error. Hence, the later loss
functions give a good indication regarding the average size of the error as well.
Thus using three different loss functions improve the conditions for making an
accurate analysis concerning the forecast errors.
As mentioned previously, this paper investigates the optimal sample frequency
based on the performance on three different horizons. Since the units of realized
volatility are different on different sample frequencies, it is not possible to
compare loss functions directly. In order to deal with this issue each loss
function is measured in terms of the actual outcome, in other words it is the
relative forecast errors that are measured. Each loss function is described below:
∑
|
|
(7.1)
∑
(
)
(7.2)
∑ √(
)
, (7.3)
where is the actual observed outcome,
is the predicted
outcome and is the total number of observed predictions. Furthermore, MRAE
denotes the mean relative absolute error, MRSE denotes the mean relative 13
See for instance (Liu, Chiang, & Cheng, 2012), (Liu & Maheu, 2009) & (Wang & Nishiyama,
2015)
24
squared error and MRRSE denotes the mean root relative squared error. Thus, as
mentioned before MRAE weighs all forecast error equal, while MRSE and MRRSE
give more weights to large forecast error. However, MRSE assigns more weight
to large forecast error than MRRSE does.
4.6 Model-Averaging As mentioned in section 3.3, model-averaging has a significant economic value
in terms of portfolio selection. This is mainly explained due to its exceptional
way of dealing with model uncertainty. However, so far it is still unclear if there
is a difference in performance between Bayesian (BMA) and dynamic (DMA)
model-averaging. In order to gain some more insight in this area, three different
model-averaging approaches have been used. The first one is a benchmark
referred to as mean forecast combination (MFC), in which all models are
assigned equal weight without respect to previous prediction performance.
According to Smith & Wallis (2009), MFC performs as well as other model-
averaging approaches and thus should be preferred due to its simplicity.
However since it is likely that some models produce better predictions than
others one might assign the weights accordingly. The DMA approach is, as the
name implies, dynamic in the sense that the weight of each model is only based
on the last forecast performance observation. However, in the BMA approach a
larger sample of the loss function is used to calculate the average error. Thus,
weights of the BMA selection changes slowly compared to the DMA. There are
several different BMA selections, in which some take models covariance into
consideration. However, Smith & Wallis (2009) emphasized that weights based
on simple performance measurements tends to outperform weights based on
more complex approaches as those including the covariance. Following his
suggestion, this paper use a straight-forward weighing process when applying
BMA and compare this to the results achieved from the non-performance based
models, MFC, and from dynamic performance based models, DMA. The BMA
selection is expressed below, where represents the value of the loss function
for model in period :
(∑
) (8.1)
∑
, (8.2)
where denotes the average value of the loss function and denotes the
weight of each model when forecasting one period ahead. The loss function’s
inverse is used in order to give models producing small forecast error a greater
25
weight in the forecast. Basically, the average historical performance is used to
select the weight of each model. In this paper, the BMA selection follows a
rolling window approach based on the last 30 observations for 1-day and 1-
week horizons and 25 observations for the 2-weeks horizon14. For DMA, the
selection is exercised as below:
∑
, (8.3)
note that in equation (8.3), the weighting is only based on the performance of
the last prediction instead of using an average of all past predictions. If model
performance is time-varying, DMA should outperform BMA. However, if models
are less time-varying, BMA should outperform DMA.
As a final step this paper considers changes in performance when restricting the
models by ranking each model in terms of its loss function value. In this way it is
possible to discover the optimal restriction for BMA and DMA. When this is
established for each horizon and loss function, BMA and DMA performance are
compared with each other in order to see if the most accurate restriction of each
averaging approach performs better than the other. This approach has not yet
been used for realized volatility, but is applied since it is possible that some
models included in the averaging process never outperform the top models and
thus might reduce the prediction accuracy rather than improving it. However,
while this approach helps to give insight in whether BMA is better than DMA it
is still restricted in the sense that in reality you do not know the optimal number
of models to include. However, using this approach still adds value. Firstly,
because it provides insight whether restricting models to only include a set of
top performing models rather than using all is a useful approach to reduce
forecast error. Secondly, it is possible to conclude whether there is a difference
between BMA and DMA in forecast performance when both are specified
optimally. Finally, it generates observations of performance between BMA and
DMA, which makes it possible to run an OLS-test to see if performance is
significantly different.
The restrictions will be based on the average selection in equation (8.2) and
(8.3), starting by all models and then decrease by two models in five steps in
order to see if prediction performance is improved. A final thing that should be
highlighted is that all model-averaging predictions on the 2-weeks horizon only
consist of 25 forecast observations. This cause doubts whether normal
14
Since 2-weeks horizon only consists of 50 forecast observations, Bayesian model-averaging is
based on 25 past observations in order to ensure that 25 forecast observations are obtained.
26
assumption holds on this horizon and should be considered as a limitation of
these results. However on 1-day and 1-week horizons, 70 forecast observations
are obtained and thus is assumed to be large and normally distributed.
5. Data
High-frequency data from OMX30 between 2nd January 2007 and 28th
December 2012 has been retrieved from Thomson Reuters Tick Database.
OMX30 contains of the 30 most traded stocks on Nasdaq Stockholm stock
exchange. The Swedish stock exchange has been selected since it is considered to
be a mature market, in which the results can be compared to those based on U.S
data. OMX30 has been chosen due to its liquidity and since it is a convincing
proxy of the market portfolio. The period has been chosen since it reflects a
turbulent period on the stock market, in which both the financial crisis and the
EU debt crisis have occurred. Collected data contains of three different
frequencies 1-, 5- and 10-min intraday stock prices, in which the final trade in
this interval reflects the final intraday stock price. Furthermore, data has been
thoroughly cleaned from observations not reflecting a true trading day. Opening
hours for days before holiday are 9.00 am to 1.00 pm on the Swedish stock
exchange and do not reflect a whole trading period. For this reason these days
have been removed in order to get similar intraday observation on all days. The
Swedish equity market has opening hours between 09:00 and 17:30 and thus
data has been restricted to reflect this time interval as well. The data consists of
1482 trading days and 313 trading weeks. The intraday observed price has been
transformed into logarithms. Thus all models are based on intraday log returns.
As mentioned previously in this paper 1-day, 1-week and 2-weeks horizons are
predicted and this section provides further details regarding each horizon.
5.1 Daily Realized Volatility Table 1 reveals the statistical properties of 1-day realized volatility horizon
separated by 1-min, 5-min and 10-min. The table shows that frequencies in
terms of quartiles and medians are quite similar, however in terms of min and
max there are more obvious differences, where 10-min has largest observed min
and max. Also worth noticing is that standard deviation is distinctly larger on 5-
and 10-min frequency than on 1-min, suggesting that variation of realized
volatility is larger on lower frequencies. In terms of skewness all three
frequencies show a similar pattern, in which all are around 2 indicating that the
distribution are right skewed and not normally distributed. The kurtosis is also
27
remarkably larger than 3 revealing that data consist of more outliers than would
be the case if normally distributed.
Table 1. Daily Realized Volatility SAMPLE-FREQUENCY 1-MIN 5-MIN 10-MIN Min .003339 .003124 .002762
1st Quartile .007709 .007572 .007250
Median .010113 .010434 .010015
Mean .011393 .011812 .011595
3rd Quartile .013361 .014158 .013934
Max .057814 .061586 .070780
Standard Deviation .00569 .006228 .006473
Skewness 2.241143 2.242893 2.442556
Kurtosis 11.91032 12.10872 14.23351
Graphs 1 to 6, illustrate daily data in in a plotted diagram and histogram on
each frequency. The distribution is similar for all frequencies. Graphs 1, 3 and 5
show that volatility peaked during 2008, 2009 and 2011 periods in which
mentiond crises occurred. Graphs 2, 4 and 6 reveal, in line with expectations,
that realized volatility is right skewed.
Rea
lize
d V
ola
tili
ty
Rea
lize
d V
ola
tili
ty
28
5.2 Weekly Realized Volatility
Table 2 has a similar structure as table 1, but in this table, the 1-week horizon of
realized volatility is illustrated. Conclusions that can be drawn from table 2 are
that 1-min data has smaller range than 5-min and 10-min data. Similar to table
1, it also shows a smaller standard deviation for 1-min data than 5-min and 10-
min. Also worth mention is that even though skewness is still above zero and
kurtosis is larger than 3 both takes a smaller value than if measured on daily
basis. This implies that even though weekly realized volatility is still not
normally distributed it seems to be closer to normal distribution than daily
realized volatility.
Table 2. Weekly Realized Volatility SAMPLE-FREQUENCY 1-MIN 5-MIN 10-MIN Min .007517 .006947 .006196
1st Quartile .017346 .016984 .016530
Median .022692 .022992 .022498
Mean .025036 .026025 .025375
3rd Quartile .029297 .030693 .029998
Max .093881 .096257 .094299
Standard Deviation .01189301 .01293827 .01268386
Skewness 1.885782 1.785964 1.807755
Kurtosis 8.545417 7.662383 7.766664
Graphs 7 to 12 illustrate a similar structure as previous graphs for daily realized
volatility. However, since observations are reduced, the plotted diagrams are less
compact in the sense that it only reflects 313 weekly observations rather than
1482 daily observations. The plots in graph 7, 9 and 11, show a very similar
pattern in which the crisis previously mentioned distinguish itself. However, the
histograms, shown in graph 8, 10 and 12, seem to be less concentrated in the
middle compared to graphs 2, 4 and 6, but still right skewed.
Rea
lize
d V
ola
tili
ty
29
5.3 Two Weeks Realized Volatility Table 3 illustrates statistical properties of the data on the 2-weeks horizon of
realized volatility. Data shows a similar pattern as for weekly realized volatility.
However, in this case min is not largest on 1-min frequency, but on 5-min
frequency. But for standard deviation, skewness and kurtosis the pattern is very
similar, in which variation is larger on smaller frequencies and all frequencies
follow a right skewed distribution with more outliers than if normally
distributed.
Rea
lize
d V
ola
tili
ty
Rea
lize
d V
ola
tili
ty
Rea
lize
d V
ola
tili
ty
30
Table 3. Two Weeks Realized Volatility SAMPLE-FREQUENCY 1-MIN 5-MIN 10-MIN Min .01226 .01367 .01254
1st Quartile .02542 .02425 .02363
Median .03196 .03328 .03179
Mean .03575 .03721 .03670
3rd Quartile .04049 .04307 .04384
Max .1202 .12470 .1332
Standard Deviation .01623894 .01762449 .01817101
Skewness 1.836134 1.742474 1.890126
Kurtosis 8.125699 7.329424 8.377635
Graphs 13, 15 and 17 are even less compact than in previous subsections since
they only consist of 156 observations. The pattern, however, is very similar to
the daily and weekly horizon. Furthermore, Graphs 14, 16 and 18 are even less
concentrated in the middle than on daily and weekly data. However, it also
seems to consist of a larger proportion of outliers with values larger than 0.12.
Rea
lize
d V
ola
tili
ty
Rea
lize
d V
ola
tili
ty
31
6. Results
This section consists of two subsections that present the results of this study’s
purpose. The first subsection provides the results regarding the optimal
frequency, in which performance of all models on 1-min, 5-min and 10-min is
tested. Since 1-min frequency shows strongest forecast performance, the results
in subsection 2 show differences between different model-averaging approaches
based on this frequency. In this section also a brief explanation of the most
important results of the robustness test is given. Except for the robustness tests15,
all interpreted results are presented in tables at the end of section 6.2.
6.1 Optimal Frequency The first results are presented in table 4. These results show the performance of
each single model in terms of mean relative absolute error (MRAE), mean
relative square error (MRSE) and mean relative root square error (MRRSE) on 1-
day, 1-week and 2-weeks horizons. Table 4 is separated by frequency in
columns (2), (3) and (4), in which 1-, 5- and 10-min, respectively, are the
underlying data used to forecast realized volatility. The bolded results, except for
the final row, show the best performing model on each horizon, loss function
and frequency. The last row, denoted average error, illustrates the total average
error on each frequency including all horizons and models. All forecast errors
presented in table 4 are significant different from zero by 1%.
Table 4 reveals that HAR models consistently yield smaller forecast error than
GARCH models on 1-week and 2-weeks horizons regardless of loss function and
frequency. Furthermore, GARCH models forecast errors are much larger in
terms of MRSE and MRRSE than HAR models on 1-week and 2-weeks horizons,
while slightly larger in terms of MRAE. This indicates that the size of the errors
15
Main findings of robustness checks are shown in appendix table 9, 10 and 11. Full results can be
handed by request
Rea
lize
d V
ola
tili
ty
32
for GARCH-models is substantially larger than what MRAE reveals. For this
reason it is possible to suspect that models originating from the GARCH family
are less stable in terms of performance on the 1-week and 2-weeks horizons.
Also worth noticing is that the elaborated models of the original HAR-RV model
increase performance. Interpreting the first five rows on each frequency, it is
obvious that adding new components into the original HAR-RV consistently
decrease the forecast errors on almost all horizons. The LHAR-CJ-RV is the
model that provides smallest forecast error most frequently which supports the
importance of including jump components and to consider the leverage effect
when modeling volatility. The best performer of the GARCH family, however, is
not as consistent. On the 1-day horizon, eGARCH shows signs of performing
best compared to the other GARCH models on this horizon. However, for the 1-
week and 2-weeks horizons, eGARCH and TGARCH are changing in terms of
performance, but the fact that both these models consider the impact of small
and large volatility of realized volatility and perform well implies that this might
add value when forecasting realized volatility. Finally, the last row in table 4,
shows that on average 1-min data provides smaller forecast errors than 5-min
and 10-min data in terms of all loss functions. Furthermore, the final row shows
that 5-min data yields smaller forecast error than 10-min data in terms of all
loss functions.
Table 5 illustrates the difference in forecast performance of each model when
based on 1-min, 5-min and 10-min frequency. Thus in column (2) and (3) 1-
min frequency is tested against 5-min and 10-min frequency, respectively, and
in column (4) 5-min frequency is tested against 10-min frequency. A negative
value implies that models based on the frequency being subtracted yields larger
forecast error than if based on the frequency not being subtracted and a positive
value indicates the opposite. The last row shows, the average difference between
frequencies being tested in terms of forecast error. * indicates a significance of
5%, ** indicates a significance of 2% and *** indicates a significance of 1%.
Interpreting column (2) in table 5 and the difference in forecast error between
each single model on 1-min and 5-min frequency, no significant difference is
obtained on any horizon. However, in terms of MRAE, MRSE and MRRSE the
minus sign occurs 57%, 73% and 83% of the times, respectively. This indicates
that 1-min data yields smaller average forecast error than 5-min data a majority
of the times in terms of all loss functions. Interpreting the average difference in
forecast error shown in last row, the results reveal that in terms of all loss
functions the average forecast errors are larger on 5-min frequency.
33
Furthermore, these results are significant in terms of MRSE and MRRSE with 5%.
Thus, results suggest that a sample-frequency on 1-min yields less forecast error
than on 5-min.
Moving to column (3) in table 5, forecast error of 1-min data is subtracted with
10-min data. Also in this case a negative result indicates that forecast error of
10-min data on average is larger than 1-min data. On the 1-day horizon, a
majority of the results in terms of MRAE and MRRSE are significant in favor for
1-min frequency. Even though the results in terms of MRSE on the 1-day
horizon are not as convincing as they are for MRAE and MRRSE, they still
contain significant results with all values being negative. For the 1-week and 2-
weeks horizons no significant results are discovered, it is however worth
mention that only 3 out of 60 results are positive giving some support for 1-min
data. Finally, the last row shows that in terms of all loss functions, 1-min
frequency yields significantly smaller forecast error than 10-min frequency.
In column (4), table 5, 5-min data is tested against 10-min data. In this case,
performance is particularly in favor of 5-min data on the 1-day horizon since
all loss functions yield negative results in which a majority is significant.
Furthermore, if including all loss functions on the 1-week and 2-weeks
horizons, only 5 out of 60 observations are positive. These findings suggest that
using 10-min data to forecast realized volatility yields larger predictions error
than 5-min data. Finally, the last row in this column support this further by
showing that in terms of all loss functions, 5-min data significantly outperform
10-min data.
To summarize the findings of table 4 and 5, according to table 4 there seems to
be evidence supporting HAR models over AR+GARCH models on the 1-week and
2-weeks horizons. This difference, however, is never statistically tested.
Furthermore, a strong majority of the models including leverage effects and
jumps generates smaller forecast error than those not considering these.
Interesting in terms of the purpose of this paper is the results revealed in table 5
that yields strong evidence in favor of using 1-min or 5-min data rather than
10-min. The results also give support for using 1-min data rather 5-min data.
However these results are not as statistically convincing as for 10-min
frequency, but the results still shows a significant difference for at least 5% in
terms of MRSE and MRRSE. Thus, the empirical evidence is still in strong favor
for 1-min data. For this reason, next section that concerns model-averaging uses
1-min frequency as the underlying data.
34
6.2 Model-Averaging Moving over to the second purpose of this study, table 6 illustrates how single
models perform compared to averaged models based on all ten models and 1-
min frequency. Table 6 shows the difference between single models and
combined models prediction error in terms of MRAE, MRSE and MRRSE. The
results in column (2), (3) and (4) illustrate the difference in performance for
single models compared to Mean forecast combinations (MFC), Bayesian (BMA)
and Dynamic (DMA) model-averaging on each horizon, respectively. A negative
sign implies that the combined model provides a larger forecast error than the
single model. The final row named “AVERAGE DIFFERENCE” shows the average
difference in forecast error between combined models and all single models on
all horizons.
Starting by interpreting the performance of Mean forecast combinations (MFC)
in column (2), table 6, it is obvious that for all horizons, except for 6
observations on the 1-day horizon, all signs are negative implying that MFC
forecast errors are larger in almost all cases when compared to single individual
models. For the 1-week and 2-weeks horizons almost all results are negative and
significant with at least 5%, a strong indicator that MFC does not provide
accurate forecasts. The results in the final row further confirm that average
difference in forecast errors for MFC is significantly worse than individual
models. Thus from a statistical standpoint it is possible to conclude that using
averaged model by equal weighting not based on any performance indicator
does not deal with model uncertainty very efficiently in the sense that almost all
single models performs better on average.
The results are more divided when moving from column (2) in table 6 to
column (3), which illustrates the difference in forecast errors between Bayesian
model-averaging (BMA) and single models. On the 1-day horizon, no
significant results are obtained and in terms of MRAE, 80% are positive
indicating support for BMA. However, in terms of MRSE and MRRSE only 50%
are positive. The results of the 1-week horizon are very distinctive. Except for
HAR-RV in terms of MRAE, all models from the HAR family are negative and in
terms of MRSE and MRRSE significant with at least 5% percent. However, results
for all models from the GARCH family are positive and in terms of MRSE and
MRRSE significant with at least 5%. For the 2-weeks horizon, the BMA performs
better than 50% of the models in terms of all loss functions. Thus, in total for all
horizons, BMA provides better forecasts than at least 50% of the single models
for all loss functions. Furthermore, the average difference shown in the last row
indicates that BMA on average yields smaller forecast errors than single models.
35
Thus, even though these results are not significant they still support that BMA
might help deal with model uncertainty.
Finally, in column (4), table 6, the performance between single models and
Dynamic model-averaging (DMA) is shown. On the 1-day horizon, a majority
shows a negative value in terms of all loss functions, which suggest that
applying a DMA for 1-day forecasts might not be very efficient. However, none
of the values on the first horizons are significant. For, the 1-week horizon the
results are very similar to those shown for BMA, where the HAR family
outperforms DMA, which outperforms the GARCH family. All of these results
are significant in terms of MRSE and MRRSE, while none are significant in terms
of MRAE. For the 2-weeks horizon, the results are in favor for DMA. In terms of
MRAE and MRSE, a majority of the results are positive but not significant. In
terms of MRRSE, DMA performs significantly better than 50% of the single
models and is outperformed by the other half. Also for DMA the final row,
indicating the average difference in forecast errors on all models and horizons,
shows positive results in terms of all loss functions. However, results are not
significant but still imply that DMA might deal with model uncertainty. Also
worth noticing is that the value for all loss functions on the last row is larger for
DMA than for BMA indicating that the error on average is smaller for DMA as
well.
The results from table 6 are twofold. First, using an averaging approach not
considering forecast performance, such as MFC, does not deal with model
uncertainty and gives significantly worse forecasts than any single model a
strong majority of the times. Second, BMA generate on average at least better
predictions than 50% of the single models on all horizons. The total average
difference in last row is also positive, but not significant, for both approaches
indicating small evidence in favor for both BMA and DMA. However, this table
does not consider what restrictions should be set for DMA and BMA in terms of
number of averaged models included nor does it consider which approach that
is better than the other. These results are illustrated in table 7 and 8.
Table 7 illustrates the difference in performance between full models (BMA 10
and DMA 10), in which all models are included, and restricted models, in which
only the top 8, 6, 4 and 2 models are included in the averaging process. The
table is separated to explicitly show the performance for different kinds of
specifications of BMA and DMA. The performances of different BMA models are
shown in the 10 first rows and for DMA in the 10 last rows. Each model has
been evaluated and ranked on each horizon and loss function. Even though the
36
difference in most cases is not significant, it is still interesting to identify which
restriction that on average produce less forecast errors in order to see how BMA
and DMA of these models perform against each other as shown in table 8.
Table 7 reveals that on the 1-day horizon, column (2), the full model consisting
of all ten single models is top ranked in terms of all loss functions both
according to BMA and DMA. This implies that when forecasting 1-day ahead
more models rather than fewer models add value in terms of forecast
performance. On the 1-week horizon in column (3), table 7, BMA and DMA
show that restricting the average process to only include 4 and 2 models,
respectively, provide smallest forecast error in terms of MRSE and MRRSE. For
the BMA these results are significant with at least 5% for all models except for
the BMA model only including two models in terms of MRSE. For DMA no
results are significant. However, in terms of MRAE, BMA 4 is top ranked while
DMA 10 is top ranked with no significant results in any of the cases. Moving
over to the 2-weeks horizon in column (4), table 7, the top ranked specifications
are more different between BMA and DMA. In terms of MRAE, BMA 10 and
DMA 8 are ranked as top performer. However in terms of MRSE and MRRSE,
BMA 2 and DMA 10 are ranked as top performer. In terms of MRRSE, the BMA
2 is significant with 2% in all cases.
In the final table, table 8, the results illustrate the difference in forecast
performance of all different BMA and DMA combinations, presented in table 7.
All results are divided into 1-day, 1-week and 2-weeks horizons in column (2),
(3) and (4), respectively. The final row named “AVERAGE DIFFERENCE”
illustrates the average difference between BMA and DMA based on different
model restrictions. However, note that the final row is only based on 25
observations and its implications should for this reason be taken by caution. In
table 8, a negative value implies that the specified BMA yields a smaller forecast
error than the specified DMA and a positive value indicates the opposite. The
bolded results illustrate the difference between top performer, identified in table
7, of BMA and DMA.
The results for 1-day’s forecasts presented in column (2) of table 8 show that in
terms of all loss functions, except for a few observations for MRRSE, all values
are negative. If further comparing the top performer, which is BMA 10 and
DMA 10 for all loss functions, the difference in performance is negative and
significant with at least 5% in all cases. Interpreting the final row it is obvious
that average difference based on the performance of different model
combinations is negative for all loss functions and in terms of MRAE and MRSE
37
significant with 1%. Thus, for 1-day ahead there is strong evidence that BMA
provides significantly better forecasts than DMA both in terms of top performer
and on average. These findings suggest that forecasting 1-day ahead is less time-
varying and thus the benefits of DMA are less required.
For 1-week ahead, the results presented in column (3), table 8, also show that
BMA yields less forecast error than DMA. However, in this case a majority of the
MRAE results are positive, but not significant. The bolded result in terms of
MRAE illustrating the top performer of each averaging restriction shows a very
low positive value implying that DMA slightly outperforms BMA when top
performers are compared. However, when top performer of each averaging
approach is compared in terms of MRSE and MRRSE, BMA outperforms DMA
with at least 5% significance. The results shown in the final row are also in
strong favor of BMA in terms of all loss functions and significant with at least
2% in terms of MRSE and MRRSE. Thus, according to the results in table 8,
column (3), BMA has support on the 1-week horizon as well.
Finally, the results on the 2-weeks horizon in column (4), table 8, are different
compared to the other two horizons. In terms of MRAE there is strong evidence
in favor for DMA, in which all values are positive and the final row is significant
with 1% in favor for DMA. In terms of MRSE and MRRSE, however, the evidence
for DMA is not as supportive even though a majority of the results takes a
positive value in terms of MRSE. However, the top performer of BMA and DMA
still shows that BMA outperform DMA in terms of MRSE and MRRSE. Finally, the
final row shows that in terms of MRSE, DMA yields on average smaller forecast
error than BMA, however these results are not significant. Furthermore, in terms
of MRRSE the value of the final row is negative, but not significant which makes
it hard to conclude anything with certainty regarding BMA and DMA forecast
performance on this horizon. Thus for 2-weeks ahead there seems to be some
support in favor for DMA, however these results are weak and should be taken
with caution. One possible reason for why DMA performs better relative to BMA
on two-weeks horizon might be due to that this horizon is more time-varying
than 1-day and 1-week horizons.
Summarizing this section for model-averaging, from table 6 it was established
that MFC does not deal with model uncertainty and is on average significantly
outperformed by single models regardless of horizon. Furthermore, the results
from table 6 indicate that BMA and DMA perform at least better than 50% of the
single models on all horizons in terms of all loss functions, giving some support
to that model-averaging based on historical performance deals with model-
38
uncertainty. However, a majority of the results are not significant. In table 7, it
was established that difference in performance between different models exist
and they are not always the combination containing most models. However, the
statistical evidence regarding the magnitude on forecast performance depending
on number of models included is not particularly strong. Finally, the results
illustrated in table 8, are in favor for BMA on 1-day and 1-week horizons.
However, for the 2-weeks horizon the evidence is more mixed and in terms of
MRAE, there is significant evidence in favor of DMA. However, in terms of MRSE
and MRRSE the evidence is not as convincing.
In order to check for robustness of the results, two robustness checks have been
executed. In the first robustness check, the whole process was executed again
based on an All-share index16. The results were strikingly similar to those
presented in tables 4-8 with some small differences. Some of the differences
were that compared to results shown in table 5 forecast error in 1-min data was
even more significantly smaller than 5-min and 10-min data while the
difference between 5-min and 10-min data was not as significant, even though
still in favor for 5-min data. Furthermore, compared to results in table 8, the
robustness test was even more favorable for Bayesian model-averaging on the 1-
day and 1-week horizon, while the 2-weeks horizon results were somewhat
more in favor for DMA17. The second robustness check was based on OMX30 in
a period after the financial crisis. The main difference when executed on this
period was that no significant difference between 1-min and 5-min sample-
frequency could be established. However, both 1-min and 5-min data
outperform 10-min frequency. Since, the sample is restricted this robustness
check could not be ran on 2-weeks horizon, but for the 1-day and 1-week
horizons, the results were significant in favor for BMA18. Summarizing findings
of the robustness tests, they support the main findings of this paper to a large
extent.
Below are all the results separated explicitly by tables referred to in this section.
Table 4 and 5 show the forecast error and the difference in forecast error of
models based on different sample-frequencies. Table 6, 7 and 8 show how
different model-averaging approaches perform compared to single models and
then how different restrictions of Bayesian of dynamic model-averaging perform
internally and against each other.
16
An index that includes all large, mid and small cap stocks listed on Nasdaq Stockholm 17
Main results of the robustness check on the All-share index are show in appendix column (1)
table 9, 10 and 11 18
Main findings of the robustness check based on OMXS30 data after the financial crisis are
illustrated in appendix column (2) table 9, 10 and 11
39
Table 4. Single Model Forecast Error on 1-, 5- and 10-min frequency
MRAE = Mean Relative Absolute Error
MRSE = Mean Relative Square Error 100
MRRSE = Mean Root Relative Square Error 10 All results in the table are significantly different from zero with 1%
(1) 1-Min (2) 5-Min (3) 10-Min (4) MRAE MRSE MRRSE MRAE MRSE MRRSE MRAE MRSE MRRSE
1-Day Ahead ( =100)
HAR-RV .212 .104 .234 .216 .123 .254 .249 .152 .287
HARJ-RV .222 .108 .245 .21 .115 .247 .251 .144 .289
HARCJ-RV .214 .101 .234 .206 .112 .242 .245 .138 .282
LHARJ-RV .202 .093 .223 .198 .106 .234 .240 .130 .277
LHARCJ-RV .200 .089 .22 .195 .103 .231 .235 .126 .271
AR(3)+GARCH(1,1) .207 .098 .219 .223 .096 .234 .249 .115 .257
AR(3)+iGARCH(1,1) .207 .098 .219 .224 .096 .235 .249 .119 .258
AR(3)+GJRGARCH(1,1) .207 .097 .218 .226 .095 .235 .245 .113 .255
AR(3)+eGARCH(1,1) .207 .094 .217 .227 .095 .235 .25 .114 .256
AR(3)+TGARCH(1,1) .207 .096 .218 .227 .096 .236 .249 .113 .256
1-Week Ahead ( =100)
HAR-RV .191 .063 .199 .194 .077 .209 .199 .082 .214
HARJ-RV .188 .062 .196 .181 .065 .197 .189 .069 .203
HARCJ-RV .183 .059 .191 .182 .066 .198 .188 .068 .201
LHARJ-RV .179 .059 .191 .180 .064 .200 .186 .069 .204
LHARCJ-RV .176 .057 .188 .179 .064 .199 .184 .069 .201
AR(3)+GARCH(1,1) .199 .159 .298 .193 .149 .292 .202 .169 .305
AR(3)+iGARCH(1,1) .205 .169 .304 .194 .148 .293 .202 .169 .305
AR(3)+GJRGARCH(1,1) .193 .147 .288 .193 .148 .291 .191 .157 .285
AR(3)+eGARCH(1,1) .195 .148 .287 .186 .146 .277 .196 .154 .288
AR(3)+TGARCH(1,1) .192 .147 .287 .192 .151 .291 .199 .171 .298
2-Weeks Ahead ( =50)
HAR-RV .189 .085 .202 .189 .081 .203 .201 .087 .215
HARJ-RV .185 .08 .199 .187 .086 .206 .204 .091 .22
HARCJ-RV .179 .073 .192 .181 .081 .198 .199 .087 .215
LHARJ-RV .167 .073 .177 .173 .076 .189 .187 .082 .200
LHARCJ-RV .162 .068 .172 .17 .072 .184 .184 .078 .197
AR(3)+GARCH(1,1) .198 .213 .323 .194 .225 .322 .201 .232 .328
AR(3)+iGARCH(1,1) .197 .216 .322 .204 .238 .337 .215 .251 .349
AR(3)+GJRGARCH(1,1) .193 .21 .326 .197 .221 .332 .196 .222 .326
AR(3)+eGARCH(1,1) .195 .218 .331 .201 .241 .341 .199 .225 .336
AR(3)+TGARCH(1,1) .200 .218 .330 .194 .219 .326 .204 .231 .336
AVERAGE ERROR .195 .117 .242 .197 .122 .249 .213 .134 .264
Note: Column (1) shows each single model used to forecast realized volatility separated by 1-day, 1-week and 2-weeks horizons. Columns (2), (3) and (4) show the forecast error for each single model for 1-min, 5-min and 10-min frequencies, respectively. The forecast error on each frequency is separated by three different loss functions; MRAE, MRSE and MRRSE. Except for the last row, all bolded results denote the best performing model in each frequency, loss function and horizon. The last row shows the average error based on all models and all horizons (n=30).
40
Table 5. Forecast Difference Between Different Frequencies
MRAE = Mean Relative Absolute Error 100
MRSE = Mean Relative Square Error 1000
MRRSE = Mean Root Relative Square Error 100 ***1% significance, **2% significance and *5% significance (1) (1-min) – (5-min) (2) (1-min) – (10-min) (3) (5-min) – (10-min) (4)
MRAE MRSE MRRSE MRAE MRSE MRRSE MRAE MRSE MRRSE 1-Day Ahead ( =100) HAR-RV -.382 -.197 -.199 -3.66 -.48* -.537*** -3.278*** -.283*** -.338*** HARJ-RV 1.262 -.069 -.019 -2.884 -.363* -.443** -4.146*** -.294* -.424*** HARCJ-RV .812 -.117 -.082 -3.120 -.375 -.485** -3.932*** -.258 -.403** LHARJ-RV .466 -.126 -.112 -3.771** -.37*** -.538*** -4.237*** -.246** -.426*** LHARCJ-RV .493 -.142 -.118 -3.509* -.372* -.512*** -4.002*** -.23* -.394*** AR(3)+GARCH(1,1) -1.591 .015 -.15 -4.151*** -.173 -.371** -2.560* -.188* -.221* AR(3)+iGARCH(1,1) -1.635 .021 -.158 -4.171*** -.205 -.387*** -2.536* -.226*** -.229* AR(3)+GJRGARCH(1,1) -1.924 .011 -.169 -3.76*** -.166 -.37*** -1.831 -.178* -.200* AR(3)+eGARCH(1,1) -2.047 -.01 -.183 -4.305*** -.202 -.395*** -2.258 -.193* -.212 AR(3)+TGARCH(1,1) -2.022 .007 -.178 -4.252*** -.163 -.388*** -2.23 -.17* -.21* 1-Week Ahead ( =100) HAR-RV -.293 -.132 -.103 -.768 -.184 -.159 -.475 -.052 -.056 HARJ-RV .729 -.027 -.005 -.102 -.071 -.068 -.832 -.045 -.063 HARCJ-RV .067 -.069 -.07 -.508 -.093 -.101 -.575 -.023 -.031 LHARJ-RV -.078 -.048 -.093 -.699 -.101 -.132 -.621 -.054 -.039 LHARCJ-RV -.336 -.068 -.115 -.807 -.113 -.135 -.47 -.045 -.02 AR(3)+GARCH(1,1) .615 .107 .062 -.272 -.097 -.071 -.887 -.204* -.133 AR(3)+iGARCH(1,1) 1.119 .208 .112 .327 -.005 -.006 -.792 -.213** -.118 AR(3)+GJRGARCH(1,1) -.05 -.008 -.033 .219 -.101 .03 .269 -.093 .063 AR(3)+eGARCH(1,1) .86 .02 .097 -.095 -.062 -.011 -.955 -.082 -.108 AR(3)+TGARCH(1,1) .008 -.042 -.036 -.657 -.238 -.109 -.665 -.196* -.073 2-Weeks Ahead ( =50) HAR-RV .005 .038 -.012 -1.164 .02 -.131 -1.169 -.058 -.12 HARJ-RV -.203 -.053 -.069 -1.829 -.106 -.212 -1.63 -.053 -.143 HARCJ-RV -.185 -.071 -.055 -1.965 -.139 -.225 -1.781 -.053 -.17 LHARJ-RV -.652 -.028 -.113 -1.983 -.085 -.231 -1.331 -.058 -.118 LHARCJ-RV -.756 -.04 -.116 -2.150 -.100 -.244 -1.394 -.06 -.128 AR(3)+GARCH(1,1) .358 -.122 .006 -.3016 -.192 -.05 -.66 -.07 -.057 AR(3)+iGARCH(1,1) -.649 -.222 -.146 -1.733 -.353 -.263 -1.084 -.131 -.117 AR(3)+GJRGARCH(1,1) -.466 -.116 -.054 -.386 -.123 0 .08 -.007 .054 AR(3)+eGARCH(1,1) -.63 -.228 -.096 -.415 -.07 -.054 .215 .158 .043 AR(3)+TGARCH(1,1) .612 -.007 .039 -.335 -.122 -.062 -.947 -.115 -.101 AVERAGE DIFFERENCE -.216 -.05* -.072*** -1.773* -.175*** -.222*** -1.557* -.125*** -.15**
Note: Column (1) shows each single model used to forecast realized volatility separated by 1-day, 1-week and 2-weeks horizons. Column (2) shows the difference in forecast error between 1-min and 5-min data, where negative values imply that 5-min data yields larger average forecast error. Column (3) shows the difference in forecast error between 1-min and 10-min data, where negative values imply that 10-min data yields larger average forecast error. Column (4) shows the difference in forecast error between 5-min and 10-min data, where negative values implies that 10-min data yields larger average forecast error. The final row “AVERAGE DIFFERENCE” shows average difference between each frequency based on all models and horizons ( =30).
41
Table 6. Forecast Difference Between Individual and Averaged Models on 1-min Frequency MRAE = Mean Relative Absolute Error 10 MFC = Mean Forecast Combination
MRSE = Mean Relative Square Error 100 BMA = Bayesian Model-Averaging
MRRSE = Mean Root Relative Square Error 10 DMA = Dynamic Model-Averaging ***1% significance, **2% significance and *5% significance (1) (-)MFC (2) (-)BMA (3) (-)DMA (4)
MRAE MRSE MRRSE MRAE MRSE MRRSE MRAE MRSE MRRSE 1-Day Ahead ( =70) HAR-RV -.126 .013 .012 .106 .017 .024 .013 .007 .013 HARJ-RV -.083 .015 .016 .148 .019 .028 .055 .008 017 HARCJ-RV -.138 .013 .01 .094 .017 .021 .002 .007 .011 LHARJ-RV -.298 -.002 -.007 -.066 .002 .005 -.159 -.008 -.005 LHARCJ-RV -.306 -.003 -.008 -.074 .001 .004 -.166 -.009 -.006 AR(3)+GARCH(1,1) -.208 -.009 -.014 .024 -.005 -.002 -.069 -.015 -.013 AR(3)+iGARCH(1,1) -.203 -.008 -.013 .029 -.004 -.002 -.064 -.014 -.012 AR(3)+GJRGARCH(1,1) -.204* -.007 -.014 .028 -.003 -.002 -.064 -.013 -.012 AR(3)+eGARCH(1,1) -.2* -.007 -.013 .031 -.003 -.001 -.061 -.013 -.012 AR(3)+TGARCH(1,1) -.2 -.007 -.013 .032 -.003 -.002 -.061 -.013 -.012 1-Week Ahead ( =70) HAR-RV -.565* -.162*** -.195*** .01 -.022* -.035*** .047 -.047* -.037* HARJ-RV -.676** -.168*** -.206*** -.101 -078*** -.045*** -.064 -.053** -.048*** HARCJ-RV -.719*** -.171*** -.209*** -.144 -.031*** -.049*** -.107 -.056** -.052*** LHARJ-RV -.741*** -.172*** -.209*** -.166 -.003*** -.049*** -.129 -.057** -.052*** LHARCJ-RV -.761*** -.173*** -.211*** -.186 -.033*** -.051*** -.15 -.058** -.054*** AR(3)+GARCH(1,1) -.434* -.057* -.085*** .142 .083** .075*** .179 .058*** .072*** AR(3)+iGARCH(1,1) -.366 -.047** -.08** .209 .093*** .08*** .245 .068*** .077*** AR(3)+GJRGARCH(1,1) -.527** -.072* -.101*** .048 .068* .06*** .085 .043*** .057*** AR(3)+eGARCH(1,1) -.504 -.078* -.102** .071 .062*** .058*** .108 .037*** .055*** AR(3)+TGARCH(1,1) -.546** -.08** -.104*** .029 .06*** .056*** .066 .035*** .053*** 2-Weeks Ahead ( =25) HAR-RV -.696 -.188*** -.263*** .157 -.051 -.034 .314 .008 -.004 HARJ-RV -.714* -.194*** -.264*** .139 -.057 -.035 .296 .002 -.005 HARCJ-RV -.804*** -.206*** -.274*** .049 -.07 -.045* .206 -.011 -.015 LHARJ-RV -.953*** -.202*** -.291*** -.100 -.066 -.061*** .057 -.007 -.032 LHARCJ-RV -1.04*** -.213*** -.3*** -.184 -.076 -.071*** -.027 -.017 -.041 AR(3)+GARCH(1,1) -.907* -.086 -.192** -.053 .050 .037 .103 .109 .067* AR(3)+iGARCH(1,1) -.915* -.08 -.193** -.061 .057 .037 .095 .115 .066** AR(3)+GJRGARCH(1,1) -.845* -.082 -.177** .009 .054 .053 .165 .113 .082* AR(3)+eGARCH(1,1) -.825* -.071 -.173** .029 .066 .056 .185 .124 .085** AR(3)+TGARCH(1,1) -.891* -.078 -.187** -.038 .058 .042 .119 .117 .071* AVERAGE DIFFERENCE -.546** -.0861* -.129* .007 .007 .005 .041 .015 .01
Note: Column (1) shows each single model used to forecast realized volatility separated by 1-day, 1-week and 2-weeks horizons. Column (2) shows the difference in forecast error between single models and MFC, where negative values imply that MFC yields larger average forecast error. Column (3) shows the difference in forecast error between single models and BMA, where negative values imply that BMA yields larger average forecast error. Column (4) shows the difference in forecast error between single models and DMA, where negative values imply that DMA yields larger average forecast error. The last row “AVERAGE DIFFERENCE” shows average difference between single models and combined forecasts based on all models and horizons ( ). Also note that results are based on combined models consisting of all 10 single models on each horizon.
42
Table 7. Forecast Difference of Different Model-Averaging Restrictions on 1-min Frequency MRAE = Mean Relative Absolute Error 100 BMA = Bayesian Model-Averaging
MRSE = Mean Relative Square Error 1000 DMA = Dynamic Model-Averaging
MRRSE = Mean Root Relative Square Error 100 ***1% significance, **2% significance and *5% significance
(1) 1–Day (2) ( =70) 1-Week (3) ( =70) 2–Weeks (3) ( =25)
MRAE MRSE MRRSE MRAE MRSE MRRSE MRAE MRSE MRRSE
BAYESIAN
BMA 10 – BMA 8 -.385 -.035 -.091 .349 .092 .144* -.467 .405 .24
BMA 10 – BMA 6 -.612 -.027 -.113 .201 .021*** .327*** -.61 .549 .388**
BMA 10 – BMA 4 -.814 -.06 -.167 .366 .297*** .459*** -.494 .651 .514**
BMA 10 – BMA 2 -.44 -.052 -.076 -.107 .285*** .377*** -.249 .694 .661**
BMA 8 – BMA 6 -.227 .008 -.022 -.148 .122*** .183*** -.141 .143 .147
BMA 8 – BMA 4 -.429 -.025 -.076 .017 .205*** .315*** -.025 .246 .274*
BMA 8 – BMA 2 -.055 -.018 .015 -.456 .193*** .233*** .22 .289 .421***
BMA 6 – BMA 4 -.203 -.033 -.054 .165 .825*** .133*** .116 .102 .126
BMA 6 – BMA 2 .172 -.026 .037 -.307 .071* .05 .361 .145 .274***
BMA 4 – BMA 2 .375 .0076 .091 -.473 -.012 -.082* .245 .043 .147***
BEST PERFORMER BMA 10 BMA 10 BMA 10 BMA 4 BMA 4 BMA 4 BMA 10 BMA 2 BMA 2
DYNAMIC
DMA 10 – DMA 8 -.253 -.004 -.007 -.121 .0104 .009 .167 -.003 -.011
DMA 10 – DMA 6 -.51 -.013 -.019 -.32 .023 .013 .049 -.005 -.012
DMA 10 – DMA 4 -.795 -.024 -.041 -.189 .031 .036 .017 -.018 -.042
DMA 10 – DMA 2 -.74 -.031 -.03 -.431 .038 .052 -.008 -.044 -.087
DMA 8 – DMA 6 -.258 -.009 -.012 -.199 .013 .004 -.118 -.002 -.002
DMA 8 – DMA 4 -.542 -.02 -.034 -.068 .021 .027 -.15 -.014 -.031
DMA 8 – DMA 2 -.487 -.003* -.022 -.31 .028 .043 -.175 -.041* -.076
DMA 6 – DMA 4 -.284 -.011 -.023 .132 .008 .023 -.032 -.013 -.03
DMA 6 – DMA 2 -.23 -.019 -.011 -.11 .015 .039 -.057 -.04*** -.074
DMA 4 – DMA 2 .055 -.008 .012 -.242 .007 .016 -.023 -.027 -.045*
BEST PERFORMER DMA 10 DMA 10 DMA 10 DMA 10 DMA 2 DMA 2 DMA 8 DMA 10 DMA 10
Note: Column (1) shows different restrictions between BMA and DMA models, where the number after each name illustrates number of models included in the model-averaging process. The ten first rows illustrate different restrictions for BMA models and the ten last rows illustrate different restrictions for DMA models. Colum (2), (3) and (4) show the difference in forecast error for 1-day, 1-week and 2-weeks horizons respectively separated by loss function; MRAE, MRSE and MRRSE. The final row “BEST PERFORMER” show those models that on average yield less forecast error for each loss function and horizon.
43
Table 8. Forecast Difference Between Different BMA and DMA Specifications on 1-min Frequency MRAE = Mean Relative Absolute Error 10 BMA = Bayesian Model-Averaging
MRSE = Mean Square Error 1000 DMA = Dynamic Model-Averaging
MRRSE = Mean Root Relative Square Error 100 ***1% significance, **2% significance and *5% significance
(1) 1 – Day (2) ( =70) 1 –Week (2) ( =70) 2 – Weeks (3) ( =25)
MRAE MRSE MRRSE MRAE MRSE MRRSE MRAE MRSE MRRSE
BAYESIAN – DYNAMIC
BMA 10 – DMA 10 -.093* -.1*** -.103* .037 -.248 -.029 .157* .588 .294***
BMA 10 – DMA 8 -.118*** -.11*** -.11** .025 -.237 -.02 .173 .584 .284
BMA 10 –DMA 6 -.144** -.11*** -.122* .005 -.225 -.016 .162 .582 .282
BMA 10 – DMA 4 -.172** -.12*** -.144* .018 -.216 .007 .158 .57 .252
BMA 10 – DMA 2 -.167** -.13*** -.133 -.006 -.209 .023 .156 .543 .208
BMA 8 – DMA 10 -.054 -.066 -.012 .002 -.34 -.173* .204** .182 .054
BMA 8 – DMA 8 -.079 -.07 -.02 -.01 -.329 -.164 .220*** .179 .043
BMA 8 – DMA 6 -.105 -.079 -.031 -.03 -.316 -.16 .209** .177 .041
BMA 8 – DMA 4 -.134 -.09 -.054 -.017 -.308 -.137 .205 .165 .012
BMA 8 – DMA 2 -.128 -.098 -.042 -.041 -.301 -.121 .203 .138 -.033
BMA 6 – DMA 10 -.031 -.075 .01 .017 -.462* -.356*** .218 .039 -.093
BMA 6 - DMA 8 -.057 -.078 .003 .005 -.452* -.346*** .234 .036 -.104
BMA 6 – DMA 6 -.082 -.087 -.009 -.015 -.439* -.342*** .223 .034 -.106
BMA 6 – DMA 4 -.111 -.098 -.031 -.002 -.431* -.319*** .22 .021 -.135
BMA 6 – DMA 2 -.105 -.106 -.019 -.026 -.424* -.304* .217 -.052 -.18
BMA 4 – DMA 10 -.011 -.041 .064 .0001 -.544* -.488*** .206 .063 -.22
BMA 4 – DMA 8 -.036 -.045 .057 -.012 -.534** -.479*** .223 -.067 -.231
BMA 4 – DMA 6 -.062 -.054 .045 -.0319 -.521* -.475*** .211 -.068 -.232
BMA 4 – DMA 4 -.091 -.065 .023 -.0188 -.513* -.452*** .208 -.081 -.262
BMA 4 – DMA 2 -.085 -.073 .034 -.043 -.506* .436*** .205 -.107 -.307
BMA 2 – DMA 10 -.049 -.049 -.027 .047 -.533** -.406*** .182 -.106 -.367
BMA 2 – DMA 8 -.074 -.053 -.035 .035 -.522* -.397*** .198 -.11 -.378
BMA 2 – DMA 6 -.1 -.061 -.046 .015 -.51* -.392*** .187 -.112 -.379
BMA 2 – DMA 4 -.128 -.073 -.069 .029 -.501* -.37*** .183 -.124 -.409
BMA 2 – DMA 2 -.123 -.08 -.057 .004 -.494* -.354*** .181 -.151 -.454
AVERAGE DIFFERENCE -.094*** -.08*** -.033 -.001 -.4*** -.268** .198*** .114 -.097
Note: The table shows the difference of realized volatility forecasts between BMA and DMA when different models are included, in which column (1) shows which restriction of BMA and DMA is being tested. Colum (2), (3) and (4) show the difference in forecast error for 1-day, 1-week and 2-weeks horizons respectively separated by loss function; MRAE, MRSE and MRRSE. The bolded results reflect the difference between top performing restricted models identified in table 7. The final row “AVERAGE DIFFERENCE” shows the average difference between BMA and DMA based on all different restriction on each horizon ( ). A negative value implies that the specified DMA model yields a larger forecast error than the specified BMA model.
44
7. Conclusions
Previous research has found strong evidence that choosing the sample-
frequency and model-averaging approach that minimize forecast error of
realized volatility adds economic value in terms of portfolio selection. For these
reasons, the purpose of this paper has been twofold. First, it aimed to fill the gap
regarding the optimal sample-frequency on higher frequencies when
forecasting realized volatility. Second, it aimed to give new insight regarding the
optimal approach for model-averaging when forecasting realized volatility.
For this paper’s first purpose, the results support previous research in the sense
that selecting sample-frequency is important for forecast performance. This
study finds that selecting 1-min frequency rather than 5- or 10-min yields
significant better forecasts when measured in terms of MRSE and MRRSE.
However, when subsampling the data to only consist of the period after the
financial crisis, the difference between 1-min and 5-min is insignificant. This
implies that the impact of market microstructure noise might be different in
periods with high and low levels of volatility. However, the main findings of this
paper suggest that mentioned trade-off occurring due to market microstructure
noise is not as important as previously thought, at least not within the range of
investigated frequencies in this paper and for OMX30. The results bring clarity
to previous findings of Bandi & Russel (2006), in which they suggested that the
optimal frequency is somewhere between 0.4-min and 13.8-min. However, the
result of this paper does not consider the possibility that the optimal frequency is
outside the 1-min and 10-min range. As mentioned previously, Potter et al.
(2008) results indicate that the optimal frequency ranges between 30-min and
65-min. However, since most papers have chosen to stay within the 1-min and
10-min interval, this study gives support to choose 1-min rather than 5-min or
10-min. Furthermore, the results’ external validity is hard to advocate since
market microstructure noise might vary between markets. Thus, this paper’s
findings regarding optimal sample-frequency only holds for those stock
exchanges that are similar to the Swedish stock exchange. Further research
should investigate the impact of market microstructure noise depending on
market. In other words, identify the characteristic of a market in which 1-min
frequency performs better than 5-min frequency and further on.
The second purpose of this paper has been to give new insight regarding model-
averaging. This paper finds significant evidence indicating that Mean forecast
combination (MFC) does not deal with model uncertainty or generate accurate
forecasts compared to single models. This finding speaks against Smith & Wallis
(2009), who argued that MFC is as good as model-averaging approaches based
45
on previous performance. Even though the difference between performance
based model-averaging approaches and MFC is never statistically tested, the fact
that MFC is significantly outperformed by single models while performance
based model-averaging approaches are not indicates that MFC is less accurate in
the forecasting process. The average difference from table 6 suggests that
model-averaging based on historical forecast performance deals with model
uncertainty in the sense that it on average generates smaller forecast error than
single models. However, these results are not significant. This paper also finds
that using all models not always produce the best model-averaging selection.
However, most of these results are not significant implying that restricting
models likely has a small impact on forecast performance. Finally and most
important for this paper’s second purpose, it is found that Bayesian model-
averaging (BMA) significantly outperform Dynamic model-averaging (DMA) on
1-day’s and 1-week’s horizon, but on 2-week’s horizon the DMA is slightly
better in terms of MRSE and significantly better in terms of MRAE. These mixed
findings are not in line with Liu et al. (2017), who found that DMA significantly
outperforms BMA on all horizons. However, these findings open the door to the
possibility that choice of weighting approach depends on the forecast horizon. A
possible explanation for why horizon matters might be due to that models
forecast performance of realized volatility are more time-varying on 2-weeks
horizon than on 1-day and 1-week horizons and thus forecasts based on DMA
are more beneficial on longer horizons. However, it could also be the case that
BMA is only performing as poor as DMA on the longer horizon and that DMA is
not actually improved on the 2-weeks horizon. Further research in this area
should consider the option to optimize the model-averaging process further. It is
possible that during periods in the time series a few models are needed while in
other periods more models are needed. A suggestion for this issue would be to
consider a new kind of model-averaging approach, in which the number of
models included in the average process depends on a relative threshold value
and if the model exceeds this value, the model is excluded.
Furthermore, it should be highlighted that for the 2-weeks horizon when
model-averaging only 25 forecast observations have been obtained and
examined. This is somewhat less than the 30 observations needed to assume a
normal distribution according to the rule of thumb, in which is approximately
large and normally distributed as . This problem also holds for those
results presented in the final row of table 8, which tests the average difference
in forecast error between BMA and DMA across all restrictions, since they are
also based on 25 observations. Mentioned issues are potential restrictions of this
paper’s result and should be kept in mind when interpreting the results.
46
Bibliography
Andersen, T. G., Bollerslev, T., & Diebold, F. X. (2007). Roughing It Up: Including Jump Components in the Measurement, Modeling, and Forecasting of Return Volatility. The Review of Economics and Statistics, 701-720.
Andersen, T. G., Bollerslev, T., & Meddahi, N. (2011). Realized Volatility Forecasting and Market Microstructure. Journal of Econometrics 160, 220-234.
Andersen, T. G., Bollerslev, T., Diebold, F. X., & Ebens, H. (2001, b). The Distribution of Realized Stock Return Volatility. Journal of Financial Economics 61, 43-76.
Andersen, T. G., Bollerslev, T., Diebold, F. X., & Labys, P. (2001, a). The Distribution of Realized Exchange Rate Volatility. Journal of The American Statistical Association, 41-55.
Andersen, T. G., Bollerslev, T., Diebold, F. X., & Labys, P. (2003). Modeling and Forecasting Realized Volatility. Econometrica, 579-625.
Andersen, T., & Bollerslev, T. (1998). Answering the Sceptics: Yes, Standard Volatility Models do Provide Accurate Forecasts. International Economic Review 39, 885-905.
Awartani, B., Corradi, V., & Distaso, W. (2009). Assesing Market Microstructure Effects via Realized Volatility Measures with an Application to the Dow Jones Industrial Averages Stocks. Journal of Business & Economics Statistics, Vol. 27, 251-265.
Bandi, F. M., & Russel, J. R. (2006). Separating Microstructure Noice from Volatility. Journal of Fincancial Economics 79, 655-692.
Bandi, F., & Russel, J. (2008). Microstructure Noice, realized variance, and optimal sampling. Review Economic Studies 75, 339-369.
Barndorff-Nielsen, O. E., & Shephard, N. (2002, a). Econometric Analysis of Realized Volatility and Its Use in Estimating Stochastic Volatility Models . Journal of the Royal Statistical Society 64, 253-280.
Barndorff-Nielsen, O. E., & Shephard, N. (2004a). Power and Bipower Variation with Stochastic Volatility and Jumps. Journal of Financial Econometrics, 1-37.
Barndorff-Nielsen, O. E., & Shephard, N. (2005). How Accurate is the Asymptotic Approximation to the distribution of Realized Variance? Cambridge: Cambridge Press.
Bates, J. M., & Granger, C. W. (1969). The Combination of Forecasts. Operational Research Quarterly 20, 451-468.
Black, F. (1986). "Noise". The Journal of Finance, 529-543. Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal
of econometrics 31, 307-327. Caporin, M., & Velo, G. G. (2015). Realized Range Volatility Forecasting: Dynamic
Features and Predictive Variables. International Review of Economics and Finance, 98-112.
Corsi, F. (2009). A Simple Approximate Long-Memory Model of Realized Volatility. Journal of Financial Econometrics 159, 276-288.
Corsi, F., Mittnik, S., Pigorsch, C., & Pigorsch, U. (2008). The Volatility of Realized Volatility. Econometric Reviews 27, 46-78.
Corsio, F., & Renó, R. (2012). Discrete-Time Volatility Forecasting With Persistent Leverage Effect and the Link With Continous-Time Volatility Modeling. Journal of Business and Economic Statistics 30:3, 368-380.
Degiannakis, S. (2008). ARFIMAX and ARFIMAX-TGARCH realized volatility modeling. Journal of Applied Statistics, 1169-1180.
Elton, E. J., Gruber, M. J., Browm, S. J., & Goetzmann, W. N. (2013). Modern Portfolio Theory and Investment Analysis. Wiley.
Engle, R. F. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica, 987-1007.
Engle, R. F., & Bollerslev, T. (1986). Modeling the Persistence of Conditional Variance. Econometric Reviews, 1-50.
Fleming, J., Kirkby, C., & Ostdiek, B. (2001). The Economic Value of Volatility Timing. The Journal of Finance, 329-352.
Fleming, J., Kirkby, C., & Ostdiek, B. (2003). The economic value of volatility timing using "realized" volatility. Journal of Financial Economics 67, 473-509.
Gloste, L. R., Jagannathan, R., & Runkle, D. E. (1993). On the Relation Between Expected Value and the Volatility of the Nominal Excess Return on Stocks. The Journal of Finance, 1779-1801.
47
Hansen, P. R., & Lunden, A. (2006). Realized Variance and Market Microstructure Noice. Journal of Business & Economic Statistics Vol.24, 127-161.
Hibon, M., & Evgeniou, T. (2004). To Combine or not Combine: Selecting Among Forecasts and Their Combinations. Intenational Journal of Forecasting 21, 15-24.
Hull, J. C. (2012). Options, Futures and other Derivatives. Pearson Education. Liu, C., & Maheu, J. M. (2009). Forecasting Realized Volatility: A Bayesian Model-
Averagin Approach. Journal of Applied Econometrics, 709-733. Liu, H.-C., Chiang, S.-M., & Cheng, N. Y.-P. (2012). Forecasting the Volatility of S&P
depositary receipts using GARCH-type models under intraday range-based and return-based proxy measures. International Review of Economics and Finance 22, 78-91.
Liu, J., Wei, Y., Ma, F., & Wahab, M. (2017). Forecasting the Realized Range-Based Volatility Using Dynamic Model Averaging Approach. Economic Modelling 6, 12-26.
McAleer, M., & Medeiros, M. (2008). Realized Volatility: a review. Econometric Review, 10-45.
McMillan, D. G., & Speight, A. E. (2004). Daily Volatility Forecasts: Reassesing the Performance of GARCH Models. Journal of Forecasting, 449-460.
Merton, R. C. (1980). On Estimating the Expected Return on the Market: an Exploratory Investigation. Journal of Financial Economics 8, 323-361.
Nelson, D. B. (1991). Conditional Heteroskedasticity in Asset Returns: A New Approach. Econometrica 59, 347-370.
Potter, M. D., Martens, M., & Dijk, D. v. (2008). Predicting the Daily Covariance Matrix for S&P 100 Stocks Using Intraday Data - But Which Frequency to Use? Econometric Reviews 27, 199-229.
Rapach, D. E., Strauss, J. K., & Zhou, G. (2010). Out-of-Sample Equity Premium Prediction: Combination Forecasts and Links to the Real Economy. The Review of Financial Studies vol. 23, 821-862.
Shin, D. W., & Hwang, E. (2015). A Langrangian Multiplier Test for Market Microstructure Noice with Applications to Sampling Interval Determination for Realized Volatility. Economics Letters 129, 95-99.
Smith, J., & Wallis, K. (2009). A Simple Explanation of the Forecast Combination Puzzle. Oxford Bulletin of Economics and Statistics 71, 331-357.
Wang, C., & Nishiyama, Y. (2015). Volatility forecast of stock indices by model averaging using high-frequency data. International Review of Economics and Finance 40, 324-337.
Wang, Y., Ma, F., Wei, Y., & Wu, C. (2016). Forecasting Realized Volatility in a Changing World: A dynamic Model Averaging Approach. Journal of Banking & Finance 64, 136-149.
Zakoian, J.-M. (1994). Treshold Heteroskedastic Models. Jourmal of Economic Dynamic and Control 18, 931-955.
Zhou, B. (1996). High-Frequency Data and Volatility in Foreign-Exchange Rates. Journal of Business & Economic Statistics, 45-52.
48
Appendix
Table 9. Robust Test Average Difference in Forecast Error on Different Frequencies
MRAE = Mean Relative Absolute Error *5% Significance
MRSE = Mean Relative Square Error **2% Significance
MRRSE = Mean Root Relative Square Error ***1% Significance
All-Share index (1) OMXS30 Post Financial Crisis (2)
MRAE MRSE MRRSE MRAE MRSE MRRSE
(1-min) –
(5-min) -4.66*** -.66*** -.6*** -.4048 .0134 -.0538
(1-min) –
(10-min) -5.671*** -.77*** -.711*** -2.627*** -.1049* -.2448***
(5-min) –
(10-min) -1.0142 -.102* -.114* -2.222*** -.118*** -.191***
Note: The results corresponds to the results shown in last row of table 5, in which the average difference in forecast performance of each single model based on different sample-frequencies are tested. The left column shows the results of an “All-Share index” on Nasdaq Stockholm stock exchange during 2nd January 2007 and 28th December 2012. The Second column shows the results for OMXS30 after the financial crisis, in which the sample starts at 1th Juni 2009. For this period on OMXS30 the 2-weeks horizon has been excluded due to few observations.
Table 10. Robust Test Average Difference Between Combined and Single Models
MRAE = Mean Relative Absolute Error *5% Significance
MRSE = Mean Relative Square Error **2% Significance
MRRSE = Mean Root Relative Square Error ***1% Significance
All-Share index (1) OMXS30 Post financial Crisis(2)
MRAE MRSE MRRSE MRAE MRSE MRRSE
MFC -.537** -.0637* -.1138* -.3675*** -.03 -.0639
BMA .0366 .0226 .0087 .0975 .0196 .0212
DMA .0615 .0167 .0119 -.0336 .0051 .0013
Note: The results corresponds to the results shown in the last row of table 6, in which the average difference in forecast performance of each single model and combined model is shown. The combined models tested against single models are Mean Forecast Combination (MFC), Bayesian Model-Averaging (BMA) and Dynamic Model-Averaging (DMA). A negative value indicate that the tested model-averaging approach on average is outperformed by single models and a positive value indicates the opposite. The left column shows the results of an “All-Share index” on Nasdaq Stockholm stock exchange during 2nd January 2007 and 28th December 2012. The Second column shows the results for OMXS30 after the financial crisis, in which the sample starts at 1th Juni 2009. For this period on OMXS30 the 2-weeks horizon has been excluded due to few observations.
49
Table 11. Robust Test Average Difference Between Bayesian and Dynamic Model-Averaging
MRAE = Mean Relative Absolute Error *5% Significance
MRSE = Mean Relative Square Error **2% Significance
MRRSE = Mean Root Relative Square Error ***1% Significance
All-Share index (1) OMXS30 Post financial Crisis (2)
MRAE MRSE MRRSE MRAE MRSE MRRSE
1-day ahead -.032 -.08*** -.054*** -.1736*** -.146*** -.196***
1-week ahead -.0551*** -.308*** -.2662*** -.0313 -.116*** -.271***
2-weeks ahead .1789*** .0607 .0395 n/a n/a n/a
Note: The results corresponds to the results shown in the last row of table 8, in which the average difference in forecast performance of Bayesian and Dynamic model-averaging is shown on 1-day, 1-week and 2-weeks horizons. A negative value indicates that Bayesian outperforms Dynamic model-averaging and a positive value indicates the opposite. The left column shows the results of an “All-Share index” on Nasdaq Stockholm stock exchange during 2nd January 2007 and 28th December 2012. The Second column shows the results for OMXS30 after the financial crisis, in which the sample starts at 1th Juni 2009. For this period on OMXS30 the 2-weeks horizon has been excluded due to few observations.
Recommended