50
On Optimal Sample-Frequency and Model-Averaging Selection when Predicting Realized Volatility Joakim Gartmark* Abstract Predicting volatility of financial assets based on realized volatility has grown popular in the literature due to its strong prediction power. Theoretically, realized volatility has the advantage of being free from measurement error since it accounts for intraday variation that occurs on high frequencies in financial assets. However, in practice, as sample- frequency increases, market microstructure noise might be absorbed and as a result lead to inaccurate predictions. Furthermore, predicting realized volatility based on single models cause predictions to suffer from model uncertainty, which might lead to understatements of the risk in the forecasting process and as a result cause poor predictions. Based on mentioned issues, this paper investigates which sample frequency that minimizes forecast error, 1-, 5- or 10-min, and which model-averaging process that should be used to deal with model uncertainty, Mean forecast combinations, Bayesian model-averaging or Dynamic model-averaging. The results suggest that a 1-min sample- frequency minimize forecast errors and that Bayesian model-averaging performs better than Dynamic model-averaging on 1-day and 1-week horizons, while Dynamic model- averaging performs slightly better on 2-weeks horizon. Keywords Realized Volatility, Market Microstructure Noise, Sample-Frequency, Model Uncertainty, Bayesian Model-Averaging, Dynamic Model-Averaging and Forecasting Department of Economics Master Thesis, 30 credits Economics Master of Science, Economics 120 credits Spring Term 2017 Supervisor: Annika Alexius *I would like to send my sincerest gratitude to Björn Hagströmer at Stockholm Business School, who helped with the data used in this study

On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

On Optimal Sample-Frequency and Model-Averaging Selection when

Predicting Realized Volatility

Joakim Gartmark*

Abstract Predicting volatility of financial assets based on realized volatility has grown popular in

the literature due to its strong prediction power. Theoretically, realized volatility has the

advantage of being free from measurement error since it accounts for intraday variation

that occurs on high frequencies in financial assets. However, in practice, as sample-

frequency increases, market microstructure noise might be absorbed and as a result lead

to inaccurate predictions. Furthermore, predicting realized volatility based on single

models cause predictions to suffer from model uncertainty, which might lead to

understatements of the risk in the forecasting process and as a result cause poor

predictions. Based on mentioned issues, this paper investigates which sample frequency

that minimizes forecast error, 1-, 5- or 10-min, and which model-averaging process that

should be used to deal with model uncertainty, Mean forecast combinations, Bayesian

model-averaging or Dynamic model-averaging. The results suggest that a 1-min sample-

frequency minimize forecast errors and that Bayesian model-averaging performs better

than Dynamic model-averaging on 1-day and 1-week horizons, while Dynamic model-

averaging performs slightly better on 2-weeks horizon.

Keywords

Realized Volatility, Market Microstructure Noise, Sample-Frequency, Model Uncertainty,

Bayesian Model-Averaging, Dynamic Model-Averaging and Forecasting

Department of Economics

Master Thesis, 30 credits

Economics

Master of Science, Economics 120 credits

Spring Term 2017

Supervisor: Annika Alexius

*I would like to send my sincerest gratitude to Björn Hagströmer at Stockholm Business School, who helped

with the data used in this study

Page 2: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

1

Table of Contents 1. Introduction .............................................................................................. 2

2. Theoretical Background ................................................................................. 5

2.1 Portfolio Optimization .................................................................................................................. 5

2.2 The Process of the Stock Price ....................................................................................................... 6

2.3 Measuring the Volatility of the Stock Price .................................................................................. 7

2.4 Market Microstructure Noise ....................................................................................................... 9

2.5 Model Uncertainty ...................................................................................................................... 10

3. Previous Research ...................................................................................... 11

3.1 Realized Volatility ....................................................................................................................... 11

3.2 Sample-Frequency and Market Microstructure Noise .............................................................. 12

3.3 Model-Averaging ........................................................................................................................ 14

4. Econometric Methodology ............................................................................. 15

4.1 The Forecasting Process .............................................................................................................. 16

4.2 Realized volatility ........................................................................................................................ 17

4.3 HAR-Models ................................................................................................................................ 18

4.4 GARCH-Models ........................................................................................................................... 20

4.5 Loss Functions ............................................................................................................................. 23

4.6 Model-Averaging ........................................................................................................................ 24

5. Data ..................................................................................................... 26

5.1 Daily Realized Volatility ............................................................................................................. 26

5.2 Weekly Realized Volatility .......................................................................................................... 28

5.3 Two Weeks Realized Volatility ................................................................................................... 29

6. Results ................................................................................................... 31

6.1 Optimal Frequency ..................................................................................................................... 31

6.2 Model-Averaging ........................................................................................................................ 34

7. Conclusions ............................................................................................. 44

Bibliography ............................................................................................... 46

Appendix ................................................................................................... 48

Page 3: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

2

1. Introduction

The failure of assessing and anticipating the financial risk on the credit markets

was one of the major reasons for the financial crisis in 2008. In order to avoid a

crisis of similar magnitude again, it is vital to ensure that financial volatility is

modeled and predicted efficiently. Economists are interested in predicting

financial volatility for several reasons. First, expected future volatility determines

how an investor should balance his portfolio of risky and risk-free assets in

order to minimize risk subject to expected return. Second, it helps policy makers

to anticipate potential financial crises and counteract by adjust their policies

accordingly. Third, it is an important determinant in asset pricing in the sense

that large risk should be compensated by larger return. Fourth, it has a huge

influence on the pricing process of financial derivatives and helps investors to

hedge risks on the market.

Historically, financial volatility has been measured and forecasted either

through daily standard deviation based on daily returns or through parametric

models such as GARCH and stochastic volatility models. Even though daily

standard deviation has the advantage of being observed it still absorbs a lot of

noise since the measure only contains of one observation in each trading day.

Furthermore, using GARCH or stochastic volatility models require assumptions

regarding the volatility’s distribution, while the volatility itself is never actually

observed. Thus, Andersen et al. (2001, b) proposed a new way to measure risk,

referred to as realized volatility, in which high-frequency data consisting of, for

example, 1-min, 5-min or 10-min prices of the financial assets are used. By

transforming the intraday prices into intraday return, the realized volatility of

each day is then calculated by taking the sum of all squared intraday returns in

one trading day. Compared to daily standard deviation and parametric models,

realized volatility has the advantage of providing a model-free measurement

that allows observing more of the volatility that occurs during one trading day.

Furthermore, empirical evidence has found that realized volatility significantly

improves forecast and portfolio performance (Andersen, et al., 2003) &

(Fleming, et al., 2003). However, modeling realized volatility based on intraday

return has a potential drawback; it might absorb more market microstructure

noise and as a result bias the estimator. Market microstructure noise is all the

variation of the stock price when observed on high frequencies that is not

related to the true volatility. Examples of such variation are bid-ask bounce

effects, rounding errors due to price discreteness and recording errors. In

theory, the higher frequency used to model volatility, the more market

Page 4: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

3

microstructure noise might also be absorbed. However, meanwhile, as more

observations are included, the efficiency of the estimators is improved. Previous

research has concluded that selecting the frequency that minimizes forecast

error has economic value in terms of portfolio performance. However, one

single optimal frequency that consistently minimizes forecast errors has not yet

been established (Bandi & Russel, 2006) & (Potter, et al., 2008). A second issue

that has been investigated in the context of predicting financial volatility is how

to deal with model uncertainty. Model uncertainty means that a single model

might yield poor predictions during certain periods even though it on average

performs well. In order to deal with this issue, previous research has combined

several models to forecast future volatility, further on referred to as model-

averaging. Generally there are three different approaches when model-

averaging realized volatility; Mean forecast combination, Bayesian model-

averaging and Dynamic model-averaging. Mean forecast combination weighs

all models equally regardless of each model’s forecast performance and has been

advocated due to its simplicity and since performing at least as good as those

approaches based on forecast performance (Smith & Wallis, 2009). Bayesian

model-averaging selects the weight of each model based on the average past

forecast performance of each model, while dynamic model-averaging selects the

weight of each model based only on the last observed forecast performance.

Thus if models forecast performance of realized volatility are time-varying,

dynamic model-averaging should outperform Bayesian model-averaging.

Previous research has concluded that regardless of approach, forecasting

realized volatility based on model-averaging add economic value in terms of

portfolio and forecast performance (Wang, Ma, Wei, & Wu, 2016). However

despite previous effort, research is still divided regarding which model-

averaging approach that yields best forecasts of realized volatility (Wang &

Nishiyama, 2015) & (Liu, et al., 2017).

Thus, based on mentioned gaps in the literature, the purpose of this paper is to

identify which sample-frequency (1) and model-averaging approach (2) that

minimize forecast error of realized volatility.

(1) denotes this paper’s first purpose, in which 1-, 5- and 10-min frequencies

are examined. (2) denotes this paper’s second purpose, in which Mean forecast

combination, Bayesian and Dynamic model-averaging are examined. To the best

of the author’s knowledge in the context of forecasting realized volatility,

previous research has not yet evaluated optimal frequency based on forecast

performance of several models nor has it investigated differences in forecast

Page 5: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

4

performance when the model-averaging process is restricted to only include the

historically best performing models. Thus, findings of this paper will be first of

its kinds and provide new insight regarding optimal sample-frequency and

differences in forecast performance of Mean forecast combination, Bayesian and

Dynamic model-averaging when forecasting realized volatility. Data is based on

OMX30, which is an index representing the most traded stocks on Nasdaq

Stockholm stock exchange. The Swedish stock exchange has been selected with

respect to its maturity and well working capital markets. Since previous research

mostly has been conducted on the U.S stock exchange this study does not only

contribute by filling gaps in the existing literature, but it also complements

previous findings by providing results based on another dataset. The data

reflects the period between 2007 and 2012, a period characterized with high

levels of volatility on the financial markets.

This paper will use some of the new proposed ways to model and forecast

realized volatility in order to keep the results as up-to-date as possible. One of

the models that has arisen in spirit of the realized measures is the heterogeneous

autoregressive (HAR) model, originally suggested by Corsi (2009). Due to its

simplicity and strong forecast performance, the HAR model has been frequently

used to predict future volatility. The HAR model follows a simple autoregressive

structure, in which the last day’s, week’s and month’s volatility are modeled to

predict future volatility. Based on this model, new models accounting to this

family have been developed, in which jumps and leverage effects are included.

Other authors have focused on the impact of volatility on realized volatility by

applying different models from the general autoregressive conditional

heteroskedasticity (GARCH) family and found strong evidence that this improves

forecast performance (Barndorff-Nielsen & Shephard, 2005) & (Corsi, et al.,

2008). Thus, based on 1-min, 5-min and 10-min data, single and combined

models of the HAR and GARCH family have been applied to forecast realized

volatility 1-day, 1-week and 2-weeks ahead.

The structure of this paper is divided into five further sections. Section 2 gives

the theoretical background related to this paper’s purpose. Section 3 highlights

what previous research has found within this area. Section 4 explains the

econometric methodology to model realized volatility. Section 5 presents the

data used in this investigation in more detail. Section 6 presents the results of

this paper literally and in tables. Finally, section 7 gives conclusions and

proposals to future research.

Page 6: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

5

2. Theoretical Background

This paper aims to identify which frequency and model-averaging approach

that minimize forecast error of realized volatility. This section is divided into five

subsections, which aim to explain the theoretical background related to this

paper’s purpose. The first section explains how forecasting volatility accurately

adds economic value in terms of portfolio performance. However, the first

section is not directly related to the purpose of this paper, but serves more as an

eye-opener to why it is important to care about predicting realized volatility.

The second section describes the nature of a stock’s return, which in its essence

is what determines the volatility of the stock, which is further explained in the

third section. The fourth and fifth section concerns the parts directly related to

this paper’s purpose, in which the theory behind market microstructure noise

and model-uncertainty is explained, respectively.

2.1 Portfolio Optimization Previous research has established that predicting volatility accurately adds

economic value in terms of portfolio selection in the sense that it helps investors

to make better investment decisions1. The framework in this section is not

applied in this paper, but is described in order to make a standpoint for why

predicting realized volatility is important according to the existing literature.

In portfolio theory it is generally known that an investor selects the portfolio in

which the return is maximized subject to the risk. To do this the investor

estimates the future risk on the market in order to ensure that the portfolio is

weighted according to her risk preferences. Fleming et al. (2001) suggest a

mean-variance framework that considers an investor with a short-time horizon,

1-day, 1-week and 1-month, who aims to minimize variance subject to a certain

level of expected return. Furthermore, the theory assumes a constant expected

return based on, among others, Merton (1980), who proved that it is hard to

expect variation in expected returns in the short horizon. Thus, the only thing

changing in the short horizon is the predicted variance and consequently the

investor follows a portfolio strategy based on volatility timing. To illustrate this

approach consider numbers of risky assets where denotes the return of

each asset in a vector then the expected return matrix is given by

and the variance-covariance matrix is given by ∑

. Now the investor aims to minimize the risk of his portfolio subject

1 See for instance (Fleming, Kirkby, & Ostdiek, 2003). A more thoroughly discussion of previous

findings in terms of how predicting volatility adds economic value is given in section 3.

Page 7: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

6

to his expected target return, , and select the weight, , of each asset

accordingly. This optimization problem is illustrated below:

∑ (1.1)

Subject to

( ) , (1.2)

where denotes the expected return of the risk-free asset. Solving this problem

for then yields:

( )∑ ( )

( ) ∑ ( )

(1.3)

Now since is assumed to be constant, the weight of each asset is determined

based on the predicted values of the variance-covariance matrix, ∑ shown

below:

2

21

2

2

221

112

2

1

nnn

n

n

(1.4)

Where denotes the predicted variance of risky asset and denotes the

covariance between risky asset and . The investor predicts the variance and

covariance of each risky asset, shown in matrix (1.4), and then weighs each

asset according to equation (1.3). Fleming et al. (2003) substitute the standard

variance and covariance measures with realized variance and covariance and

finds improvement in portfolio performance. Thus, realized measures based on

intraday observations might help investors to make more accurate investment

decisions since forecasts of risks become more accurate. This paper focus on the

realized volatility, in which both realized variance and covariance can be

calculated from (if assuming constant correlation). Next two sections give a

more detailed explanation of realized volatility and why, in theory, it is a good

proxy for financial risk.

2.2 The Process of the Stock Price Since volatility basically is variation in the stock price, it is important to

understand what drives this variation. This section explains the process of the

stock price and how it is modeled according to the existing literature.

There are many ways to describe the nature of the stock price, but common for

most models is that volatility plays an important part. In cases when change over

time for a variable is uncertain it is common to consider the change in the

𝑡

Page 8: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

7

variable to follow a stochastic process. This is the general underlying assumption

in empirical asset pricing when investigating the nature of the stock price. Based

on the literature of Hull (2012), it is generally assumed that the price process

contains of a continous-price and continous-time stochastic process. A

continous-price stochastic process means that that the stock price can take any

value within a certain range and a continous-time stochastic process means that

the change of the stock price might occur at any time. However, in reality, time

and prices are generally observed discretely rather than continous. Thus, these

assumptions are usually relaxed when forecasting prices of financial assets

(Corsio & Renó, 2012).

The most basic model assumes that changes in the log-price of a stock, ,

follow a standard Brownian motion that includes the stock’s expected return, ,

in period and its volatility, , times the wiener process, . Since this paper

uses intraday daily return to predict future volatility, it is important to add the

market microstructure noise that arises from the level of frequency on intraday

data, . The market microstructure noise on high frequency data could be such

things as typographical errors or delayed quotes to mention some2. Also

unpredicted announcement effects causing volatility to change drastically are

important to consider, generally referred to as jumps, . In order to consider

jumps and market microstructure noise, the following equation is used to

explain variation in stock prices3:

( ) ( ) ( ) ( ) ( ) ( ), (2.1)

where ( ) is the expected rate of return, ( ) is referred to as the integrated

volatility of the stock price, ( ) is the Wiener process, is the mean zero

random noise independent of the Wiener process arising due to market

microstructure noise and ( ) reflects the stochastic jump process.

2.3 Measuring the Volatility of the Stock Price Since this paper forecasts realized volatility rather than daily standard deviation,

it is important to understand why research has moved towards this

measurement. This section explains the concept and nature of volatility, the

measurement error arising when daily standard deviation is used and how

realized volatility deals with this issue.

2 A more detailed explanation of market microstructure noise is given in section 2.4

3 The equation is a combination of Zhou (1996) and (Barndorff-Nielsen & Shephard, 2004a) to

illustrate the impact of jumps and microstructure noise separately

Page 9: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

8

According to Hull (2012), stock volatility can be thought of as a measure of

uncertainty about the returns provided by the stock. Generally one can say that

volatility reflects the variation of a stock’s price. Variation occurs due to new

information conceived by the market4. Investors consider these news and

reevaluate the price and as a result movements occur in the stock price. In

general, larger expected volatility requires a larger expected return in order to

compensate for the risk.

Daily standard deviation based on the square root of the squared return’s

deviation from its past mean has historically been used as a proxy for financial

risk due to its simplicity. However, this measurement has been considered

inefficient since potentially suffering from measurement errors. This is because

daily standard deviation only consists of one single observation in each trading

day, usually the closing price, and as a result absorbs a lot of noise since not

observing the intraday volatility (Andersen & Bollerslev, 1998). To demonstrate

this issue consider the integrated volatility, ( ), in trading day as shown in

equation (2.1). In order to model this variable as accurate as possible one seeks

to model the cumulative quadratic variation of all small periods in trading day ,

in other words the intraday movements, referred to as period . The integrated

volatility from equation (2.1) in trading day can then be expressed as the

square root of the integral of all squared movements in period such that5:

( ) √∫ ( )

∑ ( )

( ), (3.1)

where ( ) is the variance process in period , denotes jumps in period ,

denotes the market microstructure noise in period and is the total number

of intraday movements in trading day . Thus if variation is large in trading day

, daily standard deviation based only on one observation is theoretically a weak

proxy of the actual variation occurring in one trading day. This is basically one

of the main reasons for why using intraday return to predict volatility has been

growing popular in the volatility forecasting literature. The realized volatility

requires intraday observations of the stock price, for example on 1-min, 5-min

or a 10-min frequency and can be expressed as follows:

( ) √∑

, (3.2)

4 See chapter 17 (Elton, Gruber, Browm, & Goetzmann, 2013) for a discussion regarding efficient

markets and the efficient market hypothesis (EMH) 5 The integrated volatility equation is inspired by & (Corsi, 2009) & (Andersen, Bollerslev, &

Diebold, 2007)

Page 10: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

9

where denotes all the intraday observations in trading day, , and grows

larger as the sample-frequency increases, denotes each intraday observation

and is the intraday log-return of the stock. Thus equation (3.2) is convenient

in the sense that variation is based on intraday returns and as a result absorbs

more of the actual variation occurring in trading day . Previous papers have

established that as sample frequency in equation (3.2) increases, it converges to

follow the quadratic variation process expressed in equation (3.1) such that6:

( ) √∫ ( )

∑ ( )

( ) ( ) (3.3)

The equations shows that as intraday observations, , goes to infinity (i.e the

sample frequency goes very close to zero), realized volatility, , becomes an

efficient and unbiased estimator of the integrated volatility in period . As will be

discussed more in section 3, recent research has moved more towards

measuring volatility in this way due to its efficiency and since, theoretically,

being free from measurement error. However, when using realized volatility as

a proxy of volatility another bias arises due to the increased proportion of

absorbed market microstructure noise. This is further explained in section 2.4.

2.4 Market Microstructure Noise This paper investigates two research questions, in which the first one concerns

the bias that arises on high frequencies due to market microstructure noise. This

section explains how market microstructure noise is defined in this paper and

the trade-off that occurs when increasing the sample frequency.

Black (1986) distinguish between the meaning of market microstructure noise

in the context of finance, econometrics and macroeconomic. Further, he

explains that the only thing market microstructure noise has in common for all

these contexts is that it refers to something that the model observe, but that is

not related to the causality the model tries to explain. This paper concerns the

market microstructure noise that arise in financial data, which Black (1986)

describes as information concerning the movement of the stock price that is not

actual information. Though this definition might seem a bit confusing, it makes

sense in the context of realized volatility. This is because when observing stock-

prices on high frequencies there is an increased chance that a proportion of the

observed prices suffers from errors such as bid-ask bounce effects, rounding

errors due to price discreteness and recording errors. None of these errors are

6 For a discussion of the convergence see (Andersen & Bollerslev, 1998) & (Barndorff-Nielsen &

Shephard, 2002, a)

Page 11: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

10

related to the true movement of the stock price, but since observed it is still used

as information to explain the movements. As a result of this, estimates are based

on information that is not actual information, which provide noisy estimations

and might lead to inaccurate predictions (Hansen & Lunden, 2006). Thus in this

paper, market microstructure noise is defined as observed movements of the

intraday stock price that has no explanatory power of the true volatility.

In theory, increasing the sample frequency implies an increased risk of

observing more market microstructure noise and as a result one might estimate

the microstructure volatility rather than the realized volatility (Awartani, et al.,

2009). However, the impact of the market microstructure noise depends on the

specifics of the market in the sense that if the market contains high proportions

of market microstructure noise relative to true variation, then realized volatility

should be measured on a lower frequency since a high frequency would absorb

too much noise (Andersen, et al., 2011). However, if the proportion of market

microstructure noise is small relative to the observed variation, a high frequency

is preferred due to the increased efficiency in the estimates. Thus in order to

maintain reliable predictions it is important to choose the right sample-

frequency.

2.5 Model Uncertainty The second purpose of this paper concerns how to deal with model uncertainty

through different methods of model-averaging. For this reason it is important to

understand the concept of model uncertainty and how model-averaging deals

with this issue in the forecasting process. This section will give some theoretical

background regarding model uncertainty and explain how model-averaging

might deal with this issue.

Model uncertainty is a potential issue when predicting a variable’s future values

based on a single model. This can be illustrated with a simple example. Consider

two different models used for predicting the future value of a variable. The

basics of forecasting theory tells you to pick the model that produces smallest

forecast error. However, this approach ignores the uncertainty of the single

model. An early paper that considers this issue points out two factors that arise

based on this approach (Bates & Granger, 1969):

1. Each model is based on information that the other model has not

considered

2. Each model interpret the relation between the independent and the

dependent variable differently

Page 12: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

11

The second one is not necessarily an issue in the sense that if the estimated

relation in one of the models is wrong, it is better to only use the correct model.

However, the issue of model uncertainty in terms of forecasting arises due to

that future performance of a single model is uncertain and since being selected

based on its average performance, it is possible that there might be periods that

other models perform better. For this reason, model-averaging has been

considered a “cure” for model uncertainty. Model-averaging combines several

models in order to adjust for model uncertainty. In macroeconomics, model-

averaging grew popular in the beginning of the 2000th-century to forecast

inflation and real output growth, while the application of model-averaging in

the financial economics literature is still relatively rare and it is not until recent

years it has been growing in popularity in this context as well (Rapach, et al.,

2010). Thus this is a relevant subject to further investigate in order to facilitate

prediction of financial variables based on combined models.

3. Previous Research

This section is divided into three subsections, in which previous research

regarding realized volatility, frequency and market microstructure noise and

model-averaging are highlighted. These are all related to the purpose of this

paper in the sense that realized volatility is being forecasted and the optimal

sample-frequency and model-averaging approach are being investigated. Since

most main findings in this research subject have been conducted on U.S data,

this section will present these findings. However, it should be highlighted that

this study is based on data from the Swedish stock exchange, which is less liquid

than the U.S stock exchange, but still very similar in terms of maturity stage.

3.1 Realized Volatility Since realized volatility is being forecasted in this paper, it is vital to understand

why this measure is important and what previous papers has found regarding

this measurement. This section gives insight in previous papers findings

regarding realized volatility and its empirical support to forecast future volatility

compared to previous more conventional approaches.

In one of the first papers investigating realized volatility, Andersen et al. (2001,

a) examined the distribution of realized exchange rate volatility based on 5-min

frequency during 1986-1996. Their paper found strong evidence supporting

that realized volatility provides a variable free from measurement error and

might be more accurate than those based on parametric estimates of the error

term of daily returns. In their second paper the same year, the authors did a

Page 13: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

12

similar investigation, based on 5-min data during 1993-1998, of stocks

included in Dow Jones Industrial Average (DJIA) (Andersen, et al.,2001, b). Also

on financial data, the authors found that realized volatility provides a model-

free measurement. Furthermore, the authors argued that realized volatility

should be preferred to parametric models, in which volatility is never actually

observed, since it allows observing actual volatility and is at least as accurate as

parametric models. Somewhat biased in the sense that some of the main findings

in this field of study is practically based on the same authors over several years,

but in their third paper concerning realized volatility, Andersen et al. (2003)

compare prediction performance of realized exchange rate volatility and daily

exchange rate volatility based on GARCH components. Their findings suggest

that basic AR models based on realized volatility outperform the forecasts of

those that only observe the daily exchange rate and then through parametric

models predict volatility. In another paper, McMillan & Speight (2004)

highlight the poor evidence in favor for that GARCH models can provide better

forecasts than a simple autoregressive model when predicting future volatility

based on daily exchange rate or stock return. The authors suggest that this is

due to all the noise being observed when based only on one daily observation.

Thus by using intraday 30-min data for 17 different exchange rates to predict

realized volatility and apply different models from the GARCH family, the

authors find strong evidence suggesting that GARCH models perform better than

AR models when realized volatility is used. The authors then use this as an

argument to support the hypothesis that predicting volatility based on daily

return to measure volatility suffers from measurement error. In another famous

paper, Corsi (2009) proposed the Heterogeneous Autoregressive model of the

realized volatility (HAR-RV). The model is very straight forward and basically

aggregate daily realized volatility into past day, past week and past month in an

OLS regression. His result suggests that despite the model’s simplicity it could

still outperform the GARCH-model and as a result the HAR-model, along with its

extensions, has been used frequently ever since.

As pointed out above, the empirical evidence for realized volatility is supportive

and as a result research in this subject has become popular as well. This section

has highlighted some of the most important findings in this area.

3.2 Sample-Frequency and Market Microstructure Noise Despite previous effort, an optimal sample-frequency has not yet been

established, which is the objective of this paper’s first purpose. However, as this

section will discuss further, there is strong evidence supporting that choosing

the sample-frequency that minimize forecast error has significant economic

Page 14: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

13

value. Thus, it is vital to do further investigations in this area in order to provide

new insight of the sample-frequency’s impact on realized volatility forecasts.

As mentioned in section 2.4 when choosing sample frequency, a trade-off

occurs. Picking a high sample frequency might reduce the stochastic error of the

measurement, also referred to as increasing the efficiency of the coefficients, but

it might also introduce more market microstructure noise and as a result

provide biased estimators (Corsi, 2009).

Bandi & Russel (2006) use data from stocks included in S&P100 in an attempt to

establish the optimal frequency. The authors’ findings suggest that the optimal

frequency for realized variance varies from 0.4 min to 13.8 min. The authors

also conclude that there is significant economic gain from chosing the right

frequency when applying different frequencies on the framework proposed by

Fleming (2003). However, a range between 0.4 minutes and 13.8 minutes is not

very useful when choosing between 1-min, 5-min and 10-min frequencies,

especially if the performances of these frequencies are significantly different

from each other. Another paper concerning frequencies was exercised by Potter

et al. (2008), who use a similar framework to investigate the performance when

predicting the realized covariance matrix of S&P100 from 1- min to 130-min

frequency. Similar to Bandi & Russel (2006), Potter et al. (2008) conclude that

choosing the right frequency has economic value. However, they also suggest

that the optimal frequency ranges between 30 and 65 minutes, which is a

significantly lower frequency than proposed by Bandi & Russel (2006).

However, the results of Potter et al. (2008) are not as convincing as the paper

might imply. First, their results also imply a strong performance on 10-min

frequency, but the authors do not consider this as important since all other

results are in favor of lower frequencies. Second, the authors only use one model

to evaluate the performance of each frequency and do not consider the model

uncertainty that might arise in this context. This issue also holds for Bandi &

Russel (2006), who investigate performance depending on frequency based on

different stocks, but only on one model. For this reasons it is vital to statistically

test several single models performances on each frequency and then test

performances of each model when based on different frequencies in order to be

more statistically certain about the optimal frequency.

Despite previous attempts7 and the economic value it might add to portfolio

optimization, research has not yet been able to establish a “golden rule” when

7 See also (Bandi & Russel, 2008), (McAleer & Medeiros, 2008) and (Shin & Hwang, 2015) for

further discussions concerning market microstructure when predicting realized volatility.

Page 15: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

14

selecting sample-frequency. Instead it has been common to use an ad hoc rule,

in which a sample-frequency between 5 and 30 minutes are selected (Shin &

Hwang, 2015). This paper aims to fill this gap by investigating the performance

of several single models on higher frequencies, 1-min, 5-min and 10-min, in an

attempt to see if there is a significant difference in forecast performance

depending on frequency.

3.3 Model-Averaging This paper’s second purpose is to investigate differences in Mean forecast

combinations, Bayesian and Dynamic model-averaging when forecasting

realized volatility with a larger focus on the two latter. This section will give

insight in what previous research has established in this subject.

Model-averaging in the context of predicting realized volatility has grown

popular in the last decade and is still a topic that many papers consider the

economic value of. As pointed out in section 2.5, model-averaging combines

several models in order to deal with model uncertainty, which might arise when

using one single model to predict the future value of a variable. Ignoring model

uncertainty might lead to understatements of the risk in the forecasting and as a

result cause poor predictions (Hibon & Evgeniou, 2004).

One of the first papers testing model-averaging to predict future realized

variance was Liu & Maheu (2009), who used a Bayesian model-averaging

(BMA) approach, which combines several models based on each model’s

average historical predictive power. The authors use 5-min S&P data during

1997 and 2004 and apply linear HAR and AR models to forecast 1-day, 1-week

and 2-weeks ahead and find that BMA significantly improves the prediction

performance compared to single models on all horizons. For these findings the

authors give two reasons. First, there is not one single model that dominates in

performance across markets and horizons. Second, giving models more weight

during periods when performing well contributes to a decreased uncertainty

and consequently provides more reliable predictions. Another study predicts

volatility on stock indices on Chinese and Japanese markets using three new

models specifically invented for high-frequency data. This study confirms that

prediction performance improves when using BMA compared to single models

(Wang & Nishiyama, 2015). However, Wang et al. (2016) argue that BMA does

not consider structural breaks or the fact that models forecast performance are

time-varying and therefore suggest a dynamic model-averaging (DMA)

approach. A DMA approach as the name implies is dynamic in the sense that the

model selection is very flexible. Due to its flexibility it allows parameters to be

Page 16: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

15

time-varying. Wang et al. (2016) use 5-min data from the S&P 500 index

during 1996-2013 and apply eight different models all derived from the HAR

family. The results show that DMA on average performs better than single

models but not significantly better than BMA. Furthermore, the authors run a

portfolio exercise using a similar framework as the one presented in section 2.1

and finds that both BMA and DMA improve portfolio performance. In another

recent paper, Liu et al. (2017) compare performance of BMA and DMA when

forecasting the realized range volatility on S&P and crude oil based on models

from the HAR family. The authors find that the DMA approach is significantly

better than BMA and individual models to forecast future volatility. However,

their model-averaging only consist of five single models from the same family

implying that the combined models will probably follow a somewhat similar

pattern. Thus by adding models from two different families and expanding

models included in the model-averaging process, as exercised in this paper,

might contribute to more heterogeneity in the forecasts and as a result provide

different findings concerning performance of BMA and DMA. Furthermore,

previous research has not been very concerned about the impact of horizons

when model-averaging. Since time-variation of models’ forecast performance

might depend on the horizon it also makes sense to consider this when

investigating model-averaging approach

Summing up, previous research has concluded that model-averaging add

economic value in terms of portfolio and prediction accuracy. However,

previous research has not yet considered the magnitude of restricting the

number of models while averaging and how this is related to the performance

between BMA and DMA nor has it considered how BMA and DMA might

depend on horizon. Thus, it might be useful to see how performance changes

when restricting model-averaging to only include a limited amount of models

based on their performance. Furthermore, investigating the difference on each

horizon is of interest in order to see if this should be considered when selecting

between BMA and DMA. This paper fills the gap concerning if number of

models included in the model-averaging process is essential and if BMA and

DMA performance depends on the horizon.

4. Econometric Methodology

This section starts by explaining this study’s forecast procedure. The following

subsection gives an explanation regarding how realized volatility is specified in

this paper. Subsections 3 and 4 illustrate and explain the models used to forecast

Page 17: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

16

realized volatility from the HAR and GARCH family, respectively. Subsection 5

describes the loss functions used to measure the forecast error of each model. In

the final section, the different selection approaches used for model-averaging is

illustrated and explained.

4.1 The Forecasting Process As mentioned in previous sections this paper aims to identify the optimal

frequency and differences in model-averaging approaches when predicting

realized volatility based on the OMX30 index during 2007-2012. Thus findings

of this paper will be based on the Swedish stock exchange rather than the U.S

stock exchange. However, since previous research has focused on the U.S

market, this paper, except for providing new findings, also complement

previous findings since based on a new dataset, the Swedish stock exchange. As

a first step, rolling window out-of-sample forecasts for 10 different models on

three horizons and on three different frequencies are executed. When running a

rolling out-of-sample forecast it is possible to choose between rolling recursively

or with a fixed window. In a recursive approach one adds new observations

after each rolling forecast, in other words the length of the sample expands after

each executed forecast. The primary problem with this approach is that it is

unfair to compare the observed forecast within the sample in the sense that they

are based on different lengths of the sample. For this reason a rolling window is

preferred since it allows the length of the sample to be equal after each forecast

and consequently the forecast observations are more comparable. The three

horizons that have been forecasted are 1-day, 1-week and 2-weeks. These

horizons have been chosen based on the mean-variance framework explained in

section 2.1 with an investor changing his portfolio daily, weekly or monthly

based on the predicted volatility. For the 1-day horizon, the sample window has

been set to 300 observations, which is approximately one year and two months.

The length has been chosen with respect to that volatility is dynamic and

changes quickly and for this reason using volatility based on past observations

that exceeds more than one and a half year is irrelevant when predicting

volatility 1-day ahead. For the 1-week and 2-weeks horizon, however, the

window has been set to 100 observations, which is the minimum amount of

observations when running GARCH models in R using the “rugarch” package.

However since the 1-week and 2-weeks horizons are based on aggregated daily

realized volatility of 1-week and 2-weeks respectively, the window of the 1-

week horizon is almost two years and the window of the 2-weeks horizon is

almost 4 years. For the 1-day and 1-week horizons, 100 forecast observations

Page 18: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

17

are obtained and for the 2-weeks horizon 50 forecast observations are obtained

and thus for all forecast samples it is possible to assume a normal distribution8.

All single model forecasts are executed on 1-min, 5-min and 10-min frequency

data. In order to test for frequency performance, OLS-tests are then executed

based on the loss function of each model’s performance on each horizon. In total

each frequency has 30 different model forecast observations, 10 single models

on three horizons and thus are assumed to be large and normally distributed as

.

Based on the established optimal frequency, model-averaging is further

investigated. As explained further in section 4.6, three kinds of model-averaging

selections are examined to see if these on average outperform single models. The

Bayesian and Dynamic model-averaging approach is then further investigated to

see if prediction performance is improved when applying restrictions that only

include the best performing models in the averaging process. Based on these

results, each model restriction that on average performs best on each horizon

and loss function is identified. Finally the Bayesian and Dynamic model-

averaging forecasts are tested against each other to see if any difference in

forecasting can be established. Also in this procedure, OLS-tests are executed in

all steps to see if any significant difference between models forecast performance

can be verified.

As a final step two robustness checks have been executed. In the first one, the

whole process described above is ran again based on an Allshare index for

Stockholm Nasdaq including all listed large, mid and small cap firms on this

stock exchange in the same period. In the second one, the first two and half year

of the OMXS30 data is dropped in order to see if results change when excluding

the financial crisis in 2008. However, since this subsample only consist of 187

trading weeks, only the 1-day and 1-week horizons have been examined.

All presented results in this paper are based on Newey-West Heteroskedasticity-

Autocorrelation-Consistent standard errors.

4.2 Realized volatility As already shown in equation (3.2), realized volatility in this paper refers to the

square root of the sum of intraday squared log-return. This measure is useful

since it is, in the absence of market microstructure noise, free from

measurement error and as sample-frequency goes to infinity yields an unbiased

and efficient proxy of the integrated volatility. Furthermore, it has empirical

8 According to the central limit theorem as goes to infinity is large and approximately

normally distributed. As a rule of thumb is approximately large when

Page 19: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

18

support in the sense that it performs smaller forecast error than daily standard

deviation and those based on parametric assumptions9. Below is the formula for

realized volatility from here on denoted :

√∑

, (4.3)

where is the squared log intraday return in intraday period and trading day

and denotes all observed intraday returns in day . Thus if using 1-min

frequency, consists of approximately five and ten more observations than a

frequency on 5-min and 10-min, respectively.

4.3 HAR-Models Due to the increased interest of modeling volatility based on high-frequency

data, several new models have been developed. Among these is the popular HAR

model, which in its essence is a basic linear model following an autoregressive

structure. F. Corsi (2009)10 published the Heterogeneous Autoregressive model

of the Realized Volatility (HAR-RV), which is convenient due to its simplicity and

long-memory. Basically the model assumes that markets are heterogeneous in

terms of investors who have different time horizons. He argues that the market

can be divided into three kinds of investors that might have an impact on

volatility, high-frequency traders with a 1-day horizon, portfolio managers with

a 1-week horizon and long-term investors with a horizon on 1-month or more.

He proposed the following model to predict stock volatility:

, (5.1)

where is the intercept, is the lagged daily realized volatility in day , is

the lagged weekly realized volatility in week , is the lagged monthly

realized volatility in month and is the forecast error. In order to consider

the potential threat arising from jumps–A stochastic process rising due to

announcements or other unpredictable actions that has a significant impact on

the stock price–, Andersen et al. (2007) suggested two models that include

jumps, in which the square root of the logarithmic standardized realized

bipower variation is calculated as below:

√ √(√

)

∑ | || ( )| , (5.3)

9 See section 3.1 for more information regarding previous findings of realized volatility

10 This paper was known and applied already in 2004, however it was not published until 2009

Page 20: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

19

where (√

)

denotes the expected mean according to a standard normal

distributed random variable and ∑ | || ( )| denotes the bipower

variation term which is basically the sum of the absolute value of intraday

return times the absolute value of the intraday return in the next period

( ). Generally expressed, bipower variation attempts to catch the quadratic

variation in the stock return that is not captured by the realized volatility

measure. According to Barndorff & Shephard (2004a), the jump component is

expressed as follow:

( ), (5.4)

where the jump component, , is truncated at zero so that it only consists of

nonnegative estimates. Thus by including the jump component in model (5.4), it

is possible to run the Heterogeneous Autoregressive model of the realized

volatility with jumps (HAR-RV-J):

, (5.5)

where , and denotes the jump component according to equation (5.4)

with 1-day, 1-week and 1-month’s lag respectively. Furthermore, Andersen et

al. (2007) argued that realized volatility could be decomposed into continuous

path (CSP) and jump components (CJ). The authors constructed these

components as shown below:

( ) (5.6)

( ) ( ) , (5.7)

where denotes the indicator function and denotes the critical value

identifying the jump according to the standardized normal distributed . Thus

in equation (5.6) identifies the significant jumps determined by its critical

value and in equation (5.7) is the sum of the residuals not consisting of

jumps. This equation is referred to as the Heterogeneous Autoregressive model

of the realized volatility with continuous jumps (HAR-RV-CJ), expressed as

below:

, (5.8)

where and are used in the model with 1-day, 1-week and 1-month lag.

As a final step for the HAR family models, the “leverage effect” is considered,

which is a general concept in financial markets. Leverage effect implies that

Page 21: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

20

negative shocks in returns have a larger impact on volatility than positive

shocks. This is basically explained by the increasing default risk that occurs as a

result of a decrease in the stock price due to the increased debt relative to equity.

In order to account for the leverage effect, Corsio & Renó (2012) proposed the

leveraged Heterogeneous Autoregressive model of the realized volatility with

jumps (LHAR-RV-J) and continuous jumps (LHAR-RV-CJ). The leverage

components can then be modeled in the following way:

( ), (5.9)

where , is the aggregated negative return in period based on intraday

return in period . Thus, this component is added to model (5.5) and (5.8),

respectively:

(5.10)

, (5.11)

where equation (5.10) is the LHAR-RV-J model and equation (5.11) is the

LHAR-RV-CJ model.

4.4 GARCH-Models Previous research has confirmed that a simple Autoregressive (AR) model based

on past realized volatility to predict future realized volatility outperforms

stochastic volatility or GARCH models based on daily return (Andersen, et al.,

2003). Previous research has also considered the role of volatility on realized

volatility and found strong results suggesting that this helps explain some of the

variation in realized volatility. For example the results found by Barndorff-

Nielsen & Shephard (2005) suggest that realized volatility might suffer from

heteroskedastic errors because of time-varying volatility in the realized volatility

estimator. Based on these findings, among others, Corsi et al. (2008)

investigated this further by including a GARCH component in the HAR-RV

model explained in section 4.2. The results indicate that modeling the volatility

of realized volatility improves forecast accuracy. Thus, in spirit of previous

research this paper will apply a similar strategy and adapt AR models combined

with GARCH(1,1) components in order to forecast realized volatility.

Introducing the autoregressive conditional heteroskedasticity (ARCH) model, F.

Engler (1982) suggested a parametric approach to model the size of the errors

Page 22: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

21

in the residual. To model the ARCH component one assumes a distribution of the

error term in the AR model, also referred to as the mean equation, and runs a

regression of the assumed variance, referred to as the variance equation, of the

error term based on past error terms. Four years later, Bollerslev (1986)

introduced the GARCH component, which includes past variance. A simple

AR(1)+GARCH(1,1), Where denotes the realized volatility can be shown as:

( ) (6.1)

, (6.2)

where equation (6.1) is the mean equation and is the AR(1) and equation

(6.2) is the variance equation, in which is the ARCH component and

is

the GARCH component. Also note that ( ) refers to the assumed

distribution of the error term in the mean equation. Furthermore all coefficients

in (6.2) takes a nonnegative value and in order for stationarity to hold

. In many cases, the ARCH component takes a value very close to zero,

which yields a highly persistent volatility. In order to deal with this issue, Engle

& Bollerslev (1986) proposed the IGARCH model presented below:

( )

, (6.3)

where equation (6.3) presents the variance equation in which ( ) now

becomes less persistent. As mentioned in previous section negative and positive

return might have asymmetric impact on volatility. This might also be true when

speaking of negative and positive volatility of realized volatility. However in this

case one might suspect that the opposite holds, in other words an increase in

volatility might have a larger impact on the volatility compared to a decrease in

volatility. In order to consider this one can apply the eGARCH model, which has

proven useful in financial modeling (Nelson, 1991). The model is shown below:

( ) (

| |

) (

), (6.4)

where captures the assymetric information, in which positive shocks have a

greater impact than negative shocks of equal magnitude. By transforming the

GARCH components into logarithms one allows the parameters to be negative,

while the conditional variance maintains non-negative. Gloste et al. (1993)

proposed another way to model asymmetry in the ARCH component, referred to

as the GJR-GARCH model, explained below:

Page 23: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

22

( )

, (6.5)

where is an indicator variable taking value 0 if and 1 if .

Furthermore all coefficients are positive and the conditional variance is

nonnegative. Basically the model is showing that positive shocks will have

impact on volatility while negative shocks will have impact on volatility.

Thus if positive volatility has a larger impact on realized volatility than negative

volatility one should expect . Finally, this paper applies an alternative

way for equation (6.5) that has been proven useful for modeling realized

volatility11, named TGARCH originally proposed by Zakoian (1994). The

difference is that instead of modeling the variance equation using variance, the

model takes the square root of (6.5). The model is expressed as:

√ √

( ) √ √

, (6.6)

where is an indicator variable taking value 0 if and 1 if .

One of the first papers investigating the distribution of the volatility of realized

volatility was Corsi et al. (2008), in which they concluded that it is non-

Gaussian. The authors further assumed an inverse Gaussian distribution and

found that it improves prediction accuracy significantly. Since this distribution

has proven useful in other papers as well12, this paper has chosen to follow a

similar strategy and assumes an inverse Gaussian distribution of the residual in

the mean equation. Also worth mentioning is that Corsi et al. (2008) and some

other papers have used the HAR approach as the mean equation and then

modeled the variance equation based on the residual. Even though these results

have performed well for forecasting realized volatility, this paper has chosen to

use simple AR models when modeling the GARCH components. This is because

the second purpose of this paper is to evaluate model-averaging models and for

that reason it is useful to include models of different orders in order to get more

heterogeneity in predictions. It is, however, possible to argue against this

approach and thus previous explanation is presented as a precaution for these

arguments. Furthermore, the AR order of the mean equation has been chosen

based on the Bayesian information criterion (BIC) for the whole sample. The AR

model with consecutive lags and smallest BIC-value was chosen for each

sample-frequency. It is possible to argue against this approach as well since this

paper deals with forecasts and thus using the full sample is kind of contradictive

11

See (Degiannakis, 2008) 12

See also (Caporin & Velo, 2015) who adopts an inverse Gaussian distribution when predicting

realized measures based on different GARCH models

Page 24: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

23

in the sense that rolling window forecasts are never based on the whole sample.

However, this paper has assumed that the impact on forecast performance if

selecting AR order based on another approach would be marginal and for that

reason this approach has been chosen due to its convenience. For all sample

frequencies, the AR(3) returned smallest BIC-value and has for this reason been

chosen as the mean equation for all of the AR+GARCH models. The GARCH

order has been set to (1,1) for all GARCH models. This is the standard order

according to the literature and has been used in previous studies when

forecasting realized volatility (Caporin & Velo, 2015) & (Corsi, et al., 2008).

4.5 Loss Functions This paper has used similar loss functions as previous studies in order to

determine forecast performance13. These functions involve mean absolute error,

mean squared error and root mean square error. The mean absolute error

weights the size of the error equally, while mean square error and root mean

squared error assigns more weight to larger forecast error. Hence, the later loss

functions give a good indication regarding the average size of the error as well.

Thus using three different loss functions improve the conditions for making an

accurate analysis concerning the forecast errors.

As mentioned previously, this paper investigates the optimal sample frequency

based on the performance on three different horizons. Since the units of realized

volatility are different on different sample frequencies, it is not possible to

compare loss functions directly. In order to deal with this issue each loss

function is measured in terms of the actual outcome, in other words it is the

relative forecast errors that are measured. Each loss function is described below:

|

|

(7.1)

(

)

(7.2)

∑ √(

)

, (7.3)

where is the actual observed outcome,

is the predicted

outcome and is the total number of observed predictions. Furthermore, MRAE

denotes the mean relative absolute error, MRSE denotes the mean relative 13

See for instance (Liu, Chiang, & Cheng, 2012), (Liu & Maheu, 2009) & (Wang & Nishiyama,

2015)

Page 25: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

24

squared error and MRRSE denotes the mean root relative squared error. Thus, as

mentioned before MRAE weighs all forecast error equal, while MRSE and MRRSE

give more weights to large forecast error. However, MRSE assigns more weight

to large forecast error than MRRSE does.

4.6 Model-Averaging As mentioned in section 3.3, model-averaging has a significant economic value

in terms of portfolio selection. This is mainly explained due to its exceptional

way of dealing with model uncertainty. However, so far it is still unclear if there

is a difference in performance between Bayesian (BMA) and dynamic (DMA)

model-averaging. In order to gain some more insight in this area, three different

model-averaging approaches have been used. The first one is a benchmark

referred to as mean forecast combination (MFC), in which all models are

assigned equal weight without respect to previous prediction performance.

According to Smith & Wallis (2009), MFC performs as well as other model-

averaging approaches and thus should be preferred due to its simplicity.

However since it is likely that some models produce better predictions than

others one might assign the weights accordingly. The DMA approach is, as the

name implies, dynamic in the sense that the weight of each model is only based

on the last forecast performance observation. However, in the BMA approach a

larger sample of the loss function is used to calculate the average error. Thus,

weights of the BMA selection changes slowly compared to the DMA. There are

several different BMA selections, in which some take models covariance into

consideration. However, Smith & Wallis (2009) emphasized that weights based

on simple performance measurements tends to outperform weights based on

more complex approaches as those including the covariance. Following his

suggestion, this paper use a straight-forward weighing process when applying

BMA and compare this to the results achieved from the non-performance based

models, MFC, and from dynamic performance based models, DMA. The BMA

selection is expressed below, where represents the value of the loss function

for model in period :

(∑

) (8.1)

, (8.2)

where denotes the average value of the loss function and denotes the

weight of each model when forecasting one period ahead. The loss function’s

inverse is used in order to give models producing small forecast error a greater

Page 26: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

25

weight in the forecast. Basically, the average historical performance is used to

select the weight of each model. In this paper, the BMA selection follows a

rolling window approach based on the last 30 observations for 1-day and 1-

week horizons and 25 observations for the 2-weeks horizon14. For DMA, the

selection is exercised as below:

, (8.3)

note that in equation (8.3), the weighting is only based on the performance of

the last prediction instead of using an average of all past predictions. If model

performance is time-varying, DMA should outperform BMA. However, if models

are less time-varying, BMA should outperform DMA.

As a final step this paper considers changes in performance when restricting the

models by ranking each model in terms of its loss function value. In this way it is

possible to discover the optimal restriction for BMA and DMA. When this is

established for each horizon and loss function, BMA and DMA performance are

compared with each other in order to see if the most accurate restriction of each

averaging approach performs better than the other. This approach has not yet

been used for realized volatility, but is applied since it is possible that some

models included in the averaging process never outperform the top models and

thus might reduce the prediction accuracy rather than improving it. However,

while this approach helps to give insight in whether BMA is better than DMA it

is still restricted in the sense that in reality you do not know the optimal number

of models to include. However, using this approach still adds value. Firstly,

because it provides insight whether restricting models to only include a set of

top performing models rather than using all is a useful approach to reduce

forecast error. Secondly, it is possible to conclude whether there is a difference

between BMA and DMA in forecast performance when both are specified

optimally. Finally, it generates observations of performance between BMA and

DMA, which makes it possible to run an OLS-test to see if performance is

significantly different.

The restrictions will be based on the average selection in equation (8.2) and

(8.3), starting by all models and then decrease by two models in five steps in

order to see if prediction performance is improved. A final thing that should be

highlighted is that all model-averaging predictions on the 2-weeks horizon only

consist of 25 forecast observations. This cause doubts whether normal

14

Since 2-weeks horizon only consists of 50 forecast observations, Bayesian model-averaging is

based on 25 past observations in order to ensure that 25 forecast observations are obtained.

Page 27: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

26

assumption holds on this horizon and should be considered as a limitation of

these results. However on 1-day and 1-week horizons, 70 forecast observations

are obtained and thus is assumed to be large and normally distributed.

5. Data

High-frequency data from OMX30 between 2nd January 2007 and 28th

December 2012 has been retrieved from Thomson Reuters Tick Database.

OMX30 contains of the 30 most traded stocks on Nasdaq Stockholm stock

exchange. The Swedish stock exchange has been selected since it is considered to

be a mature market, in which the results can be compared to those based on U.S

data. OMX30 has been chosen due to its liquidity and since it is a convincing

proxy of the market portfolio. The period has been chosen since it reflects a

turbulent period on the stock market, in which both the financial crisis and the

EU debt crisis have occurred. Collected data contains of three different

frequencies 1-, 5- and 10-min intraday stock prices, in which the final trade in

this interval reflects the final intraday stock price. Furthermore, data has been

thoroughly cleaned from observations not reflecting a true trading day. Opening

hours for days before holiday are 9.00 am to 1.00 pm on the Swedish stock

exchange and do not reflect a whole trading period. For this reason these days

have been removed in order to get similar intraday observation on all days. The

Swedish equity market has opening hours between 09:00 and 17:30 and thus

data has been restricted to reflect this time interval as well. The data consists of

1482 trading days and 313 trading weeks. The intraday observed price has been

transformed into logarithms. Thus all models are based on intraday log returns.

As mentioned previously in this paper 1-day, 1-week and 2-weeks horizons are

predicted and this section provides further details regarding each horizon.

5.1 Daily Realized Volatility Table 1 reveals the statistical properties of 1-day realized volatility horizon

separated by 1-min, 5-min and 10-min. The table shows that frequencies in

terms of quartiles and medians are quite similar, however in terms of min and

max there are more obvious differences, where 10-min has largest observed min

and max. Also worth noticing is that standard deviation is distinctly larger on 5-

and 10-min frequency than on 1-min, suggesting that variation of realized

volatility is larger on lower frequencies. In terms of skewness all three

frequencies show a similar pattern, in which all are around 2 indicating that the

distribution are right skewed and not normally distributed. The kurtosis is also

Page 28: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

27

remarkably larger than 3 revealing that data consist of more outliers than would

be the case if normally distributed.

Table 1. Daily Realized Volatility SAMPLE-FREQUENCY 1-MIN 5-MIN 10-MIN Min .003339 .003124 .002762

1st Quartile .007709 .007572 .007250

Median .010113 .010434 .010015

Mean .011393 .011812 .011595

3rd Quartile .013361 .014158 .013934

Max .057814 .061586 .070780

Standard Deviation .00569 .006228 .006473

Skewness 2.241143 2.242893 2.442556

Kurtosis 11.91032 12.10872 14.23351

Graphs 1 to 6, illustrate daily data in in a plotted diagram and histogram on

each frequency. The distribution is similar for all frequencies. Graphs 1, 3 and 5

show that volatility peaked during 2008, 2009 and 2011 periods in which

mentiond crises occurred. Graphs 2, 4 and 6 reveal, in line with expectations,

that realized volatility is right skewed.

Rea

lize

d V

ola

tili

ty

Rea

lize

d V

ola

tili

ty

Page 29: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

28

5.2 Weekly Realized Volatility

Table 2 has a similar structure as table 1, but in this table, the 1-week horizon of

realized volatility is illustrated. Conclusions that can be drawn from table 2 are

that 1-min data has smaller range than 5-min and 10-min data. Similar to table

1, it also shows a smaller standard deviation for 1-min data than 5-min and 10-

min. Also worth mention is that even though skewness is still above zero and

kurtosis is larger than 3 both takes a smaller value than if measured on daily

basis. This implies that even though weekly realized volatility is still not

normally distributed it seems to be closer to normal distribution than daily

realized volatility.

Table 2. Weekly Realized Volatility SAMPLE-FREQUENCY 1-MIN 5-MIN 10-MIN Min .007517 .006947 .006196

1st Quartile .017346 .016984 .016530

Median .022692 .022992 .022498

Mean .025036 .026025 .025375

3rd Quartile .029297 .030693 .029998

Max .093881 .096257 .094299

Standard Deviation .01189301 .01293827 .01268386

Skewness 1.885782 1.785964 1.807755

Kurtosis 8.545417 7.662383 7.766664

Graphs 7 to 12 illustrate a similar structure as previous graphs for daily realized

volatility. However, since observations are reduced, the plotted diagrams are less

compact in the sense that it only reflects 313 weekly observations rather than

1482 daily observations. The plots in graph 7, 9 and 11, show a very similar

pattern in which the crisis previously mentioned distinguish itself. However, the

histograms, shown in graph 8, 10 and 12, seem to be less concentrated in the

middle compared to graphs 2, 4 and 6, but still right skewed.

Rea

lize

d V

ola

tili

ty

Page 30: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

29

5.3 Two Weeks Realized Volatility Table 3 illustrates statistical properties of the data on the 2-weeks horizon of

realized volatility. Data shows a similar pattern as for weekly realized volatility.

However, in this case min is not largest on 1-min frequency, but on 5-min

frequency. But for standard deviation, skewness and kurtosis the pattern is very

similar, in which variation is larger on smaller frequencies and all frequencies

follow a right skewed distribution with more outliers than if normally

distributed.

Rea

lize

d V

ola

tili

ty

Rea

lize

d V

ola

tili

ty

Rea

lize

d V

ola

tili

ty

Page 31: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

30

Table 3. Two Weeks Realized Volatility SAMPLE-FREQUENCY 1-MIN 5-MIN 10-MIN Min .01226 .01367 .01254

1st Quartile .02542 .02425 .02363

Median .03196 .03328 .03179

Mean .03575 .03721 .03670

3rd Quartile .04049 .04307 .04384

Max .1202 .12470 .1332

Standard Deviation .01623894 .01762449 .01817101

Skewness 1.836134 1.742474 1.890126

Kurtosis 8.125699 7.329424 8.377635

Graphs 13, 15 and 17 are even less compact than in previous subsections since

they only consist of 156 observations. The pattern, however, is very similar to

the daily and weekly horizon. Furthermore, Graphs 14, 16 and 18 are even less

concentrated in the middle than on daily and weekly data. However, it also

seems to consist of a larger proportion of outliers with values larger than 0.12.

Rea

lize

d V

ola

tili

ty

Rea

lize

d V

ola

tili

ty

Page 32: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

31

6. Results

This section consists of two subsections that present the results of this study’s

purpose. The first subsection provides the results regarding the optimal

frequency, in which performance of all models on 1-min, 5-min and 10-min is

tested. Since 1-min frequency shows strongest forecast performance, the results

in subsection 2 show differences between different model-averaging approaches

based on this frequency. In this section also a brief explanation of the most

important results of the robustness test is given. Except for the robustness tests15,

all interpreted results are presented in tables at the end of section 6.2.

6.1 Optimal Frequency The first results are presented in table 4. These results show the performance of

each single model in terms of mean relative absolute error (MRAE), mean

relative square error (MRSE) and mean relative root square error (MRRSE) on 1-

day, 1-week and 2-weeks horizons. Table 4 is separated by frequency in

columns (2), (3) and (4), in which 1-, 5- and 10-min, respectively, are the

underlying data used to forecast realized volatility. The bolded results, except for

the final row, show the best performing model on each horizon, loss function

and frequency. The last row, denoted average error, illustrates the total average

error on each frequency including all horizons and models. All forecast errors

presented in table 4 are significant different from zero by 1%.

Table 4 reveals that HAR models consistently yield smaller forecast error than

GARCH models on 1-week and 2-weeks horizons regardless of loss function and

frequency. Furthermore, GARCH models forecast errors are much larger in

terms of MRSE and MRRSE than HAR models on 1-week and 2-weeks horizons,

while slightly larger in terms of MRAE. This indicates that the size of the errors

15

Main findings of robustness checks are shown in appendix table 9, 10 and 11. Full results can be

handed by request

Rea

lize

d V

ola

tili

ty

Page 33: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

32

for GARCH-models is substantially larger than what MRAE reveals. For this

reason it is possible to suspect that models originating from the GARCH family

are less stable in terms of performance on the 1-week and 2-weeks horizons.

Also worth noticing is that the elaborated models of the original HAR-RV model

increase performance. Interpreting the first five rows on each frequency, it is

obvious that adding new components into the original HAR-RV consistently

decrease the forecast errors on almost all horizons. The LHAR-CJ-RV is the

model that provides smallest forecast error most frequently which supports the

importance of including jump components and to consider the leverage effect

when modeling volatility. The best performer of the GARCH family, however, is

not as consistent. On the 1-day horizon, eGARCH shows signs of performing

best compared to the other GARCH models on this horizon. However, for the 1-

week and 2-weeks horizons, eGARCH and TGARCH are changing in terms of

performance, but the fact that both these models consider the impact of small

and large volatility of realized volatility and perform well implies that this might

add value when forecasting realized volatility. Finally, the last row in table 4,

shows that on average 1-min data provides smaller forecast errors than 5-min

and 10-min data in terms of all loss functions. Furthermore, the final row shows

that 5-min data yields smaller forecast error than 10-min data in terms of all

loss functions.

Table 5 illustrates the difference in forecast performance of each model when

based on 1-min, 5-min and 10-min frequency. Thus in column (2) and (3) 1-

min frequency is tested against 5-min and 10-min frequency, respectively, and

in column (4) 5-min frequency is tested against 10-min frequency. A negative

value implies that models based on the frequency being subtracted yields larger

forecast error than if based on the frequency not being subtracted and a positive

value indicates the opposite. The last row shows, the average difference between

frequencies being tested in terms of forecast error. * indicates a significance of

5%, ** indicates a significance of 2% and *** indicates a significance of 1%.

Interpreting column (2) in table 5 and the difference in forecast error between

each single model on 1-min and 5-min frequency, no significant difference is

obtained on any horizon. However, in terms of MRAE, MRSE and MRRSE the

minus sign occurs 57%, 73% and 83% of the times, respectively. This indicates

that 1-min data yields smaller average forecast error than 5-min data a majority

of the times in terms of all loss functions. Interpreting the average difference in

forecast error shown in last row, the results reveal that in terms of all loss

functions the average forecast errors are larger on 5-min frequency.

Page 34: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

33

Furthermore, these results are significant in terms of MRSE and MRRSE with 5%.

Thus, results suggest that a sample-frequency on 1-min yields less forecast error

than on 5-min.

Moving to column (3) in table 5, forecast error of 1-min data is subtracted with

10-min data. Also in this case a negative result indicates that forecast error of

10-min data on average is larger than 1-min data. On the 1-day horizon, a

majority of the results in terms of MRAE and MRRSE are significant in favor for

1-min frequency. Even though the results in terms of MRSE on the 1-day

horizon are not as convincing as they are for MRAE and MRRSE, they still

contain significant results with all values being negative. For the 1-week and 2-

weeks horizons no significant results are discovered, it is however worth

mention that only 3 out of 60 results are positive giving some support for 1-min

data. Finally, the last row shows that in terms of all loss functions, 1-min

frequency yields significantly smaller forecast error than 10-min frequency.

In column (4), table 5, 5-min data is tested against 10-min data. In this case,

performance is particularly in favor of 5-min data on the 1-day horizon since

all loss functions yield negative results in which a majority is significant.

Furthermore, if including all loss functions on the 1-week and 2-weeks

horizons, only 5 out of 60 observations are positive. These findings suggest that

using 10-min data to forecast realized volatility yields larger predictions error

than 5-min data. Finally, the last row in this column support this further by

showing that in terms of all loss functions, 5-min data significantly outperform

10-min data.

To summarize the findings of table 4 and 5, according to table 4 there seems to

be evidence supporting HAR models over AR+GARCH models on the 1-week and

2-weeks horizons. This difference, however, is never statistically tested.

Furthermore, a strong majority of the models including leverage effects and

jumps generates smaller forecast error than those not considering these.

Interesting in terms of the purpose of this paper is the results revealed in table 5

that yields strong evidence in favor of using 1-min or 5-min data rather than

10-min. The results also give support for using 1-min data rather 5-min data.

However these results are not as statistically convincing as for 10-min

frequency, but the results still shows a significant difference for at least 5% in

terms of MRSE and MRRSE. Thus, the empirical evidence is still in strong favor

for 1-min data. For this reason, next section that concerns model-averaging uses

1-min frequency as the underlying data.

Page 35: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

34

6.2 Model-Averaging Moving over to the second purpose of this study, table 6 illustrates how single

models perform compared to averaged models based on all ten models and 1-

min frequency. Table 6 shows the difference between single models and

combined models prediction error in terms of MRAE, MRSE and MRRSE. The

results in column (2), (3) and (4) illustrate the difference in performance for

single models compared to Mean forecast combinations (MFC), Bayesian (BMA)

and Dynamic (DMA) model-averaging on each horizon, respectively. A negative

sign implies that the combined model provides a larger forecast error than the

single model. The final row named “AVERAGE DIFFERENCE” shows the average

difference in forecast error between combined models and all single models on

all horizons.

Starting by interpreting the performance of Mean forecast combinations (MFC)

in column (2), table 6, it is obvious that for all horizons, except for 6

observations on the 1-day horizon, all signs are negative implying that MFC

forecast errors are larger in almost all cases when compared to single individual

models. For the 1-week and 2-weeks horizons almost all results are negative and

significant with at least 5%, a strong indicator that MFC does not provide

accurate forecasts. The results in the final row further confirm that average

difference in forecast errors for MFC is significantly worse than individual

models. Thus from a statistical standpoint it is possible to conclude that using

averaged model by equal weighting not based on any performance indicator

does not deal with model uncertainty very efficiently in the sense that almost all

single models performs better on average.

The results are more divided when moving from column (2) in table 6 to

column (3), which illustrates the difference in forecast errors between Bayesian

model-averaging (BMA) and single models. On the 1-day horizon, no

significant results are obtained and in terms of MRAE, 80% are positive

indicating support for BMA. However, in terms of MRSE and MRRSE only 50%

are positive. The results of the 1-week horizon are very distinctive. Except for

HAR-RV in terms of MRAE, all models from the HAR family are negative and in

terms of MRSE and MRRSE significant with at least 5% percent. However, results

for all models from the GARCH family are positive and in terms of MRSE and

MRRSE significant with at least 5%. For the 2-weeks horizon, the BMA performs

better than 50% of the models in terms of all loss functions. Thus, in total for all

horizons, BMA provides better forecasts than at least 50% of the single models

for all loss functions. Furthermore, the average difference shown in the last row

indicates that BMA on average yields smaller forecast errors than single models.

Page 36: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

35

Thus, even though these results are not significant they still support that BMA

might help deal with model uncertainty.

Finally, in column (4), table 6, the performance between single models and

Dynamic model-averaging (DMA) is shown. On the 1-day horizon, a majority

shows a negative value in terms of all loss functions, which suggest that

applying a DMA for 1-day forecasts might not be very efficient. However, none

of the values on the first horizons are significant. For, the 1-week horizon the

results are very similar to those shown for BMA, where the HAR family

outperforms DMA, which outperforms the GARCH family. All of these results

are significant in terms of MRSE and MRRSE, while none are significant in terms

of MRAE. For the 2-weeks horizon, the results are in favor for DMA. In terms of

MRAE and MRSE, a majority of the results are positive but not significant. In

terms of MRRSE, DMA performs significantly better than 50% of the single

models and is outperformed by the other half. Also for DMA the final row,

indicating the average difference in forecast errors on all models and horizons,

shows positive results in terms of all loss functions. However, results are not

significant but still imply that DMA might deal with model uncertainty. Also

worth noticing is that the value for all loss functions on the last row is larger for

DMA than for BMA indicating that the error on average is smaller for DMA as

well.

The results from table 6 are twofold. First, using an averaging approach not

considering forecast performance, such as MFC, does not deal with model

uncertainty and gives significantly worse forecasts than any single model a

strong majority of the times. Second, BMA generate on average at least better

predictions than 50% of the single models on all horizons. The total average

difference in last row is also positive, but not significant, for both approaches

indicating small evidence in favor for both BMA and DMA. However, this table

does not consider what restrictions should be set for DMA and BMA in terms of

number of averaged models included nor does it consider which approach that

is better than the other. These results are illustrated in table 7 and 8.

Table 7 illustrates the difference in performance between full models (BMA 10

and DMA 10), in which all models are included, and restricted models, in which

only the top 8, 6, 4 and 2 models are included in the averaging process. The

table is separated to explicitly show the performance for different kinds of

specifications of BMA and DMA. The performances of different BMA models are

shown in the 10 first rows and for DMA in the 10 last rows. Each model has

been evaluated and ranked on each horizon and loss function. Even though the

Page 37: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

36

difference in most cases is not significant, it is still interesting to identify which

restriction that on average produce less forecast errors in order to see how BMA

and DMA of these models perform against each other as shown in table 8.

Table 7 reveals that on the 1-day horizon, column (2), the full model consisting

of all ten single models is top ranked in terms of all loss functions both

according to BMA and DMA. This implies that when forecasting 1-day ahead

more models rather than fewer models add value in terms of forecast

performance. On the 1-week horizon in column (3), table 7, BMA and DMA

show that restricting the average process to only include 4 and 2 models,

respectively, provide smallest forecast error in terms of MRSE and MRRSE. For

the BMA these results are significant with at least 5% for all models except for

the BMA model only including two models in terms of MRSE. For DMA no

results are significant. However, in terms of MRAE, BMA 4 is top ranked while

DMA 10 is top ranked with no significant results in any of the cases. Moving

over to the 2-weeks horizon in column (4), table 7, the top ranked specifications

are more different between BMA and DMA. In terms of MRAE, BMA 10 and

DMA 8 are ranked as top performer. However in terms of MRSE and MRRSE,

BMA 2 and DMA 10 are ranked as top performer. In terms of MRRSE, the BMA

2 is significant with 2% in all cases.

In the final table, table 8, the results illustrate the difference in forecast

performance of all different BMA and DMA combinations, presented in table 7.

All results are divided into 1-day, 1-week and 2-weeks horizons in column (2),

(3) and (4), respectively. The final row named “AVERAGE DIFFERENCE”

illustrates the average difference between BMA and DMA based on different

model restrictions. However, note that the final row is only based on 25

observations and its implications should for this reason be taken by caution. In

table 8, a negative value implies that the specified BMA yields a smaller forecast

error than the specified DMA and a positive value indicates the opposite. The

bolded results illustrate the difference between top performer, identified in table

7, of BMA and DMA.

The results for 1-day’s forecasts presented in column (2) of table 8 show that in

terms of all loss functions, except for a few observations for MRRSE, all values

are negative. If further comparing the top performer, which is BMA 10 and

DMA 10 for all loss functions, the difference in performance is negative and

significant with at least 5% in all cases. Interpreting the final row it is obvious

that average difference based on the performance of different model

combinations is negative for all loss functions and in terms of MRAE and MRSE

Page 38: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

37

significant with 1%. Thus, for 1-day ahead there is strong evidence that BMA

provides significantly better forecasts than DMA both in terms of top performer

and on average. These findings suggest that forecasting 1-day ahead is less time-

varying and thus the benefits of DMA are less required.

For 1-week ahead, the results presented in column (3), table 8, also show that

BMA yields less forecast error than DMA. However, in this case a majority of the

MRAE results are positive, but not significant. The bolded result in terms of

MRAE illustrating the top performer of each averaging restriction shows a very

low positive value implying that DMA slightly outperforms BMA when top

performers are compared. However, when top performer of each averaging

approach is compared in terms of MRSE and MRRSE, BMA outperforms DMA

with at least 5% significance. The results shown in the final row are also in

strong favor of BMA in terms of all loss functions and significant with at least

2% in terms of MRSE and MRRSE. Thus, according to the results in table 8,

column (3), BMA has support on the 1-week horizon as well.

Finally, the results on the 2-weeks horizon in column (4), table 8, are different

compared to the other two horizons. In terms of MRAE there is strong evidence

in favor for DMA, in which all values are positive and the final row is significant

with 1% in favor for DMA. In terms of MRSE and MRRSE, however, the evidence

for DMA is not as supportive even though a majority of the results takes a

positive value in terms of MRSE. However, the top performer of BMA and DMA

still shows that BMA outperform DMA in terms of MRSE and MRRSE. Finally, the

final row shows that in terms of MRSE, DMA yields on average smaller forecast

error than BMA, however these results are not significant. Furthermore, in terms

of MRRSE the value of the final row is negative, but not significant which makes

it hard to conclude anything with certainty regarding BMA and DMA forecast

performance on this horizon. Thus for 2-weeks ahead there seems to be some

support in favor for DMA, however these results are weak and should be taken

with caution. One possible reason for why DMA performs better relative to BMA

on two-weeks horizon might be due to that this horizon is more time-varying

than 1-day and 1-week horizons.

Summarizing this section for model-averaging, from table 6 it was established

that MFC does not deal with model uncertainty and is on average significantly

outperformed by single models regardless of horizon. Furthermore, the results

from table 6 indicate that BMA and DMA perform at least better than 50% of the

single models on all horizons in terms of all loss functions, giving some support

to that model-averaging based on historical performance deals with model-

Page 39: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

38

uncertainty. However, a majority of the results are not significant. In table 7, it

was established that difference in performance between different models exist

and they are not always the combination containing most models. However, the

statistical evidence regarding the magnitude on forecast performance depending

on number of models included is not particularly strong. Finally, the results

illustrated in table 8, are in favor for BMA on 1-day and 1-week horizons.

However, for the 2-weeks horizon the evidence is more mixed and in terms of

MRAE, there is significant evidence in favor of DMA. However, in terms of MRSE

and MRRSE the evidence is not as convincing.

In order to check for robustness of the results, two robustness checks have been

executed. In the first robustness check, the whole process was executed again

based on an All-share index16. The results were strikingly similar to those

presented in tables 4-8 with some small differences. Some of the differences

were that compared to results shown in table 5 forecast error in 1-min data was

even more significantly smaller than 5-min and 10-min data while the

difference between 5-min and 10-min data was not as significant, even though

still in favor for 5-min data. Furthermore, compared to results in table 8, the

robustness test was even more favorable for Bayesian model-averaging on the 1-

day and 1-week horizon, while the 2-weeks horizon results were somewhat

more in favor for DMA17. The second robustness check was based on OMX30 in

a period after the financial crisis. The main difference when executed on this

period was that no significant difference between 1-min and 5-min sample-

frequency could be established. However, both 1-min and 5-min data

outperform 10-min frequency. Since, the sample is restricted this robustness

check could not be ran on 2-weeks horizon, but for the 1-day and 1-week

horizons, the results were significant in favor for BMA18. Summarizing findings

of the robustness tests, they support the main findings of this paper to a large

extent.

Below are all the results separated explicitly by tables referred to in this section.

Table 4 and 5 show the forecast error and the difference in forecast error of

models based on different sample-frequencies. Table 6, 7 and 8 show how

different model-averaging approaches perform compared to single models and

then how different restrictions of Bayesian of dynamic model-averaging perform

internally and against each other.

16

An index that includes all large, mid and small cap stocks listed on Nasdaq Stockholm 17

Main results of the robustness check on the All-share index are show in appendix column (1)

table 9, 10 and 11 18

Main findings of the robustness check based on OMXS30 data after the financial crisis are

illustrated in appendix column (2) table 9, 10 and 11

Page 40: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

39

Table 4. Single Model Forecast Error on 1-, 5- and 10-min frequency

MRAE = Mean Relative Absolute Error

MRSE = Mean Relative Square Error 100

MRRSE = Mean Root Relative Square Error 10 All results in the table are significantly different from zero with 1%

(1) 1-Min (2) 5-Min (3) 10-Min (4) MRAE MRSE MRRSE MRAE MRSE MRRSE MRAE MRSE MRRSE

1-Day Ahead ( =100)

HAR-RV .212 .104 .234 .216 .123 .254 .249 .152 .287

HARJ-RV .222 .108 .245 .21 .115 .247 .251 .144 .289

HARCJ-RV .214 .101 .234 .206 .112 .242 .245 .138 .282

LHARJ-RV .202 .093 .223 .198 .106 .234 .240 .130 .277

LHARCJ-RV .200 .089 .22 .195 .103 .231 .235 .126 .271

AR(3)+GARCH(1,1) .207 .098 .219 .223 .096 .234 .249 .115 .257

AR(3)+iGARCH(1,1) .207 .098 .219 .224 .096 .235 .249 .119 .258

AR(3)+GJRGARCH(1,1) .207 .097 .218 .226 .095 .235 .245 .113 .255

AR(3)+eGARCH(1,1) .207 .094 .217 .227 .095 .235 .25 .114 .256

AR(3)+TGARCH(1,1) .207 .096 .218 .227 .096 .236 .249 .113 .256

1-Week Ahead ( =100)

HAR-RV .191 .063 .199 .194 .077 .209 .199 .082 .214

HARJ-RV .188 .062 .196 .181 .065 .197 .189 .069 .203

HARCJ-RV .183 .059 .191 .182 .066 .198 .188 .068 .201

LHARJ-RV .179 .059 .191 .180 .064 .200 .186 .069 .204

LHARCJ-RV .176 .057 .188 .179 .064 .199 .184 .069 .201

AR(3)+GARCH(1,1) .199 .159 .298 .193 .149 .292 .202 .169 .305

AR(3)+iGARCH(1,1) .205 .169 .304 .194 .148 .293 .202 .169 .305

AR(3)+GJRGARCH(1,1) .193 .147 .288 .193 .148 .291 .191 .157 .285

AR(3)+eGARCH(1,1) .195 .148 .287 .186 .146 .277 .196 .154 .288

AR(3)+TGARCH(1,1) .192 .147 .287 .192 .151 .291 .199 .171 .298

2-Weeks Ahead ( =50)

HAR-RV .189 .085 .202 .189 .081 .203 .201 .087 .215

HARJ-RV .185 .08 .199 .187 .086 .206 .204 .091 .22

HARCJ-RV .179 .073 .192 .181 .081 .198 .199 .087 .215

LHARJ-RV .167 .073 .177 .173 .076 .189 .187 .082 .200

LHARCJ-RV .162 .068 .172 .17 .072 .184 .184 .078 .197

AR(3)+GARCH(1,1) .198 .213 .323 .194 .225 .322 .201 .232 .328

AR(3)+iGARCH(1,1) .197 .216 .322 .204 .238 .337 .215 .251 .349

AR(3)+GJRGARCH(1,1) .193 .21 .326 .197 .221 .332 .196 .222 .326

AR(3)+eGARCH(1,1) .195 .218 .331 .201 .241 .341 .199 .225 .336

AR(3)+TGARCH(1,1) .200 .218 .330 .194 .219 .326 .204 .231 .336

AVERAGE ERROR .195 .117 .242 .197 .122 .249 .213 .134 .264

Note: Column (1) shows each single model used to forecast realized volatility separated by 1-day, 1-week and 2-weeks horizons. Columns (2), (3) and (4) show the forecast error for each single model for 1-min, 5-min and 10-min frequencies, respectively. The forecast error on each frequency is separated by three different loss functions; MRAE, MRSE and MRRSE. Except for the last row, all bolded results denote the best performing model in each frequency, loss function and horizon. The last row shows the average error based on all models and all horizons (n=30).

Page 41: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

40

Table 5. Forecast Difference Between Different Frequencies

MRAE = Mean Relative Absolute Error 100

MRSE = Mean Relative Square Error 1000

MRRSE = Mean Root Relative Square Error 100 ***1% significance, **2% significance and *5% significance (1) (1-min) – (5-min) (2) (1-min) – (10-min) (3) (5-min) – (10-min) (4)

MRAE MRSE MRRSE MRAE MRSE MRRSE MRAE MRSE MRRSE 1-Day Ahead ( =100) HAR-RV -.382 -.197 -.199 -3.66 -.48* -.537*** -3.278*** -.283*** -.338*** HARJ-RV 1.262 -.069 -.019 -2.884 -.363* -.443** -4.146*** -.294* -.424*** HARCJ-RV .812 -.117 -.082 -3.120 -.375 -.485** -3.932*** -.258 -.403** LHARJ-RV .466 -.126 -.112 -3.771** -.37*** -.538*** -4.237*** -.246** -.426*** LHARCJ-RV .493 -.142 -.118 -3.509* -.372* -.512*** -4.002*** -.23* -.394*** AR(3)+GARCH(1,1) -1.591 .015 -.15 -4.151*** -.173 -.371** -2.560* -.188* -.221* AR(3)+iGARCH(1,1) -1.635 .021 -.158 -4.171*** -.205 -.387*** -2.536* -.226*** -.229* AR(3)+GJRGARCH(1,1) -1.924 .011 -.169 -3.76*** -.166 -.37*** -1.831 -.178* -.200* AR(3)+eGARCH(1,1) -2.047 -.01 -.183 -4.305*** -.202 -.395*** -2.258 -.193* -.212 AR(3)+TGARCH(1,1) -2.022 .007 -.178 -4.252*** -.163 -.388*** -2.23 -.17* -.21* 1-Week Ahead ( =100) HAR-RV -.293 -.132 -.103 -.768 -.184 -.159 -.475 -.052 -.056 HARJ-RV .729 -.027 -.005 -.102 -.071 -.068 -.832 -.045 -.063 HARCJ-RV .067 -.069 -.07 -.508 -.093 -.101 -.575 -.023 -.031 LHARJ-RV -.078 -.048 -.093 -.699 -.101 -.132 -.621 -.054 -.039 LHARCJ-RV -.336 -.068 -.115 -.807 -.113 -.135 -.47 -.045 -.02 AR(3)+GARCH(1,1) .615 .107 .062 -.272 -.097 -.071 -.887 -.204* -.133 AR(3)+iGARCH(1,1) 1.119 .208 .112 .327 -.005 -.006 -.792 -.213** -.118 AR(3)+GJRGARCH(1,1) -.05 -.008 -.033 .219 -.101 .03 .269 -.093 .063 AR(3)+eGARCH(1,1) .86 .02 .097 -.095 -.062 -.011 -.955 -.082 -.108 AR(3)+TGARCH(1,1) .008 -.042 -.036 -.657 -.238 -.109 -.665 -.196* -.073 2-Weeks Ahead ( =50) HAR-RV .005 .038 -.012 -1.164 .02 -.131 -1.169 -.058 -.12 HARJ-RV -.203 -.053 -.069 -1.829 -.106 -.212 -1.63 -.053 -.143 HARCJ-RV -.185 -.071 -.055 -1.965 -.139 -.225 -1.781 -.053 -.17 LHARJ-RV -.652 -.028 -.113 -1.983 -.085 -.231 -1.331 -.058 -.118 LHARCJ-RV -.756 -.04 -.116 -2.150 -.100 -.244 -1.394 -.06 -.128 AR(3)+GARCH(1,1) .358 -.122 .006 -.3016 -.192 -.05 -.66 -.07 -.057 AR(3)+iGARCH(1,1) -.649 -.222 -.146 -1.733 -.353 -.263 -1.084 -.131 -.117 AR(3)+GJRGARCH(1,1) -.466 -.116 -.054 -.386 -.123 0 .08 -.007 .054 AR(3)+eGARCH(1,1) -.63 -.228 -.096 -.415 -.07 -.054 .215 .158 .043 AR(3)+TGARCH(1,1) .612 -.007 .039 -.335 -.122 -.062 -.947 -.115 -.101 AVERAGE DIFFERENCE -.216 -.05* -.072*** -1.773* -.175*** -.222*** -1.557* -.125*** -.15**

Note: Column (1) shows each single model used to forecast realized volatility separated by 1-day, 1-week and 2-weeks horizons. Column (2) shows the difference in forecast error between 1-min and 5-min data, where negative values imply that 5-min data yields larger average forecast error. Column (3) shows the difference in forecast error between 1-min and 10-min data, where negative values imply that 10-min data yields larger average forecast error. Column (4) shows the difference in forecast error between 5-min and 10-min data, where negative values implies that 10-min data yields larger average forecast error. The final row “AVERAGE DIFFERENCE” shows average difference between each frequency based on all models and horizons ( =30).

Page 42: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

41

Table 6. Forecast Difference Between Individual and Averaged Models on 1-min Frequency MRAE = Mean Relative Absolute Error 10 MFC = Mean Forecast Combination

MRSE = Mean Relative Square Error 100 BMA = Bayesian Model-Averaging

MRRSE = Mean Root Relative Square Error 10 DMA = Dynamic Model-Averaging ***1% significance, **2% significance and *5% significance (1) (-)MFC (2) (-)BMA (3) (-)DMA (4)

MRAE MRSE MRRSE MRAE MRSE MRRSE MRAE MRSE MRRSE 1-Day Ahead ( =70) HAR-RV -.126 .013 .012 .106 .017 .024 .013 .007 .013 HARJ-RV -.083 .015 .016 .148 .019 .028 .055 .008 017 HARCJ-RV -.138 .013 .01 .094 .017 .021 .002 .007 .011 LHARJ-RV -.298 -.002 -.007 -.066 .002 .005 -.159 -.008 -.005 LHARCJ-RV -.306 -.003 -.008 -.074 .001 .004 -.166 -.009 -.006 AR(3)+GARCH(1,1) -.208 -.009 -.014 .024 -.005 -.002 -.069 -.015 -.013 AR(3)+iGARCH(1,1) -.203 -.008 -.013 .029 -.004 -.002 -.064 -.014 -.012 AR(3)+GJRGARCH(1,1) -.204* -.007 -.014 .028 -.003 -.002 -.064 -.013 -.012 AR(3)+eGARCH(1,1) -.2* -.007 -.013 .031 -.003 -.001 -.061 -.013 -.012 AR(3)+TGARCH(1,1) -.2 -.007 -.013 .032 -.003 -.002 -.061 -.013 -.012 1-Week Ahead ( =70) HAR-RV -.565* -.162*** -.195*** .01 -.022* -.035*** .047 -.047* -.037* HARJ-RV -.676** -.168*** -.206*** -.101 -078*** -.045*** -.064 -.053** -.048*** HARCJ-RV -.719*** -.171*** -.209*** -.144 -.031*** -.049*** -.107 -.056** -.052*** LHARJ-RV -.741*** -.172*** -.209*** -.166 -.003*** -.049*** -.129 -.057** -.052*** LHARCJ-RV -.761*** -.173*** -.211*** -.186 -.033*** -.051*** -.15 -.058** -.054*** AR(3)+GARCH(1,1) -.434* -.057* -.085*** .142 .083** .075*** .179 .058*** .072*** AR(3)+iGARCH(1,1) -.366 -.047** -.08** .209 .093*** .08*** .245 .068*** .077*** AR(3)+GJRGARCH(1,1) -.527** -.072* -.101*** .048 .068* .06*** .085 .043*** .057*** AR(3)+eGARCH(1,1) -.504 -.078* -.102** .071 .062*** .058*** .108 .037*** .055*** AR(3)+TGARCH(1,1) -.546** -.08** -.104*** .029 .06*** .056*** .066 .035*** .053*** 2-Weeks Ahead ( =25) HAR-RV -.696 -.188*** -.263*** .157 -.051 -.034 .314 .008 -.004 HARJ-RV -.714* -.194*** -.264*** .139 -.057 -.035 .296 .002 -.005 HARCJ-RV -.804*** -.206*** -.274*** .049 -.07 -.045* .206 -.011 -.015 LHARJ-RV -.953*** -.202*** -.291*** -.100 -.066 -.061*** .057 -.007 -.032 LHARCJ-RV -1.04*** -.213*** -.3*** -.184 -.076 -.071*** -.027 -.017 -.041 AR(3)+GARCH(1,1) -.907* -.086 -.192** -.053 .050 .037 .103 .109 .067* AR(3)+iGARCH(1,1) -.915* -.08 -.193** -.061 .057 .037 .095 .115 .066** AR(3)+GJRGARCH(1,1) -.845* -.082 -.177** .009 .054 .053 .165 .113 .082* AR(3)+eGARCH(1,1) -.825* -.071 -.173** .029 .066 .056 .185 .124 .085** AR(3)+TGARCH(1,1) -.891* -.078 -.187** -.038 .058 .042 .119 .117 .071* AVERAGE DIFFERENCE -.546** -.0861* -.129* .007 .007 .005 .041 .015 .01

Note: Column (1) shows each single model used to forecast realized volatility separated by 1-day, 1-week and 2-weeks horizons. Column (2) shows the difference in forecast error between single models and MFC, where negative values imply that MFC yields larger average forecast error. Column (3) shows the difference in forecast error between single models and BMA, where negative values imply that BMA yields larger average forecast error. Column (4) shows the difference in forecast error between single models and DMA, where negative values imply that DMA yields larger average forecast error. The last row “AVERAGE DIFFERENCE” shows average difference between single models and combined forecasts based on all models and horizons ( ). Also note that results are based on combined models consisting of all 10 single models on each horizon.

Page 43: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

42

Table 7. Forecast Difference of Different Model-Averaging Restrictions on 1-min Frequency MRAE = Mean Relative Absolute Error 100 BMA = Bayesian Model-Averaging

MRSE = Mean Relative Square Error 1000 DMA = Dynamic Model-Averaging

MRRSE = Mean Root Relative Square Error 100 ***1% significance, **2% significance and *5% significance

(1) 1–Day (2) ( =70) 1-Week (3) ( =70) 2–Weeks (3) ( =25)

MRAE MRSE MRRSE MRAE MRSE MRRSE MRAE MRSE MRRSE

BAYESIAN

BMA 10 – BMA 8 -.385 -.035 -.091 .349 .092 .144* -.467 .405 .24

BMA 10 – BMA 6 -.612 -.027 -.113 .201 .021*** .327*** -.61 .549 .388**

BMA 10 – BMA 4 -.814 -.06 -.167 .366 .297*** .459*** -.494 .651 .514**

BMA 10 – BMA 2 -.44 -.052 -.076 -.107 .285*** .377*** -.249 .694 .661**

BMA 8 – BMA 6 -.227 .008 -.022 -.148 .122*** .183*** -.141 .143 .147

BMA 8 – BMA 4 -.429 -.025 -.076 .017 .205*** .315*** -.025 .246 .274*

BMA 8 – BMA 2 -.055 -.018 .015 -.456 .193*** .233*** .22 .289 .421***

BMA 6 – BMA 4 -.203 -.033 -.054 .165 .825*** .133*** .116 .102 .126

BMA 6 – BMA 2 .172 -.026 .037 -.307 .071* .05 .361 .145 .274***

BMA 4 – BMA 2 .375 .0076 .091 -.473 -.012 -.082* .245 .043 .147***

BEST PERFORMER BMA 10 BMA 10 BMA 10 BMA 4 BMA 4 BMA 4 BMA 10 BMA 2 BMA 2

DYNAMIC

DMA 10 – DMA 8 -.253 -.004 -.007 -.121 .0104 .009 .167 -.003 -.011

DMA 10 – DMA 6 -.51 -.013 -.019 -.32 .023 .013 .049 -.005 -.012

DMA 10 – DMA 4 -.795 -.024 -.041 -.189 .031 .036 .017 -.018 -.042

DMA 10 – DMA 2 -.74 -.031 -.03 -.431 .038 .052 -.008 -.044 -.087

DMA 8 – DMA 6 -.258 -.009 -.012 -.199 .013 .004 -.118 -.002 -.002

DMA 8 – DMA 4 -.542 -.02 -.034 -.068 .021 .027 -.15 -.014 -.031

DMA 8 – DMA 2 -.487 -.003* -.022 -.31 .028 .043 -.175 -.041* -.076

DMA 6 – DMA 4 -.284 -.011 -.023 .132 .008 .023 -.032 -.013 -.03

DMA 6 – DMA 2 -.23 -.019 -.011 -.11 .015 .039 -.057 -.04*** -.074

DMA 4 – DMA 2 .055 -.008 .012 -.242 .007 .016 -.023 -.027 -.045*

BEST PERFORMER DMA 10 DMA 10 DMA 10 DMA 10 DMA 2 DMA 2 DMA 8 DMA 10 DMA 10

Note: Column (1) shows different restrictions between BMA and DMA models, where the number after each name illustrates number of models included in the model-averaging process. The ten first rows illustrate different restrictions for BMA models and the ten last rows illustrate different restrictions for DMA models. Colum (2), (3) and (4) show the difference in forecast error for 1-day, 1-week and 2-weeks horizons respectively separated by loss function; MRAE, MRSE and MRRSE. The final row “BEST PERFORMER” show those models that on average yield less forecast error for each loss function and horizon.

Page 44: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

43

Table 8. Forecast Difference Between Different BMA and DMA Specifications on 1-min Frequency MRAE = Mean Relative Absolute Error 10 BMA = Bayesian Model-Averaging

MRSE = Mean Square Error 1000 DMA = Dynamic Model-Averaging

MRRSE = Mean Root Relative Square Error 100 ***1% significance, **2% significance and *5% significance

(1) 1 – Day (2) ( =70) 1 –Week (2) ( =70) 2 – Weeks (3) ( =25)

MRAE MRSE MRRSE MRAE MRSE MRRSE MRAE MRSE MRRSE

BAYESIAN – DYNAMIC

BMA 10 – DMA 10 -.093* -.1*** -.103* .037 -.248 -.029 .157* .588 .294***

BMA 10 – DMA 8 -.118*** -.11*** -.11** .025 -.237 -.02 .173 .584 .284

BMA 10 –DMA 6 -.144** -.11*** -.122* .005 -.225 -.016 .162 .582 .282

BMA 10 – DMA 4 -.172** -.12*** -.144* .018 -.216 .007 .158 .57 .252

BMA 10 – DMA 2 -.167** -.13*** -.133 -.006 -.209 .023 .156 .543 .208

BMA 8 – DMA 10 -.054 -.066 -.012 .002 -.34 -.173* .204** .182 .054

BMA 8 – DMA 8 -.079 -.07 -.02 -.01 -.329 -.164 .220*** .179 .043

BMA 8 – DMA 6 -.105 -.079 -.031 -.03 -.316 -.16 .209** .177 .041

BMA 8 – DMA 4 -.134 -.09 -.054 -.017 -.308 -.137 .205 .165 .012

BMA 8 – DMA 2 -.128 -.098 -.042 -.041 -.301 -.121 .203 .138 -.033

BMA 6 – DMA 10 -.031 -.075 .01 .017 -.462* -.356*** .218 .039 -.093

BMA 6 - DMA 8 -.057 -.078 .003 .005 -.452* -.346*** .234 .036 -.104

BMA 6 – DMA 6 -.082 -.087 -.009 -.015 -.439* -.342*** .223 .034 -.106

BMA 6 – DMA 4 -.111 -.098 -.031 -.002 -.431* -.319*** .22 .021 -.135

BMA 6 – DMA 2 -.105 -.106 -.019 -.026 -.424* -.304* .217 -.052 -.18

BMA 4 – DMA 10 -.011 -.041 .064 .0001 -.544* -.488*** .206 .063 -.22

BMA 4 – DMA 8 -.036 -.045 .057 -.012 -.534** -.479*** .223 -.067 -.231

BMA 4 – DMA 6 -.062 -.054 .045 -.0319 -.521* -.475*** .211 -.068 -.232

BMA 4 – DMA 4 -.091 -.065 .023 -.0188 -.513* -.452*** .208 -.081 -.262

BMA 4 – DMA 2 -.085 -.073 .034 -.043 -.506* .436*** .205 -.107 -.307

BMA 2 – DMA 10 -.049 -.049 -.027 .047 -.533** -.406*** .182 -.106 -.367

BMA 2 – DMA 8 -.074 -.053 -.035 .035 -.522* -.397*** .198 -.11 -.378

BMA 2 – DMA 6 -.1 -.061 -.046 .015 -.51* -.392*** .187 -.112 -.379

BMA 2 – DMA 4 -.128 -.073 -.069 .029 -.501* -.37*** .183 -.124 -.409

BMA 2 – DMA 2 -.123 -.08 -.057 .004 -.494* -.354*** .181 -.151 -.454

AVERAGE DIFFERENCE -.094*** -.08*** -.033 -.001 -.4*** -.268** .198*** .114 -.097

Note: The table shows the difference of realized volatility forecasts between BMA and DMA when different models are included, in which column (1) shows which restriction of BMA and DMA is being tested. Colum (2), (3) and (4) show the difference in forecast error for 1-day, 1-week and 2-weeks horizons respectively separated by loss function; MRAE, MRSE and MRRSE. The bolded results reflect the difference between top performing restricted models identified in table 7. The final row “AVERAGE DIFFERENCE” shows the average difference between BMA and DMA based on all different restriction on each horizon ( ). A negative value implies that the specified DMA model yields a larger forecast error than the specified BMA model.

Page 45: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

44

7. Conclusions

Previous research has found strong evidence that choosing the sample-

frequency and model-averaging approach that minimize forecast error of

realized volatility adds economic value in terms of portfolio selection. For these

reasons, the purpose of this paper has been twofold. First, it aimed to fill the gap

regarding the optimal sample-frequency on higher frequencies when

forecasting realized volatility. Second, it aimed to give new insight regarding the

optimal approach for model-averaging when forecasting realized volatility.

For this paper’s first purpose, the results support previous research in the sense

that selecting sample-frequency is important for forecast performance. This

study finds that selecting 1-min frequency rather than 5- or 10-min yields

significant better forecasts when measured in terms of MRSE and MRRSE.

However, when subsampling the data to only consist of the period after the

financial crisis, the difference between 1-min and 5-min is insignificant. This

implies that the impact of market microstructure noise might be different in

periods with high and low levels of volatility. However, the main findings of this

paper suggest that mentioned trade-off occurring due to market microstructure

noise is not as important as previously thought, at least not within the range of

investigated frequencies in this paper and for OMX30. The results bring clarity

to previous findings of Bandi & Russel (2006), in which they suggested that the

optimal frequency is somewhere between 0.4-min and 13.8-min. However, the

result of this paper does not consider the possibility that the optimal frequency is

outside the 1-min and 10-min range. As mentioned previously, Potter et al.

(2008) results indicate that the optimal frequency ranges between 30-min and

65-min. However, since most papers have chosen to stay within the 1-min and

10-min interval, this study gives support to choose 1-min rather than 5-min or

10-min. Furthermore, the results’ external validity is hard to advocate since

market microstructure noise might vary between markets. Thus, this paper’s

findings regarding optimal sample-frequency only holds for those stock

exchanges that are similar to the Swedish stock exchange. Further research

should investigate the impact of market microstructure noise depending on

market. In other words, identify the characteristic of a market in which 1-min

frequency performs better than 5-min frequency and further on.

The second purpose of this paper has been to give new insight regarding model-

averaging. This paper finds significant evidence indicating that Mean forecast

combination (MFC) does not deal with model uncertainty or generate accurate

forecasts compared to single models. This finding speaks against Smith & Wallis

(2009), who argued that MFC is as good as model-averaging approaches based

Page 46: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

45

on previous performance. Even though the difference between performance

based model-averaging approaches and MFC is never statistically tested, the fact

that MFC is significantly outperformed by single models while performance

based model-averaging approaches are not indicates that MFC is less accurate in

the forecasting process. The average difference from table 6 suggests that

model-averaging based on historical forecast performance deals with model

uncertainty in the sense that it on average generates smaller forecast error than

single models. However, these results are not significant. This paper also finds

that using all models not always produce the best model-averaging selection.

However, most of these results are not significant implying that restricting

models likely has a small impact on forecast performance. Finally and most

important for this paper’s second purpose, it is found that Bayesian model-

averaging (BMA) significantly outperform Dynamic model-averaging (DMA) on

1-day’s and 1-week’s horizon, but on 2-week’s horizon the DMA is slightly

better in terms of MRSE and significantly better in terms of MRAE. These mixed

findings are not in line with Liu et al. (2017), who found that DMA significantly

outperforms BMA on all horizons. However, these findings open the door to the

possibility that choice of weighting approach depends on the forecast horizon. A

possible explanation for why horizon matters might be due to that models

forecast performance of realized volatility are more time-varying on 2-weeks

horizon than on 1-day and 1-week horizons and thus forecasts based on DMA

are more beneficial on longer horizons. However, it could also be the case that

BMA is only performing as poor as DMA on the longer horizon and that DMA is

not actually improved on the 2-weeks horizon. Further research in this area

should consider the option to optimize the model-averaging process further. It is

possible that during periods in the time series a few models are needed while in

other periods more models are needed. A suggestion for this issue would be to

consider a new kind of model-averaging approach, in which the number of

models included in the average process depends on a relative threshold value

and if the model exceeds this value, the model is excluded.

Furthermore, it should be highlighted that for the 2-weeks horizon when

model-averaging only 25 forecast observations have been obtained and

examined. This is somewhat less than the 30 observations needed to assume a

normal distribution according to the rule of thumb, in which is approximately

large and normally distributed as . This problem also holds for those

results presented in the final row of table 8, which tests the average difference

in forecast error between BMA and DMA across all restrictions, since they are

also based on 25 observations. Mentioned issues are potential restrictions of this

paper’s result and should be kept in mind when interpreting the results.

Page 47: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

46

Bibliography

Andersen, T. G., Bollerslev, T., & Diebold, F. X. (2007). Roughing It Up: Including Jump Components in the Measurement, Modeling, and Forecasting of Return Volatility. The Review of Economics and Statistics, 701-720.

Andersen, T. G., Bollerslev, T., & Meddahi, N. (2011). Realized Volatility Forecasting and Market Microstructure. Journal of Econometrics 160, 220-234.

Andersen, T. G., Bollerslev, T., Diebold, F. X., & Ebens, H. (2001, b). The Distribution of Realized Stock Return Volatility. Journal of Financial Economics 61, 43-76.

Andersen, T. G., Bollerslev, T., Diebold, F. X., & Labys, P. (2001, a). The Distribution of Realized Exchange Rate Volatility. Journal of The American Statistical Association, 41-55.

Andersen, T. G., Bollerslev, T., Diebold, F. X., & Labys, P. (2003). Modeling and Forecasting Realized Volatility. Econometrica, 579-625.

Andersen, T., & Bollerslev, T. (1998). Answering the Sceptics: Yes, Standard Volatility Models do Provide Accurate Forecasts. International Economic Review 39, 885-905.

Awartani, B., Corradi, V., & Distaso, W. (2009). Assesing Market Microstructure Effects via Realized Volatility Measures with an Application to the Dow Jones Industrial Averages Stocks. Journal of Business & Economics Statistics, Vol. 27, 251-265.

Bandi, F. M., & Russel, J. R. (2006). Separating Microstructure Noice from Volatility. Journal of Fincancial Economics 79, 655-692.

Bandi, F., & Russel, J. (2008). Microstructure Noice, realized variance, and optimal sampling. Review Economic Studies 75, 339-369.

Barndorff-Nielsen, O. E., & Shephard, N. (2002, a). Econometric Analysis of Realized Volatility and Its Use in Estimating Stochastic Volatility Models . Journal of the Royal Statistical Society 64, 253-280.

Barndorff-Nielsen, O. E., & Shephard, N. (2004a). Power and Bipower Variation with Stochastic Volatility and Jumps. Journal of Financial Econometrics, 1-37.

Barndorff-Nielsen, O. E., & Shephard, N. (2005). How Accurate is the Asymptotic Approximation to the distribution of Realized Variance? Cambridge: Cambridge Press.

Bates, J. M., & Granger, C. W. (1969). The Combination of Forecasts. Operational Research Quarterly 20, 451-468.

Black, F. (1986). "Noise". The Journal of Finance, 529-543. Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal

of econometrics 31, 307-327. Caporin, M., & Velo, G. G. (2015). Realized Range Volatility Forecasting: Dynamic

Features and Predictive Variables. International Review of Economics and Finance, 98-112.

Corsi, F. (2009). A Simple Approximate Long-Memory Model of Realized Volatility. Journal of Financial Econometrics 159, 276-288.

Corsi, F., Mittnik, S., Pigorsch, C., & Pigorsch, U. (2008). The Volatility of Realized Volatility. Econometric Reviews 27, 46-78.

Corsio, F., & Renó, R. (2012). Discrete-Time Volatility Forecasting With Persistent Leverage Effect and the Link With Continous-Time Volatility Modeling. Journal of Business and Economic Statistics 30:3, 368-380.

Degiannakis, S. (2008). ARFIMAX and ARFIMAX-TGARCH realized volatility modeling. Journal of Applied Statistics, 1169-1180.

Elton, E. J., Gruber, M. J., Browm, S. J., & Goetzmann, W. N. (2013). Modern Portfolio Theory and Investment Analysis. Wiley.

Engle, R. F. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica, 987-1007.

Engle, R. F., & Bollerslev, T. (1986). Modeling the Persistence of Conditional Variance. Econometric Reviews, 1-50.

Fleming, J., Kirkby, C., & Ostdiek, B. (2001). The Economic Value of Volatility Timing. The Journal of Finance, 329-352.

Fleming, J., Kirkby, C., & Ostdiek, B. (2003). The economic value of volatility timing using "realized" volatility. Journal of Financial Economics 67, 473-509.

Gloste, L. R., Jagannathan, R., & Runkle, D. E. (1993). On the Relation Between Expected Value and the Volatility of the Nominal Excess Return on Stocks. The Journal of Finance, 1779-1801.

Page 48: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

47

Hansen, P. R., & Lunden, A. (2006). Realized Variance and Market Microstructure Noice. Journal of Business & Economic Statistics Vol.24, 127-161.

Hibon, M., & Evgeniou, T. (2004). To Combine or not Combine: Selecting Among Forecasts and Their Combinations. Intenational Journal of Forecasting 21, 15-24.

Hull, J. C. (2012). Options, Futures and other Derivatives. Pearson Education. Liu, C., & Maheu, J. M. (2009). Forecasting Realized Volatility: A Bayesian Model-

Averagin Approach. Journal of Applied Econometrics, 709-733. Liu, H.-C., Chiang, S.-M., & Cheng, N. Y.-P. (2012). Forecasting the Volatility of S&P

depositary receipts using GARCH-type models under intraday range-based and return-based proxy measures. International Review of Economics and Finance 22, 78-91.

Liu, J., Wei, Y., Ma, F., & Wahab, M. (2017). Forecasting the Realized Range-Based Volatility Using Dynamic Model Averaging Approach. Economic Modelling 6, 12-26.

McAleer, M., & Medeiros, M. (2008). Realized Volatility: a review. Econometric Review, 10-45.

McMillan, D. G., & Speight, A. E. (2004). Daily Volatility Forecasts: Reassesing the Performance of GARCH Models. Journal of Forecasting, 449-460.

Merton, R. C. (1980). On Estimating the Expected Return on the Market: an Exploratory Investigation. Journal of Financial Economics 8, 323-361.

Nelson, D. B. (1991). Conditional Heteroskedasticity in Asset Returns: A New Approach. Econometrica 59, 347-370.

Potter, M. D., Martens, M., & Dijk, D. v. (2008). Predicting the Daily Covariance Matrix for S&P 100 Stocks Using Intraday Data - But Which Frequency to Use? Econometric Reviews 27, 199-229.

Rapach, D. E., Strauss, J. K., & Zhou, G. (2010). Out-of-Sample Equity Premium Prediction: Combination Forecasts and Links to the Real Economy. The Review of Financial Studies vol. 23, 821-862.

Shin, D. W., & Hwang, E. (2015). A Langrangian Multiplier Test for Market Microstructure Noice with Applications to Sampling Interval Determination for Realized Volatility. Economics Letters 129, 95-99.

Smith, J., & Wallis, K. (2009). A Simple Explanation of the Forecast Combination Puzzle. Oxford Bulletin of Economics and Statistics 71, 331-357.

Wang, C., & Nishiyama, Y. (2015). Volatility forecast of stock indices by model averaging using high-frequency data. International Review of Economics and Finance 40, 324-337.

Wang, Y., Ma, F., Wei, Y., & Wu, C. (2016). Forecasting Realized Volatility in a Changing World: A dynamic Model Averaging Approach. Journal of Banking & Finance 64, 136-149.

Zakoian, J.-M. (1994). Treshold Heteroskedastic Models. Jourmal of Economic Dynamic and Control 18, 931-955.

Zhou, B. (1996). High-Frequency Data and Volatility in Foreign-Exchange Rates. Journal of Business & Economic Statistics, 45-52.

Page 49: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

48

Appendix

Table 9. Robust Test Average Difference in Forecast Error on Different Frequencies

MRAE = Mean Relative Absolute Error *5% Significance

MRSE = Mean Relative Square Error **2% Significance

MRRSE = Mean Root Relative Square Error ***1% Significance

All-Share index (1) OMXS30 Post Financial Crisis (2)

MRAE MRSE MRRSE MRAE MRSE MRRSE

(1-min) –

(5-min) -4.66*** -.66*** -.6*** -.4048 .0134 -.0538

(1-min) –

(10-min) -5.671*** -.77*** -.711*** -2.627*** -.1049* -.2448***

(5-min) –

(10-min) -1.0142 -.102* -.114* -2.222*** -.118*** -.191***

Note: The results corresponds to the results shown in last row of table 5, in which the average difference in forecast performance of each single model based on different sample-frequencies are tested. The left column shows the results of an “All-Share index” on Nasdaq Stockholm stock exchange during 2nd January 2007 and 28th December 2012. The Second column shows the results for OMXS30 after the financial crisis, in which the sample starts at 1th Juni 2009. For this period on OMXS30 the 2-weeks horizon has been excluded due to few observations.

Table 10. Robust Test Average Difference Between Combined and Single Models

MRAE = Mean Relative Absolute Error *5% Significance

MRSE = Mean Relative Square Error **2% Significance

MRRSE = Mean Root Relative Square Error ***1% Significance

All-Share index (1) OMXS30 Post financial Crisis(2)

MRAE MRSE MRRSE MRAE MRSE MRRSE

MFC -.537** -.0637* -.1138* -.3675*** -.03 -.0639

BMA .0366 .0226 .0087 .0975 .0196 .0212

DMA .0615 .0167 .0119 -.0336 .0051 .0013

Note: The results corresponds to the results shown in the last row of table 6, in which the average difference in forecast performance of each single model and combined model is shown. The combined models tested against single models are Mean Forecast Combination (MFC), Bayesian Model-Averaging (BMA) and Dynamic Model-Averaging (DMA). A negative value indicate that the tested model-averaging approach on average is outperformed by single models and a positive value indicates the opposite. The left column shows the results of an “All-Share index” on Nasdaq Stockholm stock exchange during 2nd January 2007 and 28th December 2012. The Second column shows the results for OMXS30 after the financial crisis, in which the sample starts at 1th Juni 2009. For this period on OMXS30 the 2-weeks horizon has been excluded due to few observations.

Page 50: On Optimal Sample-Frequency and Model-Averaging ...1145913/...However, modeling realized volatility based on intraday return has a potential drawback; it might absorb more market microstructure

49

Table 11. Robust Test Average Difference Between Bayesian and Dynamic Model-Averaging

MRAE = Mean Relative Absolute Error *5% Significance

MRSE = Mean Relative Square Error **2% Significance

MRRSE = Mean Root Relative Square Error ***1% Significance

All-Share index (1) OMXS30 Post financial Crisis (2)

MRAE MRSE MRRSE MRAE MRSE MRRSE

1-day ahead -.032 -.08*** -.054*** -.1736*** -.146*** -.196***

1-week ahead -.0551*** -.308*** -.2662*** -.0313 -.116*** -.271***

2-weeks ahead .1789*** .0607 .0395 n/a n/a n/a

Note: The results corresponds to the results shown in the last row of table 8, in which the average difference in forecast performance of Bayesian and Dynamic model-averaging is shown on 1-day, 1-week and 2-weeks horizons. A negative value indicates that Bayesian outperforms Dynamic model-averaging and a positive value indicates the opposite. The left column shows the results of an “All-Share index” on Nasdaq Stockholm stock exchange during 2nd January 2007 and 28th December 2012. The Second column shows the results for OMXS30 after the financial crisis, in which the sample starts at 1th Juni 2009. For this period on OMXS30 the 2-weeks horizon has been excluded due to few observations.