Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Bagging Binary Predictors for Time Series
Tae-Hwy Lee∗
Department of EconomicsUniversity of California, Riverside
Riverside, CA [email protected]
Yang YangDepartment of Economics
University of California, RiversideRiverside, CA 92521
February 2004
Abstract
Bootstrap aggregating or Bagging, introduced by Breiman (1996a), has been proved to be effectiveto improve on unstable forecast. Theoretical and empirical works using classification, regression trees,variable selection in linear and non-linear regression have shown that bagging can generate substantialprediction gain. However, most of the existing literature on bagging have been limited to the crosssectional circumstances with symmetric cost functions. In this paper, we extend the application ofbagging to time series settings with asymmetric cost functions, particularly for predicting signs andquantiles. We link quantile predictions to binary predictions in a unified framwork. We find that baggingmay improve the accuracy of unstable predictions for time series data under certain conditions. Variousbagging forecast combinations are used such as equal weighted and Bayesian Model Averaging (BMA)weighted combinations. For demonstration, we present results from Monte Carlo experiments and fromempirical applications using monthly S&P500 and NASDAQ stock index returns.
Key Words: Asymmetric cost function, Bagging, Binary prediction, BMA, Forecast combination, Ma-jority voting, Quantile prediction, Time Series.
JEL Classification: C3, C5, G0.
∗Corresponding author. Tel: +1 (909) 827-1509. Fax: +1 (909)787-5685.
1 Introduction
In practice it is quite common that one forecast model performs well in certain periods while other models
perform better in other periods. It is difficult to find a forecast model that outperforms all competing models.
To improve forecasts over individual models, combined forecasts have been suggested. Bates and Granger
(1969), Granger, Deutsch, and Terasvirta (1994), Stock and Watson (1999), Knox, Stock and Watson (2000)
and Yang (2004) show that forecast combinations can improve forecast accuracy over a single model. Granger
and Jeon (2004), Aiolfi and Favero (2002), and Aiolfi and Timmermann (2004) emphasize that combined
forecasts may provide an insurance to diversify risk of forecast errors, analogous to investing on portfolios
rather than on individual securities.
In practice it is also common that a forecast model performs well in certain periods while it does not do
very well in other periods. It is difficult to find a forecast model that performs well in all prediction periods.
To improve forecasts of a model, a combination can be formed over a set of training sets. While usually
we have a single training set, it can be replicated via bootstrap and the combined forecasts trained over
the bootstrap-replicated training sets can be better. This is the idea of bootstrap aggregating (or bagging),
introduced by Breiman (1996a).
It is well known that, while financial returns {Yt}may not be predictable, their variance and signs are oftenshown to be predictable. Christofferson and Diebold (2003) show that the binary variableGt+1 ≡ 1(Yt+1 > 0)is predictable when some conditional moments are time varying. Hong and Lee (2003), Hong and Chung
(2003), Pesaran and Timmermann (2002a), Skouras (2002), among many others, find some evidence that
the directions of stock returns and foreign exchange rate changes are predictable.
In this paper we examine how bagging may improve binary predictions and quantile predictions for
time series under asymmetric cost functions. While there remains much work to be done in the literature,
many theoretical and numerical works have shown that bagging is effective to improve unstable forecasts.
However, most of the existing works have focused on cross sectional data and symmetric cost functions.
In this paper we find that bagging can improve the accuracy of binary and quantile predictions for time
series data under asymmetric cost functions. Monte Carlo experiments and empirical applications using the
S&P500 and NASDAQ monthly returns are presented to demonstrate it. We construct binary predictors
based on quantile predictors so that the binary and quantile predictions are linked. Various bagging forecast
combinations with equal weights and weights based on Bayesian Model Averaging (BMA) are considered.
As noticed by many researchers bagging may outperform the unbagged predictors, but this may not
always be the case. Bagging is a device to improve the accuracy of unstable predictors. A predictor is said to
1
be unstable if perturbing the training sample can cause significant changes in the predictor (Breiman 1996b).
The unstableness of predictors may come from many sources: discrete choice regressions (e.g. decision
trees and classification problem), non-smooth target functions (e.g. median or quantile function involving
indicator functions), small sample size, outliers or extreme values, and etc. Under these circumstances, the
model uncertainty and/or parameter estimation uncertainty may be so serious that the standard regression
and prediction may be unstable.
Several methods have been suggested to smooth out the instabilities. Combination or aggregation smooth-
ing (Bates and Granger 1969, Granger and Jeon 2004) have been adopted frequently in moment predictions.
A kernel smoothing has been proposed for median predictions (Horowitz 1992) and binary quantile pre-
dictions (Kordas 2002). Bootstrap aggregating or bagging (Breiman 1996a) is a method of smoothing the
instabilities by averaging the predictors over bootstrap predictors and thus lowering the sensitivity of the
predictors to training samples.
Why does bagging work? Bagging constructs a predictor by averaging predictors trained on bootstrap
samples. Bagging is effective thanks to the variance reduction stemming from averaging predictors and so it
is most effective if the volatility of the predictors is very high. This mechanism of bagging for smoothing is
illustrated by Friedman and Hall (2000), who decompose a predictor or an estimator into linear and higher
order parts by Taylor-expansion. They show that bagging reduces the variance for the higher order nonlinear
component by replacing it with an estimate of its expected value, while leaving the linear part unaffected.
Therefore, bagging works by moving the estimator towards its linear components and works most successfully
with highly nonlinear estimators such as decision trees and neural network. Buja and Stuetzle (2002) extend
the application of bagging regression to more general circumstance by expressing predictors as statistical
functionals (especially those can be written as U statistics). They use the von Mises expansion technique
based on Taylor expansion and prove that the leading effects of bagging on variance, squared bias, and mean
squared error (MSE) are of order n−2, where n is the estimation sample size. So, bagging can also be used on
estimators from non-smooth target functions, such as median predictors and quantile predictors. However,
the mechanism of bagging has not yet been fully understood. Grandvalet (2004) shows that bagging stabilizes
prediction by equalizing the influence of training samples, even when the usual variance reduction argument
is at best questionable. Bagging uses bootstrap sample randomly drawn with replacement and some extreme
values and outliers may not be included in some bootstrap samples and thus reduces the influence of such
values. See also Rao and Tibshirani (1997).
Alternatively, Buhlmann and Yu (2002) have shown that bagging transforms a hard-thresholding function
2
(e.g., indicator) into a soft-thresholding function (smooth function) and thus decreases the instabilities of
predictors. However, the relevance of this argument depends on how the bagging predictor is formed. As
discussed in Breimen (1996a), bagging predictors can be formed via voting instead of averaging. If the
target variable of interest is numerical (as for the quantile prediction in this paper), we can form the bagging
predictor by an average over bootstrap predictors. However, if the target variable takes only discrete values,
then a voting scheme over the discrete values is to be called for from the bootstrap predictors (as for
the binary prediction in this paper). Both averaging and voting can be weighted. Bagging transforms
a hard-thresholding function into a soft-thresholding function only for averaged-bagging, but not for the
voted-bagging because a voted-bagging predictor will remain as a binary hard-thresholding function.
Even so, the ability of the voted-bagging to stabilize classifier prediction has been proved to be very
successful in the machine learning literature (Bauer and Kohavi 1999, Kuncheva and Whitaker 2003, and
Evgeniou et al. 2004). The method for voting classification or voted-bagging is now an established research
area known under different names in the literature — combining classifiers, classifier ensembles, committees
of learners, a team of classifiers, consensus of learners, mixture of experts, etc. In this paper we attempt to
understand how the voted-bagging (bagging binary predictions) may work or fail to work. We find that the
voted-bagging works very well when the training sample is small and the predictor is unstable. It does not
necessarily mean that a better prediction can be achieved by stabilization of the prediction as discussed in
different but related context by Kuncheva and Whitaker (2003) and Dzeroski and Zenko (2004).
There are some limitations in the already substantial bagging literature. Most of the existing literature
on bagging are dealing with independent data. With the obvious dynamic structures in most economics and
finance variables, it is thus important to see how bagging works for time series. One concern of applying
bagging to time series is whether a bootstrap can provide a sound simulation sample. Fortunately, it has
been shown that the block bootstrap can provide consistent densities for moment estimators and quantile
estimators (Hahn 1997, Fitzenberger 1997, Horowitz 1998, Hardle et al. 2003, Politis 2003). Another
limitation of current bagging research is using symmetric cost functions (such as MSE) as a prediction
evaluation criterion. However, symmetric cost may not be a proper way to compare different predictors
(Granger 1999a, 1999b, 2002, Granger and Pesaran 2000), and it is widely accepted that asymmetric cost
functions are more relevant (Elliott et al. 2003a, 2003b). That is, our utility changes differently with positive
and negative forecast error, or with wrong-alert and fail-to-alert. In this paper, we analyze bagging for non-
smooth binary predictors formed via majority voting, for time series, under asymmetric cost functions.
The plan of this paper is as follows. Section 2 shows a binary prediction problem based on utility
3
maximization behavior of an economic agent. We introduce a way to form a binary predictor through a
quantile predictor. Section 3 shows how bagging provides superior predictive ability for time series binary
predictions, that is constructed based on quantile predictions. In Section 4 we examine how bagging works
for quantile predictions. Section 5 presents Monte Carlo experiments and Section 6 reports empirical results
on the binary and quantile predictions using the monthly returns in S&P500 and NASDAQ stock indexes.
Section 7 concludes.
2 Binary Prediction
Granger (2002) notes, based on Luce (1980, 1981), that a risk measure of financial return Yt+1 that has a
conditional predictive distribution PYt+1(y|Xt) = Pr(Yt+1 < y|Xt) may be written as
R(Xt) = A1
Z ∞0
|y −m|pdPYt+1(y|Xt) +A2
Z 0
−∞|y −m|pdPYt+1(y|Xt),
with A1, A2 both > 0 and some p > 0, where m is the predictor of y that minimizes the risk. Granger (2002)
emphasizes the risk measure starts with the whole conditional distribution PYt+1(y|Xt) with asymmetric
weights on each half-distribution.
Initially considering the symmetric case A1 = A2, one has the class of volatility measures
R(Xt) = E|y −m|p,
which includes the variance (p = 2) and the mean absolute deviation (p = 1). One problem raised here is
how to choose optimal Lp-norm in empirical works. In some studies (e.g. Ding, Granger, and Engle 1993,
Granger, Ding, and Spear 2000) the time series and distributional properties of these measures have been
studied empirically and the absolute deviations (p = 1) found to have particular properties such as long
memory. Granger (2002) consider the absolute return (p = 1) as a preferable measure given that returns
from the stock market are known to come from a distribution with particularly long tails. In particular,
Granger (2002) refers to a trio of papers (Nyguist 1983, Money et al. 1982, Harter 1977) who find that the
optimal p = 1 from Laplace and Cauchy distribution, p = 2 for Gaussian and p = ∞ (min/max estimator)
for a rectangular distribution. Granger (2002) also notes that in terms of the kurtosis κ, Harter (1977)
suggests to use p = 1 for κ > 3.8; p = 2 for 2.2 ≤ κ ≤ 3.8; and p = 3 for κ < 2.2. In finance, the kurtosis ofreturns can be thought of as being well over 4 and so the L1 norm with p = 1 is preferred. In this paper, we
follow Granger (2002) to consider binary and quantile predictions with A1 6= A2 and p = 1.Consider the conditional expected cost or conditional risk
R(Xt) = EY (c (et,1) |Xt) ≡ZRc (et,1) dPYt+1(y|Xt), (1)
4
where et,1 ≡ Gt+1 − Gt,1, Gt+1 ≡ 1(Yt+1 > 0), Gt,1 ≡ Gt,1(Xt) is the one-step binary predictor at t. The
unconditional risk is then
EY,X (c (et,1)) ≡ZRk
ZRc (et,1) dPYt+1(y|Xt)dPXt(x), (2)
where the joint distribution PYt+1,Xt(y,x) = PYt+1(y|Xt)PXt(x) and PXt(x) is the marginal distribution of
Xt.
We now consider the above risk function to discuss a binary prediction. To define the asymmetric risk
with A1 6= A2 and p = 1, we consider the binary decision problem of Granger and Pesaran (2000) with the
following 2× 2 payoff or utility matrix:Utility Gt+1 = 1 Gt+1 = 0Gt,1 = 1 u11 u01Gt,1 = 0 u10 u00
(3)
where uij is the utility when Gt,1 = j is predicted and Gt+1 = i is realized (i, j = 1, 2). Assume u11 > u10
and u00 > u01, and uij are constant over time. (u11−u10) > 0 is the utility gain from taking correct forecastwhen Gt,1 = 1, and (u00 − u01) > 0 is the utility gain from taking correct forecast when Gt,1 = 0. Denote
πt ≡ Pr(Gt+1 = 1|Xt). (4)
Under this set up, the expected utility of Gt,1 = 1 is u11πt+u01(1−πt), and the expected utility of Gt,1 = 0is u10πt + u00(1− πt). Hence, to maximize utility, the prediction Gt,1 = 1 will be made if
u11πt + u01(1− πt) > u10πt + u00(1− πt),
or
πt >(u00 − u01)
(u11 − u10) + (u00 − u01) ≡ 1− α.
Summarizing, we have the following result.
Proposition 1 (Granger and Pesaran, 2000). The optimal predictor that can maximize expected utility of
an economic agent is:
G†t,1 = 1(πt > 1− α). (5)
We can consider (1− α) as the utility-gain ratio from correct prediction when Gt+1 = 0, or equivalently
the opportunity cost (in the sense that you lose the gain) of a wrong-alert if the wrong prediction is made
when Gt+1 = 0. Similarly,
α ≡ (u11 − u10)(u11 − u10) + (u00 − u01) (6)
5
can be treated as the utility-gain ratio from correct prediction when Gt+1 = 1 is realized. To put it in
another way, α is the opportunity cost of a fail-to-alert from the wrong prediction when Gt+1 = 1 is realized.
We thus have the cost function c (et,1) = c (Gt+1 −Gt,1):
Cost Gt+1 = 1 Gt+1 = 0Gt,1 = 1 0 1− αGt,1 = 0 α 0
That is
c (et,1) =
α if1− α if0 if
et,1 = 1et,1 = −1et,1 = 0
,
which can be equivalently written as c (et,1) = ρα(et,1), where
ρα(z) ≡ [α− 1(z < 0)]z (7)
is the check function of Koenker and Basset (1978). Hence, summarizing, we have obtained the following
the result.
Proposition 2. The solution to the utility maximization problem in (3) is the same as the solution to the
cost minimization problem in (7). In other words, the optimal binary prediction obtained in Proposition 1
is the binary quantile that minimizes EY (ρα(et,1)|Xt), i.e., G†t,1 ≡ argminGt,1 EY (ρα(Gt+1 −Gt,1)|Xt), and
thus G†t,1 is the conditional α-quantile of Gt+1, denoted as
G†t,1 = QGt+1(α|Xt). (8)
Proof: The cost function may be written as
c (et,1) = αGt+1 + (1− α−Gt+1)Gt,1. (9)
and thus the conditional risk is
EY (c (et,1) |Xt) = απt + (1− α− πt)Gt,1. (10)
When πt > 1 − α, the minimizer of EY (c (et,1) |Xt) is G†t,1 = 1. When πt < 1 − α, the minimizer of
EY (c (et,1) |Xt) is G†t,1 = 0.
This is the maximum score problem of Manski (1975, 1985), Manski and Thompson (1989), and Kordas
(2002). Proposition 2 shows that the optimal binary predictor is the conditional α-quantile of Gt+1. In the
6
next proposition, we show that the optimal binary predictor is also the binary function of the conditional
α-quantile of Yt+1.
Proposition 3. The optimal binary predictor (under our cost function) is the indicator function of the
α-quantile of Yt+1 being positive:
G†t,1 = 1(QYt+1(α|Xt) > 0), (11)
where QYt+1(α|Xt) is the α-quantile function of Yt+1 conditional on Xt.
Proof: As noted by Powell (1986), for any monotonic function h(·), Qh(Y )(α|X) = h(QY (α|X)), whichfollows immediately from observing that Pr(Y < y|X) = Pr[h(Y ) < h(y)|X]. As the indicator function ismonotonic, QGt+1(α|Xt) = Q1(Yt+1>0)(α|Xt) = 1(QYt+1(α|Xt) > 0).
Remarks. Note that Q†Gt+1(α|Xt) = argminEY (ρα(et,1)|Xt) with et,1 ≡ Gt+1 − QGt+1(α|Xt), while
Q†Yt+1(α|Xt) = argminEY (ρα(ut,1)|Xt) with ut,1 ≡ Yt+1 − QYt+1(α|Xt). Proposition 2 indicates that the
binary prediction can be made from the binary quantile regression for Gt+1 in equation (8). Proposition 3
indicates that the binary prediction can also be made from a binary function of the quantile for Yt+1 in equa-
tion (11). In the next section where we consider bagging binary predictors, we use the latter. The reasons
why we choose to use Proposition 3 instead of Proposition 2 are as follows. First, Manski (1975, 1985) and
Kim and Pollard (1990) show that parameter estimators obtained from minimizingP
ρα (et,1) has a slower
n1/3 convergence rate and has a non-normal limiting distribution. This is unattractive for our subsequent
analysis for bagging in Section 4 where we will make assumptions of asymptotic normality and bootstrap
consistency. Instead, we estimate the binary regression model from minimizingP
ρα (ut,1) , which gives n1/2
consistency and asymptotic normality. See Kim and White (2001), Chernozhukov and Umantsev (2001),
and Komunjer (2003). Second, the asymptotic properties of quantile estimators are better established than
the maximum score estimator. The properties of quantile regression for both linear and non-linear models,
for both cross-section and time-series data have been well developed. However, only linear models have
been considered inside of the indicator function in the binary or maximum score literature. Third, perhaps
most importantly, the quantile-based binary regression models allows more structures than the maximum
score regression models. We may have much more information than just an indicator. In the maximum
score literature, Y is usually latent and unobservable. In our case, however, Y is not latent but observable.
Hence using more informative variable Y in the quantile regression tends to give better prediction than just
using the binary variable G. Finally, the quantile itself is a very interesting topic, particularly due to recent
7
booming industry of the financial risk management. The quantile prediction is not only used in generating
the binary prediction as shown in Proposition 3, but also the quantile itself is often the objective of inter-
ests in finance and economics, e.g., Value-at-Risk (VaR). For this reason, in addition to the bagging binary
predictors considered in Section 3, we also consider the bagging quantile predictors in Section 4.
3 Bagging Binary Prediction
To compute the binary predictor via Proposition 3, we consider a linear quantile regression function
QYt+1(α|Xt) = X0tβ(α), (12)
where Xt ∈ Rl contains the known function (e.g. polynomials) of Xt ∈ Rk and β(α) ∈ Θ ⊂ Rl. We
use a polynomial function (Chernozhukov and Umantsev, 2001) within a univariate model, with Xt = Yt,
Xt = (1 Yt Y2t )
0, β(α) = [β0(α) β1(α) β2(α)]0, and QYt+1(α|Xt) = β0(α) + β1(α)Yt + β2(α)Y
2t . Denote the
optimal binary predictor as
G†t,1(Xt) ≡ 1(Q†Yt+1(α|Xt) > 0) = 1(X0tβ†(α) > 0),
where β†(α) ≡ argminβ(α) EY (ρα (ut,1) |Xt).
We estimate β†(α) recursively using the “rolling” training sample. Suppose there are T + 1 (≡ R + P )observations. We use the most recent R observations available at time t, R ≤ t < T +1, as a training sampleDt ≡ {(Ys,Xs−1)}ts=t−R+1.We then generate P one-step-ahead forecasts for the remaining validation sample.For each time t in the prediction period, we use a rolling training sample Dt of size R to estimate model
parameters
βt(α|Dt) ≡ argminβ(α)
tXs=t−R+1
ρα(Ys −X0s−1β(α)), t = R, . . . , T.
We can then generate a sequence of one-step-ahead forecasts {QYt+1(α|Xt,Dt) = X0tβt(α|Dt)}Tt=R and
Gt,1(Xt,Dt) ≡ 1(QYt+1(α|Xt,Dt) > 0) = 1(X0tβt(α|Dt) > 0), t = R, . . . , T. (13)
For more discussion on the rolling estimation scheme and other alternative schemes, see West and McCracken
(1998, p. 819). Note that the notation, Gt,1(Xt,Dt), is to indicate the parameter estimation uncertainty inβt(α|Dt) due to the training of the unknown parameter β(α) using the training sample Dt.Although we have a single training set Dt at time t, an imitation of the training set can be constructed
from repeated bootstrap samples D∗t and form the predictor of the target variable (that is usually one-
dimensional). If the target variable is numerical as for the stock returns Y or for its quantiles QY (α|X),
8
the bagging procedure is to take an average of the forecasts from the replicated training samples. If the
target variable is a class as for the binary variable G, then a method of aggregating the bootstrap trained
forecasts is a voting. One of the most representative unstable predictor used in bagging literature is the
classifier predictor, with the binary forecast as a special case with only two classes. The aim of the paper
is to compare the two binary forecasts: unbagged predictor Gt,1 and the bagging predictor GBt,1 (defined
below).
The procedure of bagging for binary predictors can be conducted in the following steps:
1. Given a training set of data at time t, Dt ≡ {(Ys,Xs−1)}ts=t−R+1, where the Ys are the variable ofinterest and the Xs are vectors of predictor variables, construct the jth bootstrap sample D∗(j)t ≡{(Y ∗(j)s ,X
∗(j)s−1)}ts=t−R+1, j = 1, . . . , J, the bootstrap samples, according to the empirical distribution
of Dt.
2. Estimate β∗(j)t (α|D∗(j)t ) ≡ argminβ(α)
Pts=t−R+1 ρα(Y
∗(j)s −X∗(j)0s−1 β(α)), t = R, . . . , T.
3. Compute the bootstrap binary predictor from the jth bootstrapped sample, that is,
G∗(j)t,1 (Xt,D∗(j)t ) ≡ 1(Q∗(j)Yt+1
(α|Xt,D∗(j)t ) > 0) = 1(X0tβ∗(j)t (α|D∗(j)t ) > 0). (14)
Note that here we use Xt instead of X∗(j)t so that only the parameter β
∗(j)t (α|D∗(j)t ) is trained on the
bootstrap samples D∗(j)t , but the forecast is formed using the original predictor variables Xt.
4. Form a (weighted) average over all bootstrap training samples drawn from the same distribution:
ED∗G∗t,1(Xt,D∗t ) =JXj=1
wj,tG∗(j)t,1 (Xt,D∗(j)t )
withPJj=1 wj,t = 1. The expectation ED∗(·) is the average over the bootstrap training samples D∗t .
This is a key step in bagging to smooth out the instability of the predictor due to the parameter
estimation (training or learning). The weights wj,t is typically set equal to J−1, but can be computed
via a Bayesian approach (see Section 3.2). J is often chosen in the range of 50, depending on sample
size and on the computational cost to evaluate the predictor. See Breiman (1996a, Section 6.2). Our
Monte Carlo results and empirical results reported in Sections 5 and 6 suggests J = 50 is more than
sufficient, and even J = 20 is often good enough.
5. Finally, the bagging binary predictor GBt,1(Xt,D∗t ) can be constructed by the majority voting over theJ bootstrap predictors, i.e.,
GBt,1(Xt,D∗t ) ≡ 1(ED∗G∗t,1(Xt,D∗t ) > 1/2).
9
The notation, G∗(j)t,1 (Xt,D∗(j)t ), G∗t,1(Xt,D∗t ), and GBt,1(Xt,D∗t ), is to indicate the parameter estimation
uncertainty in β∗(j)t (α|D∗(j)t ) due to the training of the unknown parameter β(α) using the bootstrap training
sample D∗(j)t . For example, the binary predictor G∗(j)t,1 (Xt,D∗(j)t ) that is constructed using the parameter
value β∗(j)t (α|D∗(j)t ) trained by the sample D∗(j)t is the output of G†t,1(Xt) for each value ofXt in the predictor
space. Bagging is to average out (smooth out) the effect of the different training samples.
3.1 How bagging works
We now examine how bagging provides a superior predictor than an unbagged predictor for time series data.
We compare the unconditional risk of a bagging predictor and an unbagged predictor.
First, we need to find out the differences among different binary predictors. The minimum possible
expected cost is that of the optimal predictor G†t,1(Xt). Plugging G†t,1(Xt) into the conditional risk function
in (10), we have
EY (c(e†t,1)|Xt) = απt + (1− α− πt)G†t,1(Xt), (15)
where e†t,1 ≡ Gt+1 − G†t,1(Xt). As defined in equation (1), the conditional expectation EY (·|Xt) is taken
over PYt+1(y|Xt), the distribution conditional on the predictor variables Xt.
Once the unknown parameters are trained, the conditional risks of unbagged predictor and bagging
predictors can be written as
EY (c (et,1) |Xt) = απt + (1− α− πt)Gt,1(Xt,Dt), (16)
EY (c(eBt,1)|Xt) = απt + (1− α− πt)GBt,1(Xt,D∗t ), (17)
where et,1 ≡ Gt+1 − Gt,1(Xt,Dt) and eBt,1 ≡ Gt+1 − GBt,1(Xt,D∗t ).Now, taking (weighted) averages of (16) and (17) over the training samples Dt and D∗t , respectively,
yields:
EY,D(c (et,1) |Xt) = απt + (1− α− πt)EDGt,1(Xt,Dt)
= απt + (1− α− πt) Pr(Gt,1(Xt,Dt) = 1), (18)
and
EY,D∗(c(eBt,1)|Xt) = απt + (1− α− πt)ED∗GBt,1(Xt,D∗t )
= απt + (1− α− πt) Pr(GBt,1(Xt,D∗t ) = 1). (19)
Then, from (15), (18), and (19), it is easy to see the following result.
10
Proposition 4. If the conditional risk of the unbagged predictor (18) is so close to the minimum possible
conditional risk in (15), that is, EDGt,1(Xt,Dt) or Pr(Gt,1(Xt,Dt) = 1) is very close to G†t,1, then there isno room for improvement by bagging.
Comparing the predictability of unbagged predictor and bagging predictor can be done by comparing the
conditional risks in (18) and (19). If
(1− α− πt) Pr(Gt,1(Xt,Dt) = 1) > (1− α− πt) Pr(GBt,1(Xt,D∗t ) = 1), (20)
we say bagging works. Hence, we need to explore the properties of Pr(GBt,1(Xt,D∗t ) = 1) and Pr(Gt,1(Xt,Dt) =1). Recall Proposition 3 that we utilize quantile predictors to generate the binary predictors, i.e., G†t,1(Xt) =
1(Q†Yt+1(α|Xt) > 0), Gt,1(Xt,Dt) = 1(QYt+1(α|Xt,Dt) > 0), and G∗t,1(Xt,D∗t ) = 1(Q∗Yt+1(α|Xt,D∗t ) > 0).Let us further define some notation
ZR ≡ R1/2[QYt+1(α|Xt,Dt)−Q†Yt+1(α|Xt)]/σQ(α) ≡ −dR + d†R,
Z∗R ≡ R1/2[Q∗Yt+1(α|Xt,D∗t )− QYt+1(α|Xt,Dt)]/σQ(α) ≡ −d∗R + dR,
with d†R = −√RQ†Yt+1(α|Xt)/σQ(α), d
∗R ≡ −
√RQ∗Yt+1(α|Xt,D∗t )/σQ(α), dR ≡ −
√RQYt+1(α|Xt,Dt)/σQ(α),
and σ2Q(α) = var³R1/2[QYt+1(α|Xt,Dt) − Q†Yt+1(α|Xt)]
´.
Using Propositions 1 and 4, the condition in (20) under which bagging works can be written as follows:
Proposition 5. (a) When 1 − α − πt > 0 (equivalently when G†t,1 = 0, Q†Yt+1(α|Xt) < 0, and d†R > 0),
the bagging works if Pr(Gt,1(Xt,Dt) = 1) > Pr(GBt,1(Xt,D∗t ) = 1) > G†t,1 = 0. (b) When 1 − α − πt < 0
(equivalently when G†t,1 = 1, Q†Yt+1(α|Xt) > 0, and d†R < 0), the bagging works if Pr(Gt,1(Xt,Dt) = 1) <
Pr(GBt,1(Xt,D∗t ) = 1) < G†t,1 = 1.
We assume that {QYt+1(α|Xt,Dt) = X0tβt(α|Dt)} satisfies the following asymptotic properties — the
asymptotic normality (Komunjer 2003) and the bootstrap consistency (Fitzenberger 1997):
ZR →d N(0, 1),
supz∈R
|| Pr(Z∗R < z)− Φ(z) ||→p 0,
as R→∞ and 0 < σ2Q(α) <∞, where Φ(·) is the standard normal CDF. Therefore,
Pr(Gt,1(Xt,Dt) = 1) = Pr(QYt+1(α|Xt,Dt) > 0)
11
= Pr(dR < 0)
= Pr(ZR > d†R)
= 1− Pr(ZR < d†R) (21)
→ 1− Φ(d†R) as R→∞, (22)
where the second equality follows from the definition of dR, and the third equality follows from the definition
of ZR. As noted in Proposition 4, if Pr(Gt,1(Xt,Dt) = 1) is very close to G†t,1, there is no room for improve-
ment by bagging. According to (22), when d†R is close to 0, Φ(d†R) is most away from 1 or 0. That means
when the pseudo-true conditional α-quantile Q†Yt+1(α|Xt) is close to 0, there is largest room for bagging to
work. This is because when Q†Yt+1(α|Xt) = 0 the unbagged predictor Gt,1(Xt,Dt) = 1(QYt+1(α|Xt,Dt) > 0)will be most “unstable”. Hence, for economic agents or financial investors with the cost function parameter
α that gives Q†Yt+1(α|Xt) = 0, the bagging may work better. Recall that (1 − α) is the utility-gain ratio
from correct prediction when Gt+1 = 0 or equivalently the opportunity cost of a wrong-alert when Gt+1 = 0.
Also, α is the utility-gain ratio from correct prediction when Gt+1 = 1 is realized or the opportunity cost of
a fail-to-alert from the wrong prediction when Gt+1 = 1 is realized. For those investors with the conditional
α-quantile of the stock returns equal zero, the bagging binary prediction could work most effectively.
Now let us see how bagging works when there is some room for the unbagged predictors to be improved.
Bagging prediction is based on bootstrap procedure, so
ED∗G∗t,1(Xt,D∗t ) = Pr(G∗t,1(Xt,D∗t ) = 1)
= Pr(Q∗Yt+1(α|Xt,D∗t ) > 0)
= Pr(Z∗R > dR)
≡ 1− Φ∗(dR)
→ 1− Φ(d†R) as R→∞,
where the fourth line defines a notation Φ∗(·) to denote the bootstrap distribution of Z∗R. Therefore, we canrewrite Pr(GBt,1(Xt,D∗t ) = 1) as
Pr(GBt,1(Xt,D∗t ) = 1) = Pr³ED∗G∗t,1(Xt,D∗t ) > 1/2
´= Pr
³1− Φ∗(dR) > 1/2
´= Pr
³Φ∗(dR) < 1/2
´= Pr
³dR < Φ
∗−1(1/2)´
12
= Pr³d†R − ZR < Φ∗−1(1/2)
´= Pr
³ZR > d
†R − Φ∗−1(1/2)
´= 1− Pr
³ZR < d
†R − Φ∗−1(1/2)
´(23)
→ 1− Φ(d†R) as R→∞, (24)
where the last line is due to the asymptotic normality and the bootstrap consistency, i.e., Φ∗−1(1/2) →Φ−1(1/2) = 0. Hence, when R is large, the bagging would not work because (22) and (24) are the same
in the limit. When R is small, a difference between (21) and (23) is due to the median of the bootstrap
distribution Φ∗−1(1/2). So, bagging works either (a) if Q†Yt+1(α|Xt) < 0 and Φ∗−1(1/2) < 0, or (b) if
Q†Yt+1(α|Xt) > 0 and Φ∗−1(1/2) > 0. Summarizing:
Proposition 6. (a) When 1 − α − πt > 0 (equivalently when G†t,1 = 0, Q†Yt+1(α|Xt) < 0, d†R > 0) and
R is small, the bagging works if Φ∗−1(1/2) < 0. (b) When 1 − α − πt < 0 (equivalently when G†t,1 = 1,
Q†Yt+1(α|Xt) > 0, d†R < 0) and R is small, the bagging works if Φ
∗−1(1/2) > 0.
Proposition 7. When R is large, bagging does not work for the binary prediction.
Remark. The ability of the voted-bagging to stabilize classifier prediction has been proved to be very
successful in the machine learning literature (Bauer and Kohavi 1999, Kuncheva and Whitaker 2003, and
Evgeniou et al. 2004). However, it may not work, as also discussed by Kuncheva and Whitaker (2003)
and Dzeroski and Zenko (2004) in different but related contexts. Bagging predictors could be worse than
unbagged predictors. From the above analysis, we may see that bagging can push a predictor to its optimal
value under certain conditions, and also see that ability of bagging to improve predictability is limited. If
there is no predictability at all or the unbagged predictor is already very close to the optimal predictor, there
will be no room for improvement by bagging. We also note that bagging can be worse than the unbagged
predictors. Therefore, we cannot arbitrarily select a prediction model and expect bagging to work like a
magic and give a better prediction automatically. In case of classification (as in our binary predictors), the
bagging predictor is formed via voting, for which case Breiman (1996a, Section 4.2) shows that it is possible
that bagging could be worse than unbagged predictor. As we will see from the Monte Carlo experiment in
Section 5 and the empirical applications in Section 6, the bagging could be worse for the binary predictors
and the quantile predictions, although in general it works well particularly with a small R.
Remark. The “voted-bagging” does not transform the hard-thresholding decision into a soft-thresholding
13
decision. As Buhlmann and Yu (2002) have shown, “averaged-bagging” transforms a hard-thresholding func-
tion (e.g., indicator) into a soft-thresholding function (smooth function) and thus decreases the instabilities
of predictors. In the majority voting GBt,1(Xt,D∗t ) = 1(ED∗G∗t,1(Xt,D∗t ) > 1/2) the bagging predictor is stilla hard thresholding decision, or the indicator decision — e.g., the decision whether to take an umbrella (not
a fraction of an umbrella) by forecasting weather. Hence, the bagging binary prediction remains unstable
(particularly around the thresholding values). Hence, the explanation of Buhlmann and Yu (2002) does not
apply to the voted-bagging.
3.2 BMA-weighted bagging for binary prediction
To form the bagging binary predictor via a majority voting we need to find a proper weight function {wj,t}to calculate ED∗G∗t,1(Xt,D∗t ) =
PJj=1 wj,tG
∗(j)t,1 (Xt,D∗(j)t ). Usually, the bagging predictor uses the equally
weighted averaging (wj,t = J−1, j = 1, . . . , J.) over the bootstrapped predictions. However, wj,t can be
estimated depending on the performance of each bootstrapped predictor. There are several candidates that
we can borrow from estimated weighting forecast combination. One method is to estimate the weight by
regression (minimizing the cost function) as initially suggested by Granger et al. (1994). This weight scheme
assumes that predictors to be combined are independent and the number of predictors are small compared to
the sample size, so the regular regression method can be applied. However, the number of our bootstrapped
predictors are large and the predictors are closely related, the regression weight is not applicable here.
Another method for forecast combination is via the Bayesian averaging, which is what we use in this paper
for Monte Carlo experiments and empirical work. The idea is to use the Bayesian model averaging (BMA)
technique for bagging.1
The posterior predictive density of D∗t based on Dt can be decomposed as two parts — the predictionbased on the estimated model and the parameter estimation given the data:
P (D∗t |Dt) =JXj=1
PhD∗t | β
∗(j)t (α|D∗(j)t ),Dt
iPhβ∗(j)t (α|D∗(j)t ) | D∗(j)t ,Dt
iP (D∗(j)t |Dt)
=JXj=1
PhD∗t | β
∗(j)t (α|D∗(j)t ),Dt
iPhβ∗(j)t (α|D∗(j)t ) | Dt
i.
We then compute the BMA-weighted bootstrap average ED∗G∗t,1(Xt,D∗t ) as follows:
ED∗G∗t,1(Xt,D∗t ) =
ZG∗t,1(Xt,D∗t )dP (D∗t |Dt)
1BMA has been proved to be useful in forecasting financial returns (Avramov 2002) and in macroeconomic forecasting forinflation and output growth (Garratt et al. 2003), to deal with the parameter and model uncertainties. The out-of-sampleforecast performance is often shown to be superior to that of model selection criteria.
14
=JXj=1
Phβ∗(j)t (α|D∗(j)t ) | Dt
i ZG∗t,1(Xt,D∗t )dP
hD∗t | β
∗(j)t (α|D∗(j)t ),Dt
i
≡JXj=1
wj,tG∗(j)t,1 (Xt,D∗(j)t ),
where the last equality follows fromRG∗t,1(Xt,D∗t )dP
hD∗t | β
∗(j)t (α|D∗(j)t ),Dt
i= G
∗(j)t,1 (Xt,D∗(j)t ) and by
setting wj,t ≡ Phβ∗(j)t (α|D∗(j)t ) | Dt
i.
The posterior probability of β∗(j)t (α|D∗(j)t ) given Dt through the jth bootstrap data set D∗(j)t is calculated
by Bayes’ rule:
wj,t = Phβ∗(j)t (α|D∗(j)t ) | Dt
i=
PhDt | β∗(j)t (α|D∗(j)t )
iP (β
∗(j)t (α|D∗(j)t ))PJ
j=1 PhDt | β∗(j)t (α|D∗(j)t )
iP (β
∗(j)t (α|D∗(j)t ))
,
and now the problem is to estimate PhDt | β∗(j)t (α|D∗(j)t )
iand P (β
∗(j)t (α|D∗(j)t )).When we do not have any
information for the prior P (β∗(j)t (α|D∗(j)t )), we may just use some non-informative prior, P (β
∗(j)t (α|D∗(j)t ))
is the same for all j, so that
wj,t = Phβ∗(j)t (α|D∗(j)t ) | Dt
i=
PhDt | β∗(j)t (α|D∗(j)t )
iPJ
j=1 PhDt | β∗(j)t (α|D∗(j)t )
i . (25)
where PhDt | β∗(j)t (α|D∗(j)t )
iis usually estimated by a likelihood function. According to Komunjer (2003),
the exponential of quantile cost function can be treated as quasi-likelihood function, so PhDt | β∗(j)t (α|D∗(j)t )
ican be calculated by
PhDt | β∗(j)t (α|D∗(j)t )
i∝ exp
Ã−
kXi=1
c(e∗(j)t−i,1)
!, (26)
with using the k-most recent in-sample observations, where e∗(j)t,1 = Gt+1 − G∗(j)t,1 (Xt,D∗(j)t ). We select
k = 1, 5, and R in the simulations and empirical experiments. Intuitively, wj,t gives a large weight to the jth
bootstrap predictor at period t when it has forecasted well over the past k periods, and gives a small weight
to the jth bootstrap predictor at period t when it forecasted poorly over the past k periods.2 In Tables
1-10 (to be discussed in Section 5), EW and Wk denote the weighted-bagging with equal weights and with
BMA-weights using k-most recent in-sample observations.
4 Bagging Quantile Prediction
Another unstable predictor used in bagging literature is non-linear predictors. Quantile predictor is the min-
imizer of cost function ρα(·) as we defined in (7), and is a non-linear function of sample moments. According2The weighting scheme proposed in Yang (2004) and used in Zou and Yang (2004) may be regarded as a special case of the
above BMA-weighted forecasts. Zou and Yang (2004) have applied this method to choose ARIMA models and find that it hasa clear stability advantage in forecasting over some existing popular model selection criteria.
15
to Friedman and Hall (2000) and Buja and Stuetzle (2002), bagging can increase the prediction accuracy
for non-linear predictors. Knight and Bassett (2002) show that under i.i.d. and some other assumptions,
bagged non-parametric quantile estimators and linear regression quantile estimators can outperform their
corresponding standard unbagged predictors. Therefore, we will discuss the effect of bagging on quantile pre-
dictions, by comparing the unbagged predictor QYt+1(α|Xt,Dt) and the bagging predictor QBYt+1(α|Xt,D∗t )(defined below).
The procedure of bagging for quantile predictors can be conducted in the following steps:
1. Construct the jth bootstrap sample D∗(j)t , j = 1, . . . , J, the bootstrap samples, according to the
empirical distribution of Dt.
2. Estimate β∗(j)t (α|D∗(j)t ) = argminβ(α)
Pts=t−R+1 ρα(Y
∗(j)s − X∗(j)0s−1 β(α)), t = R, . . . , T.
3. Compute the bootstrap quantile predictor from the jth bootstrapped sample, that is,
Q∗(j)Yt+1
(α|Xt,D∗(j)t ) ≡ X0tβ∗(j)t (α|D∗(j)t ). (27)
4. Finally, the bagging predictor QBYt+1(α|Xt,D∗t ) can be constructed by averaging over the J bootstrappredictors, i.e.,
QBYt+1(α|Xt,D∗t ) ≡ ED∗Q∗Yt+1(α|Xt,D∗t ),
where ED∗Q∗Yt+1(α|Xt,D∗t ) =PJ
j=1 wj,tQ∗(j)Yt+1
(α|Xt,D∗(j)t ) withPJj=1 wj = 1.
The same problem as for the bagging binary prediction is how to find a proper weight function. Again,
we use the equal weight (wj = J−1, j = 1, . . . , J) and the BMA-weight. We compute the BMA-weighted
bootstrap average as follows:
ED∗Q∗Yt+1(α|Xt,D∗t ) =
ZQ∗Yt+1(α|Xt,D∗t )dP (D∗t |Dt)
=JXj=1
Phβ∗(j)t (α|D∗(j)t ) | Dt
i ZQ∗Yt+1(α|Xt,D∗t )dP
hD∗t | β
∗(j)t (α|D∗(j)t ),Dt
i
≡JXj=1
wj,tQ∗(j)Yt+1
(α|Xt,D∗(j)t ),
where the last equality follows fromRQ∗Yt+1(α|Xt,D∗t )dP
hD∗t | β
∗(j)t (α|D∗(j)t ),Dt
i= Q
∗(j)Yt+1
(α|Xt,D∗(j)t ) and
by setting wj,t ≡ Phβ∗(j)t (α|D∗(j)t ) | Dt
i. The BMA-weights wj,t has the formula (25), that is computed
using
PhDt | β∗(j)t (α|D∗(j)t )
i∝ exp
Ã−
kXi=1
ρα(u∗(j)t−i,1)
!, (28)
16
with using the k-most recent in-sample observations, where u∗(j)t,1 = Yt+1 − Q∗(j)Yt+1
(α|Xt,D∗(j)t ). We again
select k = 1, 5, and R in the simulations and empirical experiments in Sections 6-7.
Why does the bagging work for quantile prediction? There have been several papers that are useful to
answer this question. It is generally explained in two ways. First, bagging can driven an estimator towards
its linear approximation, which usually has a lower variance. Second, the bagging process can work by
smoothing the sample objective function, so the bagging predictor will converge to the global minimizer of
sample objective function.
Bagging for quantile falls into the category of bagging regression, which includes totally different mecha-
nism for the classification problem as we discussed in the previous section. Friedman and Hall (2000), Buja
and Stuetzle (2002), and Knight and Bassett (2002) all use a certain kind of Taylor-expansion to rewrite
interested estimators, and show that bagging predictors are first order equivalent to the standard unbagged
predictors, but will lower the non-linearities of the predictor. However, all of these works make the i.i.d. as-
sumption of the data. Friedman and Hall (2000) use a Taylor-expansion to decompose statistical estimators
into linear and higher order parts, or decompose the objective function that they optimize into quadratic
and higher order terms. To apply a Taylor-expansion, the estimator must be a smooth function of sample
moments, or the cost function is differentiable. Therefore, we can not analyze bagging quantile predictors.
Buja and Stuetzle (2002) extend the application of bagging regression to more general circumstance by ex-
pressing predictors as statistical functionals (especially those can be written as U -statistics). They show
how the quantile estimators can be written as U -statistics. Since a Taylor expansion is no longer applicable
now, they use the von Mises expansion technique (based on a Taylor expansion) to prove that the leading
effects of bagging on variance, squared bias, and MSE are of order R−2, where R is the estimation sample
size. So, bagging may works for non-smooth target functions, such as median predictors and quantile pre-
dictors. Knight and Bassett (2002) illustrate that if the quantile estimator is asymptotically normal, we can
decompose the quantile estimator into linear and non-linear part using the Bahadur-Kiefer representation.
They show that bagging quantile estimators are first-order equivalent to the standard unbagged quantile
predictor, however,bagging can reduce the non-linearity of the sample quantile.
The second explanation why bagging works for quantile predictions has to do with the unstableness of the
quantile prediction. One source of unstableness is the non-differentiable features in the objective function
(7). The sample objective function used in regression as an estimator of the objective function will be even
worse behaved because of the limitation of sample size. There may be several equivalent minimizers for
sample objective function, and the numerical searching method may stop at any of these minimizers, or even
17
a local minimizer depending on the beginning point of the search. So the quantile estimator may have a
very high volatility.
For the above reasons, we conjecture that bagging will also improve quantile prediction with time series
data. We show the performance of bagging for quantile predictions in the next two sections. These results
show that bagging would be very useful in improving the VaR forecasts in financial risk management.
5 Monte Carlo
For the regression forecast (that takes numerical values), Breiman (1996a, Section 4.1) shows that if we can
draw training samples from the true data generating process, the bagging predictor (based on averaging) will
always give a lower MSE than the unbagged predictor. He also shows that, in practice, as we bootstrap a
sample to generate replicated training samples, bagging can give improvement only if the model is unstable.
Although we have a different cost function than MSE, that is the check function in (7) and we allow the
time dependence in the data, bagging works in the similar way as in Breiman (1996) according to our Monte
Carlo and empirical experiments.
For the classification forecast (that takes finite number of discrete values), Breiman (1996a, Section 4.2)
shows that bagging may not always give a better prediction, and sometimes, bagging may even give a worse
prediction. Our numerical results in both this section and next section give a similar conclusion.
In this section, we use a set of Monte Carlo simulations to gain further insights of conditions under which
bagging works. From the asymmetric cost function given in (9), we can obtain the out-of-sample average
costs for unbagged and bagging binary predictors:
S1 ≡ P−1TXt=R
c(et,1) = P−1
TXt=R
ρα(et,1),
S2 ≡ P−1TXt=R
c(eBt,1) = P−1
TXt=R
ρα(eBt,1),
where et,1 ≡ Gt+1 − Gt,1 and eBt,1 ≡ Gt+1 − GBt,1. We consider the asymmetric cost parameter α =
0.1, 0.3, 0.5, 0.7, and 0.9. It will be said that bagging “works” if S1 > S2. To rule out the chance of pure luck
by a certain criterion, we compute the following four summary performance statistics from 100 Monte Carlo
replications (r = 1, . . . , 100): T1 ≡ 1100
P100r=1 S
ra, T2 ≡
³1100
P100r=1(S
ra − T1)
´1/2, T3 ≡ 1
100
P100r=1 1(S
r1 > S
r2),
and T4 ≡ 1100
P100r=1 1(S
r1 = S
r2), where a = 1 for the non-bagged predictor and a = 2 for the bagged predic-
tor. T1 measures the Monte Carlo mean of the out-of-sample cost, T2 measures the Monte Carlo standard
deviation of the out-of-sample cost, T3 measures the Monte Carlo frequency that the bagging works, and
(T3 + T4) measures the Monte Carlo frequency that the bagging is no worse than the unbagged predictors.
18
For quantile predictors, we use the check function for Y given in (7):
S1 ≡ P−1TXt=R
ρα(ut,1), and S2 ≡ P−1TXt=R
ρα(uBt,1),
where ut,1 ≡ yt+1 − QYt+1(α|Xt) and uBt,1 ≡ yt+1 − QBYt+1(α|Xt). The definition of four statistics T1, T2, T3,
and T4 for the quantile prediction are the same as for the binary prediction.
We generate the data from
Yt = ρYt−1 + εt,
εt = zt£(1− θ) + θε2t−1
¤1/2zt ∼ i.i.d. MWi
where the i.i.d. innovation zt is generated from the first eight mixture normal distributions of Marron and
Wand (1992, p. 717), each of which will be denoted as MWi (i = 1, . . . , 8). In Tables 1-2, we consider
the data generating processes for ARCH-MW1 with θ = 0.5, 0.9 (and ρ = 0), while in Tables 3-10, we
consider the data generating processes for AR-MWi (i = 1, . . . , 8) with ρ = 0.6 (and θ = 0). Therefore, our
data generating processes fall into two categories: the (mean-unpredictable) martingale-difference ARCH(1)
processes without AR structure and the mean-predictable AR(1) processes without ARCH structure.
To see some properties of these densities, we generated 10,000 pseudo-random numbers for which some
summary statistics are reported below. All the symmetricMWi innovations (i = 1, 4, 5, 6, 7) are standardized
so that they all have zero mean and unit variance. The asymmetric MWi innovations (i = 2, 3, 8) have non-
zero mean but still unit variance. The skewness of the innovation ranges from −0.7 (MW2) to 1.5 (MW3),
and the kurtosis ranges from 1.4 (MW7) to 25 (MW5).
Density Mean Median Max Min Std.dev. Skewness KurtosisMW1 Gaussian -0.013 -0.011 4.335 -4.252 0.998 -0.009 3.078MW2 Skewed unimodal 0.927 1.038 3.797 -4.084 1.007 -0.723 3.990MW3 Strongly skewed -1.925 -2.315 3.113 -3.307 1.027 1.471 4.813MW4 Kurtoic unimodal 0.019 0.006 4.212 -4.619 0.998 0.038 4.514MW5 Outlier 0.002 -0.006 9.471 -9.521 0.981 0.361 24.777MW6 Bimodal -0.002 0.004 3.125 -2.737 1.010 0.004 2.054MW7 Separated bimodal 0.017 0.274 2.167 -2.015 1.002 -0.346 1.378MW8 Skewed bimodal 0.326 0.370 3.328 -3.436 1.002 -0.330 2.405
For each series, 100 extra series is generated and then discarded to alleviate the effect of the starting values
in random number generation. We consider one fixed out-of-sample size P = 100 and a range of estimation
sample sizes R = 20, 50, 100, 200. Our bagging predictors are generated by voting or by averaging over
J = 50 bootstrap predictors, while the results with J = 20 basically tell the same story.
19
Our binary prediction are based on quantile predictors as suggested in Proposition 3: Gt,1(Xt,Dt) ≡1(QYt+1(α|Xt,Dt) > 0), and we are using a simple univariate linear quantile regression model QYt+1(α|Xt) =
β0(α)+β1(α)Yt+β2(α)Y2t , which is estimated using the interior-point algorithm used by Portnoy and Koenker
(1997).
To generate bootstrap samples, we use the block bootstrap for both Monte Carlo experiments in this
section and the empirical application in the next section. We choose the block size that minimizes the
in-sample average cost recursively and therefore we use a different block size at a each forecast period and
for each different cost function with different α’s.
All the Monte Carlo results are reported in Tables 1-10. According to all the tables, for both binary
predictions and quantile predictions, bagging works well when the sample size is small. The improvement
of bagging predictors over unbagged ones becomes less significant when the sample size increases. This is
true in terms of all three criteria, T1, T2, and T3, that is, bagging gives the largest reduction in the mean
cost, the largest reduction in the variance of the cost, and the highest frequency of out-performance, when
R = 20. Let us use Tables 1-2 as examples for now. The mean cost reduction (in terms of T1) for both binary
prediction and quantile prediction can be as much as about 20% for all α’s; the variance of cost (in terms of
T2) can be reduced by as much as about 20% for binary prediction and about 50% for quantile prediction;
and T3 is as high as 95% for the binary prediction and 100% for the quantile prediction. This result shows
that bagging can significantly mitigate the problem of the sample shortage.
This function of bagging can also be observed by the performance of bagging on tails. The scarce of
observations on tails usually will lead to the poor predictions, however, the degree of improvement of bagging
predictors are very significant for α = 0.1 and 0.9 according to our Monte Carlo results. We illustrate this
fact using Table 3 to 10 in case of R = 20. For binary prediction, the average cost reduction (in terms of
T1) for α = 0.1 and 0.9 is about 15%, however, the average cost reduction for α = 0.5 is only about 5%.
The average cost variance (in terms of T2) reduction for α = 0.1 and 0.9 is about 20%, however, the average
cost variance reduction for α = 0.5 is not significant. The frequency that bagging works (in terms of T3) for
α = 0.1 and 0.9 is about 80%, however, the frequency that bagging works for α = 0.5 is only about 60%. For
quantile prediction, the average cost reduction (in terms of T1) for α = 0.1 and 0.9 is about 20%, however,
the average cost reduction for α = 0.5 is only about 5%. The average cost variance reduction (in terms of
T2) for α = 0.1 and 0.9 is about 40%, however, the average cost variance reduction for α = 0.5 is about 20%.
The frequency that bagging works (in terms of T3) for α = 0.1 and 0.9 and α = 0.5 are similarly around
90%.
20
The advantage of bagging for quantile predictions (in terms of T1, T2, and T3) decreases gradually as the
sample size increases, but still exist for R = 200 though the advantage is not very significant. However, for
the binary prediction, this advantage disappears much faster, and will totally disappear when the sample
size exceeds 100 for most data generating processes (DGP). With a large sample size, the bagging predictor
will converge to the unbagged predictors as predicted from Proposition 7.
Another interesting phenomenon we observe is that bagging works better (in terms of T1, T2, and T3) for
both binary prediction and quantile when there is ARCH structure in the DGP. For the AR processes, the
mean is predictable, so the binary predictors perform pretty well even without bagging. Therefore, there is
not much room left for bagging to improve on. See Proposition 4. However, for the ARCH processes without
the AR term, the conditional mean is unpredictable. Although there is some binary predictability through
the time-varying conditional higher moments, the predictability is harder to explore than in the AR models.
As we can see from the tables, the mean cost (T1) of the unbagged binary predictor in Tables 3-10 is much
lower than the mean cost of the unbagged binary predictors in Tables 1-2. At the same time, the non-linear
structure in the ARCH models lead to the difficulty in the parameter estimation via numerical optimization,
which also leaves more room for bagging to work. Therefore, we can see that bagging improves more (in all
of T1, T2, and T3) in Tables 1-2 than in Tables 3-10 for both binary predictions and quantile predictions.
One more observation from the Monte Carlo simulation is that bagging works asymmetrically (in terms
of T1, T2, and T3) for asymmetric data. For MW distribution with asymmetric distributions like in Table
4 (MW2), 5 (MW3), and 10 (MW8), bagging works better if QYt+1(α|Xt) lies on the flatter tail for bothbinary and quantile prediction. For example, MW2 and MW8 have flatter left tail, therefore bagging works
better for a smaller α in Tables 4 and 10; while MW3 has flatter right tail, therefore bagging works better
for a larger α in Table 5.
In the previous section, we introduced several weights to form bagging predictors: equal weight and the
BMA-weights with lags k = 1, 5, and R, defined in (26) and (28). With less lags, we intend to focus on the
most recent performance of each bootstrap predictor. According to our simulation results, the BMA-weight
with k = 1 performs the best for our DGP’s, although all weights often work quite similarly.
6 Empirical Application
Our empirical application of the binary prediction is to time the market for the S&P500 and NASDAQ
indexes. Let Yt represents the log difference of a stock index at time t, and suppose we are interested in
predicting whether the stock index will rise or not in the next month.
21
The S&P 500 series, retrieved from finance.yahoo.com, is monthly data from October 1982 to February
2004 (T + 1 = 257). The NASDAQ series is also retrieved from finance.yahoo.com, monthly from October
1984 to February 2004 (T + 1 = 233). We split the series into two parts: one for in-sample estimation with
the size R = 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140 and 150, and another for out-of-sample
forecast validation with sample size P = 100 (fixed for all R’s). We choose the most recent P = 100 months
in the sample as a out-of-sample validation sample. We use a rolling-sample scheme, that is, the first forecast
is based on observations T −P −R+1 through T −P , the second density forecast is based on observationsT − P −R+ 2 through T − P + 1, and so on. The two series are summarized as follows:
In-sample period Out-of-sample period T + 1 PS&P 500 October 1982 ∼ October 1995 November 1995 ∼ February 2004 257 100Nasdaq October 1984 ∼ October 1995 November 1995 ∼ February 2004 233 100
Mean Median Max Min Std.dev. Skewness KurtosisS&P 500 0.008 0.013 0.124 -0.245 0.045 -1.066 6.996Nasdaq 0.009 0.019 0.199 -0.318 0.072 -1.017 5.921
We consider the asymmetric cost parameter α = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9 to represent
different possible preferences. With the cost function given in (9), if we predict the market to go down while
the market goes up, the cost is α; and if we predict the market to go up while the market goes down, the
cost is 1 − α. Therefore, if a person only buys and holds the stock, she tends to have a α smaller than 0.5
because missing a earning opportunity will not be as bad as losing money. However, if a person wants to
sell short, she tends to have a α larger than 0.5 because of the leverage effect. Since most investors belong
to the first category, the more predictability may be exploited for small α’s.
Figures 1-2 present the graphs of the out-of-sample average costs (S1 or S2) in vertical axis against the
training sample size R in the horizontal axis, for the nine values of α. There are lines for each α — the dark
solid line is for the cost S1 of the unbagged predictor, and the other four lines are for the cost S2 of the
bagging predictor with different weights (equal weight or the BMA weights).
Figure 1(a) and Figure 2(a) (for S&P500 and NASDAQ respectively) tell a similar story. We expect
that the prediction with smaller α’s will be harder than with larger α’s, which is confirmed by our empirical
results. For the binary prediction bagging works better with smaller α and with the smaller R. The bagging
predictors with different weighting schemes seem to work similarly.
For all α’s and for both binary and quantile predictions, the cost of the bagging predictors converge to
the optimal cost much faster than that of the unbagged predictors. We can see that the bagging predictors
converge to the stable level of the cost for R as small as 20 in most cases for both binary predictions and
quantile predictions. However, a larger R is needed for the unbagged predictors converge to that level of the
cost.
22
When the sample size is small, bagging can lower the cost to larger extent, and bagging predictors almost
never get worse than the unbagged predictors. When R = 20, bagging can lower the cost as great as 50% for
both binary predictions and quantile predictions! When R grows larger, unbagged predictors and bagging
predictors will converge to the same stable cost level.
7 Conclusions
We have examined how bagging works for binary prediction with an asymmetric cost function for time series.
We compute the binary predictor from the quantile predictor. Bagging binary predictors are constructed via
majority voting on the binary predictors trained on the bootstrapped training samples. We have shown the
conditions under which bagging works for the bagging binary predictors. Based on the asymmetric quantile
check functions, by treating it as a quasi likelihood, we have derived the various BMA-weights to form the
weighted bagging both for binary and quantile predictors. The simulation results and the empirical results
using two U.S. stock index monthly returns, not only confirm but also clearly demonstrate the conditions
that bagging may work. The main finding of the paper is that bagging basically works when the size of the
training sample is small and the predictor is unstable. Bagging does not work when the training sample size
is large, and so bagging seems to mainly smooth out the parameter estimation uncertainty due to a small
sample to improve the forecasting performance.
The advantage of bagging discussed and examined in this paper gives the bagging a very wide potential
application in areas where small sample is common. Bagging will be particularly relevant and useful in
practice when structural breaks are frequent so that simply using as many observations as possible is not
a wise choice for out-of-sample prediction, as emphasized in Pesaran and Timmermann (2001, 2002b) and
Paye and Timmermann (2003).
There is still much work to be done in this research. We have considered only a linear quantile model
in this paper. The bagging may be even more useful for the nonlinear quantile models as considered by
Komunjer (2003) and White (1992). Also, in this paper, we have considered only one-step ahead forecasts.
Bagging multi-step ahead forecasts for binary variables and quantiles is an interesting topic in our next
research agenda. More empirical work should be conducted to evaluate the gain from bagging in forecasting
many macroeconomic and financial time series in various frequencies.
23
References
Aiolfi, M. and A. Favero (2002), “Model Uncertainty, Thick Modelling, and the Predictability of Stock
Returns”, University of Bocconi.
Aiolfi, M. and A. Timmermann (2004), “Persistence in Forecasting Performance”, UCSD.
Avramov, K. (2002), “Stock Return Predictability and Model Uncertainty”, Journal of Financial Eco-
nomics, 64, 423-458.
Bates, J.M. and C.W.J. Granger (1969), “The Combination of Forecasts”, Operations Research Quarterly,
20, 451-468.
Bauer, E. and R. Kohavi (1999), “An Empirical Comparison of Voting Classification Algorithms: Bagging,
Boosting, and Variants”, Machine Learning, 36, 105-139.
Breiman, L. (1996a), “Bagging Predictors”, Machine Learning, 24, 123-140.
Breiman, L. (1996b), “Heuristics of Instability and Stabilization in Model Selection”, Annals of Statistics,
24(6), 2350—2383.
Buhlmann, P. and B. Yu (2002), “Analyzing Bagging”, Annals of Statistics, 30(4), 927-961.
Buja, Andreas and W. Stuetzle (2002), “Observations on Bagging”, University of Pennsylvania and Uni-
versity of Washington, Seattle.
Christofferson, P.F. and F.X. Diebold (2003), “Financial Asset Returns, Market Timing, and Volatility
Dynamics”, McGill University and University of Pennsylvania.
Chernozhukov, V. and Len Umantsev (2001), “Conditional Value-at-Risk: Aspects of Modeling and Esti-
mation”, Empirical Economics, 26, 271-292.
Ding, Z., C.W.J. Granger and R. Engle (1993), “A Long Memory Property of Stock Market Returns and
a New Model”, Journal of Empirical Finance, 1, 83-106.
Dzeroski, S. and B. Zenko (2004), “Is Combining Classifiers with Stacking Better than Selecting the Best
One?”, Machine Learning, 54, 255-273.
Elliott, G., I. Komunjer, and A. Timmermann (2003a), “Biases in Macroeconomic Forecasts: Irrationality
or Asymmetric Loss?”, UCSD and Caltech.
Elliott, G., I. Komunjer, and A. Timmermann (2003b), “Estimating Loss Function Parameters”, UCSD
and Caltech.
Evgeniou, T., M. Pontil, and A. Elisseeff (2004), “Leave One Out Error, Stability, and Generalization of
Voting Combinations of Classifiers”, forthcoming, Machine Learning.
Fitzenberger, B. (1997), “The Moving Blocks Bootstrap and Robust Inference for Linear Least Squares and
Quantile Regressions”, Journal of Econometrics, 82, 235-287.
Friedman, J.H. and P. Hall (2000), “On Bagging and Nonlinear Estimation”, Stanford University and
Australian National University.
24
Garratt, A., K. Lee, H.M. Pesaran, and Y. Shin (2003), “Forecast Uncertainties in Macroeconometric
Modelling: An Application to the UK Economy”, Journal of American Statistical Association, 98,
829-838.
Grandvalet, Y. (2004), “Bagging Equalizes Influence”, Machine Learning, in press.
Granger, C.W.J. (1999a), Empirical Modeling in Economics: Specification and Evaluation, Cambridge
University Press: London.
Granger, C.W.J. (1999b), “Outline of Forecast Theory Using Generalized Cost Functions”, Spanish Eco-
nomic Review 1, 161-173.
Granger, C.W.J. (2002), “Some Comments on Risk”, Journal of Applied Econometrics, 17, 447-456.
Granger, C.W.J., M. Deutsch, and T. Terasvirta (1994), “The Combination of Forecasts Using Changing
Weights”, International Journal of Forecasting, 10, 47-57.
Granger, C.W.J., Z. Ding and S. Spear (2000), “Stylized Facts on the Temporal and Distributional Prop-
erties of Absolute Returns: An Update”, Working paper, University of California, San Diego.
Granger, C.W.J. and Y. Jeon (2004), “Thick Modeling”, Economic Modelling, 21, 323-343.
Granger, C.W.J. and M.H. Pesaran (2000), “Economic and Statistical Seasures of Forecast Accuracy”,
Journal of Forecasting, 19, 537-560.
Hahn, J. (1997), “Bayesian Bootstrap of the Quantile Regression Estimator: A Large Sample Study”,
Internationl Economic Review, 38, 795-808.
Hardle, W., J. Horowitz and J.P. Kreiss (2003), “Bootstrap Methods for Time Series”, International Sta-
tistical Review, 71(2), 435-459.
Harter, H.L. (1977), “Nonuniqueness of Least Absolute Values Regression”, Communications in Statistics
- Theory and Methods, A6, 829-838.
Hong, Y. and J. Chung (2003), “Are the Directions of Stock Price Changes Predictable? Statistical Theory
and Evidence”, Cornell University.
Hong, Y. and T.-H. Lee (2003), “Inference on Predictability of Foreign Exchange Rates via Generalized
Spectrum and Nonlinear Time Series Models”, Review of Economics and Statistics, 85(4), November,
in press.
Horowitz, J.L.(1992), “A Smoothed Maximum Score Estimator for the Binary Response Model”, Econo-
metrica, 60(3), 505-531.
Horowitz, J.L.(1998), “Bootstrap Methods for Median Regression Models”, Econometrica, 66(6), 1327-1351.
Kim, J. and D. Pollard (1990), “Cube Root Asymptotics”, Annals of Statistics, 18, 191-219.
Kim, T.H. and H. White (2001), “Estimation, Inference, and Specification Testing for Possibly Misspecified
Quantile Regressions”, Advances in Econometrics, forthcoming.
Knight, K. and G.W. Bassett Jr. (2002), “Second Order Improvements of Sample Quantiles Using Sub-
samples”, University of Toronto and University of Illinois, Chicago.
25
Knox, T., J.H. Stock, and M.W. Watson (2000), “Empirical Bayes Forecasts of One Time Series Using
Many Predictors,” Working Paper, Department of Economics, Harvard University.
Koenker, R and G. Basset (1978), “Asymptotic Theory of Least Absolute Error Regression”, Journal of
the American Statistical Association, 73, 618-622.
Komunjer, I. (2003), “Quasi-Maximum Likelihood Estimation for Conditional Quantiles”, Caltech.
Kordas, G. (2002), “Smoothed Binary Regression Quantiles”, University of Pennsylvania.
Kuncheva, L.I. and C.J. Whitaker (2003), “Measures of Diversity in Classifier Ensembles and Their Rela-
tionship with the Ensemble Accuracy”, Machine Learning, 51, 181-207.
Luce, R.D. (1980), “Some Possible Measures of Risk”, Theory and Decision, 12, 217-228.
Luce, R.D. (1981), Corrections to “Some Possible Measures of Risk”, Theory and Decision, 13, 381.
Manski, C.F. (1975), “Maximum Score Estimation of the Stochastic Utility Model of Choice”, Journal of
Econometrics, 3(3), 205-228.
Manski, C.F. (1985), “Semiparametric Analysis of Discrete Response: Asymptotic Properties of the Maxi-
mum Score Estimator”, Journal of Econometrics, 27(3), 313-333.
Manski, C.F., and T. S. Thompson (1989), “Estimation of Best Predictors of Binary Response”, Journal
of Econometrics, 40, 97-123.
Marron, J.S. and M.P. Wand (1992), “Exact Mean Integrated Squared Error”, Annals of Statistics, 20,
712—736.
Money, A.H., J.F. Affleck-Graves, M.L. Hart, and G.D.I. Barr (1982), “The Linear Regression Model and
the Choice of p”, Communications in Statistics - Simulations and Computations, 11(1), 89-109.
Nyguist, H. (1983), “The Optimal Lp-norm Estimation in Linear Regression Models”, Communications in
Statistics - Theory and Methods, 12, 2511-2524.
Paye, B.S. and A. Timmermann (2003), “Instability of Return Predictability Models”, UCSD.
Pesaran, M.H. and A. Timmermann (2001), “How Costly Is It To Ignore Breaks When Forecasting the
Direction of a Time Series?”, forthcoming, International Journal of Forecasting.
Pesaran, M.H. and A. Timmermann (2002a), “Market Timing and Return Prediction Under Model Insta-
bility”, Journal of Empirical Finance, 9, 495-510.
Pesaran, M.H. and A. Timmermann (2002b), “Model Instability and Choice of Observation Window”,
UCSD and Cambridge.
Politis, D.N. (2003), “The Impact of Bootstrap Methods on Time Series Analysis”, forthcoming in Statistical
Science.
Powell, J.L. (1986), “Censored Regression Quantiles”, Journal of Econometrics, 32, 143-155.
Portnoy, S. and R. Koenker (1997), “The Gaussian Hare and the Laplacean Tortoise: Computability of l1vs l2 Regression Estimators”, Statistical Science, 12, 279-300.
26
Rao J.S. and Tibshirani R. (1997), “The Out-of-Bootstrap Method for Model Averaging and Selection”,
University of Toronto.
Skouras, S. (2002), “The Sign of a Mean Regression: Characterization, Estimation and Application”, Santa
Fe Institute.
Stock, J.H. and M.W. Watson (1999), “A Comparison of Linear and Nonlinear Univariate Models for
Forecasting Macroeconomic Time Series”, in Cointegration, Causality, and Forecasting, A Festschrift
in Honor of C.W.J. Granger, (eds.) R.F. Engle and H. White, Oxford University Press: London, pp.
1-44.
West, K.D. and M.W. McCracken (1998), “Regression-Based Tests of Predictive Ability”, International
Economic Review, 39(4), 817-840.
White, H. (1992), “Nonparametric Estimation of Conditional Quantiles Using Neural Networks”, Proceed-
ings of the Symposium on the Interface, New York: Springer-Verlag, 190-199.
Yang, Yuhong (2004), “Combining Forecasting Procedures: Some Theoretical Results”, Econometric The-
ory, 20, 176-222.
Zou, Hui and Yuhong Yang (2004), “Combining Time Series Models for Forecasting”, International Journal
of Forecasting, 20, 69-84.
27
Table 1. AR(0)-ARCH(1)-MW1(Gaussian): 111 +++ = ttt huy , 21 5.05.0 tt yh +=+
R=20 R =50 R =100 R =200 J=1 J =50 J =1 J =50 J =1 J =50 J =1 J =50
Binary EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR T1 6.89 5.71 5.56 5.60 5.63 5.41 5.11 5.08 5.09 5.09 5.10 5.05 5.05 5.05 5.05 5.04 5.04 5.04 5.04 5.04 T2 1.28 1.09 0.98 1.04 1.05 0.80 0.54 0.52 0.52 0.53 0.55 0.51 0.51 0.51 0.51 0.54 0.54 0.54 0.54 0.54 T3 0.80 0.80 0.81 0.81 0.35 0.36 0.36 0.34 0.08 0.08 0.08 0.08 0.00 0.00 0.00 0.00
α=.1
T4 0.08 0.09 0.07 0.07 0.55 0.56 0.56 0.56 0.90 0.90 0.90 0.90 1.00 1.00 1.00 1.00 T1 18.42 15.50 15.20 15.37 15.48 16.36 15.28 15.27 15.27 15.31 15.56 15.17 15.18 15.17 15.16 15.25 15.14 15.13 15.13 15.13 T2 2.44 2.28 2.13 2.23 2.19 1.94 1.69 1.62 1.65 1.71 1.74 1.50 1.50 1.51 1.51 1.66 1.65 1.64 1.64 1.65 T3 0.89 0.95 0.93 0.91 0.80 0.83 0.80 0.79 0.39 0.37 0.40 0.40 0.16 0.16 0.16 0.16
α=.3
T4 0.01 0.00 0.00 0.02 0.07 0.08 0.08 0.08 0.51 0.53 0.52 0.51 0.82 0.81 0.82 0.82 T1 25.56 21.26 21.24 21.38 21.48 25.17 22.63 22.67 22.65 22.69 25.04 23.22 23.22 23.20 23.27 24.93 23.23 23.33 23.17 23.24 T2 2.29 3.38 3.11 3.26 3.32 2.20 2.49 2.36 2.25 2.38 2.60 2.35 2.48 2.61 2.31 2.56 2.78 2.54 2.81 2.77 T3 0.93 0.97 0.94 0.93 0.81 0.82 0.82 0.82 0.80 0.80 0.77 0.80 0.70 0.72 0.72 0.65
α=.5
T4 0.04 0.00 0.02 0.03 0.09 0.06 0.08 0.09 0.07 0.03 0.09 0.06 0.07 0.03 0.06 0.14 T1 17.80 15.57 15.41 15.57 15.60 15.82 14.93 14.88 14.94 14.95 15.15 14.88 14.87 14.86 14.87 14.98 14.87 14.86 14.87 14.88 T2 2.19 2.08 2.09 2.07 2.11 1.96 1.63 1.61 1.68 1.63 1.68 1.60 1.58 1.58 1.60 1.70 1.63 1.63 1.63 1.62 T3 0.88 0.91 0.87 0.86 0.73 0.75 0.73 0.73 0.37 0.37 0.37 0.37 0.16 0.17 0.16 0.14
α=.7
T4 0.00 0.01 0.02 0.00 0.11 0.11 0.13 0.11 0.54 0.57 0.56 0.54 0.80 0.79 0.79 0.81 T1 6.94 5.89 5.83 5.86 5.92 5.22 4.96 4.97 4.97 4.96 5.03 4.95 4.95 4.95 4.96 4.96 4.96 4.96 4.96 4.96 T2 1.49 1.45 1.39 1.38 1.47 0.69 0.50 0.52 0.52 0.50 0.62 0.51 0.51 0.51 0.52 0.54 0.54 0.54 0.54 0.54 T3 0.77 0.78 0.79 0.77 0.29 0.28 0.30 0.29 0.09 0.09 0.09 0.09 0.00 0.00 0.00 0.00
α=.9
T4 0.07 0.09 0.05 0.07 0.62 0.64 0.62 0.62 0.89 0.89 0.89 0.89 0.99 0.99 0.99 0.99 Quantile
T1 25.46 19.20 18.30 18.60 18.93 19.27 17.13 17.07 17.10 17.13 17.83 16.77 16.75 16.77 16.78 16.73 16.28 16.29 16.28 16.28 T2 6.75 3.91 3.47 3.40 3.64 3.02 2.47 2.43 2.44 2.47 3.27 2.57 2.57 2.55 2.57 2.73 2.47 2.49 2.47 2.47 T3 0.96 0.98 0.98 0.97 0.97 0.97 0.97 0.97 0.93 0.92 0.93 0.93 0.88 0.88 0.88 0.88
α=.1
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 40.75 33.69 33.57 33.55 33.67 35.74 32.83 32.91 32.86 32.86 34.28 32.86 32.84 32.86 32.87 32.86 32.21 32.21 32.21 32.21 T2 7.03 5.22 5.19 5.02 5.10 4.83 4.02 4.09 4.03 4.03 5.12 4.70 4.64 4.71 4.70 4.51 4.37 4.38 4.37 4.37 T3 0.97 0.99 1.00 0.99 0.99 0.99 1.00 0.99 0.96 0.97 0.97 0.96 0.90 0.89 0.90 0.90
α=.3
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 45.31 38.12 38.25 38.02 38.15 40.62 37.32 37.56 37.38 37.35 38.87 37.23 37.29 37.24 37.24 37.53 36.80 36.81 36.80 36.80 T2 7.12 6.14 6.50 5.98 6.08 5.43 4.50 4.73 4.53 4.51 6.13 5.04 5.18 5.07 5.04 4.90 4.85 4.84 4.85 4.85 T3 1.00 1.00 1.00 1.00 0.99 0.98 0.99 0.99 0.99 0.99 0.99 0.99 0.93 0.92 0.94 0.93
α=.5
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 40.10 33.98 33.87 33.67 33.92 35.56 32.39 32.52 32.40 32.41 33.86 32.44 32.48 32.46 32.45 32.70 32.09 32.13 32.10 32.10 T2 6.32 5.93 6.10 5.44 5.75 5.12 3.76 3.91 3.77 3.77 5.53 4.74 4.82 4.78 4.75 4.42 4.26 4.31 4.27 4.26 T3 0.97 0.98 0.98 0.98 0.98 0.99 0.99 0.98 0.96 0.96 0.96 0.96 0.90 0.90 0.92 0.90
α=.7
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 24.64 19.88 18.96 19.10 19.45 19.19 16.74 16.72 16.72 16.74 17.58 16.58 16.57 16.57 16.58 16.86 16.34 16.36 16.35 16.34 T2 5.18 5.06 4.15 4.07 4.38 3.96 2.44 2.50 2.43 2.45 3.13 2.72 2.72 2.72 2.72 2.58 2.40 2.43 2.41 2.40 T3 0.91 0.97 0.97 0.96 0.98 0.98 0.98 0.98 0.92 0.91 0.92 0.92 0.88 0.87 0.88 0.88
α=.9
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Note: EW and Wk denote the weighted-bagging with equal weights (EW) and with BMA-weights using k most recent in-sample observations (k = 1, 5, and R).
Table 2. AR(0)-ARCH(1)-MW1(Gaussian): 111 +++ = ttt huy , 21 9.01.0 tt yh +=+
R=20 R =50 R =100 R =200 J=1 J =50 J =1 J =50 J =1 J =50 J =1 J =50
Binary EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR T1 7.14 5.61 5.45 5.55 5.59 5.41 5.06 5.05 5.05 5.06 5.13 5.05 5.05 5.05 5.05 5.04 5.04 5.04 5.04 5.04 T2 1.39 1.21 1.11 1.12 1.18 0.80 0.48 0.47 0.47 0.49 0.63 0.51 0.51 0.51 0.51 0.55 0.54 0.54 0.54 0.54 T3 0.87 0.91 0.88 0.87 0.39 0.40 0.39 0.39 0.10 0.10 0.10 0.10 0.01 0.01 0.01 0.01
α=.1
T4 0.04 0.05 0.05 0.05 0.52 0.51 0.53 0.51 0.88 0.88 0.88 0.88 0.97 0.97 0.97 0.97 T1 18.43 15.64 15.49 15.51 15.64 16.75 15.45 15.45 15.47 15.48 15.71 15.18 15.19 15.20 15.19 15.39 15.13 15.11 15.11 15.13 T2 2.10 2.23 2.09 2.06 2.15 2.04 1.82 1.73 1.78 1.83 1.77 1.61 1.57 1.60 1.60 1.72 1.62 1.61 1.61 1.62 T3 0.86 0.91 0.90 0.87 0.84 0.83 0.85 0.84 0.48 0.50 0.47 0.46 0.27 0.26 0.27 0.27
α=.3
T4 0.01 0.02 0.02 0.01 0.08 0.08 0.07 0.08 0.45 0.44 0.46 0.45 0.69 0.69 0.69 0.69 T1 25.45 20.92 20.92 21.06 21.07 24.99 22.61 22.89 22.63 22.73 24.95 23.19 23.24 23.22 23.29 24.83 23.74 23.85 23.70 23.70 T2 2.43 3.65 3.29 3.42 3.60 2.15 2.36 2.33 2.40 2.43 2.48 2.60 2.53 2.59 2.63 2.51 2.63 2.81 2.77 2.50 T3 0.95 0.97 0.96 0.95 0.85 0.80 0.83 0.85 0.76 0.73 0.74 0.76 0.69 0.67 0.65 0.70
α=.5
T4 0.01 0.02 0.01 0.02 0.05 0.04 0.06 0.07 0.06 0.10 0.05 0.06 0.11 0.09 0.11 0.08 T1 18.13 15.58 15.31 15.53 15.64 16.24 14.93 14.94 14.95 14.99 15.33 14.88 14.92 14.91 14.90 15.16 14.99 15.01 15.02 15.00 T2 2.35 2.14 1.95 2.06 2.13 2.04 1.74 1.68 1.72 1.81 1.86 1.69 1.72 1.68 1.68 1.77 1.73 1.73 1.76 1.72 T3 0.92 0.93 0.92 0.92 0.81 0.81 0.80 0.81 0.47 0.44 0.45 0.45 0.25 0.24 0.23 0.25
α=.7
T4 0.01 0.01 0.00 0.01 0.12 0.14 0.12 0.12 0.45 0.46 0.47 0.47 0.69 0.69 0.70 0.69 T1 7.25 5.69 5.64 5.69 5.74 5.32 4.99 4.98 4.98 5.00 5.03 4.95 4.94 4.95 4.95 5.02 4.98 4.97 4.98 4.98 T2 1.38 1.30 1.24 1.34 1.31 0.71 0.57 0.54 0.57 0.58 0.60 0.51 0.51 0.51 0.51 0.63 0.55 0.54 0.55 0.55 T3 0.89 0.89 0.88 0.88 0.39 0.40 0.39 0.39 0.08 0.09 0.08 0.08 0.03 0.03 0.03 0.03
α=.9
T4 0.06 0.07 0.06 0.06 0.50 0.52 0.53 0.51 0.91 0.90 0.91 0.91 0.97 0.97 0.97 0.97 Quantile
T1 19.36 13.51 12.56 12.72 13.12 13.87 11.72 11.77 11.73 11.75 12.17 11.03 11.00 11.05 11.03 11.03 10.55 10.58 10.57 10.56 T2 9.78 5.45 4.79 4.55 5.05 6.32 3.81 4.22 4.01 3.95 4.47 3.36 3.18 3.36 3.37 3.83 3.20 3.27 3.24 3.23 T3 0.94 0.98 0.96 0.95 0.97 0.97 0.98 0.97 0.91 0.91 0.91 0.91 0.80 0.81 0.79 0.80
α=.1
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 28.81 22.17 22.38 21.88 22.07 25.16 21.68 22.26 21.67 21.67 22.51 20.88 20.96 20.92 20.89 21.28 20.23 20.36 20.26 20.23 T2 11.08 7.32 8.73 6.75 7.00 9.95 6.29 8.58 6.31 6.23 7.49 5.88 6.02 5.96 5.90 6.92 5.52 5.83 5.62 5.54 T3 0.98 1.00 1.00 0.99 0.99 0.98 1.00 0.99 0.96 0.96 0.97 0.96 0.90 0.87 0.90 0.90
α=.3
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 31.95 24.13 24.92 24.03 24.02 27.75 24.01 24.68 24.08 23.99 25.77 23.37 23.75 23.43 23.38 24.14 23.02 23.13 23.06 23.03 T2 12.33 6.78 10.48 6.97 6.55 10.24 6.44 8.33 6.64 6.38 9.75 6.13 7.23 6.31 6.14 7.93 6.16 6.60 6.28 6.20 T3 0.99 0.99 1.00 1.00 0.98 0.99 0.99 0.99 0.96 0.96 0.96 0.96 0.95 0.94 0.95 0.95
α=.5
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 28.65 21.85 22.56 21.67 21.72 24.23 21.01 21.46 21.13 21.02 22.41 20.60 20.83 20.70 20.61 21.13 20.32 20.48 20.34 20.34 T2 9.96 6.62 9.87 6.63 6.39 8.51 5.52 7.12 5.88 5.53 7.97 5.85 6.51 6.08 5.87 6.87 5.76 6.16 5.80 5.84 T3 0.98 0.98 0.99 0.98 0.99 0.99 0.99 0.99 0.97 0.97 0.97 0.97 0.89 0.86 0.90 0.89
α=.7
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 18.22 14.12 12.85 12.98 13.33 13.66 11.28 11.26 11.22 11.27 11.97 10.84 10.86 10.87 10.85 11.29 10.78 10.73 10.78 10.80 T2 8.16 6.70 4.93 4.86 5.23 6.53 3.61 3.67 3.48 3.58 4.57 3.54 3.48 3.58 3.54 4.18 3.60 3.41 3.56 3.68 T3 0.93 0.97 0.98 0.96 0.92 0.93 0.92 0.92 0.84 0.85 0.85 0.84 0.90 0.91 0.90 0.90
α=.9
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Table 3. AR(1)-ARCH(0)-MW1(Gaussian): 11 6.0 ++ += ttt uyy
R=20 R =50 R =100 R =200 J=1 J =50 J =1 J =50 J =1 J =50 J =1 J =50
Binary EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR T1 6.52 5.66 5.36 5.46 5.55 4.97 4.78 4.78 4.78 4.78 5.09 4.95 4.93 4.95 4.94 4.91 4.94 4.94 4.94 4.94 T2 1.45 1.54 1.28 1.42 1.44 1.08 0.86 0.86 0.86 0.86 0.98 0.90 0.88 0.90 0.89 0.92 0.84 0.84 0.84 0.84 T3 0.72 0.78 0.77 0.76 0.43 0.43 0.43 0.43 0.31 0.34 0.31 0.32 0.18 0.19 0.18 0.18
α=.1
T4 0.01 0.03 0.02 0.02 0.15 0.15 0.15 0.15 0.28 0.24 0.28 0.28 0.21 0.21 0.21 0.21 T1 14.46 13.35 13.11 13.23 13.33 12.94 12.63 12.64 12.58 12.64 12.91 12.87 12.85 12.84 12.86 12.06 12.35 12.35 12.36 12.35 T2 2.41 2.24 2.14 2.13 2.22 2.57 2.24 2.28 2.21 2.25 1.90 2.00 1.97 1.99 1.99 2.28 2.21 2.22 2.20 2.20 T3 0.76 0.82 0.81 0.79 0.55 0.56 0.54 0.53 0.39 0.39 0.37 0.41 0.33 0.32 0.34 0.33
α=.3
T4 0.01 0.00 0.00 0.00 0.00 0.01 0.02 0.00 0.08 0.08 0.07 0.08 0.05 0.06 0.03 0.04 T1 17.44 16.26 15.91 16.23 16.32 15.32 15.24 15.05 15.20 15.22 15.79 15.80 15.71 15.76 15.80 14.76 14.75 14.76 14.76 14.73 T2 2.73 2.68 2.64 2.71 2.76 2.80 2.78 2.70 2.83 2.78 2.32 2.21 2.15 2.20 2.18 2.50 2.39 2.42 2.44 2.42 T3 0.69 0.73 0.68 0.66 0.46 0.57 0.50 0.51 0.36 0.40 0.40 0.35 0.36 0.30 0.35 0.36
α=.5
T4 0.10 0.13 0.11 0.13 0.18 0.16 0.17 0.16 0.27 0.25 0.23 0.29 0.33 0.38 0.35 0.36 T1 14.08 13.12 12.94 13.06 13.21 12.50 12.36 12.26 12.32 12.35 12.79 12.89 12.81 12.86 12.85 12.33 12.47 12.50 12.48 12.46 T2 2.46 2.40 2.25 2.41 2.45 2.38 2.14 2.24 2.25 2.26 2.47 2.36 2.28 2.29 2.31 2.05 2.06 2.10 2.07 2.05 T3 0.69 0.73 0.67 0.68 0.55 0.57 0.54 0.54 0.45 0.43 0.44 0.43 0.41 0.39 0.40 0.41
α=.7
T4 0.02 0.04 0.04 0.02 1.00 0.03 0.04 0.04 0.05 0.03 0.07 0.04 0.03 0.06 0.06 0.06 0.06 T1 6.22 5.18 5.00 5.11 5.21 5.12 4.92 4.90 4.91 4.92 4.96 4.87 4.88 4.87 4.88 5.01 4.91 4.90 4.91 4.91 T2 1.53 1.36 1.13 1.29 1.37 1.01 0.91 0.89 0.89 0.92 1.00 0.83 0.84 0.84 0.84 0.96 0.84 0.84 0.85 0.85 T3 0.73 0.77 0.74 0.73 0.45 0.48 0.47 0.44 0.36 0.34 0.35 0.34 0.32 0.32 0.32 0.32
α=.9
T4 0.02 0.03 0.02 0.02 0.14 0.12 0.12 0.15 0.11 0.13 0.11 0.12 0.20 0.22 0.20 0.22 Quantile
T1 26.34 22.24 19.98 20.76 21.29 20.01 18.74 18.50 18.61 18.70 18.52 18.02 17.94 17.98 18.02 17.90 17.84 17.83 17.84 17.84 T2 4.89 4.54 3.05 3.32 3.77 2.18 1.85 1.69 1.75 1.82 2.11 1.95 1.89 1.89 1.95 1.76 1.72 1.71 1.72 1.72 T3 0.86 0.99 0.94 0.92 0.86 0.93 0.90 0.87 0.75 0.78 0.76 0.75 0.50 0.50 0.50 0.50
α=.1
T4 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 42.30 38.84 37.17 37.77 38.31 37.52 36.62 36.29 36.46 36.58 35.85 35.60 35.49 35.56 35.60 35.29 35.26 35.21 35.24 35.26 T2 4.41 4.19 3.59 3.67 3.87 3.22 3.11 2.97 3.03 3.10 3.36 3.28 3.22 3.26 3.28 2.92 3.00 2.97 2.99 3.00 T3 0.90 0.99 0.97 0.94 0.82 0.85 0.84 0.83 0.60 0.67 0.60 0.60 0.57 0.57 0.56 0.57
α=.3
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 47.65 43.98 42.47 43.06 43.60 42.35 41.33 41.02 41.17 41.29 40.76 40.48 40.34 40.43 40.48 40.23 40.18 40.13 40.16 40.17 T2 4.56 4.54 3.82 4.00 4.22 3.27 3.23 3.06 3.13 3.21 3.52 3.48 3.45 3.47 3.47 3.11 3.15 3.13 3.15 3.15 T3 0.87 0.99 0.95 0.89 0.82 0.91 0.87 0.85 0.63 0.73 0.67 0.63 0.58 0.62 0.61 0.58
α=.5
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 42.57 39.18 37.46 38.08 38.68 36.74 35.99 35.68 35.83 35.96 35.43 35.26 35.15 35.21 35.25 34.93 34.92 34.88 34.90 34.92 T2 4.10 4.48 3.79 3.86 4.12 2.85 2.76 2.63 2.66 2.74 3.24 3.11 3.08 3.09 3.11 2.65 2.63 2.62 2.63 2.63 T3 0.84 0.98 0.95 0.90 0.68 0.77 0.73 0.71 0.68 0.71 0.70 0.68 0.55 0.58 0.56 0.55
α=.7
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 25.92 21.52 19.70 20.24 20.74 19.61 18.36 18.16 18.26 18.33 18.30 17.87 17.83 17.85 17.86 17.80 17.71 17.69 17.70 17.70 T2 4.29 4.07 2.83 3.04 3.42 2.26 1.89 1.77 1.83 1.87 2.11 1.88 1.88 1.88 1.88 1.60 1.43 1.42 1.43 1.43 T3 0.85 0.98 0.98 0.93 0.85 0.89 0.89 0.86 0.74 0.76 0.73 0.74 0.55 0.56 0.55 0.55
α=.9
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Table 4. AR(1)-ARCH(0)- MW2(Skewed Unimodal): 11 6.0 ++ += ttt uyy
R=20 R =50 R =100 R =200 J=1 J =50 J =1 J =50 J =1 J =50 J =1 J =50
Binary EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR T1 7.36 6.00 5.71 5.82 5.88 5.57 5.14 5.13 5.13 5.14 5.51 5.29 5.29 5.29 5.29 5.28 5.27 5.27 5.27 5.27 T2 1.77 1.69 1.27 1.45 1.53 1.12 0.84 0.83 0.83 0.85 1.12 0.97 0.97 0.97 0.97 0.95 0.88 0.88 0.88 0.88 T3 0.78 0.83 0.82 0.81 0.60 0.60 0.60 0.60 0.42 0.42 0.42 0.41 0.24 0.24 0.24 0.24
α=.1
T4 0.04 0.04 0.03 0.03 0.09 0.08 0.08 0.09 0.25 0.23 0.25 0.24 0.33 0.32 0.32 0.32 T1 15.14 14.50 14.05 14.32 14.50 13.68 13.48 13.43 13.43 13.46 13.16 13.17 13.10 13.18 13.19 12.88 12.98 12.95 12.98 12.98 T2 2.57 2.71 2.53 2.52 2.70 2.41 2.20 2.25 2.18 2.18 2.34 2.25 2.30 2.24 2.24 2.34 2.46 2.47 2.45 2.47 T3 0.66 0.68 0.66 0.66 0.52 0.55 0.54 0.52 0.51 0.54 0.51 0.48 0.48 0.49 0.48 0.48
α=.3
T4 0.01 0.02 0.01 0.01 0.03 0.05 0.05 0.03 0.03 0.03 0.03 0.03 0.02 0.03 0.02 0.02 T1 16.61 16.01 15.50 15.79 15.98 15.33 15.17 15.07 15.21 15.21 14.62 14.71 14.65 14.66 14.70 14.24 14.23 14.22 14.22 14.22 T2 2.84 2.55 2.36 2.45 2.48 3.07 2.83 2.80 2.86 2.83 3.00 3.16 3.06 3.14 3.21 2.46 2.41 2.41 2.41 2.43 T3 0.56 0.65 0.61 0.60 0.48 0.51 0.46 0.44 0.43 0.41 0.40 0.38 0.31 0.30 0.34 0.32
α=.5
T4 0.08 0.10 0.07 0.04 0.15 0.19 0.17 0.20 0.17 0.22 0.20 0.19 0.40 0.44 0.38 0.40 T1 12.80 12.04 11.85 11.96 12.09 11.41 11.40 11.40 11.37 11.38 10.83 10.85 10.83 10.80 10.83 11.20 11.31 11.27 11.28 11.29 T2 2.43 2.24 2.17 2.15 2.27 2.21 2.14 2.18 2.16 2.16 2.33 2.11 2.08 2.08 2.09 2.20 2.23 2.22 2.22 2.21 T3 0.64 0.71 0.67 0.63 0.50 0.46 0.48 0.50 0.50 0.52 0.52 0.49 0.46 0.45 0.46 0.44
α=.7
T4 0.01 0.00 0.02 0.01 0.04 0.07 0.06 0.05 0.05 0.06 0.06 0.06 0.04 0.06 0.06 0.05 T1 5.46 4.79 4.69 4.76 4.85 4.50 4.48 4.47 4.47 4.48 4.29 4.41 4.40 4.40 4.40 4.35 4.46 4.43 4.44 4.44 T2 1.62 1.46 1.33 1.42 1.48 1.11 0.94 0.96 0.96 0.96 0.94 0.92 0.92 0.91 0.93 0.93 0.79 0.80 0.80 0.80 T3 0.64 0.70 0.65 0.63 0.39 0.37 0.38 0.37 0.26 0.25 0.26 0.25 0.23 0.25 0.23 0.24
α=.9
T4 0.05 0.02 0.03 0.04 0.05 0.07 0.07 0.07 0.07 0.07 0.07 0.07 0.09 0.08 0.09 0.09 Quantile
T1 30.66 24.13 21.78 22.73 23.25 23.46 21.41 21.13 21.29 21.39 21.57 20.80 20.74 20.76 20.80 20.87 20.59 20.57 20.57 20.59 T2 5.98 5.05 3.56 3.92 4.26 3.20 2.64 2.55 2.58 2.62 3.09 2.71 2.66 2.67 2.71 2.32 2.23 2.22 2.22 2.23 T3 0.87 1.00 0.97 0.95 0.90 0.97 0.92 0.91 0.77 0.79 0.77 0.77 0.67 0.68 0.69 0.67
α=.1
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 43.96 40.60 38.92 39.29 39.95 38.52 37.42 37.11 37.25 37.40 36.49 36.21 36.07 36.14 36.20 36.01 35.92 35.87 35.89 35.91 T2 5.35 5.34 4.66 4.47 4.79 4.21 3.85 3.92 3.85 3.86 4.26 4.13 4.06 4.09 4.13 3.23 3.22 3.22 3.22 3.22 T3 0.82 0.97 0.91 0.89 0.79 0.89 0.84 0.79 0.62 0.67 0.64 0.62 0.64 0.67 0.67 0.64
α=.3
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 45.58 42.49 41.09 41.54 42.09 40.89 40.09 39.73 39.91 40.05 39.02 39.04 38.90 38.98 39.03 38.36 38.31 38.27 38.29 38.31 T2 5.07 4.85 4.22 4.30 4.55 3.86 3.56 3.55 3.54 3.55 4.01 3.98 3.91 3.94 3.97 3.31 3.23 3.23 3.23 3.23 T3 0.79 0.93 0.89 0.87 0.75 0.91 0.81 0.75 0.53 0.65 0.58 0.54 0.52 0.60 0.55 0.54
α=.5
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 38.32 35.59 34.23 34.63 35.17 33.77 33.11 32.77 32.96 33.08 32.35 32.25 32.16 32.21 32.25 31.69 31.74 31.70 31.73 31.74 T2 4.55 4.14 3.60 3.54 3.84 2.72 2.65 2.54 2.63 2.65 2.97 2.91 2.89 2.90 2.91 2.58 2.52 2.52 2.52 2.52 T3 0.82 0.96 0.91 0.84 0.70 0.83 0.78 0.70 0.56 0.62 0.61 0.56 0.50 0.51 0.49 0.50
α=.7
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 22.62 19.42 17.79 18.10 18.62 16.93 16.17 15.93 16.05 16.14 15.78 15.66 15.63 15.65 15.66 15.32 15.46 15.45 15.46 15.46 T2 4.49 3.66 2.74 2.79 3.12 1.96 1.62 1.46 1.53 1.60 1.54 1.53 1.52 1.52 1.53 1.35 1.27 1.26 1.27 1.27 T3 0.77 0.93 0.91 0.87 0.71 0.82 0.76 0.71 0.54 0.54 0.55 0.54 0.45 0.45 0.45 0.45
α=.9
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Table 5. AR(1)-ARCH(0)-MW3(Strongly Skewed): 11 6.0 ++ += ttt uyy
R=20 R =50 R =100 R =200 J=1 J =50 J =1 J =50 J =1 J =50 J =1 J =50
Binary EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR T1 3.32 3.38 3.30 3.32 3.33 2.97 3.26 3.28 3.26 3.26 3.12 3.43 3.45 3.43 3.43 3.00 3.23 3.25 3.23 3.23 T2 0.81 0.71 0.61 0.61 0.64 0.63 0.56 0.56 0.56 0.56 0.68 0.72 0.72 0.72 0.72 0.66 0.70 0.70 0.70 0.70 T3 0.34 0.36 0.37 0.37 0.17 0.18 0.17 0.17 0.13 0.13 0.13 0.13 0.06 0.06 0.06 0.06
α=.1
T4 0.06 0.06 0.06 0.07 0.08 0.05 0.08 0.08 0.08 0.06 0.08 0.08 0.11 0.10 0.11 0.11 T1 9.46 9.34 9.04 9.23 9.28 8.35 8.30 8.29 8.31 8.30 8.62 8.74 8.74 8.74 8.74 8.57 8.60 8.60 8.60 8.61 T2 2.03 2.11 1.83 1.94 1.99 1.55 1.58 1.56 1.58 1.58 1.59 1.60 1.61 1.59 1.61 1.70 1.78 1.78 1.78 1.79 T3 0.52 0.60 0.54 0.56 0.32 0.31 0.31 0.32 0.30 0.30 0.31 0.31 0.34 0.34 0.34 0.34
α=.3
T4 0.06 0.07 0.07 0.04 0.34 0.34 0.31 0.34 0.23 0.23 0.22 0.22 0.33 0.33 0.33 0.31 T1 14.59 14.09 13.64 13.85 14.00 12.85 12.72 12.61 12.66 12.70 12.99 13.06 13.02 13.04 13.02 13.13 13.01 13.04 13.01 13.00 T2 2.68 3.03 2.82 2.79 2.96 2.38 2.28 2.25 2.24 2.27 2.36 2.38 2.30 2.35 2.33 2.61 2.55 2.54 2.55 2.56 T3 0.57 0.67 0.66 0.57 0.45 0.52 0.51 0.45 0.32 0.32 0.30 0.31 0.39 0.41 0.39 0.39
α=.5
T4 0.12 0.14 0.06 0.14 0.26 0.17 0.20 0.26 0.34 0.38 0.35 0.37 0.38 0.35 0.36 0.37 T1 16.12 14.44 14.16 14.39 14.52 14.83 14.58 14.37 14.45 14.46 14.25 14.47 14.32 14.28 14.43 14.12 14.18 14.11 14.10 14.18 T2 2.62 2.61 2.48 2.53 2.56 2.65 2.37 2.28 2.26 2.36 2.32 2.28 2.22 2.20 2.21 2.38 2.43 2.35 2.38 2.42 T3 0.76 0.82 0.80 0.76 0.56 0.62 0.59 0.61 0.42 0.49 0.49 0.43 0.40 0.42 0.40 0.37
α=.7
T4 0.00 0.00 0.01 0.01 0.00 0.03 0.02 0.02 0.03 0.03 0.03 0.04 0.07 0.08 0.08 0.08 T1 8.80 6.88 6.86 6.90 6.99 6.45 5.98 5.98 5.98 5.99 5.90 5.72 5.72 5.71 5.72 5.83 5.79 5.78 5.78 5.78 T2 1.94 1.85 1.69 1.79 1.87 1.18 0.89 0.91 0.91 0.91 1.05 0.90 0.90 0.90 0.90 1.13 0.99 0.98 0.98 0.98 T3 0.85 0.84 0.85 0.84 0.51 0.52 0.52 0.52 0.30 0.29 0.30 0.30 0.14 0.14 0.14 0.14
α=.9
T4 0.01 0.04 0.02 0.02 0.16 0.18 0.19 0.17 0.40 0.40 0.40 0.40 0.66 0.66 0.66 0.66 Quantile
T1 12.96 13.34 11.83 12.13 12.30 9.52 10.31 10.23 10.26 10.28 9.39 10.16 10.12 10.14 10.15 9.19 9.74 9.72 9.73 9.74 T2 2.70 2.96 1.66 1.72 1.79 1.01 1.14 1.12 1.13 1.13 1.07 1.35 1.33 1.33 1.34 0.98 1.18 1.18 1.18 1.18 T3 0.46 0.66 0.62 0.61 0.05 0.08 0.06 0.06 0.07 0.08 0.08 0.08 0.03 0.03 0.03 0.03
α=.1
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 30.27 29.96 28.27 28.65 29.10 25.85 25.89 25.71 25.80 25.86 25.50 25.65 25.58 25.62 25.64 25.07 25.16 25.14 25.15 25.15 T2 4.04 4.34 3.46 3.31 3.54 2.47 2.54 2.47 2.51 2.53 2.90 2.94 2.91 2.92 2.93 2.68 2.65 2.65 2.65 2.65 T3 0.61 0.76 0.72 0.69 0.49 0.56 0.51 0.51 0.41 0.45 0.44 0.42 0.41 0.46 0.42 0.42
α=.3
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 43.05 40.51 39.27 39.58 40.01 37.70 37.19 36.97 37.06 37.15 37.07 36.96 36.88 36.92 36.96 36.25 36.29 36.25 36.27 36.29 T2 5.17 5.09 4.67 4.64 4.84 3.82 3.74 3.65 3.71 3.73 4.23 4.05 4.05 4.05 4.05 3.90 3.72 3.72 3.71 3.72 T3 0.79 0.89 0.84 0.82 0.64 0.76 0.68 0.65 0.56 0.64 0.58 0.57 0.49 0.51 0.49 0.49
α=.5
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 47.22 42.05 40.95 41.22 41.61 40.95 39.29 39.18 39.17 39.27 39.69 39.29 39.15 39.22 39.29 38.56 38.37 38.34 38.35 38.37 T2 5.69 5.70 5.34 5.27 5.44 4.65 4.52 4.47 4.45 4.51 4.72 4.62 4.55 4.59 4.62 3.99 3.77 3.78 3.77 3.77 T3 0.91 0.98 0.96 0.95 0.91 0.94 0.93 0.92 0.68 0.75 0.71 0.68 0.63 0.63 0.63 0.64
α=.7
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 35.58 27.37 25.20 25.70 26.31 26.28 23.70 23.47 23.59 23.70 24.36 23.09 22.98 23.00 23.08 23.36 22.87 22.84 22.87 22.87 T2 6.64 5.74 4.33 4.11 4.47 3.92 3.23 3.09 3.12 3.21 3.10 2.84 2.75 2.76 2.83 2.44 2.18 2.17 2.18 2.18 T3 0.91 0.98 0.97 0.96 0.94 0.99 0.97 0.96 0.89 0.94 0.90 0.89 0.80 0.81 0.80 0.80
α=.9
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Table 6. AR(1)-ARCH(0)-MW4(Kurtoic Unimodal): 11 6.0 ++ += ttt uyy
R=20 R =50 R =100 R =200 J=1 J =50 J =1 J =50 J =1 J =50 J =1 J =50
Binary EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR T1 6.51 5.46 5.12 5.24 5.31 5.26 4.74 4.77 4.75 4.75 4.98 4.83 4.84 4.83 4.84 4.90 4.95 4.95 4.95 4.95 T2 1.58 1.41 1.14 1.19 1.26 1.37 0.97 0.99 0.97 0.97 1.10 0.92 0.90 0.92 0.92 1.08 0.96 0.96 0.96 0.96 T3 0.72 0.79 0.75 0.77 0.58 0.57 0.58 0.58 0.38 0.36 0.38 0.37 0.22 0.22 0.22 0.22
α=.1
T4 0.04 0.04 0.06 0.03 0.11 0.11 0.11 0.11 0.18 0.18 0.18 0.18 0.16 0.16 0.16 0.16 T1 13.02 12.27 12.09 12.18 12.32 11.55 11.53 11.48 11.51 11.55 11.50 11.58 11.52 11.56 11.59 11.34 11.31 11.31 11.31 11.30 T2 2.23 2.11 1.92 2.09 2.13 2.23 2.25 2.21 2.21 2.25 2.15 2.06 2.06 2.05 2.06 2.13 2.22 2.22 2.23 2.21 T3 0.65 0.71 0.71 0.65 0.49 0.46 0.46 0.46 0.47 0.45 0.46 0.46 0.48 0.48 0.48 0.47
α=.3
T4 0.01 0.01 0.01 0.00 0.00 0.03 0.01 0.00 0.02 0.02 0.02 0.02 0.04 0.04 0.05 0.04 T1 14.51 14.33 13.97 14.19 14.32 12.61 12.78 12.58 12.64 12.74 12.64 12.69 12.66 12.62 12.70 12.30 12.27 12.28 12.27 12.28 T2 2.51 2.64 2.55 2.62 2.66 2.07 2.18 2.06 2.12 2.17 2.36 2.37 2.33 2.37 2.37 2.45 2.46 2.48 2.47 2.48 T3 0.49 0.60 0.53 0.49 0.29 0.38 0.35 0.32 0.30 0.26 0.31 0.27 0.25 0.24 0.25 0.25
α=.5
T4 0.08 0.09 0.06 0.07 0.30 0.24 0.27 0.25 0.36 0.47 0.43 0.42 0.61 0.58 0.61 0.59 T1 12.91 11.97 11.68 11.88 11.96 11.54 11.64 11.54 11.55 11.56 11.47 11.56 11.55 11.55 11.56 11.22 11.22 11.24 11.23 11.22 T2 2.37 2.38 2.18 2.36 2.42 2.14 1.98 1.95 1.96 1.97 2.17 2.09 2.12 2.08 2.11 2.33 2.25 2.25 2.27 2.28 T3 0.69 0.73 0.69 0.69 0.50 0.49 0.53 0.56 0.49 0.48 0.46 0.46 0.43 0.40 0.41 0.41
α=.7
T4 0.01 0.00 0.01 0.00 0.00 0.01 0.00 0.00 0.02 0.03 0.03 0.02 0.04 0.04 0.05 0.06 T1 6.55 5.24 5.24 5.26 5.33 5.21 4.89 4.89 4.89 4.89 5.00 4.94 4.94 4.95 4.94 4.83 4.89 4.88 4.89 4.89 T2 1.65 1.42 1.43 1.44 1.47 1.20 1.01 1.03 1.02 1.02 1.09 0.89 0.90 0.90 0.90 1.01 0.97 0.97 0.97 0.97 T3 0.81 0.79 0.80 0.80 0.55 0.55 0.55 0.54 0.33 0.31 0.31 0.31 0.21 0.22 0.21 0.22
α=.9
T4 0.02 0.02 0.02 0.02 0.04 0.06 0.06 0.04 0.11 0.15 0.15 0.15 0.22 0.21 0.22 0.21 Quantile
T1 29.60 23.20 20.80 21.53 22.14 22.02 19.98 19.79 19.88 19.97 20.06 19.35 19.26 19.31 19.34 19.26 19.14 19.13 19.14 19.14 T2 6.16 5.30 3.57 3.87 4.28 2.91 2.50 2.35 2.42 2.47 2.60 2.36 2.22 2.29 2.35 2.37 2.19 2.18 2.19 2.19 T3 0.85 0.99 0.98 0.95 0.90 0.93 0.90 0.90 0.82 0.83 0.83 0.83 0.59 0.56 0.59 0.59
α=.1
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 39.25 37.30 35.50 36.00 36.58 34.56 33.95 33.74 33.81 33.92 32.98 33.09 33.01 33.04 33.08 32.53 32.75 32.72 32.73 32.74 T2 4.76 5.00 4.43 4.39 4.63 4.46 4.11 4.04 4.06 4.11 3.93 3.68 3.66 3.68 3.68 3.80 3.74 3.72 3.73 3.74 T3 0.72 0.92 0.85 0.76 0.61 0.67 0.64 0.62 0.51 0.51 0.52 0.51 0.39 0.40 0.39 0.39
α=.3
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 40.39 40.21 38.38 38.94 39.50 35.15 35.75 35.39 35.57 35.69 34.41 34.70 34.62 34.66 34.69 34.14 34.29 34.28 34.29 34.29 T2 4.58 5.14 4.57 4.62 4.88 3.93 4.08 3.91 3.97 4.05 3.87 3.93 3.91 3.92 3.93 3.66 3.72 3.71 3.72 3.72 T3 0.52 0.79 0.73 0.68 0.27 0.43 0.33 0.29 0.32 0.36 0.33 0.32 0.37 0.40 0.38 0.37
α=.5
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 39.79 36.77 35.32 35.79 36.27 33.55 33.39 33.13 33.23 33.35 32.94 32.92 32.82 32.87 32.91 32.55 32.74 32.72 32.73 32.74 T2 5.66 5.01 4.58 4.66 4.82 4.16 4.15 4.04 4.07 4.13 4.14 3.91 3.87 3.90 3.91 3.56 3.42 3.42 3.42 3.42 T3 0.84 0.93 0.92 0.90 0.51 0.57 0.53 0.51 0.50 0.53 0.51 0.50 0.39 0.39 0.39 0.39
α=.7
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 29.40 23.19 21.07 21.66 22.21 21.69 19.62 19.47 19.52 19.61 19.97 19.22 19.16 19.18 19.22 19.31 19.11 19.08 19.10 19.11 T2 5.55 5.68 3.84 4.14 4.56 3.45 2.69 2.60 2.62 2.67 2.63 2.51 2.43 2.47 2.51 2.29 2.05 2.01 2.05 2.05 T3 0.92 0.99 0.97 0.95 0.92 0.93 0.93 0.92 0.77 0.81 0.80 0.77 0.58 0.60 0.58 0.58
α=.9
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Table 7. AR(1)-ARCH(0)-MW5(Outlier): 11 6.0 ++ += ttt uyy
R=20 R =50 R =100 R =200 J=1 J =50 J =1 J =50 J =1 J =50 J =1 J =50
Binary EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR T1 5.97 5.14 4.91 5.00 5.06 4.83 4.63 4.63 4.63 4.63 4.50 4.64 4.66 4.65 4.65 4.38 4.64 4.65 4.65 4.65 T2 1.43 1.43 1.23 1.34 1.41 1.28 0.92 0.92 0.92 0.93 1.05 0.91 0.90 0.91 0.91 1.09 0.93 0.94 0.94 0.94 T3 0.74 0.79 0.78 0.76 0.44 0.44 0.44 0.44 0.29 0.27 0.29 0.28 0.22 0.22 0.22 0.22
α=.1
T4 0.00 0.01 0.01 0.01 0.04 0.04 0.04 0.04 0.05 0.04 0.04 0.05 0.05 0.04 0.04 0.04 T1 12.29 11.85 11.53 11.72 11.85 10.93 10.83 10.78 10.82 10.81 10.61 10.55 10.53 10.51 10.56 10.43 10.50 10.48 10.48 10.51 T2 2.15 2.33 2.22 2.25 2.29 2.17 2.11 2.14 2.09 2.08 1.88 1.80 1.79 1.79 1.82 2.29 2.27 2.25 2.27 2.27 T3 0.60 0.63 0.65 0.60 0.54 0.52 0.52 0.54 0.46 0.47 0.48 0.45 0.41 0.42 0.41 0.41
α=.3
T4 0.03 0.04 0.01 0.05 0.01 0.05 0.03 0.01 0.05 0.06 0.05 0.05 0.05 0.06 0.05 0.04 T1 14.79 14.24 13.90 14.13 14.29 12.83 12.85 12.78 12.86 12.84 13.13 13.08 13.03 13.05 13.08 12.09 12.16 12.16 12.18 12.15 T2 2.78 2.71 2.65 2.75 2.76 2.72 2.62 2.63 2.62 2.61 2.56 2.57 2.54 2.49 2.55 2.65 2.70 2.68 2.70 2.68 T3 0.53 0.63 0.59 0.53 0.39 0.41 0.40 0.40 0.36 0.34 0.33 0.35 0.21 0.20 0.19 0.20
α=.5
T4 0.18 0.15 0.11 0.12 0.17 0.20 0.18 0.18 0.38 0.42 0.43 0.37 0.51 0.51 0.53 0.53 T1 12.40 11.98 11.62 11.84 11.99 10.48 10.56 10.49 10.55 10.57 10.82 10.77 10.80 10.75 10.80 10.17 10.12 10.11 10.12 10.12 T2 2.46 2.45 2.30 2.37 2.46 2.13 2.01 2.02 2.00 2.05 2.16 2.01 2.03 2.00 2.01 2.16 2.09 2.10 2.12 2.10 T3 0.61 0.67 0.62 0.59 0.43 0.45 0.42 0.42 0.42 0.41 0.45 0.41 0.43 0.44 0.43 0.43
α=.7
T4 0.02 0.03 0.03 0.01 0.04 0.05 0.06 0.05 0.05 0.08 0.07 0.05 0.11 0.11 0.11 0.13 T1 6.29 5.44 5.17 5.35 5.44 4.62 4.65 4.61 4.63 4.64 4.44 4.57 4.54 4.56 4.55 4.31 4.54 4.50 4.51 4.51 T2 1.61 1.72 1.42 1.63 1.68 1.13 0.97 0.94 0.95 0.96 0.90 0.90 0.90 0.90 0.90 0.95 0.91 0.90 0.90 0.90 T3 0.68 0.79 0.73 0.68 0.40 0.40 0.40 0.39 0.28 0.28 0.28 0.28 0.24 0.26 0.25 0.26
α=.9
T4 0.04 0.02 0.03 0.04 0.06 0.06 0.06 0.06 0.04 0.05 0.04 0.04 0.02 0.02 0.03 0.02 Quantile
T1 60.94 30.47 26.77 26.63 26.58 19.81 19.32 18.52 18.65 18.93 16.83 17.09 17.05 17.05 17.07 14.82 15.20 15.19 15.19 15.19 T2 197.22 19.90 39.66 23.42 15.44 7.80 7.04 6.28 6.34 6.65 6.24 6.02 6.05 6.01 6.01 4.98 4.72 4.73 4.72 4.72 T3 0.84 0.95 0.94 0.92 0.49 0.58 0.54 0.55 0.37 0.38 0.40 0.38 0.29 0.29 0.29 0.29
α=.1
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 33.53 31.55 30.28 28.57 29.27 23.56 23.92 23.39 23.47 23.74 22.24 22.56 22.44 22.49 22.54 20.74 21.02 20.98 21.00 21.01 T2 16.09 11.27 12.83 8.54 8.81 4.98 5.39 4.92 4.96 5.22 5.17 5.18 5.15 5.16 5.18 4.30 4.38 4.38 4.38 4.38 T3 0.52 0.68 0.71 0.62 0.34 0.48 0.44 0.40 0.36 0.37 0.37 0.36 0.37 0.40 0.38 0.37
α=.3
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 33.68 31.94 31.30 29.47 30.10 24.76 25.04 24.55 24.69 24.91 23.62 24.01 23.85 23.92 23.99 22.41 22.57 22.53 22.55 22.56 T2 13.75 11.28 13.54 9.06 9.31 4.80 4.93 4.70 4.71 4.82 4.76 4.94 4.85 4.90 4.93 3.94 3.99 3.98 3.98 3.98 T3 0.61 0.76 0.75 0.69 0.36 0.48 0.42 0.38 0.34 0.40 0.35 0.35 0.36 0.39 0.38 0.36
α=.5
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 34.17 32.33 31.50 29.14 29.77 23.22 23.08 23.06 22.87 22.94 21.72 22.12 21.94 22.02 22.10 20.68 20.99 20.95 20.97 20.99 T2 14.56 12.73 18.06 9.18 9.59 6.52 5.29 6.15 5.22 5.19 4.81 4.99 4.86 4.92 4.98 4.06 4.12 4.10 4.11 4.11 T3 0.57 0.72 0.71 0.64 0.36 0.42 0.42 0.38 0.29 0.36 0.33 0.29 0.31 0.34 0.33 0.32
α=.7
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 44.66 31.48 26.43 26.94 27.44 19.29 18.08 17.49 17.67 17.76 16.15 16.30 16.18 16.19 16.27 14.90 15.35 15.37 15.34 15.35 T2 31.82 16.88 13.36 12.67 12.49 8.66 7.00 6.46 6.59 6.76 5.89 5.25 5.16 5.15 5.24 4.60 4.33 4.40 4.34 4.33 T3 0.80 0.89 0.87 0.91 0.61 0.68 0.65 0.63 0.37 0.40 0.41 0.38 0.26 0.25 0.27 0.26
α=.9
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Table 8. AR(1)-ARCH(0)-MW6(Bimodal): 11 6.0 ++ += ttt uyy
R=20 R =50 R =100 R =200 J=1 J =50 J =1 J =50 J =1 J =50 J =1 J =50
Binary EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR T1 6.09 5.41 5.20 5.22 5.36 5.11 4.80 4.82 4.80 4.80 4.78 4.72 4.73 4.72 4.73 4.91 4.89 4.89 4.89 4.89 T2 1.69 1.79 1.42 1.50 1.70 1.01 0.78 0.81 0.78 0.78 0.86 0.73 0.74 0.73 0.74 0.82 0.75 0.75 0.75 0.75 T3 0.66 0.72 0.72 0.69 0.48 0.46 0.47 0.46 0.32 0.32 0.32 0.32 0.26 0.26 0.26 0.26
α=.1
T4 0.08 0.09 0.08 0.06 0.15 0.14 0.15 0.16 0.17 0.17 0.17 0.17 0.17 0.17 0.17 0.17 T1 14.68 13.74 13.46 13.71 13.75 13.24 12.99 12.98 12.96 12.96 12.79 12.73 12.72 12.73 12.73 13.03 13.04 13.06 13.05 13.07 T2 3.05 2.69 2.53 2.63 2.68 2.25 2.14 2.11 2.11 2.12 2.31 2.21 2.20 2.20 2.22 2.33 2.15 2.14 2.17 2.15 T3 0.73 0.77 0.71 0.70 0.57 0.58 0.57 0.58 0.47 0.47 0.47 0.47 0.44 0.45 0.45 0.43
α=.3
T4 0.01 0.01 0.00 0.02 0.02 0.02 0.01 0.02 0.06 0.07 0.06 0.06 0.04 0.04 0.04 0.04 T1 18.53 17.42 16.98 17.25 17.36 17.36 17.02 16.83 16.95 17.03 16.85 16.81 16.75 16.82 16.76 16.57 16.56 16.56 16.57 16.57 T2 2.70 2.91 2.65 2.84 2.94 2.93 2.77 2.70 2.68 2.76 2.83 2.73 2.66 2.66 2.71 2.67 2.66 2.65 2.68 2.65 T3 0.62 0.79 0.68 0.67 0.51 0.57 0.50 0.49 0.41 0.49 0.40 0.44 0.35 0.36 0.34 0.35
α=.5
T4 0.12 0.08 0.06 0.09 0.13 0.13 0.18 0.14 0.26 0.21 0.27 0.23 0.31 0.28 0.30 0.29 T1 14.96 13.82 13.51 13.66 13.87 13.40 13.01 13.06 13.02 13.02 13.11 13.05 13.01 13.05 13.02 12.98 12.95 12.94 12.96 12.93 T2 2.69 2.61 2.49 2.55 2.67 2.61 2.17 2.28 2.23 2.23 1.94 1.90 1.86 1.92 1.88 2.11 1.99 1.98 1.98 1.98 T3 0.74 0.78 0.73 0.71 0.61 0.61 0.58 0.61 0.50 0.51 0.50 0.51 0.44 0.44 0.45 0.44
α=.7
T4 0.00 0.03 0.02 0.01 0.04 0.03 0.05 0.04 0.02 0.03 0.06 0.06 0.05 0.05 0.05 0.05 T1 6.12 5.38 5.16 5.26 5.40 5.09 4.93 4.91 4.92 4.92 5.11 4.99 4.99 4.99 4.99 4.85 4.89 4.89 4.90 4.89 T2 1.56 1.43 1.22 1.30 1.46 1.07 0.87 0.87 0.87 0.88 0.89 0.79 0.80 0.80 0.80 0.81 0.74 0.74 0.74 0.74 T3 0.63 0.70 0.66 0.62 0.41 0.41 0.41 0.41 0.38 0.37 0.37 0.37 0.20 0.19 0.19 0.19
α=.9
T4 0.01 0.04 0.05 0.03 0.11 0.12 0.11 0.11 0.17 0.19 0.19 0.19 0.21 0.22 0.22 0.22 Quantile
T1 23.06 20.36 18.34 19.08 19.64 17.69 17.09 16.83 16.95 17.05 16.71 16.55 16.51 16.53 16.55 16.42 16.49 16.48 16.49 16.49 T2 2.83 3.42 2.03 2.35 2.85 1.64 1.45 1.32 1.37 1.43 1.29 1.19 1.18 1.17 1.19 1.25 1.24 1.22 1.23 1.24 T3 0.81 1.00 0.97 0.91 0.65 0.78 0.72 0.68 0.60 0.63 0.60 0.60 0.46 0.47 0.47 0.46
α=.1
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 43.08 39.30 38.02 38.60 39.07 37.76 37.10 36.62 36.88 37.06 36.88 36.75 36.60 36.69 36.74 36.37 36.29 36.24 36.27 36.29 T2 3.75 3.56 3.15 3.21 3.41 2.66 2.56 2.41 2.49 2.54 2.39 2.47 2.39 2.43 2.47 2.46 2.53 2.51 2.52 2.53 T3 0.88 0.96 0.94 0.89 0.65 0.88 0.76 0.68 0.62 0.69 0.64 0.62 0.57 0.62 0.60 0.57
α=.3
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 50.29 45.75 44.69 45.18 45.63 46.06 44.71 44.27 44.53 44.69 45.01 44.49 44.30 44.43 44.48 44.13 43.72 43.66 43.70 43.72 T2 3.80 3.96 3.70 3.66 3.83 2.97 2.95 2.78 2.87 2.95 3.01 3.22 3.15 3.19 3.21 3.01 3.17 3.12 3.15 3.17 T3 0.95 1.00 0.98 0.97 0.86 0.96 0.89 0.86 0.72 0.82 0.73 0.72 0.67 0.70 0.67 0.67
α=.5
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 42.71 38.98 37.61 38.17 38.67 38.04 37.31 36.92 37.12 37.27 37.15 36.82 36.65 36.76 36.81 36.02 35.92 35.86 35.90 35.92 T2 3.34 3.51 2.70 2.96 3.28 2.67 2.45 2.29 2.32 2.43 2.53 2.42 2.38 2.41 2.42 2.55 2.70 2.63 2.67 2.69 T3 0.86 0.99 0.96 0.89 0.72 0.87 0.78 0.74 0.68 0.77 0.71 0.69 0.61 0.60 0.61 0.61
α=.7
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 23.23 20.68 18.49 19.18 19.82 17.86 17.19 16.89 17.01 17.15 16.93 16.75 16.69 16.72 16.74 16.22 16.26 16.24 16.25 16.26 T2 3.15 3.88 2.10 2.44 3.06 1.71 1.65 1.39 1.49 1.61 1.36 1.31 1.30 1.30 1.31 1.34 1.28 1.27 1.27 1.28 T3 0.75 0.97 0.91 0.80 0.70 0.90 0.86 0.75 0.67 0.71 0.67 0.67 0.46 0.47 0.46 0.46
α=.9
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Table 9. AR(1)-ARCH(0)-MW7(Separated Bimodal): 11 6.0 ++ += ttt uyy
R=20 R =50 R =100 R =200 J=1 J =50 J =1 J =50 J =1 J =50 J =1 J =50
Binary EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR T1 5.51 5.15 5.00 5.06 5.12 4.97 4.78 4.77 4.77 4.78 4.76 4.78 4.78 4.78 4.77 4.84 4.83 4.83 4.84 4.84 T2 1.35 1.22 1.03 1.11 1.18 1.00 0.72 0.73 0.73 0.71 0.92 0.84 0.83 0.84 0.83 0.89 0.77 0.78 0.77 0.77 T3 0.57 0.61 0.58 0.56 0.39 0.39 0.39 0.39 0.25 0.25 0.25 0.26 0.25 0.26 0.25 0.25
α=.1
T4 0.07 0.08 0.08 0.07 0.12 0.11 0.12 0.11 0.20 0.19 0.20 0.20 0.19 0.19 0.19 0.19 T1 14.65 13.40 13.10 13.32 13.34 13.40 13.23 13.15 13.21 13.25 12.92 12.81 12.80 12.80 12.81 13.16 13.12 13.12 13.12 13.12 T2 2.66 2.65 2.37 2.51 2.55 2.29 2.05 2.04 2.04 2.06 2.27 2.18 2.18 2.14 2.15 2.26 2.26 2.25 2.24 2.25 T3 0.69 0.74 0.72 0.73 0.52 0.54 0.50 0.52 0.45 0.44 0.46 0.45 0.39 0.41 0.41 0.39
α=.3
T4 0.01 0.04 0.00 0.01 0.12 0.13 0.14 0.10 0.17 0.20 0.17 0.14 0.16 0.16 0.15 0.16 T1 20.69 18.36 17.94 18.39 18.50 20.03 18.92 18.83 19.00 18.97 18.49 18.03 17.96 17.94 18.01 18.74 18.64 18.57 18.58 18.62 T2 3.08 2.96 2.80 2.92 2.89 3.24 2.79 2.94 2.85 2.79 2.88 2.81 2.86 2.80 2.80 3.05 3.07 3.02 3.03 3.07 T3 0.82 0.83 0.81 0.80 0.72 0.76 0.69 0.69 0.55 0.59 0.55 0.57 0.50 0.45 0.46 0.49
α=.5
T4 0.04 0.09 0.05 0.04 0.08 0.09 0.11 0.12 0.21 0.20 0.24 0.19 0.19 0.21 0.25 0.16 T1 15.36 14.13 13.98 14.10 14.20 13.61 13.27 13.21 13.31 13.29 12.96 13.03 13.01 13.02 13.03 12.78 12.84 12.82 12.82 12.83 T2 2.49 2.46 2.29 2.38 2.46 1.98 1.77 1.76 1.86 1.83 2.26 2.14 2.12 2.12 2.14 2.23 2.22 2.24 2.23 2.22 T3 0.76 0.81 0.76 0.77 0.60 0.65 0.59 0.58 0.39 0.39 0.36 0.37 0.35 0.34 0.34 0.34
α=.7
T4 0.01 0.02 0.02 0.01 0.09 0.10 0.12 0.13 0.10 0.12 0.11 0.13 0.21 0.20 0.21 0.22 T1 5.73 5.32 5.27 5.26 5.35 4.92 4.82 4.84 4.84 4.84 4.89 4.88 4.86 4.87 4.87 4.75 4.78 4.77 4.77 4.77 T2 1.22 1.19 1.14 1.12 1.22 0.86 0.64 0.68 0.66 0.66 0.92 0.82 0.81 0.81 0.81 0.96 0.81 0.81 0.81 0.81 T3 0.59 0.61 0.60 0.58 0.33 0.31 0.31 0.31 0.26 0.26 0.26 0.26 0.20 0.20 0.20 0.20
α=.9
T4 0.04 0.06 0.05 0.05 0.15 0.17 0.16 0.16 0.18 0.19 0.18 0.18 0.19 0.19 0.19 0.19 Quantile
T1 18.15 17.67 15.95 16.59 17.01 14.96 15.16 14.89 15.01 15.12 14.20 14.39 14.35 14.37 14.38 14.04 14.17 14.16 14.17 14.17 T2 2.09 3.02 1.41 1.76 2.24 1.02 1.18 0.96 1.04 1.15 1.08 1.16 1.15 1.15 1.16 1.03 1.09 1.09 1.09 1.09 T3 0.64 0.91 0.78 0.70 0.43 0.52 0.48 0.46 0.32 0.38 0.32 0.33 0.39 0.40 0.39 0.39
α=.1
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 41.60 38.29 37.03 37.68 38.02 36.87 36.69 36.14 36.41 36.63 35.23 35.43 35.25 35.35 35.41 34.88 35.01 34.94 34.98 35.00 T2 3.85 3.32 2.82 2.96 3.16 2.37 2.43 2.18 2.25 2.40 2.68 2.71 2.63 2.67 2.70 2.37 2.38 2.35 2.36 2.38 T3 0.83 0.94 0.86 0.85 0.58 0.79 0.63 0.58 0.42 0.48 0.42 0.42 0.48 0.54 0.49 0.48
α=.3
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 53.50 46.87 46.05 46.62 46.89 50.09 47.14 46.80 47.05 47.17 48.55 47.13 46.95 47.06 47.13 48.11 47.32 47.25 47.30 47.33 T2 3.80 3.30 3.16 3.19 3.24 3.19 2.76 2.59 2.67 2.76 2.95 2.69 2.59 2.65 2.69 2.83 2.56 2.46 2.52 2.55 T3 0.99 1.00 1.00 0.99 0.93 1.00 0.99 0.94 0.87 0.88 0.86 0.87 0.70 0.72 0.70 0.70
α=.5
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 41.79 38.95 37.62 38.27 38.70 36.76 36.55 36.00 36.32 36.49 35.40 35.45 35.28 35.37 35.43 34.82 34.95 34.90 34.93 34.94 T2 3.80 3.04 2.63 2.70 2.88 2.77 2.74 2.44 2.56 2.73 2.61 2.59 2.48 2.53 2.58 2.29 2.24 2.24 2.24 2.24 T3 0.85 0.95 0.88 0.84 0.59 0.74 0.66 0.62 0.53 0.59 0.54 0.53 0.47 0.48 0.47 0.47
α=.7
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 18.57 18.04 16.35 17.06 17.50 14.78 15.01 14.72 14.86 14.96 14.24 14.46 14.42 14.44 14.46 14.06 14.24 14.23 14.23 14.24 T2 2.08 2.24 1.51 1.67 1.90 1.05 1.23 0.93 1.04 1.19 1.10 1.12 1.10 1.11 1.12 0.97 0.96 0.96 0.96 0.96 T3 0.61 0.91 0.73 0.70 0.42 0.56 0.50 0.46 0.37 0.42 0.39 0.38 0.31 0.33 0.31 0.31
α=.9
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Table 10. AR(1)-ARCH(0)-MW8(Skewed Bimodal): 11 6.0 ++ += ttt uyy
R=20 R =50 R =100 R =200 J=1 J =50 J =1 J =50 J =1 J =50 J =1 J =50
Binary EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR EW W1 W5 WR T1 6.90 5.79 5.57 5.63 5.71 5.19 4.96 4.97 4.96 4.97 4.93 4.91 4.91 4.91 4.92 5.17 5.06 5.06 5.06 5.06 T2 1.55 1.58 1.27 1.39 1.50 0.98 0.75 0.76 0.75 0.75 1.00 0.87 0.86 0.87 0.87 1.06 0.79 0.79 0.79 0.79 T3 0.81 0.86 0.85 0.84 0.39 0.39 0.39 0.39 0.24 0.23 0.24 0.23 0.25 0.24 0.25 0.25
α=.1
T4 0.01 0.03 0.01 0.01 0.16 0.16 0.16 0.15 0.20 0.21 0.20 0.21 0.31 0.31 0.31 0.31 T1 15.55 14.17 13.86 14.12 14.23 13.34 13.07 13.07 13.06 13.11 13.02 13.06 13.01 13.05 13.09 13.26 13.28 13.29 13.28 13.30 T2 2.36 2.41 2.27 2.32 2.44 2.19 2.04 2.01 1.99 2.03 2.29 2.34 2.27 2.31 2.35 2.06 1.96 1.95 1.98 1.97 T3 0.69 0.79 0.75 0.71 0.55 0.56 0.57 0.54 0.47 0.50 0.49 0.47 0.42 0.40 0.41 0.41
α=.3
T4 0.00 0.00 0.01 0.00 0.05 0.08 0.06 0.03 0.08 0.09 0.08 0.08 0.02 0.03 0.02 0.02 T1 17.82 16.73 16.24 16.57 16.79 16.50 16.30 16.14 16.21 16.27 16.42 16.35 16.24 16.19 16.29 15.87 15.84 15.80 15.83 15.81 T2 2.80 2.97 2.79 2.85 2.91 2.56 2.52 2.56 2.48 2.51 2.50 2.49 2.47 2.43 2.50 2.66 2.58 2.57 2.57 2.60 T3 0.68 0.75 0.72 0.66 0.45 0.56 0.50 0.47 0.44 0.48 0.48 0.43 0.36 0.39 0.38 0.37
α=.5
T4 0.04 0.07 0.09 0.07 0.16 0.18 0.18 0.18 0.20 0.24 0.20 0.23 0.31 0.28 0.26 0.30 T1 13.50 12.79 12.53 12.69 12.79 12.80 12.39 12.39 12.37 12.42 12.98 12.89 12.87 12.89 12.87 11.99 11.99 11.98 11.97 11.98 T2 2.49 2.43 2.35 2.33 2.46 2.09 1.90 1.94 1.92 1.96 2.19 2.11 2.13 2.13 2.08 2.07 1.99 2.00 1.98 1.99 T3 0.68 0.74 0.66 0.65 0.62 0.64 0.66 0.63 0.51 0.52 0.49 0.50 0.43 0.43 0.43 0.43
α=.7
T4 0.02 0.02 0.04 0.03 0.02 0.04 0.02 0.02 0.04 0.06 0.04 0.04 0.06 0.06 0.06 0.06 T1 5.56 4.89 4.71 4.85 4.93 4.69 4.61 4.58 4.59 4.59 4.84 4.91 4.88 4.89 4.88 4.56 4.63 4.63 4.62 4.62 T2 1.40 1.30 1.12 1.26 1.36 0.96 0.76 0.76 0.76 0.76 0.92 0.83 0.82 0.82 0.82 0.80 0.71 0.71 0.71 0.71 T3 0.70 0.74 0.67 0.67 0.33 0.35 0.34 0.35 0.24 0.24 0.24 0.24 0.27 0.26 0.27 0.27
α=.9
T4 0.02 0.03 0.03 0.02 0.19 0.20 0.20 0.19 0.08 0.08 0.08 0.08 0.07 0.07 0.07 0.07 Quantile
T1 26.33 21.96 20.11 20.78 21.27 20.14 19.02 18.76 18.90 19.00 18.95 18.60 18.50 18.54 18.59 18.63 18.55 18.53 18.53 18.55 T2 3.70 3.88 2.57 2.85 3.27 2.56 2.38 2.06 2.22 2.36 1.75 1.64 1.56 1.59 1.64 1.53 1.46 1.45 1.45 1.46 T3 0.90 1.00 0.96 0.92 0.82 0.90 0.87 0.83 0.68 0.73 0.71 0.68 0.56 0.58 0.58 0.56
α=.1
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 44.52 40.20 38.70 39.28 39.78 38.81 37.83 37.48 37.64 37.80 37.69 37.40 37.24 37.34 37.39 37.15 37.17 37.10 37.14 37.16 T2 4.26 4.30 3.58 3.66 3.96 3.63 3.56 3.41 3.47 3.55 2.72 2.72 2.62 2.68 2.71 2.58 2.64 2.60 2.63 2.64 T3 0.89 0.99 0.97 0.91 0.80 0.95 0.90 0.82 0.70 0.76 0.74 0.71 0.54 0.56 0.56 0.55
α=.3
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 49.39 44.93 43.65 44.15 44.66 44.52 43.23 42.90 43.07 43.21 43.18 42.66 42.50 42.59 42.66 42.49 42.44 42.36 42.41 42.44 T2 4.65 3.80 3.61 3.62 3.75 3.71 3.71 3.59 3.65 3.70 2.91 2.98 2.87 2.92 2.97 2.99 2.97 2.93 2.96 2.97 T3 0.93 0.99 0.97 0.95 0.83 0.95 0.90 0.85 0.75 0.80 0.78 0.75 0.58 0.62 0.60 0.58
α=.5
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 41.08 37.18 35.79 36.27 36.82 36.46 35.47 35.15 35.31 35.44 35.28 34.98 34.82 34.91 34.97 34.52 34.41 34.36 34.39 34.41 T2 4.40 4.01 3.51 3.46 3.78 2.68 2.72 2.59 2.65 2.70 2.56 2.41 2.34 2.38 2.41 2.56 2.56 2.50 2.53 2.56 T3 0.90 0.98 0.96 0.92 0.80 0.92 0.84 0.81 0.60 0.76 0.68 0.61 0.62 0.62 0.61 0.62
α=.7
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 T1 21.14 19.14 17.05 17.75 18.38 16.17 15.77 15.59 15.67 15.74 15.34 15.42 15.35 15.39 15.41 14.85 15.02 15.01 15.01 15.02 T2 3.31 3.72 2.27 2.53 3.08 1.56 1.42 1.34 1.37 1.41 1.28 1.34 1.31 1.32 1.34 1.20 1.20 1.19 1.20 1.20 T3 0.68 0.93 0.87 0.82 0.65 0.76 0.70 0.68 0.51 0.53 0.51 0.51 0.31 0.31 0.31 0.31
α=.9
T4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Figure 1(a): SP500 Binary Prediction
0 50 1 0 0 1 5 05
6
7
8
9
10α = 0 . 1
R
Co
st
0 5 0 1 0 0 15010
11
12
13
14
15
16α = 0 . 2
R
Co
st
0 50 1 0 0 1 5 01 2
1 4
1 6
1 8
2 0
2 2
2 4α = 0 . 3
R
Co
st
0 50 1 0 0 1 5 016
18
20
22
24
26
28α = 0 . 4
R
Cos
t
0 5 0 1 0 0 15010
12
14
16
18
20
22
24α = 0 . 5
R
Cos
t
0 50 1 0 0 1 5 01 3
1 4
1 5
1 6
1 7
1 8
1 9
2 0α = 0 . 6
R
Cos
t
0 50 1 0 0 1 5 011
12
13
14
15
16
17α = 0 . 7
R
Cos
t
0 5 0 1 0 0 1506
8
10
12
14
16α = 0 . 8
R
Cos
t
0 50 1 0 0 1 5 02
4
6
8
1 0
1 2
1 4α = 0 . 9
R
Cos
t
U n b a g g e dE W ( k = R )B M A ( k = 1 )B M A ( k = 5 )B M A ( k = R )
Figure 1(b): SP500 Quantile Prediction
0 50 100 15080
100
120
140
160
180α = 0 . 1
R
Cos
t
0 5 0 100 1501 2 0
1 4 0
1 6 0
1 8 0
2 0 0
2 2 0α = 0 . 2
R
Cos
t
0 50 1 0 0 1 5 01 4 0
1 6 0
1 8 0
2 0 0
2 2 0
2 4 0
2 6 0α =0 .3
R
Cos
t
0 50 100 150180
200
220
240
260
280α = 0 . 4
R
Co
st
0 5 0 100 1501 6 0
1 8 0
2 0 0
2 2 0
2 4 0
2 6 0
2 8 0
3 0 0α = 0 . 5
R
Co
st
0 50 1 0 0 1 5 01 6 0
1 8 0
2 0 0
2 2 0
2 4 0
2 6 0α =0 .6
R
Co
st
0 50 100 150140
160
180
200
220
240α = 0 . 7
R
Cos
t
0 5 0 100 1501 0 0
1 2 0
1 4 0
1 6 0
1 8 0
2 0 0
2 2 0α = 0 . 8
R
Cos
t
0 50 1 0 0 1 5 05 0
1 0 0
1 5 0
2 0 0
2 5 0α =0 .9
R
Cos
t
U n b a g g e dE W ( k = R )B M A ( k = 1 )B M A ( k = 5 )B M A ( k = R )
Figure 2(a): NASDAQ Binary Prediction
0 50 1 0 0 1 5 04
6
8
10
12α = 0 . 1
R
Co
st
0 5 0 1 0 0 15010
11
12
13
14
15
16α = 0 . 2
R
Co
st
0 50 1 0 0 1 5 01 0
1 2
1 4
1 6
1 8
2 0
2 2
2 4α = 0 . 3
R
Co
st
0 50 1 0 0 1 5 016
18
20
22
24
26α = 0 . 4
R
Cos
t
0 5 0 1 0 0 15018
19
20
21
22
23
α = 0 . 5
R
Cos
t
0 50 1 0 0 1 5 01 4
1 5
1 6
1 7
1 8
1 9
2 0
2 1α = 0 . 6
R
Cos
t
0 50 1 0 0 1 5 012
13
14
15
16α = 0 . 7
R
Cos
t
0 5 0 1 0 0 1508
9
10
11
12α = 0 . 8
R
Cos
t
0 50 1 0 0 1 5 04
5
6
7
8
9
1 0α = 0 . 9
R
Cos
t
U n b a g g e dE W ( k = R )B M A ( k = 1 )B M A ( k = 5 )B M A ( k = R )
Figure 2(b): NASDAQ Quantile Prediction
0 50 1 0 0 1 5 0100
150
200
250
300
350α =0 .1
R
Cos
t
0 5 0 100 1502 0 0
2 5 0
3 0 0
3 5 0
4 0 0
4 5 0α = 0 . 2
R
Cos
t
0 5 0 100 150250
300
350
400
450
500
550α = 0 . 3
R
Cos
t
0 50 1 0 0 1 5 0300
350
400
450
500
550α =0 .4
R
Cos
t
0 5 0 100 1503 0 0
3 5 0
4 0 0
4 5 0
5 0 0
5 5 0α = 0 . 5
R
Cos
t
0 5 0 100 150300
350
400
450
500α = 0 . 6
R
Cos
t
0 50 1 0 0 1 5 0250
300
350
400
450
500α =0 .7
R
Cos
t
0 5 0 100 1502 2 0
2 4 0
2 6 0
2 8 0
3 0 0
3 2 0
3 4 0
3 6 0α = 0 . 8
R
Cos
t
0 5 0 100 150100
150
200
250
300
350α = 0 . 9
R
Cos
t
UnbaggedE W ( k = R )B M A ( k = 1 )B M A ( k = 5 )B M A ( k = R )
0
200
400
600
800
1000
1200
-2.5 0.0 2.5
Series : MW1Sam ple 1 10000Observations 10000
Mean -0.012807Median -0.011404Maxim um 4.334614Minim um -4.252738Std. Dev. 0.998405Skewness -0.008630Kurtos is 3.078202
Jarque-Bera 2.672266Probability 0.262860
0
200
400
600
800
1000
1200
-3.75 -2.50 -1.25 0.00 1.25 2.50 3.75
Series : MW2Sam ple 1 10000Observations 10000
Mean 0.927466Median 1.038380Maxim um 3.796584Minim um -4.084973Std. Dev. 1.007481Skewness -0.722575Kurtos is 3.989728
Jarque-Bera 1278.342Probability 0.000000
0
400
800
1200
1600
2000
2400
-2.50 -1.25 0.00 1.25 2.50
Series : MW3Sam ple 1 10000Observations 10000
Mean -1.925271Median -2.315131Maxim um 3.113319Minim um -3.306660Std. Dev. 1.026836Skewness 1.471052Kurtos is 4.812800
Jarque-Bera 4975.924Probability 0.000000
0
400
800
1200
1600
2000
2400
-2.5 0.0 2.5
Series : MW4Sam ple 1 10000Observations 10000
Mean 0.019129Median 0.005695Maxim um 4.212732Minim um -4.619784Std. Dev. 0.997794Skewness 0.037861Kurtos is 4.514265
Jarque-Bera 957.8048Probability 0.000000
0
1000
2000
3000
4000
5000
*** -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5
Series : MW5Sam ple 1 10000Observations 10000
Mean 0.002387Median -0.005512Maxim um 9.471090Minim um -9.521809Std. Dev. 0.981310Skewness 0.361266Kurtos is 24.77680
Jarque-Bera 197813.0Probability 0.000000
0
100
200
300
400
500
600
700
800
900
-2 -1 0 1 2 3
Series : MW6Sam ple 1 10000Observations 10000
Mean -0.001610Median 0.003977Maxim um 3.125011Minim um -2.737267Std. Dev. 1.009703Skewness 0.003909Kurtos is 2.054055
Jarque-Bera 372.8636Probability 0.000000
0
400
800
1200
1600
-2 -1 0 1 2
Series : MW7Sam ple 1 10000Observations 10000
Mean 0.016674Median 0.274751Maxim um 2.166670Minim um -2.015000Std. Dev. 1.002191Skewness -0.034592Kurtos is 1.377536
Jarque-Bera 1098.824Probability 0.000000
0
200
400
600
800
1000
1200
-2.50 -1.25 0.00 1.25 2.50
Series : MW8Sam ple 1 10000Observations 10000
Mean 0.325811Median 0.370494Maxim um 3.327683Minim um -3.435548Std. Dev. 1.001731Skewness -0.329832Kurtos is 2.405267
Jarque-Bera 328.6929Probability 0.000000
Note: This page is an appendix for the referees and is not intended for publication. The eight Marron-Wand (MW) mixture normal densities, used in Section 5, are presented in histograms using pseudo-randomly generated 10,000 series, together with some summary statistics.
0
5
10
15
20
25
30
35
-0.2 -0.1 0.0 0.1
Series: SPRETURNSample 1982:11 2004:02Observations 256
Mean 0.008413Median 0.011627Maximum 0.123780Minimum -0.245428Std. Dev. 0.044724Skewness -1.066149Kurtosis 6.995984
Jarque-Bera 218.8222Probability 0.000000
0
10
20
30
40
50
-0.3 -0.2 -0.1 0.0 0.1 0.2
Series: NASDAQRETURNSample 1984:11 2004:02Observations 232
Mean 0.009171Median 0.018550Maximum 0.198653Minimum -0.317919Std. Dev. 0.072081Skewness -1.016935Kurtosis 5.920758
Jarque-Bera 122.4521Probability 0.000000
Note: This page is an appendix for the referees and is not intended for publication. The histograms of the S&P500 and NASDAQ monthly stock index returns, used in Section 6, are presented, together with some summary statistics.