artigoCIFEr2012

  • Upload
    ivette

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

  • 7/31/2019 artigoCIFEr2012

    1/7

    Online estimation of stochastic volatily for asset returns

    Ivette Luna and Rosangela Ballini

    Abstract This paper suggests an adaptive fuzzyrule based system applied as a financial time seriesmodel for volatility forecasting. The model is basedon Takagi-Sugeno fuzzy systems, and it is built in twophases. In the first phase, the model uses the Subtrac-tive Clustering algorithm to determine group struc-

    tures in a reduced data set for initialization purpose.In the second phase, the system is modified dynami-cally via adding and pruning operators and a recursivelearning algorithm, which is based on the Expecta-tion Maximization optimization technique. The onlinealgorithm determines automatically the number offuzzy rules necessary at each step, whereas one stepahead predictions are estimated and parameters areupdated as well. The model is applied for forecastingfinancial time series volatility, considering daily valuesthe REAL/USD exchange rate. The model suggestedis compared against generalized autoregressive condi-tional heteroskedaticity models. Experimental resultsshow the adequacy of the adaptative fuzzy approachfor volatility forecasting purposes.

    I. Introduction

    SINCE the 1990s, Value-at-Risk (VaR) has becomea standard comprehensive risk measure that sum-

    marizes the overall market risk exposure throughout onesingle quantitative parameter [1]. Its quantification isof great importance for helping financial institutions tocontrol and manage risk related to business activities aswell as for regulatory committees to set margin require-ments [2].

    Several approaches were proposed in the literaturefor estimating VaR [3]. A comparison study consideringseveral approaches is detailed in [4]. One of the most

    common approach used for this purpose is the generalizedautoregressive conditional heteroskedasticity (GARCH)family model [5], a type of non-linear time series modelthat became a standard tool for estimating the volatilityof financial market data.

    During the last decades, different non-linear strate-gies based on computational intelligence tools have beenwidely studied and applied for time series forecasting indifferent areas, and more recently their application havebeen extended to economic and financial problems. Forexample, the works detailed in [2], [4] and [6] suggest theuse of different types of neural networks for forecastingvolatility of return time series. On the other hand, the

    proposal described in [7] suggests the use of a fuzzy in-ference system for the estimation of the Brazilian central

    Ivette Luna and Rosangela Ballini are with the Depart-ment of Economic Theory, Institute of Economics, State Uni-veristy of Campinas, Sao Paulo - Brasil (email: {ivette,ballini}@eco.unicamp.br).

    This work was supporte by XXXYYY

    banks reaction function. Yet, several works suggest theuse of hybrid approaches, combining econometric modelsand neural networks in order to harness the potencialof each approach and obtain a better result than theindividual ones [8], [9]. All these works intend in some

    way, to deal with non-linearities and uncertainties presentin economic and financial systems. An excellent surveyof the existence of nonlinearities in the financial data isprovided in [5].

    An alternative to handle this situation, particularlythose connected with uncertainty and imprecision abouttheoretical relationships, is to apply the framework offuzzy inference systems. Fuzzy inference systems havebeen successfully applied in fields such as automatic con-trol, data classification, decision analysis, expert systems,and time series forecasting.

    More recently, a particular type of fuzzy models havebeen widely studied. They are called adaptive fuzzy

    systems and emerge as a tentative for decreasing thedifficulty of choosing parameters for building the timeseries models, with a low computational cost. Thesemodels have the ability of designing automatically theinput space partition and the capacity of adaptation topossible changes in the dynamic of the system, as well.The papers presented in [10], [11], [12], [13], [14] and [15]are just some examples of this kind of models that haveshown a high flexibility in capturing the characteristicsof data.

    Therefore, we propose a method to estimate VaRusing an adaptive fuzzy systems (AdaFIS) to enhancepredictive power by directly estimating volatility andutilizing it to obtain more accurate estimate for VaR.

    To verify the model performance, out-of-sample testsare performed, and results are compared with the onesachieved following a traditional and well settled modelfor estimating returns volatility, named the GARCHmodels. The case study presented in this work analysesthe exchange rate of Brazilian Reals (R$) per U.S. Dollar(REAL/USD) series. Results show the adaptive fuzzymodels as an alternative for estimating the VaR.

    After this introduction, this paper proceeds as follows.The general structure of the AdaFIS is presented inSection ??. Section ?? presents in detail the adaptive

    learning method proposed. Empirical design and resultsanalysis are shown in Section IV. Finally, some conclu-sions and further research are presented in Section ??.

    II. Model structure

    The online FIS is based on a first order Takagi-Sugeno(TS) fuzzy system [16]. The general structure of this

  • 7/31/2019 artigoCIFEr2012

    2/7

  • 7/31/2019 artigoCIFEr2012

    3/7

    where hNi is the posterior probability of the EM algo-rithm for the Nth pattern. Carefully observing Eq. (5),and considering a total of N input-output patterns, thisequation can be rewritten as:

    Ni =1

    N

    N

    k=1

    hki =1

    N(N1

    k=1

    hki + hNi )

    =1

    N[(N 1)N1i + h

    Ni ]

    = N1i +1

    N[hNi

    N1i ] (6)

    where N1i is the estimated value for i, considering just

    the first N 1 patterns. As can be observed, Equation(6) is a recursive estimate of Equation (5). Following thisprocedure for a generic number of iterations k and awindow size T, defined in this work by trial and error,recursive estimates for all the model parameters can bewritten as follows:

    k+1i = ki +

    1

    T[hki

    ki ] (7)

    ck+1i = c

    ki +

    1

    k+1i[xk cki ] (8)

    Vk+1i = V

    ki +

    1

    k+1i[(xk cki )(x

    k cki ) Vki ] (9)

    where:

    1

    k+1i=

    hk+1ik+1t=1 h

    ti

    (10)

    An approximation ofk+1

    t=1 hti can be constructed con-

    sidering a window size T and the recursive equation forminspired by the adaptive learning of the fuzzy systemdetailed in [20]. Let Sk+1i =

    k+1t=1 h

    ti, and Si(x

    k+1) =hk+1i . Then S

    k+1i can be estimated as:

    Sk+1i Si(xk+1) +

    T 1

    TSki (11)

    which can be rewritten as:

    Sk+1i Ski + [Si(x

    k+1)SkiT

    ] (12)

    It is interesting to analyze Equation (12). Ski /T canbe interpreted as an estimate of the mean value of iover the window size T. Therefore, the higher its value,the more relevant its respective fuzzy rule will be for thenext step. IfSi(x

    k+1) gets a low value over T, Sk+1i will

    decrease and the probability of pruning the associatedi th rule will increase.To estimate i, it is necessary to apply a weighted

    recursive least square algorithm (RLS), which considersa forgetting factor over time fforget as is detailed in [21].Equations of the RLS algorithm adapted to our problemare defined by the following equation:

    k+1i = ki + C

    k+1i

    k hki (yk yki ) (13)

    where

    Ck+1i =

    Cki

    fkforget + hki (

    k)TCki k

    (14)

    is the covariance matrix associated with each i duringthe online adaptation. The forgetting factor fforget (0, 1], in this paper was initially set as f0forget = 0.9. Toguarantee stability fforget was slowly increased so that

    after a long time f

    k

    forget 1.0.Initial conditions for 0i , i = 1, . . . , M were given bythe values obtained through model initialization, whileC0i = I, where = 10

    4 and I is an identity matrix withdimensions p + 1p + 1.

    After the initialization phase, online adaptation wasundertaken via structure modification based on addingand pruning operators and parameters update usingEquations (7)-(14).

    Adding: The criterion to judge whether to generatea new fuzzy rule was based on the if-part criterion,which verifies if some existing fuzzy rule clusters theinput vector. Assuming a normal input data distri-

    bution, with a confidence level of %, we can con-struct a confidence interval [ci z

    diag(Vi), ci +

    z

    diag(Vi)], where diag(Vi) is the main diagonalof the covariance matrix Vi. In this paper, we get aconfidence level of = 72, 86% which requires a zvalue of 1.1, obtained from the normal distributiontable. It is clear that = 72, 86% is the middlechunk, leaving 13,57% probability excluded in eachtail. That is:

    max

    P[ i | xk ]i=1,...,M

    > 0.1357 (15)

    If this condition is not satisfied, it means that there

    is no rule that can cluster this input vector, thatis, the input pattern is not covered by any fuzzypartition. Hence, it is necessary to add a new ruleto the structure, expanding the input space regiondetected during the initialization phase or before theactual time instant k. If it happens, a new rule isgenerated with the next initialization:

    ck+1M+1 = x

    k

    k+1M+1 = 1.0;

    k+1M+1 = [yk 0 . . . 0]1p+1

    Vk+1M+1 = 10

    4I, where I is a p p identity

    matrix;

    k+1M+1 = 105.

    Even though this value is too small to interfere inthe dynamic of the actual structure, all the i, i =1, . . . , M +1 are re-normalized, so that the sum of allthese coefficients will always be equal to the unity.

  • 7/31/2019 artigoCIFEr2012

    4/7

    Pruning: As it can be observed in Equation (5),i can be considered a measure of the importancethat each fuzzy rule has for the corresponding topol-ogy when compared to the other rules. It occursbecause i is proportional to the sum of all posteriorestimates of membership functions gki over all thedata set. Hence, a threshold for i is defined, sothat every rule with i < min at each iterationis pruned and eliminated from the actual modelstructure. However, after a new rule is created, itscorresponding i will have a small value. If thepruning operator were applied immediately, the newrule would thus be eliminated and there would be notime to verify its relevance for the model structure.This problem is resolved by the creation of a newindex, called index of permanence . Every time anew rule is created, its respective i will also becreated. As this rule is activated over time, this indexis increased, that is:

    k+1i = ki + 1 (16)

    Thus, a rule will be a candidate pruning only if itsi is very small and ki > T, where > 0 and T

    is the same window size used during the sequentiallearning. This condition ensures that no new rulewill be pruned immediately after its creation, allow-ing it to adjust for a minimum period of time andavoiding useless and abrupt oscillations in the modelstructure.

    IV. Case study: Value at Risk

    A. Data

    Our case study is based on daily values of the com-mercial exchange rate of Brazilian Reals (R$) per U.S.

    Dollar (REAL/USD).Real/USD exchange rate time series (REAL/USD) is

    registered from January 2000 to December 2010. Histor-ical records are observed in Fig. 2-(a) while Fig. 2-(b)shows its returns. Summary statistics and histogramsrelated to REAL/USD returns are presented in Fig. 3,where we can observe the highest kurtosis coefficientamong the three series described.

    The Jarque-Bera statistic reveals that the return seriesare non-normal with a 99% confidence interval. Thelast thousand samples of the historical observations wereused for validation purposes, whereas the first ones wereused for model specification and optimization. In thethree cases we observe groups of volatility, which is anindication of heteroscedasticity and consequently, thatthe data is a candidate for GARCH modelling.

    0 500 1000 1500 2000 25001.5

    2

    2.5

    3

    3.5

    4(a)

    0 500 1000 1500 2000 25000.1

    0.05

    0

    0.05

    0.1(b)

    Fig. 2. (a) Daily US Dollar exchange rate from January 2000 toDecember 2010; (b) US Dollar exchange rate returns.

    B. Value-at-Risk

    As already mentioned in Section I, despite of theseveral controversies reported in the literature, VaR isnowadays a standard tool for quantifying market risk,

    which is just one of the possible types of risk in financialmarkets [22], measuring the worst expected loss at agiven confidence level.

    Returns are calculated according to:

    Rt = ln Pt ln Pt1 (17)

    where Pt represents the stock price and Rt is the prof-itability on a ln scale at time instant t. At time t, we areinterested in measuring the risk of a financial position forthe next h periods. If V(h) denotes a random variableindicating the asset value change from time t to timet + h, then we can define its conditional density function

    (CDF) as Fh(x). Therefore, the VaR for a long positionover time horizon h with probability p is defined as

    Prob[V(h) V aR] = Fh(V aR) = p (18)

    In the case of a long position, which is the hypothesisfor the development of the case studies, the loss occurswhen V(h) < 0. Thus, the VaR defined in (18) willassume a negative value. The interpretation of p is that,over a large number of trading days, the holder willencounter a loss greater than or equal to VaR for p%of the time over the time horizon h. As observed, for along position, the left tail of the CDF is important. Since

    in practice, this CDF is unknown, the studies of VaR arefocused on the estimation of the CDF, with emphasis inthe behavior of its tail.

    Estimating the CDF of the series is far beyond thescope of this paper. Thereby, although we observed thatthe series analyzed in this paper are non-normal, weconsider that this unknown CDF is that of the normal

  • 7/31/2019 artigoCIFEr2012

    5/7

    Fig. 3. REAL/USD returns summary statistics.

    distribution for sake of simplicity, once this is the mostcommon assumption in the literature.

    In order to define the VaR of our assets given a pconfidence level, we need a one step ahead forecast of thevolatility of the returns (h = 1). For doing so, we use the

    model proposed (AdaFIS) as well as the ARMA(p, q)GARCH(r, s) model for benchmark purposes.

    C. ARMA-GARCH and AdaFIS approach

    A general representation of an ARMA(p, q) GARCH(r, s) model is given by:

    Rt = 0 +

    p

    i=1

    iRti +

    q

    j=1

    jatj

    at = t (19)

    2t = 0 +r

    i=1

    ia2ti +

    s

    j=1

    j2tj (20)

    where the errors t follow a normal distribution N(0, 1).Since these equations provide an estimation of futurereturns rt and conditional variance

    2t , theses ones can

    be used for a one step ahead VaR forecast such that

    V aRp = Rt zpt (21)

    where zp is the critical value from the normal distributiontable at p% confidence level.

    The order (p, q) of the ARMA part was specified sothat the residuals over the returns were non autocorre-

    lated (white noise). On the other hand, several configura-tions of the GARCH part (p, q) were evaluated, rankingeach model according to the Schwartz and Akaike crite-ria, over a grid of [0, 4] for r and s. The model with thelowest indicators is selected as the most representativefor that dataset. The most adequate configuration forthe GARCH part was the GARCH(1, 1).

    In the case of the AdaFIS approach, we aim to forecastthe one day estimated volatility of the three cases elsethe absolute value of the return of the assets. Afterforecasting the volatility, VaR is computed as

    V aRp = zpt (22)

    For doing so, the FIS uses the following representation:

    2t+1= R2t+1 = f(R

    2t , R

    2t1, . . . , R

    2tm) + t (23)

    where the number of lags m considered as input variablesof the AdaFIS model was defined analyzing the partialautocorrelation function of the squared returns, consid-ering as maximum the first four autoregressive lags aspossible inputs.

    D. Performance metricsBased on the research detailed in [4], in order to verify

    the model performance, two loss functions are evaluated:the violation ratio and the average square magnitudefunction.

    The violation ratio is the percentage occurrence of anactual loss greater than the estimated maximum loss inthe VaR framework, and it is calculated as

    V R =1

    N

    N

    k=1

    Hk (24)

    where Hk = 1 if Rk < V aRk and Hk = 0 if Rk V aRk,where V aRk is the one step ahead forecasted VaR for dayk considering a 5% confidence level; N is the number ofobservations in the out of sample period.

    The average square magnitude function suggested in[4] considers the amount of possible default measuringthe average squared cost of exceptions given by:

  • 7/31/2019 artigoCIFEr2012

    6/7

    E =1

    V

    V

    i=1

    Di (25)

    where V is the number of exceptions of the respectivemodel, Di = (RiV aRi)

    2 when Ri < V aRi and Di = 0when Ri V aRi.

    E. Analysis results

    The econometric model was adjusted using the soft-ware EViews. The final ARMA-GARCH model is givenby the following equations

    Rt = 0.141982 Rt1 + at (26)

    2t = 3.56E06+0.154037a2t1+0.808768

    2t1 (27)

    The dataset used for adjustment was the same used bythe AdaFIS model during the initialization stage. Afterthat, the model was run in online mode. Parameter T,which represents a window size over time, was set up in

    28. On the other hand, min = 0.001 and fforget = 0.925.Based on the autocorrelation and partial autocorrelationfunctions, the AdaFIS considered the first four lags ofthe square returns for composing the input vector.

    Results considering V R and E criteria are presented inTable I. Evaluating the V R loss function, we notice thatthe AdaFIS outperforms the GARCH model in this casestudy. However, this last one gives also an adequate levelof violation ratio taking into account the 5% confidencelevel adopted.

    Table I also shows the E loss function. When differentmodels get a similar or identical hit rates, this metricmay help us to discriminate between them. In this case, it

    just helps us to complement results quantified by the V Rmetric. Given the confidence level, we can observe thatboth metrics are conclusive, once the E value achievedby the AdaFIS were the lowest one. In general terms, wemay argue that for long positions, the AdaFIS has notonly the best V R but also a small average magnitudefor its violations. Fig. 4 presents the returns and VaRestimates for all the case studies using ARMA-GARCHand AdaFIS approach, respectively.

    TABLE I

    Loss functions.

    Case study REAL/USDGARCH AdaFIS

    VR (%) 3.70 1.80E (%) 0.008 0.006

    0 500 1000 1500 2000 25000.15

    0.1

    0.05

    0

    0.05

    0.1(a)

    Return seriesVaR

    0 500 1000 1500 2000 25000.1

    0.05

    0

    0.05

    0.1(b)

    Return seriesVaR

    Fig. 4. (a) US Dollar exchange rate - AdaFIS; (f) REAL/USD -AR(1)-GARCH(1,1).

    V. Conclusions

    The accuracy for its estimation is of great importancefor control and manage of risk bearing business activities

    This paper presents an adaptive fuzzy inference sys-tem, applied for bond price forecasting. The main con-tribution of this work is the development of a sequentiallearning that performs the model parameters in parallelto the model structure definition. An important featureof the model proposed is that it does not require a re-training of the entire model each time new data on thetime series historic is increased, since its learning is devel-oped in an online fashion, providing compact structuresthrough a fast learning procedure, which is a great advan-tages in terms of time process and computational effort.The forecasting exercise against other specific evolvingmodels of the literature is favourable to the proposal of

    this paper, showing the AdaFIS as a promising techniquefor time series modeling and forecasting. Further researchconsiders the reduction of model parameters, as well asan adaptive criterion for dynamic input selection andmulti-step ahead predictions.

    We would also like to decrease the number of keyparameters to set up, defining an automatic value forki and T, as well as simplifying the definition givenfor gki , increasing model transparency. Another possibleextension of the work presented here are the constructionof interval forecasts.

    References

    [1] X. Chen, Neural network based models for value-at-risk anal-ysis with applications in emerging markets, Ph.D. disserta-tion, Department of Management Sciences, City University ofHong Kong, 2009.

    [2] H. Lu, X. Yu, J. Zhu, X. Zhao, and N. Cheng, Value-at-risk forecasting with combined neural network model. inICNC10, 2010, pp. 746750.

    [3] P. Jorion, Value-At-Risk. McGraw-Hill, 2001.

  • 7/31/2019 artigoCIFEr2012

    7/7

    [4] C. Dunis, J. Laws, and G. Sermpinis, Modelling commodityvalue at risk with higher order neural networks, AppliedFinancial Economics, vol. 20, no. 7, pp. 585600, 2010.

    [5] T. Bollerslev, Generalized autoregressive conditional het-eroskedasticity, Journal of Econometrics, vol. 31, no. 3, pp.307327, 1986.

    [6] S. A. Hamid and Z. Iqbal, Using neural networks for fore-casting volatility of S&P 500 index futures prices, Journalof Business Research, vol. 57, no. 10, pp. 1116 1125, 2004,selected Papers from the third Retail Seminar of the SMA.

    [7] I. Luna, L. Maciel, R. L. F. da Silveira, and R. Ballini,Estimating the brazilian central banks reaction function byfuzzy inference system, in IPMU (2), 2010, pp. 324333.

    [8] R. Donaldson and M. Kamstra, An artificial neural network-garch model for international stock return volatility, Journal

    of Empirical Finance, vol. 4, no. 1, pp. 17 46, 1997.[9] J. Dhar, P. Agrawal, V. Singhal, A. Singh, and R. K. Murmu,Comparative study of volatility forecasting between ann andhybrid models for indian market, International ResearchJournal of Finance and Economics, no. 45, pp. 6879, 2010.

    [10] M. J. Er and S. Wu, A fast learning algorithm for parsimo-nious fuzzy neural systems, Fuzzy Sets and Systems, vol. 126,pp. 337351, 2002.

    [11] P. Angelov and D. Filev,Simpl eTS: A Simplified Method forLearning Evolving Takagi-Sugeno Fuzzy models, in Proceed-ings of The IEEE International Conference on Fuzzy Systems,2005, pp. 10681073.

    [12] G. Leng, T. McGinnity, and G. Prasad, An approach foron-line extraction of fuzzy rules using a self-organising fuzzyneural network, Fuzzy Sets and Systems, vol. 150, no. 2, pp.211243, 2005.

    [13] H.-J. Rong, N. Sundararajan, G.-B. Huang, and P. Saratchan-

    dran, Sequential Adaptive Fuzzy Inference System (SAFIS)for nonlinear system identification and prediction, Fuzzy Setsand Systems, no. 157, pp. 12601275, 2006.

    [14] I. Luna, S. Soares, and R. Ballini,An Adaptive Hybrid Modelfor Monthly Streamflow Forecasting, in Proceedings of TheIEEE International Conference on Fuzzy Systems, 2007, pp.16.

    [15] R. Ballini, A. R. R. Mendonca, and F. Gomide, EvolvingFuzzy Modeling of Sovereign Bonds, Journal of FinancialDecision Making, vol. 5, no. 2, December 2009.

    [16] T. Takagi and M. Sugeno, Fuzzy Identification of Systemsand Its Applications to Modeling and Control, IEEE Trans-actions on Systems, Man and Cybernetics, no. 1, pp. 116132,January/February 1985.

    [17] S. Chiu, A cluster estimation method with extension to fuzzymodel identification, in Proceedings of The IEEE Interna-tional Conference on Fuzzy Systems, vol. 2, June 1994, pp.12401245.

    [18] P. P. Angelov and D. P. Filev, An Approach to Online Identi-fication of Takagi-Sugeno Fuzzy Models, IEEE Transactionson Systems, Man and Cybernetics-part B, vol. 34, no. 1, pp.484498, February 2004.

    [19] R. Jacobs, M. Jordan, S. Nowlan, and G. Hinton, AdaptiveMixture of Local Experts, Neural Computation, vol. 3, no. 1,pp. 7987, 1991.

    [20] L. Wang, Adaptive Fuzzy Systems and Control. Prentice Hall,1994.

    [21] S. Haykin, Kalman Filtering and Neural Networks. JohnWiley & Sons, Inc. , 2001.

    [22] Y. Liu, Value-at-risk model combination using artificial neu-ral networks, Emory University Working Paper Series, 2005.