8

Click here to load reader

[IEEE 2011 IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS) - Part Of 17273 - 2011 Ssci - Paris, France (2011.04.11-2011.04.15)] 2011 IEEE Workshop on Evolving and

Embed Size (px)

Citation preview

Page 1: [IEEE 2011 IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS) - Part Of 17273 - 2011 Ssci - Paris, France (2011.04.11-2011.04.15)] 2011 IEEE Workshop on Evolving and

Evolving Fuzzy Systems For Pricing Fixed IncomeOptions

Leandro Maciel and Fernando GomideSchool of Electrical and Computer Engineering

Department of Computer Engineering and Industrial AutomationUniversity of Campinas

Campinas, Brazil 13.083-852Emails: {maciel, gomide}@dca.fee.unicamp.br

Rosangela BalliniEconomics Institute

Department of Economics TheoryUniversity of Campinas

Campinas, Brazil 13.083-857Email: [email protected]

Abstract—During the recent decades, option pricing becamean important topic in computational finance. The main issue isto obtain a model of option prices that reflects price movementsobserved in the real world. In this paper we address optionpricing using an evolving fuzzy system model and Brazilianinterest rate options pricing data. Evolving models are particu-larly appropriate since it gradually develops the model structureand its parameters from a stream of data. Therefore, evolvingfuzzy models provide a higher level of system adaptation andlearns the system dynamics continuously, an essential attributein pricing option estimation. In particular, we emphasize theuse of the evolving participatory learning method. The modelsuggested in this paper is compared against the traditional Blackclosed-form formula, artificial neural networks structures andalternative evolving fuzzy system approaches. Actual daily dataused in the experiments cover the period from January 2003 toJune 2008. We measure forecast performance of all models basedon summary measures of forecast accuracy and statistical testsfor competing models. The results show that the evolving fuzzysystem model is effective especially for out-of-the-money options.

Keywords–Evolving Fuzzy Systems; Option Pricing; Neural Net-works; Interest Rate; Derivatives.

I. INTRODUCTION

One of the most important field in financial engineeringis asset pricing, particularly in derivatives market. Optionpricing offers an important information source for playerssince they provide mechanisms for hedge operations againsthigher market fluctuations. The term structure of interest ratesor yield curve is a basis for most investments, the reason whypricing interest rate derivatives currently attract attention ofresearches and practitioners.

In Brazilian derivative markets, one important fixed incomeinstrument are One-Day Interbank Deposit Contract Index(IDI) options. IDI options are contracts that reflect the behaviorof interest rates between the trade date and the option maturity.In USA and European derivatives markets, standard interestrate options has as underlying asset a fixed income equity withmaturity greater than the option considered. This peculiaritymeans that IDI becomes the option prices, as well as thefactors that affect them, differently from those that followstandard models. Therefore, these options demand a particularapproach.

Recently [3], [1], [2] employed models based on Black-Scholes (BS) [4] closed-form formula and its derivations forIDI options pricing. Their results have shown that theoreticalprices differ significantly from actual market prices. This factreveals the limitations of these models, and one must realizethat option pricing, which occurs daily, considering thesemodels do not depend of the information recorded in the past.Since input variables are a time series and models based onclosed-form formulas are not able to capture the temporaldependence of the series involved, models that describe the dy-namical properties of the series are more suited for forecasting.These features suggest the needed to develop more accurateoption pricing models that overcome restrictive assumptionsand consider the information provided by data set.

As option pricing theory typically gives non-linear relationsbetween option prices and the variables that determine it,a highly flexible model is required to capture the empiricalpricing mechanism. Computational intelligence (CI) methodsbased on Artificial Neural Networks (ANNs) and Fuzzy SetTheory are particularly suited for this purpose due to theirability to handle complex systems. During the last decades,several researchers have attempted ANNs for option pricing[5]– [11]. In general, their results show that neural networksmethodologies outperform traditional option pricing models.

The use of Fuzzy Set Theory in option pricing became a newresearch field in financial engineering. For instance, [12] usesBS formula considering the risk-free interest rate, volatility,and assets price as fuzzy variables to price European options.This allows financial analysts to choose the European pricewith an acceptance degree.

A fuzzy pricing model for currency options was addressedin [13]. Here the option price is a fuzzy number. The authorsshow that the model helps financial investors to pick anycurrency option price with an acceptable degree for later use,and that the approach is useful to handle the imprecise natureof the problem.

Considering the main classes of stochastic volatility mod-els (the endogenous and exogenous sources of risk), [14]generalized the Black-Scholes option valuation model in afuzzy environment. The option price is a fuzzy number which,according to the authors, helps to reduce price misspecification

978-1-4244-9979-3/11/$26.00 ©2011 IEEE

Page 2: [IEEE 2011 IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS) - Part Of 17273 - 2011 Ssci - Paris, France (2011.04.11-2011.04.15)] 2011 IEEE Workshop on Evolving and

associated with uncertainty and vagueness. More recently, [15]introduced a fuzzy time series-based neural network (FTSNN),a hybrid approach composed of a fuzzy time series model anda neural network model to price options. Their results showthat FTSNN outperforms many existing methods in terms ofdistinct error measures.

The purpose of this paper is to develop an evolvingfuzzy rule-based model to price European IDI call optionstraded on the Securities, Commodities and Future Exchange,BM&FBOVESPA, the Brazilian Stock Market Exchange.

The concept of evolving fuzzy systems introduces the ideaof gradual self-organization and parameter learning in fuzzyrule-based models [23]. Evolving fuzzy systems use datastreams to continuously adapt the structure and functionalityof fuzzy rule-based models. The evolving mechanism ensuresgreater generality of the structural changes because rules areable to describe a number of data samples. Evolving fuzzyrule-based models include mechanisms for rule modificationto replace a less informative rule by a more informative one[23]. Overall, evolution guarantees gradual change of the rulebase structure inheriting structural information. The idea ofparameter adaptation of rules antecedent and consequent issimilar in the framework of evolving connectionists systems[24], evolving Takagi-Sugeno (eTS) and extended Takagi-Sugeno (xTS) models, and their variations [23], [25], [26].In particular, the eTS model is a functional fuzzy model inthe Takagi-Sugeno (TS) form whose rule base and parameterscontinually evolve by adding new rules with higher summa-rization power and modifying existing rules and parameters tomatch current knowledge.

This paper suggests an evolving fuzzy participatory learn-ing (ePL) [27] approach for option princing modeling. Theapproach joins the concept of participatory learning (PL)[28] with the evolving fuzzy modeling idea [23], [25]. Inevolving systems the PL concept is viewed as an unsupervisedclustering algorithm [29] and is a natural candidate to find rulebase structures in dynamic environments. Here we focus is onfunctional fuzzy (TS) models. Similarly as in eTS, structureidentification and self-organization in ePL means estimationof the focal points of the rules, the antecedents parameters,except that ePL uses participatory learning fuzzy clusteringinstead of scattering, density or information potential. Withthe antecedent parameters fixed, the remaining TS modelparameters can be found using least squares methods [30].Instead of using data potential as in eTS, the evolving fuzzyparticipatory learning captures the rule base structure at eachstep using convex combinations of new data samples and theclosest cluster center, the focal point of a rule. After, therule base structure is updated and, similarly as in eTS, theparameters of the rule consequents are computed using therecursive least squares algorithm. Evolving fuzzy systems aremore suitable for option pricing since these models considertemporal dependence among variables and is able to changeits structure when necessary according to market fluctuations.

After this introduction, this paper proceeds as follows.Section II briefly reviews the idea of evolving fuzzy rule-

based modeling and the basic eTS method. Next, section IIIintroduces the concept of participatory learning, the featuresof the evolving fuzzy participatory learning (ePL) method andits computational details. Section IV explains IDI contracts,Black model and alternative neural network architectures.Section V compares the evolving fuzzy model against Blackmodel, neural network and other types of evolving fuzzymodels. Finally section VI concludes the paper summarizingits contributions and suggesting issues for further investigation.

II. EVOLVING FUZZY SYSTEMS

When learning models online, data are collected and pro-cessed continuously. New data may confirm and reinforce thecurrent model if data is compatible with existing knowledge.Otherwise, new data may suggest changes and a need to reviewthe current model. This is the case in recurrent models anddynamic systems whose operating conditions modify, faultoccurs, or parameters of the processes change.

In evolving systems, a key question is how modify thecurrent model structure using the newest data sample. Evolv-ing systems use incoming information to continuously de-velop their structure and functionality through online self-organization.

Fuzzy rule-based models whose rules are endowed withlocal models forming their consequents are commonly referredto as fuzzy functional models. The Takagi-Sugeno (TS) isa typical example of a fuzzy functional model. A particu-larly important case is when the rule consequents are linearfunctions. The evolving Takagi-Sugeno (eTS) model and itsvariations [25] assume rule-based models whose fuzzy rulesare as follows

ℛ𝑖 : IF 𝑥 is Γ𝑖 THEN 𝑦𝑖 = 𝛾𝑖0 +𝑝∑

𝑗=1

𝛾𝑖𝑗𝑥𝑗

𝑖 = 1, . . . , 𝑐𝑘

whereℛ𝑖 𝑖𝑡ℎ fuzzy rule𝑐𝑘 number of fuzzy rules at 𝑘; 𝑘 = 0,1,. . .𝑥 ∈ [0, 1]𝑝 input data𝑦𝑖 output of the 𝑖𝑡ℎ ruleΓ𝑖 vector of antecedents fuzzy sets𝛾𝑖𝑗 parameters of the consequent

The collection of the 𝑖 = 1, . . . , 𝑐𝑘 rules assembles a modelas a combination of local linear models. The contribution of alocal linear model to the overall output is proportional to thedegree of firing of each rule. eTS uses antecedent fuzzy setswith Gaussian membership functions:

𝜇𝑖 = 𝑒−𝑟∣∣𝑥𝑘−𝑣𝑖∣∣2 (1)

where 𝑟 is a positive constant which defines the zone ofinfluence of 𝑖𝑡ℎ local model and 𝑣𝑖 is the respective clustercenter or focal point, 𝑖 = 1, . . . , 𝑐𝑘.

Online learning with eTS needs online clustering to findcluster centers, assumes gradual changes of the rule base,and uses a recursive least squares to compute the consequentparameters. Each cluster defines a rule.

Page 3: [IEEE 2011 IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS) - Part Of 17273 - 2011 Ssci - Paris, France (2011.04.11-2011.04.15)] 2011 IEEE Workshop on Evolving and

The TS model output at 𝑘 is found as the weighted averageof the individual rule contributions as follows:

𝑦𝑘 =

∑𝑐𝑘

𝑗=1 𝜇𝑗𝑦𝑗∑𝑐𝑘

𝑗=1 𝜇𝑗(2)

where 𝑐𝑘 is the number of rules after 𝑘 observations and 𝑦𝑗the output of 𝑗𝑡ℎ rule at 𝑘.

Clustering starts with the first data point as the center of thefirst cluster. The procedure is a form of subtractive clustering, avariation of the Filev and Yager mountain clustering approach[31]. The capability of a point to become a cluster center isevaluated through its potential. Data potentials are calculatedrecursively using Cauchy function to measure the potential.If the potential of a new data is higher than the potential ofthe current cluster centers, then the new data becomes a newcenter and a new rule is created. If the potential of a newdata is higher than the potential of the current centers, but itis close to an existing center, then the new data replaces theexisting center. See [23] and [25] for more details. Currentimplementations of eTS adopt Cauchy functions to definethe notion of data density evaluated around the last dataof a data stream, monitors the clustering step, and containsseveral mechanism to improve model efficiency such as onlinestructure simplification.

III. EVOLVING FUZZY PARTICIPATORY LEARNING

Evolving fuzzy participatory learning (ePL) modelingadopts the same philosophy as eTS. After the initializationphase, data processing is performed at each step to verify ifa new cluster must be created, if an old cluster should bemodified to account for the new data, or if redundant clustersmust be eliminated. Cluster centers are the focal point of therules. Each rule corresponds to a cluster. Parameters of theconsequent functions are computed using the local recursiveleast squares algorithm. In this paper we assume, without lossof generality, linear consequent functions.

The main difference between ePL and eTS concerns the pro-cedure to update the rule base structure. Differently from eTS,ePL uses a fuzzy similarity measure to determine the proximitybetween new data and the existing rule base structure. The rulebase structure is isomorphic to the cluster structure becauseeach rule is associated with a cluster. Participatory learningassumes that model learning depends on what the systemalready knows about the model. Therefore, in ePL, the currentmodel is part of the evolving process itself and influences theway in which new observations are used for self-organization.An essential property of participatory learning is that theimpact of new data in causing self-organization or modelrevision depends on its compatibility with the current rule basestructure, or equivalently, on its compatibility with the currentcluster structure.

A. Participatory Learning

Let 𝑣𝑘𝑖 ∈ [0, 1]𝑝 be a variable that encodes the 𝑖𝑡ℎ (𝑖 =1, . . . , 𝑐𝑘) cluster center at the 𝑘𝑡ℎ step. The aim of the

participatory mechanism is to learn the value of 𝑣𝑘𝑖 , usinga stream of data 𝑥𝑘 ∈ [0, 1]𝑝. In other words, each 𝑥𝑘,𝑘 = 1, . . . , is used as a vehicle to learn about 𝑣𝑘𝑖 . We say thatthe learning process is participatory if the contribution of eachdata 𝑥𝑘 to the learning process depends upon its acceptanceby the current estimate of 𝑣𝑘𝑖 as being a valid. Implicit in thisidea is that, to be useful and to contribute to the learning of𝑣𝑘𝑖 , observations 𝑥𝑘 must somehow be compatible with currentestimates of 𝑣𝑘𝑖 .

In ePL, the object of learning are cluster structures. Clusterstructures are defined by cluster centers (or prototypes). Moreformally, given an initial cluster structure, a set of vectors𝑣𝑘𝑖 ∈ [0, 1]𝑝, 𝑖 = 1, . . . , 𝑐𝑘, is updated using a compatibilitymeasure, 𝜌𝑘𝑖 ∈ [0, 1] and an arousal index, 𝑎𝑘𝑖 ∈ [0, 1].While 𝜌𝑘𝑖 measures how much a data point is compatiblewith the current cluster structure, the arousal index 𝑎𝑘𝑖 actsas a critic to remind when current cluster structure should berevised in front of new information contained in data. Figure1 summarizes the main constituents and functioning of theparticipatory learning approach.

DataLearning Process

ArousalMechanism

Cluster structure

𝜌 𝑎

𝑥𝑘

𝑣𝑘𝑖 , 𝑖 = 1, . . . , 𝑐𝑘

Fig. 1. Participatory Learning.

Due to its unsupervised, self-organizing nature, the PLclustering procedure may at each step create a new clusteror modify existing ones. If the arousal index is greater thana threshold value 𝜏 ∈ [0, 1], then a new cluster is created.Otherwise the 𝑖𝑡ℎ cluster center, the one most compatible with𝑥𝑘, is adjusted as follows:

𝑣𝑘+1𝑖 = 𝑣𝑘𝑖 +𝐺𝑘

𝑖 (𝑥𝑘 − 𝑣𝑘𝑖 ) (3)

where𝐺𝑘

𝑖 = 𝛼𝜌𝑘𝑖 (4)

𝛼 ∈ [0, 1] is the learning rate, and

𝜌𝑘𝑖 = 1− ∣∣𝑥𝑘 − 𝑣𝑘𝑖 ∣∣𝑝

(5)

with ∣∣ ⋅ ∣∣ a norm, 𝑝 the dimension of input space, and

𝑖 = argmax𝑗

{𝜌𝑘𝑗 } (6)

Notice that the 𝑖𝑡ℎ cluster center is a convex combinationof the new data sample 𝑥𝑘 and the closest cluster center.

Similarly as (3), the arousal index 𝑎𝑘𝑖 is updated as follows:

𝑎𝑘+1𝑖 = 𝑎𝑘𝑖 + 𝛽(1− 𝜌𝑘+1

𝑖 − 𝑎𝑘𝑖 ) (7)

Page 4: [IEEE 2011 IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS) - Part Of 17273 - 2011 Ssci - Paris, France (2011.04.11-2011.04.15)] 2011 IEEE Workshop on Evolving and

The value of 𝛽 ∈ [0, 1] controls the rate of change ofarousal: the closer 𝛽 is to one, the faster the system is tosense compatibility variations.

The way in which ePL considers the arousal mechanism isto incorporate the arousal index (7) into (4) as follows:

𝐺𝑘𝑖 = 𝛼(𝜌𝑘𝑖 )

1−𝑎𝑘𝑖 (8)

When 𝑎𝑘𝑖 = 0, we have 𝐺𝑘𝑖 = 𝛼𝜌𝑘𝑖 which is the PL

procedure with no arousal. Notice that if the arousal indexincreases, the similarity measure has a reduced effect. Thearousal index can be interpreted as the complement of theconfidence we have in the truth of the current belief, therule base structure. The arousal mechanism monitors theperformance of the system by observing the compatibility ofthe current model with the observations. Therefore learningis dynamic in the sense that (3) can be viewed as a beliefrevision strategy whose effective learning rate (8) dependson the compatibility between new data, the current clusterstructure, and on model confidence as well.

Notice also that the learning rate is modulated by compati-bility. In conventional learning models, there are no participa-tory considerations and the learning rate is usually set small toavoid undesirable oscillations due to spurious values of datathat are far from cluster centers. Small values of learning ratewhile protecting against the influence of noisy data, slow downlearning. Participatory learning allows the use of higher valuesof the learning rate and the compatibility index acts to lowerthe effective learning rate when large deviations occur. Onthe contrary, when the compatibility is large, it increases theeffective rate, which means speeding up the learning process.

Clearly, whenever a cluster center is updated or a newcluster added, the PL fuzzy clustering procedure should verifyif redundant clusters are created. This is because updatinga cluster center using (3) may push a given center closerto another one and a redundant cluster may be formed.Therefore a mechanism to exclude redundancy is needed. Onemechanism is to verify if similar outputs due to distinct rulesare produced. In PL clustering, a cluster center is declaredredundant whenever its similarity with another center is greaterthan or equal to a threshold value 𝜆. If this is the case, thenwe can either maintain the original cluster center or replace itby the average between the new data and the current clustercenter. Similarly as in (5), the compatibility index amongcluster centers is computed as follows:

𝜌𝑘𝑣𝑖= 1−

𝑝∑𝑗=1

∣𝑣𝑘𝑖 − 𝑣𝑘𝑗 ∣ (9)

Therefore, if𝜌𝑘𝑣𝑖

≥ 𝜆 (10)

then the cluster 𝑖 is declared redundant.Detailed guidelines to choose appropriate values of 𝛼, 𝛽, 𝜆

and 𝜏 are given in [36], where it is shown that they should bechosen such that:

0 <𝜏

𝛽≤ 1− 𝜆 ≤ 1

where𝜏 ≤ 𝛽 and 𝜆 ≤ 1− 𝜏

B. Parameter estimation

After clustering, the fuzzy rule based is constructed using asimilar procedure as eTS described in Section II. The clustercenters define the modal values of the Gaussian membershipfuntions while dispersions are chosen to achieve appropriatelevels or rule overlapping. Moreover, for each cluster found,the corresponding rules have their linear consequent functionparameters adjusted using the recursive least squares algo-rithm. The computational details are as follows.

Let 𝑥𝑘 = [𝑥𝑘1 , 𝑥𝑘2 , . . . , 𝑥

𝑘𝑝] ∈ [0, 1]𝑝 be the vector of

observations and 𝑦𝑘𝑖 ∈ [0, 1] the output of the 𝑖𝑡ℎ rule,𝑖 = 1, . . . , 𝑐𝑘, at 𝑘 = 1, 2, . . .. Each local model is estimatedby a linear equation:

𝑌 𝑘 = 𝑋𝑘𝛾𝑘 (11)

where 𝛾𝑘 ∈ ℝ𝑝+1 is a vector of unknown parameters defined

by(𝛾𝑘)𝑇 = [𝛾𝑘0 𝛾𝑘1 𝛾𝑘2 . . . 𝛾𝑘𝑝 ]

and 𝑋𝑘 = [1 𝑥𝑘] ∈ [0, 1]𝑘×(𝑝+1) is the matrix composed bythe 𝑘-th input vector 𝑥𝑘 and a constant term; and 𝑌 𝑘 = [𝑦𝑘𝑖 ] ∈[0, 1]𝑘, 𝑖 = 1, . . . , 𝑐𝑘, is the output vector.

The model (11) gives a local description of the system, butthe vector of parameters 𝛾 is unknown. One way to estimatethe values for 𝛾 is to use the data available. Assume that

𝑌 𝑘 = 𝑋𝑘𝛾𝑘 + 𝑒𝑘 (12)

where 𝛾𝑘 represents the parameters to be recursively computedand 𝑒𝑘 is the modeling error at 𝑘. The least squares algorithmchooses 𝛾𝑘 to minimize a sum of squares errors

𝐽𝑘 = 𝐽(𝛾𝑘) = (𝑒𝑘)𝑇 𝑒𝑘 (13)

Define 𝑋𝑘+1 and 𝑌 𝑘+1 as follows:

𝑋𝑘+1 =

(𝑋𝑘

1 𝑥𝑘+1

), 𝑌 𝑘+1 =

(𝑌 𝑘

𝑦𝑘+1𝑖

)(14)

where 𝑥𝑘+1 is the current input data and 𝑦𝑘+1𝑖 is the corre-

sponding model output. The vector of parameters that mini-mizes the functional 𝐽𝑘 at 𝑘 is [32]:

𝛾𝑘 = 𝑃 𝑘𝑏𝑘 (15)

where 𝑃 𝑘 = [(𝑋𝑘)𝑡𝑋𝑘]−1 and 𝑏𝑘 = (𝑋𝑘)𝑇𝑌 𝑘. Using thematrix inversion lemma [32]:

(𝐴+𝐵𝐶𝐷)−1 = 𝐴−1 −𝐴−1𝐵(𝐶−1 +𝐷𝐴−1𝐵)−1𝐷𝐴−1

and making

𝐴 = (𝑃 𝑘)−1, 𝐶 = 𝐼, 𝐵 = 𝑋𝑘+1, 𝐷 = (𝑋𝑘+1)𝑇

we get

𝑃 𝑘+1 = 𝑃 𝑘

[𝐼 − 𝑋𝑘+1(𝑋𝑘+1)−1𝑃 𝑘

1 + (𝑋𝑘+1)𝑇𝑃 𝑘𝑋𝑘+1

](16)

Page 5: [IEEE 2011 IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS) - Part Of 17273 - 2011 Ssci - Paris, France (2011.04.11-2011.04.15)] 2011 IEEE Workshop on Evolving and

where 𝐼 is the identity matrix. After simple mathematicaltransformations, the vector of parameters is computed recur-sively as follows:

𝛾𝑘+1 = 𝛾𝑘 + 𝑃 𝑘+1𝑋𝑘+1(𝑌 𝑘+1 − (𝑋𝑘+1)𝑇 𝛾𝑘

)(17)

Detailed derivation can be found in [33]. For convergenceproofs see [34] for example. Expression (17) is used to updatethe rule consequent parameters at each 𝑘.

The use of the recursive least squares algorithm dependsof the initial values of the parameters 𝛾0, and of the initialvalues of the entries of matrix 𝑃 0. These initial values arechosen based on:

1) Existence of previous knowledge about the system,exploring a database to find an initial rule base and,consequently, 𝛾0 and 𝑃 0.

2) A useful technique when no previous information isavailable is to choose large values for the entries ofmatrix 𝑃 0 [35]. If the initial values of consequentparameters are similar to exact values, then it is enoughto choose small values for the entries of 𝑃 0. A standardchoice of 𝑃 0 is

𝑃 0 = 𝑠𝐼𝑚

where 𝐼𝑚 is the identity matrix of the order 𝑚, and𝑚 is the number of consequent parameters. The valueof 𝑠 usually is chosen such that 𝑠 ∈ [100, 10000] iflarge values are required, while for the small values 𝑠 ∈[1, 10]. More details can be found in [35].In this paper, we use the first option, that is, weuse a database to choose the initial rule base and itsparameters.

IV. FIXED INCOME OPTIONS AND PRICING MODEL

A. IDI Contracts

The most important Brazilian interest rate is the One-DayInterbank Deposit Contract rate, or “CDI rate”. It is computedas the average rate of all interbank overnight transactionsin Brazil, published daily by ANBIMA (Brazilian NationalAssociation of Investment Banks).

The underlying asset of IDI options is the One-Day In-terbank Deposit Contract Index, or IDI index. This index iscalculated by BM&FBOVESPA as the result of the accrualdaily CDI rate. The IDI index was set to 100,000 on January2nd, 2003. The IDI index at 𝑡 is

𝐼𝐷𝐼𝑡 = 100, 000 ⋅∏𝑡

𝑢=1(1 + 𝐶𝐷𝐼𝑢) (18)

where 𝑢 = 0 refers to the date when the IDI was set to 100,000and 𝐶𝐷𝐼𝑢 is the 𝐶𝐷𝐼 rate of day 𝑢.

IDI options are European-styled cash-settled options thatentitles the owner to receive the maximum between zero andthe difference of the index and the strike price, according tothe option style, i.e., call or put.

B. Black (1976) Model

The model proposed by Black is commonly used by playerson BM&FBOVESPA. The model is based on [16] and givesthe price of an IDI call option as follows:

𝑐𝑡 = 𝐼𝐷𝐼𝑡 ⋅𝑁 (𝑑1)−𝑋 ⋅ 𝑃 (𝑡, 𝑇 ) ⋅𝑁 (𝑑2) (19)

with:

𝑑1 =ln( 𝐼𝐷𝐼𝑡

𝑋⋅𝑃 (𝑡,𝑇 ) )+𝜎2⋅(𝑇−𝑡)3

6

𝜎⋅√

(𝑇−𝑡)3

3

and 𝑑2 = 𝑑1 −√

(𝑇−𝑡)3

3

where 𝐼𝐷𝐼𝑡 is the value of the IDI index at time 𝑡, 𝑃 (𝑡, 𝑇 )is the price at time 𝑡 of a discounted 1$ equity with maturity𝑇 , 𝜎 is the short-term interest rate volatility, 𝑋 is the strikeprice, and 𝑁(⋅) the normal cumulative distribution function.

In this model, short-term interest rate volatility is esti-mated using a Generalized Autoregressive Conditional Het-eroskedasticity, GARCH (1,1), process based on 𝐶𝐷𝐼 ratereturns, parametrized according to BIC information criteria[19]. Volatility is the only unobservable parameter in the Blackmodel. The strike price, 𝐼𝐷𝐼 index, and maturity form the dataset.

C. Neural Networks Models

A number of empirical studies that include [17] show thatANNs are better than several time series models in finance andeconomics applications. However, [18] pointed out that bothapproaches have similar degree of accuracy in some cases. Forcomparison purposes, we use feedforward and recurrent neuralnetwork models for IDI calls option pricing.

In Particular, here we adopt a feedforward neural network,FFNN, and an Elman and Jordan recurrent neural networksstructures, respectively ERNN and JRNN, for IDI call optionpricing.

The main issue in multilayer network models is how tofind the optimal architecture, that is, the optimal number ofhidden layers and neurons. For inputs we used a model withall parameters that can influence IDI call option price: 𝐼𝐷𝐼index, short-term interest rate volatility, maturity and strikeprice.

The data set was partitioned into three sets: training set,validation set and test set, with 65%, 20% and 15% of totaldata, respectively. The backpropagation algorithm was used totrain the multilayer network.

V. RESULTS AND DISCUSSION

Data consists on time series of IDI call options for differentstrikes and maturities. Puts were not considered in this worksince its liquidity is very low in Brazilian option market.The data covers the period from January 2nd, 2003 to June5th, 2008. The data were provided by BM&FBOVESPA.We selected the most liquid IDI calls within each day withnegotiated contracts greater or equal to 1,000. Nevertheless,the 𝐶𝐷𝐼 daily index was selected for all business days coveredby this work.

The models were adjusted considering the same inputsthat determines IDI call options price according to the Black

Page 6: [IEEE 2011 IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS) - Part Of 17273 - 2011 Ssci - Paris, France (2011.04.11-2011.04.15)] 2011 IEEE Workshop on Evolving and

model: 𝐼𝐷𝐼 index at day 𝑡 (𝐼𝐷𝐼𝑡), short-term interest ratevolatility (𝜎𝑡), estimated by a GARCH(1,1), process consider-ing 𝐶𝐷𝐼 spot rate returns, strike price (𝑋) and maturity (𝑇 ).Finally, the output represents the IDI call price (𝑐𝑡).

Table I summarizes the neural networks structures adjustedto IDI call options. The hidden neurons numbers were selectedaccording to a selection procedure Bayesian Information Cri-terion (BIC) [19] based on root mean squared error (RMSE).We can see that the obtained structures are not complex andfor all types of networks they were composed by three hiddenlayers.

TABLE IFEEDFORWARD AND RECURRENT NEURAL NETWORK STRUCTURES.

Number of NeuronsModels 1st Hidden Layer 2nd Hidden Layer 3rd Hidden LayerFFNN 3 3 2ERNN 4 5 3JRNN 5 6 4

The ePL model adopted the follwing values: 𝛽 = 𝜏 = 𝛼 =0.01 and 𝜆 = 0.11. These are the only parameters of theePL algorithm needs to be chosen by the user. The ePL found4 rules during the 3, 512 days development period. Duringtesting, ePL uses daily data and run as in online mode.

The eTS and xTS models were developed considering thefirst 3, 512 days before the testing period. The value of thecluster radii for the eTS model was 𝑟 = 0.4 and 𝑃 = 450 andthe model adjusted 8 rules. The xTS model adopted 𝑃 0 = 750to initialize the recursive least squares and it found 6 rules.In the experiments reported here, we ran the eTS and xTSimplementation given in [37].

A. Performance Measurement

To determine the pricing accuracy of each model’s esti-mates, we examine Mean Absolute Percentage Error (MAPE),Maximum Percentage Error (MPE), Root Mean Squared Error(RMSE), Mean Absolute Error (MAE) and Theil’s inequalitycoefficient (TIC):

𝑀𝐴𝑃𝐸 =100

𝑁

𝑁∑𝑖=1

∣∣𝑐𝑚𝑘𝑡𝑡 − 𝑐𝑚𝑡

∣∣𝑐𝑚𝑘𝑡𝑡

(20)

𝑀𝑃𝐸 = max𝑖=1,...,𝑁

{100

∣∣𝑐𝑚𝑘𝑡𝑡 − 𝑐𝑚𝑡

∣∣𝑐𝑚𝑘𝑡𝑡

}(21)

𝑅𝑀𝑆𝐸 =

√√√⎷(1

𝑁

) 𝑁∑𝑡=1

(𝑐𝑚𝑘𝑡𝑡 − 𝑐𝑚𝑡

)2(22)

𝑀𝐴𝐸 =1

𝑁

𝑁∑𝑡=1

∣∣𝑐𝑚𝑘𝑡𝑡 − 𝑐𝑚𝑡

∣∣ (23)

𝑇𝐼𝐶 =

√𝑁∑𝑡=1

(𝑐𝑚𝑡 −𝑐𝑚𝑘𝑡𝑡 )

2

𝑁√𝑁∑𝑡=1

(𝑐𝑚𝑡 )2

𝑁 +

√𝑁∑𝑡=1

(𝑐𝑚𝑘𝑡𝑡 )2

𝑁

(24)

where 𝑐𝑚 represents the theoretical prices, 𝑐𝑚𝑘𝑡 the actualprices and 𝑁 the sample size. For TIC index, the best model,in terms of accuracy, is the one that shown the index closedto zero.

Moreover, we computed the determination coefficient 𝑅2

which is an indicator of accuracy. This indicator is obtainedby a simple regression model of the market prices estimatedby models against the theoretical prices.

The results were evaluated according to the degree ofmoneyness (𝑀 ), defined by the relation between the strikeprice present value and the underlying asset spot price:

𝑀 =𝐼𝐷𝐼𝑡

𝑋 ⋅ 𝑃 (𝑡, 𝑇 ) (25)

In this case, IDI options were considered according to thedegree of moneyness: out-of-the-money (𝑀 ≤ 1 − 𝛼%), at-the-money (1−𝛼% < 𝑀 ≤ 1+𝛼%) and in-the-money (𝑀 >1 + 𝛼%), considering 𝛼 = 5%.

Although they are useful measures of forecast accuracythat have been employed by empirical researches extensively,they do not reveal whether the forecast from one modelis statistically superior to the other model. Therefore, it isimperative to employ some other commonly used tests thatcould help compare two or more competing models in termsof forecasting accuracy.

One of the parametric test employed in this work is thetest proposed by Ashley-Granger-Schmalensee [20], hereafterAGS test, and it tests for the statistical significance of thedifference between the RMSEs associated with two competingmodel forecasts. This test is based on the equation

𝑑𝑡 = 𝛽1 + 𝛽2(𝑠𝑡 − 𝑠𝑚𝑒𝑎𝑛) + 𝜓𝑡 (26)

where 𝑑𝑡 is the difference between the forecast errors, asobtained by two different models, 𝑠𝑡 is the sum of theseforecast errors, 𝑠𝑚𝑒𝑎𝑛 is the sample mean of 𝑠𝑇 , and 𝜓𝑡 isa white noise process. The AGS test is a test of the equalityof the mean squared errors (MSEs) of two competing modelsagainst the alternative hypothesis that the second model’sMSE is lower (more accurate) than the first model’s. Thetest is performed by jointly testing the significance of theparameters 𝛽1 and 𝛽2 in (26). The test statistics for the AGStest are calculated from the residuals obtained from estimatingan unrestricted model represented by (26) and a model thatrestricts 𝛽1 = 𝛽2 = 0 in (26). The statistic for this test iscomputed by:

𝑇𝑆 =

(𝑆𝑆𝐸𝑅 − 𝑆𝑆𝐸𝑈𝑅)/(𝑘 − 1)𝑆𝑆𝐸𝑈𝑅/(𝑛− 𝑘)

(27)

where 𝑆𝑆𝐸𝑅 is the sum of the squared residuals from therestricted model, 𝑆𝑆𝐸𝑈𝑅 is the sum of the squared residualsfrom the unrestricted model, and 𝑘 is the number of variablesin the regression model. In this test, we assume that theforecast errors are not contemporaneously correlated and arefree from serial correlations. The test statistic 𝑇𝑆 is distributed

Page 7: [IEEE 2011 IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS) - Part Of 17273 - 2011 Ssci - Paris, France (2011.04.11-2011.04.15)] 2011 IEEE Workshop on Evolving and

as Snedecor’s F-distribution with 2 and 𝑛 − 2 degrees offreedom under the assumption of normality.

Another parametric test of equal forecast accuracy is theMorgan-Granger-Newbold (MGN) test proposed by [21]. Thistest is often employed when the assumption of contemporane-ous correlation of errors is relaxed. The test statistic for thistest can be computed using:

𝑀𝐺𝑁 =𝜌𝑠𝑑

(1− 𝜌2𝑠𝑑)12

(𝑛− 1)12 (28)

where 𝜌𝑠𝑑 is the estimated correlation coefficient between 𝑠 =𝑒1 + 𝑒2, and 𝑑 = 𝑒1 − 𝑒2, with 𝑒1 and 𝑒2 the residuals of twomodels adjusted. In this case, the statistic is distributed asStudent distribution with 𝑛 − 1 degrees of freedom. For thistest, if the forecasts are equally accurate, then the correlationbetween 𝑠 and 𝑑 will be zero.

B. Comparison Analysis

For all estimated models we evaluated their pricing per-formance based on summary measures of error and accuracystatistical tests. It was considered only neural networks test setwhich represents their capability in out-of-sample forecasting.This is a restriction that aims to avoid the overfitting problemin ANN models, since in Black closed-form formula andevolving fuzzy models this difference does not make anysense. The test set consists of data from August 1st, 2007to June 5th, 2008, a sample with 882 points.

Table II shows the results of traditional error measures.According to these measures, we can see that the Black modelpresented poor results in comparison with the computationalmethods applied.

The evolving fuzzy models resulted in the lowest errorfor all measures. It confirms their capability in play withtemporal dependence data. With the results above, we can saythat, according to the traditional forecast measurements, theevolving fuzzy systems applied in this work results in IDI callprices more close to actual prices, in a comparison with theclosed-form formula and ANNs models.

TABLE IIFORECAST EVALUATION ACCORDING TO ERROR MEASURES FOR IDI CALL

OPTIONS.

Models MAPE MPE RMSE MAE TIC 𝑅2

Black 47.34 549.10 0.27 0.23 0.21 0.01FFNN 18.74 512.87 0.11 0.08 0.11 0.16ERNN 15.42 500.53 0.10 0.07 0.09 0.18JRNN 15.04 504.35 0.09 0.07 0.09 0.20eTS 12.73 501.12 0.08 0.06 0.07 0.39xTS 11.60 491.57 0.07 0.05 0.07 0.45ePL 12.14 494.57 0.08 0.06 0.07 0.42

Table III shown traditional error measures according tothe degree of moneyness. We realized the better results fromevolving models, and it reveals that these models are able tocapture the mechanism of IDI option pricing movements. TheBlack model, in general, showed that its capability to price out-of-the-money IDI call options is very low, which we can seethe higher error according to the degree of moneyness rising.

Nevertheless, neural networks architectures and evolving fuzzysystems did not point significant differences by changes in thedegree of moneyness.

TABLE IIIFORECAST EVALUATION FOR IDI CALL OPTIONS, ACCORDING TO THE

DEGREE OF MONEYNESS.

Out-of-the-MoneyModels MAPE MPE RMSE MAE TIC 𝑅2

Black 68.12 625.91 0.38 0.32 0.24 0.01FFNN 17.61 509.11 0.14 0.11 0.09 0.17ERNN 14.76 502.87 0.13 0.10 0.08 0.18JRNN 15.32 478.76 0.12 0.09 0.09 0.20eTS 11.93 453.09 0.07 0.05 0.06 0.39xTS 10.86 462.12 0.06 0.06 0.07 0.43ePL 11.04 449.87 0.09 0.08 0.07 0.44

At-the-MoneyModels MAPE MPE RMSE MAE TIC 𝑅2

Black 44.31 501.87 0.30 0.22 0.19 0.02FFNN 15.64 514.02 0.12 0.09 0.12 0.19ERNN 14.51 507.65 0.10 0.08 0.09 0.21JRNN 15.23 487.01 0.09 0.07 0.09 0.21eTS 12.21 444.39 0.07 0.05 0.07 0.40xTS 10.92 450.92 0.06 0.04 0.06 0.44ePL 10.43 449.44 0.09 0.06 0.07 0.44

In-the-MoneyModels MAPE MPE RMSE MAE TIC 𝑅2

Black 45.76 564.98 0.29 0.22 0.19 0.02FFNN 17.71 507.12 0.12 0.11 0.10 0.18ERNN 16.23 501.23 0.11 0.09 0.09 0.19JRNN 15.99 489.00 0.10 0.10 0.08 0.19eTS 11.76 451.92 0.07 0.06 0.07 0.37xTS 11.89 448.76 0.07 0.05 0.07 0.44ePL 11.06 451.22 0.06 0.04 0.08 0.43

Although all the summary measures discussed above areuseful measures of forecast accuracy that have been em-ployed by empirical researchers extensively, they do not revealwhether the forecast from one model is statistically superior tothe other model. In this case, the results from AGS test, that areshown in Table IV, examine pair-wise comparisons betweenforecasts of two competing models, reveal that ePL, eTS, xTSand the three neural network architectures show statisticallysignificant evidence of superior performance in pricing IDIcall options outperforming Black closed-form formula. Thep-value are shown in brackets.

TABLE IVAGS AND MGN TESTS EVALUATION FOR IDI CALL OPTIONS.

Models AGS Test MGN TestBlack Vs. ERNN 57.876 21.393

(0.0000) (0.0000)

Black Vs. JRNN 88.871 21.922(0.0000) (0.0000)

Black Vs. ePL 77.273 21.161(0.0000) (0.0000)

ERNN Vs. JRNN 2.183 1.602(0.1133) (0.1094)

RNRE Vs. ePL 2.922 5.383(0.0543) (0.0000)

RNRJ Vs. ePL 3.102 4.416(0.0454) (0.0000)

MGN tests that also compare pair-wise forecasts of compet-

Page 8: [IEEE 2011 IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS) - Part Of 17273 - 2011 Ssci - Paris, France (2011.04.11-2011.04.15)] 2011 IEEE Workshop on Evolving and

ing models showed statistically significant evidence that anyof the neural network models and the evolving fuzzy systemsemployed is a candidate model for pricing IDI call options(Table IV).

VI. CONCLUSIONS

This paper has suggested evolving fuzzy participatory learn-ing as a mean to develop evolving fuzzy rule-based models foroptions pricing. Participatory learning appears in the evolvingfuzzy participatory procedure as an unsupervised clusteringmechanism. Clustering is a key step in evolving fuzzy model-ing because they provide a mechanism to extract knowledge indata in the form of a group structure. Participatory clusteringis dynmaic and provides an efficient mechanism to implementself-organization. Self-organization, in turn, is essential tocapture the time varying behavior of complex systems, asit is the case of option pricing. Computational experimentswith price the Brazilian fixed income options data show thatthe fuzzy participatory modeling outperforms conventional andsimilar and alternative modeling schemes. Further work shallinclude the modeling of human perception and the use of themodel in actual decision-making instances.

ACKNOWLEDGMENT

This work was supported by the Brazilian National ResearchCouncil (CNPq) grants 305796/2005-4 and 304857/2006-8.

REFERENCES

[1] A. F. Junior, F. Greco, C. Lauro, G. Francisco, R. Rosenfeld andR. Oliveira, “Application of hull-white model to brazilian IDI options,”Annals of the 3rd Brazilian Finance Meeting, Florianopolis, Brazil, 2003.

[2] C. I. R. Almeida and J. V. Vicente, “Pricing and hedging brazilian fixedincome options,” Annals of the 6th Brazilian Finance Meeting, Vitoria,Brazil, 2006.

[3] C. H. S. Barbedo, J. V. M. Vicente and O. M. B. Lion, “Pricing asianinterest rate options with a Three-Factor HJM model,” Brazilian CentralBank Working Paper, n. 188, 2009.

[4] F. Black and M. Scholes, “The pricing of options and corporate liabili-ties,” Journal of Political Economy, vol. 81, n. 3, 1973, pp. 637–654.

[5] R. Gencay and M. Qi, “Pricing and hedging derivative securities withneural networks: bayesian regularization, early stopping and bagging,”IEEE Transactions on Neural Networks, vol. 12, n. 4, 2001, pp. 726–734.

[6] J. Healy, M. Dixon, B. Read and F. F. Cai, “A data-centric approachto understanding the pricing of financial options,” European PhysicalJournal, vol. 27, n. 2, 2002, pp. 219–227.

[7] J. Bennel and C. Sutcliffe, “Black-Scholes versus artificial neural net-works in pricing FTSE-100 options,” Intelligent Systems in Accounting,Finance and Management, vol. 12, n. 4, pp. 243–260.

[8] P. C. Andreou, C. Charalambous and S. H. Martzoukos, “Pricing andtrading european options by combining artificial neural networks andparametric models with implied parameters,” European Journal of Op-erational Research, vol. 185, n. 3, 2008, pp. 1415-1433.

[9] X. Liang, H. Zhang, J. Xiao and Y. Chen, “Improving option priceforecasts with neural networks and support vector regressions,” Neuro-computing, vol. 72, 2009, pp. 3055-3065.

[10] Y. Wang, “Using neural network to forecast stock index option price: anew hybrid GARCH approach,” Quality & Quantity, vol. 43, n. 5, 2009,pp. 833-843.

[11] L. S. Maciel and R. Ballini, “Option pricing using fuzzy and neuro-fuzzy inference systems”, In 4th CSDA International Conference onComputational and Financial Economics, London, United Kingdom,2010.

[12] H. Wu, “European option pricing under fuzzy environments”, Interna-tional Journal of Intelligent Systems, vol. 20, n. 1, 2005, pp. 89-102.

[13] F. Liu, “Pricing currency options based on fuzzy techniques”, EuropeanJournal of Operational Research, vol. 193, n. 2, 2009, pp. 530–540.

[14] G. Figa-Talamanca and M. L. Guerra, “Fuzzy option value with stochas-tic volatility models”, In Ninth International Conference on IntelligentSystems Design and Applications, Pisa, Italy, 2009.

[15] Y. Leu, “A fuzzy time series-based neural network approach to optionprice forecasting ”, Lecture Notes in Computer Science, vol. 5990, 2010,pp. 360–369.

[16] F. Black, “The pricing of commodity contracts,” Journal of FinancialEconomics, vol. 3, 1976, pp. 167-179.

[17] J. M. Zurada, B. P. Foster, T. J. Ward and P. M. Barker, “Neural networksversus logit regression models for financial distress response variables,”The journal of Applied Business Research, vol. 15, 1999, pp. 21-29.

[18] Z. R. Yang, M. B. Platt and H. D. Platt, “Probabilistic neural networkin bankruptcy predictions,” Journal of Business Research, vol. 44, 1999,pp. 67-74.

[19] G. Schwarz, “Estimating the dimension of a model,” Annals of Statistics,vol. 6, n. 2, 1978, pp. 461–468.

[20] R. Ashley, J. W. C. Granger and R. Schmalense, “Advertising andaggregate consumption: an analysis of causality,” Econometrica, vol. 48,n. 5, 1980, pp. 1149-1167.

[21] F. X. Diebold and R. S. Mariano, “Comparing predictive accuracy,”Journal of Business and Economics Statistics, vol. 13, 1995, pp. 253-263.

[22] E. L. Lehmann, “Nonparametric statistical methods based on ranks,”New Jersey: Prentice Hall, 1988.

[23] P. Angelov, Evolving Rule-Based Models: A Tool for Design of FlexibleAdaptive Systems, ser. ISBN 3-7908-1457-1. Heidelberg, New York:Springer-Verlag, 2002.

[24] N. Kasabov and Q. Song, “DENFIS: Dynamic evolving neural-fuzzyinference system and its application for time series prediction,” IEEETrans. on Fuzzy Systems, vol. 10, no. 2, pp. 144–154, 2002.

[25] P. Angelov and D. Filev, “An approach to on-line identification ofevolving Takagi-Sugeno fuzzy models,” IEEE Trans. on Systems, Manand Cybernetics, part B, vol. 34, no. 1, pp. 484–498, 2004.

[26] ——, “Simpl eTS: A simplified method for learning evolving takagi-sugeno fuzzy models,” Proceedings of the 2005 International Conferenceon Fuzzy Systems, FUZZ-IEEE, pp. 1068–1073, May 2005.

[27] E. Lima, M. Hell, R. Ballini, and F. Gomide, “Evolving fuzzy modelingusing participatory learning,” in Evolving Intelligent Systems: Methodol-ogy and Applications, P. Angelov, D. Filev, and N. Kasabov, Eds., JohnWilley and Sons, to appear.

[28] R. R. Yager, “A model of participatory learning,” IEEE Trans. onSystems, Man and Cybernetics, vol. 20, pp. 1229–1234, 1990.

[29] L. Silva, F. Gomide, and R. R. Yager, “Participatory learning in fuzzyclustering,” The 2005 IEEE International Conference on Fuzzy SystemsFUZZ-IEEE, pp. 857–861, May 2005.

[30] S. L. Chiu, “Fuzzy model identification based on cluster estimation,”Journal of Intelligence and Fuzzy System, vol. 2, pp. 267–278, 1994.

[31] R. Yager and D. Filev, “Approximate clustering via the mountainmethod,” IEEE Trans. on Systems, Man and Cybernetics, vol. 24, no. 8,pp. 1279–1284, August 1994.

[32] P. C. Young, Recursive Estimation and Time-Series Analysis. Berlin:Springer-Verlag, 1984.

[33] K. J. Astrom and B. Wittenmark, Adaptive Control, 2nd ed. Boston,MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1994.

[34] C. R. Johnson, Lectures & Adaptive Parameter Estimation, 1st ed.Upper Saddle River, USA: Prentice-Hall, Inc., 1988.

[35] P. E. Wellstead and M. B. Zarrop, Self-tuning systems: Control andSignal Processing, 1st ed. New York,NY: John Wiley & Sons, 1995.

[36] E. Lima, “Modelo takagi-sugeno evolutivo participativo,” Ph.D. disser-tation, Faculty of Electrical and Computer Engineering, State Universityof Campinas, 2008.

[37] P. Angelov and X. Zhou. Online Identification of Takagi-Sugeno FuzzyModels DEMO. http://194.80.34.45/eTS/, 2006.