Reference 1 9

L. Wang, K. Chen, and Y.S. Ong (Eds.): ICNC 2005, LNCS 3610, pp. 1256 1265, 2005. Springer-Verlag Berlin Heidelberg 2005

The Prediction of the Financial Time Series Based on Correlation Dimension

Chen Feng1, Guangrong Ji1, Wencang Zhao1,2, and Rui Nian1

1 College of Information Science and Engineering Ocean University of China, Qingdao, 266003, China

[email protected], [email protected],[email protected] 2 College of Automation and Electronic Engineering, Qingdao University of Science,

&Technology, Qingdao, 266042, China [email protected]

Abstract. In this paper we firstly analysis the chaotic characters of three sets of the financial time series (Hang Sheng Index (HIS), Shanghai Stock Index and US gold price) based on the phase space reconstruction. But when we adopt the feedforward neural networks to predict those time series, we found this method run short of a criterion in selecting the training set, so we present a new method: using correlation dimension (CD) as the criterion . By the experiments, the method is proved effective.

1 Introduction

The prediction of the financial time series is a problem which interest the researchers at all time because it has important meaning for macro-economic adjustment and micro-economic management. For predicting the financial time series better research-ers made great efforts to find the laws of the time series. In the past the financial time series were considered random walk and the models were built according to this view-point, but the predicted results were proved bad by some experiments [1].

In recent years researchers found that some financial time series are chaotic time series rather than the random series in fact. Literature [2] indicated that hourly data of four spot exchange rates (British Pound, Deutschmark, Japanese Yen and Swiss France) are chaotic; literature [3] pointed out American national debt time series has chaotic attractor; literature [4] proved that some metal prices in London market fol-lows a mean process that is dynamic chaotic.

Many methods such as the maximum Lyapunov exponent method [5] and one-rank weighed local method [6] are used to predict the chaotic time series. In maximum Lyapunov exponent method, a teeny error induced by computing the maximum Lyapunov exponent will bring large error in the prediction. The idea of one-rank weighed local method is to use the linear model to resume local chaotic system. But the linear model always has some limits to mirror the nonlinear system. So the pre-dicted effects of the economic time series are not good enough with these methods.

At the same time owing to the strong nonlinear mapping ability of the neural net-works, many kinds of neural networks such as BPNN [7], GRNN [8] and RNN [9] etc. were used to predict the financial time series. In this paper we adopt the feedfor-

The Prediction of the Financial Time Series Based on Correlation Dimension 1257

ward neural networks used in the literature [10] as the training networks to predict the financial time series. With this kind of networks introduced in the third section, many classical chaotic systems such as Lorenz system, Henon mapping etc. can be pre-dicted very well.

But in the process of studying the method, we find the training sets choice is hazy and run short of a criterion in this method. So at the forth section, we bring forward a new method to choose the training set. According that the financial time series are chaotic, we choose the correlation dimension -- a kind of fractal dimension that can depict the chaotic characteristics as the criterion to choose the training set. By the experiments the method is proved effective.

If we use the feedforward neural networks to predict the time series, the phase space must be reconstructed firstly, so in the second section we introduce the delay coordinate method adopted to reconstruct the space and compute the financial time series maximum Lyapunov exponents to prove the three sets of financial time series are chaotic. Then we show the architecture of the neural networks in third section. In the forth section we explain the definition of the correlation dimension simply, and introduce how to choose the training set according to the correlation dimension. At the same time the three sets of economic data are used to prove the effect of the new method in the fifth section. In the last section, we reach the conclusion.

2 Phase Space Reconstruction

2.1 Theory Introduction

For resuming the dynamic characteristics of the original financial systems, the phase space should be reconstructed firstly. Takens theorem, which opens out some nonlin-ear systems dynamic mechanism, is the theoretic base of the phase space reconstruc-tion.

Takens theorem: M is d dimension manifoldmapping MM : is a smooth dif-

ferential homeomorphismmapping RMy : has second-order continuous derivative 12:),( + dRMy and

)))((,)),((),((),((),( 22 xyxyxyxyyx d L= (1)

where the function ),( y is a embedding from M to 12 +dR . The theorem indicates that a suitable embedding dimension can be found to resume the inerratic trajectory [11]. The delay coordinate method is used to reconstruct the phase space in the paper. An embedding dimension m and a delay time are determined to create mN points, and every point iY is a m dimension vector,

),,,(,),,,,(,),,,,( 1)1()1(1111 NNNNmiiiim xxxYxxxYxxxY mmm LLLLL +++++ === (2)

where )1( = mNNm . The embedding dimension m and the delay time are impor-tant parameters because they decide the quality of the reconstructed phase space.

In this paper, we use the so-called false nearest-neighbor method [12] to decide the embedding dimension m . The idea of the method is when the dimension is in-

1258 C. Feng et al.

creased from m to 1+m , we estimate whether there are false near points in the near points of the point iY , if there is none, the geometrical structure of the attractor has

been opened. When the dimension is m , supposing that the point 'iY is the nearest

point of the point iY , the distance between these two points is)(

'

m

iiYY .When the

dimension is increased to 1+m , their distance is marked )1('

+

m

iiYY .

5010,)()()1(

''' >

+TT

m

ii

m

ii

m

iiRRYYYYYY (3)

The point 'iY is the false neighbor point of the point iY where TR is the threshold .We

start at dimension 2 and increase the dimension by one each time. Either the propor-tion of the nearest neighbor points is smaller than 5% or the number of the nearest neighbor points dont decrease with the increase of the dimension, the dimension m is the optimum.

2.2 Financial Time Series Phase Space Reconstruction

In the paper, we choose the opening quotation of Hang Sheng Index (HIS) (4067 points from 31 December 1986 to 16 June 2003), Shanghai Stock Index (2729 points from 19 December 1990 to 29 January 2001), and US gold price (7277 points from 2 January 1975 to 8 August 2003) as the experiment data. The three sets of time series are shown in Fig.1.

(a) (b) (c)

Fig. 1. (a) Opening quotation of Hang Sheng Index (b) Opening quotation of Shanghai Stock Index (c) Opening quotation of US gold price.

From Fig.1 we can observe that in the time series curves some locals have similar-ity with the whole. For showing the complexity of the three sets economic data, we compute their box dimensions [13]. The box dimension, which always is used to cal-culate the dimension of the continuous curve, is a kind of fractal dimension. They are shown in Table 1.

According to the theory in the literature [14], if the capital market follows the ran-dom walk, the box dimension should be 1.5. The time series whose box dimension is between 1 and 1.5 is called long range correlation fractal time series, which means that the past increment is positive correlative with the future increment. The time


series whose box dimension is between 1.5 and 2 is called long range negative corre-lation fractal time series, which means that the past increment is negative correlative with the future increment. From the Table 1, we can observe that the box dimensions are all between 1 and 1.5, so the financial time series dont follow the random walk entirely, and that there is long range positive correlation in them.

Table 1. The box dimensions of the enconomic time series

HIS Shanghai Stock Index US gold price Box dimension 1.16016 1.16631 1.18816

We reconstruct the phase space by calculating the embedding dimension m and the delay time with the prediction error minimizing method [15].

At the same time, we choose three dimensions data from the every m -dimension reconstructed phase space of the financial time series and plot them which are shown in Fig.2.

(a) (b) (c)

Fig. 2. The 3 dimensions data from the reconstructed phase space of the financial time series (a) the opening quotation of Hang Sheng: 1-dimension, 9-dimension and 17dimension (b) the opening quotation of Shanghai Stock Index:1-dimension, 10-dimension and 19dimension (c) the opening quotation of US gold price:1-dimension, 10-dimension and 19dimension

The maximum Lyapunov exponentmax is computed with the small data sets

method [16] to prove that these financial time series are chaotic. A quantitative meas-ure for the sensitive dependence on the initial conditions is the Lyapunov exponent, which characterizes the average divergence rate of two neighboring trajectories.

It is not necessary to calculate Lyapunove spectrum because a bounded time series with a positive maximum Lyapunove exponent indicates chaos. Moreover, the maxi-mum Lyapunov exponent gives an estimate of the level of chaos in the underlying dynamical system. From Table 2 we can found the maximum Lyapunov exponents are positive, so the financial time series are chaotic.

The chaotic systems are sensitive to the initial values, so the chaotic time series has limited prediction potential. Since the maximum Lyapunove exponent characterizes the average degree of neighboring orbits, its reciprocal

max1 determines the maximum

predictable time. The results are all shown in Table 2.

1260 C. Feng et al.

Table 2. The chaotic analyse of the financial time series.

Embedding Delay time maximum Lyapunov maximum predictable dimension exponent time

HSI 17 6 0.069 14 Shanghai Stock Index 19 4 0.029 30

US gold price 19 7 0.046 20

3 Feedforward Neural Networks

The architecture of the feedforward neural networks used in this lecture is 1::2: mmm , where m is the embedding dimension. The topology architecture is shown

in Fig.3.

Fig. 3. Architecture of the feedforward neural networks

When the m dimension training set is put into the networks, each hidden unit j in

the first hidden layer receives a net input

=i

ijij xw (4)

and produces the output

)tanh()tanh( ==i

ijijj xwV (5)

where jiw represents the connection weight between the i th input unit and the j th

hidden unit in the first layer. Following the same procedure for the other unit in the next layers, the final output is then given by

=

l j iiijljsl xwwwz tanhtanh

' (6)


where the hyperbolic tangent activation function is chosen for all hidden unit, and the linear function for the final output unit.

The weights are determined by presenting the networks with the training set and comparing the output of the networks with the real value of the time series. The func-tion of the weights adjusting is

qtoldqt

newqt www = (7)

where qtqtqt wwEw )(= , )( qtwE is the mean square error function, 10 < is the learn

rate, and 10

1262 C. Feng et al.

4.2 Correlation Dimension

The G-P algorithm which was presented by Grassberger and Procaccia is adopted to calculate correlation dimension [17].

For a set of the space points }{ iY , defining

,)()1(

2)(

1

=0,1

0,0)(

x

xx is the Heaviside function.

When we choose different r , we can get different )(rC N . In estimating the correla-

tion dimension from the data, one plots )(log rCN against )log( r , where N is the cardi-

nality of the data set. )(rCN measures the fraction of the total number of pairs

),( ji YY such that the distance between iY and jY not longer than r .

5 Experiments

From the embedding dimensions in the Table 2 we can determine the neural net-works architecture, for Shanghai Stock Index the architecture is 19:38:19:1, for HSI the architecture is 17:34:17:1, for US gold price the architecture is 19:38:19:1.

(a) (b) (c)

Fig. 4. Fitting curves of prediction sets CD and the Training sets CD (a) the opening quotation of Hang Sheng Index (b) the opening quotation of Shanghai Stock Index (c) the opening quota-tion of US gold price

Based on the three phase spaces with the financial time series, we choose 100 con-tinuous points in the every phase space as the prediction set. Using the method expati-ated in the forth section we determine the training set. The correlation dimensions of the prediction set and training set are listed in Table 3.

Fig.4 shows the fitting curves of prediction sets CD and the Training sets CD. From Table 3 and Fig.4 we can observe for every set of the financial time series that the training sets correlation dimensions are near to the prediction sets, and their fitting curves are parallel. So in the next step we use these three training sets to train the networks.


Table 3. The correlation dimension comparison between the predicting set data and training data

HIS Shanghai Stock Index US gold price Predicting set CD 1.9113 2.3979 2.7269 Training sets CD 2.1078 2.3276 2.7936

CDs difference 0.0965 -0.0703 0.0667

By educating the every training set in the networks, we obtained the weights one by one. Put the prediction set into the networks whose weights have been determined, and the predicted data are calculated. The three sets of the predicted results and the real values are shown in Fig.5.

(a) (b) (c)

Fig. 5. The prediction of the financial time series (a) the opening quotation of Hang Sheng Index from 2 April 2002 to 22 April 2002 (b) the opening quotation of Shanghai Stock Index from 21 October 1996 to 11 November 1996 (c) the opening quotation of US gold price from 8 September to 5 October 1995

We also calculate the mean absolute percentage error (MAPE) displayed in Table 4 to show the prediction effect.

n

xxxMAPE

n

tttt

=

= 1

'

(9)

where tx is the real data and 'tx is the predicted data.

Table 4. The MAPE between the real data and the predicted data

HIS Shanghai Stock Index US gold price MAPE 1.9% 3.9% 0.46%

Every MAPE is less than 5%, so the prediction effect is good enough.

6 Conclusions

Though the experiment results, we can find that the predicted datas trend is identical with the real datas on the whole except few exceptional points and the MAPE be-

1264 C. Feng et al.

tween the real data and the predicted data are all small. This proved that as the chaotic time series, the financial time series can be predicted by the feedforward neural net-works.

On the other hand, we also can prove that the method which is adopted to choose the training set by using the correlation dimension as the criterion is effective from the experiment results. When we predict the chaotic financial time series using this method, the uncertainty of the training sets choice is reduced.

Acknowledgements

The National 863 Natural Science Foundation of P. R. China2001AA635010fully supported this research.

References

1. Kim, S.H., Hyun, J.N.: Predictability of Interest Rates Using Data Mining Tools: A Com-parative Analysis of Korea and the US. Expert Systems with Application,Vol.13. (1997) 85-95

2. Cecen, A.A., Erkal, C.: Distinguishing between stochastic and deterministic behavior in high frequency foreign exchange rate returns: Can non-linear dynamics help forecasting?. International Journal of Forecasting, Vol.12. (1996) 465-473

3. Harrison, R.G., Yu, D., Oxley, L., Lu, W., George, D.: Non-linear noise reduction and de-tecting chaos: some evidence from the S&P Composite Price Index. Mathematics and Computers in Simulation, Vol.48. (1999) 407-502

4. Catherine, K., Walter, C.L., Michel, T.: Noisy chaotic dynamics in commodity markets. Empirical Economics, Vol.29. (2004) 489-502

5. Rosenstein, M.T., Collins, J.J, De luca, C.J.: Apractial method for calculating largest-Lyapunov exponents in dynamical systems. Physica D, Vol.65. (1993)117-134

6. Lu, J.H., Zhang, S.C.: Application of adding-weight one-rank local-region method in elec-tric power system short-term load forecast. Control Theory And Application, Vol. 19. (2002) 767-770

7. Oh, K.J., Han, I.: Using change-point detection to support artificial neural networks for in-terest rates forecasting. Expert Systems with Application, Vol.19. (2000) 105-115

8. Leung, M.T., Chen, A., Daouk, H.: Forecasting exchange rates using general regression neural networks Computer and Operations Research,Vol.27. (2000) 1093-1110

9. Kermanshahi, B.: Recurrent neural network for forecasting next 10 years loads of nine Japanese utilities. Neurocomputing, Vol.23. (1998) 125-133

10. Holger, K., Thomas, S.: Nonlinear Time Series Analysis. Beijing: Qinghua University Press (2000)

11. de Oliveira, Kenya, Andrsia, Vannucci, lvaro, da Silva, Elton, C.: Using artificial neural networks to forecast chaotic time series. Physica A, Vol.284. (2000) 393-404

12. Kennel, M. B., Abarbanel, H., D., I.: Determining embedding dimension for phase space reconstruction using a geometric construction. Physical Review A, Vol.151. (1990) 225-223

13. Buczkowski, S., Hildgen, P., Cartilier, L.: Measurements of fractal dimension by box-counting: a critical analysis of data scatter. Physica A, Vol.252. (1998) 2334.


14. Heinz, O.P., Dietmar, S.E.: The science of fractal Image. New York: Springer Verlag New York Inc (1988) 71-94.

15. Wang, H.Y., Zhu, M.: A prediction comparison between univariate and multivariate cha-otic time series. Journal of Southeast University (English Edition),Vol.19. (2003) 414-417

16. Zhang, J., Lam, K.C., Yan, W.J., Gao, H., Li, Y.: Time series prediction using Lyapunov exponents in embedding phase space. Computers and Electrical Engineering,Vol.30. (2004)1-15

17. Grassberger, P., Procaccia, I.: Measuring the strangeness of the strange attractors. Physica, 9D (1983) 189-208.

IntroductionPhase Space ReconstructionTheory IntroductionFinancial Time Series Phase Space Reconstruction

Feedforward Neural NetworksHow to Choose the Training SetMethod of Choosing Training SetCorrelation Dimension

ExperimentsConclusionsReferences

/ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 600 /GrayImageDepth 8 /GrayImageDownsampleThreshold 1.01667 /EncodeGrayImages true /GrayImageFilter /FlateEncode /AutoFilterGrayImages false /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 2.00000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputCondition () /PDFXRegistryName (http://www.color.org) /PDFXTrapped /False

/SyntheticBoldness 1.000000 /Description >>> setdistillerparams> setpagedevice

Documents

Reference 1 9