6

Click here to load reader

[IEEE 2005 IEEE International Joint Conference on Neural Networks, 2005. - MOntreal, QC, Canada (July 31-Aug. 4, 2005)] Proceedings. 2005 IEEE International Joint Conference on Neural

Embed Size (px)

Citation preview

Page 1: [IEEE 2005 IEEE International Joint Conference on Neural Networks, 2005. - MOntreal, QC, Canada (July 31-Aug. 4, 2005)] Proceedings. 2005 IEEE International Joint Conference on Neural

Proceedings of International Joint Conference on Neural Networks, Montreal, Canada, July 31 - August 4, 2005

Wavelet Networks: An Alternative to Classical Neural Networks

Kamban Parasuraman and Amin ElshorbagyCentre for Advanced Numerical Simulation (CANSIM)

Department of Civil and Geological EngineeringUniversity of Saskatchewan

Saskatoon, SK S7N 5A9. Canada.E-mail: amin.elshorbagygusask.ca

Abstract - Artificial Neural Networks (ANNs) are beingwidely used to predict and forecast highly nonlinearsystems. Recently, Wavelet Networks (WNs) have beenshown to be a promising alternative to traditionalneural networks. In this study, the robustness of WNsand ANNs in modeling two distinct time series isinvestigated. The first series represents a chaotic system(Henon map) and the second series represents astochastic geophysical time-series (streamflows).Monthly streamflow values of the English river betweenUmferville and Sioux Lookout, ON, Canada, areconsidered in this study. For the implementation oftraditional ANNs, the weights and bias values areoptimized using Genetic Algorithms (GAs). However, inWNs, along with weights and bias, the translation anddilation factors of wavelets are also optimized. Use ofGAs to optimize the network parameters is to overcomethe problem of convergence towards local optima.Results from the study indicate that, WNs are moresuitable for modeling short-time high-frequency timeseries like Henon map. However, performance of WNsis comparable with that of ANNs in modeling low-frequency time series like streamflows.

I. INTRODUCTION

Models play an important role in the field ofhydrological sciences. Models help in capturing andunderstanding the embedded high nonlinearity in bothspatial and temporal scales. Over the past years, differentdata-driven techniques have been adopted for waterresources modeling. Key examples of such applicationsinclude: linear time series models (Salas et al., 1985)[streamflows], pattern recognition technique (Zhang andBerndtsson, 1991) [soil water], nonlinear time seriesanalysis (Porporato and Ridolfi, 1997) [streamflows], andArtificial Neural Networks (ANNs) (Zhang andGovindaraju, 2000) [rainfall-runoff].

Of the above techniques, ANNs have been widelyapplied in the field of geophysics due to their globalapproximation property. Detailed review ofANNs and theirapplication in water resources can be found in Maier andDandy (2000), and ASCE Task Committee on Applicationof Artificial Neural Networks in Hydrology (2000 a, b).

Traditionally, back-propagation (BP) algorithm is used tooptimize the weights and bias of ANN models. Thegradient search technique used in BP algorithm is shown toconverge towards local optima. In order to overcome thisproblem, Jain and Srinivasulu (2004) made use of real-coded Genetic Algorithms (GAs) for developing an ANNrainfall-runoff model. They showed that GA-trained ANNmodel predicted the daily flows more accurately than theBP-trained ANN model. Furthermore, Jain and Srinivasulu(2004) demonstrated that training ANN models using real-coded GA significantly improved the estimation accuracyof low-magnitude flows.

Recently, wavelet networks (WNs) (Zhang andBenveniste, 1992), have shown to be a promisingalternative to ANNs in modeling complex nonlinearsystems. A family of wavelets can be constructed bytranslating and dilating the mother wavelet. Hence, in WNs,along with weights and bias, the translation and dilationfactors need to be optimized. Most of the WN models makeuse of back-propagation algorithm to optimize theirparameters (Oussar et al., 1998). In this study, real-codedGAs is used to train both ANNs and WNs. It should alsobe noted that, though WNs have been used widely indifferent fields, little research has been conducted toevaluate their performance on hydrological time series.

The objective of the present study is to compare theperformance of real-coded GA-trained ANNs and WNswith regard to two different case studies. The first casestudy is that of an artificial chaotic time-series (Henonmap), generated by a nonlinear deterministic dynamicsystem. It represents a short-time high-frequency signal.The second case study is that of an actual geophysical timeseries (streamflows). The monthly streamflow values of theEnglish river between Umferville and Sioux Lookout, ON,Canada are considered. Monthly streamflow valuesrepresent a low-frequency signal. The rationale behindchoosing these two case-studies is to gauge theperformance of ANNs and WNs in modeling two diversetime series. The paper begins with a brief introduction onWNs, followed by the application of ANNs and WNs inmodeling the two different time series, and towards the end,important research conclusions are summarized. Forbrevity, description of ANNs and GAs is not been provided

0-7803-9048-2/05/$20.00 02005 IEEE 2674

Page 2: [IEEE 2005 IEEE International Joint Conference on Neural Networks, 2005. - MOntreal, QC, Canada (July 31-Aug. 4, 2005)] Proceedings. 2005 IEEE International Joint Conference on Neural

here. Infornation on ANNs and GAs can be found inHaykin (1999), and Goldberg (2000) respectively.

II. THEORY

Wavelet Networks

Wavelet networks are a special case of feedforwardneural networks. The main difference between the ANNsand WNs is that, in ANNs the nonlinearities areapproximated by superposition of sigmoidal functions.However, in WNs, nonlinearities are approximated bysuperposition of wavelet functions (Oussar et al., 1998).Similar to ANNs, WNs are also shown to have universalapproximation property. Though the origin of waveletnetworks can be dated back to the work by Daugman(1988), only after the work by Pati and Krishnaprasad(1991), and Zhang and Benveniste (1992), application ofWNs gained momentum.

HXi

X2

Xn

Bias

Input Layer Hidden Layer(Wavelets)

OutputLayer

Fig. 1. Structure of Wavelet Networks

The structure of WNs is shown in Figure 1. Similar tothe ANNs, the wavelet networks consists of an input layer,a hidden layer, and an output layer. The wavelet networkmodel shown in Figure I consists of 'n' input neurons (xi,x2,..., x) in the input layer. The number of input neurons'n' is equal to the number of input variables. The inputneurons are connected to the next layer of neurons, calledthe hidden layer neurons. The hidden layer neurons makeuse of wavelets as transformation fumctions. These neuronsare termed as "wavelons". In this study, the first derivativeof Gaussian function (Equation 1) is used as a 'mother'wavelet. A family of wavelets wj is then constructed by

Yk

translating and dilating the mother wavelet 1V according toEquation 2. Let i, j, and k represents the indices of input,hidden, and output layers respectively.

§v(x) = -x exp(- 2X

(z) = d

[I]

[2]

where tj, and dj represents the translation and dilationfactors ofthe wavelet. The output from the hidden wavelon,Hj, is given by Equation (3). The output from the hiddenwavelons is connected to the output layer neurons. Theoutput layer usually consists of a linear output neuron.Mathematically, the output from the output layer neuroncan be represented by Equation (4).

Hi (x) =n {f)=1 diJ I

Yk =WjkHj(x)+bk +ZXiXWj=1 j=1

[3]

[4]

Fig. 2. Wavelets with constant dilation (dj= 1) and varyingtranslation (tj = 0.1, 0.5, 0.9, 1.3, 1.7)

In WNs implementation, the translation and dilationfactors can be assumed as some integer values ordetermined by space-frequency analysis of the data(Cannon and Slotine, 1995). Alternatively, the factors canbe trained along with the weights and bias of the WNs(Oussar et al., 1998). In this study, the latter method isadopted. The first derivative Gaussian mother wavelet withdifferent translation and dilation factors is shown in Figures2 and 3 respectively. From the figures, it can be seen thatan array of wavelet functions can be obtained by translatingand dilating the mother wavelet. This illustrates theflexibility of wavelet functions against the rigid sigmoidaltransfer function that is widely adopted in ANNs.

2675

Page 3: [IEEE 2005 IEEE International Joint Conference on Neural Networks, 2005. - MOntreal, QC, Canada (July 31-Aug. 4, 2005)] Proceedings. 2005 IEEE International Joint Conference on Neural

Fig. 3. Wavelets with constant translation (tj = 1) andvarying dilation (d>= 0.1, 0.5, 0.9, 1.3, 1.7)

III APPLICATION

ANNs and WNs are applied to model both Henon mapand streamflow time-series. Real-coded GA is used to trainboth the models. In WNs, during GA implementation, thetranslation and dilation factors are initialized to a valuebetween 0.1 and 1.4. Selection of the above initializationrange is based on trial-and-error method. A range of 0.1-1.4is found to be optimal. Also, in addition to the primarymutation operator which operates on the weights and bias,another mutation operator is devised to operate on thetranslation and dilation factors. The second mutationoperator is functionally similar to the first mutationoperator. However, it returns mutated values between 0.1and 1.4. Single point crossover is adopted. The GAparameters are initialized as follows: initial population =

50, probability of crossover (Pc) = 0.6, probability ofmutation (Pm) = 0.4 and number of generation = 1000. Thevalues of P, and Pm are chosen arbitrarily. The reason forchoosing a higher value of Pm for real-coded GA, whencompared to that of binary-coded GA (usually Pm..0 1), isthat mutation is used as a major operator in exploring thesearch space. In order to improve the performance of GA,Elitism (Michalewicz, 1996) is incorporated in the search.Similar to WNs, ANNs adopted in this study consist of aninput layer, a hidden layer, and an output layer.Transformation in the hidden layer ofANNs is achieved bysigmoidal functions. In both ANNs and WNs, the optimalnumber of hidden nodes is determined by trial and errormethod. Optimal number of hidden nodes represents thenumber of hidden nodes which results in minimumprediction error. Performance of WNs and ANNs areusually evaluated in terms of some error measure. Thoughthere are many error measures, only MSE or its variants(root mean square error [RMSE], and normalized meansquare error [NMSE]) have been widely used in literature.

However, studies by Karunanithi et al. (1994), Armstrongand Collopy (1992), and Jain and Srinivasulu (2004) havedemonstrated that the use of relative errors would be abetter alternative to evaluate forecasting methods as itprovides a more balanced perspective of the goodness offit. In this study, mean relative error (MRE) is used as theperformance indicator. In Equation (5), n represents thenumber of instances presented to the model; y, and y,represents measured and computed counterparts of thevariable. In-house code is developed using MatLab forimplementing both neural networks and wavelet networks.Since GAs are initiated randomly, both ANN and WNmodels are simulated for ten different random seeds and thebest solution out of ten runs was selected and presented inthis study.

MREf=- YIYI -

n P=rc Yi

A. Modeling Chaotic Process

(5)

The Henon map is a two-dimensional quadratic mapthat belongs to a class of problems called deterministicchaos (nonlinear deterministic dynamic system). Animportant characteristic of a chaotic system is its sensitivityto initial conditions (Kantz and Schreiber, 1997). Moreinformation on the Henon map can be found in Hilbom(1994). The Henon map is simulated by the followingequations.

X, =1-1.4X2,-i+Z,,Z, = 0.3X, l

(6)(7)

X0 =0.2,Zo =0.2

Where X, is an instance of vector X at time "t" andsimilarly Z, is an instance of vector Z at time "t". Onethousand data points are generated using equations (6) and(7). Out of the 1000 data points generated, 800 data pointsare used for training the network and the remaining datapoints are used for testing purposes. X, is considered asinput to the model and X,+, is considered as output.

TABLE I

Comparison ofMRE (%) from Different ModelsHenon Map

Method Training TestingArtificial Neural Networks 60 58Wavelet Networks 57 47

2676

Page 4: [IEEE 2005 IEEE International Joint Conference on Neural Networks, 2005. - MOntreal, QC, Canada (July 31-Aug. 4, 2005)] Proceedings. 2005 IEEE International Joint Conference on Neural

The optimal number of hidden nodes is found to bethree, both in the case of ANNs and WNs implementation.For both ANNs and WNs, training is carried out for 5000epochs. The performance statistics of ANNs and WNsmodel in terms of MRE is presented in Table I. It can beobserved that, WNs performs better than ANNs in terms ofMRE both during training and testing. This demonstratesthe better generalization property of WNs model inmodeling high frequency signals like Henon map. Also,WNs are found to be computationally faster than ANNs.Each run ofANNs and WNs models took approximately 90minutes and 30 minutes respectively. Plots showing theperformance of ANNs and WNs in modeling the chaotictime series are presented in Figs. 4 and 5 respectively. Thesolid line represents the true value of the chaotic time seriesand the dashed line represents its predicted counterpart. Forbetter illustration purpose, only the first 100 points areplotted.

1 26 a 51 76 True- -Pred.ited

Fig. 4. True and ANNs Predicted Values of Time SeriesGenerated by Henon Map

1 26 a 51 76 True.- Predicted

Fig. 5. True and WNs Predicted Values ofTime SeriesGenerated by Henon Map

B. Modeling Streamflow

In this study, the performance of ANNs and WNs inmodeling monthly streamflow values is investigated.Monthly streamflow values of the English river, Ontario,Canada are considered. The streamflow data at Umfervilleis considered as reference data and the streamflow values ata downstream location (Sioux Lookout) are consideredmissing. Streamflow values at Sioux are estimated based onthe flow values at Umferville. The streamflow values fromJanuary 1924 to December 1981 are considered formodeling. For training the networks, flow values fromJanuary 1924 to August 1965 are utilized, whereas theremainder of the dataset is used for testing the models. Thenetworks are trained for 5000 epochs. Statisticalperformance of different models in modeling thestreamflows in terms ofMRE is presented in Table II.

TABLE II

Comparison ofMRE (%) from Different Models -Streamflows

Method Training TestingArtificial Neural Networks 17 19

Wavelet Networks 17 20

600

500

Q 400

E300

W

3 200-

0

Tine (Months)

Fig. 6. Actual and computed streamflow values atSioux Lookout using ANNs

From Table II, it can be seen that both ANNs and WNsmodel performed similarly during training. However,during testing performance of ANNs is slightly better than

2677

Page 5: [IEEE 2005 IEEE International Joint Conference on Neural Networks, 2005. - MOntreal, QC, Canada (July 31-Aug. 4, 2005)] Proceedings. 2005 IEEE International Joint Conference on Neural

the WNs. This study demonstrates that performance ofANNs is comparable or slightly better than WNs inmodeling low frequency signals like monthly streamflows.Plots showing the measured and predicted streamflowvalues using ANNs and WNs are shown in Fig. 6 and 7respectively. The optimal number of hidden nodes is foundto be two for both ANN and WN models.

600 - o Measured

500~ ~ ~ ~ *-Pedicted

E

Time (Months)

Fig. 7. Actual and computed streamflow values at SiouxLookout using WNs

IV CONCLUSIONS

In the present work, performance of artificial neuralnetworks (ANNs) and wavelet networks (WNs) inmodeling two distinct time-series is investigated. The firsttime-series represents a chaotic system (Henon map) andthe second time-series represents a geophysical time-series(streamflows). While the first time-series can be consideredto be a high-frequency signal, the later time-series can beconsidered as a low-frequency signal. Results from thestudy indicate that, in modeling Henon map, WNs performbetter than ANNs. WNs are also shown to have bettergeneralization property than ANNs. However, in modelingstreamflows, ANNs are found to perform slightly betterthan WNs. In general WNs are more appropriate formodeling high-frequency signals like Henon map.Moreover, WNs are computationally faster than ANNs. Theperfonnance of the models can further be improved bycombining a local search technique with GA. This ensuresthat no point in the search space remains unexplored. Based

on this study, it is recommended that the choice betweenWNs and ANNs should be based on the nature ofapplication and also based on trade-off between precisionaccuracy and computation time.

ACKNOWLEDGEMENT

The authors acknowledge the financial support of theNatural Sciences and Engineering Research Council(NSERC) of Canada through the Discovery Grants Programand the University of Saskatchewan through theDepartmental Scholarship Program.

REFERENCES

[1]. A. Jain, and S. Srinivasulu, "Development ofeffective and efficient rainfall-runoff models usingintegration of deterministic, real-coded geneticalgorithms and artificial neural network techniques,"Water Resources Research, vol. 40, 2004.

[2]. A. Porporato, and L. Ridolfi, "Nonlinear analysis ofriver flow time sequences," Water ResourcesResearch, vol. 33(6), pp. 1353-1367, 1997.

[3]. ASCE Task Committee on Artificial NeuralNetworks in Hydrology, "Artificial neural networksin hydrology, I, Preliminary concepts," Journal ofHydrologic Engineering, vol. 5(2), pp. 115-123,2000a.

[4]. ASCE Task Committee on Artificial NeuralNetworks in Hydrology, "Artificial neural networksin hydrology, II, Hydrologic applications." Journalof Hydrologic Engineering, vol. 5(2), pp. 124-137,2000b.

[5]. B. Zhang, and S. Govindaraju, "Prediction ofwatershed runoff using Bayesian concepts andmodular neural networks," Water ResourcesResearch, vol. 36(3), pp. 753-762, 2000.

[6]. D. E. Goldberg, Genetic Algorithms in Search,Optimization, and Machine Learning, Addison-Wesley-Longman, Reading, Mass, 2000.

[7]. H. Kantz, and T. Schreiber, Nonlinear Time SeriesAnalysis, Cambridge University Press, Cambridge,1997.

[8]. H. R. Maier, and G. C. Dandy, "Neural networks forthe prediction and forecasting of water resourcesvariables: A review of modeling issues andapplication," Environmental Modeling and Software,vol. 15, pp. 101-124, 2000.

[9]. J. D. Salas, G. Q. Tabios III, and P. Bartolini,"Approaches to multivariate modelling of waterresources time series," Water Resources Bulletin,vol. 21(4), pp. 683-708, 1985.

[10]. J. Daugmann, "Complete discrete 2-D Gabortransforms by neural networks for image analysis

2678

Page 6: [IEEE 2005 IEEE International Joint Conference on Neural Networks, 2005. - MOntreal, QC, Canada (July 31-Aug. 4, 2005)] Proceedings. 2005 IEEE International Joint Conference on Neural

and compression." IEEE Trans. Acoust., Speech,Signal Proc., vol. 36, pp. 1169-1179, 1988.

[1 1]. J. S. Armstrong, and F. Collopy, "Error measures forgeneralizing about forecasting methods: Empiricalcomparisons," International Journal ofForecasting,vol. 8, pp. 69-80, 1992.

[12]. M. Cannon, and J. J. E. Slotine, "Space-frequencylocalized basis function networks for nonlinearsystem estimation and control," Neurocomputing,vol. 9(3), pp. 293-342, 1995.

[13]. N. Karunanithi, W. J. Grenney, D. Whitley, and K.Bovee, "Neural networks for river flow prediction,"Journal of Computing in Civil Engineering, ASCE,vol. 8(2), pp. 201-220, 1994.

[14]. Q. Zhang, and A. Benveniste, "Wavelet networks,"IEEE Transactions on Neural Networks, vol. 8, pp.889-898, 1992.

[15]. R. C. Hilborn, Chaos and Nonlinear Dynamics: AnIntroduction for Scientists and Engineers, OxfordUniversity Press, New York, 1994.

[16]. S. S. Haykin, Neural Networks: A ComprehensiveFoundation, Prentice Hall, NJ, 1999.

[17]. T. Zhang, and R. Berndtsson, "Analysis of soil waterdynamics in time and space by use of patternrecognition," Water Resources Research, vol. 27(7),pp. 1623-1636, 1991.

[18]. Y. C. Pati, and P. S. Krishnaprasad, "Discrete affmewavelet transforms for analysis and synthesis offeedforward neural networks," Advances in NeuralInformation Processing Systems, vol. 3, pp. 743-749,1991.

[19]. Y. Oussar, I. Rivals, L. Personnaz, and G. Dreyfus,"Training wavelet networks for nonlinear dynamicinput-output modeling," Neurocomputing, vol. 20,pp. 173-188, 1998.

[20]. Z. Michalewicz, Genetic Algorithms + DataStructures = Evolution Programs. Springer-Verlag,New York, 1996.

2679