Detection of nonlinear dynamics in short, noisy series, Nature 381, 215 (1996)

TETTERS TO NATURE

Detection of nonlinear dynamicsin short, noisy time seriesMauricio Barahona* & Ghi-Sang Poont*

* Physics Department and t Harvard-MlT Division of Health Sciences andTechnologr, Massachusetts Institute of Technolos/, 77 MassachusettsAvenue, Cambridge, Massachusetts 02139, USA

TnB accurate identification of deterministic dynamics in anexperimentally obtained time seriesr-s can lead to new insightsregarding underlying physical processes, or enable prediction, atleast on short timescales. But deterministic chaos arising from anonlinear dynamical system can easily be mistaken for randomnoisen. Avaitable methods to distinguish deterministic chaosfrom noise can be quite effective, but their performance dependson the availability oflong data sets, and is severely degraded bymeasurement noise. Moreover, such methods are often incapableof detecting chaos in the presence of strong periodicity, whichtends to hide underlying fractal structurese. Ilere we present acomputational procedure, based on a comparison of the predic-tion power of linear and nonlinear models of the Volterra-Wiener formlo, which is capable of robust and highly sensitivestatistical detection of deterministic dynamics, including chaoticdynamics, in experimental time series. This method is superior toother techniquesl-6'11'12 when applied to short time series, eithercontinuous or discrete, even when heavily contaminated withnoise, or in the presence of strong periodicity.

Consider the usual description of a dynamical system as a'black box' with input .x, and output y,, at time n :1, . . . ,Nin multiples of the sampling time r. The discrete Volterra seriesis then a Taylor-like po$nomial expansion of y, in terms of

+To whom corespondence should be addre$ed,

NATURE . VOL 381 ' 16 MAY 1996

xntxn-t,...txn_rc+1t where rc is the memory of the system. Toovercome the computational intractability of the Volterra series,Wiener introduced an orthogonal formulationl3 assuming that.r,is a gaussian white noise sequence over ar infinite time period.Korenberg's recasting of the expansionla, on which our approachis based, relaxes these restrictions thereby allowing its applicationto finite and arbitrary signals.

For a dynamical system, either strictly autonomous or reformu-lated as such, we propose a closedJoop version of the Volterraseries in which the outputy, feeds back as delayed input (that is,xn=!n_t).Within this ftamework, we analyse univariate timeseries by using a discrete Volterra-Wiener-Korenberg series ofdegree d.and memoryc as a model to calculate the predicted timeseries yld":

y|t" : ao I atln-t I az!n-z+ "' + a*!,-* * a*+tfi-r

* a r+z ln - r ! n - z+ ' . . + au t f i - * : f o^z *1n ) (1 )

m=0

where the functional basis {2.(n)} is composed of all thedistinct combinations of the embedding space coordinatesls(ln-t, !n-t, . . . ,!n-*) up to degree d, with a total dimension M :(rc+d)ll@lrcl). Thus, each model is parametrizedby rc and d,which correspond to the embedding dimension and the degree ofnonlinearity of the model, respectively. The coefficients a^ arerecursively estimated through a Gram-Schmidt procedureto fromlinear and nonlinear autocorrelations ofthe data series itself. Thecalculations can be readily performed on a workstation.

The short-term prediction power of a model is then measuredby the standard one-step-ahead prediction error

^r-- ;t2 - II=, (Yi""(* ,d) - Y^)'U\^,u) :__f,I'.-nf-

where y :UNDI:ry, and e(x,d)2 is in effect a normalizedvariance of the error residuals. We now search for the best

(2)

2L5

TETTERS TO NATURE

mod-el {rcoo,,doo,} that minimizes the following information criter-ion'o in accordance with the parsimony principle:

C(r) : log t(r) +rlN (3)

where r € l1,lu4 is the number of polynomial terms of thetruncated Volterra expansions from a certain pair {rc,d}.

The numerical procedure is as follows: for each data series,obtain the best linear model by searching for rcrh which minimizesC(r)withd: 1. Repeatwith increasing rc andd > 1, to obtain thebest nonlinear model. Likewise, obtain the best linear and non-linear models for surrogate randomized data sets with the sameautocorrelation (and power spectrum) as the original series3,5.This results in four competing models with error standard devia-tions eli*, rll.'r, rlf, and e$o.

From above, the presence ofnonlinear determinism is indicatedff dor, ) L. Further corroboration is obtained with the followingobjective statistical criteria: for models with gaussian residuals, astandard F-test will serve to reject, with a certain level of con-fidence, the hypothesis that nonlinear models are no better thanlinear models as one-step-ahead predictors. This gaussianassumption was verffied throughout our analysis by using aX'-testwith a99Vo crtoff. Alternatively, the results are confirmedby using the nonparametric Mann-Whitney rank-sum statistic,which does not depend on the gaussian assumption3. Under thisscheme, the relevance of nonlinear predictors is established whenthe best nonlinear model from the original data is significantlymore predictive than both (1) the best linear model from the dataseries, and (2) the best linear and nonlinear models obtained fromthe surrogate series, that is, elii* sy"., el'L, > e$* in the statisticalsense.

Two points are noteworthy. First, because surrogate data aregenerated by preserving only the linear autocorrelation functionof the data series (nonlinear autocorrelations are randomized).

the addition of nonlinear terms does not increase the predictionpower. As expected, eij, = e!'[, and surrogate data are always bestapproximated by a linear model.

Second, when dealing with continuous signals, the time delay rfor the embedding (or optimal sampling rate) is another freeparameter to be determinedlt. For detection purposes, the opti-mal time delay roo, may be chosen so as to maximize the differencebefween e!lj, and'ef;.,r. The value of too, is bounded by two timits. Onone hand, if r ) roo, (undersampling) all four models (linear andnonlinear, original or surrogate) will have similarly small predic-tion powers and nonlinearity cannot be detected. On the otherhand, for an oversampled data series with a small step size(z ( r.o,), the linear correlation of the embedding is so largethat linear models always prevail. Within the range of acceptabletime delays where rool lies, generally e!1,* = elf,t that is, whenoptimally sampled, thi: prediction power of the linear model of acontinuous signal derives mainly from its autocorrelation func-tion. This equivalence of linear models from original and surro-gate data holds for discrete maps as well. Consequently, and incontrast with other methods, surrogate data play only a conflrma-tory role in our procedure.

We have tested our algorithm with a wide variety of short timeseries, typically 1,000 points long. As a preliminary trial, weverifled the method with several linear signals which, in somecases, proved to be non-recognizable by some of the existingmethods (see discussions in refs 3 and 8). The examples included:white gaussian and coloured l//" noisestl an auloregressive l inearrandom process'; periodic sinusoidal and nonsinusoidal series(see Theiler in ref. 8); as well as a quasiperiodic signal from atoruswith two frequencies (PaluS in ref. 8). In all cases the methodyielded e'i, ( €a and the linear hypothesis could not be rejected.The same conclusionwas reached in the presence of noise orwhenan a posteioi nonlinear transformation was performed on theoriginal linear datas.

a 1 . s

1 . 0

0.5

0.0

Original ' 1 . s

1 . 0\*

0.0

I

FlG. la,Embeddingreconstructionofanecological models:xn:,tyr. rexp(-O.Oot()/k,1+x* r)) ilx: O.2xx ,exp(-0.07(yn_r t xi r) * 0.8y,,_rexp(-0.05(y/,_1 + 0.5xr_r)) with 2 : 118. Thetrajectory evolves in an attractor comprised of fractals in several disconnected domains and visitseach of them in a periodic manner as indicated by the arrows. This strong periodicity makesdetection of the chaotic component difficult. b, C(r) (see equation (3); r is the number of polynomialterms) for a set of nonlinear, linear and surrogate data models for a time series (1,000 points) fromthe same ecological model, no noise added. Linearand surrogate models with d - l and differing x(see text) are represented. The nonlinear models are complete (I) or truncated (!) Volterra serieswith fixed rc : 6 and varyingd. Optimal nonlinear, linear and.sunogate models are indicated by thearrows. Here F"n": (ei lr) ' / (ei ir) ' : 9,387>> F(0.01,915,915) :1.17 Alternatively, theMann-Whitney statistic is Z : -37 .13 << - 2.33, the value of the inverse normal cumulativedensity function for or : 0.01. Similarly, the information criterion C(r) indicates that dopt > 1. Fromthese three criteria we deduce the nonlinear component is significant. c, Attractor propagated froma model (rc :2, d: 8) obtained with the algorithm. lt captures the general shape and subfractalstructure of the attractor of the original non-Volterra model. d, Plot of 1,4"r" versus the percentage ofadded coloured noise forthe same ecological example. The noise percentage is defined as the ratioofthe standard deviations ofthe additive random series and the original data. The discontinuous lineindicates the 99% confidence interval. The nonlinear deterministic component can be reliablydetected even in the presence of - 50% conelated noise. Similar results (not shown) are obtainedfor the connected fractal with z1 : 140.

2L6

d t .oo

6

LL

o/o additive correlated noise

NATURE . VOL 381 . 16 MAY 1996

Y*-'rNumber of polynomial terms

TETTERS TO NATURE

b o.oo

Number of Polynomial terms

FlG.2 a, As Fig. 1b, but for an experimental time series (1,000 points) ofemission of an NH3 laserfrom an apparatus designed to produce Lorenz-likechaos (series A from ref. 8). In this case F*u - tO.23 >F(0.01, 968, 944) : I.16, and the nonparametric Mann-Whitney statis-tic is Z : -24.60 < 2.33. Both criteria (and C(r)) indicated that thenonlinear component is significant. b, As a, but for an experimental timeseries (1,874 points) of the intensity of a variable dwarf star (segments 1and 2 of series E from ref. 8) in which linear superposition of modes hadbeen presumed and was to be verified. Not only is doo. : l- but thehypothesis of nonlinearity may be rejected with a 94% confidence levelas 7f F". '" : 1.08 = F(0.06, 1855, 1783).

Table 1 summarizes some of the results for numerically gener-ated nonlinear examples. It is important to emphasize the diver-sity of the data sets. For example, most of the discrete series(except the H6non and logistic maps) have non-polynomial, non-Volterra functional forms. The detection algorithm workedequally well with continuous systems, including those that evolvearound several ghost centres (Lorenz, Duffing); high-dimensionalsystems (as in series D (ref. 8), with a dimension of 9); and chaoticseries from nonlinear delayed feedback mechanisms (Mackey-Glass equation) with implicit dimension of seven.

We also examined the robustness of our method in the presenceof noise by adding to all the nonlinear examples white and/orcoloured measurement noises-the latter with the same auto-correlation as the original series (Table 1). Even with such shortseries, nonlinearitywas detected under high levels ofnoise (- 40-

TABLE 1 Results of algorithm tests

Discrete systems

Logistic mapH6non maplkeda mapEcological modeleNon-chaotic tractal2l

Continuous systems o/o

RosslerDuffingLorenz ll (ref. 3)Series D (ref. 8)Mackey-Glasss

75405050,L{50

List of numerically generated nonlinear examples and approximatemaximum percentages of correlated noise that can be added to a1,000-points-long series before the nonlinear component is no longerdetected with a 99% cutoff. These maximum values increase for longerdata series.

75Vo) comparable to those achieved by other methods with muchlonger (-16.000 points) series3.

As an example of the power of the algorithm, Fig. L showsresults for an ecological model whose chaotic component,obscured by strong periodicity, has eluded detection by some ofthe existing methods even in the absence of noise'. In contrast, ourtechnique-without any reprocessing of the data-detected thepresence of nonlinearity even when the series was corrupted by- 50% additive correlated noise. Moreover, the algorithm isrobust against contamination of the signal with random spikes(Poisson impulses) of intensities comparable to the signal ampli-tude and density up to 0.02.

Dynamic noise-random noise recycled by the dynamics of thesystem-is a major source of variability in feedback systemsl8. Weaddressed this issue by numerically introducing white and/orcoloured dynamic noise in H6non and Ikeda maps with boundaryconstraints to keep the trajectories in the attractors. Encoura-gingly, the method detected nonlinearity up to similar levels ofnoise as for the purely additive case, that is, - 80% and - 70%respectively.

Finally, we tested our algorithm with two benchmark experi-mental series (Fig. 2); the results are in agreement with publishedconclusionso.

Some questions remain concerning the signiflcance of themodels so obtained. Although the choice of model selectioncriterion is not critical for the present purposes, the reliability ofthe resulting models as long-term predictors (for example, Fig. 1c)is highly dependent on the chosen criterion, the nature of the timeseries and the amount of noise present (M.B. and C.-S.P.,unpublished work). The procedure can be extended to non-po$nomial functional basesle or multivariate series. But theseapproaches may not be feasible for large-scale functional expan-sions unless efflcient algorithms (such as the recursive one pro-posed here become available. New directions for the--method,including iterative refinement of modelswith shadowing20 or othernoise-reduction techniques, may prove useful for modelling inthe presence of noise and full detection of chaos. Thedemonstrated robustness of the technique in a wide variety oftest examples suggests its possible applicability to general experi-mental data.

o/o

7070

25

n

Number of polynomial terms

Received 13 November; accepted 3 April 1996.

1. Sugiham, G. & l\4ay, R. M. Nature 344,734-74L(I99O).2. Casdagli, M. Physica D35, 335 356 (1989).3. Kennel, N/]. B. & lsabelle, S. Phys. Rev- 446, 3111-3tlg (1992\.4 .The i le r ,J . ,Eubank,S. ,Longt in ,A . ,Ga ldr ik ian ,B.&Famer ,J .D.Phys icaD58,77 94(1992) -5. Kaplan, D. T. & Glass, L. Physica D64,43I-454 (1993).6. cressberger, P. & Procaccia, l. Phlsrca D9, L89 208 (1983).7. Osborne, A. R. & Provenale, A. Physica D35,357 381 (1989).8. Weigend, A. S. & Gershenfeld, N. A. (eds) Iime Series P€diction (Santa Fe Inst. Studies in the

Sciences ofComplexity, Vol. XV, Addison-Wesley, Reading, MA, 1994).9. Cazelles, B. & Ferriere, R. H. Nature 355,25-26 Q992).

10. lvlamarelis, V. Z. (ed.) Advanced Methods of Phj6io/ogical System Modeling Vol. ll (Plenum,NewYork,1988).

11. crassberger, P. & Procaccia, L Phys. Rev. I2A,259t 2593 (1983).

NATURE . VOL 381 . 16 MAY 1996

12. Sano, l\4. & Sawada, Y. Phys. Rev. Left. 55' 1082-1085 (1985).13. Wiener, N. Nonlinear Problems in Random Theoty Wiley, New York, 1958).14. Korenberg, M. J. Ann. biomed. Eng. L6,123-742 {3gA8).15. Takens, F. in Dynamical Systems and Turbulence (eds Rand, D. A. & Young, L. S.) 336-381

(Lecture Notes in lvlathematics Vol. 898, Springer, Berlin, 1981).16. Akaike, H. I EEE Trans. Auto. Co ntrol 49, 7 tO 7 23 (!97 4).17. Fraser, A. M. &Swinney, H. L. Phia. Rev. A33' 1134-1140 (1986).18. Glass, L. & Malta, C. P. J. theor. Biol. 145'2I7-223 (1990)-19. Crutchfi eld, J. P. & MacNamara, B. S. Compiex SJAtems l, 417 452 1J947 ).20. Hammel, S. lM. Phys. Lett. A7.44,427-424 Q99O).21. Grebog, C., Ott, E., Pelikan, S. & Yorke, J. A. PhJAica DA3,261 26A 094q.

ACKNOWLEDGEMENTS. We thank L. Glas and A. L. Goldberger for helpful comments. M.B. wassupported by a Spanish N4Ec-Fulbright Fellowship. C-S.P. was supported in pan by grants from theNSF, oNR and NHLBI.

2L7

Documents

Detection of nonlinear dynamics in short, noisy series, Nature 381, 215 (1996)