Sylvain Chartier, Patrice Renaud and Mounir Boukadoum- A nonlinear dynamic artificial neural network model of memory

8/3/2019 Sylvain Chartier, Patrice Renaud and Mounir Boukadoum- A nonlinear dynamic artificial neural network model of m

1/26

New Ideas in Psychology ] (]]]]) ]]]]]]

A nonlinear dynamic artificial neural network

model of memory

Sylvain Chartiera,c,, Patrice Renaudb,c, Mounir Boukadoumd

aSchool of Psychology, University of Ottawa, Montpetit 407B, 125 University Street,

Ottawa, ON, Canada K1N 6N5bUniversite du Quebec en Outaouais, CanadacInstitut Philippe Pinel de Montre al, CanadadUniversite du Quebec a Montreal, Canada

Abstract

Nonlinearity and dynamics in psychology are found in various domains such as neuroscience,

cognitive science, human development, etc. However, the models that have been proposed are mostlyof a computational nature and ignore dynamics. In those models that do include dynamic properties,

only fixed points are used to store and retrieve information, leaving many principles of nonlinear

dynamic systems (NDS) aside; for instance, chaos is often perceived as a nuisance. This paper

considers a nonlinear dynamic artificial neural network (NDANN) that implements NDS principles

while also complying with general neuroscience constraints. After a theoretical presentation,

simulation results will show that the model can exhibit multi-valued, fixed-point, region-constrained

attractors and aperiodic (including chaotic) behaviors. Because the capabilities of NDANN include

the modeling of spatiotemporal chaotic activities, it may be an efficient tool to help bridge the gap

between biological memory neural models and behavioral memory models.

Crown Copyright r 2007 Published by Elsevier Ltd. All rights reserved.

PsycINFO classification: 4160 neural networks

Keywords: Chaos theory; Cognitive science; Connectionism; Mathematical modeling; Neural networks

ARTICLE IN PRESS

www.elsevier.com/locate/newideapsych

0732-118X/$ - see front matter Crown Copyright r 2007 Published by Elsevier Ltd. All rights reserved.

doi:10.1016/j.newideapsych.2007.07.005

Corresponding author. School of Psychology, University of Ottawa, Montpetit 407B, 125 University Street,

Ottawa, ON, Canada K1N 6N5.E-mail address: [email protected] (S. Chartier).

Please cite this article as: Chartier, S., et al. A nonlinear dynamic artificial neural network model of memory.

New Ideas in Psychology (2007), doi:10.1016/j.newideapsych.2007.07.005
http://www.elsevier.com/locate/newideapsychhttp://dx.doi.org/10.1016/j.newideapsych.2007.07.005mailto:[email protected]://dx.doi.org/10.1016/j.newideapsych.2007.07.005http://dx.doi.org/10.1016/j.newideapsych.2007.07.005mailto:[email protected]://dx.doi.org/10.1016/j.newideapsych.2007.07.005http://www.elsevier.com/locate/newideapsych


2/26

1. Introduction

In the past, cognition has been viewed as being of computational nature. Computational

approaches most often ignore dynamic properties. However, a multi-domain build up of

evidence exists to challenge this static view and replace it by the dynamic systemperspective (Clark, 1997; van Gelder, 1998). The application and development of the

nonlinear dynamic system (NDS) has recently been receiving a lot of attention ( Guastello,

2000). For instance, NDS can be found in a wide range of domains that include

neuroscience (Dafilis, Liley, & Cadusch, 2001; Freeman, 1987; Korn & Faure, 2003),

psychology of perception and motor coordination (DeMaris, 2000; Renaud, Bouchard, &

Proulx, 2002; Renaud, Decarie, Gourd, Paquin, & Bouchard, 2003; Renaud, Singer, &

Proulx, 2001; Zanone & Kelso, 1997), cognitive sciences (Erlhagen & Scho ner, 2002),

human development (Haken, Kelso, & Bunz, 1985; Thelen & Smith, 1994), creativeness

(Guastello, 1998) and social psychology (Nowak & Vallacher, 1998). NDS is a theoretical

approach that helps bring several spatiotemporal scales within a unified framework. The

purpose of NDS is twofold. First, it serves as a tool to analyze data (e.g., EEG rhythms,

bimanual coordination, eye movements, etc.). Second, it is used to model the different

domains under investigation (from neuroscience to creativeness). Time and change are the

two variables behind the strength of the NDS approach. As a result, NDS is deeply

challenging the way mental and behavioral phenomena have been studied since the

inception of scientific psychology, and NDS is quickly becoming a common tool to probe

and understand cognitive phenomena (e.g., memory, learning and thinking), thanks to its

ability to account for their chronological dimension (Bertenthal, 2007).

The way that a system changes over time is linked to the interaction between itsimmediate and external surroundings. The interaction between the system and its

environment is essential to self-organization and complex behavior (Beer, 1995), like the

decision-making process that the system goes through when dealing with ambiguous

information (Grossberg, 1988). If this interaction occurs under nonlinear dynamic

assumptions, then a larger variety of behaviors can be exhibited (Kaplan & Glass, 1995).

NDS principles are thus found in both the microworld (e.g., neural activities) and the

macroworld (e.g., cognitive phenomena).

In this work, a nonlinear dynamic artificial neural network (NDANN) is proposed that

exhibits NDS properties. The model is thus positioned between neural activities and low-

level cognition (Fig. 1). Although it cannot all by itself connect with neuroscience andpsychology, it is a step in the direction that both worlds could be connected through the

NDS perspective. Artificial neural networks (ANN) have been around since the seminal

papers of McCulloch and Pitts (1943) and Rosenblatt (1958). Although many kinds of

ANN exist, cognitive psychologists usually think about the computational model that was

conceived by McClelland and Rumelhart (1986). But while ANN and NDS have

properties in common, not all ANN models fall under the NDS perspectives (van Gelder,

1998). For example, a three-layer feedforward network (e.g. Aussem, 1999; Elman et al.,

1996; Mareshal & Thomas, 2006; Munakata & McClelland, 2003) has dynamic properties,

but is not included in the dynamic perspective of cognition; it is associated with a

computational view of cognition instead (van Gelder, 1998).ANN considered under the umbrella of NDS can be used to achieve two goals: fit

human experimental data or implement general properties. The model presented in this

paper falls under the second goal: it implements NDS properties while being loosely

ARTICLE IN PRESSS. Chartier et al. / New Ideas in Psychology ] (]]]]) ]]]]]]2


http://dx.doi.org/10.1016/j.newideapsych.2007.07.005http://dx.doi.org/10.1016/j.newideapsych.2007.07.005


3/26

constrained by neuroscience. The model is therefore part of the neurodynamic class

(Haykin, 1999) where, contrary to biological neural models, units represent a small

population of neurons (Skarda & Freeman, 1987). In fact, that branch of ANN is the most

popular when considering the microscopic and macroscopic behaviors found within the

NDS perspective (Haykin, 1999). One notable model is the adaptive resonance theory

(ART) networks, which were proposed to solve the stabilityplasticity dilemma (Carpenter& Grossberg, 1987; Grossberg, 1988). In fact, nonlinear learning principles can be traced

back to the 1960s (Grossberg, 1967). Other models, based on the distribution of the

memory trace over the network, instead on a specific unit, has been around since the

seventies when recurrent associative memories (RAMs) were created and then generalized

into bidirectional associative memories (BAMs). According to Spencer and Scho ner

(2003), associative memories were developed around the stabilities of attractors with no

understanding of the existing instabilities. This observation is true for earlier models (e.g.

Anderson, Silverstein, Ritz, & Jones, 1977; Begin & Proulx, 1996; Hopfield, 1982; Storkey

& Valabregue, 1999), but it is no longer so for newer ones (e.g. Adachi & Aihara, 1997;

Aihara, Takabe, & Toyoda, 1990; Lee, 2006; Tsuda, 2001). Such models have shown thatthey can take into account classification and categorization; contrary to Prinz and

Barsalou (2000). However, they are not devoid of problems. On the one hand, they are too

simple (Anderson et al., 1977; Begin & Proulx, 1996; Hopfield, 1982; Kosko, 1988); on the

other, they are too complex (Adachi & Aihara, 1997; Du, Chen, Yuan, & Zhang, 2005;

Imai, Osana, & Hagiwara, 2005; Lee, 2006; Lee & Farhat, 2001). A model that aims to

express NDS properties present in both domains (depicted in Fig. 1) must be built upon

dynamic biological neural models (Gerstner & Kistler, 2002) while remaining as simple as

possible. The model should exhibit several properties of human cognition while still

abiding by underlying neuroscience; it should reflect in particular the rapidly changing and

widely distributed neural activation patterns that involve numerous cortical and sub-cortical regions activated in different combinations and contexts (Buchel & Friston, 2000;

Sporns, Chialvo, Kaiser, & Hilgetag, 2004). Therefore, information representation within

the network needs to be distributed and the network coding must handle both bipolar

ARTICLE IN PRESS

Nonlinear Dynamic System

Macroscopic

behaviors(humans)

Microscopic

behaviors

(neurons)

NDANN

Model

Fig. 1. Level of modeling. The proposed model is situated between neuroscience and cognitive science modeling.

S. Chartier et al. / New Ideas in Psychology ] (]]]]) ]]]]]] 3




4/26

(binary) and multi-valued stimuli (Chartier & Boukadoum, 2006a; Costantini, Casali, &

Perfetti, 2003; Wang, Hwang, & Lee, 1996; Zhang & Chen, 2003; Zurada, Cloete, & van

der Peol, 1996). Multi-valued coding can then be interpreted as rate coding based on

population averages, which is consistent with fast temporal information processing

(Gerstner & Kistler, 2002) of neural activities. Consequently, coding reflects not the unit initselfas seen in localist neural networksbut rather the whole network (Adachi &

Aihara, 1997; Werner, 2001).

Furthermore, the static view of information representation by stable attractors (e.g.,

Hopfield-type networks) is now challenged by the dynamics of neural activities

(Babloyantz & Lourenc-o, 1994; Dafilis et al., 2001; Korn & Faure, 2003) and behaviors,

where spatial patterns could be stored in dynamic orbits (Tsuda, 2001). Those phenomena

suggest that, in real neural systems, information is stored and retrieved via both stable and

dynamic orbit (possibly chaotic) attractors. The network proposed in this paper exhibits

both dynamic orbit properties and fixed points. Finally, memory association and recall is

not an all-or-nothing process, but rather a progressive one (Lee, 2006). As a result, hard

discontinuitiessuch as the signum (sign) output function used in most Hopfield-type

networksas well as one-shot learning algorithms must be discarded (e.g., Grossberg,

1988; Hopfield, 1982; Personnaz, Guyon, & Dreyfus, 1986; Zhao, Caceres, Damiance, &

Szu, 2006).

The rest of the paper is organized as follows: Section 2 presents the model and its

properties obtained from analytic and numerical results. Section 3 shows the simulation

results obtained with bipolar and multi-valued stimuli. The section also displays the

simulation results obtained under chaotic network behavior. It is followed by a discussion

and conclusion.

2. Model description

2.1. Architecture

The model architecture is illustrated in Fig. 2, where x[0] and y[0] represent the initial

input-states (stimuli); t is the number of iterations over the network; and W and V are

weight matrices. The network is composed of two interconnected layers that, together,

allow a recurrent flow of information that is processed bidirectionally. The W layer returns

information to the V layer and vice versa: a function that can be viewed as a kind of top-down/bottom-up process. Like any BAM, this network can be both an associative and

heteroassociative memory (Kosko, 1988). As a result, it encompasses both unsupervised

and supervised learning and is thus suitable for a general architecture under the NDS

perspective. In this particular model, the two layers can be of different dimensions and,

contrary to usual BAM designs, the weight matrix from one side is not necessarily the

transpose of the other side. In addition, each unit in the network corresponds to a neural

population, not an individual neuron, as in a biological neural network (Skarda &

Freeman, 1987), or a psychological concept.

2.2. Transmission

The output function used in our model is based on the classic Verhulst equation

(Koronovskii, Trubetskov, & Khramov, 2000). This logistic growth is described by the





5/26

dynamic equation

dz

dt R1 zz fz, (1)

where R is a general parameter. Eq. (1) has two fixed points: z 0 and 1. However, only

the z 1 value is a stable fixed point. For that reason, Eq. (1) has only one attractor and itmust be modified if two attractors are desired. One way to accomplish that is to change the

right-hand term to a cubic form. We then obtain

dz

dt R1 z2z fz. (2)

This last equation has three fixed points: z 1, 0 and 1, of which the two non zero ones

are stable fixed points. This continuous-time differential equation can be approximated by

a finite difference equation following Kaplan & Glass (1995). Let z(t) be a discrete variable

for t 0, D, 2D, ? We have

dzdt lim

D!0

zt 1 ztD

. (3)

If we assume D to be small but finite, the following approximation can be made:

zt 1 zt

D fzt ) zt 1 Dfzt zt. (4)

With

fzt R1 zt2zt, (5)

we obtain

zt 1 DR1 zt2zt zt, (6)

where D is a small constant term. The last equation can be applied to each element of a

vector z. If we make the following variable changes: d DR, y(t+1) z(t+1), a(t) z(t)

ARTICLE IN PRESS

Fig. 2. Network architecture.





6/26

(or b(t) z(t)), and rearrange the terms of the previous equation, the following output

functions are obtained:yt 1 d 1at dat3, (7)

xt 1 d 1bt dbt3. (8)

In the previous equations, y(t+1) and x(t+1) represent the network outputs at time t+1;

a(t) and b(t) are the corresponding usual activation functions at time t (a(t) Wx(t);

b(t) Vy(t)); and d is a general output parameter. As an example, Fig. 3 illustrates the

shape of the output function when d 0.2 in Eq. (7). The value of d is crucial for

determining the type of attractor in the network as the network may converge to steady,

cyclic or chaotic attractors. Fig. 4 illustrates five different attractors that the outputfunction exhibits based on the d value. All of the attractors have an initial input

x(0) 0.05 and W 1; in this one-dimensional network, both x and W are scalar.

Like any NDS, to guarantee that a given output converges to a fixed point, x*(t), the

slope of the derivative of the output function must be positive and less than one ( Kaplan &

Glass, 1995)

dyt 1

dWxt 0od 1 3dWxt2o1. (9)

This condition is satisfied when 0odo0.5 for bipolar stimuli. In that case, Wx*(t) 71.

If Eq. (7) is expanded the relation between the input (x(t)) and output (y(t+1)) is detailed.

yt 1 d 1at dat3,

) yt 1 d 1Wxt dWxt3. 10

ARTICLE IN PRESS

Fig. 3. Output function when d 0.2.

S. Chartier et al. / New Ideas in Psychology ] (]]]]) ]]]]]]6




7/26

ARTICLE IN PRESS

Fig. 4. Attractor type in relation to the d value (a) monotonic approach to a fixed point, (b) alternate approach toa fixed point, (c) 2-s period of oscillation, (d) positive quadrant constraint chaotic attractor, (e) chaotic attractor.





8/26

To proceed further, Eq. (10) needs to be reformulated in a continuous time form as

shown by

dy

dt d 1Wx dWx3 x. (11)

For example, Fig. 5a illustrates the networks fixed points for a one-dimensional setting.

As expected, the stable fixed points are7

1, while zero is unstable. Fig. 5b illustrates thevector field for a two-dimensional setting. In this case, the fixed points [71, 71]T are

stable nodes, the fixed points [71, 0]T and [0,71]T are saddle manifold points that define

the basin of attraction boundary and the fixed point [0, 0]T is an unstable node.

There exists also another way to visualize the dynamics of the network using the idea of

potential energy. Energy E(y) is defined as

dE

dy

dy

dt. (12)

The negative sign indicates that the state vector moves downhill in the energy landscape.

This is given, using the chain rule, by the following time derivative:dE

dt

dE

dy

dy

dt. (13)

Thus replacing Eq. (12) into Eq. (13) yields

dE

dt

dE

dy

dE

dy

dE

dy

2p0 (14)

Therefore E(t) decreases along trajectories or, in other words, the state vector globally

converges towards lower energy. Equilibrium occurs at the fixed point of the vector field

where local minima correspond to stable fixed points and local maxima correspond to

unstable fixed points. Hence, we need to find E(y) such that

dE=dy d 1Wx dWx3 x. (15)

ARTICLE IN PRESS

Fig. 5. Phase portrait of the function dy/dt (1+d)Wxd(Wx)

3

x for a one-dimension network and thecorresponding vector field for a two-dimensions network.





9/26

The general solution is

Ey 1

2yTx yTWx dyTWx

1

2dyTWx3 C

, (16)

where C is an arbitrary constant (C 0 for convenience). Similar reasoning to the oneapplied to obtain the energy function E(x) from Eq. (8) gives

Ex 1

2xTy xTVy dxTVy

1

2dxTVy3 C

. (17)

Fig. 6 illustrates the energy function for a one-dimensional (x*(t) 71) and a two-

dimensional network (x* [1, 1]T, [1, 1]T, [1, 1]T, [1, 1]T). In both cases, the number

of dimensions in both layers is equal. It is easy to see that for a one-dimensional setting, the

network has a double-well potential, and for a two-dimensions setting it has a quadruple

potential where the local minima correspond to the stable equilibria.

By performing a Lyapunov analysis (Kaplan & Glass, 1995), the d values that will showthe various behaviors (fixed point, cyclic and chaotic) depicted in Fig. 4 can be found. The

Lyapunov exponent for the case of a one-dimensional network is approximated by

l %1

T

XTt1

logdyt 1

dxt

, (18)

where T is the number of network iterations, set to 10,000 to establish the approximation.

In that case, the derivative term is obtained from Eq. (10), so that l is given by

l %1

TXTt1

log 1 d 3dxt2

. (19)

The bifurcation diagram can also be computed. When the two diagrams are compared, it is

easy to see the link between d and the type of attractor. Fig. 7 shows that the network

ARTICLE IN PRESS

Fig. 6. Energy landscape of a one-dimensional and one-dimensional bipolar stimuli with its corresponding

contour plot.





10/26

exhibits a monotonic approach to steady states if the value ofd is between 0 and 0.5. The

bifurcation diagram shows fixed points at 1 and 1. This result is not surprising, since the

weight (W) was set to 1. Finally, the proposed output function is composed of a

mechanism that balances the positive (d+1)ai and negative da3i parts. Thus, a units

output remains unchanged if it reaches a value of (d+1)aiR da3i, where R is a limit

with real value (e.g. 0.7). This mechanism enables the network to exhibit multi-valued

attractor behavior (for a detailed example, see Chartier & Boukadoum, 2006a). Such

properties contrast with the standard nonlinear output function, which can only

exhibit bipolar attractor behavior (e.g., Anderson et al., 1977; Hopfield, 1982). It should

be noted that the multi-valued attractor in this model is not simply a special coding

strategy (Costantini et al., 2003; Muezzinoglu, Guzelis, & Zurada, 2003; Zhao et al., 2006)

or a subdivision of the bipolar function into a staircase function (Wang et al., 1996;

Zhang & Chen, 2003; Zurada et al., 1996). In those strategies, the experimenter must

know in advance how many different real values form each stimulus to modify thearchitecture or the output function accordingly. In NDANN there is no need for the

experimenter to be involved as the network autonomously self-adapts its attractors for any

given real values.

ARTICLE IN PRESS

Fig. 7. Bifurcation and Lyaponov exponent diagrams as a function ofd.





11/26

2.3. Learning

The simplest form of weight modification is accomplished with a simple Hebbian rule,

according to the equation

W YXT. (20)

In this expression, X and Y are matrices that represent the sets of bipolar pairs to be

associated, and W is the weight matrix. Eq. (20) forces the use of a one-shot learning

process since Hebbian association is strictly additive. A more natural learning process

would make Eq. (20) incremental. But, then, the weight matrix would grow unbounded

with the repetition of the input stimuli during learning. This property may be acceptable

for orthogonal patterns, but it leads to disastrous results when the patterns are correlated.

In that case, the weight matrix will be dominated by its first eigenvalue, and this will result

in recalling the same pattern whatever the input. A compromise is to use a one-shot

learning rule to limit the domination of the first eigenvalue, and to use a recurrent

nonlinear output function to allow the network to filter out the different patterns during

recall. Kosko (1988) BAM effectively used a signum output function to recall noisy

patterns, despite the fact that the weight matrix developed by using Eq. (20) is not optimal.

The nonlinear output function usually used by the BAM network is expressed by the

following equations:

yt 1 sgn Wxt (21)

and

xt 1 sgnWT

yt, (22)

where sgn is the signum function defined by

sgn z

1 if z40;

0 if z 0;

1 if zo0:

8>: (23)

In short, by using the weight matrix defined by Eq. (20) and the output function defined by

Eqs. (21) and (22), the network is able to recall Y from X, and by using the weight matrix

transpose, the network is able to recall X form Y. These two processes taken together

create a recurrent nonlinear dynamic network with the potential to accomplish binaryassociation. However, the learning of the BAM network is performed offline and the

nonlinear output function of Eqs. (21) and (22) is not used during that stage. Moreover,

the network is limited to bipolar/binary patterns and, as such, cannot learn multi-valued

attractors. In addition, the network develops many spurious attractors and has limited

storage capacity (Personnaz, Guyon, & Dreyfus, 1985). One approach to overcome these

limitations uses a projection matrix based on least mean squared error minimization

(Kohonen, 1972; Personnaz et al., 1985).

W YXTX1XT. (24)

This solution increases the storage capacity and recall performance of the network,but its learning rule, based on an inverse matrix principle, is not a local process.

Several sophisticated approaches have also been proposed that modify the learning rule or

coding procedure, with the result of both increasing storage capacity and performance

ARTICLE IN PRESSS. Chartier et al. / New Ideas in Psychology ] (]]]]) ]]]]]] 11




12/26

(e.g., Arik, 2005; Shen & Cruz Jr., 2005). More complex learning rules such as the

backpropagation algorithm (McClelland & Rumelhart, 1986) or support vector machine

(Cortes & Vapnik, 1995) could have been used. However, since the proposed model had to

be close to neuroscience, variations on Hebbian learning were favored (Gerstner & Kistler,

2002). Therefore, the learning in NDANN is based on time difference Hebbian association(Chartier & Boukadoum, 2006a; Chartier & Proulx, 2005; Kosko, 1990; Sutton, 1988). It is

formally expressed by the following equations:

Wk 1 Wk Zy0 ytx0 xtT, (25)

Vk 1 Vk Zx0 xty0 ytT, (26)

where Z represents the learning parameter. In Eqs. (25) and (26), the weight updates follow

this general procedure: first, initial inputs x(0) and y(0) are fed to the network, then, those

inputs are iterated t times through the network (Fig. 2). This results in the outputs x(t) and

y(t) that are used for the weight updates. Therefore, the weights will self-stabilize when the

feedback is the same as the initial inputs (y(t) y(0) y*(t) and x(t) x(0) x*(t)); in

other words, when the network has developed fixed points. The way learning works in

NDANN contrasts with ART models (Grossberg, 1988) where one-shot learning occurs

only when the state of the system is at a fixed point (resonance). In NDANN, the learning

causes the network to progressively develop a resonance state between the input and the

output. Finally, since the learning explicitly incorporates the output (x(t) and y(t)), it

occurs online; thus, the learning rule is dynamically linked to the networks output. This

contrasts with most BAMs, where the learning is performed solely on the activation

(offline). Learning convergence is a function of the value of the learning parameter Z. Ifweight convergence is desired, Z must be set according the following condition (Chartier &

Boukadoum, 2006b; Chartier & Proulx, 2005):

Zo1

21 2dMaxN; M; da1=2. (27)

3. Simulations

Several simulations were performed to illustrate the various behaviors and properties of

the model. They were divided into two sets. The first one was devised to show thatNDANN can (1) produce the same behavior as that observed with a fixed-point associative

memory, and (2) classify multi-valued stimuli which, in turn, links it to biological rate-

based models. The second set of simulations dealt with dynamic orbits, where the network

state space is different from one iteration to another. These simulations show that the

proposed network can represent behavior variability without resorting to stochastic

processes. In addition, the chaotic attractors can be bound or not depending on the desired

context.

3.1. Learning and recall: the fixed-point approach

The first simulation addresses the issue of iterative and convergence learning from multi-

valued stimuli of different dimensions. It will show that the model can learn any kind of

stimulus, in any situation, and without need for data preprocessing (stimuli normalization





13/26

or orthogonalization). This feature is important since, sometimes, the entire task is done

through preprocessing, thus making ANN use accessory.

In this simulation, the distance between the network output and the desired stimulus is

obtained by a measure of error (Euclidian distance), with the number of stimuli to be

learned being varied from 2 to 6. Thus, a task of learning 6 associations should be more

difficult than a task of learning 2, given the same amount of learning time (k 25 learningtrials). The stimuli are displayed in Fig. 8. The first stimulus set represents letters on a 7 7

grid, where a white pixel is assigned a value of1 and a black pixel a value of +1. Each

letter forms a 49-dimensional bipolar vector. The second stimulus set consists of 16 16

gray-level images, with each image forming a 256-dimensional real value vector of eight

levels of gray. Therefore, the W weight matrix has 49 256 connections and the V weight

matrix 256 49 connections. The network task was to associate each image with its

corresponding letter (the printer image with the letter P, the mailbox image with the

letter M, etc.). The learning parameter Z was set to 0.0025 and the output parameter

d to 0.01. Both values met the requirement for weight convergence and fixed-point

development. Since the models learning is online and iterative, the stimuli were notpresented all at once. In order to save time, the number of iterations before each weight

update was set to t 1. The learning followed the general procedure:

0. Initialization of weights to zero.

1. Random selection of a pair following a uniform distribution.

2. Stimuli iteration through the network according to Eqs. (7) and (8) (one cycle).

3. Weight update according to Eqs. (25) and (26).

4. Repetition of 13 until the desired number of learning trials is reached (k 25).

Fig. 9 illustrates the obtained results as a function of task difficulty. Easy tasks (fewerassociations) were better learned in comparison to more complex ones. In addition, as

learning increased in complexity, more and more learning interference was observed,

producing greater variability in the results. To evaluate the variability for each of the

ARTICLE IN PRESS

Fig. 8. Stimuli association used for the simulation.





14/26

association tasks, the learning procedure was repeated 400 times. The variability was

evaluated using standard deviation. For example, in the two-association task, the

variability was given by the following average:

sdAver X25

k

X2j1

sdjk, (28)

where k represents the learning trial, j the association pairs, and sdik is given by

sdjk

P400i1

xijk xjk 2

399(29)

ARTICLE IN PRESS

Fig. 9. Learning curves of 2, 4 and 6 associations during 25 learning trials (a) example of a single simulation(b) averaging over 400 simulations for the printer association.





15/26

with i representing the simulation trial and x the performance (1.0-error). The average

standard deviation for the two-associations task was 0.11, while those for the four- and six-

associations tasks were 0.15 and 0.20, respectively. If the number of trials is not restrained

and if there are fewer stimulus prototypes than network dimensions, then the network will

be able to learn the desired association perfectly (Chartier & Boukadoum, 2006a); afterfewer than 200 learning trials, the network could learn all of the six desired associations

perfectly (Fig. 10).

Following learning convergence, recall tests were performed to see if the network

could show pattern completion over missing parts and noise removal, properties

that are generally attributed to the Gestalt theory of visual perception (Gordon, 1997).

In other words, can the network develop fixed points? And is the recall process

a progressive one (Lee, 2006)? Fig. 11 shows the recall output as a function of time. Using

an incomplete input (printer image), the network was able to iteratively recall the

associated letter (P), while also being able to reconstruct the missing parts. Contrary

to signum output functions (e.g., Hopfield, 1982), it takes several iterations for a given

stimuli before converging to a fixed point. The same behavior is observed if the

initial stimulus is corrupted with noise (Fig. 12). For instance, the network effectively

ARTICLE IN PRESS

Fig. 10. Weight convergence as a function of the number of learning trials.

Fig. 11. Recall output after 63% of the printer image has been removed.





16/26

completes the pattern (Fig. 13) and removes the noise (Fig. 14) if the initial condition is a

noisy letter instead of the prototypes. For a comparison on the number of spurious

attractors as well as other types of BAMs on recall performance, see Chartier &

Boukadoum (2006a)

ARTICLE IN PRESS

Fig. 12. Recall output after 30% of the mailbox pixels have been flipped.

Fig. 13. Recall output after 43% of the letter D pixels has been removed.

Fig. 14. Recall output after 20% of the letter L pixels have been flipped.





17/26

3.2. Learning and recall: the dynamic orbit approach

This simulation was performed to see if the network could depart from the fixed-point

perspective to the dynamic orbit perspective. In order to ensure that the stimuli were well

stored, the d value was set to 0.1 during training and to a chaos leading value (1.45 or 1.65)during recall. From Fig. 7, it is easy to see that d 1.45 corresponds to a region of

restrained chaotic behavior (Fig. 4c) and d 1.65 to an unrestrained chaotic behavior

(Fig. 4d). The first simulation consisted of learning two two-dimensional bipolar stimuli:

[1, 1]T and [1, 1]T. The learning parameter (Z) was set to 0.01 and the number of learning

trials corresponded to 100. Fig. 15 shows scatter plots of the output state vectors, given the

first stimulus (a) and the second (b) when d 1.45. As expected, the plot did not exhibit

random behavior, but rather that of a quadratic map function. Moreover, the figure clearly

shows that the chaotic output is constrained to its corresponding quadrant (++ for the

first stimulus and + for the second). Fig. 16 displays the output variations within the

basin of attraction for different random inputs. Contrary to regular BAMs, the attractors

are not the corners of a hypercube, but regions clearly delimited within quadrants. Thus,

by evaluating the sign of the N quadrants it is possible to know which attractor the

network is on. Given the value ofd, the amount of variability in the output is observed as a

function of the region volume (see the bifurcation diagram on Fig. 7); the greater the

variability, the higher the d value. Thus, even if the network behavior is chaotic, no

ARTICLE IN PRESS

Fig. 15. Scatter plot ofx(t+1) in function of x(t) for d 1.45.





18/26

boundary crossing will occur while recalling an input. It is then still possible to exhibit

pattern completion and reconstruction (as would be expected from a memory model).

Sometimes, nonperiodic associative dynamics can be the desired behavior (Adachi &

Aihara, 1997). This behavior is characterized by the networks ability to retrieve stored

patterns and escape from them in the transient phase. To achieve this behavior, the output

parameter is simply increased to a value equal or greater than 1.6 (Fig. 7). For instance, if

the value ofd is set to 1.65, then the output covers the entire quadrant (Figs. 17 and 18).

For the sake of comparison, the next simulation used the same patterns used by Adachiand Aihara (1997) and Lee (2006). As shown in Fig. 19, the four stored stimuli are 10 10

patterns that give 100-dimensional bipolar vectors. Therefore, the W weight matrix is

composed of 100 100 connections and the V weight matrix of 100 100 connections. The

learning parameter was set to 0.0025, and the output parameter was set to 1.45 (for the first

simulation), and to 1.65 (for the second one). Since the dimension of the network is 100

and the weights are initialized at zero, the maximum squared error is thus 100. It took

NDANN less than 100 learning trials before weight convergence (Fig. 20).

For the first simulation, the network chaotic output must remain bounded within a

region of the stimulus space. More precisely, each element of the output vector can vary

only within its respective quadrant. For example, after convergence, if an element isnegative, then all its successive states will be negative as well. Fig. 21 shows the network

behavior given a noisy stimulus (30% of the pixels were flipped), when the transmission

parameter is set to d 1.45. The network progressively recalled the input into its correct

quadrant. Then, the output elements always varied within their converged quadrant, like

the two-dimensional network behavior illustrated in Fig. 16, without crossing any axis. By

assigning +1 to each positive vector element and 1 to each negative element, it is easy to

establish to which particular stimulus the network converges. Thus, this behavior differs

from the fixed-point approaches by exhibiting output variability, while sharing their

convergence to a stable attractor from the quadrant constrained point-of-view.

If the value of d is increased enough, then the network shows nonperiodic associativedynamics. For instance, Fig. 22 displays the network output given a noise free stimulus.

After a couple of iterations, the state vector leaves the attracted region and wanders from

one stored stimulus region to another. This model is therefore able to reproduce the

ARTICLE IN PRESS

Fig. 16. Output variations for the quadrant restrained condition.





19/26

dynamics found in Adachi and Aihara (1997). Evaluation of the stimulus distribution was

performed by generating random patterns where each element followed a uniformdistribution xiA[1, 1]. Each random pattern was iterated through the network for 1000

cycles. The two steps were repeated with 50 different initial patterns. The proportion of

each stimulus was then 22%, 23%, 26% and 27% (SE 0.21%). Because of this, the

ARTICLE IN PRESS

Fig. 18. Output variations for the nonperiodic associative memory condition.

Fig. 17. Scatter plot ofx(t+1) in function of x(t) for d 1.65.





20/26

observed distribution is statistically different from the theoretical uniform distribution. In

addition, those memories are present in less than 15% of all the observed patterns. Thus,

more than 85% of the time, the pattern is a transient memory circulating within the

ARTICLE IN PRESS

Fig. 21. Example of recall output from a noisy stimuli (30% pixel flipped).

Fig. 20. Weight convergence as a function of the number of learning trials. Since the network dimension is 100,

the maximum squared error is 100.

Fig. 19. The four stimuli used for learning.





21/26

stimulus-subspace. Finally, nonperiodic associative dynamics can be used as a search

procedure for a given item. For example, if the triangle pattern symbol is the target, then

Fig. 23 shows that this pattern can effectively be retrieved if the transmission parameter is

set to a lower value once the networks state is close to a match. More precisely, the figure

shows that initially the transmission parameter d was set to 1.6 to allow the network to be

in a nonperiodic associative state. At time t 52, the Euclidian distance from the network

state and the stimuli was close enough (oO60) to allow lowering the transmissionparameter to d 1.45, corresponding to a quadrant constrained aperiodic state. After 10

iterations (t 61), the transmission parameter d was set 0.4, corresponding to a fixed-point

behavior. The output then rapidly converges to the desired triangle pattern.

ARTICLE IN PRESS

Fig. 22. Nonperiodic associative dynamics behavior (d 1.65).





22/26

4. Discussion and conclusion

In NDANN, the simple NDS principles (nonlinear change and time) were appliedto the output and the learning function. Those principles were implemented in a recurrent

neural network and were kept as simple as possible. For the output function, a one-

dimensional logistic map was employed, while for learning, it was a time difference

Hebbian association. Both algorithms were linked together and subject to an online

exchange of information that enabled the model to exhibit behaviors under the

NDS perspective (e.g., learning curves, pattern completion, noise tolerance, output

deterministic variability). The model can easily be modified to account for more

behavior. For example, if the architecture changes, then the model could be used

for multi-step pattern learning and one-to-many associations (Chartier & Boukadoum,

2006b). This temporal associative memory could be employed to model knowledgeprocessing (Osana & Hagiwara, 2000) as well as periodic behavior (Yang, Lu,

Harrison, & Franc-a, 2001). Also recently, it has been shown that by only modifying the

architecturewhile keeping both learning and transmission constantthe model can

ARTICLE IN PRESS

Fig. 23. Specific pattern retrieval in function of time. The target pattern was the triangle symbol. The minimum

distance criterion was set to O60. The transmission parameter d was lower (1.45 and 0.4) at t 52 and at t 61,respectively.





23/26

convert high-dimensional data into low-dimensional codes (Chartier, Gigue` re, Renaud,

Proulx, & Lina, 2007). In this way, it could be potentially used for feature extraction and

learning (Hinton & Salakhutdinov, 2006).

The results with iterative online learning allow the model to find an analogy

in developmental psychology, where there is progressive learning through adoption andre-adoption as a function of the interaction between stimulus relationships (Lee,

2006). However, for simulation purposes, learning was made from the prototypes

themselves. In a natural setting, the categories should be extracted using a set of

exemplars instead of prototypes. In real world, each stimulus that is experienced

is different from the previous one. To overcome the possibility of an infinite

number of stimuli, animals can regroup them in categories. Human cognition, in

particular, allows adaptation to many environments. For instance, humans learn to

discriminate almost all individual stimulations in some situations, whereas only a

generic abstraction of the category is learned in others. In some contexts, exemplars

are memorized, while in others, prototypes are abstracted and memorized. This noisy

learning is possible with the incorporation of a vigilance procedure (Grossberg,

1988) into distributed associative memories (Chartier, He lie, Proulx, & Boukadoum,

2006; Chartier & Proulx, 1999) or by using a PCA/BAM-type architecture (Gigue` re,

Chartier, Proulx, & Lina, 2007). Moreover, cognitive models must consider base

rate learning, where the frequency of categories is employed to correctly identify an

unknown item. In other contexts, however, the frequency of exemplars is used. The model

could be easily modified to account for those different environmental biases, a property

that winner-takes-all models, like ART, have difficulty coping with (Helie, Chartier, &

Proulx, 2006).If d is set high enough (ex. d 1.45), the network behavior will be constrained

within a specific region. This allowed the network to exhibit variability, while still

being able to show noise tolerance and pattern completion; it is an important

property for a dynamic associative memory. Furthermore, if d is set to a still higher

value (ex. d 1.65), then the network can display nonperiodic associative memory

behavior. The state vector in that case is never trapped in any fixed point or region;

instead, it moves in non-periodic fashion from stored pattern to stored pattern.

This memory searching process is clearly different from that of fixed-point associative

memory models (Adachi & Aihara, 1997) and helps in the understanding of instability.

The network behavior variability results from its chaotic behavior and is thusdeterministic. The network behavior was always constrained to the stimulus-subspace.

Stochastic processes could also be implemented, which might then allow the network to

explore high-dimensional space through chaotic itinerancy (Kaneko & Tsuda, 2003;

Tsuda, 2001) as well as chaotic wandering from a high-dimensional to a low-dimensional

attractor. This wandering could play a role in system evolvability and architecture

development.

In conclusion, the present paper has shown that complex behaviors can arise from

simple interactions between a presented networks topology, learning and output

functions. The fact that the model can process multi-valued stimuli allows it to be built

in conformity with biological neural models dynamics (Gerstner & Kistler, 2002).Furthermore, the network displayed various behaviors expected within the NDS approach

in psychology. As a result, the model may bring neural activities and human behaviors

closer through the satisfaction of NDS properties.





24/26

Acknowledgments

The authors are grateful to David Fung and two anonymous reviewers for their useful

help in reviewing this article.

References

Adachi, M., & Aihara, K. (1997). Associative dynamics in a chaotic neural network. Neural Networks, 10(1),

8398.

Aihara, K., Takabe, T., & Toyoda, M. (1990). Chaotic neural networks. Physics Letters A, 144(67), 333340.

Anderson, J. A., Silverstein, J. W., Ritz, S. A., & Jones, R. S. (1977). Distinctive features, categorical perception,

and probability learning: Some applications of neural model. Psychological Review, 84, 413451.

Arik, S. (2005). Global asymptotic stability analysis of bidirectional associative memory neural networks with

time delays. IEEE Transactions on Neural Networks, 16, 580586.

Aussem, A. (1999). Dynamical recurrent neural networks towards prediction and modeling of dynamical systems.

Neurocomputing, 28(13), 207232.Babloyantz, A., & Lourenc-o, C. (1994). Computation with chaos: A paradigm for cortical activity. Proceedings of

the National Academy of Sciences, 91, 90279031.

Beer, R. D. (1995). A dynamical systems perspective on agentenvironment interaction. Artificial Intelligence,

72(12), 173215.

Begin, J., & Proulx, R. (1996). Categorization in unsupervised neural networks: The Eidos model. IEEE

Transactions on Neural Networks, 7(1), 147154.

Bertenthal, B. I. (2007). Dynamical systems: Its about time. In S. Boker, & M. J. Wenger (Eds.), Analytic

techniques for dynamical systems. Hillsdale, NJ: Erlbaum.

Buchel, C., & Friston, K. (2000). Assessing interactions among neuronal systems using functional neuroimaging.

Neural Networks, 13(89), 871882.

Carpenter, G. A., & Grossberg, S. (1987). A massively parallel architecture for a self-organizing neural pattern

recognition machine. Computer Vision, Graphics, and Image Processing, 37, 54115.

Chartier, S., & Boukadoum, M. (2006a). A bidirectional heteroassociative memory for binary and grey-level

patterns. IEEE Transactions on Neural Networks, 17(2), 385396.

Chartier, S., & Boukadoum, M. (2006b). A sequential dynamic heteroassociative memory for multistep pattern

recognition and one-to-many association. IEEE Transactions on Neural Networks, 17(1), 5968.

Chartier, S., Gigue` re, G., Renaud, P., Proulx, R. & Lina, J.-M. (2007). FEBAM: A feature-extracting

bidirectional associative memory. Proceeding of the International Joint Conference on Neural Networks

(IJCNN07), Orlando, USA.

Chartier, S., He lie, S., Proulx, R., & Boukadoum, M. (2006). Vigilance procedure generalization for recurrent

associative memories. In R. Sun, & N. Miyake (Eds.), Proceedings of the 28th annual conference of the

Cognitive Science Society (p. 2458). Mahwah, NJ: Lawrence Erlbaum Associates.

Chartier, S., & Proulx, R. (1999). A self-scaling procedure in unsupervised correlational neural networks.Proceeding of the International Joint Conference on Neural Networks (IJCNN99), Vol. 2, Washington DC,

USA, pp. 10921096.

Chartier, S., & Proulx, R. (2005). NDRAM: Nonlinear dynamic recurrent associative memory for learning

bipolar and nonbipolar correlated patterns. IEEE Transactions on Neural Networks, 16(6), 13931400.

Clark, A. (1997). The dynamical challenge. Cognitive science, 21(4), 461481.

Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273297.

Costantini, G., Casali, D., & Perfetti, R. (2003). Neural associative memory storing gray-coded gray-scale images.

IEEE Transactions on Neural Networks, 14(3), 703707.

Dafilis, M. P., Liley, D. T. J., & Cadusch, P. J. (2001). Robust chaos in a model of the electroencephalogram:

Implications for brain dynamics. Chaos, 11, 474478.

DeMaris, D. (2000). Attention, depth perception, and chaos in the preception of ambigous figures. In D. S.

Levine, & V. R. Brown (Eds.), Oscillations in neural systems (pp. 239259). Lawrence Erlbaum Associates.Du, S., Chen, Z., Yuan, Z., & Zhang, X. (2005). A kind of Lorenz attractor behavior and its application in

association memory. International Journal of Innovative Computing, Information and Control, 1(1), 109129.

Elman, J. L., Bates, E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., & Plunkett, K. (1996). Rethinking

innateness: A connectionist perspective on development. Cambridge, MA: MIT Press.





25/26

Erlhagen, W., & Scho ner, G. (2002). Dynamic field theory of movement prepartion. Psychologica Review, 109(3),

545572.

Freeman, W. J. (1987). Simulation of chaotic EEG patterns with dynamic model of the olfactory system.

Biological Cybernetics, 56(23), 139150.

Gerstner, W., & Kistler, W. (2002). Spiking neuron models: Single neurons, populations, plasticity. Cambridge:

Cambridge University Press.

Gigue` re, G., Chartier, S., Proulx, R. & Lina, J.-M. (2007). Creating perceptual features using a BAM-inspired

architecture. In D. S. McNamara, & J. G. Trafton (Eds.), Proceedings of the 29th Annual Cognitive Science

Society (pp. 10251030). Austin, TX: Cognitive Science Society.

Gordon, I. E. (1997). Theories of visual perception (2nd ed). Chichester, UK: Wiley.

Grossberg, S. (1967). Nonlinear difference-differential equations in prediction and learning theory. Proceedings of

the National Academy of Sciences, 58, 13291334.

Grossberg, S. (1988). Nonlinear neural networks: Principles, mechanisms, and architectures. Neural Networks,

1(1), 1761.

Guastello, S. J. (1998). Creative problem solving groups at the edge of chaos. Journal of Creative Behavior, 32(1),

3857.

Guastello, S. J. (2000). Nonlinear dynamics in psychology. Discrete Dynamics in Nature and Society, 6(1),1129.

Haken, H., Kelso, J. A. S., & Bunz, H. (1985). A theoretical model of phase transitions in human hand movement.

Biologicla Cybernetics, 51(5), 347356.

Haykin, S. (1999). Neural networks: A comprehensive foundation. Englewood Cliffs, NJ: Prentice-Hall.

Helie, S., Chartier, S., & Proulx, R. (2006). Are unsupervised neural networks ignorant? Sizing the effect of

environmental distributions on unsupervised learning. Cognitive Systems Research, 7(4), 357371.

Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science,

313(5786), 504507.

Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities.

Proceedings of the National Academy of Sciences, 79(8), 25542558.

Imai, H., Osana, Y., & Hagiwara, M. (2005). Chaotic analog associative memory. Systems and Computers in

Japan, 36(4), 8290.Kaneko, K., & Tsuda, I. (2003). Chaotic itinerancy. Chaos, 13(3), 926936.

Kaplan, D., & Glass, L. (1995). Understanding nonlinear dynamics (1st ed). New York: Springer.

Kohonen, T. (1972). Correlation matrix memories. IEEE Transactions on Computers, C-21, 353359.

Korn, H., & Faure, P. (2003). Is there chaos in the brain? II. Experimental evidence and related models. Comptes

Rendus Biologies, 326(9), 787840.

Koronovskii, A. A., Trubetskov, D. I., & Khramov, A. E. (2000). Population dynamics as a process obeying the

nonlinear diffusion equation. Doklady Earth Sciences, 372(4), 755758.

Kosko, B. (1988). Bidirectional associative memories. IEEE Transactions on Systems, Man and Cybernetics, 18(1),

4960.

Kosko, B. (1990). Unsupervised learning in noise. IEEE Transactions on Neural Networks, 1(1), 4457.

Lee, R. S. T. (2006). Lee-associator: A chaotic auto-associative network for progressive memory recalling. Neural

Networks, 19(5), 644666.

Lee, G., & Farhat, N. H. (2001). The bifurcating neuron network 1. Neural Networks, 14(1), 115131.

Mareshal, D., & Thomas, M. S. C. (2006). How computational models help explain the origins of reasoning.

IEEE Computational intelligence magazine, 1(3), 3240.

McClelland, J. L., & Rumelhart, D. E. (1986). Parallel distributed processing, Vol. 1. Cambridge, MA: MIT Press.

McCulloch, W. S., & Pitts, W. (1943). A logical calculus of ideas immanent in nervous activity. Bulletin of

Mathematical Biophysics, 5, 115133.

Muezzinoglu, M. K., Guzelis, C., & Zurada, J. M. (2003). A new design method for the complex-valued multistate

Hopfield associative memory. IEEE Transactions on Neural Networks, 14(4), 891899.

Munakata, Y., & McClelland, J. L. (2003). Connectionist models of development. Developmental Science, 6(4),

413429.

Nowak, A., & Vallacher, R. R. (1998). Dynamical social psychology. New York: Guilford Press.Osana, Y., & Hagiwara, M. (2000). Knowledge processing system using improved chaotic associative memory.

Proceeding of the International Joint Conference on Neural Networks (IJCNN00), 5, 579584.

Personnaz, L., Guyon, L., & Dreyfus, G. (1985). Information storage and retrieval in spin-glass like neural

networks. Journal of Physique Letter, 46, L359L365.





26/26

Personnaz, L., Guyon, L., & Dreyfus, G. (1986). Collective computational properties of neural networks:

New learning mechanisms. Physical Review A, 34, 42174228.

Prinz, J. J., & Barsalou, L. W. (2000). Steering a course for embodied representation. In E. Dietrich, &

A. Markman (Eds.), Cognitive dynamics: Conceptual change in humans and machines (pp. 5177). Cambridge:

MIT Press.

Renaud, P., Bouchard, S., & Proulx, R. (2002). Behavioral avoidance dynamics in the presence of a virtual spider.

IEEE Transactions on Information Technology in Biomedicine, 6(3), 235243.

Renaud, P., Decarie, J., Gourd, S.-P., Paquin, L.-C., & Bouchard, S. (2003). Eye-tracking in immersive

environments: A general methodology to analyze affordance-based interactions from oculomotor dynamics.

CyberPsychology & Behavior, 6(5), 519526.

Renaud, P., Singer, G., & Proulx, R. (2001). Head-tracking fractal dynamics in visually pursuing virtual objects.

In W. Sulis, & I. Trofimova (Eds.), Nonlinear dynamics in life and social sciences (pp. 333346). Amsterdam:

IOS Press.

Rosenblatt, F. (1958). The perceptron: A propalistic model for information storage and organization in the brain.

Psychological Review, 65, 386408.

Shen, D., & Cruz Jr., J. B. (2005). Encoding strategy for maximum noise tolerance bidirectional associative

memory. IEEE Transactions on Neural Networks, 16, 293300.Skarda, C. A., & Freeman, W. J. (1987). How brains make chaos in order to make sense of the world. Behavioral

and Brain Sciences, 10, 161195.

Spencer, J. P., & Scho ner, G. (2003). Bridging the representational gap in the dynamic systems approach to

development. Developmental Science, 6(4), 392412.

Sporns, O., Chialvo, D. R., Kaiser, M., & Hilgetag, C. C. (2004). Organization, development and function of

complex brain networks. Trends in Cognitive Sciences, 8(9), 418425.

Storkey, A. J., & Valabregue, R. (1999). The basins of attraction of a new Hopfield learning rule. Neural

Networks, 12(6), 869876.

Sutton, R. S. (1988). Learning to predict by the methods of temporal difference. Machine learning, 3, 944.

Thelen, E., & Smith, L. (1994). A dynamic system approach to the development of cognition and action . Cambridge,

MA: MIT Press.

Tsuda, I. (2001). Towards an interpretation of dynamic neural activity in terms of chaotic dynamical systems.Behavioral and Brain Sciences, 24(4), 793847.

van Gelder, T. (1998). The dynamical hypothesis in cognitive science. Behavioral and Brain Sciences, 21(5),

615665.

Wang, C.-C., Hwang, S.-M., & Lee, J.-P. (1996). Capacity analysis of the asymptotically stable multi-valued

exponential bidirectional associative memory. IEEE transactions on systems, Man and CyberneticsPart B:

Cybernetics, 26(5), 733743.

Werner, G. (2001). Computation in nervous systems from /http://www.ece.utexas.edu/$werner/Neural_compu-

tation.htmlS.

Yang, Z., Lu, W., Harrison, R. G., & Franc-a, F. M. G. (2001). Modeling biological rhythmic patterns using

asymmetric Hopfield neural networks. In Proceeding of the international conference on engineering applications

of neural networks (pp. 273278).

Zanone, P. G., & Kelso, J. A. S. (1997). The coordination dynamics of learning and transfer: Collective and

component levels. Journal of Experimental Psychology: Human and Perception, 23(5), 14541480.

Zhang, D., & Chen, S. (2003). A novel multi-valued BAM model with improved error-correcting capability.

Journal of Electronics, 20(3), 220223.

Zhao, L., Caceres, J. C. G., Damiance, A. P. G., Jr., & Szu, H. (2006). Chaotic dynamics for multi-value content

addressable memory. Neurocomputing, 69(1315), 16281636.

Zurada, J. M., Cloete, I., & van der Peol, E. (1996). Generalized Hopfield networks for associative memories with

multi-valued stable states. Neurocomputing, 13(2-4), 135149.

http://www.ece.utexas.edu/~werner/Neural_computation.htmlhttp://www.ece.utexas.edu/~werner/Neural_computation.htmlhttp://www.ece.utexas.edu/~werner/Neural_computation.htmlhttp://www.ece.utexas.edu/~werner/Neural_computation.htmlhttp://www.ece.utexas.edu/~werner/Neural_computation.htmlhttp://www.ece.utexas.edu/~werner/Neural_computation.html

Documents

Sylvain Chartier, Patrice Renaud and Mounir Boukadoum- A nonlinear dynamic artificial neural network model of memory