Upload
masahiko-morita
View
216
Download
0
Embed Size (px)
Citation preview
Recognition of Sequential Patterns by Nonmonotonic Neural
Networks
Masahiko Morita
Institute of Information Sciences and Electronics, University of Tsukuba, Tsukuba, Japan 305-8573
Satoshi Murakami
Doctoral Program in Engineering, University of Tsukuba, Tsukuba, Japan 305-8573
SUMMARY
A neural network model that recognizes sequential
patterns without expanding them into spatial patterns is
presented. This model forms trajectory attractors in the state
space of a fully recurrent network by a simple learning
algorithm using nonmonotonic dynamics. When a sequen-
tial pattern is input after learning, the network state is
attracted to the corresponding learned trajectory and the
incomplete part of the input pattern is restored in the input
part of the model; at the same time, the output part indicates
which sequential pattern is being input. In addition, this
model can recognize learned patterns correctly even if they
are temporally extended or contracted. © 1999 Scripta
Technica, Syst Comp Jpn, 30(4): 11�19, 1999
Key words: Sequential pattern; pattern recogni-
tion; nonmonotonic neural network; trajectory attractor;
learning algorithm.
1. Introduction
A number of models have been proposed concerning
pattern recognition by neural networks, but they basically
deal with static spatial patterns only. To recognize a sequen-
tial pattern, they have to expand it into a spatial pattern
using multilayer delay units or various time-delay filters [1,
2]. However, these methods of recognition have such prob-
lems as the temporal length of the pattern being limited by
the maximum delay time and difficulty in coping with
temporal expansion and contraction of the pattern.
Moreover, most of the conventional recognition mod-
els use a layered neural network without feedback, and
cannot utilize dynamics of the network, namely, dynamic
interaction between many neurons. It is true that the Elman
model [3], for example, has feedback connections in the
middle layer, but it is based on discrete-time dynamics
requiring synchronization between neurons and sampling
of the input pattern at discrete times, so that it does not make
the best use of the network dynamics.
Another feature of the conventional models is that
they use a complex learning algorithm like the back-propa-
gation algorithm and give priority to improving it. Espe-
cially when there are feedback connections, the learning
algorithm tends to be very complicated. As a result, the
computational cost increases explosively and learning does
not succeed easily, in addition to other problems.
On the other hand, the mechanism of sequential pat-
tern recognition in the human brain seems far different from
that of the above models, for the following reasons. First,
no delay circuit which can convert a long sequence into a
spatial pattern is found in the higher center of the brain. In
contrast, the brain has plenty of feedback connections
CCC0882-1666/99/040011-09
© 1999 Scripta Technica
Systems and Computers in Japan, Vol. 30, No. 4, 1999Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J81-D-II, No. 7, July 1998, pp. 1679�1688
11
among neurons, which suggests that it actively uses the
network dynamics. Moreover, the actual neurons seem to
act neither synchronously with a clock nor according to a
highly complex learning rule.
Of course, we need not use the same mechanism as
the brain if our purpose is only practical application. How-
ever, it seems hardly possible to achieve a neural network
model as excellent as the brain within the conventional
framework. The purpose of this study is to construct a
neural network model that recognizes sequential patterns
based on a framework more similar to the brain than the
conventional one, and to show its ability and possibilities.
An interesting model from such a standpoint has been
proposed by Futami and Hoshimiya [4]. Their model, using
the state transition of a neural network with mutual connec-
tions, can recognize a sequential pattern without converting
it into a spatial pattern. It is also different from the Elman
model in that the neurons act time-continuously and the
transition of the network state is continuous. However, this
model is based on the local representation of information
or �grandmother-cell� type encoding [5]; that is, each neu-
ron represents only a specific part of a specific sequence.
Besides, the network cannot take every possible state, but
very limited states. Accordingly, this model does not effi-
ciently use neurons nor make the most of the merits of
parallel distributed processing.
The root cause of these problems is that the dynamics
of conventional neural networks is not appropriate for
controlling the network to make a smooth and stable state
transition between arbitrary states [6, 7]. It is therefore
necessary to use improved dynamics for the above purpose.
A very simple and effective method of improving
network dynamics is to use nonmonotonic dynamics, that
is, to change the monotonic output function of each neuron
into a nonmonotonic one [8, 9]. As reported recently [6],
the neural network with such dynamics (nonmonotonic
neural network) can form a trajectory attractor along a given
orbit in its state space, using a simple learning algorithm. It
is thus expected that the nonmonotonic neural network can
recognize sequential patterns based on a principle similar
to the brain.
The existing nonmonotonic neural network model,
however, is for memory, where stored sequential patterns
have to be mutually separate in the pattern space so that
confusion in recall may not occur. On the other hand, the
model for sequential pattern recognition should treat the
sequences such that the same spatial pattern appears repeat-
edly in different contexts; otherwise, it is nothing more than
a model of spatial pattern recognition. We thus need some
contrivance so that the network can learn trajectories having
intersection and overlap. This paper presents a concrete
method for that purpose.
Incidentally, neurons with nonmonotonic input�out-
put characteristics (nonmonotonic neurons) are not found
in the brain. In this respect, the nonmonotonic neural net-
work seems quite implausible as a model of the brain.
However, similar dynamics can be achieved by combining
excitatory and inhibitory neurons of monotonic charac-
teristics, and such a model [5, 10, 11] is supported by some
physiological findings. Nevertheless, the use of the non-
monotonic neuron has the advantages of simple architec-
ture, concise description, and ease of analysis. Since the
main purpose of this study is not to construct a biologically
plausible model but to present a new principle of recogni-
tion, we use the nonmonotonic neuron as a component of
the model here.
2. Principle
2.1. Structure and dynamics
This model has a simple structure composed of n
nonmonotonic neurons with fully recurrent connections.
These neurons are divided into three groups, input, middle,
and output parts, though all neurons obey the same dynam-
ics and learning rules. For convenience, we give serial
numbers to the neurons such that neurons 1 to k are the input
part, k + 1 to l are the middle part, and l + 1 to n are the
output part.
The dynamics of the network is expressed by
where ui is the potential of neuron i, wij is the synaptic
weight from neuron j, zi is the external input, and W is a time
constant. The output yi is given by
where f�u� is a nonmonotonic function as shown in Fig. 1.
We use, as the nonmonotonic output function,
(1)
(2)
(3)
Fig. 1. Nonmonotonic output function.
12
where c, cc, h, and N are constants (we substitute c = 50, cc
= 10, h = 0.5, N = �1 in the experiments described later).
Since the polarity of ui is important in nonmonotonic
neural networks, we consider xi sgn�ui� and treat the
vector x �x1, . . . , xn� as the network state, where sgn(u) =
1 for u > 0 and �1 for u d 0.
The network state x at an instant is represented by a
point in the state space consisting of 2n possible states.
When x changes, it almost always moves to an adjacent
point in the state space, because xi changes asynchronously.
Consequently, x leaves a track with time, which we call the
trajectory of x. Similarly, we call xin �x1, . . . , xk�,
xmid �xk�1, . . . , xl� and xout �xl�1, . . . , xn� the states of
the input, middle, and output parts, respectively, and con-
sider the trajectories of xin, xmid, and xout in the state space
of each part.
2.2. Learning
For simplicity of discussion, we assume for the pre-
sent that k l, or that the model has no middle part; the case
of l ! k will be treated in section 4, but the basic principle
is the same.
Let s1�t�, . . . , sm�t� be m sequential patterns to be
recognized, where sP s1P, . . . , sk
P�. We assume that the
elements siP are r1 and change asynchronously. Then we
can consider m trajectories corresponding to sP in the k-
dimensional pattern space regarded in the same light as the
state space of the input part. These trajectories may intersect
or overlap with one another.
We perform learning so that the state xout of the output
part becomes a target state SP �sk�1P , . . . , sn
P� when the
sequential pattern sP is input into the input part. The learn-
ing algorithm is as follows.
First , we create a learning signal vector
r �r1, . . . , rn� with binary elements �ri r1�. The learning
signal rin corresponding to the input part is sP, that is,
ri siP for i d k. The learning signal rout corresponding to
the output part is a spatiotemporal pattern changing gradu-
ally from a static pattern O to SP, where we assume
O ��1, . . . , �1� without losing generality. Since r is an
n-dimensional binary vector, as well as x, r is regarded as
moving in the state space of the network from
�sP�0�, O� { �s1P�0�, . . . , sk
P�0�, �1, . . . , �1� to �sP�T�, SP�,
where T is the temporal length of sP.
Next, we give an initial state x �sP�0�, O� and input
r in the form zi Oiri to the network while it acts according
to Eq. 1. Here, Oi denotes the input intensity of ri, which is
a constant Oin for the input part �i d k� and a variable Oout
decreasing with the process of learning for the output part
�i ! k�.
We simultaneously modify all synaptic weights wij
according to
where Wc denotes a time constant of learning �Wc !! W� and
D is a learning coefficient. Since learning performance is
better when D is a decreasing function of |ui| [6], we put
D Dcxiyi, Dc being a positive constant.
When r is moving in the state space, x follows slightly
behind, leaving its track as a gutter in the energy landscape
of the network [6]. If r reaches the end, we keep
r �sP�T�, SP� and continue modifying wij until x comes
close enough to r.
We apply this procedure to all P, and repeat it over a
number of cycles, gradually decreasing Oout. If xout can reach
a state near SP even when Oout 0, then the learning is
completed.
2.3. Recognition
By the above learning, the trajectories of x, which are
roughly the same as those of r, become attractors of the
dynamical system formed by the nonmonotonic neural
network [5, 6]. Accordingly, there exist m trajectory attrac-
tors in the state space after learning. Using this, the sequen-
tial patterns sP are recognized in the following way.
Let us assume that sc �s1g, . . . , sk
g) is an input pattern
made by transforming (e.g., adding noise to) s1 and that sig
is 1, �1, or 0. We input it to the model in the form
zi Oinsig � �i d k�. To the output part, we give the initial state
O and input nothing �zi 0� thereafter.
When sc is input in this way, x is attracted to the
nearest trajectory attractor that is thought to correspond to
s1. Consequently, it is expected that the output state xout
becomes nearly equal to S1 when we finish inputting sc.
3. Behavior of the Model
To confirm the above principle, computer simulations
were performed using a network with 300 input and 200
output neurons �k l 300, n 500� [12].
Four sequential pat terns s1 {ABC}40W,
s2 {ABD}40W, s
3 {DAC}40W, and s4 {DBC}40W were
used in the experiment, where A, B, C, and D are k-dimen-
sional binary vectors selected at random; {ABC} represents
the shortest path from A via B to C, and {ABC}T denotes a
spatiotemporal pattern whose trajectory is {ABC} and tem-
poral length is T. The target states S1, S2, S3, and S 4 were
selected at random, but S1 and S2 were selected such that
they have a similarity of 0.5, where similarity is defined by
the direction cosine between two vectors. The reason we
make S1 and S2 similar is described below.
After finishing 10 cycles of learning, we input various
patterns and examined the behavior of the model. The
(4)
13
parameters were Dc 2, Wc 5000W, and Oin 0.2; Oout was
decreased by degrees from 0.2 to 0.
3.1. Recognition process
Figure 2 shows the process of recognition when part
of s1 was input. Specifically, zi 0.2si1 for half elements of
the input part that are randomly chosen and zi 0 for the
other half; at t ! 40W, zi 0 for all i. Similarities (direction
cosines) between xout and SP denoted by dout�SP� are plotted
in the top graph and those between xin and A to D are plotted
in the bottom one. The abscissa is time scaled by the time
constant W.
The similarity din�A� between xin and A increases very
rapidly from the initial value of 0.5 to more than 0.9 and
then decreases gradually. In this process, din�A� � din�B� is
constant and is nearly equal to 1, which indicates that xin is
moving along the path {AB}. We also see that xin u B at
t 20W and xin u C at t ! 40W and that the whole of s1 is
restored in the input part.
On the other hand, the similarity dout�S1� between
xout and S1 (thick line) increases consistently with time and
xout u S1 at t ! 40W. This means that the model has correctly
recognized the input pattern as s1. We should note that the
trajectory {ABC} of s1 overlaps everywhere with other
trajectories and thus s1 cannot be distinguished by the
instantaneous input pattern at any moment.
We should also note that dout�S1� and dout�S
2� rise in
the same manner while t � 25W and rapidly separate at
t u 30W. This indicates that xout moves in the middle of
trajectories {OS1} and {OS2} at first and then approaches
{OS1} when xin approaches C after passing through B.
This process is schematically shown in Fig. 3, where
the n-dimensional state space of the network is represented
three-dimensionally. Panels (a) and (b) depict the same
thing from different angles. The origin represents the initial
state x �A, O� meaning xin A and xout O. The thick line
represents the trajectory of x and the broken lines represent
the trajectories r1 and r2 of the learning signal for s1 and
s2. The gray lines represent the projection onto the x1�x2 or
x3�x4 plane, that is, the trajectories in the state space of the
input or output part.
If we observe only the input part, the two trajectories
r1 and r2 overlap in their first half and then diverge in their
second. On the other hand, the trajectories in the output part
diverge at the starting point. Thus, as a whole, r1 and r2 are
separate but rather close in the first half.
The nonmonotonic neural networks have the prop-
erty that when some attractors exist nearby, states lying
Fig. 2. A process of recognition.
Fig. 3. Schematic of the recognition process.
14
between them are comparatively stable [6, 7]. In other
words, the energy landscape of the network has a �flat�
bottom between neighboring attractors because the attrac-
tive force is smaller near attractors (note that in the case of
conventional neural networks, the energy landscape is
�sharp� at attractors). Consequently, while xin is moving
along the common path {AB}, xout moves in the middle of
{OS1} and {OS2}.
As xin departs from B toward C, the distance from x
to r2 increases rapidly whereas that to r1 does not increase
to such an extent so that x cannot remain in between the
two. As a result, x is attracted to r1 and xout approaches S1.
We can see from the above discussion why a spa-
tiotemporal pattern {OSP}T, rather than a static pattern SP,
should be used for the learning signal rout. That is, if rout is
a static pattern, r1 and r2 are far apart over the entire path
and thus x is attracted to either of the two soon after starting
from the origin; once x is attracted to r1, for example, x can
hardly transfer to r2 even if xin goes along {BD} afterwards.
By similar reasoning, if the trajectories of s1 and s2
are identical or very similar in their first section, the corre-
sponding target states S1 and S2 should be similar so that the
distance between r1 and r2 is decreased. Then there is less
possibility that x is attracted to r1 or r2 before sufficient
information is given to the model, and even if it occurs, x
can transfer to the correct trajectory more easily.
3.2. Recognizing patterns with blank sections
The trajectory attractor formed by the above learning
not only has a strong surrounding flow that runs into it, but
also has a gentle flow that moves as fast as r along the
trajectory [6]. Accordingly, this model can recognize a
learned sequential pattern even if the input pattern has some
blank sections.
As an example, the first quarter of s2 (from A to the
midpoint between A and B) was input for 0 d t d 10W and
then the input was cut off. Figure 4 shows the behavior of
the model in the same way as shown in Fig. 2, but the thick
line represents dout�S2�. We see that the movement of x is
roughly the same as that in Fig. 2 for 10W � t d 20W, although
there is no external input. However, when xin passed
through B and slightly approached C and D, x stopped. This
is thought to be an equilibrium state in which the attracting
forces from multiple attractors balance out.
Then the third quarter of s2 (from B to the midpoint
between B and D) was input for 30W d t d 40W, and the input
was cut off again. We see that x is attracted to r2 and finally
comes close to �D, S2�.
In this way, this model complements the blank sec-
tions of the input pattern and recognizes it correctly, pro-
vided that no other trajectory attractors exist near the blank
sections.
3.3. Recognizing patterns with temporal
extension and contraction
As described above, x basically moves at the same
pace as that in learning after being attracted to the learned
trajectory. However, when sP is input at a different pace, xin
follows the input pattern unless the pace is too fast; then xout
keeps pace with xin and approaches SP. That is, the model
can recognize a temporally extended or contracted pattern
of sP.
Figure 5a shows the recognition process when
s3 {DAC}40W was input at one-fifth the pace, and Fig. 5b
shows the case when s4 {DBC}40W was input at double the
pace. We see that the input patterns are correctly recog-
nized, though the transition of xin is slightly delayed in the
latter case.
In this connection, if the input pace is still faster, the
delay of xin increases so much that the model fails in
recognition. Also, as the input pattern contains more noise,
the model is less tolerant to temporal extension and con-
traction of the pattern.
4. Introduction of the Middle Part
In the example of the previous section, we treated the
case where the input spatiotemporal patterns have rather
simple trajectories, and thus we may associate xin�u sP�
directly with xout that has a short straight trajectory from O
to SP. If sP are long and intricately intertwined, however, the
Fig. 4. Behavior when the sequence with blank sections
was input.
15
above method does not work well because the difference in
trajectory structure between rin and rout is too large.
We can reduce the difference by increasing the di-
mension of the output part and making the trajectory of rout
long and curved. However, this causes another problem,
that we cannot know the result of recognition until xout
comes close enough to the end point SP of the trajectory of
rout, whereas in the above model, we can easily distinguish
(e.g., by a perceptron) the destination of xout when it is
attracted to one of the trajectories.
To solve this problem, we will introduce hidden
neurons in the middle part, leaving rout unchanged. That is,
we expect that we can associate xin with xout through the
middle part where xmid draws an intermediary trajectory.
4.1. Generating the learning signal for the
middle part
The structure and dynamics of the network with
hidden neurons were previously described. The point is
how to generate the learning signal rmid for the middle part.
We often use error signals obtained by the back-
propagation algorithm for training hidden neurons. This
method, however, has problems as described in section 1
and is unsuitable for the present model. It is also undesirable
to generate rmid off-line with a complex method. We thus
choose a method of supplementing the above model with a
network which generates rmid from rin and rout in real time.
Of course, the supplementary network should be as simple
as possible and not require complex learning.
The requirements for rmid are as follows: (1) its tra-
jectories may be curved, but should be shorter and less
intertwined than those of rin; (2) it should reflect the change
in rin and rout to some extent; (3) its starting point should be
near the initial state Oc ��1, . . . , �1� of the middle part.
Since all of these properties lie in between those of rin and
rout, it is thought that a desired rmid is obtained by mixing
rin and rout using a randomly connected network.
Based on this idea, we constructed the model shown
in Fig. 6. The lower half of the figure is the supplementary
network used only in learning. This network consists of
ordinary binary neurons of the same number l � k as the
middle part. Each neuron i�i k � 1, . . . , l� receives rin and
rout through synaptic weights aij and bij, respectively, and
Fig. 5. Behavior when the sequences with temporal
extension and contraction were input.
Fig. 6. Structure of the model with the middle part.
16
its output ri is given to the corresponding hidden neuron as
the learning signal. It also has a self connection with a
positive strength U, which prevents rmid from sharp fluctua-
tions and makes the trajectory of rmid smooth. In mathemati-
cal terms,
where k � 1 d i d l and we put ri �1 for t � 0.
The synaptic weights aij and bij are randomly deter-
mined, but the average of bij should be positive so that rmid
may be close to Oc in the initial state when rout = O. In the
following experiments, aij are normally distributed random
numbers with mean 0 and variance 1 / k, bij are those with
mean 1 / �n � 1� and variance 1 / �n � 1�, and U = 1; these
values are determined by several trials, and are not neces -
sarily optimal.
4.2. Computer simulation
Computer simulations were performed on the model
with 400 input, 600 hidden, and 200 output neurons (k =
400, l = 1000, n = 1200).
We prepared 21 sequential patterns s1�s21 as shown
in Table 1. These patterns were generated by connecting
400-dimensional binary vectors A�G in random order. On
average, A�G appear 15 times each; also, it is calculated
that a unit section such as {AB} and {AC} appears twice
(actually, the frequency of appearance was distributed from
0 to 6). The temporal length T was set to 80W for all sP.
The target states SP of the output part were selected
randomly out of the orthogonal vectors to O, but if the
trajectories of sP and sn�P z n� are identical in the first p
quarters, corresponding SP and Sn were so selected that their
similarity is p/4.
Using rin sP, rout {OSP}T, and rmid generated in
the above way, the model was trained 15 times for each
sequential pattern. As learning proceeds, the input intensity
Omid of rmid was decreased gradually from 0.2 to 0, as well
as Oout. The parameters for learning are the same as those
in section 3 except that Wc 40000W.
Figure 7 shows the recognition process when
{CcEcAcCcAc}T was input after learning; Ac, Cc, and Ec are
vectors obtained by randomly flipping 100 components
(the noise ratio is 50%) of A, C, and E, respectively. The top
and bottom graphs indicate the time courses of dout�SP� and
din�A��din�G�, respectively. The middle graph indicates the
state transition of the middle part in a different way, where
similarities between xmid�t� and rmidP �t� (used in learning sP)
at time t are plotted.
We see that the model recognized the input pattern as
s8 {CEACA}T. In this process, xmid moves roughly along
the trajectory of rmid8 , departing gradually from those of the
other rmidP . It should be noted that s8 overlaps with
s7 {CEABD}T in their first half and thus x at first moves
along an intermediate trajectory between r7 and r8.
In the same way, for 14 out of 21 sP, patterns with a
noise ratio of 50% were correctly recognized. Recognition
failed for the other 7, but all of them were correctly recog-
nized when the noise ratio was 25%.
(5)
Fig. 7. A recognition process of the model with the
middle part.
Table 1. Sequential patterns for the experiment
17
Figure 8 shows the behavior when a vague pattern
{�AD�Ec�AG�Fc�AC�}T was input. Here, Ec and Fc are vec-
tors containing 50% of random noise, and (AD) denotes a
vector lying midway between A and D; (AG) and (AC) are
also middle vectors. Note that (AD) is quite different from
Ac and Dc.
We see that the model recognized this pattern as
s3 AEGFAT which is overall the most similar. We also see
that G and A are restored in the input part when (AG) and
(AC) are being input, respectively, whereas an intermediate
pattern between A and D appears when (AD) is input
initially.
Generally, however, this kind of vague pattern is
difficult to recognize, and in many cases, xout does not reach
any target state, so that recognition fails. One of the causes
is that in this experiment, vectors A and B are nearly
orthogonal and vector (AB) is very distant from them. The
performance will thus be better if similar objects that can
be confused with each other are encoded into similar vec-
tors; however, such improvement remains for future study.
5. Conclusions
We have described a model that can recognize se-
quential patterns by the use of trajectory attractors formed
in a nonmonotonic neural network. Distinctive features of
this model are enumerated in the following.
1. The input sequence is not expanded into a spatial
pattern, so that no delay elements are necessary and the
length of the sequence is not restricted
2. The network changes its state continuously,
based on fully distributed representation. This enables such
flexible recognition as shown above.
3. The network has simple architecture. The learn-
ing algorithm is also simple and does not require many
iterations of input.
4. The input sequential pattern is not only recog-
nized but restored, that is, defects of the pattern are repaired
in both spatial and temporal dimensions.
From these features, the model seems much closer to
the brain in its working principle than the existing models
of sequential pattern recognition, and thus we expect that it
has a great potential in brain modeling as well as techno-
logical application.
However, the model described in this paper is a basic
one and there remain many subjects for future study. For
example, we should proceed with experimental and theo-
retical analysis of the properties of the model. Also, this
model has much room for further development; for exam-
ple, we may possibly treat complex sequences more effi-
ciently by introducing hierarchical structure into the middle
part. Improvement of learning signal generation and appli-
cation to speech recognition are also future subjects.
Acknowledgments. The authors thank Mr.
Hidekazu Kimura (currently with NTT Data Co.) for pre-
liminary experiments in this study. Part of this study was
supported by a Grant-in-Aid for Scientific Research
(#08279105, #08780328 and #0978031) from the Ministry
of Education of Japan.
REFERENCES
1. Waibel A. Modular construction of time-delay net-
works for speech recognition. Neural Computation
1989;1:328�339.
2. Tank DW, Hopfield JJ. Neural computation by con-
centrating information in time. Proc Natl Acad Sci
USA 1987;84:1896�1900.
Fig. 8. Behavior when the input pattern has vague
sections.
18
3. Elman JL. Finding structure in time. Cognitive Sci
1990;14:179�211.
4. Futami R, Hoshimiya N. A neural sequence identifi-
cation network (ANSIN) model. Trans IEICE
1988;J71-D:2181�2190.
5. Morita M. Neural network models of learning and
memory. In: Toyama T, Sugie N, editors. Brain and
computational theory. Asakura Publishing; 1997. p
54�69.
6. Morita M. Memory and learning of sequential pat-
terns by nonmonotone neural networks. Neural Net-
works 1996;9:1477�1489.
7. Morita M. Associative memory of sequential patterns
using nonmonotone dynamics. Trans IEICE
1995;J78-D-II:678�688.
8. Morita M, Yoshizawa S, Nakano K. Analysis and
improvement of the dynamics of autocorrelation as-
sociative memory. Trans IEICE 1990;J73-D-II:232�
242.
9. Morita M. Associative memory with nonmonotone
dynamics. Neural Networks 1993;6:115�126.
10. Morita M. A neural network model of the dynamics
of a short-term memory system in the temporal cor-
tex. Trans IEICE 1991;J74-D-II:54�63.
11. Morita M. Computational study on the neural mecha-
nism of sequential pattern memory. Cognitive Brain
Res 1996;5:137�146.
12. Morita M, Murakami S. Recognition of spatiotempo-
ral patterns by nonmonotone neural networks. Proc
ICONIP�97, vol 1, p 6�9.
AUTHORS (from left to right)
Masahiko Morita graduated in 1986 from the Department of Mathematical Engineering and Information Physics, Faculty
of Engineering, University of Tokyo, where in 1991 he obtained a D.Eng. degree. He has been an assistant professor at the
University of Tsukuba since 1992. He is engaged in research on biological information processing and neural networks. He
received a Research Award and a Paper Award from the Japanese Neural Network Society in 1993 and 1994, respectively.
Satoshi Murakami graduated in 1995 from the College of Engineering Systems, University of Tsukuba. He is currently
in the Doctoral Program in Engineering there, studying neural information processing.
19