13
1600 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 9, SEPTEMBER 2014 Complex-Valued Recurrent Correlation Neural Networks Marcos Eduardo Valle Abstract—In this paper, we generalize the bipolar recurrent correlation neural networks (RCNNs) of Chiueh and Goodman for patterns whose components are in the complex unit circle. The novel networks, referred to as complex-valued RCNNs (CV-RCNNs), are characterized by a possible nonlinear function, which is applied on the real part of the scalar product of the current state and the original patterns. We show that the CV-RCNNs always converge to a stationary state. Thus, they have potential application as associative memories. In this context, we provide sufficient conditions for the retrieval of a memorized vector. Furthermore, computational experiments concerning the reconstruction of corrupted grayscale images reveal that certain CV-RCNNs exhibit an excellent noise tolerance. Index Terms— Complex-valued neural network, grayscale image retrieval, high-capacity memory, neural associative memories (AMs), noise tolerance, recurrent neural networks. I. I NTRODUCTION A SSOCIATIVE memories (AMs) are mathematical constructs motivated by the human brain ability to store and recall information [1]–[3]. Similar to the biological neural network, an AM should be able to retrieve a memorized information from a possibly incomplete or corrupted item. In mathematical terms, an AM is designed for the storage of a finite set of vectors {u 1 , u 2 ,..., u p }, called the fundamental memory set. Subsequently, the AM model is expect to retrieve a memorized vector u ξ in response to the presentation of a noisy or incomplete version of u ξ . Applications of AMs cover, for instance, pattern classification and recognition [4], [5], optimization [6], computer vision [7], prediction [8], and language understanding [9]. The Hopfield neural network is one of the most widely known neural network used to realize an AM [10]. The Hopfield network is implemented by a recursive single-layer neural network constituted of threshold neurons of McCulloch and Pitts. Moreover, it has many attractive features including: ease of implementation in hardware, characterization in terms of an energy function, and a variety of applications [11], [12]. On the downside, the Hopfield network suffers from a low Manuscript received February 7, 2014; revised June 13, 2014; accepted July 15, 2014. Date of publication July 28, 2014; date of current version August 15, 2014. This work was supported in part by the Fundo de Apoio ao Ensino, à Pesquisa e à Extensão under Grant 519.292, in part by the Fundação de Amparo à Pesquisa do Estado de São Paulo under Grant 2013/12310-4, and in part by the Conselho Nacional de Desenvolvimento Científico e Tecnológico under Grant 304240/2011-7. The author is with the Department of Applied Mathematics, University of Campinas, Campinas 13083-859, Brazil (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2014.2341013 absolute storage capacity of approximately 0.15n items, where n is the length of the vectors [13]. A simple but significant improvement in storage capacity of the Hopfield network is achieved by the recurrent correlation neural networks (RCNNs) introduced in [14]. In some sense, the RCNNs generalize the Hopfield network by adding a layer of nodes that compute the inner product between the current state and the fundamental memories. The activation function of the neurons in this layer characterizes the RCNN. For example, a variation of the original Hopfield network is obtained by considering a linear activation function. Furthermore, the storage capacity of certain RCNNs scales exponentially with the length of the vectors for some activation functions [14]. Besides the very high storage capacity, these RCNNs exhibit excellent error correction capabilities [14]. Some RCNNs are also closely related to certain types of Bayesian processes [15] and kernel-based AMs such as the model of [16] and [17]. Such as the Hopfield network, the original RCNNs are designed for bipolar vectors. However, many applications of AMs, including the retrieval of grayscale images in the presence of noise, require the storage and recall of multistate or complex-valued vectors [18]. As far as we know, the first significant generalization of the Hopfield network using complex numbers in the unit circle have been proposed in [19]. In a few words, they consid- ered a single-layer complex-valued dynamic neural network (CV-DNN) whose neurons evaluate the complex-signum func- tion on the activation potential and the synaptic weights are determined using a complex-valued Hebbian learning. However, similar to the Hopfield network, the implementa- tion of the CV-DNN of [19] as an AM is shortened by its low storage capacity. Therefore, several researchers have developed improved CV-DNN models to overcome this lim- itation. For instance, Müezzinoˇ glu et al. [20] and Lee [21] proposed learning rules, which improve the storage capac- ity of the CV-DNN based on the complex-signum function. Kuroe et al. [22], Kuroe and Taniguchi [23], as well as Tanaka and Aihara [24] proposed single-layer CV-DNNs in which the complex-signum function is replaced by other nonlinear activation functions. Suzuki et al. [25] employed strong bias terms to reduce the number of spurious memories—which are undesired fixed points of the network—of the CV-DNN of [19]. Apart from the contributions mentioned in the previ- ous paragraph, we recently generalized the original bipolar RCNNs of [14] for complex-valued vectors [26]. In contrast 2162-237X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Complex-Valued Recurrent Correlation Neural Networks

Embed Size (px)

Citation preview

Page 1: Complex-Valued Recurrent Correlation Neural Networks

1600 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 9, SEPTEMBER 2014

Complex-Valued Recurrent CorrelationNeural Networks

Marcos Eduardo Valle

Abstract— In this paper, we generalize the bipolar recurrentcorrelation neural networks (RCNNs) of Chiueh and Goodmanfor patterns whose components are in the complex unit circle.The novel networks, referred to as complex-valued RCNNs(CV-RCNNs), are characterized by a possible nonlinear function,which is applied on the real part of the scalar product ofthe current state and the original patterns. We show that theCV-RCNNs always converge to a stationary state. Thus, they havepotential application as associative memories. In this context,we provide sufficient conditions for the retrieval of a memorizedvector. Furthermore, computational experiments concerning thereconstruction of corrupted grayscale images reveal that certainCV-RCNNs exhibit an excellent noise tolerance.

Index Terms— Complex-valued neural network, grayscaleimage retrieval, high-capacity memory, neural associativememories (AMs), noise tolerance, recurrent neural networks.

I. INTRODUCTION

ASSOCIATIVE memories (AMs) are mathematicalconstructs motivated by the human brain ability to store

and recall information [1]–[3]. Similar to the biological neuralnetwork, an AM should be able to retrieve a memorizedinformation from a possibly incomplete or corrupted item. Inmathematical terms, an AM is designed for the storage of afinite set of vectors {u1,u2, . . . ,up}, called the fundamentalmemory set. Subsequently, the AM model is expect to retrievea memorized vector uξ in response to the presentation of anoisy or incomplete version of uξ. Applications of AMs cover,for instance, pattern classification and recognition [4], [5],optimization [6], computer vision [7], prediction [8], andlanguage understanding [9].

The Hopfield neural network is one of the most widelyknown neural network used to realize an AM [10]. TheHopfield network is implemented by a recursive single-layerneural network constituted of threshold neurons of McCullochand Pitts. Moreover, it has many attractive features including:ease of implementation in hardware, characterization in termsof an energy function, and a variety of applications [11], [12].On the downside, the Hopfield network suffers from a low

Manuscript received February 7, 2014; revised June 13, 2014; accepted July15, 2014. Date of publication July 28, 2014; date of current version August 15,2014. This work was supported in part by the Fundo de Apoio ao Ensino,à Pesquisa e à Extensão under Grant 519.292, in part by the Fundação deAmparo à Pesquisa do Estado de São Paulo under Grant 2013/12310-4, and inpart by the Conselho Nacional de Desenvolvimento Científico e Tecnológicounder Grant 304240/2011-7.

The author is with the Department of Applied Mathematics, University ofCampinas, Campinas 13083-859, Brazil (e-mail: [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNNLS.2014.2341013

absolute storage capacity of approximately 0.15n items, wheren is the length of the vectors [13]. A simple but significantimprovement in storage capacity of the Hopfield networkis achieved by the recurrent correlation neural networks(RCNNs) introduced in [14].

In some sense, the RCNNs generalize the Hopfield networkby adding a layer of nodes that compute the inner productbetween the current state and the fundamental memories. Theactivation function of the neurons in this layer characterizesthe RCNN. For example, a variation of the original Hopfieldnetwork is obtained by considering a linear activation function.Furthermore, the storage capacity of certain RCNNs scalesexponentially with the length of the vectors for some activationfunctions [14]. Besides the very high storage capacity, theseRCNNs exhibit excellent error correction capabilities [14].Some RCNNs are also closely related to certain types ofBayesian processes [15] and kernel-based AMs such as themodel of [16] and [17].

Such as the Hopfield network, the original RCNNs aredesigned for bipolar vectors. However, many applicationsof AMs, including the retrieval of grayscale images in thepresence of noise, require the storage and recall of multistateor complex-valued vectors [18].

As far as we know, the first significant generalization of theHopfield network using complex numbers in the unit circlehave been proposed in [19]. In a few words, they consid-ered a single-layer complex-valued dynamic neural network(CV-DNN) whose neurons evaluate the complex-signum func-tion on the activation potential and the synaptic weightsare determined using a complex-valued Hebbian learning.However, similar to the Hopfield network, the implementa-tion of the CV-DNN of [19] as an AM is shortened byits low storage capacity. Therefore, several researchers havedeveloped improved CV-DNN models to overcome this lim-itation. For instance, Müezzinoglu et al. [20] and Lee [21]proposed learning rules, which improve the storage capac-ity of the CV-DNN based on the complex-signum function.Kuroe et al. [22], Kuroe and Taniguchi [23], as well as Tanakaand Aihara [24] proposed single-layer CV-DNNs in whichthe complex-signum function is replaced by other nonlinearactivation functions. Suzuki et al. [25] employed strong biasterms to reduce the number of spurious memories—whichare undesired fixed points of the network—of the CV-DNNof [19].

Apart from the contributions mentioned in the previ-ous paragraph, we recently generalized the original bipolarRCNNs of [14] for complex-valued vectors [26]. In contrast

2162-237X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: Complex-Valued Recurrent Correlation Neural Networks

VALLE: COMPLEX-VALUED RCNNs 1601

to the single-layer CV-DNNs, the complex-valued RCNNs(CV-RCNNs) are two layer dynamic networks. Furthermore,similar to the original bipolar model, preliminary compu-tational experiments revealed that some CV-RCNNs exhibithigh storage capacity and an excellent noise tolerance. TheCV-RCNN, as well as their implementation as AMs, is furtherinvestigated in this paper.

This paper is organized as follows. Section II presents somebasic concepts on AMs. A brief review on the bipolar RCNNsis given in Section III. The CV-RCNNs are introduced inSection IV. Section V provides some computational experi-ments concerning the storage capacity and noise tolerance ofthe novel memories. This section also presents some resultsconcerning the retrieval of grayscale images in the presenceof noise. This paper finishes with the concluding remarks inSection VI and the Appendix, which contains the proofs ofthe main results.

II. SOME BASIC CONCEPTS ON RECURRENT

NEURAL NETWORKS AND AMS

First of all, recall that a recurrent network can be used todefine a sequence {x(t)} using the difference equation:

x j (t + 1) = ψ j(x(t)

) ∀ j = 0, 1, . . . , n (1)

where x(t) = [x1(t), x2(t), . . . , xn(t)]T is the state vector attime t ∈ {0, 1, 2, . . .} and ψ j is a real or complex-valuedfunction for any index j = 1, . . . , n. A vector u is called astationary state or a fixed point of the network if u j = ψ j (u)for all j = 1, . . . , n.

A real-valued function E on the state space is called aLyapunov or energy function of the recurrent network givenby (1) if it is bounded from below and E(x(t+1)) < E(x(t))whenever x(t + 1) �= x(t). The existence of an energyfunction guarantees that the sequence {x(t)}t≥0 defined by (1)converges to a stationary state for any initial state x(0) [27].Furthermore, for any input x, the network yields a mappingM defined by

M(x) = limt→∞ x(t) (2)

where {x(t)}t≥0 is the sequence given by (1) with the initialstate x(0) = x. It is not surprising that the mapping M maybe used to realize an AM [10].

An AM is a system that allows for the storage and recallof a set of vectors {u1,u2, . . . ,up}, called the fundamentalmemory set [1], [2], [11]. Formally, an (deterministic) AMcorresponds to a mapping M, called associative mapping, suchthat M(uξ ) = uξ for all ξ = 1, . . . , p [3]. Furthermore, anAM should exhibit some noise tolerance or error correctioncapability.

In this paper, we are concerned with AM models describedby recurrent neural networks, i.e., the associative mappingM is given by the limit (2) of the sequence defined by (1).In such situations, the fundamental memories u1,u2, . . . ,up

must be fixed points of the network. For the purposes ofthis paper, however, the equality ψ(uξ ) = uξ is too strict.Specifically, although ψ(uξ ) is sufficiently close to uξ , theequality ψ(uξ ) = uξ may not be satisfied. Hence, we shallconsider the following definition.

Fig. 1. Network topology of a recurrent correlation AM. (a) Blue: bipolar.(b) Red: complex-valued.

Definition 1: Given a fundamental memory set{u1,u2, . . . ,up} and a small τ > 0, we say that arecurrent neural network described by (1) realizes a τ -AM(τ -AM) if, for any ξ ∈ {1, . . . , p}, the inequality

‖ψ(uξ )− uξ‖ ≤ τ (3)

holds true for an appropriate vector norm ‖ · ‖.In other words, we replaced the identity ψ(uξ ) = uξ by

an inequality involving the norm of the difference betweenthe fundamental memory uξ and ψ(uξ ). We believe sucha substitution is plausible since both hardware and softwareimplementations of AMs are subjected to noise or errors. Notethat an AM—with the equality constrain ψ(uξ ) = uξ—isa τ -AM for any τ > 0. Conversely, a τ -AM designed forthe storage and recall of discrete-valued vectors—includingbipolar or multistate vectors—is an AM if τ is sufficientlysmall.

III. BRIEF REVIEW ON THE BIPOLAR RCNNS

A bipolar RCNN is implemented by the fully connectedtwo layer recurrent neural networks shown in Fig. 1(a)[12], [14]. The first layer computes the inner product betweenthe current state and the fundamental memories, followedby a possibly nonlinear function f, which often emphasizesthe induced local field. Throughout this paper, a subscript isused to indicate a particular function f. The second layer,composed of threshold neurons of McCulloch and Pitts, yieldsthe sign of a weighted sum of the fundamental memories asthe next state of the RCNN.

Formally, let B denote the bipolar set {−1,+1} and considera set U = {u1,u2, . . . ,up} ⊆ B

n , where each uξ =[uξ1, uξ2, . . . , uξn]T is an n-bit vector. Given an input x(0) =[x1(0), x2(0), . . . , xn(0)]T ∈ B

n , the RCNN defines recur-sively the following sequence of n-bit vectors for t ≥ 0:

x j (t + 1) = sgn

⎝p∑

ξ=1

wξ(t)uξj

⎠ ∀ j = 1, . . . , n (4)

Page 3: Complex-Valued Recurrent Correlation Neural Networks

1602 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 9, SEPTEMBER 2014

where the dynamic weights wξ(t) are given by the followingequation for some continuous and monotone nondecreasingfunction f : [−n, n] → R:

wξ(t) = f( ⟨

x(t),uξ⟩ ) ∀ξ = 1, . . . , p. (5)

We would like to point out that, such as in the Hopfieldnetwork, the sgn function is evaluated in (4) as

sgn(v j (t)

) =

⎧⎪⎨

⎪⎩

+1, v j (t) > 0,

x j (t), v j (t) = 0,

−1, v j (t) < 0

(6)

where v j (t) =∑pξ=1 wξ(t)u

ξj is the activation potential of the

j th output neuron at the iteration t .Recall that the inner product between the two possibly com-

plex vectors x = [x1, . . . , xn]T ∈ Cn and y = [y1, . . . , yn]T ∈

Cn is given by

〈x, y〉 = y∗x =n∑

j=1

y j x j (7)

where y j corresponds to the complex conjugate of y j and y∗denotes the conjugate transpose of y. In particular, the innerproduct of n-bit patterns and the Hamming distance dH arerelated using 〈x, y〉 = n − 2dH (x, y) for all x, y ∈ B

n . Also,observe that −n ≤ 〈x, y〉 ≤ n for any n-bits vectors x, y. Thus,the domain of f is the closed interval [−n, n] ⊆ R.

Alternatively, (4) is written as follows in matrix-vectornotation:

x(t + 1) = sgn (Uw(t)) , w(t) = f(U T x(t)

)(8)

where U = [u1, . . . ,up] ∈ Bn×p is the matrix whose columns

correspond to the vectors in U , U T denotes the transpose ofU , and both sgn and f are evaluated in a component-wisemanner. We would like to recall that the network given by (4)yields a convergent sequence {x(t)}t≥0 for any input x(0) inboth synchronous and asynchronous update modes [14].

Example 1: Given a set U = {uξ : ξ = 1, . . . , p} ⊆ Bn

and an input x(0) ∈ Bn , the Hopfield network described by

the recurrent equation:

x j (t + 1) = sgn

(n∑

k=1

m jkxk(t)

)

∀ j = 0, 1, . . . , n (9)

with synaptic weights given by

m jk = 1

n

p∑

ξ=1

uξj uξk ∀ j, k = 1, . . . , n (10)

is an RCNN. Indeed, combining (9) and (10), we obtain (4)and (5) with fc(x) = x/n in place of f .

Example 2: The exponential correlation neural network(ECNN) is obtained by setting f equals to an exponentialfunction:

fe(x) = eαx/n, α > 0. (11)

The ECNN seems to be the RCNN that is more suited forimplementation using a very large scale integration (VLSI)technology [14]. Furthermore, applications of the ECNN asan AM have been extensively explored in [14], [28], and [29].

In particular, Chiueh and Goodman [14] showed that thestorage capacity of the ECNN scales exponentially with n,the length of the stored items. In other words, this modelis able to store approximately cn n-bits bipolar fundamentalmemories. Here, the base c > 1 depends on the coefficientα of the exponential function. Moreover, the storage capacityof the ECNN approaches the ultimate upper bound for thecapacity of a memory model for bipolar patterns as α tendsto +∞. In practice, however, the exponential storage capacityis very difficult to meet due to the limitation on the dynamicrange of the exponentiation [14]. In fact, a typical VLSIimplementation of the exponentiation circuit has a dynamicrange approximately 105–107. Also, the implementation ofthe ECNN in traditional von Neumann computers is limitedto the floating point representation of real numbers.

Example 3: The class of RCNNs also includes the high-order correlation neural network and the potential-functioncorrelation neural network [14], [30]–[32]. These models areobtained by considering, respectively, fh(x) = (1 + x/n)q ,q > 1 is an integer, and f p(x) = 1/(1 − x/n)L , L ≥ 1in (5). Such as the ECNN, the storage capacity of the potential-function correlation neural network scales exponentially withn when used to realize an AM [31]. In contrast, the high-ordercorrelation neural network has a polynomial storage capacitywhen used to implement an AM model [32].

IV. COMPLEX-VALUED RCNNS

Loosely speaking, CV-RCNNs are obtained by replacing theMcCulloch–Pitts neuron by the continuous-valued multivaluedneuron of [33]. The latter computes a weighted sum of itsinputs followed by the continuous-valued activation functionσ : C \ {0} → S given by

σ(z) = z

|z| (12)

where |z| denotes the modulus or absolute value of thecomplex number z and S = {z ∈ C : |z| = 1} is the unitcircle in the complex plane. Alternatively, the continuous-valued activation function can be expressed as σ(z) = eiArg(z),where Arg(z) ∈ [0, 2π) is the principal argument of z. Notethat σ generalizes sgn : R \ {0} → B and, therefore, theCV-RCNNs also generalize the bipolar RCNNs.

Analogous to the bipolar RCNNs, a CV-RCNN isimplemented by a two layer recurrent network with thefully connected topology shown in Fig. 1(b). The first layercomputes the real part of the inner product between thecurrent state and the complex-valued vectors u1, . . . ,up

followed by the possibly nonlinear function f. The secondlayer evaluates the continuous-valued activation function σ ata weighted sum of the complex valued vectors u1, . . . ,up .

In mathematical terms, let U = {u1, . . . ,up} ⊆ Sn

and f : [−n, n] → R be a continuous and monotonenondecreasing functions. Given a complex-valued inputz(0) = [z1(0), . . . , zn(0)]T ∈ S

n , a CV-RCNN defines recur-sively the following sequence for t ≥ 0:

z j (t + 1) = σ⎛

⎝p∑

ξ=1

wξ(t)uξj

⎠ ∀ j = 0, 1, . . . , n (13)

Page 4: Complex-Valued Recurrent Correlation Neural Networks

VALLE: COMPLEX-VALUED RCNNs 1603

where the weights w1(t), . . . , wp(t) are given by

wξ(t) = f(�{ ⟨uξ , z(t)

⟩ }) ∀ξ = 1, . . . , p. (14)

Here, �{z} denotes the real part of the complex number z.Also, the function σ : C→ S is evaluated as follows, wherev j (t) = ∑p

ξ=1wξ(t)uξj is the activation potential of the j th

output neuron at the iteration t:

σ(v j (t)

) ={v j (t)/|v j (t)|, v j (t) �= 0,

z j (t), v j (t) = 0.(15)

In matrix-vector notation, (13) is written as

z(t + 1) = σ(Uw(t)), w(t) = f(�{U∗z(t)}) (16)

where U = [u1, . . . ,up] ∈ Sn×p is the complex matrix

whose columns are the vectors in U , U∗ denotes the conjugatetranspose of U , and the functions f, σ , and �{·} are evaluatedin a component-wise manner.

Examples of CV-RCNNs include the following straightfor-ward complex-valued generalizations of the bipolar RCNNsgiven in Examples 1–3 of Section III.

1) The correlation CV-RCNN, which is obtained by con-sidering in (14) the function fc(x) = x/n.

2) The high-order CV-RCNN, which is derived by settingf in (14) equals to fh(x) = (1+x/n)q for some integerq > 1.

3) The potential-function CV-RCNN, which is obtained byconsidering in (14), the function

f p(x) = 1

(1+ ε − x/n)L(17)

for L ≥ 1 and ε > 0 small.4) The exponential CV-RCNN, which is defined by (14)

with fe(x) = eαx/n for some α > 0.We would like to point out that we slightly modified thepotential-function f p to avoid a division by zero when theCV-RCNN is fed by one of the fundamental memories. Inthe computational experiments, we set ε = (εmach)

1/2, whereεmach denotes the machine floating-point relative accuracy.Also, note that the floating point number system imposes, dueto overflow, an upper bound on the parameters q , L, and α ofthe functions fh , f p , and fe, respectively. In a rough manner,we must consider q ≤ 1024, L ≤ 39, and α ≤ 709 on amachine that supports IEEE floating point arithmetic.

A. Implementation of CV-RCNNs as τ -AMs

In this section, we present some theoretical results concern-ing the implementation of τ -AMs using the CV-RCNNs. First,let us show that a CV-RCNN yields a convergent sequence{z(t)}∞t=0 independently of the cardinality of the set U ⊆ S

n

and the input z(0) ∈ Sn .

Theorem 1: Let f : [−n, n] → R be a continuous andmonotone nondecreasing functions and U = {u1, . . . ,up} ⊆S

n . For any input z(0) ∈ Sn , the sequence {z(t)}t≥0 defined

by (13) and (14) is convergent in both synchronous andasynchronous update modes.

The proof of Theorem 1, which can be found inthe Appendix, is very similar to the one provided by

Chiueh and Goodman [14] for the bipolar RCNNs. Briefly,we show that the time evolution of (13) and (14) yields aminima of the energy function:

E(z) = −p∑

ξ=1

F(� ⟨uξ , z

⟩ ) ∀ z ∈ Sn (18)

where F is a certain primitive of the nonlinear function f.Theorem 1 shows that CV-RCNNs have potential applica-

tion as AM or, more specifically, as a τ -AM. In other words,we can define ψ : Sn → S

n by ψ(z) = limt→∞ z(t), where{z(t)}t≥0 is the sequence given by (13) and (14) with z(0) = z.The following provides sufficient conditions for ψ be aτ -AM designed for the storage of the complex-valued vectorsu1, . . . ,up . Specifically, we first show that (3) holds withrespect to the∞-norm whenever a certain inequality conditionis satisfied. Subsequently, we provide a sufficient conditionfor a CV-RCNN implementing a τ -AM with respect to the∞-norm.

Theorem 2: Consider a small positive number τ < 1 anda set {u1, . . . ,up} ⊆ S

n . Let z(1) ∈ Sn denote the complex-

valued vector produced by a single-step CV-RCNN suppliedwith z(0) = uη for some η ∈ {1, . . . , p}. If

ξ �=η

∣∣∣ f(�{ ⟨uξ ,uη

⟩ })∣∣∣ ≤ f (n)

τ 2

1+ τ 2 (19)

then the inequality ‖z(1)− uη‖∞ ≤ τ holds true.Remark 1: Results similar to Theorem 2 can be deduced

by considering other vector norms such as the 2-norm. Forinstance, we have ‖z(1)− uη‖2 ≤ τ whenever

ξ �=η

∣∣∣ f(�{ ⟨uξ ,uη

⟩ })∣∣∣ ≤ f (n)

τ 2

n + τ 2 . (20)

Indeed, if (20) holds true, then ‖z(1) − uη‖∞ ≤ τ/√n and,therefore, ‖z(1)− uη‖2 ≤ √n‖z(1)− uη‖∞ ≤ τ .

The condition (19) in Theorem 2 has the following intuitiveinterpretation. The output of the CV-RCNN is given by z(1) =σ(v) where the activation potential v is determined by theweighted sum:

v =n∑

ξ=1

wξuξ = wηuη +∑

ξ �=ηwξuξ . (21)

Thus, the desired output uη is multiplied by wη = f (n) whilethe other vectors are multiplied by wξ = f

(� {⟨uξ ,uη⟩})

.Now, if wη is sufficiently large compared with the weightswξ , for ξ �= η, then the error term e = ∑

ξ �=η wξuη maybe ignored and the inequality ‖z − uη‖∞ ≤ τ holds true.Furthermore, since we often have ‖e‖∞ > 0, the strict equalityz = uη hardly holds true in practical situations. In otherwords, although the CV-RCNNs can be used for the storageof complex-valued vectors as τ -AMs, they rarely implementan AM (which is defined using equality constrain).

Theorem 2 gives a condition for obtaining an output zsufficiently close to uη, when the latter is fed into a single-step CV-RCNN. If such condition holds for all complex-valued vectors u1, . . . ,up , then the CV-RCNN realizes a

Page 5: Complex-Valued Recurrent Correlation Neural Networks

1604 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 9, SEPTEMBER 2014

τ -AM designed for the storage of u1, . . . ,up . Precisely, wehave the following corollary of Theorem 2.

Corollary 1: Let τ < 1 be a small positive number andconsider a set U = {u1, . . . ,up} ⊆ S

n . If

ξ �=η

∣∣∣ f(�{ ⟨uξ ,uη

⟩ })∣∣∣ ≤ f (n)

τ 2

1+ τ 2 ∀η = 1, . . . , p

(22)then the CV-RCNN given by (13) and (14) realizes a τ -AMwith respect to ∞-norm.

Finally, the following corollary is obtained by consideringan upper bound of the sum in the left-hand side of (22).

Corollary 2: Let τ < 1 be a small positive number andconsider a set U = {u1, . . . ,up} ⊆ S

n . Also, assume that thecontinuous nondecreasing function f : [−n, n] → R satisfiesthe inequality | f (x)| < f (|x |) for all x ∈ [−n, n] and letκ = maxξ �=η

{|�{⟨uξ ,uη⟩}|}. If

(p − 1) f (κ) ≤ f (n)τ 2

(1+ τ 2)(23)

then the CV-RCNN given by (13) and (14) realizes a τ -AMwith respect to ∞-norm.

Note that the functions fc, fe, fh , and f p satisfies thecondition | f (x)| ≤ f (|x |) for all x ∈ [−n, n]. In other words,Corollary 2 can be applied to test if one of the four CV-RCNNslisted previously is able to implement a τ -AM.

B. Storage of Independent and Uniformly Distributed Vectors

Let us conclude this section by investigating the case inwhich a CV-RCNN is designed for the storage of complex-valued vectors whose components are independent and uni-formly distributed in the unit circle S. Specifically, we shallestimate a value κ so that the inequality |�{⟨uξ ,uη

⟩}| ≤ κholds true with high probability for any ξ �= η. Hence,from Corollary 2, a CV-RCNN almost surely realizes aτ -AM whenever (23) remains valid. In the sequel, we assumethat | f (x)| < f (|x |) holds for all x ∈ [−n, n].

First of all, note that

⟨uη,uη

⟩ =n∑

j=1

uηj uξj =

n∑

j=1

ei(φξj−φηj

)(24)

where φξj and φηj denote, respectively, the principal argument

of uξj . Therefore

�{ ⟨uξ ,uη⟩ } =

n∑

j=1

cos(φξj − φηj

). (25)

Now, if the components uξj are independent and uniformlydistributed in S for all indexes j and ξ , then the mean andthe variance of cos

(φξj − φηj

)are, respectively, 0 and 1/2.

Moreover, by the central limit theorem, the random variable

cξη = 2√n

n∑

j=1

cos(φξj − φηj

), for ξ �= η (26)

has a standard normal distribution for n sufficientlylarge. Since Pr(|cξη| ≤ 4) ≥ 0.9999 or, equivalently,

Pr(|�{⟨uξ ,uη⟩}| ≤ 2

√n) ≥ 0.9999, the value κ = 2

√n

is almost surely an upper bound for |�{⟨uξ ,uη⟩}|. Thus, a

CV-RCNN almost surely implements a τ -AM designed forthe storage of independent and uniformly distributed complex-valued vectors u1, . . . ,up . If

f (2√

n)

f (n)≤ τ 2

(1+ τ 2)(p − 1). (27)

Note that (27) can be used to estimate the absolute storagecapacity of a CV-RCNN as well as to determine the valueof a parameter so that the network is able to implement aτ -AM. For example, we conclude from (27) that an exponen-tial CV-RCNN, with α = 25 in (11), has high probability toimplement perfect recall of

p ≤ 1+ τ 2

1+ τ 2 eα(1−2/√

n) = 486.16 (28)

complex-valued vectors of length n = 100 with a toler-ance τ = 10−3. Similarly, suppose that we intent to storep = 12 uniformly distributed complex-value patterns of lengthn = 100 in an exponential CV-RCNN with a toleranceτ = 10−3. From (27), we conclude that an exponent

α ≥ ln(p − 1)+ ln(1+ τ 2)− 2 ln(τ )

1− 2/√

n= 20.27 (29)

can be chosen for the task. Indeed, fixed n, p, and τ < 1,we can always estimate a value for the parameter α suchthat the exponential CV-RCNN realizes a τ -AM. As weshall observe in the computational experiments, however, theestimate derived from (27) is usually pessimist. For instance,the parameter α = 10 is usually sufficient for the storage ofp = 12 uniformly distributed patterns of length n = 100 witha tolerance τ = 10−3 in an exponential CV-RCNN.

Finally, we would like to point out that, from (27), theestimate of the largest number of uniformly distributed vectorsthat can be stored in an exponential, high-order, and potential-function CV-RCNNs approach, as n→∞, respectively

1+ τ 2

1+ τ 2 eα, 1+ τ 2

1+ τ 2 2q , and 1+ τ 2

1+ τ 2

(1+ εε

)L

where ε > 0 is the small number used to avoid a division byzero in (17). Therefore, the estimate of the asymptotic storagecapacity of these three CV-RCNN are given by an exponentialin the parameter of the function f. Furthermore, according tothe estimation given by (27), these three memories may exhibitsimilar performance as τ -AMs by fine tuning the parametersα, q , and L. The computational experiments in the followingsection confirm this remark.

V. COMPUTATIONAL EXPERIMENTS

In this section, we shall perform computational experimentswith some CV-RCNNs as well as other AM models for thestorage of complex- or real-valued pattern. Specifically, wefirst compare the four CV-RCNNs presented previously. Then,we turn our attention to the storage capacity and the propertiesof the pattern recalled by the exponential CV-RCNN. Finally,we perform some experiments concerning the storage andrecall of grayscale images.

Page 6: Complex-Valued Recurrent Correlation Neural Networks

VALLE: COMPLEX-VALUED RCNNs 1605

Fig. 2. Comparison of the one-step error correction capability of the four CV-RCNNs. (a) High-order CV-RCNN. (b) Potential-function CV-RCNN.(c) Exponential CV-RCNN. (d) Comparison between the four CV-RCNNs with the parameters q, L , and α all equal to 10.

A. Comparison Between the Four CV-RCNNs

Let us first investigate the effect of the parameters q , L,and α on the error correction capability of the high-order,potential-function, and exponential CV-RCNNs, respectively.Let us also investigate the error correction capability of thecorrelation CV-RCNN and compare the four memories. Tothis end, we performed the following steps 1000 times.

1) We generated p = 12 complex-valued vectorsu1, . . . ,u12 of length n = 100 whose components areindependent and uniformly distributed in S.

2) We also randomly generated s = [s1, . . . , sn]T ∈ Bn .

3) For δ ∈ {0, 0.05, 0.1, . . . , 1.00}, we defined the inputpattern z(0) = [z1(0), . . . , zn(0)]T ∈ S

n as

z j (0) = u1j e

s jπδi/2 ∀ j = 1, . . . , n. (30)

4) We computed the pattern z(1) obtained after one itera-tion of a certain CV-RCNN.

Briefly, the component z j (0) of the input vector is obtained byrotating u1

j by δπ/2 in either clockwise or counterclockwisedirection (depending on s j ).

The outcome of this experiment is shown in Fig. 2.Precisely, for each value δ and for each step k from 1 to 1000,

we defined z(0) using (30) and computed the input errorein(δ; k) = ‖u1 − z(0)‖∞. Then, we determined the patternz(1) by evaluating (16) with t = 0 and we computed the outputerror eout(δ; k) = ‖u1 − z(1)‖∞. The horizontal and verticalaxes in Fig. 2 contain, respectively, the following averages fordifferent values of δ:

Ein(δ) = 1

1000

1000∑

k=1

ein(δ; k) (31)

and

Eout(δ) = 1

1000

1000∑

k=1

eout(δ; k). (32)

In other words, this figure shows the parametrized curves[Ein(δ), Eout(δ)] produced by the CV-RCNNs. Note that theinequality Ein < Eout holds true if the curve is below theidentity line. Therefore, a certain memory exhibit some errorcorrection capability if the corresponding curve in Fig. 2 isbelow the dotted line. Also, observe that Eout(0) is the averageerror produced by a CV-RCNN feed by u1. Hence, a certainCV-RCNN usually implements perfect recall of u1 with atolerance τ if Eout(0) ≤ τ . Conversely, the network failedto realize a τ -AM in at least one step if Eout(0) > τ .

Page 7: Complex-Valued Recurrent Correlation Neural Networks

1606 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 9, SEPTEMBER 2014

TABLE I

AVERAGE ERROR Eout(0) PRODUCED BY THE EXPONENTIAL CV-RCNN

FEED BY THE FUNDAMENTAL MEMORY u1

Fig. 3. Probability of recalling the fundamental memory that is closer to theinput pattern in the Euclidean distance sense.

The plots in Fig. 2 allow us to make the following generalobservations regarding the four CV-RCNNs.

1) The high-order CV-RCNN, as well as the exponentialCV-RCNN, failed to realize a τ -AM for small values ofthe parameters q and α, i.e., for q, α ≤ 5.

2) The error correction capability of the high-order andexponential CV-RCNNs have a similar dependence onthe parameters q and α.

3) The potential-function CV-RCNN always yielded perfectrecall of the fundamental memory u1.

4) The correlation CV-RCNN failed to implement aτ -AM, i.e., the memory failed to retrieve the fundamen-tal memory u1 in many steps.

5) The high-order, potential-function, and exponentialCV-RCNNs, besides giving perfect recall of undistortedpatterns with a certain tolerance τ , exhibited similarerror correction capability for large values of theirparameters, i.e., for q, L, and α ≥ 10.

Concluding, except from the correlation CV-RCNN, theother three CV-RCNNs exhibit a satisfactory storage capacityand error correction capability by fine tuning the parametersL, q , and α. In view of this remark and, considering thetheoretical results available in the literature concerning thebipolar exponential RCNN, we shall focus on the expo-nential CV-RCNN in the following. In particular, Table Icontains the average error Eout(0) produced by the exponentialCV-RCNN feed by the fundamental memory u1 ∈ S

100. Notethat the network failed to retrieve the original vector u1 witha tolerance τ = 10−3 for α = 5. The error rates in Table Isuggest that the exponential CV-RCNN can implement aτ -AM, with τ = 10−3, for α ≥ 10.

B. Some Remarks on the Recall Phase and Storage Capacity

In the computational experiment performed in the previoussection, the exponential CV-RCNN exhibited high probability

Fig. 4. Estimation of the storage capacity of the exponential CV-RCNN.

Fig. 5. Original grayscale images of size 128× 128 and 256 gray levels.

Fig. 6. Example of grayscale images obtained by single-step exponentialCV-RCNNs supplied with a fundamental memory.

to recall one of the fundamental memories under the presenta-tion of an input obtained by a component-wise rotation. In thissection, we provide further insights on the storage capacity aswell as the nature of the recalled pattern.

First, we counted the number of times the exponentialCV-RCNN implements a τ -AM—with a tolerance τ = 10−3

and the norm ‖ · ‖∞—designed for the storage of uniformlydistributed vectors of length n = 100. The network with α =10 realized a τ -AM for the storage of p = 12 vectors in 983 of1000 simulations. Also, the exponential CV-RCNN succeeded

Page 8: Complex-Valued Recurrent Correlation Neural Networks

VALLE: COMPLEX-VALUED RCNNs 1607

Fig. 7. Average PSNR produced by the input pattern and several AM models versus the standard deviation of Gaussian noise.

to implement a τ -AM in all 1000 simulations for α = 20and 30. Recall that the estimate given by (29) is α ≥ 20.27.Similarly, the network with α = 20 and 30 realized a τ -AM for the storage of p = 486 vectors in 1000 simulations.On the downside, the exponential CV-RCNN with α = 10failed to implement a τ -AM in all simulations. Recall thatp = 486 was obtained in (28) as an estimate of the largestnumber of fundamental memories that can be stored in theexponential CV-RCNN with α = 25.

We also counted the number of times the memory recalls thefundamental memory that is closer to a randomly generatedinput vector in 1000 experiments. Here, we used the Euclideandistance because it is closely related to the inner product usedto activate the neurons in the hidden layer. The outcome ofthis experiment for α = {10, 20, 30} is shown in Fig. 3 whenthe number of stored items is p = 12 or p = 486. Note thatthe probability of recalling the fundamental memory that is themost similar to the input pattern—in the Euclidean distancesense—increases with the parameter α.

Finally, for each length n ∈ {2, 3, . . . , 20}, the followingsteps have been performed 100 times to estimate the largestnumber p of fundamental memories that can be stored inthe exponential CV-RCNN with respect to the ∞-norm anda tolerance τ = 10−3.

1) Initialize with an empty fundamental memory set U = ∅and let p = 0.

2) While the network realizes a τ -AM or p ≤ pmax, do:

a) increment p, i.e., p← p + 1;b) generate a uniformly distributed complex-valued

vector up of length n;c) add the complex-valued pattern into the fundamen-

tal memory set, i.e., U ← U ∪ up .

3) The storage capacity of the memory is estimated as p−1.

Here, we assume that the memory always exhibit perfect recallof an empty fundamental memory set. Also, due to com-putational constrain, we limited the number of fundamentalmemories to pmax = 1500.

The semilog plot on Fig. 4 shows the average of theestimated storage capacity of the exponential CV-RCNN withthe parameters α = 10, 15, 20, 25, and 30. We also includedin this semilog plot, the straight lines corresponding theexponentials of the form Acn obtained by ordinary least-squares. The coefficient A and the base c for each valueof the parameter α can be observed in Table II. Such asthe bipolar ECNN, the storage capacity of the exponentialCV-RCNN visually scales exponentially with the length ofthe stored vectors. Furthermore, this experiment confirmed

Page 9: Complex-Valued Recurrent Correlation Neural Networks

1608 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 9, SEPTEMBER 2014

Fig. 8. Average PSNR produced by the input pattern and several AM models versus the probability of salt-and-pepper noise.

TABLE II

COEFFICIENT AND BASE OF THE EXPONENTIALS Acn

DEPICTED AS STRAIGHT LINES IN FIG. 4

that the storage capacity is often greater than the estimate1+τ 2/1+ τ 2 exp

(α(1 − 2/(n)1/2)

)obtained in Section IV-B.

C. Experiments With Grayscale Images

Let us now perform some experiments with the eightgrayscale images shown in Fig. 5. These images have size128×128 and K = 256 gray levels. For each of these images,we generated a complex-valued vector uξ ∈ S

16384 using thestandard row-scan method and applying rK : [0, K ) → S

defined below in a component-wise manner:

rK (x) = e2πxi/K . (33)

First, we synthesized exponential CV-RCNNs using theeight complex-valued patterns u1, . . . ,u8 and the parameterα ∈ {1, 3, 5, 10, 20, 30}. We observed that only the networkswith α = 20 and 30 implemented a τ -AM with respect to ∞-norm and τ = 10−3. Nevertheless, the exponential CV-RCNNs

with α = 10 also yielded satisfactory results concerningthe storage of the eight grayscale images. Specifically, weconverted the output of a complex-valued network into amultivalued vector by applying the function qK : S → K,K = {0, 1, . . . , K − 1}, given by

qK (z) = floor

(Arg(z)

K

2π+ 1

2

)(34)

in a component-wise manner. Fig. 6 shows instances of imagesobtained by the single-step exponential CV-RCNNs suppliedwith a fundamental memory. Note that the images producedby the network with α ≥ 10 are visually very similar tothe original image. The minimum peak signal-to-noise ratio(PSNR) rates produced by a single-step CV-RCNN suppliedwith the fundamental memories for α = 1, 3, 5, 10, 20, and30 are, respectively, 10.70, 15.48, 22.79, 332.47, 332.47, and332.47. Observe that the PNSR rate produced by the networkwith α = 10 is as large as the PSNR rates produced by theCV-RCNN with α = 20 or 30. We would like to point outthat we computed the PSNR between the vectors x ∈ K

n andy ∈ K

n using

PSNR = 20 log

( √n(K − 1)

max{εmach, ‖x − y‖2})

(35)

Page 10: Complex-Valued Recurrent Correlation Neural Networks

VALLE: COMPLEX-VALUED RCNNs 1609

Fig. 9. Image corrupted by Gaussian noise with standard deviation 100followed by the images retrieved by the memory models.

where εmach, introduced to avoid a division by zero, denotesthe machine precision. Hence, the largest PSNR rate is 403.35.

Let us conclude this section by comparing the noisetolerance of the exponential and potential-function CV-RCNNs with other AMs that can be used for the storageand recall of grayscale images. Namely, let us confront theexponential and the potential-function CV-RCNNs with theCV-DNN of [19], [21], and [23] with the complex-valuedactivation function referred to as Model B, the CV-DNN withthe complex-sigmoid function of [24], and the CV-DNN withstrong bias terms of [25]. For simplicity, we use the surnameof the first author to refer to a certain CV-DNN model.Also, the words exponential and potential refer, respectively,to the exponential and potential-function CV-RCNNs.Besides the complex-valued models, let us consider thefollowing real-valued AMs: optimal linear AM (OLAM) [2],the kernel AM (KAM) [34], and the subspace projectionautoassociative memory (SPAM) based on the M-estimationmethod [35].

We probed the networks with images corrupted by Gaussiannoise with zero mean and standard deviations ranging from0 to 100. We also probed the memory models with imagescorrupted by salt-and-pepper noise with densities varying from0 to 0.8. The output of the complex-valued recurrent networkswas obtained by iterating until either ‖z(t+1)−z(t)‖2 ≤ 10−4

Fig. 10. Image corrupted by salt-and-pepper noise with probability 0.4followed by the images retrieved by the memory models.

or t ≥ 100. Then, we computed the PSNR rates averagedin 80 simulations of each variance of Gaussian noise orprobability of salt-and-pepper noise. Specifically, each originalimage was distorted 10 times for a given noise variance orprobability. The outcome of this experiment can be observedin Figs. 7 and 8. Here, we adopted the parameters α = 10 andL = 5 in the functions fe and f p of the CV-RCNNs, respec-tively. Moreover, as suggested in [24], we used the parameterε = 0.2 in the complex-sigmoid activation. We adopted thesame parameters considered by Suzuki et al. [25] in theircomputational experiments, including C = 200 as the strengthof the bias term. The KAM as well as the SPAM have beendesigned according to [34] and [35], respectively.

Note that the two CV-RCNNs outperformed the othermemory models concerning the removal of Gaussian noiseas well as salt-and-pepper noise. We would like to pointout that we also evaluated the noise tolerance of the expo-nential CV-RCNN with α = 20 and 30. However, thesenetworks yielded PSNR rates very similar to the exponentialCV-RCNN with α = 10. Thus, they have not been included inFigs. 7 and 8.

Finally, a visual interpretation of the noise tolerance of the10 memory models is shown in Figs. 9 and 10. Precisely, Fig. 9shows an image corrupted by Gaussian noise with standarddeviation 100 followed by the images retrieved by the memory

Page 11: Complex-Valued Recurrent Correlation Neural Networks

1610 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 9, SEPTEMBER 2014

TABLE III

PSNR PRODUCED BY THE IMAGES SHOWN IN FIGS. 9 AND 10

models. Similarly, Fig. 10 refers to the noise tolerance withrespect to salt-and-pepper noise with density 0.4. In addition,Table III shows the PSNR produced by the images depictedin these two figures. Note that the CV-DNN models of[19], [23], and [25] failed to retrieve the original lena imagedue to the crosstalk inherited by the Hebbian learning. TheCV-DNN of Lee yielded images similar to the CV-DNN ofTanaka and Aihara. Furthermore, these two CV-DNNs werenot able to retrieve the original lena image corrupted byGaussian noise in Fig. 9. Finally, the real-valued modelsproduced images visually similar to the original ones exceptfor the OLAM, which retrieved a darker airplane image inthe presence of salt-and-pepper noise. The CV-RCNNs alsoyielded images visually similar to the undistorted images evenin the presence of high density noise. Notwithstanding, asshown in Table III, the CV-RCNNs yielded PSNR rates greaterthan those produced by the real-valued models.

VI. CONCLUSION

In this paper, we generalized the bipolar RCNNs of [14]to the complex domain. Such as the original bipolar RCNNs,the CV-RCNNs are implemented by the fully connected twolayer recurrent neural networks shown in Fig. 1. The firstlayer evaluates a continuous nondecreasing function f at thereal part of the inner product of the input pattern and eachfundamental memory. The second layer projects the compo-nents of a weighted sum of the fundamental memories intothe complex unit circle using the continuous-valued activationfunction σ .

We showed that a CV-RCNN always produces a con-vergent sequence. Therefore, they can be used to imple-ment AMs. In this context, we provided conditions for aCV-RCNN realize a τ -AM with respect to the ∞-norm.We also concluded that a CV-RCNN almost surely implementsa τ -AM for the storage of uniformly distributed vectors ifthe quotient f (2

√n)/ f (n) is bounded by the tolerance τ

and the number of fundamental vectors p. In view of thisfact, we concluded that the asymptotic storage capacity ofthe exponential, high-order, and potential-function CV-RCNNscan be estimated by an exponential on the parameters α, q ,and L.

Finally, computational experiments pointed out that thestorage capacity of the exponential CV-RCNN scales expo-nentially with the length of the stored patterns. Also, giventhat this memory implements perfect recall, the probabilityof recalling the fundamental memory, which is the mostsimilar to the input pattern in the Euclidean distance senseincreases with the value of the parameter α. Finally, exper-iments concerning the reconstruction of grayscale imagesrevealed that the exponential CV-RCNN, as well as the

potential-function, exhibit an excellent tolerance with respectto both Gaussian and salt-and-pepper noises. In particular, theCV-RCNNs outperformed other memory models includingthe CV-DNNs of [19], [21], [23], [24], and [25]. TheCV-RCNN also outperformed some real-valued AM mod-els such as the OLAM [2], KAM [34], and a certainSPAM [35].

APPENDIX

PROOF OF THE THEOREMS

Proof of Theorem 1: Let f : [−n, n] → R be a continuousand monotone nondecreasing functions, and let F : [−n, n] →R be the primitive of f given by

F(x) =∫ x

−nf (t)dt ∀x ∈ [−n, n].

By the mean value theorem, combined with the monotonicityof f, we deduce the following inequality [30]:

F(y)− F(x) ≥ f (x)(y − x) ∀ x, y ∈ [−n, n]. (36)

Now, consider the real-valued function E : Sn → R givenby (18). Note that E is bounded below. Precisely, we haveE(z) ≥ −pF(n) for all z ∈ S

n . Furthermore, if we denotez ≡ z(t) and z′ ≡ z(t + 1), for some t ≥ 0, then

�E = E(z′)− E(z)

= −p∑

ξ=1

[F(�{⟨uξ , z′

⟩})− F(�{⟨uξ , z

⟩})].

The following shows that �E < 0 whenever z′ �= z. Hence,E is an energy function and, therefore, the sequence producedby a CV-RCNN always converge to a stationary state.

First, by (36) and recalling that wξ = f(�{⟨uξ , z

⟩}), weconclude that

�E ≤ −p∑

ξ=1

f(�{⟨uξ , z

⟩})[�{⟨uξ , z′⟩} − �{⟨uξ , z

⟩}]

= −p∑

ξ=1

wξ[�{⟨uξ , z′

⟩} − �{⟨uξ , z⟩}].

However, the weights wξ are real for all ξ , the function �{·}is linear, and the inner product is sesquilinear. Thus, writingv = ∑p

ξ=1 wξuξ and expressing the inner product as a sum,

Page 12: Complex-Valued Recurrent Correlation Neural Networks

VALLE: COMPLEX-VALUED RCNNs 1611

we have

�E ≤ −�⎧⎨

⟨ p∑

ξ=1

wξuξ , z′⟩⎫⎬

⎭+ �

⎧⎨

⟨ p∑

ξ=1

wξuξ , z

⟩⎫⎬

= −� {⟨v, z′⟩}+� {〈v, z〉}

= −�⎧⎨

n∑

j=1

z′j v j

⎫⎬

⎭+�

⎧⎨

n∑

j=1

z jv j

⎫⎬

= −n∑

j=1

�{z′jv j } +n∑

j=1

� {z jv j}

= −n∑

j=1

[�{z′jv j } − �{z jv j }]. (37)

Let us now write the components of v ∈ Cn in polar form:

v j = r j eiθ j ∀ j = 1, . . . , n (38)

where r j ≥ 0 and θ j ∈ [0, 2π) for any j . Since z′ = σ(v),the component z j has the same argument θ j of v j

z′j = eiθ j ∀ j = 1, . . . , n. (39)

Similarly, the components of z ∈ Sn are given by

z j = eiϑ j ∀ j = 1, . . . , n (40)

for some ϑ j ∈ [0, 2π) for all j = 1, . . . , n. Combining (37)with (38)–(40), we deduce

�E ≤ −n∑

j=1

[�{r j

}−�{

r j ei(θ j−ϑ j )

}]

= −n∑

j=1

[r j − r j cos (θ j − ϑ j )

]

= −n∑

j=1

r j(1− cos (θ j − ϑ j )

).

Since r j ≥ 0 and 1 − cos(θ j − ϑ j ) ≥ 0 for all j , �E ≤ 0.In particular, we have �E = 0 if, and only if, either r j = 0or 1 − cos(θ j − ϑ j ) = 0, hold true for any j . However, bothcases imply z′j = z j for all j = 1, . . . , n. Indeed, on the onehand

1− cos(θ j − ϑ j ) = 0 ⇔ θ j − ϑ j = 0 ⇔ z′j = z j .

On the other hand, we have

r j = 0 ⇔ v j = 0 ⇔ z′j = z j .

Concluding, �E ≤ 0 with �E = 0 if and only if z′ = z.Proof of Theorem 2: First of all, let φξj denote the principal

argument of uξj , that is, uξj = eiφξj for all ξ = 1, . . . , p andj = 1, . . . , n. Also, from (14) with z(0) = uξ , we deducewξ = f (�{⟨uξ ,uη

⟩}), for all ξ = 1, . . . , p. In particular,wη = f (�{〈uη,uη〉}) = f (n). Hence, the inequality (19) canbe written as

ξ �=η|wξ | ≤ wη τ 2

1+ τ 2 (41)

or, equivalently, as follows where the denominator is positive:∑ξ �=η |wξ |

wη −∑ξ �=η |wξ |≤ τ 2. (42)

In addition, from (13), we conclude that z j (1) = σ(v j ),where σ is defined by (15) and v j = ∑n

ξ=1wξuξj is theactivation potential of the j th output neuron. If v j = 0, thej th component of z and uη coincide and, thus, |z j −uηj |2 = 0.In the case v j �= 0, we can express the square of |z j − uηj | as

|z j − uηj |2 = sin2 ϑ j + (1− cosϑ j )2 = 2(1− cosϑ j ) (43)

where ϑ j denote the principal argument of the quotient:

v j

uηj=

n∑

ξ=1

wξ ei(φξj−φηj ). (44)

In the following, we determine an upper bound for 1−cosϑ j .Note that the value t j = tan ϑ j satisfies

t j =�{v j/u

ηj }

�{v j/uηj }=∑nξ=1 wξ sin

(φξj − φηj

)

∑nξ=1 wξ cos

(φξj − φηj

) . (45)

Using the triangle inequality and the fact that | sin x | ≤ 1,∀x ,we conclude that the absolute value of the numerator of thelast fraction in (45) satisfies∣∣∣∣∣∣

n∑

ξ=1

wξ sin(φξj − φηj )∣∣∣∣∣∣=∣∣∣∣∣∣

ξ �=ηwξ sin(φξj − φηj )

∣∣∣∣∣∣≤∑

ξ �=η|wξ |.

(46)Furthermore, the absolute value of the denominator in (45)satisfies the inequalities:

∣∣∣∣∣∣

n∑

ξ=1

wξ cos(φξj − φηj )∣∣∣∣∣∣≥ wη −

ξ �=η|wξ | ≥ 0. (47)

From (45) and the inequalities (46) and (47), we obtain

|t j | ≤∑ξ �=η |wξ |

wη −∑ξ �=η |wξ |. (48)

Moreover, since ϑ j = tan−1 t j , we have

1− cosϑ j = 1− cos(tan−1 t j ) = 1− 1√

1+ t2j

≤ |t j |2. (49)

The last inequality can be observed by plotting both functions1− 1/(1+ t2)1/2 and |t|/2.

Finally, combining (42), (43), (48), and (49), we obtain theinequalities:

|z j − uηj |2 ≤ |t j | ≤∑ξ �=η |wξ |

wη −∑ξ �=η |wξ |≤ τ 2 (50)

which, evidently, implies ‖z − uη‖∞ = max j {|z j − uξj |} ≤ τ .

Proof of Corollary 2: From the hypotheses on f,|�{⟨uξ ,uη

⟩} ≤ κ implies the inequalities:

| f (�{⟨uξ ,uη⟩})| ≤ f (|�{⟨uξ ,uη

⟩}|) ≤ f (κ) (51)

Page 13: Complex-Valued Recurrent Correlation Neural Networks

1612 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 9, SEPTEMBER 2014

for all ξ �= η. Therefore, for any η ∈ {1, 2, . . . , p}, we have

ξ �=η

∣∣ f (�{⟨uξ ,uη

⟩})∣∣ ≤ (p − 1) f (κ) ≤ f (n)τ 2

1+ τ 2 . (52)

Thus, (22) holds true and, therfore, the CV-RCNN implementsa τ -AM with respect to the ∞-norm.

REFERENCES

[1] J. Austin, “Associative memory,” in Handbook of Neural Computation,E. Fiesler and R. Beale, Eds. London, U.K.: Oxford Univ. Press, 1997,pp. F1.4:1–F1.4:7.

[2] T. Kohonen, Self-Organization and Associative Memory, 2nd ed.New York, NY, USA: Springer-Verlag, 1987.

[3] M. H. Hassoun and P. B. Watta, “Associative memory networks,”in Handbook of Neural Computation, E. Fiesler and R. Beale, Eds.London, U.K.: Oxford Univ. Press, 1997, pp. C1.3:1–C1.3:14.

[4] E. Esmi, P. Sussner, M. E. Valle, F. Sakuray, and L. Barros, “Fuzzyassociative memories based on subsethood and similarity measures withapplications to speaker identification,” in Proc. Int. Conf. Hybrid Artif.Intell. Syst. (HAIS), 2012, pp. 479–490.

[5] D. Zhang and W. Zuo, “Computational intelligence-based biometrictechnologies,” IEEE Comput. Intell. Mag., vol. 2, no. 2, pp. 26–36,May 2007.

[6] J. J. Hopfield and D. W. Tank, “‘Neural’ computation of decisions inoptimization problems,” Biol. Cybern., vol. 52, no. 3, pp. 141–152,Jul. 1985.

[7] P. Sussner, E. L. Esmi, I. Villaverde, and M. Graña, “The Koskosubsethood fuzzy associative memory (KS-FAM): Mathematicalbackground and applications in computer vision,” J. Math. Imag. Vis.,vol. 42, nos. 2–3, pp. 134–149, Feb. 2012.

[8] P. Sussner, R. Miyasaki, and M. E. Valle, “An introduction toparameterized IFAM models with applications in prediction,” inProc. IFSA World Congr. EUSFLAT Conf., Lisbon, Portugal, Jul. 2009,pp. 247–252.

[9] H. Markert, U. Kaufmann, Z. K. Kayikci, and G. Palm, “Neuralassociative memories for the integration of language, vision and actionin an autonomous agent,” Neural Netw., vol. 22, no. 2, pp. 134–143,Mar. 2009.

[10] J. J. Hopfield, “Neural networks and physical systems with emergentcollective computational abilities,” Proc. Nat. Acad. Sci. USA, vol. 79,pp. 2554–2558, Apr. 1982.

[11] M. H. Hassoun, Ed., Associative Neural Memories: Theory andImplementation. Oxford, U.K.: Oxford Univ. Press, 1993.

[12] M. H. Hassoun, Fundamentals of Artificial Neural Networks.Cambridge, MA, USA: MIT Press, 1995.

[13] R. J. McEliece, E. C. Posner, E. R. Rodemich, and S. S. Venkatesh,“The capacity of the Hopfield associative memory,” IEEE Trans. Inf.Theory, vol. 33, no. 4, pp. 461–482, Jul. 1987.

[14] T.-D. Chiueh and R. M. Goodman, “Recurrent correlation associativememories,” IEEE Trans. Neural Netw., vol. 2, no. 2, pp. 275–284,Mar. 1991.

[15] E. R. Hancock and M. Pelillo, “A Bayesian interpretation for theexponential correlation associative memory,” Pattern Recognit. Lett.,vol. 19, no. 2, pp. 149–159, Feb. 1998.

[16] R. Perfetti and E. Ricci, “Recurrent correlation associative memories:A feature space perspective,” IEEE Trans. Neural Netw., vol. 19, no. 2,pp. 333–345, Feb. 2008.

[17] C. García and J. A. Moreno, “The Hopfield associative memory network:Improving performance with the kernel ‘trick’,” in Proc. 9th IBERAMIA,vol. 3315. Nov. 2004, pp. 871–880.

[18] A. Hirose, Complex-Valued Neural Networks (Studies in ComputationalIntelligence), 2nd ed. Heidelberg, Germany: Springer-Verlag, 2012.

[19] S. Jankowski, A. Lozowski, and J. M. Zurada, “Complex-valuedmultistate neural associative memory,” IEEE Trans. Neural Netw., vol. 7,no. 6, pp. 1491–1496, Nov. 1996.

[20] M. K. Müezzinoglu, C. Güzelis, and J. M. Zurada, “A new designmethod for the complex-valued multistate Hopfield associative memory,”IEEE Trans. Neural Netw., vol. 14, no. 4, pp. 891–899, Jul. 2003.

[21] D.-L. Lee, “Improvements of complex-valued Hopfield associativememory by using generalized projection rules,” IEEE Trans. NeuralNetw., vol. 17, no. 5, pp. 1341–1347, Sep. 2006.

[22] Y. Kuroe, M. Yoshid, and T. Mori, “On activation functionsfor complex-valued neural networks—Existence of energyfunctions,” in Artificial Neural Networks and Neural InformationProcessing—ICANN/ICONIP (Lecture Notes in Computer Science),vol. 2714, O. Kaynak, E. Alpaydin, E. Oja, and L. Xu, Eds.Berlin, Germany: Springer-Verlag, 2003, pp. 985–992.

[23] Y. Kuroe and Y. Taniguchi, “Models of self-correlation typecomplex-valued associative memories and their dynamics,” inArtificial Neural Networks: Biological Inspirations—ICANN (LectureNotes in Computer Science), vol. 3696, W. Duch, J. Kacprzyk,E. Oja, and S. Zadrozny, Eds. Berlin, Germany: Springer-Verlag, 2005,pp. 185–192.

[24] G. Tanaka and K. Aihara, “Complex-valued multistate associativememory with nonlinear multilevel functions for gray-level imagereconstruction,” IEEE Trans. Neural Netw., vol. 20, no. 9,pp. 1463–1473, Sep. 2009.

[25] Y. Suzuki, M. Kitahara, and M. Kobayashi, “Dynamic complex-valuedassociative memory with strong bias terms,” in Neural InformationProcessing (Lecture Notes in Computer Science), vol. 7062, B.-L. Lu,L. Zhang, and J. Kwok, Eds. Berlin, Germany: Springer-Verlag, 2011,pp. 509–518.

[26] M. E. Valle, “An introduction to complex-valued recurrent correlationneural networks,” in Proc. IEEE World Conf. Comput. Intell. (WCCI),Beijing, China, Jul. 2014.

[27] M. W. Hirsh, “Dynamical systems,” in Mathematical Perspectives onNeural Networks, P. Smolensky, M. Mozer, and D. Rumelhart, Eds.Mahwah, NJ, USA: Lawrence Erlbaum Associates, 1996.

[28] R. C. Wilson and E. R. Hancock, “Storage capacity of the exponentialcorrelation associative memory,” Neural Process. Lett., vol. 13, no. 1,pp. 71–80, Feb. 2001.

[29] R. C. Wilson and E. R. Hancock, “A study of pattern recovery inrecurrent correlation associative memories,” IEEE Trans. Neural Netw.,vol. 14, no. 3, pp. 506–519, May 2003.

[30] T. Chiueh and R. Goodman, Recurrent Correlation AssociativeMemories and their VLSI Implementation. Oxford, U.K.: Oxford Univ.Press, 1993, ch. 16, pp. 276–287.

[31] A. Dembo and O. Zeitouni, “General potential surfaces and neuralnetworks,” Phys. Rev. A, vol. 37, no. 6, pp. 2134–2143, 1988.

[32] D. Psaltis and C. H. Park, “Nonlinear discriminant functions andassociative memories,” Physica, vol. 22D, pp. 370–375, Aug. 1986.

[33] I. Aizenberg, C. Moraga, and D. Paliy, “A feedforward neural networkbased on multi-valued neurons,” in Computational Intelligence, Theoryand Applications (Advances in Soft Computing), vol. 33, B. Reusch,Ed. Berlin, Germany: Springer-Verlag, 2005, pp. 599–612.

[34] B.-L. Zhang, H. Zhang, and S. S. Ge, “Face recognition by applyingwavelet subband representation and kernel associative memory,” IEEETrans. Neural Netw., vol. 15, no. 1, pp. 166–177, Jan. 2004.

[35] M. E. Valle, “A robust subspace projection autoassociative memorybased on the M-estimation method,” IEEE Trans. Neural Netw. Learn.Syst., vol. 25, no. 7, pp. 1372–1377, Jul. 2014.

Marcos Eduardo Valle received the Ph.D. degree in applied mathematicsfrom the University of Campinas, Campinas, Brazil, in 2007.

He was with the University of Londrina, Londrina, Brazil, from 2008 to2013. He is currently an Assistant Professor with the Department of AppliedMathematics, University of Campinas. His current research interests includefuzzy set theory, lattice theory, neural networks, mathematical morphology,and pattern recovery.