17
Morphological bidirectional associative memories G.X. Ritter a, * , J.L. Diaz-de-Leon b , P. Sussner c a Department of Computer and Information Science and Engineering, University of Florida, CSE Building, Gainesville, FL 32611-6120, USA b Instituto Politechnico Nacional de Mexico, C.P. 07738, Mexico City, Mexico c State University of Campinas, 13083-970 Campinas, S.P., Brazil Received 27 January 1997; received in revised form 6 April 1999; accepted 6 April 1999 Abstract The theory of artificial neural networks has been successfully applied to a wide variety of pattern recognition problems. In this theory, the first step in computing the next state of a neuron or in performing the next layer neural network computation involves the linear operation of multiplying neural values by their synaptic strengths and adding the results. Thresholding usually follows the linear operation in order to provide for nonlinearity of the network. In this paper we discuss a novel class of artificial neural networks, called morphological neural networks, in which the operations of multiplication and addition are replaced by addition and maximum (or minimum), respectively. By taking the maximum (or minimum) of sums instead of the sum of products, morphological network computation is nonlinear before thresholding. As a consequence, the properties of morphological neural networks are drastically different from those of traditional neural network models. The main emphasis of the research presented here is on morphological bidirectional associative memories (MBAMs). In particular, we establish a mathematical theory for MBAMs and provide conditions that guarantee perfect bidirectional recall for corrupted patterns. Some examples that illustrate performance differences between the morphological model and the traditional semilinear model are also given. q 1999 Elsevier Science Ltd. All rights reserved. Keywords: Associative memories; Bidirectional associative memories; Morphological neural networks; Morphological associative memories 1. Introduction The concept of morphological neural networks grew out of the theory of image algebra (Ritter, 1991, 1994; Ritter & Wilson, 1996; Ritter, Wilson & Davidson, 1990). It was shown that a subalgebra of image algebra includes the math- ematical formulations of currently popular neural network models (Ritter, 1991; Ritter, Li & Wilson, 1989) and first attempts in formulating useful morphological neural networks appeared in Davidson and Ritter (1990) and Ritter and Davidson (1990). Since then, only a few papers invol- ving morphological neural networks have appeared. David- son employed morphological neural networks in order to solve template identification and target classification problems (Davidson, 1992; Davidson & Hummer, 1993; Davidson & Srivastava, 1993; Davidson & Talukder, 1993). Suarez-Araujo applied morphological neural networks to compute homothetic auditory and visual invar- iances (Suarez-Araujo, 1997; Suarez-Araujo & Ritter, 1992). Another interesting network consisting of a hybrid morphological net and a classical feedforward network used for feature extraction was described in Gader, Won and Khabou (1994), Gader, Miramonti, Won and Coffield (1995), Won and Gader (1995), and Won, Gader and Coffield (1997). These researchers devised multilayer morphological neural networks for very specialized applica- tions. A more comprehensive and rigorous basis for comput- ing with morphological neural networks appeared in (Ritter & Sussner, 1996) where it was shown that any conventional type of computational problem can be done using morpho- logical neural networks. An extensive foundation of morphological associative memories was given by Ritter, Sussner and Diaz-de-Leon (1998) where it was proved that morphological auto-associative memories have unlim- ited storage capacity and provide perfect recall for noncor- rupted input. In this paper we restrict our attention to morphological bidirectional associative memories, or MBAMs, which constitutes a class of morphological networks not previously discussed. We need to remark that an entirely different model of a morphological network was presented by Wilson (1989). This particular model uses the usual operations of multi- plication and summation at each node, which is fundamen- tally different from the models presented here. The model Neural Networks 12 (1999) 851–867 PERGAMON Neural Networks NN 1320 0893-6080/99/$ - see front matter q 1999 Elsevier Science Ltd. All rights reserved. PII: S0893-6080(99)00033-7 www.elsevier.com/locate/neunet * Corresponding author. Tel.: 1 1-352-392-1212; fax: 1 1-352-392- 1220. E-mail address: [email protected] (G.X. Ritter)

Morphological bidirectional associative memories

Embed Size (px)

Citation preview

Morphological bidirectional associative memories

G.X. Rittera,* , J.L. Diaz-de-Leonb, P. Sussnerc

aDepartment of Computer and Information Science and Engineering, University of Florida, CSE Building, Gainesville, FL 32611-6120, USAbInstituto Politechnico Nacional de Mexico, C.P. 07738, Mexico City, Mexico

cState University of Campinas, 13083-970 Campinas, S.P., Brazil

Received 27 January 1997; received in revised form 6 April 1999; accepted 6 April 1999

Abstract

The theory of artificial neural networks has been successfully applied to a wide variety of pattern recognition problems. In this theory, thefirst step in computing the next state of a neuron or in performing the next layer neural network computation involves the linear operation ofmultiplying neural values by their synaptic strengths and adding the results. Thresholding usually follows the linear operation in order toprovide for nonlinearity of the network. In this paper we discuss a novel class of artificial neural networks, calledmorphological neuralnetworks, in which the operations of multiplication and addition are replaced by addition and maximum (or minimum), respectively. Bytaking the maximum (or minimum) of sums instead of the sum of products, morphological network computation is nonlinearbeforethresholding. As a consequence, the properties of morphological neural networks are drastically different from those of traditional neuralnetwork models. The main emphasis of the research presented here is on morphological bidirectional associative memories (MBAMs). Inparticular, we establish a mathematical theory for MBAMs and provide conditions that guarantee perfect bidirectional recall for corruptedpatterns. Some examples that illustrate performance differences between the morphological model and the traditional semilinear model arealso given.q 1999 Elsevier Science Ltd. All rights reserved.

Keywords:Associative memories; Bidirectional associative memories; Morphological neural networks; Morphological associative memories

1. Introduction

The concept of morphological neural networks grew outof the theory of image algebra (Ritter, 1991, 1994; Ritter &Wilson, 1996; Ritter, Wilson & Davidson, 1990). It wasshown that a subalgebra of image algebra includes the math-ematical formulations of currently popular neural networkmodels (Ritter, 1991; Ritter, Li & Wilson, 1989) and firstattempts in formulating useful morphological neuralnetworks appeared in Davidson and Ritter (1990) and Ritterand Davidson (1990). Since then, only a few papers invol-ving morphological neural networks have appeared. David-son employed morphological neural networks in order tosolve template identification and target classificationproblems (Davidson, 1992; Davidson & Hummer, 1993;Davidson & Srivastava, 1993; Davidson & Talukder,1993). Suarez-Araujo applied morphological neuralnetworks to compute homothetic auditory and visual invar-iances (Suarez-Araujo, 1997; Suarez-Araujo & Ritter,1992). Another interesting network consisting of a hybrid

morphological net and a classical feedforward network usedfor feature extraction was described in Gader, Won andKhabou (1994), Gader, Miramonti, Won and Coffield(1995), Won and Gader (1995), and Won, Gader andCoffield (1997). These researchers devised multilayermorphological neural networks for very specialized applica-tions. A more comprehensive and rigorous basis for comput-ing with morphological neural networks appeared in (Ritter& Sussner, 1996) where it was shown that any conventionaltype of computational problem can be done using morpho-logical neural networks. An extensive foundation ofmorphological associative memories was given by Ritter,Sussner and Diaz-de-Leon (1998) where it was provedthat morphological auto-associative memories have unlim-ited storage capacity and provide perfect recall for noncor-rupted input. In this paper we restrict our attention tomorphological bidirectional associative memories, orMBAMs, which constitutes a class of morphologicalnetworks not previously discussed.

We need to remark that an entirely different model of amorphological network was presented by Wilson (1989).This particular model uses the usual operations of multi-plication and summation at each node, which is fundamen-tally different from the models presented here. The model

Neural Networks 12 (1999) 851–867PERGAMON

NeuralNetworks

NN 1320

0893-6080/99/$ - see front matterq 1999 Elsevier Science Ltd. All rights reserved.PII: S0893-6080(99)00033-7

www.elsevier.com/locate/neunet

* Corresponding author. Tel.:1 1-352-392-1212; fax:1 1-352-392-1220.

E-mail address:[email protected] (G.X. Ritter)

presented here is connected with the fundamental questionconcerning the difference between biological neuralnetworks and artificial neural networks: Is the strength ofthe electric potential of a signal traveling along an axon theresult of a multiplicative process and does the mechanism ofthe post-synaptic membrane of a neuron add the variouspotentials of electrical impulses, or is the strength of theelectric potential an additive process and does the postsy-naptic membrane only accept signals of a certain maximumstrength? A positive answer to the latter query wouldprovide a strong biological basis for morphological neuralnetworks.

2. Computational basis for morphological neuralnetworks

In recent years, lattice-based matrix operations havefound widespread applications in the engineering sciences.In these applications, the usual matrix operations of additionand multiplication are replaced by corresponding latticeoperations. Lattice-induced matrix operations lead to anentirely different perspective of a class of nonlinear trans-formations. These ideas were applied (Shimbel, 1954) tocommunications networks, and to machine scheduling(Cuninghame-Green, 1960, 1962; Giffler, 1960). Othershave discussed their usefulness in applications to shortestpath problems in graphs (Backhouse & Carre`, 1975;Carre, 1971; Benzaken, 1968; Peteanu, 1967). Additionalexamples are given by Cuninghame-Green (1979), primar-ily in the field of operations research. Application to imageprocessing were first developed by Davidson (1989) andRitter and Davidson (1987). This paper presents a continua-tion of these developments in that we take lattice-basedoperations as the basic computational model for artificialneural networks.

Artificial neural network models are specified by thenetwork topology, node characteristics, and training orlearning rules. The underlying algebraic system used inthese models is the set of real numbersR together withthe operations of addition and multiplication and the lawsgoverning these operations. This algebraic system, knownas aring, is commonly denoted by (R, 1 , × ). The two basicequations governing the theory of computation in the stan-dard neural network model are:

ti�t 1 1� �Xnj�1

aj�t�wij �1�

and

ai�t 1 1� � f �ti�t 1 1�2 ui�; �2�whereaj�t� denotes the value of thejth neuron at timet, nrepresents the number of neurons in the network,wij thesynaptic connectivity value between theith neuron andthe jth neuron,ti�t 1 1� the next total input effect on theith neuron,ui a threshold, andf the next state function which

usually introduces a nonlinearity into the network. Althoughnot all current network models can be precisely described bythese two equations, they nevertheless can be viewed asvariations of these.

Note that the computations represented by Eq. (1) arebased on the operations of the algebraic structure (R, 1, ×).The basic computations occurring in the proposed morpholo-gical network are based on thesemi-ring structures�R2∞;∨;1� and�R∞;∧;1 0�; whereR2∞ andR∞ representthe extended real number systemsR2∞ � R < { 2 ∞} andR∞ � R < {∞} : The basic arithmetic and logic operations inthe extended real number systems are as follows. The symbol1 denotes the usual addition with the additional stipulationthata 1 �2∞� � �2∞�1 a� 2∞ ;a [ R2∞:The symbol1 0 denotes theself-dualof addition and is defined bya 1 0b ;a 1 b whenevera;b [ R; anda 1 0∞ � ∞ 1 0a� ∞ ;a [R∞: The symbols∨ and ∧ denote the binary operations ofmaximum and minimum, respectively, with the additionalstipulation thata ∨ �2∞� � �2∞� ∨ a� a; a [ R2∞ anda ∧ ∞ � ∞ ∧ a� a ;a [ R∞: Note that the symbol2∞acts like azero element in the system�R2∞;∨;1� if oneviews ∨ as addition and 1 as multiplication. Similarcomments hold for the symbol∞ in the system�R∞;∧;1 0�:Also, the role of the multiplicative identity in the structure(R, 1 ,× ) is played by the number 1; i.e. 1·a� a·1� a ;a [R: In the structures�R2∞;∨;1� and�R∞;∧;1 0�; this role isplayed by the number 0 since 01 a� a 1 0� a ;a [ R:

Using the structure�R2∞;∨;1�, the two basic equationsunderlying the theory of computation in the morphologicalneural network model are given by:

ti�t 1 1� �_nj�1

aj�t�1 wij �3�

and

ai�t 1 1� � f �ti�t 1 1�2 ui�: �4�Observe that Eqs. (2) and (4) are identical. Thus, the differ-ence between the classical models and the morphologicalmodel is the computation of the next total input effect on theith neuron, which is given by:_nj�1

aj�t�1 wij

� �a1�t�1 wi1� ∨ �a2�t�1 wi2� ∨ …∨ �an�t�1 win�:�5�

Using the dual structure�R∞;∧;1 0� instead, then Eq. (3)needs to be replaced by:

ti�t 1 1� �n

j�1

aj�t�1 0wij : �6�

Eqs. (3) and (6) represent the basic operations of adila-tion and anerosionthat form the foundation of mathema-tical morphology (Serra, 1982). Hence the reason for callingneural network models based on the operations defined byEqs. (3) and (6)morphologicalneural networks.

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867852

Let v 0 denote the transpose of the vectorv. The totalnetwork computation resulting from Eq. (1) can beexpressed in matrix form as:

T�t 1 1� �W × a�t�; �7�where a�t� � �a1�t�;…;an�t�� 0;T�t 1 1� � �t1�t 1 1�;…;

tn�t 1 1�� 0; andW denotes then × n synaptic weight matrixwhosei,jth entry iswij :

Analogous to Eq. (7), the total morphological networkcomputation resulting from Eq. (3) (or Eq. (6)) can alsobe expressed in matrix form. In order to do this, we needto define a matrix product in terms of the operations of thesemi-ring�R2∞;∨;1�: For anm× p matrix A and ap × nmatrix B with entries fromR2∞; the matrixproduct C� A

B, also called themax productof A andB, is defined by:

cij �_pk�1

aik 1 bkj

� �ai1 1 b1j� ∨ �ai2 1 b2j� ∨ …∨ �aip 1 bpj�: �8�

The min productof A and B induced by�R∞;∧;1 0� isdefined in a similar fashion. Specifically, thei,jth entry ofC � A B is given by:

cij �p

k�1

aik 1 0bkj

� �ai1 1 0b1j� ∧ �ai2 1 0b2j� ∧ …∧ �aip 1 0bpj�: �9�

The total network computation resulting from Eqs. (3) and(6) can now be expressed in the matrix forms:

T�t 1 1� �W a�t� �10�and

T�t 1 1� �W a�t�; �11�respectively.

Some additional comments concerning lattice-basedoperations are pertinent when discussing morphologicalnetwork computations. When using the lattice�R2∞;∨;1�; the maximum of two matrices replaces theunusual matrix addition of linear algebra. Here thei,jthentry of the matrixC � A ∨ B is given by cij � aij ∨ bij :

Similarly, the minimum of two matricesC � A ∧ B;which is used when employing the structure�R∞;∨;1 0�;is defined bycij � aij ∧ bij : Also, given twom× n matricesA andB, we say thatA is less or equalthanB, writtenA # B;if and only if for eachi � 1;…;m and j � 1;…; n we haveaij # bij :

The structures�R2∞;∨;1� and �R∞;∧;1 0� can becombined into one cohesive algebraic system�R^∞;∨;∧;1;1 0�; called abounded l-group. HereR^∞ �R < { 2 ∞;∞} and the operations of maximum and mini-mum extend intuitively to the appended symbols of infinity,

namely:

∞ ∨ r � r ∨ ∞ � ∞ ;r [ R^∞ �12�

∞ ∧ r � r ∧ ∞ � r ;r [ R^∞:

The dual additive operations1 and 1 0 are identical whenadding real numbers; i.e.

a 1 0b� a 1 b ;a;b [ R: �13�However, they introduce an asymmetry between2 ∞ and1 ∞ when properly extended to the setR^∞: Specifically,we define:

a 1 �2∞� � �2∞�1 a� 2∞ a [ R2∞�14�

a 1 ∞ � ∞ 1 a� ∞ a [ R∞

a 1 0�2∞� � �2∞�1 0a� 2∞ a [ R2∞

a 1 0∞ � ∞ 1 0a� ∞ a [ R∞

�2∞�1 ∞ � ∞ 1 �2∞� � 2∞

�2∞�1 0∞ � ∞ 1 0�2∞� � ∞:

The algebraic structure ofR^∞ provides for an elegantduality between matrix operations. Ifr [ R^∞; then theadditive conjugateof r is the unique elementrp defined by:

rp ;

2r if r [ R

2∞ if r � 1∞1∞ if r � 2∞

:

8>><>>: �15�

Here, 2 r is the inverse ofr under the group operation1 .Therefore,�rp�p � r : This gives the following relation forall r, u in R^∞ :

r ∧ u� �rp ∨ up�p: �16�If A� �aij �m×n is an m× n matrix with aij [ R^∞; then

the conjugate matrix Ap of A is the n × m matrix Ap ��bij �n×m defined bybij � �aij �p; where�aij �p is the additiveconjugate ofaji as defined above. It follows that:

A ∧ B� �Ap ∨ Bp�p �17�and

A B� �Bp Ap�p �18�for appropriately sized matrices. This implies that amorphological neural net using the operation (i.e. Eq.(10)) can always be reformulated in terms of the operation

(i.e. Eq. (11)), and vice versa, by using the duality rela-tions expressed in Eqs. (17) and (18).

Having defined the necessary mathematical tools, we arenow in a position to discuss some basic properties ofmorphological neural networks and present some examples.

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867 853

3. Storage of data pairs in associative memories

In classical neural network theory, for a given inputvector or key x � �x1;…; xn� 0; an associative memory Mrecalls a vector output signalf(y), where y � M × x: Iff �yi� � yi;i; thenf �y� � y and the memory is called alinearassociative memoryotherwise it is referred to as asemi-linear associative memory. Generally, the functionf is anonlinear threshold function which is usually a step functionor a sigmoidal function.

A basic question concerning associative memories is:what is the simplest way to storek vector pairs�x1

; y1�;…; �xk; yk�; where xj [ Rn and yj [ Rm for j �

1;…; k; in an m× n memoryM? The well-known answerto this question is to convert each association�xj; yj� into anm× n association matrixMj � yj × �xj� 0 and then combineall the association matrices pointwise by using matrixaddition:

M � M1 1 M2 1 …1 Mk �Xkj�1

yj × �xj� 0: �19�

The i,jth entry ofM is, therefore,mij �Pk

j�1 yji xjj : This is

the familiar storage method used in the theory of linear andsemi-linear associative memories (Kohonen, 1972;Kohonen, 1984; Ritz, Anderson, Silverstein & Jones,1977). If the input patternsx1

;…; xk are orthonormal, that is:

�xj� 0 × xi �1 if i � j

0 if i ± j;

(�20�

then

M·xi � �y1 × �x1� 0 1 …1 yk × �xk� 0� × xi � yi: �21�

Thus, we haveperfect recallof the output patternsy1;…; yk

:

If the vectorsx1;…; xk are not orthonormal (which is the

case in most realistic cases), then the term:

N �Xi±j

yj × ��xj� 0 × xi� ± 0 �22�

is called thenoise termand contributes tocross talkto therecalled pattern by additively modulating the signal term.Therefore filtering processes using threshold functionsbecome necessary in order to retrieve the desired outputpattern.

Morphological associative memories, which are based onthe lattice algebra described in the preceding section, aresurprisingly similar to these classical associative memories.Each association�xj; yj� is converted into anm× n associa-tion matrix Mj � yj �2xj� 0 and all the associationmatrices are combined by using the pointwise maximum:

M � M1 ∨ M2 ∨ …∨ Mk �_kj�1

yj �2xj� 0: �23�

The i,jth entry ofM is now given bymij � ∨kj�1 �yji 2 xjj �:

Example 1. Let

x1 �0

0

0

0BB@1CCA; y1 �

0

1

0

0BB@1CCA; �24�

x2 �0

22

24

0BB@1CCA; y2 �

21

21

0

0BB@1CCA;

x3 �0

23

0

0BB@1CCA; y3 �

0

22

0

0BB@1CCA:

According to Eq. (23), the memoryM is given by:

M �_3j�1

�yj �2xj� 0� �25�

�0 0 0

1 1 1

0 0 0

0BB@1CCA ∨

21 1 3

21 1 3

0 2 4

0BB@1CCA ∨

0 3 0

22 1 22

0 3 0

0BB@1CCA

�0 3 3

1 1 3

0 3 4

0BB@1CCA:

It can be easily verified thatM xj � yj holds for j �1,2,3. For example,

M x2 �0 3 3

1 1 3

0 3 4

0BB@1CCA

0

22

24

0BB@1CCA �

21

21

0

0BB@1CCA � y2

: �26�

The above example provides a perfect recall memory forthe associated pairs (x1,y1), (x2,y2), and (x3,y3). This bringsup the obvious question as to necessary and sufficient condi-tions for the assurance of perfect recall. Specifically, forwhat vector pairs (x1,y1), (x2,y2),…,(xk,yk) will the memoryM given by Eq. (23) provide perfect recall? This question isanswered by the following theorem:

Theorem 1. M xj � yj ;j � 1;…; k if and only iffor each row index i� 1;…;m and eachj [ {1 ;…; k} thereexist column indices ji

j [ {1 ;…;n} such that:

mij ij� yji 2 xj

j ij

;j � 1;…; k: �27�

We use the notationj ij to denote the fact that the columnindex j depends on the particular indicesj and i.

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867854

Example 2. The memoryM given in Example 1 satisfiesthe conditions of Theorem 1. For example, forj � 2 wehave:

m13 � 3� 21 1 4� y21 2 x2

3; �28�

m22 � 1� 21 1 2� y22 2 x2

2;

m33 � 4� 0 1 4� y23 2 x2

3:

In view of this example and Theorem 1, it is obvious thatTheorem 1 provides for a simple algorithmic method forverifying whether or not a memory is a perfect recallmemory. An extensive discussion of the performancecapabilities of morphological associative memories can befound in Ritter et al. (1998).

The following corollary is an easy consequence ofTheorem 1 and provides for another interpretation of thetheorem:

Corollary 1.1. M xj � yj ;j � 1;…; k if and only iffor each row index i� 1;…;m and eachg [ {1 ;…; k}there exists a column index ji

g [ {1 ;…;n} such that:

xgj ig �k

j�1

�xjjig 2 yji �1 ygi : �29�

Of course, due to the duality relationships mentioned inthe previous section, the theorems discussed in this sectioncould just as well have been cast in terms of the max product

. More specifically, in the lattice algebra�R2∞;∨;1� theequivalent associative memoryW is given by:

W �W1 ∧ W2 ∧ …∧ Wk �k

j�1

�yj �2xj� 0�; �30�

where

Wj � yj �2xj� 0 and wij �k

j�1

�yji 2 xjj �:

�31�

Example 3. Consider the associations�xj; yj�; j � 1,2,3,given in Example 1. According to Eq. (30), the memoryW isgiven by:

W �k

j�1

�yj �2xj� 0� �32�

�0 0 0

1 1 1

0 0 0

0BB@1CCA ∧

21 1 3

21 1 3

0 2 4

0BB@1CCA ∧

0 3 0

22 1 22

0 3 0

0BB@1CCA

�21 0 0

22 1 22

0 0 0

0BB@1CCA:

It can be easily verified thatW xj � yj holds for j �1,2,3. For example,

W x1 �21 0 0

22 1 22

0 0 0

0BB@1CCA

0

0

0

0BB@1CCA �

0

1

0

0BB@1CCA � y1

: �33�

Employing the memoryW, we obtain the followinganalogues of Theorem 1 and its corollary:

Theorem 2. W xj � yj ;j � 1;…; k if and only if foreach row index i� 1;…;m there exist column indices ji

j [{1 ;…;n} such that:

wij ij� yji 2 xj

jij

;j � 1;…; k: �34�

Corollary 2.1. W xj � yj ;j � 1;…; k if and only iffor each row index i� 1;…;m and eachg [ {1 ;…; k}there exists a column index ji

g [ {1 ;…;n} such that:

xgj ig�_kj�1

�xjj ig

2 yji �1 ygi : �35�

It was shown by Ritter et al. (1998) thatW is optimal inthe following sense: Given an arbitrary memoryA��aij �n×m with the property thatA xj � yj for j �1;…; k; then:

A # W and W xj � yj �36�for j � 1;…; k: This means that if in the matrix algebrabased on the lattice algebra�R2∞;∨;1� there exist perfectrecall memories for the associations (x1,y1),(x2,y2),…,(xk,yk), then W is a perfect recall memory andW is the least upper bound of all possible perfect recallmemories.

By interchanging maxs and mins, the same argumentshows that ifB� �bij �m×n is any matrix with the propertythat B xj � yj for j � 1;…; k; then:

M # B and M xj � yj �37�for j � 1;…; k: Hence, in the matrix algebra based on thelattice algebra�R∞;∧;1 0�; M is a perfect recall memory and

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867 855

M is also the greatest lower bound for all possible perfectrecall memories for the association pairs (x1,y1),(x2,y2),…,(xk,yk).

4. Bidirectional associative memories

Suppose we wish to synchronously feedback the outputyj1 � f �M × xj� to an associative memory in order toimprove recall accuracy. In classical bidirectional associa-tive memory (BAM) theory as developed by Kosko and latergeneralized by Grossberg (Kosko, 1987a; Grossberg, 1988),the simplest feedback scheme is to passyj1 backwardsthroughM 0, whereM 0 denotes the transpose ofM. If theresultant patternxj1 � f �M 0 × yj1� is fed back throughM, anew pattern vectoryj2 results, which can be fed back throughM 0 to producexj2; and so on. This back-and-forth flow willquickly resonate to a fixed data pair�xjF ; yjF� (Kosko, 1987a).However, classical BAMs, being generalizations of theHopfield net, have limitations that are similar to those ofthe Hopfield net. The number of associations that can beprogrammed into the memory and effectively recalled isvery limited. The storage capacity for reliable recall has tobe significantly less than min(m,n) (Kosko, 1987a,b; Ritter& Wilson, 1996). Also, BAMs have the tendency toconverge to the wrong association pair if components oftwo association pairs have too may features in common.Thus, it is often likely that at convergence�xjF ; yjF� ±�xj; yj�: As some later examples will demonstrate, inmany cases MBAMs allow for heavy overlap of features.Additionally, the notion ofkernel patterns discussed inSection 5 may provide for large increases in storage capa-city and robust recall. In fairness it must be pointed out thatnumerous advances and improvements in encoding–recal-ling capabilities for traditional BAMs have occurred sincethe publication of Kosko’s original paper on the subject (Oh& Kothari, 1994; Shi, Zhao & Zhuang, 1998; Simpson,1990; Wang, 1996; Wang, Cruz & Mulligan, 1990, 1991;Wang & Don, 1995; Xu, 1994; Zhuang, Huang & Chen,1993). Thus, a rigorous comparative analysis of enhancedBAMs and MBAMs is needed to provide for a fair evalua-tion of the advantages and disadvantages of these twodistinct approaches to bidirectional recall.

As in the case of associative memories, in appearance,MBAMs are surprisingly similar to classical BAMs.However, there are drastic differences in behavior. In thefeedback scheme, MBAMs employ theconjugateor dualmemoryMp of M, whereM denotes the memory given byEq. (23). Thei,jth entry ofMp is given by:

mpij � 2mji � 2

_kj�1

�yjj 2 xji � �k

j�1

�xji 2 yjj �; �38�

wherei � 1;…;n and j � 1;…;m:

The dual memoryMp associates with each input patternvectoryj the output pattern vectorxj provided that certainconditions are satisfied.

Example 4. If M is as in Example 1, then:

Mp �0 21 0

23 21 23

23 23 24

0BB@1CCA �39�

and it can be easily shown thatMp yj � xj for j � 1,2,3.

In the previous section we noted that the memoryWdefined by Eq. (30) satisfied the least upper bound optim-ality criterion for perfect recall memories in the algebrabased on the lattice structure�R2∞;∨;1�: By interchangingxj with yj and vice versa, we have the same criteria satisfiedby Mp

: In particular, given an arbitrary memoryA� �aij �n×m

with the property thatA yj � xj for j � 1;…; k; then:

A # Mp and Mp yj � xj �40�for j � 1;…; k: Thus, if in the matrix algebra based on thelattice algebra �R2∞;∨;1� there exist perfect recallmemories for the associate pairs (y1,x1), (y2,x2),…,(yk,xk),thenMp is also a perfect recall memory andMp is the leastupper bound of all possible perfect recall memories.

Theorem 1 and Corollary 1.1 have the following dualanalogues:

Theorem 3. Mp yj � xj ;j � 1;…; k if and only if foreach row index i� 1;…;n and eachj [ {1 ;…; k} thereexist column indices ji

j [ {1 ;…;m} such that:

mpij ij� xji 2 yj

j ij

: �41�

Corollary 3.1. Mp yj � xj ;j � 1;…; k if and only iffor each row index i� 1;…;n and eachg [ {1 ;…; k} thereexists a column index ji

g [ {1 ;…;m} such that:

ygjig�_kj�1

�yjjig

2 xji �1 xgi : �42�

Thus, while the memoryM recalls the patternyj whenpresented the patternxj , the dual memoryMp recalls thepatternxj when presented with the patternyj .

If the dual memoriesM andMp are applied in sequence,with M applied first, then they form an auto-associativememory for the patternsx1

;…; xk: Similarly, an auto-asso-

ciative memory for the patternsy1;…; yk is obtained by first

passing these patterns through the memoryMp and thenfeeding the output ofMp through M. Perfect recall forthese auto-associative memories is guarantied as long asthe conditions of Corollaries 1.1 and 3.1 are satisfied. Thisfact is expressed by the next theorem.

Theorem 4. If the following two conditions are satisfied:

1. for eachg [ {1 ;…; k} and for each i� 1;…;m there

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867856

exists an index jig [ {1 ;…;n} such that:

xgjig �k

j�1

�xjjig 2 yji �1 ygi : �43�

and2. for eachg [ {1 ;…; k} and each j� 1;…;n there exists

an index ijg [ {1 ;…;m} such that:

ygiig �_kj�1

�yjijg

2 xjj �1 xgj ; �44�

then

Mp �M xj� � xj ;j � 1;…; k �45�and

M �Mp yj� � yj ;j � 1;…; k: �46�

Thus, for an associate pair (xj ,yj ), the patternyj can berecalled if xj is fed through the memoryM and xj canbe recalled ifyj is fed back through the memoryMp

:

Schematically, we have:

xj ! M ! yj �47�

xj ← Mp ← yj:

In this scheme, perfect recall, which is a one-step procedurewithout thresholding, is always achieved as long as theconditions of Theorem 4 are satisfied. The MBAM systemdescribed by Eq. (47) will, henceforth, be denoted by:

…! M !…

… ← Mp ← …

( ):

As an example of a morphological BAM, consider thefour associations (a,A), (b,B), (d,D), and (x,X) of lettersfrom the alphabet. Representing each letter as an 18× 18Boolean image, we obtain four corresponding Booleanimage pairs (pj , Pj ), j � 1;…;4 shown in Fig. 1. We pur-posely framed the images in order to show the location ofthe pattern pixels in each image.

Using the standard row-scan method, each pattern imagep can be converted into a Boolean pattern vectorx ��x1;…; x324� by defining:

x18�i21�1j �1 if p�i; j� � 1�� black pixel�0 if p�i; j� � 0�� white pixel�:

:

(The four pattern pairs (xj ,yj ), j � 1;…;4; generated in thisfashion were used to construct both the classical semi-linearbidirectional associative memory as well as the morpholo-gical bidirectional associative memory. In this as well as allsubsequent examples the hard-limiter activation functionwas used in the BAM computation. As Fig. 2 shows, perfectassociation could not be achieved when using the BAM. Inall cases the input to the net consisted of the lower caseletters. For the association (a,A), the net stabilized aftertwo iterations, but could not recover the capital letter A.Both associations (b,B) and (d,D) stabilized after six itera-tions to the output shown in Fig. 2. Only the association(x,X) converged to the correct two letters. As mentionedearlier, the number of associations that can be effectivelyprogrammed into the memories a classical BAM is verylimited. This limitation is especially apparent when differentassociations share too many “on” bits (Lippmann, 1987). Inthis particular example, the letters a, b, and d share a largenumber of pattern pixels (i.e., pixels with valuep�i; j� � 1�and so do the letters B and D of the associations (b,B) and(d,D). In contrast, as shown in Fig. 3, the MBAM providesfor perfect bidirectional recall.

One may erroneously infer from this example that anMBAM is superior to classical BAMs in recall capability.It is important to realize that the conditions of Theorem 4are easily violated in which case perfect recall may not bepossible.

Example 5. Consider the following associative pairs of

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867 857

Fig. 1. The four associations used in constructing the semi-linear BAM andthe MBAM.

Fig. 2. Convergence behavior of the semi-linear BAM.

pattern vectors:

x1 �

1

1

1

1

0BBBBBB@

1CCCCCCA; y1 �

1

21

1

21

0BBBBBB@

1CCCCCCA; �48�

x2 �

1

1

21

21

0BBBBBB@

1CCCCCCA; y2 �

1

21

21

1

0BBBBBB@

1CCCCCCA:

Using Eqs. (23) and (38), the memoriesM andMp are givenby:

M �

0 0 2 2

22 22 0 0

0 0 0 0

0 0 2 2

0BBBBBB@

1CCCCCCA and Mp �

0 2 0 0

0 2 0 0

22 0 0 22

22 0 0 22

0BBBBBB@

1CCCCCCA;�49�

respectively. Simple computation shows that:

M x1 �

1

21

1

1

0BBBBBB@

1CCCCCCA ± y1; while M x2 � y2

;

Mp y1 � x1; and Mp y2 � x2

: �50�

By starting withj � 1 and checking the row indices ofM, itis easily verified thaty1

4 2 x1j � 22 for j � 1;…;4: Thus,

for i � 4 andj � 1 there does not exist a column indexjsuch thatm4j � y1

4 2 x1j (see Theorem 1). Equivalently,

using Corollary 1.1, there does not exist a column indexjsuch thatx1

j � ∧2j�1�xjj 2 yj4�1 y1

4: Therefore, condition 1of Theorem 4 is not satisfied. However, since the patternvectors are orthogonal, these associative pairs can beencoded and perfectly recalled using the traditional BAM.In Section 5 we provide a method for MBAMs that effec-tively encodes the associative pairs of this example.

We now turn our attention to noisy patterns. We say that adistorted version~xg of the patternxg has undergone anerosive changewhenever ~xg # xg and adilative changewhenever~xg $ xg: Thus, for the above Boolean patterns,a change in pattern values fromp�i; j� � 1 to p�i; j� � 0represents an erosive change, while a change fromp�i; j� �0 to p�i; j� � 1 represents a dilative change. MBAMs areextremely robust in recalling patterns that are distorteddue to dilative changes in the input patternsxj: Thesechanges can be random (systems noise, partial patternocclusion, etc.) or non-random (processing effects such asskeletonizing, filtering, etc.). For example, corrupting thexj

patterns with 20% randomly generated dilative noiseresulted in perfect bidirectional recall as shown in Fig. 4.The limitations on the type of dilative distortion that can beeffectively handled by the MBAM is given by the nexttheorems.

Theorem 5. For g � 1;…; k let ~xg denote the distortedversion of the patternxg: Then M ~xg � yg if and onlyif the following hold:

~xgj $ xgj ∧_mi�1

�^j±g

�ygi 2 yji 1 xjj �� ;j � 1;…;n �51�

and for each row index i[ {1 ;…;m} there exists a columnindex ji [ {1 ;…;n} such that:

~xgji � xgji ∧ �^j±g

�ygi 2 yji 1 xjj i ��: �52�

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867858

Fig. 3. The perfect one-step convergence of the MBAMf… ! M ! …… ← Mp ← …g

Fig. 4. One-step convergence of the MBAMf… ! M ! …… ← Mp ← …g

using input patternsxj corrupted by dilative noise.

Theorem 5 has the following analogue for the memoryMp :

Theorem 6. For g � 1;…; k let ~yg denote the distortedversion the patternyg: Then Mp ~yg � xg if and only ifthe following hold:

~ygi # ygi ∨n

j�1

�_j±g

�xgj 2 xjj 1 yji �� ;i � 1;…;m �53�

and for each row index j[ {1 ;…; n} there exists a columnindex ij [ {1 ;…;m} such that:

~ygi j � ygi j ∨ �_j±g

�xgj 2 xjj 1 yjij ��: �54�

It follows that the memoryMp is fairly robust in thepresence of erosive noise. The robustness is illustrated inFig. 5. In this example, the input patternsyg were corruptedwith 20% randomly generated erosive noise.

In realistic applications, noise is random and includesboth dilative and erosive noise. The problem faced here is

that the memoryM is incapable of handling erosive noisewhile the memoryMp is extremely vulnerable to dilativechanges. A consequence of Theorems 5 and 6 is that if~xgi ,xgi ; then it is very likely thatM ~xg ± yg; and if ygi , ~ygi ;then we have a high probability thatMp ~yg ± xg: Asshown in Fig. 6, even minimal erosive noise corruption ofthe input patternsxj can lead to convergence patterns thatdiffer drastically from the pattern associations imprinted onthe memory.

Reversing the roles ofxj and yj; Theorem 6 can berestated in terms of the memoryW as follows.

Theorem 7. For g � 1;…; k let ~xg denote the distortedversion of the patternxg: Then W ~xg � yg if and onlyif the following hold:

~xgj # xgj ∨m

i�1

�_j±g

�ygi 2 yji 1 xjj �� ;j � 1;…;n �55�

and for each row index i[ {1 ;…;m} there exists a columnindex ji [ {1 ;…; n} such that:

~xgj i � xgj i ∨ �_j±g

�ygi 2 yji 1 xjji ��: �56�

Thus, in analogy toMp; the memoryW is also very robust

in the presence of erosive noise but vulnerable to dilativenoise. Hence, replacingM by W (i.e. using the sequenceWp

�W x�), then many of the problems caused by erosivenoise in the input patternsxj would have been eliminated.This can be ascertained by the example shown in Fig. 7. Inthis example, the input patterns were corrupted with 20%randomly generated erosive noise.

The major problem with MBAMs, as formulated thus far,should now be apparent: although the memoriesM, Mp

; W,and Wp are capable of handling one type of noise, eithererosive or dilative, no sequence of these memories iscapable of effectively dealing with patterns containingrandom noise (i.e. patterns containing both erosive and dila-tive noise). However, the particular properties of thesememories provide the underlying foundation for theconstruction of MBAMs that have larger storage capacityand are highly robust in the presence of random noise. Theseaugmented memories will be discussed in the subsequentsections. Before we consider these augmented memories, itwill be useful to take another look at the storage and recalllimitations of the memories discussed thus far.

Although morphological associative memories can oftensuccessfully store and retrieve patterns with large featureoverlap, the theorems listed earlier show that there arecertain limitations on perfect recall for perfect input. Whatwe do know, however, is that the following inequalities arealways satisfied forj � 1; 2;…; k :

W xj # yj # M xj and Mp yj # xj # Wp yj:�57�

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867 859

Fig. 5. One-step convergence of the MBAMf… ! Mp ! …… ← M ← …g

using input patternsyj corrupted by erosive noise.

Fig. 6. Performance of the MBAMf… ! M ! …… ← Mp ← …g using input

patterns which are minimally corrupted by erosive noise.

The validity of these inequalities can be easily established.For example, letM xj � z and pick an arbitrary indexi [{1 ;…;m} : Then:

zi �n

j�1

�mij 1 xjj � �n

j�1

�_kz�1

�yzi 2 xzj �1 xjj �

$n

j�1

��yji 2 xjj �1 xjj � � yji : �58�

Since i was arbitrary, we have thatyj # z (i.e.yji # zi ;i � 1;…;m). This shows thatyj # M xj ;j �1;…; k:

As an example, suppose we increase the number of asso-ciations by adding the pattern pairs (e,E) and (r,R) to thememoriesM andMp as well as to the memories of the semi-linear BAM. The corresponding 18× 18 Boolean imagepairs representing these pattern pairs are shown in Fig. 8.As there is an obvious large overlap of pattern pixels amongthe letters B, D, E, and R as well as the letters a, b, d, and e,poor performance of the semi-linear BAM is expected. Thisexpectation is confirmed by the performance shown inFig. 9.

In contrast, as shown in Fig. 10, the MBAM:

…! M !…

… ← Mp ← …

( )exhibits almost perfect performance. However, the memoryM derived from the six associations fails to satisfy the condi-tions of Theorem 1. In particular, for the association (x4,y4)corresponding to the pair (d,D), we obtainy4 , M x4

:

Adding the additional associations to the MBAM memoriesWandWp

; we again obtain near perfect performance. In this

case, perfect recall fails for the associations (y3,x3) and(y6,x6) corresponding to the pattern pairs (E,e) and (R,r),respectively. As illustrated in Fig. 11, here we havex3 ,Wp y3 andx6 , Wp y6

:

5. Conditions for perfect bidirectional recall usingdistorted inputs

The fundamental idea of creating a morphological asso-ciative recall memory that is robust in the presence ofrandom noise (i.e. both dilative and erosive random noise)is to replace the associative memoryM with a sequence oftwo memoriesM1 andW1 that are defined in terms of subpat-terns of the exemplar patternsxj: Basically, the memoryM1

is constructed such that it can associate randomly corruptedversions of the exemplarxj with a subpatternzj consistingof a few select pattern values fromxj. The patternzj repre-sents a special eroded version ofxj: In the Boolean case,

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867860

Fig. 7. Performance of the MBAM sequencef… ! W ! …… ← Wp ← …g

using input patterns which are corrupted by erosive noise.

Fig. 8. The two pattern pairs (e,E) and (r,R) added to the BAM and MBAM.

Fig. 9. Convergence behavior of the semi-linear BAM for six patternassociations. In each case, the BAM was presented with the perfect exem-plar pattern shown in the upper left-hand corner of each image sequence.The final output after stabilization are the two patterns on the bottom ofeach image sequence.

Fig. 10. Performance of the MBAMf… ! M ! …… ← Mp ← …g for six

pattern associations.

ideally, zj is a sparse representation ofxj having the prop-erty thatzj p xj andzj ∧ xg � 0 wheneverj ± g: Here p

denotes “much smaller than” in the sense of containingmostly zero-valued pixels, and0 denotes the zero vector.The morphological memoryM1 defined as an auto-associa-tive memory for the patternszj; j � 1;…; k so thatM1

zj � zj: SinceM1 is robust in the presence of dilative noiseand zj p xj; a corrupted version~xj of xj containing bothdilative and erosive noise, will generally look like a dilatedversion ofzj to the recall memoryM1. Hence, for properlychosenzj; we haveM1 ~xj � zj: Furthermore, sincexj

represents a version ofzj corrupted only by dilative noise,we also haveM1 xj � zj:

The memoryW1 is defined as an associative memorywhich associates with each inputzj the patternyj: Thus,under the correct conditions, we obtainW1 �M1 xj� �W1 zj � yj as well asW1 �M1 ~xj� �W1 zj � yj

for randomly corrupted patternsxj:We now make these notions more rigorous. LetX denote

the n × k matrix X � �x1; x2

;…; xk� andY the m× k matrixY � �y1

; y2;…; yk�: SinceMj � yj �2xj� 0 andWj � yj

�2xj� 0 for j � 1;2;…; k; we have thatmjij � yji 2 xjj �

wjij : Therefore,Mj �Wj for j � 1; 2;…; k: In fact, yj

�2xj� 0 � yj × �2xj� 0 � yj �2xj� 0; where × denotesthe usual matrix product. Because of this equality, we usethe common matrix productyj × �2xj� 0 to denote theproductyj �2xj� 0 � yj �2xj� 0: It follows that:

�yj × �2xj� 0� xj � �yj × �2xj� 0� xj � yj

for j � 1; 2;…; k:�59�

For an association (X,Y), we define:

MXY �_kj�1

�yj × �2xj�� and WXY �k

j�1

�yj × �2xj��: �60�

The i,jth element ofMXY and WXY will be denoted bymXY�i; j� and wXY�i; j�; respectively. Obviously,M � MXY

and W �WXY; whereM and W are the memories definedby Eqs. (23) and (30), respectively. Accordingly, we also

have:

Mp �k

j�1

�xj × �2yj�� �WYX and

Wp �_kj�1

�xj × �2yj�� � MYX:

�61�

A somewhat surprising fact, proven by Ritter et al. (1998)is that morphological auto-associative memories haveperfect recall for all patterns. This is in contrast to theHopfield net which has very limited storage capacity andrequires patterns with little overlap.

Theorem 8. MXX X � X and WXX X � X:

Observe that the theorem places no restriction on the typeor number of patterns. These perfect recall memories areonly affected by random noise.

Definition 1. Let Z � �z1; z2

;…; zk� be ann × k matrix.We say thatZ is akernelfor (X, Y) if and only if the follow-ing two conditions are satisfied:

1. MZZ X � Z2. WZY Z � Y:

It follows that if Z is a kernel for (X,Y), then:

WZY �MZZ X� �WZY Z � Y: �62�If, in addition, WYX Y � X; then:

WYX �WZY �MZZ X�� � X: �63�Eq. (64) defines the MBAM system:

…! MZZ !WZY!…

… ← WYX ← …

( )�64�

which for a properly chosen kernelZ, will be robust in thepresence of random noise in the inputX. Of course, theassociative memoryWYX � Mp is vulnerable to dilativechanges. This concern can be eliminated by choosing akernel V for (Y,X) and replacing the memoryWYX by thesequence {…! MVV !WVX!…} : The new MBAM:

…! MZZ !WZY!…

… ← WVX ← MVV ← …

( )�65�

will then be robust in the presence of random noise in eitherinput X or Y.

Definition 1 does not provide for a methodology of find-ing kernels nor does it furnish direct insight into the questionas to why kernels are useful in constructing MBAMs that arerobust in the presence of random noise. The next theoremprovides a partial answer to these questions.

Theorem 9. If Z is a kernel for (X, Y), then Z# X:

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867 861

Fig. 11. Performance of the MBAMf… ! W ! …… ← Wp ← …g for six

pattern associations.

If Z � �z1; z2

;…; zk� is a kernel for (X,Y), then the columnvectorszj of Z are calledkernel vectorsfor (X,Y). Accordingto Theorem 9,zj # xj for j � 1;2;…; k: Thus, the kernelvectors represent eroded versions of the patternsx1

;…; xk:

Example 6. The set:

0

0

22

0BB@1CCA;

0

22

24

0BB@1CCA;

0

23

22

0BB@1CCA

8>><>>:9>>=>>; �66�

is a set of kernel vectors for the pattern association (X,Y)given in Example 1. The set:

0

0

23

0BB@1CCA;

0

22

24

0BB@1CCA;

0

23

23

0BB@1CCA

8>><>>:9>>=>>; �67�

provides another example of a set of kernel vectors for thevector pairs given in Example 1. This shows that kernelvectors are not unique.

The definition of kernel vectors does not preclude thatzj � xj for somej or even thatZ � X: Of course, ifZ �X; then nothing will have been gained in the quest for creat-ing MBAMs that are robust in the presence of random noise.For the same reason, cases werezj � xj for somej , need tobe avoided. This leads to the notion ofproper kernels. If Z isa kernel for (X,Y) with the property thatzj ± xj;j; thenZ iscalled aproper kernelfor (X,Y). Note that the two kernels inExample 6 are not proper kernels since for both kernels, wehavez2 � x2

: However, the notion of a proper kernel doesnot imply thatzj , xj ;j: It only implies that for eachj �1; 2;…; k; there exists an indexij [ {1 ;…; n} such thatzjij ± xjij : In other words, each kernel vectorzj hasat leastonecoordinate that differs from the corresponding coordi-nate ofzj: According to Theorem 9, this coordinate must bestrictly lessthan the corresponding coordinate ofzj:

Example 7. As an example of a proper kernel, considerthe three 4× 4 pattern images shown in the top row of Fig.12. If we use the auto-associationxj � yj; wherexj denotesthe corresponding 16 dimensional pattern vector, then thethree kernel images corresponding to the kernel vectorszj;j � 1;2;3 are shown in the bottom row of the figure. Aquick visual check shows that the kernel vectors are properand thatzj ∧ xg � 0 wheneverg ± j:

If zg # ~xg # xg; then:

zg � MZZ zg # MZZ ~xg # MZZ xg � zg �68�and, hence,MZZ ~xg � zg: Thus, if Z is a kernel for (X,Y)andzg # ~xg # xg; then we are guaranteed that:

WZY �MZZ ~xg� � yg: �69�Therefore, an eroded version~xg of xg satisfying the inequal-ity zg # ~xg will be correctly associated with the patternyg:SinceMZZ is very robust in the presence of dilative noise, itis not unreasonable to expect that a distorted version~xg ofxg containing both dilative and erosive noise while satisfy-ing the inequalityzg # ~xg will also be correctly associatedwith the patternyg: This expectation turns out to be true andthe next theorem provides conditions that guarantee perfectrecall for corrupted input patterns~xg:

Theorem 10. If Z is a kernel for (X,Y) and~xg denotes adistorted version ofxg such that zgj # ~xj ;j � 1;…; n andzgj � ~xj whenever mZZ�i; j� � zgi 2 zgj ; then MZZ ~xg � zg:

Theorems 9 and 10 can be viewed as a guide for selectingkernel vectors. A first step is to erode the patternsxj; savingonly a few of the original pattern values. The Boolean caseis of prime interest here as it assumes the traditional McCul-loch–Pitts model in which neurons have only one of twovalues, 0 or 1, representing theoff or on state, respectively.In this case one needs to turn most pattern values of a patternvectorxj to zero, saving only a few unique pattern values inorder to construct a kernel vectorzj such thatzj ∧ xg � 0whenever g ± j: This may not always be possible.However, when it is possible, it provides unique identifica-tion markers for the memoryMZZ and the conditions statedin Definition 1 are automatically satisfied. In general,however, such unique features may not exist for someindex j . In this case, once an initial set of kernel vectorcandidates has been selected, the two conditions of Defini-tion 1 need to be checked in order to insure correct recall.

Obviously, all comments concerning kernels for the asso-ciation (X,Y) also apply to kernels for the association (Y,X).An additional important observation is that since the erodedpatterns are sparse (i.e. in the Boolean case most patternvalues will be zero) andMZZ as well asWVV are perfectrecall memories for any finite number of pattern vectorszj

or vj; respectively, storage capacity increases dramatically.As an example, we increased the number of associations

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867862

Fig. 12. Three pattern images (top row) and three corresponding kernelimages (bottom row).

to seven by adding the pattern pair (i,I), shown in Fig. 13 tothe memoriesM andMp

: The output of the MBAM:

…! M !…

… ← Mp ← …

( )

when presented with the inputxj; j � 1;…; k; is shown inFig. 14. Comparing this output with the example shown inFig. 10, performance deteriorated even further. Althoughthe output patternsM xj closely resemble the desiredoutput yj; perfect performance was achieved only for theassociation (x,X). Somewhat surprising is the fact that thefeedback throughMp provided perfect recall of the patternsxj even though the corresponding patternsyj were corruptedby dilative cross talk noise.

Following the outline of selecting kernel vectorsdiscussed above, we obtained kernel vectorszj; j �1;2;…;7; whose corresponding 18× 18 image representa-tions are shown in Fig. 15. These kernel vectors were usedin the construction of the memoriesMZZ andWZY:

Using the modified MBAM:

…! MZZ !WZY!…

… ← WYX ← …

( )

and perfect input, perfect bidirectional association wasachieved (Fig. 16). This supports the argument of storageincrease.

As illustrated in Fig. 17, corrupting the input patternvectorsxj; j � 1;2;…; 7; using 20% random noise did notinterfere with the ability of perfect recall.

Of course, if the corrupted input patterns violate thehypothesis of Theorem 10, then perfect recall may not bepossible. However, since the non-zero values of the pattern

kernel vectors are extremely sparse, chances of one of thesevalues being turned off are greatly reduced.

For our final example we return to the problem encoun-tered in Example 5. Choosingz1 � �21;1; 1;1� 0 andz2 �x2; we obtain the memories:

MZZ �

0 0 2 2

2 0 2 2

2 0 0 0

2 0 0 0

0BBBBBB@

1CCCCCCA and

WZY �

0 0 0 0

22 22 22 22

22 22 0 0

0 22 22 22

0BBBBBB@

1CCCCCCA:�70�

It is easy to affirm thatMZZ xj � zj andWZY zj � yj

for j � 1;2: Thus,Z � �z1; z2� is a kernel for the association

(X,Y) given in Example 5. Also, sinceWYX yj � xj forj � 1; 2; the MBAM system:

…! MZZ !WZY!…

… ← WYX ← …

( )provides for correct bidirectional recall of the patterns.

6. Conclusion

This paper introduces neural network computing based onlattice algebra, which provides a new paradigm for artificialneural networks. The resulting morphological networks areradically different in behavior than traditional artificialneural network models. The main emphasis of this paperwas on morphological bidirectional associative memories.In contrast to traditional bidirectional associative memories,morphological bidirectional associative memories convergein one step. Thus, convergence problems do not exist. In ourexperiments, morphological bidirectional associativememory were capable of recalling 34 grid patterns definedon grids of size 21× 21. These patterns represented 34 ofcapital and lower case letters from the alphabet similar tothose shown in this paper. Corruption of these patterns byrandom noise in terms of 15% bit reversal and presentingthese corrupted patterns to the memory resulted in perfectrecall. These examples indicate that for some pattern classesMBAMs have larger storage capacity and superior bidirec-tional recall than traditional BAM models. However, asExample 5 demonstrates, there are cases in which MBAMsfail, but traditional BAMs do not. Additionally, even thoughMBAMs based on kernel vectors may provide a method forhandling cases such as Example 5, current methods for select-ing good sets of kernel vectors are more art than science.

We need to point out that the results presented here andRitter et al. (1998) represent only a first step in the exploration

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867 863

Fig. 13. The additional pattern association used in constructing the MBAMf… ! M ! …… ← Mp ← …g.

Fig. 14. Performance of the MBAMf… ! M ! …… ← Mp ← …g for seven

pattern associations.

of morphological neural networks. The lattice algebraicapproach to neural network theory is extremely new and amultitude of open questions await exploration. For example,the selection of kernel patterns needs further investigation.This is especially true when using grey-valued patterns.Selection of proper sets of kernel vectors for Booleanpattern is often, but not always, easy, while the selectionof proper sets grey-valued kernel patterns is usually fairlydifficult. To get a better understanding of this difficulty, wechallenge the reader to find a complete set of kernel vectorsfor the set of vector pairs defined in Example 1. The ques-tion of an optimal set of kernel vectors for a given set ofpatterns remains an open question. A rigorous mathematicalcomparison between the newer proposed BAM models andMBAMs using the kernel vector approach needs to estab-lished. Of course, this represents only a small part of thechallenge. Learning rules for both associative morphologi-cal memories and morphological perceptrons need to beformulated and investigated. The base of applicationsneeds to be expanded and compared to traditional methods.It is our hope that these problems and this paper will gener-ate sufficient interest and entice other researchers to join oureffort in advancing the frontiers of this new endeavour.

Acknowledgements

This work has been supported in part by Air Force Grant#F08630-96-1-0004.

Appendix A. Proofs of theorems

We only prove theorems in one domain, namely those

theorems involving the min product and the memoryM. The dual theorems employing the max productandthe memoryMp are proven in an analogous fashion byreplacing minimums with maximums, or vice versa, andreplacing the max product by the min product .

Proof of Theorem 1. Suppose thatM xj � yj ;j �1;…; k: Let i [ {1 ;…;m} and g [ {1 ;…; k} be arbitrary.We need to show that there exists a column indexj0 [{1 ;…; n} such thatmij 0

� ygi 2 xgj0: Suppose to the contrarythat mij ± ygi 2 xgj for all j � 1;…;n: SinceM is the maxi-mum of the matricesyj × �2xj� 0 j � 1;…; k we haveyg × �2xg� 0 # M: Therefore, our assumption is equivalentto the statement thatygi 2 xgj , mij ;j � 1;…;n: But now acontradiction to the assumption arises immediately from thefollowing consideration:

ygi � �M xg�i �n

j�1

�mij 1 xgj � .n

j�1

��ygi 2 xgj �1 xgj � � ygi

�71�In order to prove the converse, note thatyj × �2xj� 0 �

Mj # M ;j � 1;…; k: Thus,Mj xj # M xj for all j �1;…; k: But:

Mj xj �

n

i�1

�yj1 2 xji 1 xji �

..

.

n

i�1

�yjm 2 xji 1 xji �

0BBBBBBBBB@

1CCCCCCCCCA� yj �72�

for all j � 1;…; k and therefore,yj # M xj ;j � 1;…; k:In order to show that we actually have equality, we showthat we also haveM xj # yj ;j � 1;…; k:

Let us again fix arbitrary indicesi [ {1 ;…;m} and g [{1 ;…; k} ; and evaluateM xg at the indexi:

�M xg�i �n

j�1

�mij 1 xgj � # mij ig

1 xgj ig� �ygi 2 xg

j ig�1 xg

j ig

� ygi : �73�

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867864

Fig. 15. The seven pattern images (top row) associated with the patternvectorsxj; j � 1;2;…; 7; and the corresponding kernel images (bottomrow) associated with the kernel vectorszj.

Fig. 16. Performance of the modified MBAM

f… ! MZZ ! WZY ! …… ← WYX ← … g using seven pattern associations.

Fig. 17. Performance of the modified MBAM

f… ! MZZ ! WZY ! …… ← WYX ← … g using input patternsxj corrupted

by 20% random noise.

Therefore,M xg # yg; and sinceg was chosen arbi-trarily, the conclusion follows. A

Proof of Corollary 1.1. According to Theorem 1,Mxj � yj;j � 1;…; k if and only if for each row indexi �1;…;m and eachg [ {1 ;…; k} there exists a column indexj ig [ {1 ;…; n} such that:

ygi 2 xgj ig � mij ig�_kj�1

�yji 2 xjj ig�: �74�

This equation holds if and only if:

xgj ig

2 ygi � 2mij ig�

k

j�1

�xjj ig

2 yji � �75�

or, equivalently, if and only if:

xgj ig �k

j�1

�xjjig 2 yji �1 ygi : �76�

A

Proof of Theorem 2. Theorem 2 follows by duality fromTheorem 1. A

Proof of Corollary 2.1. Corollary 2.1 follows by dualityfrom Corollary 1.1. A

Proof of Theorem 3. Theorem 3 follows directly fromTheorem 2 by simply interchangingxj with yj: A

Proof of Corollary 3.1. Corollary 3.1 follows directlyfrom Corollary 2.1 by simply interchangingxj with yj: A

Proof of Theorem 4. According to Corollary 1.1, the firstcondition implies thatM xj � yj ;j � 1;…; k: Similarly,using Corollary 3.1, condition 2 of the theorem implies thatMp yj � xj;j � 1;…; k: Hence:

Mp �M xj� � Mp yj � xj �77�and

M �Mp yj� � M xj � yj �78�for everyj � 1;…; k: A

Proof of Theorem 5. Suppose that~xg denotes a distortedversion ofxg and that forg � 1;…; k; M ~xg � yg: Then:

ygi � �M ~xg�i �n

l�1

�mil 1 ~xgl � # mij 1 ~xgj ;i

� 1;…;m and ;j � 1;…;n: �79�Therefore,

~xgj $ ygi 2 mij ;i � 1;…;m and;j � 1;…;n �80�

, ~x $_mi�1

�ygi 2 mij ;j � 1;…n

, ~xgj $_mi�1

�ygi 2_kj�1

�yji 2 xjj �� ;j � 1;…;n

, ~xgj $_mi�1

�ygi 1k

j�1

�xjj 2 yji �� ;j � 1;…;n

, ~xgj $_mi�1

�ygi 1^j±g

�xjj 2 yji � ∧ �xgj 2 ygi �� ;j � 1;…;n

, ~xgj $_mi�1

�^j±g

�ygi 2 yji 1 xjj � ∧ xgj � ;j � 1;…; n

, ~xgj $ xgj ∧_mi�1

�^j±g

�ygi 2 yjj 1 xjj �� ;j � 1;…; n:

This shows that the inequalities given by Eq. (52) are satis-fied. It also follows that:

~xgj $ xgj ∧ �^j±g

�ygi 2 yjj 1 xjj ��

;j � 1;…;n and ;i � 1;…;m:

�81�

Suppose that the set of inequalities given by Eq. (81) doesnot contain an equality fori � 1;…;m; i.e. assume that thereexists a row indexi [ {1 ;…;m} such that:

~xgj . xgj ∧^j±g

�ygi 2 yji 1 xjj �24 35 ;j � 1;…;n: �82�

Then

�M ~xg�i �n

j�1

�mij 1 ~xgj � �83�

.n

j�1

mij 1 xgj ∧^j±1

�ygi 2 yji 1 xjj �0@ 1A24 35

�n

j�1

mij 1k

j�1

�ygi 2 yji 1 xjj �0@ 1A24 35

�n

j�1

�mij 1 ygi 2_kj�1

�yji 2 xjj ��

�n

j�1

�mij 1 ygi 2 mij �

� ygi :

Therefore,�M ~xg�i . ygi which contradicts the hypothesis

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867 865

thatM ~xg � yg: It follows that for each row indexi theremust exist a column indexj i satisfying Eq. (53).

We now prove the converse. Suppose that:

~xrj $ xr

j ∧ or;m

i�1�^j±g

�ygi 2 yji 1 xjj �� ;j � 1;…n: �84�

By the first part of our proof, this inequality is true if andonly if:

~xgj $ ygi 2 mij ;i � 1;…;m and ;j � 1;…;n �85�or, equivalently, if and only if:

mij 1 ~xgj $ ygi ;j � 1;…;n and ;i � 1;…;m �86�

,n

j�1

�mij 1 ~xgj � $ ygi ;i � 1;…;m

, M ~xÿ �

i$ ygi ;i � 1;…;m

which implies thatM ~xg $ yg ;g � 1;…; k: Thus, if wecan show thatM ~xg # yg;g � 1;…; k; then we musthave thatM ~xg � yg ;g: Let g [ {1 ;…; k} and i [{1 ;…;m} be arbitrarily chosen. Then:

�M ~xg�i �n

j�1

�mij 1 ~xgj � �87�

# mij i1 ~xgj i

� mij i1 �xgj i ∧

^j±1

�ygi 2 yji 1 xjj i ��

� mij i1

k

j�1

�ygi 2 yji 1 xjj i �

� mij i1 ygi 2

_kj�1

�yji 2 xjj i �

� mij i1 ygi 2 mij i

� ygi :

This shows thatM ~xg # yg: A

Proof of Theorem 6. Theorem 6 follows from Theorem 5by duality. A

Proof of Theorem 7. Theorem 7 follows from Theorem 5by interchangingyg with xg and ~yg with ~xg:

We dispense with the proof of Theorem 8 since itappeared in Ritter et al. (1998).A

Proof of Theorem 9. Suppose thatZ is a kernel for (X,Y).Let g [ {1 ;…; k} and i [ {1 ;…;m}be arbitrarily chosen.

SincemZZ�i; i� � 0; we have:

zgi � �M xg�i �n

j�1

�mZZ�i; j�1 xgj � # mZZ�i; i�1 xgi � xgi :

�88�Sinceg and i were arbitrarily chosen, it follows thatzg #xg ;g and, hence,Z # X: A

Proof of Theorem 10. Since zgj # ~xgj ;j � 1;…;n; itfollows that:

~xgj $ zgj ∧ �_ni�1

^j±g

�zgi 2 zji 1 zjj �� ;j � 1;…; n: �89�

According to Theorem 8,MZZ zj � zj ;j � 1;…; k:Thus, by Corollary 1.1, for each row indexi � 1;…; n andeach a [ {1 ;…; k} ; there exists a column indexj ia [{1 ;…; n} such that:

zajia �k

j�1

�zjj ia 2 zji 1 zai �: �90�

Settingg � a and j � jia we have:

zgj �k

j�1

�zjj 2 zji 1 zgi � �91�

or, equivalently,

zgi 2 zgj � 2�k

j�1

�zjj 2 zji �� �_kj�1

�zji 2 zjj � � mZZ�i; j�: �92�

From the hypothesis and Eq. (91) we now obtain:

~xgj � zgj � zgj ∧ zgj � zgj ∧k

j�1

�zjj 2 zji 1 zgi �: �93�

Therefore,

~xgj � zgj ∧k

j�1

�zgi 2 zji 1 zjj �: �94�

In view of Theorem 5, Eqs. (89) and (94) imply thatMZZ

~xg � zg: A

References

Backhouse, R. C., & Carre`, B. (1975). Regular algebra applied to path-finding problems.Journal of the Institute of Mathematics and Its Appli-cations, 15, 161–186.

Benzaken, C. (1968). Structures alge´bra des cheminements. In G. Biorbi(Ed.),Network and switching theory, New York: Academic Press.

Carre, B. (1971). An algebra for network routing problems.Journal of theInstitute of Mathematics and Its Applications, 15, 161–186.

Cuninghame-Green, R. (1960). Process synchronization in steelworks—aproblem of feasibility. In Banbury and Maitland (Eds.),Proceedings ofthe Second International Conference on Operations Research(pp.323–328). English University Press.

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867866

Cuninghame-Green, R. (1962). Describing industrial processes with inter-ference and approximating their steady-state behaviour.OperationsResearch Quarterly, 13, 95–100.

Cuninghame-Green, R. (1979).Minimax algebra: Lecture notes in econom-ics and mathematical systems, 166. New York: Springer.

Davidson, J.L. (1989).Lattice structures in the image algebra and applica-tions to image processing. Doctoral dissertation. University of Florida,Gainesville, FL.

Davidson, J. L. (1992). Simulated annealing and morphological neuralnetworks.Society of Photo-optical Instrumentation Engineers (SPIE)Proceedings: Image Algebra and Morphological Image Processing III,1769, 119–127.

Davidson, J. L., & Hummer, F. (1993). Morphology neural network: anintroduction with applications.Circuits, Systems, and Signal Proces-sing, 12, 177–210.

Davidson, J. L., & Ritter, G. X. (1990). A theory of morphological neuralnetworks.Society of Photo-optical Instrumentation Engineers (SPIE),Proceedings: Digital Optical Computing II, 1215, 378–388.

Davidson, J.L., & Srivastava, R. (1993). Fuzzy image algebra neuralnetwork for template identification.Proceedings of the Second AnnualMidwest Electro-Technology Conference(pp. 68–71). Piscataway, NJ:IEEE.

Davidson, J.L., & Talukder, A. (1993). Template identification using simu-lated annealing in morphology neural networks.Proceedings of theSecond Annual Midwest Electro-Technology Conference(pp. 64–67).Piscataway, NJ: IEEE.

Gader, P. D., Won, Y., & Khabou, M. (1994). Image algebra networks forpattern classification.Society of Photo-optical Instrumentation Engi-neers (SPIE) Proceedings: Image Algebra and Morphological ImageProcessing, V, 2300, 157–168.

Gader, P. D., Miramonti, J. R., Won, Y., & Coffield, P. (1995). Segmenta-tion free shared weight networks for automatic vehicle detection.Neural Networks, 8, 1457–1473.

Giffler, B. (1960). Mathematical solution of production planning and sche-duling problems. IBM ASDD Division Tech. Rep.

Grossberg, S. (1988). Nonlinear neural networks: principles, mechanisms,and architectures.Neural Networks, 1, 17–61.

Kohonen, T. (1972). Correlation matrix memory.IEEE Transactions onComputers, C-21, 353–359.

Kohonen, T. (1984). Self-organization and associative memory. New York:Springer.

Kosko, B. (1987a). Adaptive bidirectional associative memories.Proceed-ings of the 16th IEEE Workshop on Applied Images and Pattern Recog-nition (pp. 1–49). Washington, DC: IEEE Computer Society Press.

Kosko, B. (1987). Adaptive bidirectional associative memories.IEEETransactions on Systems, Man, and Cybernetics, SMC-17, 124–136.

Lippmann, R. P. (1987). An introducing to computing with neural nets.IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-4, 4–22.

Oh, H., & Kothari, S. C. (1994). Adaptation and relaxation method forlearning in bidirectional associative memory.IEEE Transactions onNeural Networks, 5, 573–583.

Peteanu, V. (1967). An algebra of the optimal path in networks.Mathema-tica, 9, 335–342.

Ritter, G. X. (1991). In P. Hawkes (Ed.),Recent developments in imagealgebra, Advances in Electronics and Electron Physics 80New York:Academic Press.

Ritter, G.X. (1994). Image algebra with applications. Unpublished manu-script available via anonymous ftp from ftp://ftp.cise.ufl.edu/pub/scr/ia/documents.

Ritter, G.X., & Davidson, J.L. (1987). The image algebra and lattice theory.Tech. Rep. TR 87-09, University of Florida, Gainesville, FL.

Ritter, G. X., Li, D., & Wilson, J. N. (1989). Image algebra and its relation-ship to neural networks.Society of Photo-optical Instrumentation Engi-neers (SPIE) Proceedings: Aerospace Pattern Recognition, 1098, 90.

Ritter, G.X., & Davidson, J.L. (1990). Recursion and feedback in imagealgebra. InSociety of Photo-optical Instrumentation Engineers (SPIE)Proceedings of the 19th AIPR Workshop on Image Understanding in the90s.

Ritter, G.X., & Sussner, P. (1996). An introduction to morphological neuralnetworks.Proceedings of the 13th International Conference on patternRecognition(pp. 709–717), Los Altos, CA: IEEE Computer SocietyPress.

Ritter, G. X., Sussner, P., & Diaz-de-Leon, J. L. (1998). Morphologicalassociative memories.IEEE Transactions on Neural Networks, 9, 281–293.

Ritter, G. X., & Wilson, J. N. (1996). Handbook of computer vision algo-rithms in image algebra. Boca Raton, FL: CRC Press.

Ritter, G. X., Wilson, J. N., & Davidson, J. L. (1990). Image algebra: anoverview.Computer Vision, Graphics, and Image Processing, 49, 297–331.

Ritz, S. A., Anderson, J. A., Silverstein, J. W., & Jones, R. S. (1977).Distinctive features, categorical perception, and probability learning:some applications of a neural model.Psychology Review, 84,413–415.

Serra, J. (1982). Image analysis and mathematical morphology. London:Academic Press.

Shi, H., Zhao, Y., & Zhuang, X. (1998). A general model for bidirectionalassociative memories.IEEE Transactions on Systems, Man, and Cyber-netics, 28, 511–519.

Shimbel, A. (1954). Structure in communication nets.Proceedings of theSymposium on Information Networks(pp. 119–203). Brooklyn, NY:Polytechnic Institute of Brooklyn.

Simpson, P. K. (1990). Higher-ordered and intraconnected bidirectionalassociative memories.IEEE Transactions on Systems, Man, and Cyber-netics, 20, 637–652.

Suarez-Araujo, C. P. (1997). Novel neural network models for computinghomothetic invariances: An image algebra notation.Journal of Mathe-matical Imaging and Vision,, 769–783.

Suarez-Araujo, C. P., & Ritter, G. X. (1992). Morphological neuralnetworks and image algebra in artificial perception systems.Societyof Photo-optical Instrumentation Engineers (SPIE) Proceedings:Image Algebra and Morphological Image Processing III, 1769, 128–142.

Wang, C. C., & Don, H. S. (1995). An analysis of high-capability discreteexponential BAM.IEEE Transactions on Neural Networks, 6, 492–496.

Wang, J. F., Cruz, J. B., & Mulligan, J. H. (1990). Two coding strategies forbidirectional associative memory.IEEE Transactions on NeuralNetworks, 1, 81–92.

Wang, Y. F., Cruz, J. B., & Mulligan, J. H. (1991). Guaranteed recall of alltraining pairs for bidirectional associative memory.IEEE Transactionson Neural Networks, 2, 559–567.

Wang, Z. (1996). A bidirectional associative memory based on optimallinear associative memory.IEEE Transactions on Computers, 45,1171–1179.

Wilson, S. S. (1989). Morphological networks.Society of Photo-opticalInstrumentation Engineers (SPIE) Proceedings: Visual Communicationand Image Processing, IV, 1189, 483–493.

Won, Y., & Gader, P.D. (1995). Morphological shared weight neuralnetwork for pattern classification and automatic target detection.Proceedings of the 1995 IEEE International Conference on NeuralNetworks(vol. 4, pp. 2134–2138). Piscataway, NJ: IEEE Press.

Won, Y., Gader, P. D., & Coffield, P. C. (1997). Morphological shared-weight networks with applications to automatic target recognition.IEEE Transactions on Neural Networks, 8, 1195–1203.

Xu, Z. B. (1994). Asymmetric bidirectional associative memories.IEEETransactions on Systems, Man, and Cybernetics, 24, 1558–1564.

Zhuang, X., Huang, Y., & Chen, S. S. (1993). Better learning for bidirec-tional associative memory.Neural Networks, 8, 1131–1146.

G.X. Ritter et al. / Neural Networks 12 (1999) 851–867 867