Asymmetric bidirectional associative memories

1558 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 24, NO. 10, OCTOBER 1994

representation to capture both structural information and the propagative nature of faults. An original inference strategy to locate failure sources is presented and compared with existing approaches to dealing with fault propagation. The inference strategy presented in this paper uses a systematic method of generating subdevices for testing as a way of reducing the candidate space for both single fault and multiple fault situations. It does not require information about fault propagation timing. The computation between successive testing operations was shown to be computationally tractable.

As it exists now, the inference strategy presented in this paper does not address the situation in which a cycle exists in the digraph representing the device under diagnosis. A large majority of systems do not have propagative cycles. For those that do have cycles, there are two possible solutions: one is to remove the cycle from the system for purpose of diagnosis, for example, by physically disconnecting part of the system; the second is to use a coarser resolution model that incorporated the cycle in a single subdevice. The investigation of this problem represents a possible future research direction.

REFERENCES

S. J. Chang, F. Dicesare and G. Goldbogen, “Failure propagation trees for diagnosis in manufacturing systems,” IEEE Trans. Syst., Man. and Cybern., vol. 21, no. 4, pp. 767-776, Jul./Aug., 1991. J. deKleer and B. C. Williams, “Diagnosing multiple faults,” ArtQicial Intelligence, 32, pp. 97-130, 1987. M. R. Genesereth, “The Use of design descriptions in automated diagnosis,” Artificial Intelligence, 24, pp. 41 1436, 1984. J. H. Graham, J. Guan and S. M. Alexander, “A Hybrid diagnostic system with learning capabilities,” I d . Joumal of Engineering Applications of Art$cial Intelligence, vol. 6, no. 1, pp. 21-28, 1993. M. Kokawa, S. Miyazaki and S. Shingai, “Fault location using digraph and inverse direction search with application,” in Automatica, vol. 19,

N. H. Narayanan and N. Viswanadham, “A methodology for knowledge acquisition and reasoning in failure analysis of systems,” IEEE Trans. Syst., Man, and Cybern., pp. 274-288, MarJApr., 1987. S. Padalkar, G. Karsai, C. Biegl, J. Sztipanovits, K. Okuda and N. Miyasaka, “Real-time fault diagnostics,” IEEE Expert, pp. 75-85, Jun. 1991. S. Padalkar, G. Karsai, J. Sztipanovits, K. Okuda and N. Miyasaka, “Real-time fault diagnostics with multiple aspect models,” Proceedings of the 1991 IEEE International Conference on Robotics anddutomation, pp. 803-808, Sacramento, CA, Apr., 1991. Y. Peng and J. Reggia, Abductive Inference Models for Diagnostic Problem-Solving, New York Springer-Verlag, 1990. N. S. V. Rao, “Computational complexity issues in operative diagnosis of graph-based systems,” IEEE Trans. on Computers, vol. 42, no. 4, pp. 447457, 1993. N. S. V. Rao and N. Viswanadham, “An efficient level-structuring algorithm for large scale systems,” Control Theory and Advanced Technology, vol. 3, no. 1, pp. 85-90, 1987. N. S. V Rao and N. Viswanadham, ‘‘Fault diagnosis in dynamical systems: a graph theoretic approach,” Int. J. Systems Sci., vol. 18, no. 4, pp. 687-695, 1987. J. A. Reggia, D. S. Nau and P. Y. Wang, “Diagnostic expert systems based on a set covering model,” Int. J. Man-Machine Studies, 19, pp. 437460, 1983. R. Reiter, “A Theory of diagnosis from first principles,” Artificial Intelligence, 32, pp. 57-95, 1987. W. B. Rouse, “A Model of human decisionmaking in a fault diagnosis task,” IEEE Trans. Syst., Man, and Cybem., vol. 19, pp. 729-735, 1983. A. P. Sage, Methodology for Large-Scale Systems, New York McGraw- Hill Book Company, 1977. S. Warshall, “A Theorem on Boolean Matrices,” J o u m l of the ACM, vol. 9, no. 1, pp. 11-12, 1962.

pp. 729-735, 1983.

Asymmetric Bidirectional Associative Memories

Zong-Ben Xu, Yee h u n g , and Xiang-Wei He

Ahtrucf- Bidirectional Associative Memory (BAM) is a potentially promising model for heteroassociative memories. However, its applications are severely restricted to networks with logical symmetry of interconnections and pattern orthogonality or small pattern size. Although the restrictions on pattern orthogonality and pattern size can be relaxed to a certain extent, all previous efforts are at the cost of increase in connection complexity. In this paper, a new modification on the BAM is made and a new model named Asymmetric Bidirectional Associative Memory (ABAM) is proposed. This model not only can cater for the logical asymmetry of interconnections but also is capable of accommodating a larger number of non-orthogonal patterns. Furthermore, all these properties of the ABAM are achieved without increasing the connection complexity of the network. Theoretical analysis and simulation results all demonstrate that the ABAM indeed outperforms the BAM and its existing variants in all aspects of storage capacity, error-correcting capability and convergence.

I. INTRODUCTION Associative memories are important neural network models which

can be employed to model human thinking and machine intelligence by association. They have found applications in content addressable memory, pattern recognition, expert systems, intelligent control, and optimization problems.

Associative memories can essentially be classified into autoassociative memories and heteroassociative memories. Autoassociative memories [ 11-[4] are in general single-layered and fully intercon- nected neural networks which can store multiple stable states. Each neuron is connected to all neurons in the network with symmetric connection strengths. Given an input pattern, the network finds the best match from a set of known stored patterns. Though autoassociative memories are useful tools for various applications, they tend to produce spurious stable states and highly complex connections.

As an extension of autoassociative memories, heteroassociative memories have been developed in recent years. Holographic associative memory [5], adaptive resonance theory [6] and the bidirectional associative memory (BAM) [7]-[9] are typical examples. In place of unidirectional association, heteroassociative memories in general perform bidirectional association. They associate an input pattern with a different stored output pattern or a stored pattern pair.

Owing to its lowest connection complexity, guaranteed convergence and stronger error-correction capability [lo], BAM has recently attracted particular attention in neural network research. Extended on the BAM framework, a number of improvements have been made on the encoding methods, performance andor storage capacity of the

All these models are however severely restricted by two funda- mental problems which so far have eluded the attention of neural network researchers. They are respectively the logical symmetry of connections and pattern orthogonality in the BAM. Our inability

Manuscript received December 20, 1992; revised December 10, 1993. This work is supported by the RGC Earmarked Grant, UPGC, Hong Kong.

Z.-B. Xu is with the Institute for Computational and Applied Mathematics, Xi’an Jiaotong University, Xi’an, Shaanxi, 710049, P.R. China.

Y. hung is with the Department of Geography and Center for Environ- mental Studies, The Chinese University of Hong Kong, Shatin, Hong Kong.

X.-W. He is with the Department of Information Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong.

IEEE Log Number 9403054.

BAM [11]-[14].

0018-9472/94$04.00 0 1994 JEEE


to relax these limitations in tum hampers efficiency in pattern restorability, error-correction capability, and convergence. It further limits the use of BAM for knowledge representation and inference.

The purpose of this paper is to propose a model of asymmetric bidirectional associative memories with asymmetric feedforward- feedback connections, pattern non- orthogonality, and relatively large capacity. To facilitate our discussion, BAM and its recent variants are first briefly reviewed in section 2. The problems arising from logical symmetry and pattern orthogonality are examined in section 3. The proposed model and its properties are discussed in section 4. Comparisons with the Hopfield network (HOP), BAM and intraconnected BAM (IBAM) are then made in section 5 through simulation studies in terms of their restorability, error-correction capability, and convergence. The paper concludes with a brief summary of our work.

11. A NOTE ON BAM AND lTS VARIANTS

The basic structure of the BAM and its variants are summarized in this section. Their limitations are also highlighted.

A. BAM

Given the heteroassociative bipolar pattern pairs

{ ( X ( l ) , Y ( I ) ) , ( X ( Z ) , Y @ ) ) , . . . , ( X ( M ) , Y ( M ) ) } (2.1)

where X(') E {-l,+l}'v and Y( ' ) E {-l,+l}p, for i = 1,2 , . . , , M , are patterns in the two neuron populations X and Y respectively. Let Z(') = ( ( X ( Z ) ) T , ( Y ( z ) ) T ) T and { Z ( l ) , 2@), . . . , Z ( M ) } be the autoassociative patterns in {-l,+l}L where L = N + P. Then the BAM is a discrete time system which can be represented as a weighted directional bipartite graph with N + P nodes (neurons) in X U Y, where X and Y are independent sets of nodes (Le., no two nodes within X or Y are connected by an edge). Let nodes (i) and (j) be fixed in X and Y respectively, and let

(a) mz3 be the weight attached to edge ( i , ~ ) ; (b) 8, be the threshold attached to node (i) and v3 be that attached

to node 0'). Then Kosko's BAM model is uniquely defined by the N x P matrix M = ( m , J ) ~ x p and the L + P dimensional vector ( 0 , ~ ) ~ = (8t3 7, ) T .

In Kosko's model, the matrix M is obtained as M

M = C ( X ' * ' ) ( Y ' y (2.2) Z=1

As in the Hopfield network, every node (neuron) can be in one of two possible states, either +1 or -1. The state of node ( i ) in X and node 0') in Y at time t are denoted by X , ( f ) and Ej(t) respectively. The states of neuron populations X and Y at time t then are the vectors X ( t ) = ( X , ( t ) , X P ( t ) , . . . , X N ( t ) ) T and Y ( t ) = ( Y ~ ( t ) , Y z ( t ) , . . . , Y P ( ~ ) ) ~ respectively. The states of X and Y at the next point in time, X(t + 1) and Y ( t + l), are determined by the following evolution equations:

X ( t - 1) = s g n ( M Y ( t ) - e ) , (2.3)

(2.4) Y ( t + 1) = s g n ( M T X ( t ) - 7) .

The BAM works in two modes: the block-wise serial mode or the synchronous mode. In the first mode, the network operates in a manner of bidirectional information search. An initial input pattem X O (or YO) is first presented to the network, the resulting output YI (or XI) is then obtained by (2.4) (or (2.3)). The output Y1 (or X I ) is subsequently fed to (2.3) (or (2.4)) to derive a new candidate XZ (or Yz). The iterative procedure continues until the network evolves to

a stable state of two pattern reverberation. The stable reverberation, denoted as ( X * , Y * ) , corresponds to a stable state of the BAM which satisfies

X' = sgn(MY* - e ) , (2.5) Y" = sgn(MTX* - 17) (2.6)

On the other hand, operating the BAM in a synchronous mode requires an initial pattern pair ( X O , YO) (or (YO, XO)) to be presented to the network. With respect to the initial pair ( X O , YO) (or (YO, XO)), the network then evolves according to (2.3) and (2.4) with the simultaneous change of the patterns X and Y . It stops when a stable reverberation of the system is reached. Ideally, the stable reverberation vector is the prototype pattern pair nearest to ( X O , YO) (or (YO, X O ) ) .

The interpretation of the BAM and a comparison with the Hopfield network has been made. Details can be found in [IO].

B. Intraconnected BAM (IBAM)

To increase the storage capacity and the retrieval efficiency of a BAM, the first-ordered BAM is extended to the high- ordered BAM by Kosko 181 and Simpson [I I]. Through intrapopulation connections, the intraconnected bidirectional associative memory (IBAM) and its variants proposed by Simpson appear to be able to extend storage capacity and efficiency. The evolution equations of pattern X in the neuron population X and pattem Y in the neuron population Y are respectively

X ( t + 1) = sgn(Px . s g n ( M Y ( t ) ) ) , (2.7)

Y ( t + 1) = s g n ( 4 . . s g n ( M T X ( t ) ) ) , (2.8)

where M is the same connection matrix as that defined in the BAM, and

M

(2.9)

M

2 = 1

are the intraconnection matrices of X(')s and Y(')s respectively.

C . BAM With Multiple Trainings and Exponential Correlations

To guarantee recall of all training pairs for a BAM, a multiple- training strategy has been incorporated in the BAM [12], [13]. In place of the encoding scheme in (2.3) and (2.4), a correlation matrix M encoded as

M

(2.11)

is used. The coefficients { q z } can be determined by either the linear programming method or the sequential adjustment method. However, the existence of appropriate {qt} is not guaranteed in the multiple- training BAM.

Based on the concept of correlation, a BAM with exponential correlation has also been proposed [ 141. The corresponding evolution equations become

M

exp(y(X('), X ( t ) ) ) Y " '

where y is a parameter.

The present note is by no means an exhaustive account of the his- torical development of the BAM. It however summarizes information necessary for the discussion to follow.

111. PROBLEMS OF LOGICAL SYMMETRY AND PATERN ORTHOGONALITY IN BAM

Though the BAM is a relatively powerful model for heteroassociative memories, its usefulness is severely restricted to situations in which:

(a) Connections of the network are symmetric, when interpreted in terms of rule in knowledge representation, it implies that the logical relation between patterns X ( ' ) and Y(') , { (x( ' ) , Y ( ' ) ) , i = 1,2, . . . , M } is a symmetric implication, Le., the logical implication ''X(') IF AND ONLY IF Y(')'' must exist between X ( ' ) and Y(');

(b) Either { ( X ( ' ) , Y") ) , i = 1,2 , . . . , M} is a orthogonal set, or the number of patterns M is much smaller than the dimensions of the pattern spaces in which X ( ' ) and Y(') lie.

Restriction in (a) clearly excludes the likelihood of applying the existing BAM to patterns among which asymmetric connections exist. A typical limitation is the difficulty in applying BAM to develop connectionist expert systems [15], [16] in which asymmetric logical relations ordinarily hold.

It should be noticed that different modes of operation of the BAM correspond to different manners of making an inference.

(1) When the BAM is operating in the block-wise serial mode with resulting output ( X * , Y * ) , we have the following equivalences:

(i) (ii)

X(O) as input e forward implication, Y(O) as input e backward implication.

(2) When the BAM is operating in a synchronous mode with resulting output ( X * , Y * ) , we have the following equivalences:

(i) (X(O), Y(O)) as input e combination implication, (ii) (X(O), Y') as input e causality check, (iii) ( X * , Y ( ' ) ) as input 9 effectiveness check.

Therefore, symmetric logical implication in existing BAM means that the following situation is prohibited:

If X ( ' ) then Y(') with confidence level m ( X ( " , Y") ) , and if Y(') then X ( ' ) with confidence level m ( Y ( ' ) , X ( " ) with

Obviously, such a restriction is too expensive to have in associative memories.

Restriction in (b) is evidently not a desirable property in the BAM. It is apparent that in many problems neither must the pattern set { ( X ( * ) , Y ( * ) ) , i = 1,2 , . . . , M} be orthogonal nor the number of patterns be small. Though this restriction can be relaxed to a certain extent [11]-[14], all these efforts are at the cost of increase in connection complexity.

To make the BAM more versatile, it is thus imperative to develop a new encoding scheme without the restriction of either pattern orthogonality or small pattern size. The encoding scheme should also cater for asymmetry in connections. The asymmetric bidirectional associative memory (ABAM) to be discussed in the following section possesses all these properties.

m(X(") , Y'") # m ( Y ( ' ) , X'") .

IV. ASYMMETRIC BIDIRECTIONAL ASSOCIATIVE MEMORIES (ABAM)

As many authors have demonstrated [lo] [14], Kosko's BAM model can only handle patterns, { ( X ( ' ) , Y ( ' ) , i = 1,2 , . . . , M } ,

which are either orthogonal or their number is restricted by

)'" N + P N + P - 2dz(BAM)'

M < 1 + min

( P - 2dy(BAM) ,'"I '

where dz(BAM), dx(BAM) and dy(BAM) are thedistribution dis- tances ofthe sets {Z(1)7 Z('), . . . ,Z(')}, { X ( l ) , X ( ' ) , . . . , X ( M ) } , and {Y('l7 Y(') , . . . , Y ( M ) } , respectively.

To relax these serious limitations, we propose in this section an improved and generalized BAM model, namely the asymmetric bidirectional associative memory (ABAM).

The ABAM is described by the following evolution equations:

X ( t + 1) = sgn(AY( t ) - e) , (4.1) Y ( t + 1) = s g n ( B X ( t ) - 7) . (4.2)

or, equivalently

Z ( t + 1) = s g n ( W Z ( t ) - 6) (4.3)

with O A

w = [B 0 1 [ = [:] where

A = H u ( X , Y ) , B = H M ( Y , X ) . (4.4)

The sequence of matrices { HK (Y, X ) } is specified according to the following learning algorithm:

Given the linearly independent patterns X and Y with

Leaming Algorithm

x = { X U ) X W , . . . , X ( M ) } ,

y = { Y ( l ) 3 yo) , * ' . , y(W}*

Let

and

(4.5)

with .I,(X) = X(I,+l) - PI, (X)X(L+l) ,k = 1 , 2 ,..., M - 1,

~ I , ( Y , X ) = Y(I,+') - H k ( Y , X ) X ( k + l ) , k = 1 , 2 , . . . M - 1.

The suggested model (encoding scheme) attempts to construct directly an interpolation operator Hk (Y, X ) which meets the following requirements

H I , ( Y , X ) X ( i ) = Y( i ) , i = 1,2, . . . , k. (4.7)

It tums out that such interpolation operators do exist and, actually, a particular choice of them is just the matrix HI, (Y , X ) defined recursively by the Learning Algorithm. Thus, our encoding scheme assures that each pattern pair ( X ( ' ) , Y ( ' ) ) is a stable state of the ABAM in (4.1H4.2) provided the thresholds 0 and 7 both take on zero as their values. This is exactly what we intend to have in our original motivation.

The following lemma provides a first step towards the justification of the fact that the Learning Algorithm generates an operator Hk (Y, X ) satisfying the interpolation requirement in (4.7).



Lemma I : Let Pk (X) be defined as that in the Learning Algo- rithm. Then Pk (X) is a best approximation projection from RN into X k , where

xk = span(X(l), xQ), . . . , x ( ~ ) } . For the linear operator (matrix) Pk(X) to be a best

approximation projection from RN into Xk. the following statements are equivalent [17]:

Proof:

(Pl) . IIX - Pk(X)Xll 5 IIX - X'II for any X E RN and x' E xk;

(P2) * (X - Pk(X)X, U ) = (X - Pk(X)X)% = 0,Vx E R N a n d u E x k ;

(P3) .Pk(X) is symmetric and such that (Pk(X))' = Pk (X).

So, we can prove the lemma by substantiating the validity of (P2) for any i E [ l ,k] . It should be noticed that the subspace Xk is spanned by the vectors X(l), X(2), . . . , X(k). It implies that (€9) is equivalent to the following:

(X - Pk(X)X,X(')) = 0,VX E R N a n d i = 1 , 2 , . . . , k. (4.8)

Clearly, (4.8) is trivially true for k = 1, for

= ( x y ) - (X(l),X) = 0.

(x - Pk(X)X,X'") = (x - Pk-l(x),x(z))

Assuming that (4.8) is true for any 1 5 i 5 k - 1, then by the construction of the Learning Algorithm, we have

(4.9)

We now claim that Ec-l(X)X(a) # 0 must hold in (4.9). Indeed, by the inductive assumption, Pk-l(X) is the best approximation projector of RN to Xk-1 and hence P := 1 - Pk--l(X) is the best approximation projector of RN to Xipl, the orthogonal complement subspace of Xk-1. From (E) with P instead of Pk(X), it then follows that

) - ( ~k-l(X)E~-l(X)X,X(z) E;- 1 (X)X ( k )

(x - P x , P x - Px' = 0 ) and

(X' - PX', PX - PX') = O,VX,X' E RN,

which yields

(x - x ' , P x - PX' = IlPX - Px'f, ) or, equivalently

(x - XI, x - x' - P k - 1 (X)(X - XI)) = IIX - x' - Pk-l(X)(X - X')f.

Taking X = X(k) and X' = 0 in this last identity gives

) E:-l(X)Xk = ( x ( k ) , x ( k ) - P&lX(k)

= (IX(k) - Pk-lX(k)l12 = E:-l(x).

Consequently, ET-~(X)X(~) # 0 if and only EL-I(X) # 0. However, E ~ - ~ ( X ) # 0 has to follow, otherwise, X ( k ) = Pk--l(X)X(k) E Xk-1. This means that {X(l), X(2), . . . , X(k)} is linearly dependent, which is impossible by our assumption. Thus our claim is justified.

Now, for any 1 5 i 5 k-1, (4.9) yields (X - Pk(X)X, X(')) = 0 directly by the inductive assumptions and, for i = k, one has

(x - P,(X)X,X(")) = (x - Pk-l(X)X,X(k))

= (x - Pk-l(X)X,X(k))

= (x - Pk-l(X)X,X(k))

- (X,Ek-l(X))

- (x,x(k) - Pk-l(x)x(k)) = 0,

where the last equality follows from the symmetricity of the matrix (I - PkT1(X)) (by (P3)). Thus, by the principle of induction, (4.9) then holds for any integer k E [l, MI, which completes the proof.

Now we apply lemma 1 to establish the following: Theorem 1: If the pattems X and Y are linearly independent,

then every pattern pair ( X ( ' ) , Y ( ' ) ) is a stable state ofthe ABAM in (4.1)-(4.2) with 6' = 77 = 0.

We first justify the equalities in (4.7). This holds clearly for i = 1 by definition of H1 (Y, X). If this is true for any 1 5 i 5 k, then from the defintion of Hk(Y,X), we have

Proof:

Hk(Y, X)X'" = Hk-l(Y, X)X'"

By lemma 1 (P2), for any i 5 k - 1,

E:-l(X)X(Z) = (x(') - Pk4(X)X(k),X(Z)) = 0,

and therefore, Hk(Y,X)X(') = Hk-l(Y,X)X(') = Y(') for any i 5 k - 1. For i = k, (4.10) becomes

Hk(Y,X)X(k) = Hk4(Y,x )x (k ) + 7]k-1(Y1X) = Hk-l(Y,X)X(k) + Y ( k )

- Hk-l(Y,X)X(k) = Y(k).

Consequently, (4.7) is valid for i = k also. Thus, (4.7) is justified for any integer i E [I, MI.

From (4.7), it follows that

BX(') = HM(Y, X)X(') = Y("),Vi = 1 , 2 , . . . , M ,

which yeilds Y(') = sgn(BX(')), implying that (4.2) is satisfied for any pair (x('),Y(')).

Observe that the position of X(a) and Y(') appeared in the foregoing arguments is completely symmetric. The equalities

AY") = Hn;r(Y, X)Y'" = X"), Vi = 1,2 , . . . , M ,

follows also. This means that (4.1) is also satisfied for any pair (XcZ),YcZ)). Therefore the proof of Theorem 1 is completed.

Theorem 1 implies clearly that any A4 linearly independent pattern pairs can be totally recalled by our proposed A B M model.

Recall that the storage capacity of a neural network is the maximum number of randomly generated patterns which can be stored in the network. Given a set of vectors, X , the rank of X , T ( X ) , is the maximum number of linearly independent vectors that exit in X . With these terms, Theorem 1 then obviously implies the following:

Theorem 2: The storage capacity of the A B M is not less than the of minimum of the ranks of the pattern populations X and Y.

Generally speaking, min{ T ( X ) , T ( Y ) } also gives the exact value of the storage capacity of the ABAM, as indicated in the subsequent simulation studies.

Remark The encoding scheme in (4.4) proposed in this section can be regarded as a direct generalization of Kosko’s scheme in (2.2). To see this, let us suppose that { X ( l ) , X ( ’ ) , . . . , X(‘)} and ( Y ( l ) , Y(’) , . . . , Y ( M ) } both are normalized orthogonal patterns. Then, in this case, we have

P l ( X ) = ( X ( l ) ) ( X ( l ) ) T and H l ( Y , X ) = ( Y ( l ) ) ( X ( ’ ) ) T

in the Learning Algorithm. Furthermore, if k

%(x) = C(x(3) ) (x( ’ ) )T 3 x 1

and

(4.11)

for some integer k 2 1, then by the definition of the Learning Algorithm,

Ek(X) = X ( k + l ) - P k ( x ) x ( k + ’ )

and hence,

k41

j = 1

That is, (4.11) is valid for k + 1 also. This, by induction, in turn implies the validity in (4.1 1) for any k = 1,2 , . . . M . Particularly, B = H M ( Y , X ) = M T and A = H M ( X , Y ) = M, where M is defined in (2.2), then follow.

v. SIMULATIONS AND COMPARISONS

To evaluate the effectiveness of the ABAM, a series of simulation runs was made to compare the performance of the HOP (Hopfield network), BAM, IBAM, and ABAM in terms of their restorability,


error-correction capability, and convergence.

A, Restorability Based on the definition of restorability, we need to find out

from the simulation runs the number of stable states a network can restore given a set of randomly generated patterns {Z(’)}E1, where Z(’) = (Z i t ) , Z;”, . . . , Z;)). The larger is the number, the higher the restorability becomes. In the simulation studies, we selected M = 100, L = 120. Each pattern set (from a set of one pattern to a set of M patterns) was randomly tested 10 times. For each network model, a total of 50,500 patterns were tested.

For the HOP model, the number of stable states is the number of random patterns Z(3), j = 1 , 2 , . . . , M , which satisfy

~ ( 3 ) = sgn(WZ(3)) (5.1)

where M w = C Z ( i ) ( Z ( i ) ) T

i = l

is the connection matrix in the HOP evolution equation:

Z( t + 1) = sgn(WZ(t) - 0). (5.2)

To test the BAM, IBAM and ABAM, we let Z(‘) = ( ( X ( Z ) ) T , (Y(’))T)T where X(’) and Y ( ’ ) are 60- dimensional vectors. Then the number of stable states in the BAM, IBAM, and ABAM are the numbers of random patterns (same patterns used in the HOP) Z(j) = ( ( X ( ‘ ) ) T , ( Y ( ’ ) ) T ) T , j = 1,2 , . . . , M , which respectively satisfy

X ( j ) = sgn(MY(’)), (5.3)

Y( j ) = sgn(MTX(j)) , (5.4)

where M is the connection matrix in (2.2); and

X(’) = sgn(Px . sgn(MY(j))), (5.5) Y(I) = sgn(f i . sgn(MTX(’))), (5.6)

where PX and Py are the matrices in (2.9) and (2.10); and

X(3) = sgn(HM(X, Y ) Y ( J ) ) , (5.7)

y ( 3 ) = sgn(Hnn(Y, X ) X ( 3 ) ) , (5.8)

where HM is in (4.6). The simulation results, as depicted in Fig. 1, demonstrate that the

ABAM out-performs all other networks. Particularly, it guarantees that any number of prototype patterns not larger than 60 are all stable states in the ABAM. It should be noticed that in the simulation runs the patterns X ( ’ ) and Y ( ’ ) are both 60-dimensional vectors. That is, the simulation runs support all claims of Theorems 1 and 2.

B. Error-Correction Capability The error-correction capability of a network is determined by the

network‘s capability in recovering the original stable state when error occurs in any of its component(s). In simulations, we observe whether or not a stable state can be recovered when its components are put, one by one, in error (Le., changing from positive to negative or vice versa in the bipolar pattern). The “number of tolerable error bits” is used to measure the maximum number of correctable errors. That is, it is the maximum number of components which can be permitted to take on an error before a stable pattern becomes irrecoverable.

The above rule was applied to the simulation runs performed on the HOP, BAM, IBAM, and ABAM. The procedure was:

(a) Find all common stable states in the HOP, BAM, IBAM, and ABAM from the simulations in (A.Restorability).

(b) For each of such stable states (same in all networks), Z(’) = ( Z t ’ , k = 1,2, . . . , N), let 2;’ = -2;’ for each component k in Z(”. Observe whether Z(’) can be recovered.

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 24, NO. 10, OCTOBER 1994 1563

50

k 4 0 /

Number of Stored Patterns

Fig. 1 . Comparison of restorability.

Since there were only 21 stable patterns, common to the HOP, BAM, IBAM, and ABAM (see Fig. l), to be found, only these 21 stable states were tested for error-correction capability in these networks.

To test the HOP for error-correction capability, we let {Z(*), i = 1 ,2 , . . . ,21} be the stable states (common to all networks). Let Z(1) = (Zi'),Zi*),. . . ,Z$)) be a specific stable state i, and Z[i\ be the state when 2:') in Zz is set to -Z;i;, i.e. 2;;; = ( -Ziz) , Z$), . . . ,Zg)). Based on Z:;;, we tested whether the original state ZCz) could be recovered by subjecting 2:;; to the evolution equation in (5.2) with Z(0) = 8'). If the original stable state could not be recovered in 10 iterations, then the error w,as considered to be uncorrectable. If from 2;;; we would recover Z(", then we set 2:) in Z(') to -22' and denote the new state as Z{;;, 2;;; =

(-Z$'), -Zi t ) , Z$') . . . , -2;)). The state 2;;; was then tested. Such a process was repeated until at Z::;, q 5 N ( N = 60), Z(') could no longer be recovered. The error-correction capability of the HOP is then q - 1 (i.e., only q - 1 components can have a change in sign).

To be able to compare with the BAM, the state variable Z ( t ) in the HOP was structured in the simulations as Z ( t ) = (X(t) ,Y(t))T, where X ( t ) and Y ( t ) are 60-dimensional vectors, and the initial state is taken as Z(0) = (X:;;,O) for i = 1,2, . . . , 2 l , a n d j = 1,2 )..., q.

To test the BAM for error-correction capability, we again set the state variable (X( t ) ,Y( t ) )T = Z( t ) and take the initial state respectively as Z(0) = (X:;!, O)T for i = 1 ,2 , . . . ,2l,and j = 1,2 , . . . , N, where

The same procedure in the HOP experiment was employed, but in this case we, applied ,the evolution equations (2.3H2.4) to test whether or not Z(') = (X(') , P(*))T could be restored.

The tests of the IBAM and ABAM were similar to that of the BAM. The simulation results are depicted in Fig. 2, which reveals that the error-correction capability of all networks are almost the same for small number of patterns. However, as the number of patterns increases, the ABAM has a higher error-correction capability.

Fig. 2. Comparison of error-correction capability.

C. Convergence

Convergence of a network is determined by the probability of a randomly generated vector converging to a given set of prototype patterns. Therefore, the number of spurious stable states of a network can be determined correspondingly.

To compare across all networks, we required the prototype patterns {Z('), i = 1,2, . . . , M} used to encode the connection matrix be simultaneously the same stable states of the HOP, BAM, IBAM, and ABAM. That is, any 2(J) E {Z('), i = 1,2, . . . , M ) has to satisfy (5.1), (5.2) and (5.3). (5.4) and ( 5 3 , as well as (5.6) and (5.7). Due to this imposed strict condition, a very large number of selections had to be made before the simulation runs could be performed.

After numerous selections, 11 common prototype patterns were finally selected for developing the networks HOP, BAM, IBAM and ABAM, and for making the comparison study. For each n,n = 1,2, . . . , 1 1 ,100 random vectors were generated to compute their probabilities of converging to the prototype patterns. Therefore, a cumulative total of 1100 random vectors were used for each network.

To test the HOP for convergence, the following procedure was carried out:

Given a random vector Z, we obtained

z* = Sgn(WZ*)

through the iterative model

Ze+l = sgn(WZe),t = 0,1 ,2 , . . . ,Zo = 2.

We then determined whether Z* was among the stored patterns {Z( ' ) , i = 1,2, . . . ,11}. If yes, then Z* converged to the prototype pattern. If not, then Z* converged to a spurious stable state. The trend of convergence (Fig. 3) was then obtained by computing cumulatively 100 random vectors.

To test the BAM for convergence, we let a random vector Z = {X, Y} be the initial state of the following iterative model

Xe+i = sgn(WYe)),

~ e + l = sgn(WTXt , t = 0,1 ,2 , . . . and obtained, through iterative computation,

X * = sgn(WY*)

Y* = sgn(WTX*)

1564 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 24, NO. IO, OCTOBER 1994

Number of Stored Patterns

Fig. 3. Comparison of convergence.

If Z* = ( X * , Y * ) was one of the stored prototype patterns, then a convergence had taken place. If not, then 2‘ converged to a spurious stable state. The same procedure was applied to the test of the IBAM and ABAM and the results are depicted in Fig. 3.

In general, the ABAM performs better than the other networks in convergence, especially with increasing number of prototype patterns. In terms of spurious stable states, when the number of stored prototype pattern was 5, 100% convergence to the stored patterns was experienced in the BAM, IBAM, and A B M , while only 90% was recalled in the HOP. When there were 6 stored prototype patterns, however, a 20% convergence (Le. 80% spurious stable states) was recorded in the ABAM as compared to the 7 to 8% convergence (Le. 92 to 93% spurious stable states) found in other networks. Though the ABAM has a better performance, it is still not immune to spurious- point problem which often crops up in neural networks with linear interconnections.

VI. CONCLUSION In this paper we have introduced an asymmetrical bidirectional

associative memory network-ABAM. This network is constructed within the framework of Kosko’s BAM model but with an asymmetric encoding scheme for the connection matrix. Based on the application of a specific interpolation operator, the new encoding scheme provides not only a direct generalization of the Kosko’s scheme, but also a guaranteed recall of all training pairs (X(’ ) ) , Y ( ’ ) ) whenever the pattem populations {X( ’ ) , X ( 2 ) , . . . , X ( M ) } and {Y(’) , Y ( 2 ) , . . . , Y ( M ) } are linearly independent. Therefore, the ABAM has been shown to be:

(a) the simplest asymmetric heteroassociative memory; (b) the one possessing the largest storage capacity among all

existing BAM-like networks with linear connections. (c) the best performer compared with the Hopfield network, the

BAM and its many variants in terms of storage capacity, error-correcting capability and convergence.

It is apparent that the ABAM is a more powerful and general model for associative memories. In particular, it has a great potential in the construction of connectionist expert systems, especially when connections (i.e. logical implications) are asymmetric, patterns are non-orthogonal or number of patterns is relatively large.

REFERENCES

M. A. Cohen and S. Grossberg, “Absolute stability of global pattem formation and parallel memory storage by competitive neural networks,” IEEE Trans. Sysf., Man, and Cybern., vol. 13, pp. 815-826, 1983. J. J. Hopfield, “Neural networks and physical systems with emergent collective computational ability,” Proc. Nut. Acad. Sci. USA, vol. 79, pp. 2554-2558, 1982.

, “Neurons with graded response have collective computational properties like those of two-state neurons,” Proc. Nut. Acad. Sci. USA, vol. 81, pp. 3088-3092, 1984. J. A. Anderson and E. Rosenfeld (eds.), Neurocompuring: Foundations of Research, Cambridge: MIT, 1988. G. Dunning, E. Marom, Y. Owechko and B. Soffer, “Optical holographic associative memory using a phase conjugate resonator,” Proceedings of the SPIE, vol. 625, pp. 205-213, 1986. G. Carpenter and S. Grossberg, ‘The ART of adaptive pattem recognition by a self organizing neural network,” Computer, 77-87, March 1988. B. Kosko, “Bidirectional associative memories,” IEEE Trans. Syst., Man, and Cybem., vol. 18, no. 1, pp. 4940, 1988.

, “Adaptive bidirectional associative memories,” Appl. Opt., vol. 26, no. 23, pp. 4947-4960, 1987. C. C. Cuest and R. Tekolste, “Design and devices for optical bidirectional associative memories,” Appl. Opt., vol. 26, no. 23, pp. 5055-5059, 1987. Z. B. Xu, C. P. Kwong and X. W. He, “Hopfield network and BAM as heteroassociative memory: A critical comparison,” submitted to IEEE Trans. Neural Networks. P. K. Simpson, “Higher-ordered and intraconnected bidirectional associative memorise,” IEEE Trans. Syst., Man, and Cybern., vol. 20, no. 3, pp. 637-652, 1990. Y. F. Wang, J. B. Cruz, Jr. and J. H. Mulligan, “Two coding strategies for bidirectional associative memory,” IEEE Trans. Neural Networks, vol. 1, no. 1, pp. 181-192, 1990.

, “Guaranteed recall of all training pairs for bidirectional associative memory,” IEEE Trans. Neural Networks, vol. 2, no. 6, pp.

B. L. Zhang, B. Z. Xu and C. P. Kwong, “Performance analysis of bidirectional associative memory from the matched-filtering viewpoint and a new way for improvement,” to appear in IEEE Trans. Neural Networks, 1992. L. Shastri, Semantic Networks, An Evidential Formalization and Its Connectionist Realization, London: Pitman, 1988. G. H. Hinton (ed.), Connectionist Symbol Processing, Amsterdam: Elsevier, 1990. C. R. Rao and S. K. Mitra, Generalized Inverse of Matrices and its Applicarions, New York John Wiley, 1971.

559-567, 1990. ’

Documents

Asymmetric bidirectional associative memories