Evolucion Neural Network

Embed Size (px)

Citation preview

  • 7/28/2019 Evolucion Neural Network

    1/5

    Evolvin ural twork elsYoshiaki Tsukamoto, Akira Namatame

    Dept . of Computer ScienceNational Defense AcademyYokosuka, Kanagaw a, 239{suka,nama}@cc.nda.ac.jp

    Abstract- Neural networks in nature are not designed butevolved, and they should learn their structure through theinteraction with their environment. This paper introducesthe notion of an adaptive neural network model with re-flection. We show how reflection can implement adaptiveprocesses, and how adaptive mechanisms are actualized us-ing the concept o f reflection. Learning mechanisms mustbe understood in terms of their specific adaptive functions.We introduce an adaptive function which makes the net-work to be able to adjust its internal structure by itself toby modifying its adaptive function and associated learningparameters. We then provide the model of emergent neuralnetworks. We show that the em ergent neural network modelis especially suitable for constructing large scale and hetero-geneous neural networks with the composite and recursivearchitectures, where each component unit is modeled to beanother neural network. Using the emergent neural networkmodel, we introduces the conce pts of composition and recur-sion for integrating heterogeneous neural network moduleswhich are trained individually.

    I . INTRODUCTIONThe abi l i ty to learn is the most important proper ty ofthe l iving systems. Evolut ion and learning a re the twomost fundam enta l adap ta t ion processes and the i r re la tion-ship is very complex. Study ing the evolution and devel-opment processes of biological systems can reveal how the

    st ruc t ures a re formed through in te rac t ions wi th th e envi-ronment of the l iv ing systems in na ture . These st ruc tura ladaptation mechanisms in biological systems can suggestways of bui ld ing adaptab le s t ruc ture so tha t f ina lly the ne t -work grows to a configuration s uita ble class of problem scharac te r ized by the t ra in ing pa t te rns. The in i t ia l s tep isto explore the as pects of this relationship by defining theprocess of evolution as the process of learning procedureto be adjusted[4]. Taking this approach , we view evolu-t ion as a kind of an adaptive process of learning mecha-nism. Here , th e learning process itself is the object of evo-lu t ion . Cur rent neura l ne twork models a llow th e ne tworksto adjust the i r behavior by changing the in te rconnect ionsweights associating neurons to each other, but the archi-tec ture of the network mus t be set up by system design-ers and once the s t ruc ture i s designed, i t has to remainfixed. This sets a quite constraint on the applicabili ty ofneural network models. In most of the current neural net-work models, learning is done through modification of th esynaptic weights of neurons in the network [1][5][8].h iskind of learning is basically a parameter adapta t ion pro-cess. Th e struc ture of neural networks should be evolvedand developed rather than pre-specified, and we need todevelop a f ramework for an ad aptable process. We in t ro-duce an adaptive neural network model with reflection as

    a framework for an a daptive process evolving in a dynamicenvironment. Ad apt atio n is viewed either as a modificationof ones behavior or as a modification of ones environment.Learning mechanisms must b e understood in te rms of the i rspecific adaptive functions. We introduc e an adaptive func-tion of a neural network, and an self reflective or an adap-tive process of a neural network is modeled to adjust i tsinternal structure to evolving environments by modifyingi ts adapt ive func t ion and i t s assoc ia ted learn ing parameter .We also investigate reflective learning am ong m ultiple neu-ral network modules[7]. In multiple modules setting, twotypes of reflection ma y occur: each network module learnson i t s own by adjust ing i t s adapt ive func t ion and i t s as-sociated learning parameter, while at the same t ime, eachnetwork module mutua l ly in te rac ts an d learns as a groupto obta in the coordina ted adapt ive func t ions and learningparameters .

    11 . FORMULATIONF ADAPTIVE NEURAL NETWORKSA. D e f i n i t i o n o f a n a d a p t i v e f u n c t i o n

    The weighing evidence scheme is that it would not de-mand that every feature of an object be present, insteadit would only weigh the evidence that object is present.Tha t i s , a knowledge object becomes active whenever theweighted summation of the present features proceed thethreshold level. The number weights are assigned to eachfea ture , and th is i s based on the theme of weighing ev-idence. We denote the set of objects by W = (0 ; :i = 1 , 2 , . .. k } . The list of the feature values is rep-resented as d = ( d l ,&,.. ,) E D where the vari-ables d l , d2 , . .., , tak e the Boolean values. We denotet ra in ing examples as the ordered pa i rs l a n g l e d t , e t ) wheredt 6 D a n d et E {0, l}. The set of the ordered pairs(D,C ) = [ ( d t , t } : t = 1, 2, . . ,TI, is termed as a train-ing set.Definition 1 L e t C+ a n d C - be the se t of t h e n e g a t i v ea n d p o s i t i v e e x a m p l e s of the concept C . The summationsof the inn er vec tors o f Boolean vec tors d,, d , E D, denoteda s G + ( d p ) a nd G-(d,) are defined as:

    0-7803-2902-3/96/$4 .00 0 996 IEEE 689

    mailto:tsuka,nama%[email protected]:tsuka,nama%[email protected]
  • 7/28/2019 Evolucion Neural Network

    2/5

    where

    i=lWe define the similarity matrix of the training set D

    under the concept C by the T x 2 matrix defined as(3 )

    In the next th eorem , we provide the procedure of obta in ingthe activation function as the method of each object.Theorem 1 [6] S u p p o s e t h e s i m i l a r i t y m a t r i x T(D,C ) sl inear ly separable . W e d e f i n e t h e c o n n e c t i o n w e i g h ts w i , i =1 , 2 ...,n as ,

    d,EC+ d g E C -a n d t h e a c t i v a t i o n f u n c t i o n w i t h t h e a b ov e c o n n e c t i o nweights as n

    F ( z )= w ix i +0 ( 5 )i= 1T h e n , t h e a c t i v a t i o n f u n c t i o n b e co m e s t h e l i n e a r t h r es h ol d

    f u n c t i o n of a neuro-objec t , tha t is , it satisf ies(i ) Fo r a pos i t ive t ra in ing example:d , E C++F ( d , ) >0

    (ai) Fo r U negat ive t ra in ing example:dh E C - -+F(r2;)

  • 7/28/2019 Evolucion Neural Network

    3/5

    A Initial Training Set(D ,C)

    Fig. 1. A self learning process t o a new learning environment.

    neuro-agent learns its way of the response to the messageby the kind of weighing evidence learning method. Th eweighing evidence scheme is th at i t would not dem and t ha tevery feature t o be present, ins tead it would only weigh theevidence that feature is present.

    Iv . A M O D E L OF E V O L V I N G N E U R A L NETWORKSA. Composition of homogeneous network modules

    In this section, we discuss cooperative learning amongmultiple homogeneous neural network modules, and showthe coordination m echanism. We consider composite learn-ing composed of a set of modules, each coupled with adifferent subset of the whole learning space T = (D,C).The t r a in ing da t a se t T = (D,C) is composed from thesubse ts Ti, = 1 , 2 , . . . n where Ti, = 1, 2, . . n, repre-sents the i - th subse t of T = (D,C). In th is method ofcomposition, however, the results of the distributed mod-ules are under-specified with respect to the whole problem[2][3][8]. n the next theorem, we wi l l show the proper-ties of each subset of the train ing set essentially deter-mine the learn abili ty of the whole training set. We de-note the training set of each network module as Ti =(Di,C i ) , i=1 , 2 , . . .,n. Th e in tegra ted t ra in ing se t i s de-scribed as T =(D,C )=Vy= Ti. Th e aggrega te matr ix ofeach network module is represented by T(Di,Ci), hereT(Di,Ci ) are the similarity matrices respectively. At th ecomposite phase, each network module trained with theinitial training set T(Di,C i ) , i = 1 , 2 , . ..,n , modifies itsadaptive function for coordination as follows: Each net-work module, i =1 , 2 , . . .,n,modify i t s s imi la r ity m atr ixas follows:

    whereI(Di, Dj, C j )=DiDT[Cj,1 - C j ] (7 )

    Theorem 2 If the similarity matrices T;(D,C ) are lan-early separable on learning parameter L = (a ,P, ) , thenthe whole training set (D ,C ) is also linearly separable onlearning parameter L = (a,P, ). The connection weightsw i , i = 1, 2, . . m , are given as the summation of theweights w!, = 1, 2, . .,n obtained from each subset ofeach subset (Dj, C j ) , j = 1 , 2 , . . . n of the training set(D ,C ) .

    mw; = c w ;

    i= l(9)

    We now discuss composite learning, in which some train -ing subset may contain the inconsistent training exam-ples: t he concept value C may be different for t he sa meinput pa t te rn . Th a t i s, the t ra in ing subse t (Di,C i ) , i =1 , 2 , . . .,n may conta in the input pa t te rns d t E Di, d , EDj, i # j such tha t d t =d, a nd et # cs . Since the ta rge tconcept is determined by the combination of the decom-posed inp ut space, this kind of inconsistency may occur forthe composi tion of the t ra in ing subse ts . Th e inconsistencyproblem is handled by generating the appropriate interme-dia te concepts H =[HI,H2 , . .. ,H,] so tha t each t ra in ingsubse t (Dz,Ci) ecomes to be the consistent training setunder th e in te rmedia te concept Hi, = 1 , 2 , . ..,n by gen-erating hidden units as follows: Suppose there are someinconsistent m training examples d l , dz, .. , , E Di suchtha t t he y a re t he sa me inpu t pa t t e rn s d l =d2 =. . .=d,with the different concept values. I n this case, we generate

    691

  • 7/28/2019 Evolucion Neural Network

    4/5

    a hidden u ni t corresponding those inconsistent input pa t -te rns such th a t i t c lassi fies those in put p a t te rn s as the pos-i t ive t rain ing examples and th e o thers as the negative ex-amples. Then the t ra in ing subse t (Di,Hi),i = 1 , 2 , . . ,nbecome to be the consistent training set. The concept Cis then represented as the disjunctive form of those inter-mediate concepts.E . C o m p o s i t i o n of h e t e r o g en e o u s n e t w o r k m o d u l e s

    In this section, we discuss cooperative learning amongmultiple heterogeneous neural network modules, and showthe coordina t ion mechanism. We denote the t ra in ing se t ofeach network module as Ti =(Di,C) , i=1 , 2 , . . ,n. T h eintegrated training set is described as T = (D,C) whereDi c D,Vrz1Di =D. The aggrega te matr ix of each ne t -work module is represented by T(Di,C ) ,where T(Di,C)are the similarity matrices respectively. At the compos-ite phase, each network module trained with the initialtraining set T(Di,C), i=1 , 2 , . . n , modifies i ts adap tivefunction for coordination as follows: E ach network module,i =1 , 2 , . .. n, modify its similarity matrix as follows:

    n

    Ti(D,C) = T(Di,C) (10)i=l

    Theorem 3 If t h e s i m i l a r i t y m a t r i c e s T;(D,C) are l in-ear ly separable o n learn ing parameter L = ( a , , Oi), t h e nt h e w h o l e t r a i n i n g s e t (D,C) i s a l so l inear ly separable o nlearning parameter L = ( a , , 0 = O i ) . T h e c o n n e c -t i o n w e i g h t s wi,,=1 ,2 , . ..,m , a r e g i v e n a s t h e s u m m a t i o nof t h e w e i g h t s w:,j=1 , 2 , . . .,n o b ta i ne d f r o m e a c h s u b s e tof each subse t (Dj,C j ) , j = 1 , 2 , . . . n of the t ra in ing se t(D,C ) .

    with the class-subclass relation such as fo r Ci ,Cj E CCi +Cj o:r Ci 4 Cj (15)

    is defined as a generalization hierarchy. A generalizationhierarchy is a class-subclass hierarchy in which objects be-longing to a subclass inherit i the properties of the ir imme-diate superclass. Each object in W can be described by abit of vectors. Th e i-th att rib ute of each object is describedby a vector $i of length mi, t he j - t h pos it ion of the vec tor$; being either 1or 0, indica t ing tha t the j - th va lue of thea t t r i bu t e A; is or is not present, respectively. Each ob ject,therefore, consists of a set of the vector (T,!J1,42,. . .,$) oflength mi . Each object is classified into several classesc =c1 x c, x x ck. ach element {cl, c2,. . , I , } ofC is termed as a class value ,and ci takes one if a n ob jec tOi belongs to the class Ci anid zero otherwise. An objectOi i s then represented as a vector of pairs of the form

    where $ A ~0;) s te rmed as the a t t r ibute charac te r is t icfunction defined over the attribute subspace of Ai, a n dycj (Oi) s termed as the class characteristic function de-fined over the class Cj. Th e a t t r ibut e charac te r is t ic func-tion defined over the att rib ute space A =A1 xA2 x . .xA ,is denoted as

    Similarly, th e class charac teristic function defined over theclass space C =C1 x C2 x * . x CI , s defined asY ~ ( 0 i ) { Y C ~Oi), c, ,(Oi),..,pc, ( 0 ; ) ) (18)

    C. Construc t ing large sca le neural ne tworksThis section provides the abstract design model of neu-ral networks model. An abs trac t model is especially suit-able fo r constructing large scale and heterogeneous neu-ral networks. Th e large scale and heterogeneous neu-ra l ne tworks a re made up of many small scale networkswhich are trained individually. We consider a set of ob-

    jects W = Oi : =1 , 2 , . ..,n and each objec t i s charac-terized with n a t t r i bu t e s , AI ,Az, ..,A , w i th the doma inDom(A;) , i=1 , 2 , . . .,n.Definition 2 A n y s ub se t Ci of the object space W, t e r m e das a class, is defined as

    Ci = { A I :Dom(Al) ,A2 :Dom(Az), . .,A, :Dom(An)} (13)

    T h e s e t of those characteristic functions over the wholeobject space W =0; : i =1 , 2 , . . ., , denoted as

    defines the training set.We propose th e hierarchicaJ learning to specify the classcode p ~ ( O i ) , hich allows all ancestors and descendants

    of any class in a generalizati.on hierarchy can be inferredsimultaneously. With this hierarchical learning algorithm,newly generated classes can be self-organized in to the pre-existing generalization hierarchy C. We assume the h ie rar-chical relation p property for a pair of classes Ci,Cj E C.We define the indices for the elem ents of C t ha t a re l e a rn -ing obtained by the following hierarchical learning algo-r i thm.

    The se ts of these classes Step 1.1 Each element of 27 is locally coded. T h a t is, weprovide the local code ri to C; , he i - th e lement of S ,where the i - th e lement of r i is 1 and zeros for the o thers.= Cl,C2,...,ClC} (14)

    692

  • 7/28/2019 Evolucion Neural Network

    5/5

    meta-network module

    Fig. 4 . Evolving networks with the composite and recursive architecture.

    S t e p 1.2 Each class Cj inheritance the hierarchical codefrom i ts immedia te ancestor c lass. Th a t i s the c lass char-acteristic code of C after the inheritance is modified asfollows:r j =r ; CE rsum (21)where

    S t e p 1.3 For the immediate descent class Cj of C; , h eclass code of Ci is inherited as the class code of Cj a sfollows:~j=ri CE r j (22)

    where represents the bit OR.We have the following property as with the class codes.Lemma 1 For a pair of class codes r j and r ;

    (23)ri @3rj = r j i f Ci +Cj# ~j otherwizeThe attribute characteristic function $ I A ( O ; )s obtained b yapplying the following equation:

    Using th e above coding scheme, we can genera te the emer-gent neural networks with the composite and recursive ar-chi tec ture as shown in Figure 4.V . CONCLUSION

    This pape r showed how an adapt ive neura l ne tworkmodel w ith reflection can imp lement a dap tive processes.We in t roduced the concept of a n adap t ive func t ion , andlearning mechanisms were understood in te rms of theirspecific adaptive functions. A neural network module wasmodeled to adju st i t s in te rna l s ta te to evolving t ra in ing en-vi ronments by modify ing i t s adapt ive func tion . We also in -vestigate reflective learning amon g multiple network mod -ules. In multiple m odules setti ng, two types of reflectionwere considered: each network module mutua lly intera ctand can learn as a group, while at the same t ime, each

    network module can also learn on its own by adjustingits adaptive function. Th e overall system is constructedfrom several interacting learning modules. We the n dis-cussed how an assembly of modules coo perate to learn a s afunction of their individual inte rnal states. M any separatenetwork mod ules, each of which is combines their individ-ually defined adap tive functions by co ordinating learningparameters . We have also used the agent oriented model asarchitecturally neutral metaphor for describing massivelypara lle l d is t r ibuted an d coopera t ive neura l ne twork mod-ules. An agent oriented model for the specification of neu-ral network m odels consists of cooperative distrib uted pro-cessing elements called autonomous neuro-agent with selfreflection. Each neuro-agent is encapsulating a specific setof knowledge obtained from a diffcrcnt training set. At thecoopera t ive s tage , neuro-agents put forward the i r lea rn tknowledge to obtain coordinated learning parameters,

    REFERENCES[ l ] Ba rnde n , J . A. and Pollack, J. B.: A High-Level con-

    nectionist Models, Ablex Publishing (1991).[a] Hrycej, T.(ed.): Modular Learning i n Neural Networks,Wiley-Interscience (1992).[3] Jacobs, R. A . , Jo rda n , M. I. et al .: Adaptive Mixturesof Local Experts, Neural Computation, Vol. 3 , pp, 79-87 (1991).[4] Lee, T.(ed.): Structure Level Adaptation f o r Artificial

    Neural Networks, Kluwer Academic Publishers (1991).[5] Nadal, J.: Study of a growth a lgor i thm for a feed-forward ne twork , International Journal of Neural

    Neworks, Vol. 1, pp. 55-59 (1989).161 N a ma ta me , A . a nd Tsuka moto , Y.: St ruc tu ra l Con -nectionist Learning with Complementary Coding, In-ternational Journal of Neural Systems, Vol. 3 , No. 1,

    [7] Sian, S. S.: The Role of Cooperation in Multi-agentLearning Systems, Cooperative Knowledge-Based Sys-tems 1990 (Deen, S. M.(ed.)), Springer-Verlag, pp. 67-84 (1990).

    [8] Sikora, R.: Learning Control Strategies for ChemicalProcesses, IEEE E X P E R T , Vol. 7, No . 3 , pp . 35-43(1992).

    pp . 19-30 (1992).

    693