Evolution and Development of Neural Controllers

  • Upload
    lvsl

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

  • 8/7/2019 Evolution and Development of Neural Controllers

    1/17

    796 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

    Evolution and Development of Neural Controllersfor Locomotion, Gradient-Following, andObstacle-Avoidance in Articial Insects

    Jerome Kodjabachian and Jean-Arcady Meyer

    Abstract This paper describes how the SGOCE paradigmhas been used to evolve developmental programs capable of generating recurrent neural networks that control the behavior of simulated insects. This paradigm is characterized by an encodingscheme, by an evolutionary algorithm, by syntactic constraints,and by an incremental strategy that are described in turn. Theadditional use of an insect model equipped with six legs andtwo antennae made it possible to generate control modules thatallowed it to successively add gradient-following and obstacle-avoidance capacities to walking behavior. The advantages of thisevolutionary approach, together with directions for future work,are discussed.

    Index Terms Animats, genetic programming, leaky integra-tors, recurrent neural networks, SGOCE paradigm.

    I. INTRODUCTION

    S INCE the pioneering attempts of a few researchers [1][5]in the late 1980s, the automatic design of articial neuralnetworks using some variety of evolutionary algorithm is acommon occurrence (reviews in [6][9]), in particular in theapplication domains of evolutionary robotics (for a review, see[10]) and of animat design (for a review, see [11]). However,such an approach is not without raising specic problems [12],

    notably that of choosing how to genetically encode the neuralnetworks produced by the evolutionary algorithm.Indeed, it turns out that numerous encoding schemes that

    are currently used in such application domains, becausethey implement a direct so-called genotype-to-phenotypemapping, are hampered by a lack of scalability, according towhich the size of the genetic description of a neural network grows as the square of the networks size. As a consequence,the evolutionary algorithm explores a genotypic space thatgrows bigger and bigger as the phenotypic solutions soughtget more and more complex. Moreover, it also turns outthat such encoding schemes are usually not able to generatemodular architectures, i.e., that they do not allow for repeated

    substructures that would help to encode complex controlarchitectures in compact genotypes.

    In [11], we have argued that it might be wise to tackle theseproblems in the same way that nature does, i.e., by using anindirect genotype-to-phenotype mapping that would insert adevelopmental process between the genotype and the pheno-type of an animat interacting with its environment. In [13],

    Manuscript received March 1, 1997; revised November 15, 1997.The authors are with the AnimatLab, Ecole Normale Sup erieure, France.Publisher Item Identier S 1045-9227(98)06181-5.

    we have implemented such a developmental process withinthe framework of an incremental evolutionary approach thatmade it possible to evolve one-dimensional (1-D) locomotioncontrollers in six-legged animats, i.e., in animats that walk along straight lines.

    This paper reports on the extension of this approach tothe automatic generation of neural networks controlling two-dimensional (2-D) locomotion and higher-level behaviors insimulated insects that live on a at surface. It aims to con-tribute to the animat approach to cognitive science [14] andarticial life [15]. As such, it is heavily inspired by the work of Beer [16]who designed the nervous system of an articialcockroach capable of walking, of avoiding obstacles and of getting to an odorous food sourcealthough the controllersthat are used in the present work are evolved instead of beinghand-coded. It is also heavily inspired by the work of Beerand Gallagher [17]who let evolve the nervous system of awalking insectalthough the neural networks that are evolvedhere are capable of controlling more than mere locomotion.Finally it should be stressed that, although the methodologydescribed herein has not yet been used to evolve controllersfor real walking robots, such an application could be easily

    attempted along the tracks pioneered in [18] and [19].The paper starts with a description of the simple geometry-oriented cellular encoding (SGOCE) evolutionary paradigmand of the Simulated Walking ANimat (SWAN) model of ahexapod animat that we are using. Experimental results onthe evolution of simulated insects exhibiting a tripod walkingrhythm and capable of both following an odor gradient andavoiding obstacles are then described. The paper ends witha discussion of the results and proposes directions for futurework.

    II. THE SGOCE E VOLUTIONARY PARADIGM

    This paradigm is characterized by an encoding scheme that

    relates the animats genotype and phenotype, by an evolu-tionary algorithm that generates developmental programs, bysyntactic constraints that limit the complexity of the develop-mental programs generated, and by an incremental strategy thathelps producing neural control architectures likely to exhibitincreasing adaptive capacities.

    A. The Encoding Scheme

    SGOCE is a geometry-oriented variation of Gruaus cellularencoding scheme [20][22] that is used to evolve simple devel-

    10459227/98$10.00 1998 IEEE

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    2/17

    KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 797

    Fig. 1. The three stages of the tness evaluation procedure of a develop-mental program ( X i ). First, the program is executed to yield an articialneural network. Then the neural network is used to control the behavior of a simulated animat that has to solve a given task in an environment. Finally,the tness of Program X i is assessed, according to how well the task hasbeen solved.

    opmental programs capable of generating complex recurrentneural networks [23], [24]. According to the SGOCE scheme,each cell in a developing network occupies a given positionin a 2-D metric substrate and can get connected to othercells through efferent or afferent connections. In particular,such cells can get connected to sensory or motor neurons thathave been positioned by the experimenter at initialization timein specic locations within the substrate. Moreover, duringdevelopment, such cells may divide and produce new cellsand new connections that expand the network. Ultimately, theymay become fully functional neurons that participate to thebehavioral control of a given animat, although they also mayoccasionally die and reduce the networks size.

    SGOCE developmental programs call upon subprogramsthat have a tree-like structure like those of genetic program-ming [25], [26]. Therefore, a population of such programs canevolve from generation to generation, a process during whichindividual instructions can be mutated within a given sub-program and subtrees belonging to two different subprogramscan be exchanged.

    At each generation, the tness of each developmental pro-gram is assessed (Fig. 1). To this end, the experimenter mustprovide and position within the substrate a set of precursorcells, each characterized by a local frame that will be inheritedby each neuron of its lineage and according to which thegeometrical specications of the developmental instructionswill be interpreted. Likewise, the experimenter must provideand position a set of sensory cells and motoneurons that willbe used by the animat to interact with its environment. Last,an overall description of the structure of the developmentalprogram that will generate the animats control architecture,possibly with the specication of a grammar that will constrainits evolvable subprograms, must be supplied (Fig. 2).

    Each cell within the animats control architecture is assumedto hold a copy of this developmental program. Therefore,the programs evaluation starts with the sequential executionof its instructions by each precursor cell and by each newcell occasionally created during the course of development

    Fig. 2. Setup for the evolution of a neural network that will be used inSection III-A to control locomotion in a six-legged animat. The gure showsthe state of the substrate before the development starts (positions of the sensorycells, motoneurons, and precursor cells within the substrate), as well as thestructure of a developmental program that, in this example, calls upon sevensubprograms. Subprograms SP0SP5 are xed (JP is a call instruction that

    forces a cell to start reading a new subprogram). Only subprogram SP6 needsto be evolved. Its structure is constrained by the GRAM-1 tree-grammar (tobe described in Section II-C). Additional details are to be found in the text.

    (Fig. 3). At the end of this stage, a complete neural network is obtained, whose architecture will reect the geometry andsymmetries initially imposed by the experimenter, to a degreethat depends on the side-effects of the developmental instruc-tions that have been executed. Through its sensory cells andits motoneurons, this neural network is then connected to thesensors and actuators of the insect model to be described later.This, together with the use of an appropriate tness function,makes it possible to assess the networks capacity to generatea specic behavior. Thus, from generation to generation,the reproduction of good controllersand hence of gooddevelopmental programscan be favored to the detrimentof the reproduction of bad controllers and bad programs,according to standard genetic algorithm practice [27].

    In the present application, a small set of developmental in-structions can be included in evolvable subprograms (Table I).A cell division instruction (DIVIDE) makes it possible for agiven cell to generate a copy of itself. A direction parameter( ) and a distance parameter ( ) associated with that instruc-tion specify the position of the daughter cell to be created inthe coordinates of the local frame attached to the mother cell.

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    3/17

    798 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

    Fig. 3. The developmental encoding scheme of SGOCE. The genotype that species the animats nervous system is encoded as a program whose nodesare specic developmental instructions. In this example, the developmental program contains only one subprogram that is read at a different position byeach cell. The precursor cell rst makes connections with the motoneuron M 0 and the sensor S 0 (Steps 1 and 2). Then it divides, giving birth to anew cell that gets the same connections than those of the mother cell (Step 3). Finally, the mother cell dies while the daughter cell makes a connectionwith the motoneuron M 1 and changes the value of its time constant (Step 4).

    (a) (b) (c)

    Fig. 4. The effect of a sample developmental code: (a) when the upper cell executes the DIVIDE instruction, it divides. The position of the daughter cell inthe mother cells local frame is given by the parameters and r of the DIVIDE instruction, which, respectively, set the angle and the distance at which thedaughter cell is positioned. (b) Next, the mother cell reads the left subnode of the DIVIDE instruction while the daughter cell reads the right subnode. (c) Asa consequence, a connection is grown from each of both cells. The two rst parameters of a GROW instruction determine a target point in the local frameof the corresponding cell. The connection is realized with the cell closest to the target pointa developing cell, an interneuron, a motoneuron or a sensorycelland its synaptic weight is given by the third parameter of the GROW instruction. Note that, in this specic example, the daughter cell being closest toits own target point, a recurrent connection is created on that cell. Finally, the two cells stop developing and become interneurons.

    Then, the local frame associated to the daughter cell is centeredon this cells position and is oriented as the mother cells frame(Fig. 4). Two instructions (GROW and DRAW), respectively,create one new efferent and one new afferent connection. Thecell to be connected to the current one is the closest to atarget position that is specied by the instruction parameters,provided that the target position lays on the substrate (Fig. 4).No connection is created if the target is outside of the sub-strates limits. Another instruction called GROW2 will be usedin Sections III-B and III-C below. It is similar to instructionGROW but creates a connection from a cell in a given neural

    module to another cell in another module. The synaptic weightof a new connection is given by the parameter . Two addi-tional instructions (SETTAU and SETBIAS) specify the valuesof a neurons time constant and bias . Finally the instructionDIE causes a cell to die. The functional roles of the connectionweights, time constants and biases are explained below.

    Neurons of intermediate complexity between abstract binaryneurons and detailed biomimetic models are used in the presentwork. 1 Contrary to neurons used in traditional PDP applica-

    1 Reviews of single neuron models are to be found in [28] in [29].

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    4/17

    KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 799

    Fig. 5. The evolutionary algorithm. See text for explanation.

    tions [30], [31], such neurons exhibit an internal dynamics.However, instead of simulating each activity spike of a realneuron, the leaky-integrator model used here only monitorseach neurons average ring frequency. According to thismodel, the mean membrane potential of a neuron is

    governed by the equation(1)

    where is the neurons short-termaverage ring frequency, is a uniform random variablereturning values in and whose meanis the neurons evolvable bias, and is an evolvable timeconstant associated with the passive properties of the neuronsmembrane. is the sum of the inputs that neuron mayreceive from a given subset of sensors, and is theevolvable synaptic weight of a connection from neuronto neuron . This model has already been used in sev-

    eral applications involving continuous-time recurrent neural-network controllers [17], [21], [32][35]. It has the advantageof being a universal dynamics approximator [36]i.e., of being likely to approximate the trajectory of any smoothdynamic systemthat can be taught to exhibit particulardesired temporal behaviors [37].

    B. Evolutionary Algorithm

    To slow down convergence and to favor the apparition of ecological niches, the SGOCE evolutionary paradigm resortsto a steady-state genetic algorithm that involves a populationof randomly generated well-formed programs distributedover a circle and whose functioning is sketched in Fig. 5.

    The following procedure is repeated until a given numberof individuals have been generated and tested.

    1) A position is chosen on the circle.2) A two-tournament selection scheme is applied, in which

    the best of two programs randomly selected from theneighborhood of is kept. 2

    3) The selected program is allowed to reproduce and threegenetic operators possibly modify it. The recombinationoperator is applied with probability . It exchanges two

    2 A programs probability p s of being selected decreases with the distanced to P : p s = m a x ( R 0 d ; 0 ) = R 2 , with R = 4 . Programs for which d isgreater than or equal to R cannot be selected ( p s = 0 ).

    Fig. 6. The GRAM-1 grammar.

    subtrees between the program to be modied and anotherprogram selected from the neighborhood of . Two

    types of mutation are used. The rst mutation operatoris applied with probability . It changes a randomlyselected subtree into another randomly generated one.The second mutation operator is applied with probabilityone and modies the values of a random number of pa-rameters, implementing what Spencer called a constant perturbation strategy [38]. The number of parametersto be modied is drawn from a binomial distribution

    .4) The tness of the new program is assessed by collecting

    statistics while the behavior of the animat controlled bythe corresponding articial neural network is simulatedover a given period of time.

    5) A two-tournament antiselection scheme, in which theworse of two randomly chosen programs is selected, isused to decide which individual (in the neighborhood of

    ) will be replaced by the modied program.

    C. Syntactic Restrictions

    In order to reduce the size of the genetic search-space andthe complexity of the generated networks, a context-free tree-grammar is used to constrain each evolvable subprogram tohave the structure of a well-formed tree. This entails:

    starting the evolution from a population of randomlygenerated well-formed trees;

    constraining the recombination and rst mutation opera-tors to replace a subtree by another compatible 3 subtree,thus always creating a valid grammatical construct.

    Fig. 6 shows an example of a grammar (GRAM-1) usedin Section III-A to constrain the structure of the subprogramsthat participate in the developmental process of locomotioncontrollers.

    3 Two subtrees are said to be compatible if they are derived from thesame grammatical variable. For instance, if grammar GRAM-1 (Fig. 6)is used to dene the constraints on a given subprogram, the subtreeSIMULT3[SETBIAS, DEFTAU, SIMULT4(GROW, DRAW, GROW,NOLINK)] in that subprogram may be replaced by the subtree DIE becauseboth subtrees are derivations of the Neuron variable in GRAM-1.

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    5/17

    800 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

    TABLE ISET OF DEVELOPMENTAL INSTRUCTIONS THAT ARE USED TO MODIFY

    CELL PARAMETERS , CELL NUMBERS , OR INTERCELL CONNECTIONS

    In this example, the set of terminal symbols consists of thedevelopmental instructions listed in Table I and of additionalstructural instructions that have no side-effect on the devel-opmental process. NOLINK is a no-operation instruction.DEFBIAS and DEFTAU leave the default value of the param-eters and unchanged. These instructions label nodes of arityzero. SIMULT3 and SIMULT4 are branching instructions thatallow the subnodes of their corresponding nodes to be executedsimultaneously. The introduction of such instructions makesit possible for the recombination operator to act upon whole

    interneuron descriptions or upon sets of grouped connections,and thus to hopefully exchange meaningful building blocks.Those instructions are associated to nodes of arity three andfour, respectively.

    As a consequence of the use of syntactic constraints thatpredene the overall structure of a developmental program,the timing of the corresponding developmental process isconstrained. In the GRAM-1 example, rst divisions occur,then cells die or parameters are set, and nally connectionsare grown. No more than three successive divisions can occurand the number of connections created by any cell is limited tofour. Thus, the nal numbers of interneurons and connectionscreated by a subprogram well-formed according to GRAM-1cannot be greater than eight and 32, respectively.

    D. Incremental Methodology

    The SGOCE paradigm resorts to an incremental approachthat takes advantage of the geometrical nature of the de-velopmental model. In a rst evolutionary stage, locomotioncontrollers for a simulated insect are evolved. At the end of thatstage, the developmental program corresponding to the bestevolved controller is selected to be the locomotion moduleused thereafter (Section III-A). During further evolutionarystages, other neural modules are evolved that control higher-level behaviors. These modules may inuence the locomotion

    module by creating intermodular connections.For instance, Fig. 7 shows how the successive connectionof two additional modules with a locomotion controller hasbeen used in the experiments reported below to rst generatea gradient-following behavior (Section III-B) and then to gen-erate additional obstacle-avoidance capacities (Section III-C).

    E. The SWAN Model

    The experimental results to be described herein made use of the SWAN model of a simulated walking animat [39] that isinspired by the work of Beer and Gallagher [17]. According tothis model, each of the six legs of the animat is equipped with

    Fig. 7. The SGOCE incremental approach. During a rst evolutionary stage,

    Module 1 is evolved. This module receives proprioceptive information throughsensory cells and inuences actuators through motoneurons. In a secondevolutionary stage, Module 2 is evolved. This module receives specicexteroceptive information through dedicated sensory cells and can inuencethe behavior of the animat by making connections with the neurons of therst module. Finally, in a third evolutionary stage, Module 3 is evolved.Like Module 2, it receives specic exteroceptive informations and it inu-ences Module 1 through intermodular connections. In the present work, noconnections between Module 2 and Module 3 are allowed.

    two pairs of muscles that allow control of its angular positionand of the height of its foot [Fig. 8].

    For three of those muscles, a corresponding motoneuronspecies the value of the resting length parameter in a simplemuscle model [Fig. 8(b)]. Furthermore, each leg is equippedwith a sensor that measures the legs angular position . Thusthe available motors and sensors correspond to those of Beerand Gallaghers simulated insect [17]. However, a differencewith Beer and Gallaghers scheme is that the foot status (upor down) is not determined by the state of the correspondingUP-motoneurons only. More realistically, instantaneous footpositions are determined by the dynamics of the physicalmodel of the animat.

    Additionally, depending upon the activity level of the PS-and RS-motoneurons, forces acting on the animats body canbe greater on one side than on the other. This entails legdisplacements from a vertical plane and the triggering of return

    forces proportional to the angular displacement that areresponsible for the animats rotations [Fig. 8].Last, the SWAN model allows for the monitoring of the

    animats overall equilibrium. When the animat sets uprightafter having fallen, the weight of its body opposes the forceexerted by the UP-muscles, thus lengthening the return tostability.

    III. EXPERIMENTAL RESULTS

    The SGOCE methodology and the SWAN model havebeen used to evolve the developmental programs of neuralnetworks that are able to control 2-D-locomotion, gradient

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    6/17

    KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 801

    (a) (b) (c)

    Fig. 8. The SWAN model. (a) According to the former version of this model [13], each leg has two degrees of freedom. Thanks to the antagonisticPS-(power strike) and RS-(return strike) muscles, a leg can rotate around an axis ( A y ) orthogonal to the plane of the gure. The UP- and DOWN-musclesallow the position of the foot F to be translated along the ( A x ) axis. The resting lengths of the tunable PS-, RS-, and UP-muscles depend on the activitylevels u of the corresponding motoneurons. The resting length of the DOWN-muscle is supposed to be nontunable. (b) A muscle is modeled as a springof xed stiffness k and of (possibly variable) resting length l (after [40]). (c) In the extended model of the hexapod animat that has been used herein,each leg is afforded a third degree of freedom and can accordingly deviate from the vertical plane parallel to the body axis. In this case, a return forceproportional to the deviation tends to bring the leg back to the vertical plane.

    following and obstacle avoidance in a six-legged animat. Thishas been possible thanks to a three-stage incremental approach,according to which an efcient locomotion controller was rstgenerated and then connected to two additional controllers.The rst one permitted the simulated insect to reach a givenodor source, while the second provided the added capacity of avoiding obstacles while walking toward the odorous goal.

    There is no principled way to decide in advance how manygenerations, which grammar, and which tness function touse in order to favor the evolution of neural networks likelyto control specic behaviors. Therefore, in every experimentdescribed herein, the corresponding settings have been iter-atively improved through preliminary experiments. Althoughother choices could have been made and led to substantiallydifferent results than those that are presented here, the numberof tness evaluations in each stage has been set to 100000.Likewise, the three grammars that have been used were almostidentical, except for the addition of an instruction allowing thecreation of intermodular connections in stages 2 and 3. Also,the numbers of division cycles that were imposed by thesegrammars were respectively set to three, two, and zero. Asfor the tness functions, they were tailored to the specicbehaviors sought and will be detailed further. Finally, theevolvable parameters were set to vary in the intervals givenin Table II and the genetic operators parameters were set to

    , , , and . Parameter values forthe animats physical model were those given in [39].

    A. 2-D-Locomotion

    Results concerning the automatic production of 1-D-locomotion controllers have been reported at length elsewhere[13]. In particular, it has been shown that some of the neural

    TABLE IIRANGES AND DEFAULT VALUES OF EVOLVABLE PARAMETERS . THE

    AMPLITUDE OF THE NOISE ADDED TO THE BIASES b IS = 0 : 0 5 FORALL NEURONS . THE VALUES OF PARAMETERS r AND r 0 REFERTO A SUBSTRATE OF SIZE 1536 2 1536 IN ARBITRARY UNITS

    networks that have been obtained were capable of generatinga tripod gait because they called upon four central patterngenerators that were responsible for the rhythmic movementsof the middle and back legs. Moreover, suitable connectionswere responsible for the synchronization of each tripod,according to which the front and back legs on each side of theanimat were moved in synchrony with the middle leg of theopposite side. Likewise, other connections were making for

    phase opposition in the rhythms of the two opposite tripods.Such results have been obtained with a former version of

    the SWAN model in which only two degrees of freedom wereafforded to each animats leg. The tness function was thedistance covered during the evaluation augmented by a termencouraging any leg motion

    (2)

    where was the position of the animats center of mass attime , was the evaluation time set to 1000 cycles, and

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    7/17

    802 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

    (a) (b)

    Fig. 9. (a) A developmental subprogram obtained for the locomotion task after 100000 selection-replacement events. Instructions parameters are not shown.(b) The corresponding developed locomotion network. Filled lines are excitatory connections, dotted lines are inhibitory connections. Fan-in connectionarrive at the top and fan-out connections depart from the bottom of each neuron [13].

    and were the angular position and the height of leg at time [39]. Although no explicit selection pressurewas introduced for not falling, falls were implicitly penalizedbecause the stepping upright process slowed down locomotion.

    The extension of this approach to 2-D-locomotion has beenstraightforward because it only entailed adding a third degreeof freedom to the SWAN model, as described in Section II-E.In other words, simply modifying the SWAN model made itpossible for a previously generated 1-D-locomotion controllerto generate tripod walking in a 2-D-environment. Fig. 9 showsthe best evolved subprogram (subsequently called LOCO1)that has been obtained after 100 000 selection-replacement

    events had been made by the genetic algorithm in a populationof 200 programs. It also shows the corresponding neuralnetwork, which included 38 interneurons and 100 connec-tionsafter useless interneurons and connections had beenprunedand which will serve as Module 1 in the subsequentexperiments described herein. Several 2-D-trajectories gener-ated by Module 1, together with an illustration of the tripodgait obtained, are shown in Fig. 10. The turning directionof the animat depends upon initial conditions but, after atransitory period, straight locomotion resumes.

    It thus appears that using Module 1 and the extended SWANmodel together affords the simulated insect the possibility of walking straight ahead and of changing direction, providedthat some dissymmetry is imposed to the activity levels of mo-toneurons controlling leg and foot movements on each side of the animat. In the following sections, such a dissymmetry willbe generated through new connections brought by additionalneural controllers.

    B. Gradient-Following

    In this section, a new neural module is used to controlthe already evolved locomotion module in order to solve agoal-seeking task.

    To this end, we evolved a gradient following module thatreceived information from two sensors, each measuring the

    intensity of an odor signal perceived at the tip of an antenna.This intensity decreased in proportion to the square of thedistance from an odorous source.

    The gradient following module stemmed from two precur-sor cells that read the same developmental subprogram andexecuted its instructions in a symmetric way (Fig. 11). It hadthe possibility of inuencing the behavior of the locomotionmodule through intermodular connections created during de-velopment. A new developmental instruction (GROW2) wasused to create a connection from a cell in the second moduleto a cell in the locomotion module. This instruction workslike the instruction GROW except that the target cell is not

    sought in the coordinate frame of the current module but in thatof the locomotion module. The GRAM-2 Grammar (Fig. 12)denes the set of developmental subprograms that were usedfor Module 2. The corresponding subprograms could create atmost four neurons and 16 connections. Because such subpro-grams were executed by both precursor cells, this resulted ina maximum of eight neurons and 32 connections in Module 2.

    To evaluate the tness of each program, a set of environments with different source positions was used. Ineach environment, the animats task was to reach the source of odor, considered as a goal. The animat always started from thesame position and was allowed to walk for a given timeset to 1000 cycles, or until it reached the goal. This event was

    considered to have occurred if the point situated betweenthe tips of the animats two antennae came close enough tothe source . The corresponding tness function was

    (3)

    (4)

    where was the time at which the evaluation in environ-ment stopped.

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    8/17

    KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 803

    (a)

    (b)

    Fig. 10. (a) Tripod gait produced by an animat controlled by the controller shown in Fig. 9. The horizontal axis represents time. A dot is plotted whenthe corresponding leg is raised. Legs are numbered as in Fig. 2. (b) 100 trajectories generated by the controller when the extended SWAN model is used.All the trajectories start from the same point (0, 0). Because the initial leg conguration used is symmetric, differences in the random number sequencesused to implement neuronal noise lead to large differences in the initial turning directions.

    This function rewarded animats that quickly approached thesource during the evaluation.

    Five different experiments have been done, each startingwith a different initial population and involving 100 000 t-ness evaluations (20 000 selection/replacement events). Fig. 13shows the evolution of the best and average performances of the population through generations. Individuals able to reachthe goal in all ve test environments consistently evolved.Such capacities proved to be general enough to allow theanimat to reach the goal in eight generalization experiments,which sometimes required lengthening the evaluation time

    or adjusting the parameters of the animats physical

    model (Fig. 14).Fig. 15 describes how the gradient following capacities

    are implemented in the control architecture of the animatwhose behavior is shown in Fig. 14. This animats Module2 contains six interneurons and 22 connections. Subsequently,the evolved subprogram of this module will be called GRAD1.

    C. Obstacle-Avoidance

    In this section, we seek to implement a minimal reactivenavigation system that allows an animat to both follow anodor-gradient and avoid obstacles.

    To this end, capitalizing on the evolved subprogramsLOCO1 and GRAD1 previously generated, we let evolvea Module 3 that added obstacle-avoidance capacities tothose of walking and gradient-following already secured.This module stemmed from two precursor cells that read thesame developmental subprogram and executed its instructionin a symmetric way (Fig. 16). It was assumed to receiveinformation from two sensors, each indicating if an antennagot into contact with an obstacle, and it could inuence thebehavior of the locomotion module through intermodularconnections created during the evolutionary process. To avoidparasitic interferences, no intermodular connections fromModule 3 to Module 2 were allowed. The GRAM-3 grammarthat was used is described in Fig. 17. It only permitted eachprecursor cell to grow at most four connections to neuronsof Module 1.

    To evaluate the tness of each program, a set of different environments , each containing a source of odor (goal) and several obstacles, has been used. In eachenvironment, the behavior of the animat was simulated untila nal time set to 1500 cycles was reached or until theanimat reached the goal. When an obstacle was hit by the bodyof the animat, no further moves occurred until the end of the

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    9/17

    804 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

    Fig. 11. Setup for the evolution of a goal-seeking controller. DRAW instructions in subprograms SP0 and SP1 create a default connection between a precursorcell of Module 2 and the associated sensory cell. These connections are copied to any daughter cell the precursor cells may have. WAIT instructions arenecessary to synchronize the developments of the two modules because Module 1 goes through three division cycles while Module 2 goes through only twosuch cycles. Subprogram SP9 is evolved according to the GRAM-2 tree-grammar (specied in Fig. 12).

    trial. The corresponding tness function was

    (5)

    (6)

    where was a weighting factor set empirically to 15, wasthe time at which the evaluation in environment stopped,

    and was set to one if the animat was stable at timeand to zero otherwise. The rst term in the function rewardedan individual according to the rate of decrease of its distanceto the goal during the evaluation. The second term explicitlyfavored individuals that did not fall and will be discussed lateron.

    Again, ve different experiments have been done, each start-ing with a different initial population and involving 100 000 t-ness evaluations (20 000 selection/replacement events). Fig. 18shows the evolution of best and average performances as afunction of generation number. In each experiment, individualsable to reach the source and to avoid obstacles in each of

    Fig. 12. The GRAM-2 grammar.

    the ve test environments of the learning set were obtained.Fig. 19 shows the behavior of one such individual in theseenvironments (T1T5). Besides being surrounded or not bya rectangular wall, these test environments only containedcircular obstacles. Generalization experiments, where individ-uals were tested in new environments (Fig. 19, G1G4), were

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    10/17

    KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 805

    Fig. 13. Evolution of module 2. Results of ve experiments are shown: solid lines represent evolution of the best-of-generation individual; dotted lines representthe average tness of a generation. In all ve experiments, individuals able to reach the goal in the ve test positions were found after around 50 generations.

    Fig. 14. Generalization experiments for the gradient-following task. An animat, which has been selected to reach a goal in ve different test positions,has been tested against eight other goal positions (cases a to h). When the animat occasionally missed the goal, it might nevertheless reach it later (asin case g). In case h however, the neural controller entered a stable limit cycle according to which the animat turned around the goal forever. Halvingthe value of the legs stiffness relative to angle ' allowed to correct the behavior shown in h and led to the behavior shown in i , where the steeringangle of the animat was appropriately increased.

    often successful, although some difculties getting out of dead-ends or avoiding collisions with obstacles exhibiting sharpcorners have been noticed. Fig. 20 shows that environmentsG1G4 are of increasing difculty. In G2, the tness peak

    around 35 corresponds to trajectories that end in the dead-end between the right-most obstacle and the right wall. InG3, the animat occasionally hits an obstacle because thegeometrical arrangement of its antennae prevents detecting

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    11/17

    806 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

    Fig. 15. Gradient following mechanism for the animat of Fig. 14. The activity of the two odor sensors O0 and O1 are compared thanks to the reciprocalinhibitory connection between neurons G 0 and H 0 . When the goal is on the right of the animat, interneuron H 0 wins the competition against G 0 . If such is the case, a rotation toward the goal is instigated by the inhibition of motoneuron R 4 , that prevents the right hind leg from rising. As soon as thedifference between the signals received by both antennae becomes small enough, straight locomotion resumes. The inhibitory connection from interneuronH 0 to interneuron C 1 in Module 1 seems to play no meaningful functional role.

    sharp corners when they are approached from an inappropriateangle (tness peak around 30). For the same reason, the animatalmost always ends hitting an obstacle in G4. Such resultsclearly demonstrate that evolution has the greater difcultiesgenerating efcient controllers the more dissimilar the tasks tobe solved are from those the previous generations have beenconfronted with. Evidently, this is why nature has inventedthe processes of individual learning, which provide additionaladaptive means for coping with unforeseen circumstances.

    Fig. 21 describes how the obstacle avoidance capacities areimplemented in the control architecture of the animat whosebehavior is shown in Fig. 19. The corresponding Module 3contains two interneurons and six connections.

    IV. DISCUSSION

    Insofar as the use of the SGOCE paradigm only requiresthat the experimenter provides a means of connecting thenetworks input and output neurons to the problem domain,together with a suitable tness function, this paradigm shouldprove useful for the automatic design of neural networks inmany application areas. Thanks to the possibility of usingappropriate grammars, it is likely to reduce the complexity of the networks it generates. However, it should be stressed thatrecourse to developmental programs to evolve neural networks

    has probably numerous consequences that are yet to be fullyunderstood and assessed. In particular, it will probably bevery difcult and counter-intuitive to understand the role thatmutations and crossovers may have depending upon wherethey occur within tree-like developmental programs. It is, forinstance, well known that a mutation occurring in the partof a genotype that is expressed in an early developmentalphase will have more extensive consequences on the nalphenotype than a mutation occurring in a late phase. Becauseit is already very difcult to adapt the genetic operators of traditional evolutionary algorithms so that they can select andfavor useful building blocks, it is possible that the acquisitionof the corresponding empirical or theoretical knowledge willbe much more difcult and lengthy for applications resortingto a developmental process.

    Be that as it may, results that have been obtained here provethat the SGOCE evolutionary paradigm provides a convenientmeans of generating neural networks capable of controllingthe behavior of an animat. In particular, such results gofurther than any previous attempt [17][19], [21], [38], [41],[42] at automatically designing the control architecture of simulated insects or real six-legged robots, attempts that havebeen limited to the evolution of mere straight locomotion

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    12/17

    KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 807

    Fig. 16. Setup for the evolution of an obstacle-avoidance controller. DRAW instructions in subprograms SP0 to SP3 create a default connection betweena precursor cell of Modules 2 or 3 and the associated sensory cell. These connections are copied to any daughter cell the precursor cells may have.WAIT instructions are added to synchronize the developments of the different modules because Module 1 goes through three division cycles, whileModule 2 goes through only two such cycles, and Module 3 does not lead to any division. Subprogram SP12 is evolved according to the GRAM-3tree-grammar specied in Fig. 17.

    Fig. 17. The GRAM-3 grammar.

    controllers. 4 As compared to the way these attempts wereconducted, the efciency of the SGOCE paradigm is probablydue to the compact encoding it affords, thus tremendouslyreducing the size of the search space that other approachesare committed to explore. This paper also demonstrates thatthe incremental approach on which the SGOCE paradigmrelies makes it possible to progressively enrich an animatsbehavioral repertoire by capitalizing upon already functionalneural networks whose inner workings are modulated byadditional control modules. Such capacities are afforded bythe geometry oriented processes of axonal growth that havebeen added to Gruaus basic encoding scheme [21]. SGOCEs

    4 Jakobis contribution [19] is an exception here. However, the correspond-ing reesarch effort was performed after the work described in this paper.

    incremental approach is similar to that used by Brooks [43] forthe hand design of so-called subsumption architectures. It isalso similar to the methodology advocated by several authors[41], [44][47]. Last, it is commonly used by nature to buildcontrol hierarchies [48] that are responsible for the adaptivebehavior of animals.

    Recourse to grammars constraining the structure of thedevelopmental programs is not mandatory, although it helpedin the present application to reduce the complexity of theevolved controllers and, thus, to reduce simulation time.In [49] a locomotion controller that has been evolved inthe absence of syntactic constraints is presented: it exhibits192 neurons and 2222 connections. However, it should bestressed that a potential drawback of using grammars is thatthe controllers thus generated may be too constrained. Forinstance, in the absence of additional experiments, one cannotdismiss the possibility that a less stringent grammar than

    GRAM-3, and a greater variety of environmental challengesthan the ve tests that have been used in Section III-C, mighthave permitted the inclusion of more neurons and connectionsinto Module 3. In turn, this might have improved obstacle-avoidance behavior and helped to deal more efciently withdead-ends or sharp corners. Such a remark suggests futureinteresting research directions, which would let the grammarscoevolve with the developmental programs.

    Likewise, although efcient overall behaviors have beenobtained here while forbidding interconnections between Mod-ules 2 and 3, it would certainly be interesting to seek howto optimize the animats control architectures, for instance

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    13/17

    808 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

    Fig. 18. Evolution of module 3. Results of ve experiments are shown: solid lines represent evolution of the best-of-generation individual; dotted linesrepresent the average tness of a generation. In all ve experiments, individuals able to occasionally reach the goal in the ve test environments appeared inthe rst ten generations. However, the tness of these individuals was very sensitive to intrinsic neuronal noise. As evolution proceeded, the correspondingbehaviors became more and more robust with respect to neuronal noise.

    Fig. 19. Experimental results obtained when gradient-following and obstacle-avoidance behaviors are evolved. Cases (T1T5) show the animats trajectorywithin the ve test environments. Cases (G1G4) show generalization results obtained in four new environments. The animat can sometimes deal withenvironmental congurations (G1 and G2) or with obstacle shapes (G3) never met during evolution. However, the geometrical arrangement of its antennaemakes it difcult for it to avoid hitting sharp corners (G4).

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    14/17

    KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 809

    Fig. 20. Histograms of the tness scores obtained when each of the experiments shown on Fig. 19; G1G4 has been repeated 100 times.

    thanks to using coevolving modules. In particular, in theabsence of additional experiments, one may wonder whetherthe absence of suitable interconnections was not responsiblefor some antagonistic effects that Modules 2 and 3 had withrespect to Module 1, antagonistic effects that were responsiblefor the animats high falling rate in preliminary experiments.Although such effects have been cured by adding a secondterm penalizing falls in the tness function of Section III-C,future work might reveal that proper interconnectionslikethose that have been hand-coded by Beer [16]are moresuited to this end.

    In the same manner, the solutions described in this paperwere certainly heavily determined by the initial setups thathave been used. In particular, according to their respectivepositions in the substrate, some cells had a better chance of getting connected to each other than did others. This, in par-ticular, was the case with sensors, motoneurons and precursorcells whose initial positions were set by the experimenter,and which were, therefore, more or less likely to be incor-porated into the nal, developed neural network. Here again,a possible way of combating the negative consequences of anexperimenters arbitrary choice is to let the morphology of

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    15/17

    810 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

    Fig. 21. Obstacle avoidance mechanism for the animat of Fig. 19. When the right antenna contacts an obstacle, the corresponding sensory cell C 0 sendsan excitatory signal to interneuron J 0 . If the intensity of this signal is sufcient, the interneuron modulates the locomotion behavior so as to turn left,mainly by inhibiting motoneuron U 1 , thus preventing the left front leg from rising. As soon as the right antenna does no longer detect the obstacle, straightlocomotion resumes. The effect of the excitatory connection from interneuron J 0 to interneuron D 4 of Module 1 is more difcult to elucidate, althoughit has been observed that the presence of this connection enhances the obstacle avoidance behavior.

    the animat coevolve with its control architecture, an approach

    already explored by others [50], [51].Finally, coevolution could be used to let the learning set

    coevolve with the animat population, so as to propose themost challenging environmental situations according to currentpopulation abilities. Such a possibility has been rst proposedby Hillis [52] and further explored in [53][55].

    The present work also demonstrates that, among the dif-ferent paradigms that have been used to evolve the controlarchitecture of an animate.g., Lisp functions [25], [56], logictrees [57], [58], classier systems [59]recurrent articialneural networks exhibit several specic and attractive features.Besides being universal dynamics approximators as already

    mentioned, it turns out that they are low-level, nonspecicprimitives that can be combined to give rise to several mech-anisms known to be at work in the control architecturesof animals [60]. Thus, inside the controllers that have beenevolved in this work, it is possible to identify reexes andfeedback mechanisms, together with oscillators and centralpattern generators. Other mechanisms, like chronometers andrudimentary memories, have also been observed in a previouswork [13]. Furthermore, modular architectures like those thathave been sought and generated herein are also known tobe exhibited by the nervous systems of animals. Simpleconnections between such modules proved to be sufcient to

    control behaviors of increasing complexity, but the generality

    of this nding remains to be assessed. In particular, it willbe enlightening to see how far such an approach could leadtoward the discovery of more cognitive abilities than thesimple stimulus-response pathways that have been evolved sofar.

    There are several research directions to be investigated inorder to improve the behavior of the insects that have beensynthesized here. In particular, it turns out that the locomotioncontroller that has been evolved as Module 1 is perfectly ca-pable of generating backward locomotion. Therefore, recourseto appropriate tness functions would probably lead to thegeneration of animats exhibiting improved obstacle-avoidance

    behavior that would, in particular, be able to escape fromdead-ends.It may likewise be hoped that the extension of the results

    described in [13], which lead to the automatic discovery of aswitch device, will make it possible to implement more com-plex memory mechanisms in the animats control architecture.This, together with the use of additional sensors that wouldafford minimal visual capacities, might help improving theanimats navigation behavior if it could detect and memorizespecic landmarks in its environment.

    Last, additional developmental instructions could be devisedthat would allow the synaptic weights of some connections to

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    16/17

    KODJABACHIAN AND MEYER: EVOLUTION AND DEVELOPMENT OF NEURAL CONTROLLERS 811

    be changed during an animats lifetime, thanks to an individuallearning process. In a similar manner, other developmentalinstructions could be devised that would allow the develop-mental pathway to be dynamically changed depending uponthe specic interactions that the animat experiences with itsenvironment. Besides its operational value, every step in suchdirections would contribute to theoretical biology and enableus to better understand the interactions between development,learning and evolution, i.e., the three main adaptive processesexhibited by natural systems [11].

    V. C ONCLUSION

    It has been shown here that the current implementationof the SGOCE evolutionary paradigm makes it possible toautomatically design the control architecture of a simulatedinsect that is capable not only of quickly walking accordingto an efcient tripod gait, but also of following an odorgradient while avoiding obstacles. Such results provide markedimprovements over current state-of-the-art in the automaticdesign of straight-locomotion controllers for articial insects

    or real six-legged robots. They rely upon specic mechanismsimplementing the developmental process of a recurrent dy-namic neural network and upon an incremental strategy thatamounts to xing the architecture of functional subnetworks ina still evolving higher-level control system. There are severalways of improving the corresponding mechanisms, in partic-ular by letting evolve several characteristics that have beenarbitrarily set here by the experimenter, or by devising newdevelopmental instructions that would add individual learningcapacities to the processes of development and evolution.Such research efforts might be as useful in an engineeringperspective as in a contribution to a better understanding of the mechanisms underlying adaptation and cognition in naturalsystems.

    REFERENCES

    [1] W. B. Dress, Darwinian optimization of synthetic neural systems, inProc.IEEE 1stInt. Conf. NeuralNetworks. San Diego, CA: SOS, 1987.

    [2] A. Guha, S. Harp, and T. Samad, Genetic synthesis of neural networks,Honeywell Corporate Syst. Development Division, Tech. Rep. CSDD-88-I4852-CC-1, 1988.

    [3] D. Whitley, Applying genetic algorithms to neural net learning, Dept.Comput. Sci., Colorado State Univ., Tech. Rep. CS-88-128, 1988.

    [4] R. K. Belew, J. McInerney, and N. N. Schraudolph, Evolving networks:Using the genetic algorithm with connectionist learning, CSE, Univ.California San Diego, Tech. Rep. CS90-174, June 1990.

    [5] G. F. Miller, P. M. Todd, and S. U. Hedge, Designing neural networksusing genetic algorithms, in Proc. 3rd Int. Conf. Genetic Algorithms,1989.

    [6] J. Schaffer, D. Whitley, and L. Eschelman, Combinations of geneticalgorithms and neural networks: A survey of the state of the art, inCombinations of Genetic Algorithms and Neural Networks, D. Whitleyand J. Schaffer, Eds. Los Alamitos, CA: IEEE Comput. Soc. Press,1992.

    [7] I. Kuscu and C. Thorton, Design of articial neural networks usinggenetic algorithms: Review and prospect, Univ. Sussex, Brighton, U.K.,Cognitive Sci. Res. Paper 319, 1994.

    [8] K. Balakrishnan and V. Honavar, Evolutionary design of neural ar-chitecturesPreliminary taxonomy and guide to literature, ArticialIntell. Group, Iowa State Univ., Ames, Tech. Rep. CS TR#95-01, 1995.

    [9] J. Branke, Evolutionary algorithms for neural network design andtraining, in Proc. 1st Nordic Wkshp. Genetic Algorithms Applicat., T.Talander, Ed., 1995.

    [10] T. Gomi and A. Grifth, Evolutionary RoboticsAn overview, inProc. IEEE 3rd Int. Conf. Evolutionary Comput., 1996.

    [11] J. Kodjabachian and J.-A. Meyer, Evolution and development of control architectures in animats, Robot. Autonomous Syst., vol. 16, pp.161182, Dec. 1995.

    [12] M. Matari c and D. Cliff, Challenges in evolving controllers for physicalrobots, Robot. Autonomous Syst., vol. 19, pp. 6783, 1996.

    [13] J. Kodjabachian and J.-A. Meyer, Evolution and development of modular control architectures for 1-d locomotion in six-legged animats,Connection Scil., to be published.

    [14] J.-A. Meyer, The animat approach to cognitive science, in Compara-tive Approaches to Cognitive Science, H. Roitblat and J.-A. Meyer, Eds.

    Cambridge, MA: MIT Press, 1995.[15] , Articial life and the animat approach to articial intelligence,in Articial Intelligence, M. Boden, Ed. New York: Academic, 1996.

    [16] R. Beer, Intelligence as Adaptive Behavior: An Experiment in Compu-tational Neuroethology. San Diego, CA: Academic, 1990.

    [17] R. Beer and J. Gallagher, Evolving dynamical neural networks foradaptive behavior, Adaptive Behavior, vol. 1, no. 1, pp. 91122, 1992.

    [18] T. Gomi and K. Ide, Emergence of gaits of a legged robot bycollaboration through evolution, in Proc. Int. Symp. Articial Life and Robot. , 1997.

    [19] N. Jakobi, Running across the reality gap: Octopod locomotion evolvedin a Minimal Simulation, in EvoRobot98: Proc. 1st European Wkshp.Evolutionary Robot. , P. Husbands and J.-A. Meyer, Eds., 1998.

    [20] F. Gruau, Synth`ese de reseaux de neurones par codage cellulaire etalgorithmes g enetiques, Th ese duniversit e, ENS Lyon, Universite LyonI, Jan. 1994.

    [21] , Automatic denition of modular neural networks, AdaptiveBehavior, vol. 3, no. 2, pp. 151184, 1994.

    [22] , Articial cellular development in optimization and compila-tion, in Evolvable Hardware95, E. Sanchez and Tomassini, Eds. NewYork: Springer-Verlag, Lecture Notes in Comput. Sci., 1996.

    [23] R. Hecht-Nielsen, Ed., Neurocomputing . Reading, MA: AddisonWesley, 1990.

    [24] J. Hertz, A. Krogh, and R. Palmer, Introduction to the Theory of NeuralComputation. Reading, MA: Addison-Wesley, 1991.

    [25] J. Koza, Genetic Programming: On the Programming of Computers byMeans of Natural Selection. Cambridge, MA: MIT Press, 1992.

    [26] , Genetic Programming II: Automatic Discovery of ReusableSubprograms. Cambridge, MA: MIT Press, 1994.

    [27] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Ma-chine Learning. Reading, MA: Addison-Wesley, 1989.

    [28] I. Segev, Simple neuron models: Oversimple, complex, and reduced,Trends in Neurosci., vol. 15, no. 11, pp. 414421, 1992.

    [29] W. Softky and C. Koch, Single-cell models, in The Handbook of BrainTheory and Neural Networks, M. Arbib, Ed. Cambridge, MA: MIT

    Press, 1995.[30] J. L. McClelland and D. E. Rumelhart, Eds., Parallel Distributed Processing, vol. 1. Cambridge, MA: MIT Press, 1986.

    [31] D. E. Rumelhart and J. L. McClelland, Eds., Parallel Distributed Processing, vol. 2. Cambridge, MA: MIT Press, 1986.

    [32] D. Cliff, I. Harvey, and P. Husbands, Explorations in evolutionaryrobotics, Adaptive Behavior, vol. 2, no. 1, pp. 73110, 1993.

    [33] D. Cliff and G. F. Miller, Coevolution of pursuit and evasion ii:Simulation methods and results, in From Animals to Animats 4. Proc.4th Int. Conf. Simulation of Adaptive Behavior, P. Maes, M. J. Mataric,J.-A. Meyer, J. B. Pollack, and S. W. Wilson, Eds. Cambridge, MA:MIT Press., 1996.

    [34] W. T. Miller III, R. S. Sutton, and P. J. Werbos, Eds., Neural Networksfor Control . Cambridge, MA: MIT Press, 1991.

    [35] K. Warwick, G. Irwin, and K. Hunt, Eds., Neural networks for controland systems. London, U.K., Peter Peregrinus, 1992.

    [36] R. D. Beer, On the dynamics of small continuous-time recurrent neuralnetworks, Adaptive Behavior, vol. 3, no. 4, pp. 469510, 1995.

    [37] B. A. Pearlmutter, Gradient calculations for dynamic recurrent neu-ral networks: A survey, IEEE Trans. Neural Networks, vol. 6, pp.12121228, Sept. 1995.

    [38] G. Spencer, Automatic generation of programs for crawling andwalking, in Advances in Genetic Programming, K. E. K., Jr., Ed.Cambridge, MA: MIT Press, 1994, pp. 335353.

    [39] J. Kodjabachian, Simulating the dynamics of a six-legged animat,AnimatLab, ENS, Paris, Tech. Rep., 1996.

    [40] E. R. Kandel, J. H. Schwartz, and T. M. Jessell, Principles of neural sci-ence, in Muscles, Effectors of the Motor Systems, 3rd ed. EnglewoodCliffs, NJ: Prentice-Hall, 1991, ch. 36.

    [41] H. de Garis, Genetic programming: GenNets, articial nervous systems,articial embryos, Ph.D. dissertation, Universit e Libre de Bruxelles,Belgium, 1991.

    [42] M. A. Lewis, A. H. Fagg, and A. Solidum, Genetic programmingapproach to the construction of a neural network for control of a walking

    Authorized licensed use limited to: AMERICAN UNIVERSITY OF SHARJAH. Downloaded on June 20,2010 at 13:02:41 UTC from IEEE Xplore. Restrictions apply.

  • 8/7/2019 Evolution and Development of Neural Controllers

    17/17

    812 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

    robot, in IEEE Int. Conf. Robot. Automat., Nice, France, 1992, pp.26182623.

    [43] R. A. Brooks, A robot that walk: Emergent behavior form a carefullyevolved network, Neural Comput., vol. 1, no. 2, pp. 253262, 1989.

    [44] A. Wieland, Evolving neural network controllers for unstable sys-tems, in Proc. Int. Joint Conf. Neural Networks , 1991.

    [45] I. Harvey, P. Husbands, and D. Cliff, Seeing the light: Articialevolution, real vision, in From Animals to Animats 3: Proc. 3rd Int.Conf. Simulation of Adaptive Behavior, D. Cliff, P. Husbands, J.-A.Meyer, and S. W. Wilson, Eds. Cambridge, MA: MIT Press, 1994,

    pp. 392401.[46] N. Saravan and D. Fogel, Evolving neural control systems, IEEE Expert , vol. 10, pp. 2327, 1995.

    [47] F. Gomez and R. Mikkulainen, Incremental evolution of complexgeneral behavior, Adaptive Behavior , vol. 5, pp. 317342, 1997.

    [48] R. Dawkins, The Blind Watchmaker. Essex, U.K.: Longman, 1986.[49] J.-A. Meyer, From natural to articial life: Biomimetic mechanisms in

    animat design, Robot. Autonomous Syst., , vol. 22, pp. 321, 1997.[50] K. Sims, Evolving 3-D morphology and behavior by competition, in

    Proc. 4th Int. Wkshp. Articial Life, R. A. Brooks and P. Maes, Eds.Cambridge, MA: MIT Press, 1994.

    [51] F. Dellaert and R. D. Beer, A developmental model for the evolutionof complete autonomous agents, in From Animals to Animats 4: Proc.4th Int. Conf. Simulation of Adaptive Behavior, P. Maes, M. J. Mataric,J.-A. Meyer, J. B. Pollack, and S. W. Wilson, Eds. Cambridge, MA:MIT Press, 1996.

    [52] W. D. Hillis, Coevolving parasites improve simulated evolution as anoptimization procedure, in Articial Life II, C. G. Langton, C. Taylor,J. D. Farmer, and S. Rasmussen, Eds. Reading, MA: Addison-Wesley,1992, pp. 313324.

    [53] J. Paredis, Coevolutionary computation, Articial Life, vol. 2, no. 4,pp. 355376, 1995.

    [54] H. Juill e and J. B. Pollack, Dynamics of coevolutionary learning, inFrom Animals to Animats 4: Proc. 4th Int. Conf. Simulation of AdaptiveBehavior, P. Maes, M. J. Mataric, J.-A. Meyer, J. B. Pollack, and S. W.Wilson, Eds. Cambridge, MA: MIT Press, 1996, pp. 526534.

    [55] C. D. Rosin and R. K. Belew, New methods for competitive coevolu-tion, Evolutionary Computation , vol. 5, pp. 129, 1997.

    [56] C. W. Reynolds, Evolution of corridor following behavior in a noisyworld, in From Animals to Animats 3: Proc. 3rd Int. Conf. Simulationof Adaptive Behavior, D. Cliff, P. Husbands, J.-A. Meyer, and S. W.Wilson, Eds. Cambridge, MA: MIT Press, 1994, pp. 402410.

    [57] W.-P. Lee, J. Hallam, and H. H. Lund, A hybrid GA/GP approachfor coevolving controllers and robot bodies to achieve tness-speciedtasks, in Proc. 3d IEEE Int. Conf. Evolutionary Comput., 1996.

    [58] W.-P. Lee, J. Hallam, and H. H. Lund, Applying genetic programmingto evolve behavior primitives and arbitrators for mobile robots, in Proc.4th IEEE Int. Conf. Evolutionary Comput., 1997, to be published.

    [59] L. B. Booker, Classier systems that learn internal world models,Machine Learning, vol. 3, pp. 161192, 1988.

    [60] C. R. Gallistel, The Organization of Action: A New Synthesis. Hillsdale,NJ: Laurence Erlbaum, 1980.

    J er ome Kodjabachian received the bachelorsdegree from the Ecole Nationale Sup erieure desTelecommunications, Paris, France. He receivedthe French Ph.D. degree in biomathematics fromthe University Paris 6.

    His research interests include the automaticdesign of control architectures for robots, bothsimulated and real.

    Jean-Arcady Meyer received the bachelord de-grees in human psychology and animal psychologyfrom the Ecole Nationale Superieure de Chimie deParis, France, and the Ph.D. degree in biology fromthe same university.

    He is currently Research Director at the CNRSand heads the AnimatLab of the Ecole NormaleSuperieure, Paris. His main scientic interests arethe interaction of learning, development, and evolu-tion in adaptive systems, both natural and articial.

    Dr. Meyer is the Editor-In-Chief of the Journalof Adaptive Behavior .