23
:fv';ru'",{,#:::::::;!,*;l;,'::,1"i:"ff ,,t),?,',liTl;,?T,'*, 255 Evolutionary Learningof aFuzzy Behavior Based Controller for a NonholonomicMobile Robot in a Class of Dvnamic Environments D. P. THRISHANTHA NANAYAKKARA Faculty of Engineering Systems and Technology, Graduate School of Science and Engineering, Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGOWATANABE and KAZUO KIGUCHI Department of Advanced Systems Control Engineering, Graduate School of Science and Engineering, Saga Univershy, I- Honjomachi, Saga 840-8502, Japan; e -mai l : watanab e@ me. s aga-uac..i p KIYOTAKA IZUMI Department of Mechanical Engineering, Faculty of Science and Engineering, Saga University, I -Honjomachi, Saga 840-8502, Japan (Received: 4 July 2000; in final fbrm: 4 January 2001) \ Abstract. This paper presentsan approach for evolving optimum behaviors for a nonholonomic mobile robot in a class of dynamic environments. A new evolutionaryalgorithm reflecting some powerful features in the natural evolutionary processto have flexibility to deal with changesin the environment is used to evolve optimum behaviors. Furthermore, a fuzzy set basedmulti-objective fitness evaluation function is adopted in the evolutionary algorithm. The multi-objective evaluation function is designedso that it allows incorporating complex linguistic f'eatures that a human observer would desire in the behaviors of the mobile robot movements. To illustrate the effectiveness of the proposedmethod, simulation results are compared using a conventional evolutionary algorithm. Key words: nonholonomicmobile robot, fuzzy behavior basedcontrol, evolutionaryalgorithms, dynamic environments, fuzzy setbased objective functions. 1. Introduction Brooks (1986) flrst proposedthe concept of adopting a layered control system based on subsumption architecture for mobile robots.In the layeredcontrol sys- tem, independent behaviorgroupsthat coherently work to control the navigation, obstacleavoidance, movement velocity, etc. are employed. The important feature in this type of controller is that it is divided into layered behavior groups with lower level behaviorstaking care of more primitive activities and those at higher levelstaking care of more complicated onesas shownin Figure l. This approach of controllerconstruction can be compared with conventional ones with functional elementsthat often operatein a hierarchical structure as shown in Figure 2. ln a

:fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

Embed Size (px)

Citation preview

Page 1: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

:fv';ru'",{,#:::::::;!,*;l;,'::,1"i:"ff ,,t),?,',liTl;,?T,'*,255

Evolutionary Learning of aFuzzy Behavior Based

Controller for a Nonholonomic Mobile Robot in a

Class of Dvnamic Environments

D. P. THRISHANTHA NANAYAKKARAFaculty of Engineering Systems and Technology, Graduate School of Science and Engineering,

Saga University, l-Honjomachi, Saga 840-8502, ktpan

KEIGO WATANABE and KAZUO KIGUCHIDepartment of Advanced Systems Control Engineering, Graduate School of Science and

Engineering, Saga Univershy, I- Honjomachi, Saga 840-8502, Japan;

e -mai l : w atanab e @ me. s a g a- uac..i p

KIYOTAKA IZUMIDepartment of Mechanical Engineering, Faculty of Science and Engineering, Saga University,

I -Honjomachi, Saga 840-8502, Japan

(Received: 4 July 2000; in final fbrm: 4 January 2001) \

Abstract. This paper presents an approach for evolving optimum behaviors for a nonholonomic

mobile robot in a class of dynamic environments. A new evolutionary algorithm reflecting some

powerful features in the natural evolutionary process to have flexibility to deal with changes in the

environment is used to evolve optimum behaviors. Furthermore, a fuzzy set based multi-objective

fitness evaluation function is adopted in the evolutionary algorithm. The multi-objective evaluation

function is designed so that it allows incorporating complex linguistic f'eatures that a human observer

would desire in the behaviors of the mobile robot movements. To illustrate the effectiveness of the

proposed method, simulation results are compared using a conventional evolutionary algorithm.

Key words: nonholonomic mobile robot, fuzzy behavior based control, evolutionary algorithms,

dynamic environments, fuzzy set based objective functions.

1. Introduction

Brooks (1986) flrst proposed the concept of adopting a layered control systembased on subsumption architecture for mobile robots. In the layered control sys-

tem, independent behavior groups that coherently work to control the navigation,

obstacle avoidance, movement velocity, etc. are employed. The important featurein this type of controller is that it is divided into layered behavior groups withlower level behaviors taking care of more primitive activities and those at higherlevels taking care of more complicated ones as shown in Figure l. This approachof controller construction can be compared with conventional ones with functionalelements that often operate in a hierarchical structure as shown in Figure 2. ln a

Page 2: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

256 THRISHANTHA NANAYAKKARA

Plan chanees to the world

Build maps

Explore

Wander

Avoid objects

Figure I. The layered control system concept

Sensors----------------

Sensors----------->

Acfuators--------------->

()C)

O

()o0

q,

Xo

F

O Actuators------------->

Figure 2. Hierachical functional control concept.

layered control system, independent behavior groups are fused (or coordinated)by way of competition and cooperation. In addition, Maes (1992) proposed theconcept of behavior netvvork so that the nodes of this network represent behav-iors out of which an artificial creature can select a behavior in a given situation.A Genetic Programming (GP) approach to develop wall following behaviors inmobile robots was developed by Koza (1992). Recent literature can be found withthe layered control system hybridized with other techniques such as reinforcementlearning, inductive learning (Mataric, 1994). The fuzzy behavior based controllerfor a nonholonomic mobile robot studied in this paper was proposed by Watanabeand Izumi (1998). Moreover, robot navigation in dynamic environments is studiedin recent literature (Ram et al.,1991;' Bourbakis et a1., 1997).

The motivation behind developing the proposed method has two facets: oneis to improve the adaptability of behavior based controllers to deal with dynamicenvironments; the other is to improve the evaluation of behaviors to bring it morecloser to the way a human would evaluate the behaviors in an artificial creature. Thefundamental objectives are to achieve increased efficiency in actual engineeringapplications of behavior based controllers in changing environments and to evolvebehaviors that reflect practical engineering objectives often lay down in linguisticterms.

The adaptability of the behaviors is improved using a new Evolutionary Algo-rithm (EA) with some powerful features found in the natural evolutionary processto deal with dynamic environments. More specifically, the new evolutionary algo-rithm is characrerized by two new features: one is a new chaotic mutation mecha-nism controlled by the diversity of the population to improve the convergence speedand to improve flexibility to deal with dynamic environments by maintaining the

Page 3: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

LEARNING OF AFUZZY BEHAVIOR BASED CONTROLLER 257

diversity of the population; the other is the usage of historical knowledge of theevolutionary process to make strategic moves in the search space.

The fitness evaluation of behaviors in the new approach involves a fuzzy setbased evaluation function to include multiple linguistic requirements often desiredin the behaviors of the mobile robot. This approach tries to imitate human wayof evaluating behaviors of other human beings, creatures or man made machinerysuch as robots, where the evaluations are often made with respect to linguisticstandards desired in the behavior concerned. For example, in evaluating an obstacleavoidance behavior of a mobile robot, a human linguistic requirement could be thatit is not good to avoid the obstacle by going too neither close to the obstacle nortoo far from the obstacle. Another requirement would be: the maximum torquesgiven to the wheels should neither be too close to the saturation level nor too lowso that the capacity of the motors is under utilized. Therefore this kind of multiplelinguistic requirements desired in the behaviors of the mobile robot can be easilyembodied in the evaluation function using fuzzy sets than crisp mathematics oftenfound in the literature.

Simulation studies are canied out for a nonholonomic mobile robot navigatingin a class of dynamic environments using the proposed method. The effectivenessof the proposed method has been demonstrated by comparing with the resultsobtained from a conventional Evolutionary Programming (EP) algorithm calledMeta-EP (Biick and Schewefel, 1993).

The rest of the paper is organized as follows. In Section 2, an explanation of themodel of the nonholonomic mobile robot, the basic features of the behavior-basedcontroller and the definition of the class of dynamic environments considered inthis study are explained. Section 3 gives a discussion on the new features of theevolutionary algorithm tailored to deal with this problem. The new design methodof the multi-objective fitness evaluation function based on fuzzy sets is elaboratedin Section 4. Section 5 gives the results and a discussion of behavior based navi-gation in a class of dynamic environments. Finally, in Section 6, some concludingremarks are given.

2. Behavior Based Navigation of a Nonholonomic Mobile Robot

2.1. THE MoDEL oF THE MOBTLE RoBor

The schematic diagram of the mobile robot is shown in Figure 3, where the left sidedenotes the robot parameters and variables, and the right side denotes the availablesensory information, whose de{initions are also described in Tables I and II. Let

x ( I ) : l u ( n Q U I O U l f '

be the state variable vector and

u( t ) : fu , u , l '

Page 4: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

258 THRISHANTHA NANAYAKKARA

Figure 3. Left: schematic diagram of the mobile robot, right: a sample situation of the mobile

robot in a given environment.

Table I. The parameters ofthe robot

Parameter Description

Mombnt of inertia around the c.g. of robot

Distance between the left or right wheel and the c.g. of the robot

Mass of robot

Moment of inertia of wheel

Viscous friction factor of wheel

Driving gain factor

Radius of wheel

Table l l . Sensory informalion

Variable Description

IU

I

M

Iut

k

r

U

0D

d

It

\y

u r , u l

Velocity of robot

Azimuth of robot

Distance to the target

Distance to the obstacle

Angle between the direction of travel and the line connecting

the obstacle and the c.g. of the robot

Angle between the direction of travel and the line connecting

the target and the c.g. of the robot

Right and left driving input torques

Page 5: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

LEARNING OF AFUZZY BEHAVIOR BASED CONTROLLER 259

r - l

Figure 1. The block diagram of the fuzzy behavior-based controller for the mobile robot.

be the manipulated variable vector. Then the state space model for the mobile robotcan be written as

* ( rS : Ax(r ) - t Bu(r )

with

( 1 )

| - . ,

I Mr, + 21.A : l 0

I

L 0t - , -

l- -^,

I Mr, + 21.n : l o

| -krt

L l , , t- ' � +2h,1'�

0 0 - l0 1 ^ I ,-2cl ' Io _ l

1 , , r2 +21, , ,12 Jt . , -- " ,

IMr2 121, , , I

0 l .krt I

I ,r, + 2hP J

A mobile robot constructed in our laboratory was considered, whose physicalparameters are given by I , :0.6541 [kgm2], M :25.5 tkgl , / : 0.165 [m],r : 0.05 [m], 1, :0.4419 x 10-3 [kgm2], k :90, and c : 0.0419 [kgm2/s].

Directional search FNN

Velocity change FNN

Azimuth chanse FNN

Free behavior group *r - : 1 u 1i E i " n

__$e_agt!1e_b=e@vloJglogp_.t - t *

d i

Page 6: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

260

2.2. THE FUZZy BEHAVIoR-BASED coNTRoLLER

The basic block diagram of the fizzy behavior-based controller is shown in Fig-ure 4. The figure shows the behavior groups, consisting of objective behaviorgroup, free behavior group, and reactive behavior group, involved in the appli-cation and how they interact with the environment in the process of mobile robotnavigation.

In Figure 4, u!,, is the force to change the distance to the target, zi* is the torqueto change the relative angle to the target, r,r]., is the force to change the velocity ofthe robot, ulris the torque to change the azimuth of the robot, ulois the force tochange the approaching distance to the obstacle, and uir is the torque to changethe avoidance angle against the obstacle.

Behavioral fusion was implemented by using a suppression unit S, which iscomposed of a cooperation or competition logic. Let a and b be two results offuzzy reasoning from two lower and upper behavior groups, respectively. Then thefusion result c is given by

c : ( l - s ) a l s b ,

where

lsat(a) * sat(b) l

2

function sat(.) is given byThe saturation

sat(-r) :l x l > € ,

l x l { e ,(2)I sgn(x ) if

l * 1 , i f

where € was set to be 0.01. Then in this case, if the sign of a and b are equal, thenthe output from the upper behavior group becomes the fusion result. Otherwise theoutput from the lower behavior group becomes the fusion result. The input-output

THRISHANTHA NANAYAKKARA

Figure 5. The schematic diagram of the FNN

Page 7: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

LEARNING OF AFUZZY BEHAVIOR BASED CONTROLLER

relation of fizzy neural network (FNN) is given by asuch that

N,

y : D p r W t r , ,f t : l

261

simplified fuzzy reasoning

(3)

nt : --#-. (4)D'i-t tt i

pr : exp{ln(0.5)(w,r - D'w1il, (5)

where X is the input to the FNN, I is the expected output of the FNN, VIl.7 denotesthe center value of the Gaussian-like fuzzy membership function associated with

the 7th rule, and W7; denotes the reciprocal value of the deviation from the centerW,1 to which the 7th Gaussian function of the input on the support set has value

0.5. W6p is the constant in the conclusion of the ftth rule, p7, is the normalized

confidence associated with the kth rule, and N, is the total number of fuzzy rulesfor one FNN. Figure 5 depicts the schematic diagram of one FNN.

2.3. THE CLASS OF DYNAMIC ENVIRONMENTS CONCERNED

DEFINITION. A class of dynamic environments I is defined as follows. Let

T; : {ti - ti-t }, i : 1, ... , NE, denote a random time period bound by time

t;-1 and /;. Here Ng denotes the maximum number of environments that can exist

within a given time period. Let E7, be a given environment in time period f . Then,

E7, €. 8, i f ET,remains stat ic in l , and E;_,,8; * Er, . where

E f : f i m E , , t t ,' i t t - - -o

and

E,, , :J , iToE, , , -u , .

EXAMPLE. Let us take a simple example of a mobile robot navigating to a target

to do some work and returning to the original position repeatedly. Suppose the path

between the original position and the target is filled with obstacles that can move.

The definition of a class of dynamic environments denoted by 6, means that these

obstacles remain static for a time period starting at time ti l and lasting until t;.Then at time /;, the positions of the obstacles may change and remain static until/;1i. Therefore, if the environment changes in this manner, that environment fallsinto the class of dynamic environments denoted by t.

Page 8: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

262 THRISHANTHA NANAYAKKARA

2,4. DEFINITIoN oF AN ADAPTABILITY MEASURE

To evaluate the adaptability of the algorithm to a new environment, the followingindicator is used.

The indicator of adaptability Ia(Ei) in a fitness maximizing environment E; isgiven by

v*1. , ( E, \ : - .

Nvr

where V* is given by

(6)

V* : 0.995{V'o* - Vol * Vo Q)

in which Vn,n* is the maximum fitness achieved in the new environment and Vs isthe initial fitness in the new environment. Ny* is the number of generations whereV* is calculated.

If the algorithm adapts to a new environment efficiently, the number of gen-erations Ny- is less, and the value of V* is expected to be as high as possible.The value of V* is affected by two factors: one is the increment of fitness in thenew environment; the other is the initial fitness in the new environment. If thealgorithm keeps enough diversity, the value of Vs is expected to be high. Withthe available diversity, achieving high increments of fitness is determined by theconvergence ability of the algorithm. Therefore in an efficient case, the value ofthe indicator should be comparatively higher than an inefficient algorithm. In theproposed method, the average of the histories of evolution over 100 trials is usedto calculate the indicators. Then in this case the I,(Ei) is calculated by

y*l , ( E i ) : - . r - . ( 8 )

N;-

7- : o.ggs {v.u* - vrl + vn (9)

7-u* is the average of the maximum fitness achieved in the new environment, V6is the average of the initial fitness in the new environment, and N7- is the averagenumber of equat ions where 7* is calculated.

3. The New Evolutionary Algorithm Adopted

Evolving the behaviors in the class of dynamic environments using a new evolu-tionary algorithm is studied in this paper. The new evolutionary algorithm consistsof some new features that closely imitate some powerful characteristic featuresfound in the natural evolutionary process to have the robustness in changing envi-ronments and to have effective information transfer from old generations to newones. Therefore adoption of the new evolutionary algorithm improves the perfor-mance of evolutionary behavior based navigation of mobile robots in changing

Page 9: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

LEARNING OF AFUZZY BEHAVIOR BASED CONTROLLER

environments. The new evolutionary algorithm is depicted in Figure 6. The basicprinciples of this algorithm were discussed in (Nanayakkara er al., 1999a 1999b).The main differences of the new method from those presented in (Nanayakkara etal., 1999a, 1999b) are: first, the new method uses a practical problem of behaviorbased mobile robot navigation in changing environments, whereas the above men-tioned presented only the results of mathematical function optimizations; second,the above two papers gave introductions to chaotic mutation mechanism and strate-gic move mechanism based on historical knowledge independently, whereas in thismethod we have used both mechanisms as a composite algorithm; third, this paperpresents an analytical result to support the idea that diversity based mutation canaccelerate the speed of convergence; forth, in the new method, a finzy set basedpractical objective function is presented to allow the linguistic terms of humanobservers to be incorporated to behavior evaluation; fifth, the definition of the classof dynamic environments is presented in this paper. Here, a brief elaboration isgiven on the new features found in the algorithm that enables fast convergence andadds robustness to deal with changing environments.

3.1. ADAPTIVE CHAoTIC MUTATION MECHANISM

3.1.1 . Basic Algorithm

Mutation is performed to each individual as given by

\ ' i : X i + o iNi (0, 1) , ( t 0 )

where x; is the object variable of a randomly selected individual. o; is the stan-dard deviation or so called strategy parameter, and Ni(0, 1) is a Gaussian randomvalue generator with a normal distribution of zero mean and unity variance. In theproposed method standard deviation o; is given by

oi : Bf l€r l ,

B i : c lB i l

in which B; is the range of initialization of the object variable x;, c is set to be0.125 , and fp is the output of a chaotic neuron in the tth generation. The input of thechaotic neuron is relative diversity of the population in the kth generation given by

263

( 1 1 )

(r2)

where De is the diversity of the population of the ftth generation, which is given by

Dpt " - D n (13 )

(r4)

in which ;r;; is the 7th element of the ith individual, xb, is the 7th element ofthe best individual, n denotes the number of elements in an individual, and pt

Page 10: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

264 THRISHANTHA NANAYAKKARA

chaotic mutation and stratesrc moves.

oc(g

b 0 .

(gc

oo

0 0 . 5 1

Relative diversity (n

Figure 7. Behavior of the output of the chaotic neuron with the input (1 .

denotes the number of individuals, Bui and Bq are upper and lower bounds ofthe searching domain of the 7th object variable. Note that the step size of mutationin the proposed method is directly controlled by the diversity of the population. Thecharacteristics of the chaotic neuron are presented in Nanayakkaral et al. (1999a,1999b). The pictorial view of the input-output relationship of the chaotic neuron isdepicted in Figure 7.

Initialize Pooulation

Relative Diversity

Figure 6. The new evolutionary algorithm with

Page 11: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

LEARNING OF AFUZZY BEHAVIORBASEDCONTROLLER 265

3.1.2. How Diversity-Based Mutation Can Help Fast Corwergence

Here, we will discuss about the importance of the relative diversity based mu-tation. Consider the Gaussian distribution with zero expectation and variance o2which is controlled by the chaotic neuron in the proposed method. Such a Gaussiandistribution is given by

r t5111.o2.,,(x): -+exp f -S) (1s)o t / 2 n \ / o ' /

Let the distance from the current search point to the global optimum be given by,r*.Then the probability of generating a value in the neighborhood of x* by mutatingan individual is given by

n-r*+e

Pc(0,o.)( lr - r. l < .) : I ,",u,"2y(x) d,r, (16)' J r * - e

where e > 0 is the neighborhood size. According to the mean value theorem,

,n . i *+e

I d"ro,, :1 (x) dr : 2ed61s.oz,,(x* - e * d), (11)J t + - e

where 6 exists such that 0 < 6 < 2e . Then to investisate the behavior of

Pcq ' r ( l x -x* l ( e )

with respective to the standard deviation of mutation o, the partial derivative ofPce,o\( lx - x* l < e) is taken as

a .) - l P " ,n " r , ( l x - r - I ( e ) ]A o L

v \ v ' � v I \ '

a f f r - t ' I: ; l I d c , o . o r , ( x ) d r l

d o L J r ' , I

: : ( 2 e d o , r . " z , { x - - e * 6 ) )3 o '

^ a [ | / ( x * - e + 6 ) 2 \ l: " '* l ;a"-o\- , ' / .1

Itfollows from the above that (0 l3o) P6e,o'1(lx -x*l ( e) >and (0 /0o)P6ro .o2r ( l r -x * l < € ) < 0 i f o > l x * - e *61 .

( l 8 )

f o < l x * -e *61

Since o is the search step size for Gaussian mutation, the above analysis showsthat the distance from the current search point and the neighborhood ofthe globaloptimum should be larger than the step size of mutation, in order to maximize theprobability of finding the global optimum. It follows that the step size of mutation

' )

0 i

Page 12: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

266 THRISHANTHA NANAYAKKARA

has a strong relationship with the spread of the population around the global opti-mum. When the population is well spread, larger step sizes are desired. Conversely,when the population is converged closer to the global optimum, smaller step sizesare desired. Since the diversity of the population is defined by Q : DklDo, itis a good measure of the spread of the population around the global optimum.Coming back to the new chaotic mutation mechanism, Figure 7 clearly depicts thatthe output of the chaotic neuron makes the standard deviation of mutation reducewith the reducing diversity of the population when the algorithm converges. Thisphenomenon is in complete agreement with the above analytical result. Therefore,it concludes that the direct control of the standard deviation of mutation by therelative diversity of the population can increase the probability of finding the global

optimum in a given environment. Moreover, this kind of decreasing standard de-viation of mutation agrees with one of the fundamental properties of evolutionarycomputations fbr faster convergence according to Fogel (1994, 1991), Fogel andGhozeil (1996), and Rechenberg (1994). Chellapilla and Fogel (1997) also suggestthat the Gaussian mutation operator helps faster convergence.

3.1.3. How Chaotic Muttttion Can Help to Deal with Changing Environments

Then we come to discuss about the rationale of using chaotic standard deviationswith the purpose of maintaining further evolvability. First, let us draw attentionto the natural process of evolution. It is incredible to see how the life on earthhas come through many changes in the environment. In a pafticular environment,a particular type of species may be powerful. With a change in the environment,another species may emerge powerful and the fotmer may lose its position. Yetlife goes on without being annihilated. The main reason behind the success of thenatural biological evolution in changing environments seems to be the bio-diversityon earth. It is important to note that the ability of a population to continue survivingin a changing environment comes from two main characteristics: one is the abilityto converge quickly to the global optimum in the current environment; the otheris to maintain enough diversity to face a potential change in the environment.The interesting phenomenon in the nature that needs attention is that there is apowerful mechanism to keep searching for better living beings while maintainingbio-diversity. The authors are in the belief that the mutation of genetic informationis the key to create new species and new creeds of a given species leading to acreative mechanism to keep bio-diversity. In fact, Leung (1997) has presented atheoretical work that concludes that the mutation operator plays the main role tokeep diversity, by performing a Markov chain analysis of GAs.

Coming back to the new mechanism of mutation, Figure 7 shows that whenthe relative diversity drops below a threshold, chaotic dynamics are invoked in theneuron that lead to increased step sizes of mutation. This results in improvement ofthe diversity of the population. Note that in this method, the diversity is improvedapparently after converging to a certain extent. Since relative diversity itself is thekey variable that controls the output of the chaotic neuron, the new mechanism

Page 13: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

- f{islery of optimization

Range of prediction

LEARNING OF AFUZZY BEHAVIOR BASED CONTROLLER 267

1.0 2.O 3.0 4.0 5.0

Number of generationsFigure 8. The basic idea of making strategic moves in the search space for one variable.

of chaotic mutation has self-adaptive capability. On the other hand, the diversityimprovement is based on deterministic chaos as opposed to adding random distur-bances to the standard deviation of mutation. Random disturbances may achievethe same objective to some extent, yet lacks repeatability and consistency of results.This leads to difficulties in comparisons of results obtained to a given problemusing the same algorithm. From this perspective, using deterministic chaos is farsuperior to just adding random disturbances to the standard deviation of mutation.Therefbre the proposed type of self-adaptive mechanisms of mutation well reflectsthe natural phenomena of keeping bio-diversity in the nature.

From the above explanation, note that the most imporlant points in the newmechanism of mutation are: one is that the relative diversity of the populationdirectly controls the standard deviation of mutation; the other is that the chaoticproperties of the neuron to control the standard deviation of mutation are invokedwhen the relative diversity drops below a threshold due to convergence.

3.2 . ADAPTIVE STRATEGIC MOVES BASED ON HISTORICAL KNOWLEDGE

3.2.1. Insights from Nature

In natural evolution, there are two kinds of information transfer from one genera-tion to the next: one is the genetic information transfer; the other is the experienceor knowledge transfer by teaching. Perhaps the latter plays an important role inhuman societies in the growth of technology and philosophy.

The new algorithm closely follows the latter way of information transfer bymaking strategic moves in the search space using the historical path taken by the

4.

oo.C,

> 3 .oo5ooO aa a .o(l)f(d

1 .

Page 14: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

268 THRISHANTHA NANAYAKKARA

best individual. Hook-Jeeve method follows this kind of moves in the search space.Yet it merely performs a deterministic move in the search space as opposed tocreating many individuals randomly in a directed region of the search space bythe proposed method. Yuret (1994) exploits local knowledge in GAs based onthe fitness landscape of the search space. Although the idea of making strategicmoves in the search space by using historical knowledge is not new, effectivelycombining it with other features of the evolutionary optimization algorithm makesit more powerful. The method of making adaptive strategic moves is explained in(Nanayakkara et al., 1999a,1999b). The basic idea is depicted in the Figure 8 fora single variable. It is clear from this figure that after making strategic moves, thenext generation can search for a wide range of solutions, induced by the history ofthe solutions.

4. Incorporation of Multiple Linguistic Objectives Using Fuzzy Set BasedEvaluation Function

Before going to discuss about the issues of evaluating a behavior, let us first have anidea about what a behavior is. A common definition of a behavior in the literatureis that a behavior is a control law for achieving a particular goal (Mataric,1994).For example, in a mobile robot domain, avoiding obstacles is a control law thattakes inputs from the mobile robots sensors and uses them to generate appropriateactions. which will enable the robot to avoid obstacles in the environment.

In a practical behavior based mobile robot controller, there can be many behav-iors with various control objectives. Then the problem arises as to how one wouldevaluate these behaviors. For example, coming back to the obstacle avoidancebehavior, the important question to be answered is, whether it is enough just toavoid an obstacle. Apparently a human observer would like the robot to avoid theobstacle by taking a path which is neither too close to the obstacle nor too farfiom the obstacle. Therefore, there exist such linguistic requirements that wouldrender a quality behavior than just attaining an abstract control goal. Therefore thedefinition of a behavior adopted in this paper is stated as follows: A behavior is acontrol law for achieving a particular goal in such a way that the resulting motionabides by a set of qualitative standards that a human observer would desire tosee in the robot motion These qualitative standards are often defined in linguisticterms.

Therefore, in this case, the linguistic objectives include:Avoid obstacles so that the robot does not move too close nor too far fromthem.Reach the target as close as possible.Try to keep the maximum torque command given to the wheels around therated optimum torque level. Never go close to the torque saturation level.Do not move too fast nor too slow. Try to keep the velocity within an optimumrange.

Page 15: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

LEARNING OF AFUZZY BEHAVIOR BASED CONTROLLER

Table III. The characteristics considered for fitness function in the robot

navigation problem

Variable Characteristic Desired (fuzzy)

269

A 1

Y "

Y .

X4

X5

A 6

Velocity u [m/s]

Right wheel torque z1 [Nm]

Left wheel torque 11 [Nm]

Distance to the target, D [ml

Distance to the center of obstacle, d [m]

Total distance travelled, dru- [m]

up1* is about 0.7

rr.pry is about 0.02

17.n'u* is about 0.02

D.in is about 0.0

d-in is about 1.1

Table IV. The parameter values of the Gaussian fuzzysets for multi-obiective evaluation

Function

/1 (u): velocity evaluation 0.7 5.0

f2ftr): right wheel torque evaluation 0.02 15.0

/3(q): left wheel torque evaluation 0.02 15.0

.fq(D): rarger approach evaluation 0.0 10.0

l5(d): obstacle avoidance evaluation 1. I 2.0

waiWci

0 0.1 0.2 0.3Righvleft torque, &sy'&sl [Nm]

0 0.2 o.4 0 1Distance to target, D [m] Distance to obstacle, d [m]

Figure 9. Thefizzy sets ofthe functions fi to fs.

Velocity, v [m/s]

Page 16: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

210 THRISHANTHA NANAYAKKARA

Do not take wild paths and try to keep the total distance within an optimumrange.

Therefore the fitness function has six subobiectives. such as

(1e)i - I

where X; is the lth characteristic variable of a given set of behavior groups, and theevaluation functions from .f1 to .f5 are Gaussian fuzzy sets in the following form:

, : : D r i ( X i t .

f i(Xi) : exp{ln(0. 5)(w,i - x ) ' w3,1.

The function 16 is given by

I I i f 5 .5 < / ,u - < 8 .5 .T o r x o ) : i o o r h e r w i s e .

(20)

(2r)

where d.u,o denotes the total distance traveled by the robot. The description of thecharacteristic variables Xt, . . . , X6 lra given in Table III. The parameter variablesW,; and Wa; for the functions .fl to ./5 are fixed as given in Table IV. Figure 9 showsthe shapes of each of those functions. Note here that the fz(r,,^u) = .l'z(e,^u).In the algorithm, the population contains 60 individuals each encoding one set ofbehavior groups as shown in Figure 4.

5. Results and Discussion

5. I . ENCODING FNN BASED BEHAVIoRS IN THE INDIVIDUALS OF THE

EVOLUTIONARY ALGORITHM

In the case of FNN based evolution of behaviors for a nonholonomic mobile robot,three behavior groups constitute the controller as shown in Figure 4. Each behaviorgroup in the set of behavior groups has two FNNs in this application. Each FNNobtains one variable as input and produces one consequent output for controllingthe robot as shown in Figure 5. For example in the objective behavior group, oneFNN is for target searching. This FNN takes D as input, and using N, fuzzy rules,it produces an output in order to reduce the distance. Optimizing a set of behaviorgroups involves finding a set of training parameters such that the robot successfullyachieves the set of multi objectives. Therefore, all parameters of the FNNs in thebehavior based controller are encoded as a real-valued vector of 54 parametersfor N, : 3 as shown in the Figure 10, in which {Wrii,W7;1 andWuii } for i :

1, . . . ,6 and 7 - 1, . . . ,3, and the new subscript i is introduced here to denote theconsecutive number of FNN. For the coding of parameters, the ranges specified inTables V and VI are used. For all behavioral elements, the reciprocal value of thedeviation is set between 1 and 10.

Page 17: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

LEARNING OF AFUZZY BEHAVIOR BASED CONTROLLER

FNN I WtrWarr Wnn, . . . ,Wat Wan l l / r , t

FNN 2 W.uWazt Wtzr , . . ' ,Wczt Wazt Wnt

Behavior group 2- - - - - - - - - - - - - l

FNN 3 hst lYat W6y, . . . ,Wat Wan l fns

FNN 4 W.qr l lacr W6s, . . . ,W,a3 lYa$ Wbqs

Figure 1 0. The structure of the coding of tnzzy behavior groups.

Table V. The maximum and minimum values of the

central parameters

lnput Minimum Maximum Unit

271

U

D

d

Q V and ttt

- t

U- t

1

l 0

5

1T

lm/sl

lm llml

lradl

Table VI. The maximum and minimum values of

the consequent parameters

Consequent Minimum Maximum Unit

u , D a n d d

@, V andty'

0.5 tNl0.1 [Nm]

-0.5

-0. I

5.2 . CONDITIONS OF THE DYNAMIC ENVIRONMENT AND RESULTS

The mobile robot navigation was investigated in a dynamical environment ex-

plained by Table VII, which consists of five environments, but the first and fifth

ones are the same. In this application, it is assumed that the environment belongs

to the class of dynamic environments defined by 6, with the first period beingfixed at 100 generations. In the lirst period, the mobile robot was navigated with

the obstacle situated at (2,2) and the target at (4,4). The evolved behavior can

be seen on the left hand side of Figure 11. After 100 generations, the obstacle

was moved from (2,2) to (1.5, 1.5). The target remained the same. Note that this

Page 18: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

272 THRISHANTHA NANAYAKKARA

Table VII. The pattern of change taking place for the mobile robotnavigation

Environment Start (-r, ,r,) Obstacle (x, y) Target (;r, y)

l (0 ,0) ( .2 .0,2.0) (4.0,4.0)

2 (0, 0) ( I .s , l .s) (4.0,4.0)

3 (0, 0) ( r .5 . r .5) (4.0, 3.0)4 (0, 0) (2.0, 1.s) (4.0, 3.0)

s (0, 0) (2.0,2.0) (4.0,4.0)

Ec

F

q).=

Ec

'F

o.=

X direction [m] X direction [m]Figure I I. Lefi: the path traveled by the mobile robot in the lirst environment; right: that inthe second environment.

c

Eoo

Ec

.F

o

X direction [m] X direction [m]Figure 12. Left: the path traveled by the mobile robot in the third environment; right: that inthe fourth environment.

Page 19: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

LEARNING OF AFUZZY BEHAVIOR BASED CONTROLLER z t - )

.E,c

E

o€

X direction [m]Figure 13. The path traveled by the mobilerobot back in the first environment.

0.9

u,utocE OE3

.Exrd

0 .

Number of generalions

Figure 11. The behaviors of the maximum fitness for mobile robot navigation in a dynamicenvironment.

environment remained static for 9 more generations. The old behaviors were re-evolved to give the result shown on the right hand side of Figure 1 1. Then, thetarget was moved from (4, 4) to (4,3), while the obstacle remaining at (1.5, 1.5).This change took place after 109 generations and remained static for 139 gener-ations. The evolved path can be seen on the left hand side of Figure 12. After148 generations, the obstacle was moved from (1.5, 1.5) to (2,7.5). The targetremained same at (4,3). This environment remained static for another 55 genera-tions. The evolved behaviors from the old behaviors gave the result shown on theright hand side of Figure 12. Finally the target and the obstacle were restored backto the original positions. The final result is given in Figure 13. Note that in eachchange of the environment, the old behaviors were re-evolved to adapt to the newenvironment without resetting the algorithm, until the evolution was stopped after500 generations. Figure 14 shows how the maximum fitness of the two methods

Page 20: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

214 THRISHANTHA NANAYAKKARA

=b n r

(g

(U

(U

=c

2 3 4Number of changes

Figure 15. The pertbrmance comparison between the proposed method and the conventional

method using the indicator of adaptability.

behaves in the dynamic environment. In Figure 14, "changes" drawn in dashedlines denotes the time at which the obstacle or the target changes their positions.It is evident that the fitness of the best individual of the proposed method achieveshigher values than the conventional method with less time. One other imporlantpoint to be noticed from Figure 14 is that the fitness of the best individual ofthe proposed method does not drop as much as that of the conventional method.This is very important to optimize the usage of computational resources. The com-parison of the adaptation rates using the indicator of adaptability is depicted inFigure 15. The indicator of adaptability as described earlier measures the rate ofconvergence to evaluate the adaptability of the evolutionary algorithm in a givenenvironment. According to Figure 15, the proposed method converges faster thanthe conventional one, given a new environment. That means, the proposed methodhas better ability to converge fast to new environments and restore back when theenvironment restores to the initial condition, depicting its evolvability in dynamicenvironments. The reason underlying the difference is basically due to the diversityenhancement by the chaotic mutation mechanism in the proposed method. In theconventional method (Back and Schewefel, 1993), the performance is quite com-parable with the proposed method in the first environment, but it deteriorates as theenvironment changes.

6. Conclusions

The main issue addressed in this paper was the ability of an EA to adaptivelymodify the behaviors of a mobile robot in changing environments. Related to thisobjective, the most important features desired were the robustness and adaptationrate of the EA in changing environments. In the proposed method, a new EA with a

Page 21: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

LEARNING OF AFUZZY BEHAVIOR BASED CONTROLLER 215

chaotic mutation mechanism and a strategic move mechanism was applied. It wasanalytically shown that the diversity based mutation mechanism could enhancethe speed of convergence. Furthermore, the chaotic mutation mechanism helped tocontrol the diversity ofthe population that ensures robustness ofthe process ofevo-lution. The performance of the proposed method was quantitatively compared withthat of a conventional method using an indicator of adaptability that measures therate of convergence of the EA in a new environment. The proposed indicator valueincreased both with the increasing rate of convergence and the difference betweenthe final maximum Iltness achieved by the best individual and its initial fitness inthe new environment. Figure 15 graphically shows the variation of the indicator ofadaptability of both the methods in each environment. It gives quantitative evidenceto support that the proposed method has better adaptability in changing environ-ments. Graphical view of the evolutionary history shown in Figure 14 also supportsthis conclusion. The practical benefit of this kind of evolutionary adaptation is thatthe behaviors can be globally optimized in changing working environments that isoften encountered by the industrial robots. Local search algorithms may find it verydifficult to globally optimize the behaviors. Furthermore they tend to take a longertime than EAs to converge to a global optimum. Therefore, the proposed approachcan improve the working efficiency and robustness of the behaviors in unceftainworking conditions.

Yet, the EAs have an inherent drawback of using extensive computational re-sources. The fast convergence property ofthe proposed algorithm tries to alleviatethis problem. The convergence speed can be further increased by initializing theweights of the FNNs of the behavior based controller, using an expert observer'sexperience. Furthermore, the advantage of sustaining enough diversity of the popu-lation helps to have many alternative solutions in the new environment and it helpsto use the existing knowledge in furlher evolution. In practical situations, one otherproblem is the effective evaluation of behaviors. In the proposed method, a fuzzyset based evaluation is performed so that the human linguistic terms of evaluationcan be easily incorporated. Therefbre the evaluation of behaviors can be easilymade very close to the evaluations of a human observer that often leads to theevolution of meaningful behaviors.

One other limitation of the proposed method is that it needs some knowledgeabout the kinematics of the robot.

In the simulations, we used circular obstacles for simplicity. In practice, theobstacles can be iregular. Yet this will not pose a threat because in the optimizationprocess, these irregular obstacles can be approximated to be polygonal obstaclesand consider the center of gravity in place of the center of circular obstacles.Then the evaluation criteria of obstacle avoidance has to be slightly modified toconsider the dimensions of the long and shorl axis's of the polygonal obstacle.The advantage of the ability to modify and incorporate such new features in thebehavior evaluation of the proposed EA plays an important role in this kind ofpractical problems. Therefore, we would like to introduce the proposed method as

Page 22: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

276 THRISHANTHA NANAYAKKARA

a practically efficient and flexible method to evolve globally optimum behaviors ofmobile robots in changing environments.

Furthermore, the proposed method is difficult to be applied to continously mov-ing obstacles. Of course the fuzzy behavior based controller will produce somecontrol action for continuously moving obstacles, yet it may not be globally op-timum. The evolution in continuously changing environments is kept as futureresearch and we would like to keep it out of the scope of this paper.

References

Brooks, R. A.: 1986, A robust layered control system for a mobile robo|, IEEE J. Robotics Automat.2,r).14-23.

Bourbakis, N. G.: 1997, A traffic priority language for collision-free navigation of autonomousmobile robots in dynamic environments, IEEE Trans. Systems Man Cybernet. Part B 27(4),513-58'7.

Biick, T. and Schewefel, H. P.: 1993, An overview of evolutionary algorithms for parameteroptimization, Ev o lut i ona r 1- C o mp ut at i o n I ( I ), I -23.

Chellapilla, K. and Fogel, D.:1997, Two new mutation operators for enhanced search and opti-mization in evolutionary programming, in: Applications of Soft Computing, Proc. of SPIE, pp.260*269.

Fogel, D. B: 1994, An introduction to simulated evolutionary optimization, IEEE Trans. NeuralNetworks 5(1), 3-14.

Fogel, D. B.:1997, The advantages of evolutionary computation, in: D. Lundh, B. Olsson, and A.Narayanan (eds), BioComputing and Emergent Computation, World Scientilic, Singapore, pp.1 - 1 l .

Fogel, D. B. and Ghozeil, A.: 1996, Using fitness distributions to design more efficient evolutionarycomputations, in'. Proc. of IEEE Conference on EvoLutionary Computation, pp. 11-19.

Fogel, D. B.: 1994, Evolutionary programming in perspective: The top-down view, in: J. M. Zurada,R. J. Marks II, and C. J. Robinson (eds), ComputationaL Intelligence: Imitating Lde, IEEE Press,Piscataway, pp. I 35*146.

Koza, J. R.: 1992, Evolution of subsumption using genetic programming. in: Proc. of the FirstEuropean Conference on Artifcial Life, pp. I 1 0-l l 9.

Leung.Y. : l99T.Degreeofpopu la t iond ivers i ty -aperspec t iveonprematureconvergence ingenet icalgori thmsanditsMarkovchainanalysis, IEEETrans.NeuralNetworks 8(5), 1165-1 175.

Mataric, M. J.: 1994, Interaction and intelligent behavior, Ph.D. thesis, Massachusetts Institute ofTechnology, USA.

Maes, P.: 1992, Learning Behavior networks from experience, in: Proc. of the lst EuropeanConference onArtificial Lfe, pp. 48-57.

Nanayakkara, D. P. T., Watanabe, K., and Izumi, K.: 1999, Evolving in dynamic environmentsthrough adaptive chaotic mutation, in: Proc. of the 4th Internat. Symposium on ArtiJicial LiJeand Robotics, Vol.2, Oita, Japan, pp. 520-523.

Nanayakkara, D. P. T., Watanabe, K., Kiguchi, K., and lzumi, K.: 1999, Evolutionary optimizationwith strategic moves based on historical knowledge, tn: Proc. of the lst SOFT Kltushu BranchAnnual Conference, Saga, Japan, pp.2l-28.

Rechenberg, I.: 1994, Evolutionary strategy, in: J. M. Zurada, R. J. Marks II, and C. J. Robinson(eds), Computational Intelligence: Imitating Life,LEEE Press, Piscataway,pp. 147-159.

Ram, A., Arkin, R. C., Moorman, K., and Clark. R. J.:1997, Case-based reactive navigation: Amethod for on-line selection and adaptation of reactive robotic control parameters, IEEE Trans.Systems Man Cybernet. Part B 27(3),376-393.

Page 23: :fv';ru',{,#:::::::;!,*;l;,'::,1i:ff ,,t),?,',liTl;,?T,'*,...Saga University, l-Honjomachi, Saga 840-8502, ktpan KEIGO WATANABE and KAZUO KIGUCHI Department of Advanced Systems Control

LEARNING OF AFUZZY BEHAVIOR BASED CONTROLLER 211

Watanabe, K. and Izumi, K.: 1998, A fuzzy-neural realization of behavior-based control systems for a

mobile robot, in: L. C. Jain and T. Fukuda (eds), Soli Computing.for Intelligent Robotic Systems,

Physica-Verlag, Heidelberg, pp. 1-26.Yuret, D.: 1994, From genetic algorithms to efficient optimization, A. I. Technical Report No. 1569,

Artificial Intellisence Laboratorv. MIT. USA.