13
Evolving Aggregation Behaviors in Multi-Robot Systems with Binary Sensors Melvin Gauci, Jianing Chen, Tony J. Dodd and Roderich Groß Abstract This paper investigates a non-traditional sensing trade-off in swarm robotics: one in which each robot has a relatively long sensing range, but pro- cesses a minimal amount of information. Aggregation is used as a case study, where randomly-placed robots are required to meet at a common location without using environmental cues. The binary sensor used only lets a robot know whether or not there is another robot in its direct line of sight. Simulation results with both a mem- oryless controller (reactive) and a controller with memory (recurrent) prove that this sensor is enough to achieve error-free aggregation, as long as a sufficient sensing range is provided. The recurrent controller gave better results in simulation, and a post-evaluation with it shows that it is able to aggregate at least 1000 robots into a single cluster consistently. Simulation results also show that, with the recurrent controller, false negative noise on the sensor can speed up the aggregation process. The system has been implemented on 20 physical e-puck robots, and systematic experiments have been performed with both controllers: on average, 86-89% of the robots aggregated into a single cluster within 10 minutes. 1 Introduction Many studies in swarm robotics have investigated systems where each robot only makes use of localized information. As one may expect, such a restriction often comes at the cost that each robot is required to extract and process a considerable amount of information about its immediate surroundings. For example, each robot may be required to estimate the relative positions of all other robots within some radius. This study aims to investigate an alternative sensing trade-off, which is be- lieved to be potentially useful in a number of ways. In this trade-off, a longer sensing range is allowed than is normally assumed in swarm robotics; however, each robot The authors are with the Department of Automatic Control and Systems Engineering, The Univer- sity of Sheffield, UK. e-mail: {m.gauci, j.n.chen, t.j.dodd, r.gross}@sheffield.ac.uk 1

Multi-Robot Systems

Embed Size (px)

DESCRIPTION

multirobot

Citation preview

Page 1: Multi-Robot Systems

Evolving Aggregation Behaviors in Multi-RobotSystems with Binary Sensors

Melvin Gauci, Jianing Chen, Tony J. Dodd and Roderich Groß

Abstract This paper investigates a non-traditional sensing trade-off in swarmrobotics: one in which each robot has a relatively long sensing range, but pro-cesses a minimal amount of information. Aggregation is used as a case study, whererandomly-placed robots are required to meet at a common location without usingenvironmental cues. The binary sensor used only lets a robot know whether or notthere is another robot in its direct line of sight. Simulation results with both a mem-oryless controller (reactive) and a controller with memory (recurrent) prove that thissensor is enough to achieve error-free aggregation, as long as a sufficient sensingrange is provided. The recurrent controller gave better results in simulation, and apost-evaluation with it shows that it is able to aggregate at least 1000 robots intoa single cluster consistently. Simulation results also show that, with the recurrentcontroller, false negative noise on the sensor can speed up the aggregation process.The system has been implemented on 20 physical e-puck robots, and systematicexperiments have been performed with both controllers: on average, 86-89% of therobots aggregated into a single cluster within 10 minutes.

1 Introduction

Many studies in swarm robotics have investigated systems where each robot onlymakes use of localized information. As one may expect, such a restriction oftencomes at the cost that each robot is required to extract and process a considerableamount of information about its immediate surroundings. For example, each robotmay be required to estimate the relative positions of all other robots within someradius. This study aims to investigate an alternative sensing trade-off, which is be-lieved to be potentially useful in a number of ways. In this trade-off, a longer sensingrange is allowed than is normally assumed in swarm robotics; however, each robot

The authors are with the Department of Automatic Control and Systems Engineering, The Univer-sity of Sheffield, UK. e-mail: {m.gauci, j.n.chen, t.j.dodd, r.gross}@sheffield.ac.uk

1

Page 2: Multi-Robot Systems

2 Melvin Gauci, Jianing Chen, Tony J. Dodd and Roderich Groß

only extracts and processes a minimal amount of information per control cycle. Oneadvantage of such a sensing scheme is that, as long as a suitable technology can beused to provide the necessary sensing range, the system is truly scalable, becausethe amount of information that each robot needs to extract and process does notincrease with the number of robots in the swarm. Furthermore, simpler sensing re-quirements are more likely to be implementable on smaller scale robots, paving theway for nano-scale systems. Finally, using a simple sensing method increases thechance that a controller that performs well in simulation also performs well on thephysical system.

The task of aggregation is used as a case study here. Trianni (2008) argues that“aggregation is of particular interest since it stands as a prerequisite for other formsof cooperation.” Self-organized aggregation is a widely-observed phenomenon innature. It occurs in a range of organisms, including bacteria, arthropods, fish andmammals (Camazine et al, 2001; Parrish and Edelstein-Keshet, 1999). In somecases, self-organized aggregation is aided by environmental heterogeneities, suchas areas providing shelter or thermal energy (see Halloy et al, 2007; Kernbach et al,2009, and references therein) However, aggregation can also occur in homogeneousenvironments (Deneubourg et al, 1990).

Jeanson et al (2005) investigated aggregation in cockroach larvae, and developeda model of their behavior. The cockroaches were reported to join and leave clus-ters with probabilities correlated to the sizes of the clusters. Garnier et al (2008)implemented this model as a probabilistic finite-state automaton on 20 Alice robotsto achieve aggregation in homogeneous environments. Correll and Martinoli (2011)analyzed a similar model and showed that “robots need a minimum combinationof communication range and locomotion speed in order to aggregate into a singlecluster when using probabilistic aggregation rules”. These probabilistic models ofaggregation require the agents to obtain estimates of the cluster size or the robotdensity. For example, Garnier et al (2008) used local infra-red communication toestimate the number of nearby robots.

Ando et al (1999) introduced a deterministic algorithm for achieving aggregationin a group of mobile agents with limited perception in homogeneous environments.Cortes et al (2006) adapted this algorithm and showed that it can be used to achieveaggregation in arbitrarily high dimensions. These algorithms require that the robotsinitially form a connected visibility graph, and are based on maintaining this graphin every time step. The robots are essentially required to measure the relative posi-tions (distances and angles) to all their neighbors. Gordon et al (2004) relaxed thisrequirement, such that the robots need only to measure the angles to their neighbors,and not the distances. Although the algorithm was theoretically proven to work,simulation results revealed that “the merging process [was] generally agonizinglyslow” (Gordon et al, 2004). Gennaro and Jadbabie (2006) developed a connectivity-maintaining algorithm based on every robot computing the Laplacian matrix of itsneighbors. Similar to the work of Ando et al (1999), the algorithm requires therobots to measure the distances and angles to their neighbors.

Dorigo et al (2004) addressed the problem of robotic aggregation by using anevolutionary robotics approach (Nolfi and Floreano, 2000). In their system, the

Page 3: Multi-Robot Systems

Evolving Aggregation Behaviors in Multi-Robot Systems with Binary Sensors 3

robots can emit a sound and can sense each other using proximity sensors and di-rectional microphones. A neural network controller was evolved and validated insimulation with up to 40 robots. Bahceci and Sahin (2005) used a similar setup andinvestigated the effects of several parameters, such as the number of robots, arenasize, and run time.

The problem of aggregating robots with limited information is challenging, be-cause the robots, if not properly controlled, may end up forming separate clusters.This work investigates whether a single bit of information is sufficient to achieveerror-free aggregation, and whether memory in the controller is a fundamental re-quirement.

2 Experimental Setup

2.1 Problem Formulation

N robots are placed in an unbounded, obstacle-free, homogeneous environment,with random positions and orientations. The objective is to bring the robots togetherat some location in the environment (i.e. aggregate them) via decentralized control.Each robot is equipped with a single binary sensor on some point of its body, whichallows it to know whether or not there is another robot in the direct line of sight ofthe sensor. Formally, the binary sensor gives a positive reading at time t, I(t) = 1, ifthere is a robot in its direct line of sight, and a negative reading, I(t) = 0, otherwise.The binary sensor does not provide the distance to the robot being perceived.

2.2 Robotic and Simulation Platforms

The robotic platform that has been used in this study is the e-puck robot (Mon-dada et al, 2009), shown in Fig. 1(a), which is a miniature, differential wheeledmobile robot that was developed for educational and research purposes. The e-puckis equipped with (among other sensors) a directional camera located at its front. Thecamera has been used in this study to realize the binary sensor in the physical imple-mentation (see Section 5). The simulations presented here were performed using theopen-source Enki library (Magnenat et al, 2011), which is used by WebotsTM in 2Dmode. Enki is capable of modeling the kinematics and the dynamics of rigid bodiesin two dimensions, and has a built-in model of the e-puck. In Enki, the body of ane-puck is modeled as a smooth disk of diameter 7.4cm and mass 152g. The speedsof the left and the right wheels can be set independently, and the maximum speedof the e-puck is set to 12.8cm/s, in both the forward and the reverse directions.In simulation, the binary sensor was realized by projecting a line from the robot’sfront and checking whether it intersects with another robot’s body. The length of the

Page 4: Multi-Robot Systems

4 Melvin Gauci, Jianing Chen, Tony J. Dodd and Roderich Groß

(a)

(b) (c)

Fig. 1: (a): An e-puck robot fitted with a black ‘skirt’ to allow for its visual detection by other robotsagainst a white background. The robot is around 7.4cm in diameter. (b): Pseudo-code representingany reactive controller. (c): A schematic diagram of the fully-recurrent neural network controller.

control cycle was set to ∆ = 0.1s, both in simulation and in the physical system. Insimulation, the physics was updated at a rate of 10 times per control cycle.

2.3 Controllers

Two controllers have been investigated: a reactive controller that does not have anymemory, and a recurrent controller with memory. A reactive controller maps all pos-sible sensor readings onto actuation values. For a differential-wheeled robot with asingle binary sensor, a reactive controller simply maps each of the two possible sen-sor readings onto a pair of speeds for the wheels of the robot. Thus, any reactivecontroller can be represented by 4 parameters (see Fig. 1 (b)): x =

(s0

l ,s0r ,s

1l ,s

1r),

x ∈ [−1.0,1.0]4, where s0l is the speed of the left wheel when I(t) = 0, and so

on, and −1.0 and 1.0 correspond to the maximum backward and forward speedsof a wheel, respectively. The recurrent controller is a fully-recurrent neural net-work (Williams and Zipser, 1989) with only two nodes, whose outputs determinethe speeds of the left and the right wheels. A schematic diagram of the network isshown in Fig. 1 (c). The network is defined by 8 unbounded real-valued parameters:x = (wI1,w11,w21,b1,wI2,w12,w22,b2), x ∈ R8. The internal states of the neurons,γ1 and γ2, are initialized to 0, and are updated according to:

γ(t+1)k = wIkI(t)+w1ksig

(γ(t)1

)+w2ksig

(γ(t)2

)+bk, k ∈ {1,2} .

where sig(·) is the standard sigmoidal function, given by sig(·) = 1/(

1+ e−(·))

.

The speeds of the left and the right wheels are calculated from the states as s(t)l =

2sig(

γ(t)1

)−1, s(t)r = 2sig

(γ(t)2

)−1, such that s(t)l ,s(t)r ∈ (−1.0,1.0).

Page 5: Multi-Robot Systems

Evolving Aggregation Behaviors in Multi-Robot Systems with Binary Sensors 5

2.4 Evolutionary Algorithm

The aim of the evolutionary algorithm is to synthesize controllers of the forms de-scribed in Section 2.3 that give a high aggregation performance. The algorithm usedhere is based on Classical Evolutionary Programming (Yao et al, 1999), and is suit-able for optimization in continuous, real-valued parameter spaces, S ⊆Rn. Its mainfeatures are (i) self-adaptation of mutation strengths, and (ii) a stochastic selectionmethod known as q-tournament selection. In this algorithm, an individual can beconsidered as a 2-tuple: a = (x,σ), where x ∈S is a candidate solution (i.e. here, aset of parameters for a controller) and σ ∈ (0,∞)n is a vector of mutation strengths.The i-th mutation strength in σ corresponds to the i-th element in x. Each generationg comprises a population of µ individuals, P(g) =

{a(g)1 ,a(g)2 , . . . ,a(g)µ

}. In gener-

ation g = 0, all objective parameters and mutation strengths in P(0) are initializedaccording to some distribution. Thereafter, in every generation g, every individualin P(g) generates a new individual by mutation, to create a mutated populationP ′(g) (see Eqs. (1) and (2) in Yao et al (1999)). The population for the next gener-ation, P(g+1), is selected by q-tournament selection from the combined populationP(g) ∪P ′(g). Each individual a(g)k in P(g) ∪P ′(g), k ∈ {1,2, . . . ,2µ}, competes

in a tournament q times. For each tournament, an individual a(g)χ , χ 6= k, is cho-

sen at random from P(g) ∪P ′(g), with replacement. The individuals a(g)k and a(g)χ

are evaluated using an identical, randomly-generated seed ψ (hence, an identicalinitial configuration). If a(g)k achieves a better fitness than its opponent, its score isincreased by one. Therefore, after q tournaments, each individual in P(g) ∪P ′(g)

obtains a score from the set {0,1, . . . ,q}. The µ individuals with the highest scoresare selected to constitute the new population, P(g+1) (individuals with an identicalscore have an equal chance of being selected).

2.5 Fitness Evaluation

The aggregation performance, or fitness, of a controller, defined by a vector ofparameters, x, is measured by running a simulation with a number of robots em-ploying that controller, and computing their performance as follows: Let p(t)

i ,i ∈ {1,2, . . . ,N} represent the positions of the robots at time step t. The averagedistance to their centroid is given by:

d(t) =1N

N

∑i=1‖p(t)

i − p(t)‖, (1)

where p(t) is the centroid of the robots’ positions at time step t, computed as p(t) =1N ∑

Ni=1 p(t)

i , and ‖ · ‖ denotes the Euclidean norm. Based on this metric, the fitnessof a controller is computed as:

Page 6: Multi-Robot Systems

6 Melvin Gauci, Jianing Chen, Tony J. Dodd and Roderich Groß

F (x,ψ) =T

∑t=1

1[d(t)] t

T, (2)

where T is the number of time steps for which the simulation is run. Since d(t) isplaced in the denominator, a larger value of F signifies a better fitness. This functionrewards not only the aggregation metric at the end of the simulation, but also thespeed of the aggregation. Through the t/T exponent applied to d(t), a large value ofd(t) is increasingly penalized as the simulation progresses. Note that the fitness F isthe outcome of a stochastic process using the seed ψ , which determines the initialplacement of the robots and the actuation noise.

3 Controller Synthesis and Selection

Two sets of 100 independent evolutionary runs were performed: one set with a re-active controller and one with a recurrent controller. The evolutions were run for1000 generations, with all object parameters (x) initialized to 0.0 and all mutationstrengths (σ ) initialized to 1.0. The population size was set to µ = 15 and the tourna-ment selection parameter was set to q = 5 (settings as used in Chellapilla and Fogel(2001)). N = 10 robots were used for the fitness evaluations of the controllers. Theirpositions were initialized randomly with a uniform distribution within a square ofsides 316.23cm, for an area of 10000cm2 per robot (on average), and their orienta-tions were initialized randomly with a uniform distribution in the range (−π,π].

Additionally, a grid search was carried out for the reactive controller. A resolutionof 21 settings per parameter was used with values between −1.0 and 1.0 in incre-ments of 0.1. Therefore, 214 = 194481 controllers were evaluated. Each controllerwas evaluated 100 times using Eq. 2 with different initial configurations of robots,with the set of configurations being identical for each controller. The evaluationswere done with N = 10 robots, and the initialization method was identical to thatused in the evolutionary runs, described above. The fitness of each controller wasrecorded as the mean fitness of the 100 evaluations. Such a search was not possibleto perform with the recurrent controller, because this has 8 unbounded parameters,which makes the search space prohibitively large.

Each evolution produced µ = 15 controllers after 1000 generations. In order toextract the best controller found by each evolution, each controller in the final gener-ation was evaluated 100 times with different initial configurations of robots, and thecontroller with the highest average fitness over the 100 evaluations was selected asthe resultant controller of the evolution. Each of the 100 runs for each of the 200 con-trollers was inspected visually, and it turns out that two distinct behaviors emerged,as shown in Fig. 2(a). The first behavior leads to a compact cluster, while the sec-ond behavior leads the robots to form a circle and maintain it. With the reactivecontroller, 81 evolutionary runs produced controllers leading to a circle formation,while 19 runs produced controllers leading to a compact cluster formation. With therecurrent controller, only one evolutionary run out of 100 produced a circle-forming

Page 7: Multi-Robot Systems

Evolving Aggregation Behaviors in Multi-Robot Systems with Binary Sensors 7

(a)

clus

ters

circ

les

clus

ters

circ

les

270

300

330

fitne

ss,F

,in

cm−

1

(b)

Fig. 2: (a) The cluster (top) and circle (bottom) configurations of robots. (b) This box plot (seeFootnote 1) shows the average fitnesses F of the 200 evolved controllers, grouped according to(i) the type of controller (reactive or recurrent) and (ii) the type of formation that the controllerleads to. The left (red) two boxes represent reactive controllers, while the right (blue) two boxesrepresent recurrent controllers. The horizontal green line shows the average fitness of the bestcontroller found by the grid search. Higher values indicate a better fitness. The expected value ofF for randomly-placed robots that do not move throughout the simulation is 185.562cm−1. Themaximum ‘possible’ value of F is 417.514cm−1, corresponding to robots that are in the mostcompact configuration possible (see Graham and Sloane, 1990) from start to finish.

controller. The best reactive controller found by the grid search leads to a compactcluster formation. Fig. 2(b) shows a box plot1 with the average fitnesses F of the 200evolved controllers, grouped according to (i) the type of controller (reactive or re-current) and (ii) the type of formation that the controller leads to. The circle-formingcontrollers achieved a significantly lower fitness than the compact cluster-formingcontrollers. This is because the fitness function in Eq. 2 was specifically designedto reward compactness. The circle-forming recurrent controller has a worse fitnessthan the worst circle-forming reactive controller, as the circle forms over a longertime. In terms of compact-cluster-forming controllers, it is clear from Fig. 2(b) thatthe recurrent controllers lead to significantly higher fitnesses than the reactive con-trollers. In fact, the worst compact cluster-forming recurrent controller evolved hasa higher fitness than the best reactive controller evolved. The fitness of the best re-

1 The boxplots presented here are all as follows. The line inside the box represents the medianof the data. The edges of the box represent the lower and the upper quartiles (25-th and 75-thpercentiles) of the data, while the whiskers represent the lowest and the highest data points thatare within 1.5 times the inter-quartile range from the lower and the upper quartiles, respectively.Circles represent outliers.

Page 8: Multi-Robot Systems

8 Melvin Gauci, Jianing Chen, Tony J. Dodd and Roderich Groß

active controller located by the grid search (green line in Fig. 2(b)) is lower than thefitness of the best reactive controller found by an evolution. This is because of thelimited resolution that had to be used in the grid search, whereas the evolutionaryalgorithm has a practically infinite resolution.

The circle forming controllers yield an interesting behavior. Preliminary experi-ments show that they scale well with the number of robots (as tested with 100 robotsin simulation), and are in principle also implementable on real robots. However, thenext sections will investigate the cluster-forming controllers, which outperform thecircle-forming controllers in terms of compactness. An investigation of the circleforming controllers will be left as future work.

4 Post-Evaluations with the Best Controller

During the controller synthesis stage, the controller evaluations were limited to N =10 robots, in order to keep the computation time within reasonable limits. The bestsynthesized (recurrent) controller was chosen for post-evaluations with 100 robots.In the following, 100 robots were initialized within a virtual square of sides 1000cm,such that the area per robot (on average) is 10000cm2, identical to that used forcontroller synthesis.

The effect of the sensing range: As the robots are initialized within a virtualsquare of sides 1000cm, a sensing range of

√10002 +10002 = 1414 cm can be

considered as practically unlimited. This will be denoted by δ∞. In order to inves-tigate the effect of the sensing range on aggregation performance, simulations withdifferent proportions of the sensing range δ to δ∞ were performed; 100 simulationsfor each δ/δ∞ = {0.0,0.1, . . . ,1.0}. For each simulation, the value of 1/d(t) (seeEq. 1) after 1800s was recorded (hence a higher value signifies more compactness).Fig. 3(a) shows a box plot for all the simulations. The performance is virtually un-affected as δ/δ∞ is reduced from 1.0 to 0.3. As δ/δ∞ is reduced to 0.2, the medianperformance drops instantly, almost to the value of robots that do not move duringthe simulation (lower magenta line in Fig. 3(a)).

The effect of sensing noise: Two types of noise on the binary sensor were con-sidered. With false positive noise, the binary sensor erroneously gives the wrongreading with probability p when no robot should be observed, while it always givesthe correct reading when a robot should be observed. Conversely, with false nega-tive noise, the binary sensor never indicates the presence of a robot when there isnone, but when there is a robot, it fails to detect it with probability p. False positivenoise was found to be detrimental to aggregation performance. However, false neg-ative noise was found to speed up the aggregation process. In order to investigatethis, 100 simulations were run for each of 10 values of p: {0.0,0.1, . . . ,0.9}. Eachsimulation was stopped when all the robots were aggregated in a single cluster2, and

2 Two robots are said to be neighbors at time step t if the distance between their peripheries isless than some threshold, τ . The value of τ used here is equal to the diameter of the robots, i.e.τ = 7.4cm. A depth-first search algorithm is used to find the cluster profile at time step t, where

Page 9: Multi-Robot Systems

Evolving Aggregation Behaviors in Multi-Robot Systems with Binary Sensors 9

0.0

0.2

0.4

0.6

0.8

1.0

0.00

0.02

0.04

sensing range factor, δ/δ∞

1/d(

t),i

ncm−

1 ,aft

er18

00s

(a)

0.0

0.2

0.4

0.6

0.8

200

800

level of noise, p

time

tofo

rma

sing

lecl

uste

r,in

s

(b)

Fig. 3: (a): This box plot shows how the aggregation performance with N = 100 robots is affectedby the range of the binary sensor. The horizontal axis shows the sensing range, δ , as a proportionof the maximum, effectively infinite, sensing range, δ∞, while the vertical axis shows a measure ofaggregation, 1/d(t), as defined in Eq. 1. Each box represents 100 values of 1/d(t) obtained from100 runs. The lower magenta line shows the value of 1/d(t) for robots that do not move. The uppermagenta line shows the maximum value of 1/d(t) obtainable with 100 robots, corresponding to theoptimal packing (see Graham and Sloane, 1990). (b): This box plot shows how the time taken byN = 100 robots to form a single cluster is affected by the level of false negative noise on the binarysensor. Each box represents 100 values of time obtained from 100 runs.

the time taken was recorded. Fig. 3 (b) shows a box plot of these times for false neg-ative noise. The plot shows a clear ‘bowl’ shape, and it is striking that the optimum(fastest aggregation) occurs somewhere between p = 0.6 and 0.7.

Scalability: In order to investigate the scalability of aggregation with binary sen-sors, the best synthesized controller was tested with increasing numbers of robots.100 simulations with different initial configurations of robots were performed foreach value of N ∈ {100,200, . . . ,1000}. In each trial, the robots formed a singlecluster, meaning that the controller is capable of achieving consistent, error freeaggregation with at least 1000 robots.

a cluster is defined as a set of robots that form a connected graph (i.e. every two robots within theset are either neighbors, or connected by a chain of neighboring robots).

Page 10: Multi-Robot Systems

10 Melvin Gauci, Jianing Chen, Tony J. Dodd and Roderich Groß

5 Physical Implementation and Experiments

A square arena of sides 250cm was used, having a white floor and enclosed by whitewalls. Its floor was marked with a grid of 9×9 points, spaced 25cm from each otherand from the walls. For each experiment, 20 out of these 81 points where chosenrandomly as the initial positions of the robots. Additionally, a random orientationfrom {North, South, East, West} was chosen for each robot.

The binary sensor has been implemented using the e-puck’s directional camera.The robots were fitted with black ‘skirts’ in order to make them distinguishableagainst the white arena. The middle column of pixels from the camera’s image issub-sampled so as to obtain 15 equally-spaced pixels from the bottom of the im-age to its half-way height. The gray value of these 15 pixels is compared against athreshold, which was empirically set to 2/3 of the maximum possible value (purewhite). If one or more of the pixels has a gray value below this threshold, the sensorgives a positive reading. The implemented sensor has been found to provide reliablereadings up to a range of around 150cm.

Two sets of 30 trials each were performed with N = 20 robots; one set with thebest-found reactive controller, and one set with the best-found recurrent controller.The robots were started by issuing an infra-red signal from a remote control, andstopped automatically after 10 minutes. Each trial was recorded by an overheadcamera, and all of the videos can be found in the online supplementary material(Gauci et al, 2012). Throughout the 60 trials, 9 robots had a mechanical or elec-trical problem, and stopped moving. Whenever this happened, an infra-red signalwas re-issued to the robot in case it might start again (normally-operating robotsignored this signal). Otherwise, the stationary robot was left in the arena for the restof the trial. No other mechanical or electrical problems were observed. The clusterconfiguration (see Footnote 2 for a definition) of the robots at the end of each trialwas examined. In the trials with the recurrent controller, the mean size (over the30 trials) of the largest cluster in the final configuration was 17.10, meaning that85.50% of the robots were aggregated in a single cluster. In the trials with the reac-tive controller, this value was 17.77, or 88.83%. These results are not significantlydifferent (Mann-Whitney test performed with a p = 0.05 threshold). However, withthe recurrent controller, substantially more robots became stuck at the arena wallsthan with the reactive controller (49 times against 15 times, out of a possible totalof 600 each). Fig. 4 shows the cluster sizes in the final configurations of the 30 trialswith the reactive controller.

6 Conclusion

This paper has investigated the usefulness of an alternative sensing trade-off forswarm robotic systems: one in which the robots have a longer sensing range thanis normally assumed, but process only a minimal amount of information. This ideahas been implemented to solve the problem of decentralized robot aggregation in

Page 11: Multi-Robot Systems

Evolving Aggregation Behaviors in Multi-Robot Systems with Binary Sensors 11

5

10

15

20

25

30

12

34

0

10

20

Trial Number

Cluster Number

Clu

ste

r S

ize

Fig. 4: This plot shows the final configurations of all the 30 trials performed with the reactivecontroller. The bar at the back shows the number of robots in the largest cluster, the bar in front ofit shows the number of robots in the second-largest cluster, and so on. A robot that is not within 7.4cm of any other robot is considered as a cluster in itself. Across all the trials, there were at most 4clusters in the final configuration.

a homogeneous environment using a sensor that provides each robot with a singlebit of information per control cycle. The simplicity of the sensor does not comeat the cost of a complex controller: in fact, even the simplest possible memorylesscontroller has been shown to be effective. To the best of the authors’ knowledge,this is the simplest sensor/controller combination capable of aggregating robots in asingle location. The system was implemented on 20 physical e-puck robots, and twosets of 30 trials were conducted, one set with a memoryless controller, and one setwith a recurrent controller. Both controllers led to a good aggregation performancewithin 10 minutes. In the future, it is intended to extend the work performed tomore complex environments (e.g. with obstacles), and to test the proposed sensingtrade-off on other, more challenging swarming tasks.

Acknowledgments

The research work disclosed in this publication is funded by the Marie Curie Eu-ropean Reintegration Grant within the 7th European Community Framework Pro-gramme (grant no. PERG07-GA-2010-267354). M. Gauci acknowledges supportby the Strategic Educational Pathways Scholarship (Malta). The scholarship is part-financed by the European Union – European Social Fund (ESF) under OperationalProgramme II – Cohesion Policy 2007–2013, “Empowering People for More Jobsand a Better Quality of Life”.

Page 12: Multi-Robot Systems

12 Melvin Gauci, Jianing Chen, Tony J. Dodd and Roderich Groß

References

Ando H, Oasa Y, Suzuki I, Yamashita M (1999) Distributed memoryless point con-vergence algorithm for mobile robots with limited visibility. IEEE Trans RoboticAutom 15(5):818–828

Bahceci E, Sahin E (2005) Evolving aggregation behaviors for swarm robotic sys-tems: a systematic case study. In: Proc. 2005 IEEE Swarm Intelligence Sympo-sium, pp 333–340

Camazine S, Franks NR, Sneyd J, Bonabeau E, Deneubourg JL, Theraulaz G (2001)Self-Organization in Biological Systems. Princeton University Press, Princeton,NJ

Chellapilla K, Fogel DB (2001) Evolving an expert checkers playing program with-out human expertise. IEEE Trans Evolut Comput 5(4):422–488

Correll N, Martinoli A (2011) Modeling and designing self-organized aggregationin a swarm of miniature robots. Int J Robot Res 30(5):615–626

Cortes J, Martinez S, Bullo F (2006) Robust rendezvous for mobile autonomousagents via proximity graphs in arbitrary dimensions. IEEE Trans Automat Contr51(8):1289–1298

Deneubourg JL, Gregoire JC, Fort EL (1990) Kinetics of larval gregarious behaviorin the bark beetle Dendroctonus micans (Coleoptera: Scolytidae). J Insect Behav3:169–182

Dorigo M, Trianni V, Sahin E, Groß R, Labella TH, Baldassarre G, Nolfi S,Deneubourg JL, Mondada F, Floreano D, Gambardella L (2004) Evolving self-organizing behaviors for a swarm-bot. Auton Robot 17:223–245

Garnier S, Jost C, Gautrais J, Asadpour M, Caprari G, Jeanson R, Grimal A, Ther-aulaz G (2008) The embodiment of cockroach aggregation behavior in a group ofmicro-robots. Artif Life 14(4):387–408

Gauci M, Chen J, Dodd TJ, Groß R (2012) Online supplementary material. URLhttp://naturalrobotics.group.shef.ac.uk/supp/2012-003/

Gennaro MD, Jadbabie A (2006) Decentralized control of connectivity for multi-agent systems. In: Proc. 45th IEEE Conf. Decision and Control, pp 3628–3633

Gordon N, Wagner IA, Bruckstein AM (2004) Gathering multiple robotic a(ge)ntswith limited sensing capabilities. In: Ant Colony Optimization and Swarm Intel-ligence, Proc. ANTS 2004, Springer Verlag, Berlin, Germany, Lecture Notes inComputer Science, vol 3172, pp 142–153

Graham RL, Sloane NJA (1990) Penny-packing and two-dimensional codes. DicreteComput Geom 5:1–11

Halloy J, Sempo G, Caprari G, Rivault C, Asadpour M, Tache F, Saıd I, Durier V,Canonge S, Ame JM, Detrain C, Correll N, Martinoli A, Mondada F, Siegwart R,Deneubourg JL (2007) Social integration of robots into groups of cockroaches tocontrol self-organized choices. Science 318:1155–1158

Jeanson R, Rivault C, Deneubourg JL, Blanco S, Fournier R, Jost C, Theraulaz G(2005) Self-organized aggregation in cockroaches. Anim Behav 69(1):169–180

Page 13: Multi-Robot Systems

Evolving Aggregation Behaviors in Multi-Robot Systems with Binary Sensors 13

Kernbach S, Thenius R, Kernbach O, Schmickl T (2009) Re-embodiment of hon-eybee aggregation behavior in an artificial micro-robotic system. Adapt Behav17(3):237–259

Magnenat S, Waibel M, Beyeler A (2011) Enki: The fast 2d robot simulator. URLhttp://home.gna.org/enki/

Mondada F, Bonani M, Raemy X, Pugh J, Canci C, Klaptocz A, Magnenat S, Zuf-ferey JC, Floreano D, Martinoli A (2009) The e-puck, a robot designed for educa-tion in engineering. In: Proceedings of the 9th Conference on Autonomous RobotSystems and Competitions, 1, pp 59–65

Nolfi S, Floreano D (2000) Evolutionary Robotics: The Biology, Intelligence, andTechnology of Self-Organizing Machines. The MIT Press, Cambridge, MA

Parrish JK, Edelstein-Keshet L (1999) Complexity, pattern, and evolutionary trade-offs in animal aggregation. Science 284(5411):99–101

Trianni V (2008) Evolutionary Swarm Robotics, Studies in Computational Intelli-gence, vol 108. Springer Verlag, Berlin, Germany

Williams RJ, Zipser D (1989) A learning algorithm for continually running fullyrecurrent neural networks. Neural Comput 1(2):270–280

Yao X, Liu Y, Lin G (1999) Evolutionary programming made faster. IEEE TransEvolut Comput 3(2):82–102