13
Robotics and Autonomous Systems 59 (2011) 530–542 Contents lists available at ScienceDirect Robotics and Autonomous Systems journal homepage: www.elsevier.com/locate/robot A multi-robot exploration algorithm based on a static Bluetooth communication chain Tamás Kovács , Attila Pásztor, Zoltán Istenes Kecskemét College, Department of Informatics, 6000 Kecskemét, Izsáki 10, Hungary Eötvös Loránd University, Faculty of Informatics, Department of Software Technology and Methodology, Budapest, Hungary article info Article history: Received 23 June 2010 Received in revised form 18 March 2011 Accepted 31 March 2011 Available online 12 April 2011 Keywords: Bluetooth scatternet Mobile robots Area exploration abstract In the first part of the present paper, a novel communication scheme of an autonomous robot team via Bluetooth radio is investigated. In the presented solution, an autonomous unit is equipped with two independent Bluetooth radios and so a relatively fast communication is possible in the team in a static (i.e. no ad hoc) networking topology. The performance of such a network was tested by implementing a linear graph topology by NXT robots. It was found that the reliability and the speed of such a communication scheme are satisfactory and give rise to applications in a robot team control task. In the second part of the paper, an area exploration method is presented based on the static linear communication network above. The method was tested by computer simulations for various obstacle configurations and densities. It was found that the proposed method performs better than the chosen reference methods in the case of zero or low obstacle density and when high (75% or 100%) exploration ratio is required. With a simple proof, we have shown that the proposed (fixed chain-like team) exploration method is optimal in the obstacle-free case under the constraint of the connectivity with the base station. © 2011 Elsevier B.V. All rights reserved. 1. Introduction Due to the considerable development in the field of team in- telligence and multi-robot cooperation there are a wide variety of problems that can be solved efficiently by a group of autonomous mobile robots. A typical and intensively investigated problem is the exploration of an unknown area with the purpose of either mapping or finding certain target object(s). The optimal solution strategies of such a task are mostly determined by the conditions imposed by the environment and the abilities of the robot team. Almost all of such exploration methods are based on the concept of ‘‘frontier’’, which is the boundary between explored and unexplored areas [1,2]. The basic idea is that the robots should be directed to or kept on this frontier, while the cost of the team- moves is minimal and the information gain (explored area) is maximal. In order to achieve this the published algorithms apply a utility function, which is increasing in information gain and decreasing in the cost or exploration time [1–10]. Simmons et al. [3] applied a centralized method, where a base station calculated the optimal team-move (with the highest utility) and controlled the robots. Here the idea that the robots should Corresponding author at: Kecskemét College, Department of Informatics, 6000 Kecskemét, Izsáki 10, Hungary. E-mail addresses: [email protected] (T. Kovács), [email protected] (A. Pásztor), [email protected] (Z. Istenes). stay apart from each other (dispersion) was built into the utility function. This dispersion is required in order to maximize the long- term information gain in such models. Burgard et al. [4] propose a similar centralized algorithm; however, in this method they also consider the case of limited communication range. If the originally continuous communication cluster of the robot team breaks up into more, smaller clusters so that the robots in the different clusters cannot communicate with each other, then each cluster continues the exploration algorithm individually. In the worst case, all of the robots are isolated and allocates the exploration tasks only for himself, so the method crosses over to a decentralized scheme, though the task allocation is far from the optimal. In their paper Sheng et al. [5] introduce a solution to the problem of separated communication clusters: the robots perform an individual exploration movement but they are compelled to get in (communication) contact with their fellows regularly so that the task allocation procedure can work regarding the whole team. Thus, for a considerable part of the working time the team members are outside the communication range of the other robots. Vazquez and Malcolm [6] proposed a decentralized method, where the robots individually perform their algorithms respon- sible for area exploration, communication maintaining and colli- sion avoiding. The prevailing activity is chosen according to the momentary situation: if the communication and collision avoid- ing is ensured then the exploration behavior is preferred. If an im- portant communication link is close to a failure (because of the growing distance of the partners), then the connection maintaining behavior is preferred; and the same is true for the danger of 0921-8890/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.robot.2011.03.008

A multi-robot exploration algorithm based on a static Bluetooth communication chain

Embed Size (px)

Citation preview

Page 1: A multi-robot exploration algorithm based on a static Bluetooth communication chain

Robotics and Autonomous Systems 59 (2011) 530–542

Contents lists available at ScienceDirect

Robotics and Autonomous Systems

journal homepage: www.elsevier.com/locate/robot

A multi-robot exploration algorithm based on a static Bluetooth communicationchainTamás Kovács ∗, Attila Pásztor, Zoltán IstenesKecskemét College, Department of Informatics, 6000 Kecskemét, Izsáki 10, HungaryEötvös Loránd University, Faculty of Informatics, Department of Software Technology and Methodology, Budapest, Hungary

a r t i c l e i n f o

Article history:Received 23 June 2010Received in revised form18 March 2011Accepted 31 March 2011Available online 12 April 2011

Keywords:Bluetooth scatternetMobile robotsArea exploration

a b s t r a c t

In the first part of the present paper, a novel communication scheme of an autonomous robot team viaBluetooth radio is investigated. In the presented solution, an autonomous unit is equipped with twoindependent Bluetooth radios and so a relatively fast communication is possible in the team in a static (i.e.no ad hoc) networking topology. The performance of such a network was tested by implementing a lineargraph topology by NXT robots. It was found that the reliability and the speed of such a communicationscheme are satisfactory and give rise to applications in a robot team control task. In the second part of thepaper, an area explorationmethod is presented based on the static linear communication network above.The method was tested by computer simulations for various obstacle configurations and densities. It wasfound that the proposedmethod performs better than the chosen referencemethods in the case of zero orlow obstacle density and when high (75% or 100%) exploration ratio is required. With a simple proof, wehave shown that the proposed (fixed chain-like team) exploration method is optimal in the obstacle-freecase under the constraint of the connectivity with the base station.

© 2011 Elsevier B.V. All rights reserved.

1. Introduction

Due to the considerable development in the field of team in-telligence and multi-robot cooperation there are a wide variety ofproblems that can be solved efficiently by a group of autonomousmobile robots. A typical and intensively investigated problem isthe exploration of an unknown area with the purpose of eithermapping or finding certain target object(s). The optimal solutionstrategies of such a task are mostly determined by the conditionsimposed by the environment and the abilities of the robot team.

Almost all of such exploration methods are based on theconcept of ‘‘frontier’’, which is the boundary between explored andunexplored areas [1,2]. The basic idea is that the robots should bedirected to or kept on this frontier, while the cost of the team-moves is minimal and the information gain (explored area) ismaximal. In order to achieve this the published algorithms applya utility function, which is increasing in information gain anddecreasing in the cost or exploration time [1–10].

Simmons et al. [3] applied a centralized method, where a basestation calculated the optimal team-move (with the highest utility)and controlled the robots. Here the idea that the robots should

∗ Corresponding author at: Kecskemét College, Department of Informatics, 6000Kecskemét, Izsáki 10, Hungary.

E-mail addresses: [email protected] (T. Kovács),[email protected] (A. Pásztor), [email protected] (Z. Istenes).

0921-8890/$ – see front matter© 2011 Elsevier B.V. All rights reserved.doi:10.1016/j.robot.2011.03.008

stay apart from each other (dispersion) was built into the utilityfunction. This dispersion is required in order tomaximize the long-term information gain in such models. Burgard et al. [4] propose asimilar centralized algorithm; however, in this method they alsoconsider the case of limited communication range. If the originallycontinuous communication cluster of the robot team breaks upinto more, smaller clusters so that the robots in the differentclusters cannot communicate with each other, then each clustercontinues the exploration algorithm individually. In theworst case,all of the robots are isolated and allocates the exploration tasksonly for himself, so the method crosses over to a decentralizedscheme, though the task allocation is far from the optimal.

In their paper Sheng et al. [5] introduce a solution to theproblem of separated communication clusters: the robots performan individual exploration movement but they are compelled toget in (communication) contact with their fellows regularly sothat the task allocation procedure can work regarding the wholeteam. Thus, for a considerable part of the working time the teammembers are outside the communication range of the other robots.

Vazquez and Malcolm [6] proposed a decentralized method,where the robots individually perform their algorithms respon-sible for area exploration, communication maintaining and colli-sion avoiding. The prevailing activity is chosen according to themomentary situation: if the communication and collision avoid-ing is ensured then the exploration behavior is preferred. If an im-portant communication link is close to a failure (because of thegrowing distance of the partners), then the connectionmaintainingbehavior is preferred; and the same is true for the danger of

Page 2: A multi-robot exploration algorithm based on a static Bluetooth communication chain

T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542 531

collision. With this scheme, the algorithm guarantees the commu-nication connectivity at any time meanwhile the exploration, andthe exploration proceeds under the constraint of the connectivityand collision avoiding.

In a recent paper, Pei et al. [7] introduce a method, wherethe cost of the team-moves is given by the total migration timebetween the two consecutive team-positions. Their algorithm,being a central one, ensures the connectivity with a base stationat the end of each team-move; however, meanwhile the migrationphase the connectivity is not guaranteed.

The task allocation is a crucial part of these algorithms. Inthis phase, the central resource or the community of the robotsdecides the (believed) optimal target frontier position for eachrobot. In a decentralized system, this is based on the bids fromeach robot for the target positions. This bidding scheme is quitedifficult to organize when there is no continuous communicationbetween thewhole team (see [5]). In numerous papers (e.g. [8–10])this bidding is organized as a market auction, supposing that thecommunication under the bidding is continuous in thewhole team.The paper published byDahl et al. [11] treats the task allocation as ascheduling problem applied also in complex industrial productionprocesses. Their solution takes into consideration the case ofheterogeneous robot team, where the abilities of the robots aredifferent for the appointed tasks (like exploration for example).

Most of the cited methods take into consideration the problemof collision avoidance with the kin robots. On the level of thealgorithm, it is done simply by adding a term to the utility function,that punishes the configurations in which the robots come tooclose to each other. In addition to this, it is necessary to implementa local, independently working collision avoiding behavior on therobots that takes over if there is danger of collision [6]. In thepresent paper, the collision avoiding with ki robots is treated firstfrom the part of the proposed algorithm, and then from the part ofthe local method in a specific robot-simulator test. In our solution,we used only a few sensors, and no on-board or central camera.Much more advanced control and collision avoiding methods canbe achieved with the help of an on-board camera. The applicationof image processing methods in robotic intelligence makes itpossible to recognize, identify and locate the fellow robots andlandmark objects [12,13], or with a multiple camera system eventhe body poses of a humanoid can be recognized [14].

In the case when the robot team works under the connectivityconstraint of the network, the exploration algorithm has a doublerole: it has to find movements that are optimal from theexploration point of view, and, at the same time, it has to ensure thenetwork connectivity. There are applications [15–17] where thislatter role is trivial, since the supposed radio coverage of a robotis bigger than the working area (i.e. the area to be explored). Inthese cases the robot team is supposed to use a wireless sharedmedia communication system and the Media Access Control hasthemost crucial task. Numerous otherworks, however, investigatemulti-robot exploration problems when the radio coverage of anindividual robot is much smaller than the area to be explored.Generally, in these problems the working plane is divided intocells and a step (or movement action) of a robot is determinedby the source and the destination cells. This (coarser or finer)discretization of the explored area makes easier the mathematicalcharacterization of the problem. In these cases, the explorationalgorithms suppose that there is a Mobile Ad hoc Networking(MANET) system providing communication channels between anytwo robots of the team. Cui et al. [18] applied a so-called gradualexpansion algorithm to keep the robots in communication contactwhile completing their task. The basic idea of this algorithm isthat only one robot moves in one time-step and the destinationcell of this move must be a neighbor of a cell occupied by a kinrobot. In the step planning procedure, they applied a fuzzy decision

system to find the best discoverymove under the constraint above.Sheng et al. [5] and later Cheng and Dasgupta [19] treated similarproblems, but here the communication connectivity of the teamwas not enforced at each time-step, though the communication(if it was possible) played important role in the applied algorithm.So in this scene the communication network could be broken upinto smaller clusters or individual robots, but on this account theteam could move around more freely. Rooker and Birk [20] usedan exploration scheme where the communication network had tobe continuous at any moment; moreover, they investigated theimportant casewhen the teamhad to be in communication contactwith a fixed base station, so the exploration range was limited inspace. Here the whole robot team moved in one time-step, andthey used a utility functionmethod to find the best collectivemove.The collective moves that would break up of the communicationnetwork obtained such amount of negative utility points so thatthe team never chose these moves. Antonelli et al. [21] proposeda theoretical solution to a problem of maintaining a multi-hop wireless communication chain between two mobile targetsby mobile robots. However, their work did not deal with areaexploration, since the positions of the targets were known duringthe simulation task.

Nevertheless, the most of the published schemes assume thewireless communication granted and consider the specifications ofthe wireless system beyond the scope of their interest. In the caseswhen a specific wireless system is applied (either in computersimulation or in reality) the mobile robots at hand are equippedwith Wi-Fi (IEEE 802.11), ZigBee (IEEE 802.15.4) or Bluetooth(IEEE 802.15.1) radio system, since they render cheap and yetsatisfactory solutions.

The Bluetooth radio system, which is the applied technologyin the present work, is a cheap solution and compared to itsrelatively high data rate it is economic on power. Due to theseadvantages, the Bluetooth technology is the most commonly usedon small mobile devices, and therefore it, is a good candidate in thecase of mobile robots. More detailed pros and cons on Bluetoothcommunication in mobile robotics can be read in [22] or [23]. Themost serious limitation of a Bluetooth network is that it is notscalable, since a Bluetooth piconet can consists of a master and atmost seven slaves [24]. In order to overcome this limitation the so-called Bluetooth scatternet is invented shortly after the appearanceof the original Bluetooth standard [25,26]. The basic idea of forminga scatternet is that a slave disconnects from itsmaster and becomesonly a passive member (park or hold mode) in its original piconet,and then asks for and gets admission into another piconet as a slaveor a master. Thus, there can be communication between the twopiconets through this, so-called, bridge unit: if there is a packet oramessage directed to the other piconet the bridge takes it, changespiconet and passes the packet towards its destination. However,each of these bridging actions causes some delay and acts as abottleneck [27]. In addition to this, the position of the bridge unit ismore restricted than that of the others, since it must be in the radiocoverage in both piconets. Due to this shortcomings this bridge-based scatternet, although arbitrary scalable from theoretical pointof view, is limited to not too big networks and low data rates inpractice.

Sohrabi et al. [28] and later Leopold et al. [29] proposed a noveland simple solution that employed two independent Bluetoothradios in single autonomous host to form a large scale wirelesssensor network (dual-radio scatternet). In this scheme, the twoBluetooth radios are parts of two different piconets so that the hostpasses the information between its two radios. It is easy to seethat this solution is free from the limitations of the former bridge-based scatternet, although it has higher cost. The main advantageof this scheme over the usual scatternet is that a node does nothave to disconnect from one neighbor and reconnect to another

Page 3: A multi-robot exploration algorithm based on a static Bluetooth communication chain

532 T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542

one regularly, in order to ensure the connectivity of the entirenetwork. This switching from one neighbor to the other consists ofa normal disconnection and a connectionprocess,which altogethertakes between 3 and 6 s [30,31]. Under this period the network isnot connected, though the packet finally reaches its destination.Since there are a number of bridge nodes in a big (conventional)scatternet, in almost any moment there is an unconnected partof the scatternet. The dual-radio-based scheme, however, ensuresthat every nodes in the network is reachable continuously (Forfurther details on the advantage of the dual-radio-based scatternetsee [28]).

In the present paper, we employed the dual-radio scatternetscheme described above to form communication network ofmobile robots. In order to test our scheme in reality we usedthe microcontroller-based NXT robot assembling set producedby LEGO. This set is based on Bluetooth communication andplanned mostly for educational purposes, however, there are alsonumerous research applications using NXT. The most remarkableadvantage of the dual-radio scatternet is that two communicatingrobots need not to break their Bluetooth connectionwhile they arein the vicinity of each others (in the case of Bluetooth I technologythismeans a distance atmost tenmeters). Thismeans that in orderto exploit the advantages of the communication system above therobots should not change their neighbors in the communicationnetwork while the completing the task, that is, the networkingshould work as not aMANET but a static network. In one hand, thishas restrictive consequences for the applied exploring algorithm,in other hand, we can exploit the advantages of the dual-radio-based system, and the continuously connected network enhancesthe reliability and robustness of the robot system. First, a realitycheck of the dual-radio-based communication system is performedwith a simple scenario that consists of a linear communicationchain formed by three to seven robots. An exploration algorithmfor this fixed linear communication topology is also proposed.

In the next chapter the technical details of the invented NXTscatternet system are given. In Section 3, the reality check of thelinear communication chain consisting of three to seven robots isintroduced. Section 4 details the proposed exploration algorithm,and in Section 5 a simple proof is given regarding the optimalityof the proposed algorithm in an obstacle-free area. Section 6introduces the results of the tests concerning the proposed andreference methods. Finally, we conclude in the last section.

2. The dual-radio-based scatternet

The most significant part of the NXT robot building set is themain brick that contains the central microcontroller, several IOports and the Bluetooth unit (CSR BlueCore 4 v 2.0) controlled by anARM7 chip. The radio coverage is guaranteed up to tenmeters. Thepoint-to-point Bluetooth communication is implemented usingthe Serial Port Profile (SPP) [32]. By means of the previous sectionour first goal is to create a communication unit equipped by twoindependent Bluetooth radios. The NXT is not a skeleton systemi.e. it is not possible to extend the original electronics of the centralbrick. That is why we invented an unusual solution, the basicidea of which is that two central bricks are connected to eachother through their RS485 ports. These ports regularly serves asa receiver of measured data from various sensors of NXT standard,however, they can not only receive but also send data via the NXTsensor cable. Using this capability of the RS485 ports, a fast half-duplex communication link can be established between the twobricks. Thus, the two bricks form a new autonomous unit that hastwo independent radios (each in its own host brick).

The point-to-point communication or on the Bluetooth eitheron the RS485 link must be controlled by program codes in theapplication layer. In the present project, we applied the C-like NXC

Fig. 1. An example of mapping a dual-radio-based Bluetooth scatternet (b) intoa directed graph (a). The master and slave units are marked by shaded or whiterectangles, respectively.

Fig. 2. The testing scenario. The symbols introduced in Fig. 1 are used.

programming language of theNXTmicrocontroller,which containsAPI functions to treat the ARM7 controller and the RS485 port. (TheJava-like Lejos or the assembler-like NBC programming tools arealso capable for the purpose.)

Each of the radios of the autonomous unit can be either masteror slave in the piconet it belongs to. Every unit with the two radiosis a connection point between the two piconets, so such units canform a large connected network. Let us represent this network bya directed graph, where the nodes correspond to the autonomousunits and the directed links denote the Bluetooth connections sothat the direction is from the master to the slave (see Fig. 1).

It is also possible to involve Bluetooth equipped cell phones intothis network, however the cell phonesmust be end-points (leafs) ofthe graph since a phonehas only oneBluetooth radio.With the helpof this multi-hop network, the cell phones can communicate witheach other so that in this scenario, the cell phones are the clientsand the mobile robots provide service for them.

Various self-assembly algorithms to create such networks aredetailed by Sohrabi et al. [28] for sensor nodes with fixed spatialpositions. The situation is much more difficult when the nodesare mobile robots or cell phones carried by their owners, sincethe topology is continuously changing. In that case a continuousmaintaining the network connectivity and routing information isnecessary. Working out such connectivity preserving and routingalgorithms is beyond the scope of the present work. Our aim isto implement a real communication capability test for a simpletopology consisting of several autonomous NXT robots assembledby the scheme above. This communication test is described in thenext section.

3. The communication tests and results

The test presented here is based upon variable number of robotsarranged in a linear graph topology shownby Fig. 2. In order to get apicture about the communication speed a simple ping applicationwas written. In this application, the first robot in the chain sentsmall (16 bit long) ping packets into the chain, while the robot onthe other end of the chain replied automatically the ping packets.

Page 4: A multi-robot exploration algorithm based on a static Bluetooth communication chain

T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542 533

Fig. 3. The average reply times through the robot communication chain as afunction of the number of robots.

A ping procedure sent ten packets in one series. The next packetwas sent as soon as the acknowledgment arrived. The average replytimes weremeasured and recorded for various robot chain lengthsofN = 3, 4, 5, 6 and 7. For each chain-length, five ping procedureswere executed. The average reply times are shown in Fig. 3.

It can be seen that each robot–robot link rises the hopping timeby approximately 60ms, since the average reply time is the doubleof the average hopping time.

It is also important that the experiments were executed in aclosed room with solid walls the diameter of which was less thanten meters that is each Bluetooth radio was within the coverage ofall of the others. In spite of this closeness, the possible interferencedid not caused increasing latencywith the increasing chain-length.This is mostly due to that the ping packets were short and only onepacket were traveling on the chain at a time.

4. The area exploration algorithm

4.1. The cell grid and the radio coverage

The area to be explored is divided into non-overlapping equallysized square-shaped grid cells as it is usual in such problems. Inour case, the cell size depends on the Bluetooth communicationrange. The specific task of the team is to visit all of the cells in thearea and discover the possible obstacles. A practical applicationof this task, for example, is the case when the team has to findthe locations of several objects or connect to a freshly discoveredBluetooth equipped host (e.g. cell phone) in order to establish awireless multi-hop communication chain between the host andthe base station. The only restriction is that the robots have to keepup the linear fixed-topology communication chain between thebase station and the robot farthest from the base station (measuredalong the chain).

In such tasks one of the most important parameter is the radiusof the radio coverage of a robot. We denote this by rcov andsuppose that the radio contact of two robots within the coverageof each other is guaranteed. In several papers, however, moreadditional environments are defined beyond rcov [6,21,33]. One ofthese is a so-called ‘‘precaution’’ or ‘‘safety’’ environment (heredenoted by rsafe), the radius of which is smaller than rcov . The basicidea in this is that the algorithm tries to keep each robot in thesafety environment of its neighbors, which greatly increases therobustness of the connectivity of the team. In the present paper, theidea of the safety environment is also employed, though its rolewillbe a bit different. In our scheme, it can happen that a robot leavesthe safety environment of its neighbor if an unforeseen obstacleblocks a planned move. However, the algorithm guarantees thatno robot can leave the radius coverage of its neighbors under anycircumstances.

First, we give the actual parameter values of rcov and rsafe in cell-size units C , which is defined as the side of a square cell. Then,based on the radius of the actual Bluetooth coverage the specificvalue of C can be given. Here the two environment parameters arechosen to be:

rsafe =32

√2C and rcov =

√262

C . (1)

Let a cell-position (P) be given by the Cartesian coordinates ofthe center of the cell. Based on the radii given above we definethe environments ES(P) and EC (P) of a certain cell P as the setsof cells that are completely within the circle centered in P withthe radii rsafe and rcov , respectively, as it is demonstrated in Fig. 4.In addition to these the environment EN(P) of P is also definedas the union of the cell P and all of its side neighbors (seeFig. 4). These environments will be important in the explorationalgorithm. EC (P) is the set of cells where the wireless connection isguaranteed with a robot located in P. ES(P) is the set of cells wherethe communication neighbors of the robot in P are planned to beafter finishing a team-move. Finally, EN(P) is the set of cells wherea robot in P can move to during a team-move. This means thatduring a team-move a robot is allowed to move no farther thana neighboring cell. Note that P is element of EN(P), that is, a robotcan also stay in place in a team-move. Besides, the movement ofeach robot is controlled so that its destination point is located in orvery close to the center of the destination cell.

4.2. The communication chain

As it was mentioned in the introduction, here the simplesttopology was chosen for the communication network: a linearchain (without branches and loops). In addition to this, thetopology is static, that is, the established radio connections cannotbe broken and no new connections are established. Therefore thecommunication scheme is quite simple: at one end of the chainis the base station fixed in a center of a cell (this is the origin),and every robot has two communication neighbors except thelast one. If, for example, the base station has a broadcast message

Fig. 4. The three environments of a robot used in the algorithm. The cells of the environments are shaded.

Page 5: A multi-robot exploration algorithm based on a static Bluetooth communication chain

534 T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542

for the team then it sends the message to its communicationneighbor robot, and the message gradually propagates to the lastrobot in the chain. If a chain member has a unicast message to aspecificmember then themessage passes through the stations onlybetween the source and the destination members.

4.3. The wall-like obstacles

The area to be explored is supposed to contain unknown wall-like (boundary-like) obstacles. The definition of such an obstacleis simple: if it is possible for a robot to move from a given cellto a neighboring cell then there is no obstacle on the boundarybetween the two cells, otherwise the common side of the two cellsis considered to be awall-like obstacle. Any segment of the cell gridcan be an obstacle (wall), since we do not have a priori knowledgeabout the situation or the density of the obstacles. A wall-likeobstacle can be identified by giving the positions of the two cellsthe boundary of which is blocked by the obstacle. We assumethat an obstacle is discovered by the robot team only if a robottries to move from one cell to a neighboring one. This assumptionis realistic for many types of small mobile robot because in ourscheme a robot is generally situated in the center of a cell i.e. afew meters from the cell boundary, and from this distance it isgenerally impossible to determine if the center of the neighboringcell is reachable or not. Even if there is an object that seems tobe an obstacle for the robot in a few meters distance, there canbe gateway(s) discoverable only by a systematic search along theblocking object. This definition of the obstacles implies the leastloss of generality.

4.4. The exploration algorithm

Let us introduce the following notations:

• The members of the robot team are denoted by Ri, where i =

1, . . . ,N . The communication neighbors of Ri are Ri−1 and Ri+1(if they exist), R1 represents the host in the base station, RN isthe robot farthest from the base station on the communicationchain.

• The cells occupied by the robots are given by the set of vectors{P1, . . . , PN}.WithP1 = (0, 0), which is the position of the basestation.

• In each team-position {P1, . . . , PN} the algorithm determinesthe next team-position {P′

1, . . . , P′N}. Therefore, an elementary

move of the robot Ri is given by the vector (P′i − Pi).

The applied area exploration algorithm has to guarantee theconnectedness of the linear communication chain and find thebest possible exploration move. The proposed algorithmic schemedivides the robot team into a leader and follower robots asfollows:

• The robot RN is appointed as the leader of the team. This meansthat in a step cycle the elementary move for RN is calculatedfirst. Similarly, to the other referred methods, the basic conceptof this calculation is the utilities of the possible moves, whichare determined by the relative position of RN and the frontiercells.

• The moves of the other robots (R1 . . . RN−1) are calculated toguarantee the connectedness. Let us call them follower robots.However, their task is to not only ensure communication, butalso explore new cells, which has not been visited before.This ‘‘secondary’’ task is done automatically meanwhile theirprimary follower movement. The exploration movement of RNis controlled by the possible moves of the follower robots: if anexploration move cannot be followed, then this move will beforbidden for RN .

The next position of the team is calculated in a recursive way.First, the set of possible new positions for RN is determined andarranged in a priority order. (Let us call this ordered set of positionsas Priority List of New Positions and denote it by PLNN ). Thenthe algorithm examines the first element of PLNN whether it ispossible for RN−1 to follow this step. This is done by determiningthe PLNN−1 for RN−1. If this is empty the move P′

N − PN is notpossible and the next element in PLNN is examined in the sameway. If PLNN−1 is not empty then the same is done, that is, the firstelement of PLNN−1 is examined whether it is acceptable for RN−1;if it is not then the procedure takes the next element in PLNN−1. Sothe samemethod is repeated at each level fromN down to 1,wherethe ith level corresponds to examinations of the elements in PLNi.This recursive method continues until an acceptable new positionis found in PLNN , which means that an acceptable new positionis found for each robot. This method also ensures that the bestpossible team-move is found, provided that each PLNi is orderedfrom an optimality point of view.

This recursive scheme can be well represented by a decisiontree with N − 1 hierarchy levels. Here the highest hierarchy nodeand its outgoing edges correspond to the leader robot and thepossible new positions in PLNN so that the edges are arranged inpriority order from left to right (see Fig. 5). The next hierarchy levelrepresents the possible new positions in the PLNN−1 and so on.Finally, the leaves of the tree correspond to (completely given) newpositions of the team, where the leftmost node is considered to bethe most optimal position for the team (provided that the betterposition have higher priority in each PLNi). However, not every newposition is possible; some of them are forbidden by obstacles orthe danger of breaking up the communication chain. The recursivealgorithm described above finds the leftmost possible (i.e. notforbidden) team-position. The determination of thewhole plannedteam-position can be given in a concise way by the followingrecursive function

Team_New_Position (i, [P′

i+1 · · · P′

N])

=

[P′

i, Team_New_Position (i − 1, [P′

i · · · P′

N])] or [] if i > 1[(0, 0)] or [], if i = 1 (2)

where [P′i+1 . . . P′

N] are the planned position of Ri+1 . . . RN , andP′

i is the first position in PLNi for which the Team_New_Position(i−1, [P′

i . . . P′N]) function call returns a nonempty list. If there is

no such an element in PLNi the function returns also an empty list.Thus, the function call Team_New_Position(i, [. . .]) always returnsa position list of length i, and this list increases by one newposition in each recursive cycle. It can be seen that the functioncall Team_New_Position(N − 1, [P′

N]) returns an entire plannedfollower team-position [P′

1 . . . P′N−1] if it exists for P′

N, or anempty list, otherwise.

The algorithm of the recursive function Team_New_Position()are detailed by Algorithm 1. It can be seen that there is yet anadditional element in this algorithm: in order to return a nonemptylist of new positions the condition

PLN(j) ∈ {P′2, . . . , P′

i−1, P′i+1, . . . , P′

N} \ {(0, 0)} (3)

must be also true. With other words this means that the possiblenew position PLN(j) can be selected only if it is not coincides withthe other possible new positions of the fellow robots except thebase station (0, 0). Since this condition is examined in all levels(i.e. for all values of the index i), the entire planned new team-position will have this property, that is, no new robot-position willcoincide with any other except in the base station. This propertyis very useful from collision avoiding point of view, because iteffectively keeps the robots in a cell-size distance from each other.To bemore specific, two robotsmay arrive into the same cell only ifone of them encountered a new obstacle and could not accomplish

Page 6: A multi-robot exploration algorithm based on a static Bluetooth communication chain

T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542 535

Fig. 5. The decision tree of the robot team used in the recursive move planningmethod. The links represent the possible decisions about the new position, theleafs correspond to complete team-positions. The shaded leafs represent new team-positions forbidden by the connectivity constraint.

the planned move. However, no three or more robots can arrive inthe same cell. The only exception is the base station (0, 0), where amuch more advanced robot navigation system is supposed than inthe area outside (see for e.g. [15]).

The main exploration algorithm is given in Algorithm 2. Firstthe list PLNN is generated by the Priority_List_of_New_Positions(N)function call, and then the algorithm examines each possiblenew position in PLNN in priority order if it can be competedto an entire team-move. This is done by calling the functionTeam_New_Position (N − 1, [PLNN(j)]). The object called Mapcontains all information that the robot team gathered about theexplored area. The leader robot registers into its Map not onlythe real obstacles encountered by the team but also ‘‘virtualobstacles’’, which plays important role in the whole explorationalgorithm. These virtual obstacles are considered as real ones inthe Manhattan path finding procedure (see later) of the leaderrobot. The registration of a virtual obstacle is implied by twodifferent events: firstly, if RN finds move in its PLNN that wouldtake it outside the EC environment of RN−1 (denoted by EC (PN−1))then blocks that move by a freshly registered virtual obstacleand substitutes the proposed position by PN in PLNN , i.e. a nullmove for RN . Secondly, if the function call Team_New_Position(N −

1, [PLNN(j)]) returns an empty list for the element PLNN(j) of PLNNthen also a virtual obstacle is registered into Map between thepositions PN and PLNN(j). Actually, a virtual obstacle refers to thefact that the chain-like team cannot reach a specific cell with thehelp of a specific move. This cell, however, may be reached laterfrom a different direction. In the next team-move planning cyclethe possible new positions of RN blocked by virtual obstacles willnot be selected into PLNN . Without the application of the virtualobstacles, the team would keep trying to reach these presently‘‘unreachable’’ cells not realizing the hopeless situation. It shouldbe noted that the virtual obstacles blocks only RN , the other robotsare blocked by only the real obstacles. The real obstacles exploredbefore are taken into consideration when the robot Ri calculatesits PLNi. A robot may discover unexplored obstacles only when ateam-move is calculated and the ‘‘start move’’ procedure launchesat the individual robots. Therefore, it is not sure that a plannedteam-move can be realized. If a robot cannot enter into an adjacentcell because of a ‘‘real’’ freshly discovered obstacle then it stays inits present cell, say, in the center of the cell. If the function callPriority_List_of_New_Positions(N) does not yield any explorationmove (returns an empty list) in the beginning of the algorithm,then themove takes a backtrackmove (denoted byBacktrack Team-Move in the algorithm). This is necessary, because in this case the

leader robot is completely closed into an explored area by virtualor real obstacles, and only the backtracking solves the situation.Rooker and Birk also employ this method in [20]. If no backtrackis possible, i.e. we are back in the original position, then all virtualobstacles are deleted and the exploration continues.

At the end of the moving phase, the robots, one after other,broadcasts their new positions and the possibly discoveredobstacles. It is enough to give their new positions relative to theprevious one, so one of the five possible moves (the four directionsplus the null move) is given. This information, together with thepossible existence of the obstacle, can be encoded in four bits (orin one byte for the sake of simplicity); therefore, each robot sendsonly one byte in a step cycle.

A crucial element of the algorithms above is the Prior-ity_List_of_New_Positions() function, which constructs the PLNi ofthe individual robots and is different for RN and Ri (i < N). Thisprocedure is responsible for the exploration strategy if i = N andfor the follower strategy if i < N . It is detailed in Algorithm 3. Theprocedure starts out from the set of possible new positions (de-noted by P) that are enabled by the known real or virtual obsta-cles. Then it calculates the shortest obstacle avoiding Manhattanpath to an unexplored cell from each cell that are reachable by theenabledmoves. (Obviously, there are atmost four such cells). Moreprecisely, for each Pi,j ∈ P it calculates the path-length

dj = min{MDist(C, P′

i,j) | C ∈ Map(unexplored)}, (4)where theManhattan distance between two cell-positions A and Bis denoted by MDist(A, B), and Map(unexplored) denotes the set ofunexplored cells in Map. The Manhattan distance between A andB is defined as the shortest (obstacle avoiding) path on a squaregrid, which contains A and B as nodes. The minimal value in (4)is found by applying a flood-fill algorithm starting at P′

i,j, and theflood-fill algorithm stopswhen the first unexplored cell is reached.Thus, it is guaranteed that the closest unexplored cell is found.When the dj values are given, the priority order of the list of newpositions is determined so that the P′

i,j positions with smaller djvalues precedes the ones with higher dj values. This correspondsto the usual path planning strategy in case of the frontier-basedmethods, that is, the most preferred new position will be that fallsclosest to the frontier. If there is no path found to any unexploredcell then the function returns an empty set as PLN. In the casewheni = N , this is the primary point of view in lining up the list. If thereare more P′

N,j positions with the same dj value in PLNN , then themove with the higher scalar product (P′

N,j −PN) ·PN will precedesthe others, because this new position increases the distance fromthe base station more than the others. This is the secondary pointof view in lining up.

In the case when i < N , this function is responsible for findinggood chain maintaining moves. Therefore the list of enabledpossible new positions is determined so that Ri remains in theenvironment EC (Pi−1) and ES(P′

i+1), where Pi−1 is the presentknown position of Ri−1, and P′

i+1 is the proposed new positionof Ri+1. This means that Ri follows Ri+1 to remain in the narrower(safe) vicinity of its new position but at the same time remains inthe wider (radio coverage) vicinity of the present position of Ri−1.Corresponding to that, the set of possible new positions is given asES(P′

i+1) ∩EN(Pi) ∩EC (Pi−1). This ensures that the communicationchain will not break even if unknown obstacles block some movesin the team-move. In case of a follower robot the exploration issecondary compared to the optimal follower strategy. Therefore,the first lining up by increasing dj value is rearranged so thatthe positions that are element of EN(P′

i+1) precedes the others.The idea behind this is that the best thing for Ri is to be in theenvironment not only ES(P′

i+1) but also EN(P′i+1) (the set of Phigh),

so the robot Ri+1 will have a bigger variety of moves in the nextstep. If the position Pi is element of Phigh then the optimal newposition is givenwithout anymove, so best action is to stay in place.In this case this position is put into the first place in PLNi.

Page 7: A multi-robot exploration algorithm based on a static Bluetooth communication chain

536 T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542

4.5. The stop condition

The exploration is stopped if the stop_condition in Algorithm1 became true. Obviously, this condition should be set to true ifall of the cells in Map has been explored. However, this cannotbe achieved for arbitrary obstacle configuration, even if the areais small compared to the possible coverage of the robot chain.Therefore, another stopping condition is needed too. The durationof the exploration is restricted by the limited power storagecapacity of the individual robots. The bigger part of the electricpower of the robots is consumed by the moves and the smallerpart by the Bluetooth radio usage. This latter part is approximatelythe same for any robot, and it is proportional to the explorationtime i.e. to the number of team-moves. Based on this, we supposethat starting with a fully charged accumulator any robot can makethe same limited number of moves, and let denote this number byMAX_MOVES. So the duration of the exploration is restricted by thecondition

max(Moves(Ri)) ≤ MAX_MOVES, (5)

where Moves(Ri) denotes the number of total moves done by Ri.For example the MAX_MOVES parameter for the Lego NXT Tribotis approximately 400.

4.6. Recovery in the case of robot-failure

The recovery strategy in case of robot-failure is treatedseparately in this subchapter. The main purpose is to make theteam able to continue the exploration after a member of a teamhas broken down. Under the term of ‘‘breakdown’’, we mean thatthe robot either cannot take part in the exploration movements orthe communication of the team. In any case, the communicationchain also fails, since if the robot does not follow the collectivemovement of the other robots, then, sooner or later, it will be out ofthe radio coverage of its neighbors. Therefore, such a breakdown isperceived by the failure of the communication chain, in any cases.Moreover, each robot is able to detect the state of the link with itsneighbors (in case of Bluetooth, ZigBee or Wi-Fi communication),and therefore, the team is aware of the identity of the failed robot.

Regarding these, the recovery method detailed by Algorithm4 is proposed. The basic idea is very simple: each robot beyondthe point of the failure moves to the position of its neighbor withlower index, and then the robots re-establish the communicationchain omitting the robot responsible for the failure. After thisthe exploration can be continued but with a shorter chain. Thisrecovery method works only if all of the robots aware of thepresent positions of the other (working) robots, which is ensuredby Algorithm 2.

A special case is when the leader robot RN breaks down. Sofar, we have not treated the question that where the algorithmiccalculations (of the team-moves) takes place physically. In fact,these calculations can be performed by any of the robots (includedR1, the base station), since it does not require a high computingcapacity. Considering this, the leader robot, RN , can be replacedby RN−1, in the case of its breakdown. Actually, if all of the robotsperform all of the algorithmic calculations, then there is no need ofbroadcasting the planned team-move, only the synchronization isnecessary so that the team members start move at one moment.

A simple state-flow chart of the main exploration process,including the backtracking and recovery moves, can be seen inFig. 6.

4.7. Feasibility of the algorithm

The implementation of the introduced algorithm to a robotteam does not require highly qualified robots. In general terms, the

Fig. 6. The main flow chart of the proposed exploration algorithm. The details ofthe main processes are given in Algorithms 1–4.

team members must be equipped with the following capabilities(beyond the communication capability detailed before):

• Navigation capability, with the accuracy of the order of 2.0E−3.(Similarly to the papers [34–36], here the accuracy is defined as⟨1L⟩/L, where ⟨1L⟩ is the mean value of the error of positionafter driving a path of length L.)

• Capability of refreshing the positions of the robots with accuratevalues, if they are in the base-station cell.

• Local obstacle detection capability. This means that the robot isable to detect an obstacle (even a fellow robot) in a distancecomparable to its size.

• Autonomously working, small obstacle and fellow robot avoidingalgorithm.

By themeans of the definition given in the first point, if an explorerrobot with navigation accuracy of 2.0E−3 travels through n cells(of size C), the error of its position will be (2.0E–3)nC . Thus, aftertraveling a path of 500 cells, the navigation uncertainty will bein the size of a cell. In the next chapter, we test the proposedalgorithm on a simulated area of size 15 × 15 cells. In this area,the length of longest exploration path is under 100 cells if theobstacle density is small. In the cases of higher obstacle densitiesthis data is 100–400 cells, but the robots visits more time the basestation, so the position is refreshed (according to the capabilitygiven in the second point). In a larger test area, the robot teamshould be directed back to the base station in order to refresh theirposition (or a more accurate navigation is needed). The navigationaccuracy of 2.0E−3 can be achieved easily by a simple odometrynavigation. Borenstein and Feng [35], introduced a method tocompensate systematic odometry errors by calibration. With asimulator of a differential-drive three-wheeled robot (tribot) theymeasured navigation accuracy of 1.0E−2 in the not-calibrated and1.0E−3 in the calibrated case. Papadopoulos and Misailidis [34]adopted this method for a real differential-drive tribot withdrive synchronization, and achieved navigational accuracy under1.0E−3. Pasztor et al. [36] measured the navigation accuracy ofthe NXT differential-drive (real) tribot with drive synchronizing;however, they used no calibration, but compass sensor in orderto control the direction in long straight paths and in rectangularturns. The accuracy with these robots was 1.0E−2 on averageand 2.0E−3 in the cases when selected hardware elements wereused. If the turning angle is much less accurately controllable by asimple odometry, which is the case of the NXT tribot, the use of thecompass sensor is necessary.

The local obstacle avoiding is also feasible by relatively simpletools. The local obstacle detection can be solved by touch,

Page 8: A multi-robot exploration algorithm based on a static Bluetooth communication chain

T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542 537

ultrasonic or laser sensors, and after the obstacle detection a simplerectangular or more advanced obstacle avoiding method can beapplied (see for e.g. [37] or [38]).

5. The optimal algorithm in the obstacle-free case

In most of the papers, dealing with exploration strategies themain measure of effectiveness of the algorithm is the explorationtime with fixed number of robots [4–6,17,20]. Based on this, herewe call an algorithm ‘‘optimal’’, if it explores (visits) all of thereachable cells under the given conditions in a minimal time. Letus measure the exploration time in time units, where one time unitis the time duringwhich a robotmoves from a cell to a neighboringone.

We state that the algorithm proposed in the previous sectionwith the given conditions is optimal in the obstacle-free case. Tobe more specific:

If each robot must be in the ES environment of its neighbor(s) inevery step cycle; and if one of the robots is confined in the cell of thebase station meanwhile the exploration, then the algorithm given byAlgorithms 1–3 is optimal in the obstacle-free case. (The area is dividedinto cells by a square grid).

In order to prove this statement, first we show that if there isno obstacle in the area, then the exploration time cannot be lessthan 9(N − 1) − 1 time units with N robots. Let us call a cell as‘‘border cell’’, if it is reachable by the team and is side neighbor toan unreachable cell. In our case, there are 8(N−1) border cells andthey are situated along the sides of a 2N − 1 by 2N − 1 square. Itis easy to see that two or more robots cannot occupy border cell inthe same time-step, otherwise the condition of connectivity withthe base station cannot be satisfied. Thus, only one new bordercell can be explored in one time-step (i.e. during one time unit),whichmeans that the exploration of the set of border cells takes atleast 8(N − 1) time units. The shortest path to the closest bordercell from the base station is N − 2 cells long, therefore, at least8(N − 1) + N − 2 = 9(N − 1) − 1 time units are necessary toexplore all of the cells.

As a second step, we show that the exploration algorithm givenin Algorithms 1–3 executes the exploration in 9(N − 1) − 1 i.e.the minimal number of time units. In the first phase, the leaderrobot walks on the shortest path toward the closest border cell. Itis because in this phase the first element of PLNN will be alwaysthe new position that increases the distance from the base stationthe most i.e. has the bigger scalar product (P′

N,j −PN) · PN (seeAlgorithm 3). This phase requires N − 2 time units. In the secondphase, the leader robot walks through the set of border cells one byone, because in PLNN the first element (after unreachable positions)will be always the next border cell, being the closest reachableunexplored cell (see Algorithm 3). On the boundaries betweenthe border cells and the unreachable cells, the algorithm registersvirtual obstacles. While this movement of the leader, the followerrobots form a moving line of connection between the base stationand the leader robot, so they explore all of the cells between thebase station and the border cells. Since there are no obstacles in thearea, this movement pattern is achievable. The exploration pathsof the robots in the second phase lie on concentric squares withthe base station in their center, as it is shown in Fig. 7. The secondphase requires 8(N − 1) time units, which is the number of theborder cells. The whole exploration takes altogether 9(N − 1) − 1time units. �

6. Tests of the proposed algorithm

In this chapter, we give a quantitative test of the proposedrecursive chain algorithm and compare its performance with thealgorithm published by Rooker and Birk [20]. We will call their

Fig. 7. The theoretical paths of the robots R2 . . . RN in an obstacle-free areafollowing from the Algorithms 1–3.

method as ‘‘reference method’’ (or ‘‘reference algorithm’’) in thefollowings. As itwasmentioned in the introduction their algorithmused a utility function in order to choose the best team-movefrom a relatively large pool of possible (or impossible) team-moves, which was generated randomly. They used different poolpopulations from 50 up to 1000, and a test area with rectilinearwall-like obstacles that divided the area into large rooms(compared to the cell size). In order to give a correct quantitativecomparison we implemented the reference algorithm with thepool population of 1000, and tested both in the same test areas.

One of the main differences between the two algorithms isthat the reference algorithm uses a utility function that takesinto consideration the utilities of the individual robot-steps withequal weight, and the roles of the robots are equal, so we can saythat the exploration task and the utility function is decentralized.Besides, it assumes a continuously working MANET system whilethe proposed algorithm does not. Another considerable differenceis that the team step planning method of the reference methodinvolves a high computational load, while our method requiresmuch less computation. This is because the reference methodcomputes the utility for 1000 team configurations, while ourmethod does a similar computation only for the leader robot. Thecomputational load has importance here, because these algorithmsbased on central computation. If we have not a high-performancecomputer in the base station, then one of the robots (or all ofthem) must carry on the computation. If all of the robots carryon the computation, then the time and energy of broadcasting thecomputed team-move can be saved.

Besides, we applied an other reference algorithm, too. Thisis a special case of Rooker’s algorithm, since its method is thesame, basically, but here the best utility team-move is chosenfrom the pool of all of the possible team-moves. This pool contains4N−1 team configurations, which is 47

≈ 16,000 in the case of thepresent test with N = 8. Let us call this a ‘‘Full search’’ algorithm.This algorithm provides the best possible team-move with thegiven utility function (see [20]). Actually, this method has only atheoretical relevance, because the calculation of the team-movesincludes the evaluation of the utility function of about 16,000 teamconfigurations, which is 16 times longer than the simple referencealgorithm above.

6.1. The simulation areas

From the proposed algorithm it follows that the maximal sizeof the area can be explored is (2N +1)2, since the robots are forcedto be in the safety environment (ES) of each other. This restrictionis applied for all of the tested algorithms.

Page 9: A multi-robot exploration algorithm based on a static Bluetooth communication chain

538 T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542

Fig. 8. A random (a) and the two building-like (b and c) obstacle configurations. The members of the team are denoted by numbered circles, the explored and unexploredcells are marked by gray or white color. The random obstacle configuration has obstacle density of ρ = 0.15.

Two kinds of obstacle configurationwere used in the simulationareas. One configuration type contained randomly distributedobstacleswith obstacle densityρ, that is, any cell-dividingwallwaschosen to be an obstaclewith probabilityρ without any correlationbetween the obstacles. The other configuration type was similarto that of the test area used by Rooker and Birk, which was abuilding-like environment with smaller or larger rooms, separatedby rectilinear walls and opened doors. Here we constructed twodifferent such a building-like obstacle configurations, which canbe seen in Fig. 8(b) and (c). As a comparison, a random obstacleconfiguration of the first type with obstacle density ρ = 0.15 isalso shown in Fig. 8(a). All simulation areas were chosen to be thesame size of (2N+1)by (2N+1) cells square bordered by obstacles.Therefore, both algorithms could reach all of the cells (unless therewere areas isolated by random obstacles).

6.2. Results and discussion

The algorithms were tested by computer simulation usingMATLAB for parameter N = 8, which corresponds 7 exploringrobots. Therefore, the simulation area contained 225 cells to beexplored. In the case of the randomobstacle configuration type, thevalue of ρ was chosen to be 0, 0.01, 0.02, 0.05, 0.1, 0.15 and 0.2. Thecaseρ = 0 corresponds to the obstacle-free configuration. For eachnon-zero values of ρ five different obstacle configurations weregenerated. The tests were performed with the limiting parameterMAX_MOVES = 400, which corresponds to the average measuredvalue in the case of theNXT Tribot.With the tests, wemeasured theexploration times needed to explore 50% and 75% of the total 225cells, whichmeans 112 and 169 cells. The 100% exploration cannotbe required, because a certain fraction of the 225 cells (0%–20%)is not reachable under the constraint of connectivity with thebase station. This is due to the randomly positioned obstacles. Theexploration times were measured in time units (time-steps) wedefined in Section 5. Then the mean value of the exploration timesthat belong to the same exploration ratio and obstacle density wascalculated and graphed as a function of ρ. The results of these testsare shown in Fig. 9(a) (50% exploration) and (b) (75% exploration).

It can be seen that in the case of 50% exploration, the proposedalgorithm performs definitely better than the reference algorithmfor an obstacle-free area or low obstacle density, and for higherobstacle densities the performances of the two methods aresimilar. The comparison with the Full search method is interestingtoo: at zero and 0.01 densities the two methods produces almostthe same exploration times and for higher densities the Full searchmethod performs better. In Section 5, we have shown that theproposed algorithm is optimal in the present conditions for theobstacle-free case (ρ = 0). Thus, the Full search method seems tobe also optimal in the obstacle-free case when only half of the cellsmust be explored. The situation is different when the algorithms

Fig. 9. The ten-based logarithm of the exploration times of the proposed and thereference methods on the random obstacle test areas as a function of the obstacledensity for 50% (a) and75% (b) exploration rate of the total 225 cells. The explorationtime is measured in time units defined in the beginning of Section 5.

have to accomplish 75%. In this case, for low densities (zero and0.01) the proposed algorithm performs better than the other two(being the optimal one), and beyond ρ = 0.01 the Full searchmethod achieves the best results. However, in the 75% case theproposed method performs definitely better than the referencemethod for all examined obstacle densities.

In an obstacle-free area, the robot team can explore all of the225 cells, so we can measure the exploration times for the case of100% exploration ratio. Table 1 shows these measured times in thefirst twodata column. The advantage of the proposedmethod is themost pronounced in this case. Its exploration time (62 time units)

Page 10: A multi-robot exploration algorithm based on a static Bluetooth communication chain

T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542 539

Fig. 10. The paths of the individual robots measured by the robot-simulator (black lines) in an obstacle-free (a) and a building-like (b) test area. The obstacles are markedby shaded area; the position of the base station is marked by a thick circle. The linear sizes are given at the bottom of the figures in meters.

Table 1The exploration times and success rates of the reference, the Full search and the proposed method for the obstacle-free and building-like test areas. The exploration time isgiven if the success rate is 100%.

Obstacle-free area Building-like area with bigger rooms Building-like area with smaller roomsSuccess rate (%) Exploration time Success rate (%) Success rate (%)

Rookers’ algorithm 100 372 25 43Full search algorithm 100 117 67 84Proposed algorithm 100 62 75 77

is about half of that of the Full search method and one sixth of thatof the reference method. This exploration time corresponds to theminimal theoretical value of 9(N − 1) − 1 with N = 8, which wasgiven in Section 5. The table also shows the success rates of theexplorations in percents in the cases of the building-like obstacleconfigurations. In these simulations all three algorithms stoppedbefore the 100% exploration success, because one of the robotsreached theMAX_MOVES = 400 value (see Eq. (5)). In this case, wecan compare the success rates done before the algorithms stopped.The proposed and the Full search algorithms have similar successrates (about 70% and 80% in the cases of the two configurations),in the first case the former is better and in the second case thelatter. However, they are much better in this measurement thanthe reference algorithm having only 25% and 43% success rates.

Since the proposed algorithm is optimal in the obstacle-freecase, it is not surprising that it achieves the best explorationtime value in the case of zero or low obstacle density. However,the question arises that why does it perform better than thereference method also for non-zero obstacle densities when 75%or higher success ratio is required, considering that the referencemethod is not restricted by the fixed chain-like topology as theproposed method. The answer can be found in the decentralizedutility system of the exploration strategy of the reference method.This algorithm, similarly to other frontier- and utility-basedalgorithms [2–4,6], determines the utility functions regarding theindividual robots and takes them into consideration with equalweight. When there is a connectivity constraint with each otherand the base station, then the robots, following their own utility-based exploration purposes, often obstruct each other, so thewhole team-moveswith difficulty to an unexplored area. Consider,for example, the following situation: two robots, Ri and Rj, are atthe frontier, but on the opposite sides of the base station as farfrom each other as permitted by the connectivity constraint, andthe other robots are between them ensuring the connectivity. Inthis situation the team should move definitely into one direction,towards Ri or Rj, but the utility is approximately equal for both

direction, therefore the team is in a deadlock situation for a whilecaused by the decentralized utility system. In the referencemethodthere is a random factor, which helps the team out of this deadlocksooner or later. This effect is more pronounced if high explorationsuccess is required (75% or beyond it), because the algorithmmustreach also cells far from the base station. The proposed methodis free from this obstructive effect, because there is an appointedleader, RN , and the exploration utility of this robot overridesthat of the others. Thus, the leader determines the explorationstrategy alone; the other robots are responsible for ensuring theconnectivity in the first place. It seems that this kindof task divisionis useful in the exploration strategy.

6.3. Feasibility test by robot-simulator

The tests introduced above characterize the effectivenessproposed algorithm regarding the number of exploration stepsversus the explored area. However, the feasibility of the algorithmis also an important question, and we treated this at the end of theprevious section in a rather theoretical way. Besides, we tested theproposed method by a simulation implemented in the MicrosoftRobotic Developer Studio (version 2010), which provides a robot-simulator environment and a realistic package for the Lego NXTTribot as well [39]. The seven simulated NXT Tribots (the eighthrobot is the base station) were three-wheeled differential-driveentities with two (0–9 V) DC motors and rotation sensors on themotor shaft by which the motor rotation can be controlled withthe accuracy of 1°. The robots were equipped by a front and a reartouch sensor and a compass sensor. This equipment satisfies therequirements given in the last subsection in Section 4. The appliedtraveling and turning speeds were approximately 30 cm/s and180°/s, respectively. At these speed values, the average accuracyof the navigation was 2.0E−3, which can be really achieved byselected hardware elements [36] (or applying calibration). Thephysical dimensions of a simulated robot are 22 × 18 × 14 cm,its weight is approximately 450 g.

Page 11: A multi-robot exploration algorithm based on a static Bluetooth communication chain

540 T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542

We tested the robot team in two environments: one obstacle-free and one with rectangular obstacles (see Fig. 10(b)). The areawith obstacles was bounded by four rectangular walls, so theyformed a 32 m long and 32 m wide room. The robots divided theircommonmap of area into 4 by 4meters square cells, and navigatedthemselves in this coordinate system by odometry and compasssensor. Due to the communication phase at the end of each team-move, all of the robots were aware of the positions of the otherrobots (with the navigational error), and the discovered obstacles.The specific purpose of the robots was simply to visit all of thecells that can be reached by the robots. This means that a cell wasconsidered explored if a robot entered into it and occupied thecenter of it.

Algorithm 1 The algorithm of the recursive functionTeam_New_Position (i, [P′

i+1 . . . P′

N])

PLN := Priority_List_of_New_Positions(i, P′

i+1);if i > 1 then {

for all positions in PLN do {[P′

1 . . . P′

i−1] := Team_New_Position (i − 1, [PLN(j), P′

i+1 . . .P′

N]);if [P′

1 . . . P′

i−1] = [] and PLN(j) /∈ {P′

2, . . . , P′

i−1, P′i+1, . . . ,

P′

N}\{(0, 0)}then return [P′

1 . . . P′

i−1 , PLN(j) ];}return [] (empty list) ; }

if i = 1 then return PLN (which is a list containing only (0, 0) or anempty list);

Algorithm 2 The main exploration algorithmwhile not Stop_Condition_Is_True do {PLNN := Priority_List_of_New_Positions(N);if PLNN = [] (empty list) then if Backtrack Team-Move is possiblethen do Backtrack Team-Move;

else delete virtual obstacle fromMap;for all position PLNN(j) in PLN do {

if PLNN(j) /∈ EC (PN−1) then {register a virtual obstacle between the cells PN and

PLNN(j) in Map;PLNN(j) :=PN (i.e. null move); }

[P′

1 . . . P′

N−1] := Team_New_Position (N − 1, [PLNN(j)]);if [P′

1 . . . P′

N−1] = [] then {P′

N = PLNN(j);Robot Ri starts move (P′

i − Pi) (i = 1, . . . ,N);Robot Ri broadcasts its new position and the discovered

obstacles;Robot Ri refreshes itsMap;break (for cycle); }

else register a virtual obstacle between the cells PN and PLNN(j)in Map;}}

The obstacle discovering and the kin robot collision avoidingwas solved by the same simplemethod: if the front touch sensor ofa robot sensed contact, then the robot moved backward 30 cm andtried to avoid that object on a rectangular path. If it could avoid theobject in such a way, then it was a kin robot, else it was consideredan extended obstacle. Thus, two robots could come into physicalcontact with each other, which slightly deflected the robots fromtheir original directions. Because of this, at the beginning of everymove cycle the directions of the robots were corrected with thehelp of the compass sensor. In the previous section, we discussedthat two robots can occupy the same cell in the same step cycle

only if one of the two robots encountered an extended obstacle onits way out of the cell, and no three or more robots can occupythe same cell. This makes the implementation of the collisionavoidingmuch easier. Note thatwe used only odometry navigation(with compass) and touch sensors, which is a simple solution,really.

Algorithm 3 The algorithm of the Priority_List_of_New_Positions(i)function (i = N case) and the algorithm of the Priority_List_of_New_Positions(i, P′

i+1) function (i < Ncase)Determine the set of enabled new positions:

P := EN (PN) in the case of i = N;P := ES(P′

i+1)∩ EN (Pi)∩ EC (Pi−1) in the case of i <1;Delete all positions from P that are blocked by known obstacle.Virtual obstacles apply only to the case of i = N;for all P′

i,j ∈ P do {dj := min{ MDist(C, P′

i,j)| C is unexplored};(i.e. find the shortest obstacle avoidingManhattan path from

P′

i,j to an unexplored cell;) }if there is no Manhattan path to unexplored cell then return [];Line up the P′

i,j elements into PLN by increasing dj value;if i = N then {

Line up the P′

N,j elements with the same dj value bydecreasing (P′

N,j−PN) · PN value; }if i < N then {

Phigh := P∩ EN (P′

i+1); (set of higher priority elements)Put the elements of Phigh in the first places in PLN;if Pi ∈ Phigh then put Pi in the first place in PLN; }

return PLN ;

Algorithm 4 The recovery algorithm after a link failureif both links of Rj fail then {

Ri moves to the position Pi−1 (the present position of Ri−1)for i = (j + 1), . . . ,N;

Establish the link between Rj−1 and Rj+1 (continue explo-ration without Rj); }if the link between Rj and Rj+1 fails then {

Ri moves to the position Pi−1 (the present position of Ri−1)for i = (j + 1), . . . ,N;

if 1 < j < (N − 1) then {Establish the link between Rj−1 and Rj+1 (continue

exploration without Rj);if it is not successful then { Establish the link between

Rj and Rj+2 ;(continue exploration without Rj+1); }

if j = 1 then {Establish the link between R1 and R3 (continue

exploration without R2);if it is not successful then R2 will be the base station

instead of R1 ; }if j = (N − 1) then {

Establish the link between RN−2 and RN (continueexploration without RN−1);

if it is not successful then RN−1 will be the leader robotinstead of RN ; }}

Fig. 10 shows the trajectories of the seven robots in the twoexplored areas. These trajectories on the obstacle-free area forma regular pattern with concentric squares, which was discussedin the previous section. In the other area, one can see clearly thetwo contact points with the obstacles, which is resulted from thecollision avoiding method.

Page 12: A multi-robot exploration algorithm based on a static Bluetooth communication chain

T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542 541

7. Conclusions

Based on the results obtained from the tests of the implementedcommunication chain and the proposed algorithm the followingconclusions can be drawn:

• The dual-radio-based Bluetooth Scatternet proposed by Sohrabiet al. [28] and Leopold et al. [29] for ad hoc sensor networkis well applicable for a mobile robot team as a multi-hopcommunication network, and the total time for one hop in thecase of a few bytes message is approximately 60 ms, which istolerable.

• The proposed fixed chain-like organization of the robot teamin the exploration algorithm supports the dual-radio-basedscatternet and ensures the continuous radio connection withthe base station.

• With the help of a simple proof, we have shown that the pro-posed (fixed chain-like team) exploration method is optimal inthe obstacle-free case under the constraint of the connectivitywith the base station. We demonstrated it also experimentallyby comparing the proposedmethodwith utility-function-basedmethods.

• The proposedmethod produces better exploration times at lowobstacle densities and at 75% or 100% exploration ratios thana (decentralized utility function and MANET-based) referencemethod. The good behavior of the method even at non-zero ob-stacle density is due to that there is a ‘‘leader’’ in the team, andthe step utility of this leader robot overrides those of the otherrobots in the area exploration strategy. Therefore, this methodavoids the effect that the individual exploration purposes ofthe team members obstruct each other, which is characteristicfor the decentralized utility-function-based methods under theconstraint of connectivity. In the other hand, these latter meth-ods have other good properties in comparing to the proposedmethod. (They aremore robust and can be faster in an areawithmany random obstacles). Therefore, inventing a hybrid methodthat joins the good properties of the two kinds of method canbe a subject of future work.

• We demonstrated the feasibility of the proposed algorithm bya robot-simulator in an obstacle-free and a building-like envi-ronment. The collision avoiding strategy, which is built in thealgorithm, supports the specific implementation of the explo-ration in the simulator or in a real environment.

References

[1] C. Stachniss, W. Burgard, Exploring unknown environments with mobilerobots using coverage maps, in: International Joint Conference on ArtificialIntelligence, 2003, pp. 1127–1134.

[2] B. Yamauchi, Frontier-based exploration using multiple robots, in: Proceed-ings of the Second International Conference on Autonomous Agents, ACMPress, 1998, pp. 47–53.

[3] R. Simmons, D. Apfelbaum, W. Burgard, D. Fox, S. Thrun, H. Younes,Coordination for multi-robot exploration and mapping, in: Proceedings of theNational Conference on Artifcial Intelligence AAAI, 2000, pp. 852–858.

[4] W. Burgard, M. Moors, C. Stachniss, F. Schneider, Coordinated multi-robotexploration, IEEE Transactions on Robotics 21 (3) (2005) 376–378.

[5] W. Sheng, Q. Yang, J. Tan, N. Xi, Distributed multi-robot coordina-tion in area exploration, Robotics and Autonomous Systems 54 (2006)945–955.

[6] J. Vazquez, C. Malcolm, Distributed multirobot exploration maintaining amobile network, in: IEEE International Conference on Intelligent Systems,2004, pp. 113–118.

[7] Y. Pei, M.W. Mutka, N. Xi, Coordinated multi-robot real-time exploration withconnectivity and bandwidth awareness, in: IEEE International Conference onRobotics and Automation, ICRA, 2010, pp. 5460–5465.

[8] R. Zlot, A. Stenz, M. Dias, S. Thayer, Multi-robot exploration controlledby a market economy, in: IEEE International Conference on Robotics andAutomation, IROC, 2002, pp. 3016–3023.

[9] M. Berhault, H. Huang, P. Keskinocak, S. Koenig, W. Elmaghraby, P. Griffin,A. Kleywegt, Robot exploration with combinatorial auctions, in: IEEEInternational Conference on Intelligent Robots and Systems, IROS, 2003,pp. 1957–1962.

[10] M. Nanjanath, M. Gini, Dynamic task allocation for robots via auctions,in: International Conference on Robotics and Automation, ICRA, 2006,pp. 2781–2786.

[11] T.S. Dahl, M. Mataric, G.S. Sukhatme, Multi-robot task allocation throughvacancy chain scheduling, Robotics and Autonomous Systems 57 (6–7) (2009)674–687.

[12] S. Se, D.G. Lowe, J.J. Little, Vision-based global localization and mapping formobile robots, IEEE Transactions on Robotics 21 (3) (2005) 364–375.

[13] K. Bolla, T. Kovacs, G. Fazekas, Compact image processing based kinrecognition, distance measurement and identification method in a robotswarm, in: International Joint Conference on Computational Cybernetics andTechnical Informatics, ICCC-CONTI, 2010, pp. 419–424.

[14] S. Knoop, S. Vacek, R. Dillmann, Fusion of 2d and 3d sensor data forarticulated body tracking, Robotics and Autonomous Systems 57 (3) (2009)321–329.

[15] M.J.B. Krieger, J.B. Billeter, The call of duty: self-organised task allocation in apopulation of up to twelve mobile robots, Robotics and Autonomous Systems30 (2000) 65–84.

[16] S. Rutishauser, N. Correll, A. Martinoli, Collaborative coverage using a teamof networked miniature robots, Robotics and Autonomous Systems 57 (2009)517–525.

[17] A. Atyabi, S. Phon-Amnuaisuk, C.K. Ho, Navigating a robotic team in anuncharted 2D landscape, Applied Soft Computing 10 (2010) 149–169.

[18] X. Cui, T. Hardin, R.K. Ragade, A.S. Elmaghraby, A team-based Fuzzy-logiccontrol mobile sensor network for hazardous contaminants localization, in:IEEE International Conference on Mobile Ad-hoc and Sensor Systems, 2004,pp. 194–203.

[19] K. Cheng, P. Dasgupta, Dynamic area coverage using faulty multi-agent teams,in: IEEE/WIC/ACM International Conference on Intelligent Agent Technology,2007, pp. 17–23.

[20] M.N. Rooker, A. Birk, Multi-robot exploration under the constraints of wirelessnetworking, Control Engineering Practice 15 (2007) 435–445.

[21] G. Antonelli, F. Arrichiello, S. Chiaverini, R. Setola, A self-configuring MANETfor coverage area adaptation through kinematic control of a platoon ofmobile robots, in: IEEE/RS International Conference on Intelligent Robbots andSystems, 2005, pp. 1332–1337.

[22] J. Bae, R.M. Voyles, Wireless video sensor networks over Bluetooth for a teamof urban search and rescue robots, in: International Conference on WirelessNetworks, Las Vegas, 2006, pp. 409–415.

[23] U. Witkowski, M. El Habbal, S. Herbrechtsmeier, A. Tanoto, J. Penders,L. Alboul, V. Gazi, Ad-hoc network communication infrastructure for multi-robot systems in disaster scenarios, in: Proceedings of IARP/EURONWorkshopon Robotics for Risky Interventions and Environmental Surveillance, Spain,2008.

[24] J.C. Haartsen, The Bluetooth radio system, IEEE Personal Communications 7(2000) 28–36.

[25] R.M.Whitaker, L. Hodge, I. Chlamtac, Bluetooth scatternet formation: a survey,Ad Hoc Networks 3 (2005) 403–450.

[26] O. Al-Jarrah, O. Megdadi, Enhanced AODV routing protocol for Bluetoothscatternet, Computers and Electrical Engineering 35 (2009) 197–208.

[27] A. Pasztor, T. Kovacs, Z. Istenes, Team intelligence simulation with NXT robotsusing piconet and scatternet, in: Applied Computational Intelligence andInformatics, 2009, pp. 199–204.

[28] K. Sohrabi, W. Merrill, J. Elson, L. Girod, F. Newberg, W. Kaiser, Scalable self-assembly for ad hoc wireless sensor networks, IEEE Transactions on MobileComputing (2002) 317–331.

[29] M. Leopold, M.B. Dydensborg, P. Bonnet, Bluetooth and sensor networks: areality check, in: Proc. 1st ACM Conference on Embedded Networked SensorSystems, SenSys, 2003, pp. 103–113.

[30] E. Welsh, P. Murphy, J.P. Frantz, Using bluetooth for short-term ad hocconnections between moving vehicles: a feasibility study, in: IEEE VehicularTechnology Conference 1, 2002 pp. 414–418.

[31] U. Lee, S. Jung, D.-K. Cho, A. Chang, J. Choi, M. Gerla, Bluetooth-based P2PContent Distribution to Mobile Users, UCLA Technical Report: TR-070013,2008.

[32] A. Behrens, L. Atorf, T. Aach, Teaching practical engineering for freshmanstudents using the RWTH—Mindstorms NXT Toolbox for MATLAB, in: E.P.Leite (Ed.), Matlab — Modelling, Programming and Simulations, InTech,2010.

[33] V. de Silva, R. Ghrist, A. Muhammad, Coordinate-free coverage in sensornetworks with controlled boundaries via homology, International Journal ofRobotics Research 25 (2006) 1205–1222.

[34] E. Papadopoulos, M. Misailidis, On differential drive robot odometry withapplication to path planning, in: Proceedings of the European ControlConference, 2007, pp. 5492–5499.

[35] J. Borenstein, L. Feng, Measurement and correction of systematic odometryerrors in mobile robots, IEEE Transactions on Robotics and Automation 12 (5)(1996) 869–880.

Page 13: A multi-robot exploration algorithm based on a static Bluetooth communication chain

542 T. Kovács et al. / Robotics and Autonomous Systems 59 (2011) 530–542

[36] A. Pásztor, T. Kovács, Z. Istenes, Compass and odometry based navigation of amobile robot swarm equipped by Bluetooth communication, in: Proc. of IEEEInternational Joint Conferences on Computational Cybernetics and TechnicalInformatics, 2010, pp. 565–570.

[37] C. Pozna, F. Troester, R.-E. Precup, J.K. Tar, S. Preitl, On the design of an obstacleavoiding trajectory: method and simulation, Mathematics and Computers inSimulation 79 (7) (2009) 2211–2226.

[38] E. Rimon, D.E. Koditschek, Exact robot navigation using artificial potentialfunctions, IEEE Transactions on Robotics and Automation 8 (5) (1992)501–518.

[39] J. Jackson, Microsoft robotics studio: a technical introduction, IEEE Robotics &Automation Magazine 14 (4) (2007) 82–87.

Tamás Kovács received M.S. (1992) and Ph.D. (1997)in Statistical Physics at University of Debrecen. He iscurrently an Associate Professor in the Department ofInformatics at Kecskemét College.

Attila Pásztor graduated as a teacher of informatics fromEötvös Loránd University in 1993. Currently he is workingtowards his Ph.D. degree at the Ph.D. School of ComputerScience Eötvös Loránd University. His research interestincludes how can be use programmable robots in theeducation of programming and the swarm-intelligencesimulations with low-cost robots.

Zoltán Istenes received M.S. (1993) and Ph.D. (1996)in Computer Engineering at University of Nantes. Heis currently an Associate Professor at Eötvös LorándUniversity, Faculty of Informatics, Department of SoftwareTechnology.