8
Cooperative Inchworm Localization with a Low Cost Team Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor Abstract— In this paper we address the problem of multi- robot localization with a heterogeneous team of low-cost mobile robots. The team consists of a single centralized observer with IMU and monocular camera, and multiple picket robots with only IMUs and RGB LEDs. This team navigates a visually featureless environment as a unit, using only relative pose obser- vations of the picket robots and IMU measurements to estimate the motion of the team. A team movement strategy, referred to as inchworm, is formulated; it uses pickets to move ahead of the observer and then act as temporary landmarks for the observer to follow. This cooperative approach employs a single Extended Kalman Filter to localize the entire heterogeneous multi-robot team, using a unique formulation of the measurement Jacobian to relate the pose of the observer to the poses of the pickets with respect to the global reference frame. An initial experiment with the inchworm strategy has shown localization within .27 meter error over a path-length of 5 meters in an environment with irregular ground, partial occlusions and a ramp. I. INTRODUCTION The size of a robot can greatly affect what it can do and where it can go. Advantages of small robots include increased accessibility and a wider range of capabilities such as crawling through pipes, inspecting collapsed buildings, exploring congested or complex environments, and hiding in small or inconspicuous spaces. However, these benefits also bring along challenges in the form of limited energy availability, reduced sensing abilities, lower communica- tion capability, limited computational resources, and tighter power constraints. One way to overcome these limitations is to employ a heterogeneous team [1] of collaborative robots. This ap- proach marks a design shift away from the traditional SLAM ground robots that have expensive sensors and powerful processors, but less mobility in disaster environments. The goal is having small, mobile, disposable robots with limited capabilities collaborate and share information to accomplish a larger task. Since each robot is expendable, reliability can be obtained in numbers because even if a single robot fails, few (if any) capabilities are lost for the team. Hi- erarchical organization and the idea of a heterogeneous teams allows for robots with different specializations, such as larger robots with higher computation power, smaller robots with increased maneuverability, robots with different sensor modalities, and so on. Another advantage of a team of less capable robots, as opposed to one extremely-capable This work is supported by the National Science Foundation under the National Robotics Initiative, Award 1427096. The authors are with Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720 USA {brian.nemsick, abuchan, nagaban2, ronf, avz} @eecs.berkeley.edu robot, is that it allows sensing from multiple viewpoints, which is helpful for tasks such as surveillance, exploration, and monitoring. Furthermore, physically traversing an area conveys much more information than simply looking at it from a distance. For example, an expensive scanner can scan the rubble of a disaster site from the outside, but cannot enter and inspect the inside. Knowledge that cannot be gained without physical presence includes detection of slippery surfaces, hidden holes, and other obscured hazards; these can completely incapacitate robots despite their state-of-the- art SLAM algorithms, expensive cameras, and complex laser range finders. Instead, these same hazards can be detected through sacrifice of highly mobile disposable picket robots that scout the area [2]. Localization is a central problem in many robotic applica- tions. It is particularly important in collaborative situations where, without position and orientation information of each robot, there is no global context in which to meaningfully share the information between the robots. In this paper, we focus on the localization problem and consider a heteroge- neous multi-robot team consisting of two types of minimally equipped robots. A single, central, and more capable observer robot is equipped with a monocular camera and a 6-axis inertial measurement unit (IMU) consisting of a gyroscope and accelerometer. Multiple picket robots, which are expend- able and less computationally capable, are equipped with no sensors other than 6-axis IMUs. A limited communication interface between the observer robot and individual picket robots is assumed. We present a method for using a single Extended Kalman Filter (EKF), which the observer uses to localize the entire multi-robot team, including itself, in six degrees of freedom (6-DOF) by fusing together all available measurements. These measurements include IMU readings from all robots, as well as relative pose estimates of the pick- ets, which are calculated using RGB LEDs that are mounted on the pickets. As the single-robot states become correlated, single-robot measurements improve the estimate of the multi- robot team state. To cooperatively move forward, we use a team movement strategy that we refer to as inchworm (Figure 1). This strategy contains incremental movements, where each increment consists of the pickets moving ahead to pave the way for the observer. We require that at least one robot is stationary at any given time, although it does not matter which one. The datasets discussed in the results of this paper includes one observer working together with two pickets to traverse a given area. Even with minimal sensors (only camera and IMU), the inchworm method is shown to work in dark environments with visual occlusions such as walls or

Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi ...nagaban2/resources/paper_icra.pdf · Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi ...nagaban2/resources/paper_icra.pdf · Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor

Cooperative Inchworm Localization with a Low Cost Team

Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor

Abstract— In this paper we address the problem of multi-robot localization with a heterogeneous team of low-cost mobilerobots. The team consists of a single centralized observer withIMU and monocular camera, and multiple picket robots withonly IMUs and RGB LEDs. This team navigates a visuallyfeatureless environment as a unit, using only relative pose obser-vations of the picket robots and IMU measurements to estimatethe motion of the team. A team movement strategy, referred toas inchworm, is formulated; it uses pickets to move ahead of theobserver and then act as temporary landmarks for the observerto follow. This cooperative approach employs a single ExtendedKalman Filter to localize the entire heterogeneous multi-robotteam, using a unique formulation of the measurement Jacobianto relate the pose of the observer to the poses of the picketswith respect to the global reference frame. An initial experimentwith the inchworm strategy has shown localization within .27meter error over a path-length of 5 meters in an environmentwith irregular ground, partial occlusions and a ramp.

I. INTRODUCTION

The size of a robot can greatly affect what it can doand where it can go. Advantages of small robots includeincreased accessibility and a wider range of capabilities suchas crawling through pipes, inspecting collapsed buildings,exploring congested or complex environments, and hidingin small or inconspicuous spaces. However, these benefitsalso bring along challenges in the form of limited energyavailability, reduced sensing abilities, lower communica-tion capability, limited computational resources, and tighterpower constraints.

One way to overcome these limitations is to employ aheterogeneous team [1] of collaborative robots. This ap-proach marks a design shift away from the traditional SLAMground robots that have expensive sensors and powerfulprocessors, but less mobility in disaster environments. Thegoal is having small, mobile, disposable robots with limitedcapabilities collaborate and share information to accomplisha larger task. Since each robot is expendable, reliabilitycan be obtained in numbers because even if a single robotfails, few (if any) capabilities are lost for the team. Hi-erarchical organization and the idea of a heterogeneousteams allows for robots with different specializations, suchas larger robots with higher computation power, smallerrobots with increased maneuverability, robots with differentsensor modalities, and so on. Another advantage of a teamof less capable robots, as opposed to one extremely-capable

This work is supported by the National Science Foundation under theNational Robotics Initiative, Award 1427096.

The authors are with Department of Electrical Engineering andComputer Sciences, University of California, Berkeley, CA 94720 USAbrian.nemsick, abuchan, nagaban2, ronf, [email protected]

robot, is that it allows sensing from multiple viewpoints,which is helpful for tasks such as surveillance, exploration,and monitoring. Furthermore, physically traversing an areaconveys much more information than simply looking at itfrom a distance. For example, an expensive scanner can scanthe rubble of a disaster site from the outside, but cannot enterand inspect the inside. Knowledge that cannot be gainedwithout physical presence includes detection of slipperysurfaces, hidden holes, and other obscured hazards; thesecan completely incapacitate robots despite their state-of-the-art SLAM algorithms, expensive cameras, and complex laserrange finders. Instead, these same hazards can be detectedthrough sacrifice of highly mobile disposable picket robotsthat scout the area [2].

Localization is a central problem in many robotic applica-tions. It is particularly important in collaborative situationswhere, without position and orientation information of eachrobot, there is no global context in which to meaningfullyshare the information between the robots. In this paper, wefocus on the localization problem and consider a heteroge-neous multi-robot team consisting of two types of minimallyequipped robots. A single, central, and more capable observerrobot is equipped with a monocular camera and a 6-axisinertial measurement unit (IMU) consisting of a gyroscopeand accelerometer. Multiple picket robots, which are expend-able and less computationally capable, are equipped with nosensors other than 6-axis IMUs. A limited communicationinterface between the observer robot and individual picketrobots is assumed. We present a method for using a singleExtended Kalman Filter (EKF), which the observer uses tolocalize the entire multi-robot team, including itself, in sixdegrees of freedom (6-DOF) by fusing together all availablemeasurements. These measurements include IMU readingsfrom all robots, as well as relative pose estimates of the pick-ets, which are calculated using RGB LEDs that are mountedon the pickets. As the single-robot states become correlated,single-robot measurements improve the estimate of the multi-robot team state. To cooperatively move forward, we usea team movement strategy that we refer to as inchworm(Figure 1). This strategy contains incremental movements,where each increment consists of the pickets moving aheadto pave the way for the observer. We require that at leastone robot is stationary at any given time, although it doesnot matter which one.

The datasets discussed in the results of this paper includesone observer working together with two pickets to traversea given area. Even with minimal sensors (only cameraand IMU), the inchworm method is shown to work indark environments with visual occlusions such as walls or

Page 2: Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi ...nagaban2/resources/paper_icra.pdf · Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor

Fig. 1: (a) Leapfrog methods have one robot move forwardwhile the other two remain stationary. By alternating whichrobot moves forward, the group moves forward as a unit.(b) Our inchworm method requires only one robot to bestationary at any given time. The robots cooperatively moveforward, with the picket robots paving the way for theobserver.

obstacles, instances when line of sight between the robots islost, and nonplanar settings without external visual featuresor landmarks.

II. RELATED WORK

Temporarily stationary robots and leapfrogging methodsbuild on the ideas from [1] and have shown promise in 3-DOF environments in [3] [4]. The two or three stationaryrobot constraint can be relaxed with an inchworm technique(Figure 1). The inchworm approach extends leapfrog princi-ples to an environment where two or three stationary robotsmay not be in line-of-sight. Previous approaches and variantsof leap frog techniques were focused on team dynamicswere multiple robots had relative pose measurements butdid not consider teams with only one robot capable ofrelative pose measurements. The inchworm approach relaxesthe sensor constraints of the team to accommodate thisteam dynamic. Only a single observer robot requires relativepose capabilities but leaves the picket robots with moreflexibility/specialization.

Haldane et al. [2] use a heterogeneous team to detectslippery terrain by sending out a small picket robot andhaving it walk around the area of interest. The large robot isvery complex and intelligent, so it can estimate its own posevery well. The large robot uses an AR tag on the picket robotfor localization. Then, features of the picket’s motion areused to train a terrain classifier capable of detecting slippery

terrain. This approach demonstrates the value of picket robotsin a heterogeneous robot team for exploration or SLAM.

A follow-the-leader approach in [5] demonstrates a teamcomposition similar to picket-observer. The leaders andchildren setup in [6] provides a relative localization schemein 3-DOF; it assumes accurate localization of the leadersfrom an external source and performs localization of thechildren robots. This approach is extended in [7] to localizethe leaders. The problem is subdivided into using multipleleaders to perform leader localization and then performinglocalization of the children. Maintaining sensor line-of-sightwith multiple leaders is a difficult constraint in non-planarand irregular environments. We extend their approach tojointly solve the leader and children localization problemwithout requiring multiple leaders.

Alternatives to cooperative localization methods have beendemonstrated with Khepera robots [8]. They are 5cm indiameter, modular, and can support various sensors. Theyrely on dead reckoning, GPS or fixed global cameras toperform localization. For unknown, low-light and potentiallyindoor environments GPS is not reliable and using/mountinga global camera may not be possible. Other small-scalecooperating robots similar to Khepera robots include thosein FIRA and Robot cup competitions [9]. They addressthe localization and coordination problem with a centralizedcontroller and global camera.

A more recent approach [10] uses range-only sensorswith a team of aerial vehicles for SLAM and builds on thelimited sensor approach of [11]. These drones are equippedwith on-board computers and lasers. Our approach is touse inexpensive and disposable picket robots in 6-DOFenvironment.

Limited communication capabilities between robots areaddressed by [12] [13], and this is increasingly critical inmulti-robot teams. Ants robots [14] are small and demon-strate localization capabilities with encoders under theseconstraints. In featureless non-planar 6 DOF enviromentsthese encoders introduce drift from wheel slippage.

Odometry-based propagation have been successful in 3-DOF fusion architectures [1] [15]. In 6-DOF non-planarenvironments, wheel slippage causes systematic biases fromencoders. Cell phone quality IMUs are a low cost alternativeto wheel encoders in 6-DOF environments because theyprovide a motion model even under slippage. Extensive workin IMU-based propagation in visual-inertial systems has beenexplored in [16] [17] [18] [19] [20] [21] [22].

Many algorithms and approaches exist for multi-robotlocalization. A least-squares optimization approach to local-ization with a multi-tier heterogeneous team was employedin [1], which used IR, GPS, and sonar modules to produce anestimate at each time step. Graph based approaches have alsobeen used [23] [24], and the graph optimization algorithm in[23] relies on the locations of static landmarks and exploitsthe sparse nature of the graph. The use of cluster matchingalgorithms with IR sensors is presented by [25]. Maximuma posteriori approaches have also been used for multi-robotlocalization [26] and are robust to single-point failures.

Page 3: Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi ...nagaban2/resources/paper_icra.pdf · Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor

Algorithm 1: Cooperative Multi-Robot EKFPropagation: For each IMU measurement:• buffer previous IMU measurements received from

other robots• propagate state and covariance for the team using the

time-step, buffer and new IMU measurement (cf.Section III-B).

Update: For each camera image:• identify RGB LEDs (cf. Section III-C).• estimate the relative pose between the visible picket

robots and the observer frame with P3P and GaussNewton minimization (cf. Section III-C).

• propagate the state and covariance for the team usingthe time-step, and most recent IMU measurements

• perform state and covariance update for the team (cf.Section III-D, III-E).

Maintain: At least one stationary robot

Fig. 2: Block diagram of the general algorithm, whichperforms real-time cooperative localization of all robots inthe robot team. Asynchronous sensor measurements fromthe robots are sent over WiFi, sorted into a measurementbuffer, and then used in the EKF propagate and updatestep. Currently, the functionality inside the dotted box isperformed on an external laptop.

Existing EKF [27] [28] [15] or particle filter methods [29][12] [30] demonstrate the capability of fusing measurementsto provide accurate multi-robot localization.

While the noted previous works have extensively exploredmulti-robot localization, their experiments were conductedwith access to significantly more capable robots, availabilityof GPS or beacons of known pose, 3-DOF settings with pla-nar environmental assumptions and accurate wheel odometry,requirements of additional stationary robots, assumptions oflight, and existence of landmarks or visual features. In thispaper, RGB LEDs are used as features since they work invarious lighting conditions, the perspective-3-point algorithmuses the known layout of features to estimate the 6-DOFpose, and Gauss Newton minimization then refines theseestimates using a rigid body assumption.

III. METHODS

The purpose of the cooperative multi-robot EKF is tolocalize the robot team’s body frames with respect to a globalreference frame. An outline of the algorithm performed onthe observer robot is provided in Algorithm 1. IMU mea-surements from both types of robots are used to propagatethe state and covariance of the team with the same motionmodel. RGB LEDs are placed with a known configurationon each picket robots and used as features to estimate therelative pose. The relative poses are subsequently used toupdate the state and covariance of the team.

A team movement strategy called inchworm is adopted,which switches between the pickets moving ahead and thenthe observer moving (Figure 1). This movement strategyrequires at least one stationary robot at a given point in time;this is used to significantly mitigate IMU dead-reckoningerror and increase the robustness of the localization algorithmto temporary loss in line of sight. A stationary robot doesnot propagate it’s corresponding states or covariances. Thisenables the stationary robot to function as a temporarylandmark and serves as a functional substitute to externalfeatures, because although external features are used intraditional visual odometry or visual SLAM systems, theymay not be available in a low light environment. An examplebenefit of a stationary picket robot is in cases of completeline of sight failure, where none of the picket robots arevisible to the observer. Then, a single future re-observationof a stationary robot corrects the IMU dead-reckoning errorof the entire team (for that stage of the movement, duringwhich the stationary robot has remained stationary).

The following sections describe the EKF propagation andupdate steps in detail.

A. State Vector

The EKF state and accompanying error-state vector storesthe state of each single-robot in the multi-robot team. Thestate vector subcomponent with respect to ith picket robot is:

xi = [BGqTi ,

GpTi ,

GvTi ,bTig ,b

Tia ]T (16 x 1) (1)

where BGqT

i , is the unit quaternion representation of therotation from the global frame (G) to the body frame (B),Gpi,

Gvi are the body frame position and velocity withrespect to the global frame, and big ,bia are the gyroscopeand accelerometer biases.

The corresponding error-state subcomponent with respectto ith picket robot is:

xi = [GθT

i ,GpT

i ,GvTi , b

T

ig , bT

ia ]T (15 x 1) (2)

where GθT

i is the minimal representation from the errorquaternion δq ' [ 12

GθT, 1]T . The remaining states use an

additive error model (e.g p = p−p), where p is the estimatedposition.

Page 4: Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi ...nagaban2/resources/paper_icra.pdf · Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor

The observer robot is also a subcomponent in the EKFstate and error-state vector:

xO = [OGqTO,

GpTO,

GvTO,bTOg,bT

Oa]T

xO = [GθT,GpT

O,GvT

O, bT

Og, b

T

Oa]T

(3)

where O denotes the observer frame.The augmented EKF state vector and error-state vector

with respect to the multi-robot team becomes:

x = [xTO, xT1 , x

T2 , ... xTn ]T , x = [xTO, x

T1 , x

T2 , ... xT

n ]T (4)

where n is the total number of picket robots.

B. IMU Propagation Model

The EKF propagation step occurs each time a new IMUmeasurement from any single-robot or a camera image iscaptured on the observer robot. The observer and picketrobots utilize the same linearized motion model. Althoughthe single-robots become correlated with each other, thestate vector propagation is calculated independently for eachrobot. The correlation is maintained in the error-state andpreserved under IMU propagation. The continuous dynamicsof the IMU propagation model for a single-robot are [17][18]:

BGq =

1

2Ω(ω)BGq, Gp = Gv, Gv = Ga

bg = nwg, ba = nwa

(5)

where nwg,nwg are Gaussian white noise vectors, ω is thebody rotational velocity, and

bω×c =[ 0 −ω3 ω2ω3 0 −ω1−ω2 ω1 0

],Ω(ω) =

[−bω×c ω−ωT 0

](6)

The discrete-time model implementation is discussed withdetail in [17] [18]. Stationary robots receive no state orcovariance propagation.

As opposed to distributed robot teams, the team has acentralized observer and asynchronous communication. Asensor measurement buffer is used to handle measurementsfrom each robot to account for latency and communicationdelay. Each time a measurement (IMU or relative pose) isreceived it is inserted into the sorted buffer. On a constanttimer the buffer is emptied by propagating and updating thestate of the system with the sensor measurements.

C. Relative Pose Estimation

Four (or more) RGB LEDs are placed at known config-urations position on the picket robots to allow relative poseestimation on board the observer. Each picket robot receivesa unique configuration with LEDs of various colors. Colorand intensity thresholds are used to find the LED centroids,and these LED detections are passed into separate poseestimators (one pose estimator for each robot).

From the centroid detections, the approach from [19] isused to perform relative pose estimation. Pose correspon-dence is computed with the perspective-3-point [21] [22]algorithm for each picket. Using different colors for the

Fig. 3: Coordinate frames for a two robot team.

LEDs reduces the computational load by allowing the P3Pcorrespondence search to search fewer possible configura-tions. Gauss-Newton minimization refines the initial solutionfrom the P3P algorithm by minimizing reprojection error[19]:

P ∗ = arg minP

∑<l,d>∈C

||π(l,P− d)||2

where P is pose estimate, l is the set of LED configurations,d is the set of LED centroids, C is the LED correspondences,and π projects an LED from R3 into R2 (camera image).

The pose estimate covariance (Q) is calculated with theJacobian (J) from the Gauss-Newton minimization [19]:

Q = (JT Σ−1J)−1 where Σ = I2×2 pixels2 (7)

D. Camera Measurement Model

After using previous work in IMU propagation modeland LED pose estimation, we derive a measurement modelfor our unique case. From the relative pose measurementsthe camera measurement model is derived. An overview ofthe coordinate frames is shown in Figure 3. The cameratransform is defined as [COqT ,COpT ]T

The raw pose measurements are in the camera frame:

zi = [BC qTiz,BCpT

iz]T (7 x 1) (8)

A residual, r, and a measurement Jacobian, H, are requiredto complete the EKF update step. The relationship betweenthe residual and measurement Jacobian is given by:

r = z− z ' Hx + n (9)

where n is noise. In order to calculate a residual, r, anestimate of the camera measurements, zi, is computed.

To calculate zi, the state vector estimate is updated withthe EKF propagation step. The poses of the picket robots arethen converted from the global to the camera frame with thestate estimate:

zi =

[BG

ˆqi ⊗ OG

ˆq−1O ⊗ CO

ˆq−1OCOR

OGR(BGpi − O

GpO − COp)

](10)

where ⊗ represents quaternion multiplication.The single-robot residuals with respect to the visible picket

robots are calculated:

Page 5: Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi ...nagaban2/resources/paper_icra.pdf · Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor

ri =

[2 · π((BG ˆqi ⊗ O

Gˆq−1O ⊗ C

Oˆq−1O )−1 ⊗ B

C qiz)

BCpiz

− COR

OGR(BGpi − O

GpO − COp)

](11)

where π is defined as π([qx, qy, qz, qw]T )T = [qx, qy, qz]T

and utilized as a small angle approximation for the orienta-tion difference between z and z.

The single-robot residuals must be approximated as a lin-ear combination of the error-state and measurement Jacobian(9) for the covariance portion of the EKF update step.

The ith measurement Jacobian, Hi, is calculated by ap-plying small angle approximations and taking the partialderivatives of the ith single-robot residual with respect tothe error-state. The sparse non-zero entries:

ri ' Hix '

Hi =[

−CGR 03 03 ... C

GR 03 03 ...CGRb(

BG pi−

OG pO−

COp)×c −C

GR 03 ... 03CGR 03 ...

]x = [ GθO

GpOGvO ··· Gθi

GpiGvi ··· ]

(12)

where CGR = C

OROGR and bq×c is the quaternion skew

operator (6). The higher order and cross terms are droppedfrom Hi to satisfy the linear requirement of the EKF.

The states of all picket robots become correlated withthe observer robot through the measurement Jacobian. Thisenables an individual pose estimate of a picket robot to im-prove the state estimate of each picket robot. The correlationis essential to localizing the observer robot because it isunable to observe itself directly with the camera. A singlestationary robot has relatively low pose uncertainty in theglobal frame and can be used to correct IMU propagationerror in hazardous terrain for the remaining moving robots.

E. EKF Update

The Kalman gain, K, and covariance update are calculatedusing standard EKF equations:

K = Σk+1|kHT (HΣk+1|kHT + Q)−1

Σk+1|k+1 = (I−KH)Σk+1|k(13)

To utilize the standard equations, the overall measurementJacobian is calculated by vertically stacking the single-robot measurement Jacobians from the camera measurementmodel:

H = [HT1 ,H

T2 , ... HT

n ]T (14)

The corresponding overall observation noise is calculatedby diagonalizing the uncorrelated relative pose estimatecovariances (7):

Q = diag(Q1,Q2, ... Qn) (15)

The correction for both observer and picket robots arecalculated with:

∆x = Kr (16)

For a single-robot, the correction has the form:

∆xi = [δθTi ,∆pTi ,∆vTi ,∆bT

ig ,∆bTia ]T

∆x = [∆xTO,∆xT1 , ...∆xTi ]T

(17)

The non-quaternion states from the multi-robot state vec-tor utilize standard additive correction (e.g. Gpk+1|k+1 =Gpk+1|k+∆p). The orientation components of the correctionare converted to quaternion corrections. Quaternion multipli-cation is applied to the quaternion states and corrections (e.g.qk+1|k+1 = qk+1|k ⊗ δq) to complete the state update forthe multi-robot team.

IV. RESULTS

A. Experimental Approach

We apply the localization technique described above todata collected from a team of three small, low-cost, mobilerobots. The Zumy robot1 is a decimeter-scale tracked robotrunning ROS on board a Linux computing system withnetworking and vision processing capability. The observerZumy supports a Microsoft Lifecam 3000 webcam (640x480,pixels 30 Hz), InvenSense MPU-6050 MEMS IMU (30 Hz),and supports WiFi wireless communication. This robot isdesigned to be easily built from commercially available off-the-shelf parts for a total cost of ≈ $350.

The robotic team consists of one observer and two picketrobots shown in Figure 3. A Zumy robot with a camera servesas the observer, and to represent the inexpensive and lesscapable picket robots, we use Zumy robots without cameras.Each picket robot is outfitted with an LED ”hat” so that itcan be visually tracked by the observer robot. The robots aremanually driven, and ground truth for the analysis is obtainedusing a VICON motion capture system.

B. Planar Base Case

The baseline experimental task was a cooperative u-turnin planar 3-DOF with one observer and two pickets. Therobots were manually driven in the dark. Although the dataset was recorded in a 3-DOF environment, the filter was notconstrained as such.

In this planar u-turn dataset, inchworm increments weredone at a relative pose depth of ≈ 2 m. An inchwormincrement at relative pose depth of 2m means that thepicket robots start 2m away from the observer, and theobserver moves between 1-1.5 m. According to experimentsin [19], the expected position and orientation error fromthe pose estimator is ≈ 5cm and 1-2 degrees, respectively.Our setup has a slightly lower baseline (a convex hull wasused as a spherical radius whereas the baseline is 5 cmin the other dimension) and lower resolution, as well asless accurate LED centroids (RGB vs infared). A directcomparison between the system setup in [19] and our systemsetup is in Table I. Experimentally, at this relative pose depth,our position and orientation errors were approximately 6-8cm and 3-5 degrees.

In Figure 4, we show the resulting trajectories of thisplanar u-turn set as compared to ground truth. We plot theresults of using only one picket (discarding the measurementsfrom other), and then the results of using both pickets.

1https://wiki.eecs.berkeley.edu/biomimetics/Main/Zumy

Page 6: Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi ...nagaban2/resources/paper_icra.pdf · Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor

Note that the observer trajectory is not as smooth as thatof picket-1 or picket-2, because the motion of the observerhas un-modeled vibration effects that cause motion blur andtemporary changes to the “static” camera transform.

This data set consisted of 10 inchworm increments. End-position drift for using a single picket vs. using two pickets(Table II) was roughly equivalent, but the angular drift, whichhas large future error implications, was significantly less forthe two picket setup. Although unconstrained to a plane,the angular drift was almost exclusively in yaw. Performingright or left turns with the robot team introduces the mostrotational drift and is significantly higher cost than movingthe robot team forward or backwards. The major source ofrotational error came from latent biases and repeated poorpose estimates when only picket-1 was visible (picket-2 wasexperiencing temporary line of sight failure at that time) froma suboptimal orientation. Without external features or globalcorrection, the yaw errors persist until the end of experiment,but adding more picket robots helps to mitigate these effects.

The drift and angular error after 10 increments is consis-tent with that of a single pose measurement. A camera onlyfiltering approach was evaluated in Figure 5, but it performedsignificantly worse (twice as much yaw drift) than the IMUplus camera approach. This shows an improvement of thefilter fusion-based approach compared to a purely vision-based approach. Relatedly, using only the IMU for dead-reckoning produced drifts of over 100 m for each of the 3robots.

C. Environment with Non-planar Terrain and Ramp

The second experiment was conducted in an environmentfeaturing non-planar terrain, obstacles, and occlusions. Therobots were manually driven in the environment shown inFigure 6. The ground truth trajectories and the EKF poseestimates of the data set are shown in Figure 8. The 6-DOF

TABLE I: Pose Tracker Comparison

Faessler et al. [19] Our System

Resolution (pixels2) 752x480 640x480Baseline Radius (cm) 10.9 10.54

LEDs/Robot 5 5LED Type Infared RGB

≈ error at 2 m depth 5 cm, 1-2 deg 6-8 cm, 3-5 deg

TABLE II: Planar Drift Analysis

Camera Only Fusion FusionTwo Pickets One Picket Two Pickets

Obs

erve

r x (cm) -7.91 0.040 -4.81y (cm) 22.6 -4.50 11.0z (cm) 1.14 1.24 0.86

Angle () 20.5 8.9 3.0

Pick

et-1 x (cm) 4.50 -6.32 -11.8

y (cm) 15.1 -24.8 -5.68z (cm) 4.38 1.98 3.73

Angle () 17.8 10.9 4.7

Pick

et-2 x (cm) -2.40 - 4.20

y (cm) 29.1 - 7.73z (cm) 4.45 - 5.25

Angle () 16.4 - 2.8

Fig. 4: A plot of the XY projection of the team’s poseestimates (from the EKF) along with the ground truth trajec-tories. Shown for the base case of the planar u-turn, whereonly one picket is used in the top plot, and two pickets areused in the bottom plot.

Fig. 5: Camera only approach: A plot of the XY projectionof the team’s pose estimates (from the EKF) along withthe ground truth trajectories, using a camera only approach.Performs notably worse (in yaw drift) than the IMU-camerafusion approach shown in Figure 4.

planar data set consisted of 10 inchworm increments: 3 forthe rock garden and 7 for the right turn and ramp. Temporaryline of sight failure of both pickets occurred during the rockgarden because the pose estimator failed to converge, as theobserver was moving on the rocks. Despite significant wheelslippage and added IMU noise (≈ 3 × flatground), theangular drift through the rock garden was < 1 in yaw (≈ 0

pitch and roll error) for each robot.After the rock garden, picket-2 was deliberately left behind

Page 7: Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi ...nagaban2/resources/paper_icra.pdf · Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor

Fig. 6: 6-DOF environment for testing: Robot team (right),rock garden (right), green patch (bottom left) to represent ahole that picket-2 falls into, and ramp (top left).

Fig. 7: Starting position of the robot team with view of therock garden section. The origin is defined as the startingposition of the observer.

to simulate a hole in the environment. After picket-2 was leftbehind, the observer and picket-1 experienced slippage (in-creased IMU noise) while performing a right turn to go ontothe ramp. The combination of noiser IMU measurements andcorrelated pose estimate errors from only a single stationarypicket adversely affected performance. On flat ground (3-DOF planar case) with one observer and one picket, a rightturn experimentally produced yaw drift of ≈ 3-5 . This issubstantially less than the 11 yaw drift of the observer andpicket-1 during the turn and ramp. The principle of keeping asingle robot stationary will allow functionality of our methodin various types of terrain; however, in 6-DOF with slippageand noisier IMU measurements, a performance hit in yawwill be incurred. This effect can be optionally mitigated bymaintaining additional stationary robots.

V. CONCLUSION

A heterogeneous team which consists of a single ”observerand multiple ”picket robots is able to navigate a visuallyfeatureless, unknown, non-planar environment as a unit,

Fig. 8: Ground truth trajectories of the multi-robot team arecompared against the estimates of the EKF for the non-planarenvironment.

using only relative pose observations and IMU measurementsto estimate the motion of the team. The IMU and camerafusion approach presented in this paper has clear advantagesover a simpler camera only approach that outweigh the costof having asynchronous communication (see Figure 4, 5).Although calibration of an IMU adds many complications,it is a natural choice for environments in which wheelencoders are unreliable. A camera-only approach heavilyrelies on line of sight at all times, which is restricting andpotentially very hard to maintain. In addition, motion-modelbased approaches for EKF propagation allow for the rejectionof errant camera pose estimates from faulty LED detectionsor P3P correspondence matching. Most importantly, with acamera only approach, each increment consisting of picketand observer movement has an associated positional androtational drift (drift in 6-DOF). The calibration of the IMUallows the fusion based approach of using the stationaryrobots’ gravity vectors to reduce and bound the pitch androll drift (drift in 4-DOF).

In the future, we will develop strategies for larger robotteams (n > 10), including yaw drift mitigation, autonomouscontrol and various multi-robot exploration strategies. Addi-tionally, one of the benefits of having numerous expendablerobots is that pickets can be left in a position to act aspermanent landmarks, which we plan to use to implementloop closure and improve our drift.

ACKNOWLEDGMENT

The authors would like to thank the Biomimetic Millisys-tems Lab, specifically Andrew Chen, Andrew Pullin, and

Page 8: Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi ...nagaban2/resources/paper_icra.pdf · Brian E. Nemsick, Austin D. Buchan, Anusha Nagabandi, Ronald S. Fearing, and Avideh Zakhor

James Lam Yi for their contributions to the Zumy platformdevelopment, as well as Justim Yim for helping to create theLED mounts.

REFERENCES

[1] R. Grabowski, L. E. Navarro-Serment, C. J. Paredis, and P. K.Khosla, “Heterogeneous teams of modular robots for mapping andexploration,” Autonomous Robots, vol. 8, no. 3, pp. 293–308, 2000.

[2] D. W. Haldane, P. Fankhauser, R. Siegwart, and R. S. Fearing,“Detection of slippery terrain with a heterogeneous team of leggedrobots,” in IEEE Int. Conf. on Robotics and Automation, 2014, pp.4576–4581.

[3] L. E. Navarro-Serment, C. J. Paredis, P. K. Khosla, et al., “A beaconsystem for the localization of distributed robotic teams,” in Int. Conf.on Field and Service Robotics, vol. 6, 1999.

[4] S. Tully, G. Kantor, and H. Choset, “Leap-frog path design for multi-robot cooperative localization,” in Int. Conf. on Field and ServiceRobotics. Springer, 2010, pp. 307–317.

[5] K. E. Wenzel, A. Masselli, and A. Zell, “Visual tracking and follow-ing of a quadrocopter by another quadrocopter,” in 2012 IEEE/RSJInternational Conference on Intelligent Robots and Systems. IEEE,2012, pp. 4993–4998.

[6] T. R. Wanasinghe, G. K. Mann, and R. G. Gosine, “Distributedcollaborative localization for a heterogeneous multi-robot system,” in27th IEEE Canadian Conf. on Electrical and Computer Engineering(CCECE), 2014, pp. 1–6.

[7] ——, “Distributed leader-assistive localization method for a heteroge-neous multirobotic system,” IEEE Transactions on Automation Scienceand Engineering, vol. 12, no. 3, pp. 795–809, 2015.

[8] F. Mondada, E. Franzi, and P. Ienne, “Mobile robot miniaturisation: Atool for investigation in control algorithms.” Experimental RoboticsIII, Springer Berlin Heidelberg, 1994, pp. 501–513.

[9] M. Veloso, P. Stone, K. Han, and S. Achim, “The cmunited-97 smallrobot team.” Springer Berlin Heidelberg, 1998, pp. 242–256.

[10] F. Fabresse, F. Caballero, and A. Ollero, “Decentralized simultaneouslocalization and mapping for multiple aerial vehicles using range-onlysensors,” in IEEE Int. Conf. on Robotics and Automation, 2015, pp.6408–6414.

[11] K. E. Bekris, M. Glick, and L. E. Kavraki, “Evaluation of algorithmsfor bearing-only slam,” in IEEE Int. Conf. on Robotics and Automa-tion, 2006, pp. 1937–1943.

[12] L. Carlone, M. K. Ng, J. Du, B. Bona, and M. Indri, “Rao-Blackwellized particle filters multi robot SLAM with unknown initialcorrespondences and limited communication,” in IEEE Int. Conf. onRobotics and Automation, 2010, pp. 243–249.

[13] N. Trawny, S. I. Roumeliotis, and G. B. Giannakis, “Cooperativemulti-robot localization under communication constraints,” in IEEEInt. Conf. on Robotics and Automation, 2009, pp. 4394–4400.

[14] J. McLurkin, “Using cooperative robots for explosive ordnance dis-posal,” Massachusetts Institute of Technology Artificial IntelligenceLaboratory, 1996, pp. 1–10.

[15] R. Madhavan, K. Fregene, and L. E. Parker, “Distributed heteroge-neous outdoor multi-robot localization,” in IEEE Int. Conf. on Roboticsand Automation, 2002, vol. 1, pp. 374–381.

[16] D. Strelow and S. Singh, “Motion estimation from image and iner-tial measurements,” The International Journal of Robotics Research,vol. 23, no. 12, pp. 1157–1195, 2004.

[17] A. I. Mourikis and S. I. Roumeliotis, “A multi-state constraint Kalmanfilter for vision-aided inertial navigation,” in IEEE Int. Conf. onRobotics and Automation, 2007, pp. 3565–3572.

[18] G. Huang, M. Kaess, and J. J. Leonard, “Towards consistent visual-inertial navigation,” in IEEE Int. Conf. on Robotics and Automation,2014, pp. 4926–4933.

[19] M. Faessler, E. Mueggler, K. Schwabe, and D. Scaramuzza, “Amonocular pose estimation system based on infrared LEDs,” in IEEEInt. Conf. on Robotics and Automation, 2014, pp. 907–913.

[20] A. Breitenmoser, L. Kneip, and R. Siegwart, “A monocular vision-based system for 6D relative robot localization,” in IEEE Int. Conf.on Intelligent Robots and Systems, 2011, pp. 79–85.

[21] L. Quan and Z. Lan, “Linear n-point camera pose determination,”IEEE Transactions on Pattern Analysis and Machine Intelligence,vol. 21, no. 8, pp. 774–780, 1999.

[22] L. Kneip, D. Scaramuzza, and R. Siegwart, “A novel parametrizationof the perspective-three-point problem for a direct computation ofabsolute camera position and orientation,” in IEEE Conf. on ComputerVision and Pattern Recognition (CVPR), 2011, pp. 2969–2976.

[23] A. Ahmad, G. D. Tipaldi, P. Lima, and W. Burgard, “Cooperative robotlocalization and target tracking based on least squares minimization,”in IEEE Int. Conf. on Robotics and Automation, 2013, pp. 5696–5701.

[24] V. Indelman, E. Nelson, N. Michael, and F. Dellaert, “Multi-robot posegraph localization and data association from unknown initial relativeposes via expectation maximization,” in IEEE Int. Conf. on Roboticsand Automation, 2014, pp. 593–600.

[25] A. T. Rashid, M. Frasca, A. A. Ali, A. Rizzo, and L. Fortuna, “Multi-robot localization and orientation estimation using robotic clustermatching algorithm,” Robotics and Autonomous Systems, vol. 63, pp.108–121, 2015.

[26] E. D. Nerurkar, S. I. Roumeliotis, and A. Martinelli, “Distributed max-imum a posteriori estimation for multi-robot cooperative localization,”in IEEE Int. Conf. on Robotics and Automation, 2009, pp. 1402–1409.

[27] S. I. Roumeliotis and G. A. Bekey, “Distributed multirobot localiza-tion,” IEEE Transactions on Robotics and Automation, vol. 18, no. 5,pp. 781–795, 2002.

[28] A. Martinelli, F. Pont, and R. Siegwart, “Multi-robot localization usingrelative observations,” in IEEE Int. Conf. on Robotics and Automation,2005, pp. 2797–2802.

[29] A. Howard, “Multi-robot simultaneous localization and mapping usingparticle filters,” The International Journal of Robotics Research,vol. 25, no. 12, pp. 1243–1256, 2006.

[30] A. Prorok, A. Bahr, and A. Martinoli, “Low-cost collaborative local-ization for large-scale multi-robot systems,” in IEEE Int. Conf. onRobotics and Automation, 2012, pp. 4236–4241.