[IEEE 2013 European Conference on Mobile Robots (ECMR) - Barcelona, Catalonia, Spain (2013.09.25-2013.09.27)] 2013 European Conference on Mobile Robots - Kullback-leibler divergence

Kullback-Leibler Divergence based Graph Pruning in Robotic FeatureMapping

Yue Wang, Rong Xiong, Qianshan Li and Shoudong Huang

Abstract— In pose feature graph simultaneous localizationand mapping, the robot poses and feature positions are treatedas graph nodes and the odometry and observations are treatedas edges. The size of the graph exerts an important influenceon the efficiency of the graph optimization. Conventionally,the size of the graph is kept small by discarding the currentframe if it is not spatially far enough from the previous oneor not informative enough. However, these approaches cannotdiscard the already preserved frames when the robot re-visitsthe previously explored area. We propose a measure derivedfrom Kullbach-Leibler divergence to decide whether a frameshould be discarded, achieving an online implementation ofthe graph pruning algorithm for feature mapping, of whichthe pruned frame can be any of the preserved frames. Theexperimental results using real world datasets show that theproposed pruning algorithm can effectively reduce the size ofthe graph while maintaining the map accuracy.

I. INTRODUCTION

Maps are desired in many robotic systems since it enablesthe robot to localize itself in a complex environment, form-ing the fundamentals of tasks in high layers. It is solvedby simultaneous localization and mapping (SLAM), whichincludes various efficient approaches developed in the recentdecade. The graph SLAM is to formulate the problem in theway of graph optimization to provide a globally consistentsolution, which is usually an important final processing stepfor map building [1] [2] [3] [4]. Among these methods, thegraph form includes pose graph and pose feature graph. Inthis paper, we talk about pose feature graph, whose nodesare poses and features, while the edges are observations andodometry information if available.

Without pruning, the size of the graph will grow fast, sinceit is related to the length of the robot trajectory. As a result,the storage and computational requirements will be out ofthe capacity for large-scale problems. A fast growing graphalso has negative effects on the loop closure, which performsa search in all acquired frames to find the loop constraints.Thus, methods that relate the size of the graph to the size ofthe mapping area instead of the length of the trajectory aredesired for long term mapping.

This work is supported by the National Nature Science Foundation ofChina (Grant No. NSFC: 61075078), the Natural Science Foundation ofZhejiang Province (Grant No. LQ12F03009), and the Joint Centre forRobotics Research (JCRR) between Zhejiang University and the Universityof Technology, Sydney. Yue Wang, Rong Xiong and Qianshan Li are withthe State Key Laboratory of Industrial Control and Technology, ZhejiangUniversity, Hangzhou, P. R. China. Shoudong Huang is with the Centerfor Autonomous Systems, Faculty of Engineering and IT, University ofTechnology, Sydney, Australia. The authors are all with the JCRR. RongXiong is the corresponding author. [email protected]

In this paper, we solve the problem on pose feature graphusing an online pose pruning method. Specifically, a heuristicmechanism is proposed for greedy pruning based on themeasure of Kullbach-Leibler (KL) divergence between themap generated from the graph containing all the poses andthe map generated from a subset of the graph. Inspired by thetheoretic analysis, we design a fast and easy implementationto seek a solution by limiting the number of pruned frames toat most one at each step, so that the redundancy in the mapcan be reduced online. When a pose is pruned, some featuresin the map may become isolated and thus being discardedtoo, which can be controlled by adjusting a parameter in theproposed algorithm.

The remainder of the paper is organized as follows. InSection II, the related works are reviewed and discussed.The main contribution of this paper is shown in Section IIIand its online implementation process is presented in SectionIV. The experiments are performed in Section V, followedby the conclusion and the future work in Section VI.

II. RELATED WORKS

Graph pruning problem is initially studied in computervision community, named keyframe selection, which is main-ly motivated by the intractable video rate frames. In recentyears, inspired by the long term or lifelong mapping, thegraph pruning problem is also researched in the SLAM field.The main focus in SLAM is to create a graph relevant to thesize of the mapping area instead of the trajectory length.So more attention is paid to the revisiting of explored areathan consecutive frames, which is different from the originalkeyframe selection.

In computer vision, the inter frame redundancy is reducedby discarding the current frame when it is not far enoughfrom the previous one in visual odometry [5], which is unableto deal with revisiting the explored area. In [6], a decisionmeasure based on the color and wavelet coefficients of aframe is proposed, making use of the texture information.In [7], the keyframe is selected based on the considerationthat the features in the selected frame should cover theentire space. Snavely et al. proposed a pruning method byselecting the skeletal graphs from the full graph to reduce thenumber of poses [8]. However, the last two methods are bothoffline methods, requiring construction of the full graph inadvance. Thormahlen et al. presented an online method toselect keyframe with the lowest expected estimation errorof initial camera motion and object structure, aiming atgood initial value for bundle adjustment [9]. Eade et al.

2013 European Conference on Mobile Robots (ECMR) Barcelona, Spain, September 25-27, 2013

978-1-4799-0263-7/13/$31.00 ©2013 IEEE 32

proposed a complexity reduction method by marginalizingand suppressing edges incident to nodes of high degrees [10].

In [5] [11] [12] [13], the lifelong mapping system de-veloped by Willow Garage employs a skeleton pose graphmethod for pose reduction. Specifically, the skeleton is builtby the frontend and then the spatially close poses aremarginalized out for loop closure efficiency. The long termexperiment proves that the graph pruning is meaningful forfast loop closure detection. However, their skeleton mapcannot prune previously preserved poses when a loop closureoccurs.

In [14], an information-theoretic graph pruning method forSLAM system on the grid occupancy map is proposed. Theinformation gain of each scan is regarded as the measureof the pose. Their method deals with the pose graph, ofwhich the map is decoupled in the optimization, leading toan unbiased Chow-Liu tree based approximation to preservethe structure when a pose is pruned. A great complexityreduction is shown in their experiment. Compared to thismethod, the measure of a pose in this paper is computed ina different way because of the map representation. In posefeature graph, the features are coupled in the optimization.To preserve the diagonal structure in the information matrix,the edges after pruning are generated selectively.

In [15], a reduced pose graph is proposed to solve thegraph pruning problem. Their method is to register the newlyacquired frame to a selected node in existing nodes. So itis similar to extending the keyframe selection technique toloop case to control the size of the graph. The long termdatasets are used in their experiments, showing the efficiencyand accuracy of the approach. But their algorithm still hasthe problem of keyframe selection, that is, the previouslypreserved pose cannot be pruned. As we directly aims onreducing the KL-divergence error, our method and theirmethod are in different perspectives.

III. A COMPUTABLE MEASURE FOR GRAPH PRUNING

In pose feature graph SLAM, the feature positions aredenoted as a vector 𝐿 with each element 𝑙𝑖 being the 𝑖thfeature position. The set of poses are denoted as a vector𝑋 with each element 𝑥𝑡 being the pose at time 𝑡. Sincean observation only relates one pose and one feature, theobservation made from pose 𝑥𝑡 to feature 𝑙𝑖 is denoted as𝑧𝑡𝑖. All the observations form an observation set 𝑍. If a poseset 𝑋𝑝𝑟𝑢𝑛𝑒 ⊂ 𝑋 is pruned, its corresponding observation set𝑍𝑝𝑟𝑢𝑛𝑒 is also discarded, which reduces the accuracy of thefinal map. In sequel, the subset of poses after pruning isdenoted as 𝑋∗ = 𝑋∖𝑋𝑝𝑟𝑢𝑛𝑒 and the subset of observationafter pruning is denoted as 𝑍∗ = 𝑍∖𝑍𝑝𝑟𝑢𝑛𝑒.

A. A Measure based on Kullbach-Leibler (KL) divergence

The map 𝐿 is a random vector. Its distribution is de-termined by the observations 𝑍 recorded at poses 𝑋 . Itsdistribution is determined by observations 𝑍∗ recorded atposes 𝑋∗ if poses in 𝑋𝑝𝑟𝑢𝑛𝑒 are pruned. So to measure thedifference between these two maps is equivalent to measurethe difference between the two probability distributions. The

KL divergence is a quasi-distance on probability distribution.It is employed for such a task in many problems. In thispaper, we use this measure for graph pruning problem. Thatis

𝐾𝐿 =

∫𝑝(𝐿∣𝑋,𝑍) log 𝑝(𝐿∣𝑋,𝑍)

𝑝(𝐿∣𝑋∗, 𝑍∗)𝑑𝐿 (1)

where 𝑝(𝐿∣𝑋,𝑍) is the probability density of map estimatedby all information so far, while 𝑝(𝐿∣𝑋∗, 𝑍∗) is generated bythe subset 𝑋∗ and 𝑍∗. The pruning problem is to find asubset 𝑋∗ with reduced number of poses than 𝑋 , such thatthe map generated by 𝑋∗ and its corresponding 𝑍∗ is similarto the map generated by 𝑋 and 𝑍, i.e. with KL divergenceclose to zero. By analyzing the measure in (1), the key factorsthat have effects on the KL divergence will be discovered,which is used later as a heuristic mechanism for the graphpruning.

B. Decomposition of the Measure

Expand the KL divergence given in (1), we have

𝐾𝐿 =

∫𝑝(𝐿∣𝑋,𝑍) log 𝑝(𝐿∣𝑋,𝑍)𝑑𝐿

−∫𝑝(𝐿∣𝑋,𝑍) log 𝑝(𝐿∣𝑋∗, 𝑍∗)𝑑𝐿

(2)

Since the first term is a constant, only the second term needsanalysis, which is denoted as 𝑄

𝑄 ≜∫𝑝(𝐿∣𝑋,𝑍) log 𝑝(𝐿∣𝑋∗, 𝑍∗)𝑑𝐿 (3)

Apply the conditional independence of features

𝑄 =

∫ ∏𝑗

𝑝(𝑙𝑗 ∣𝑋,𝑍) log∏𝑖

𝑝(𝑙𝑖∣𝑋∗, 𝑍∗)𝑑𝐿

=∑𝑖

∫ ∏𝑗

𝑝(𝑙𝑗 ∣𝑋,𝑍) log 𝑝(𝑙𝑖∣𝑋∗, 𝑍∗)𝑑𝐿

=∑𝑖

∫𝑝(𝑙𝑖∣𝑋,𝑍) log 𝑝(𝑙𝑖∣𝑋∗, 𝑍∗)𝑑𝑙𝑖

(4)

Now the only term including variable is log 𝑝(𝑙𝑖∣𝑋∗, 𝑍∗),which is the log probability density of the 𝑖th featuregenerated from the subset.

In SLAM, the posterior probability density of a feature iscommonly modeled as a Gaussian distribution

𝑝(𝑙𝑖∣𝑋∗, 𝑍∗) ∼ 𝑁(𝜇𝑖,Σ𝑖) (5)

where 𝜇𝑖 is the solution of the least squares given all theposes and observations, Σ𝑖 is the covariance, equaling to the𝑖th 𝑛×𝑛 diagonal block of (𝐽𝑇Σ−1

𝑍 𝐽)−1 [16], where Σ𝑍 isthe covariance matrix of all observations, 𝐽 is the Jacobianof the observations w.r.t the feature and 𝑛 is the dimensionof a feature.

It is possible that for a particular feature, all observationsto it are pruned due to the pose pruning, thus this feature cannot be estimated using 𝑋∗ and 𝑍∗. In this case, we modelthe posterior as a uniform prior.

33

Summarizing the result above, the complete model of thesecond term in the integral in (4) is

log 𝑝(𝑙𝑖∣𝑋∗, 𝑍∗)

=

{log 𝛾𝑖 − 𝐶

2 (𝑙𝑖 − 𝜇𝑖)𝑇Σ−1𝑖 (𝑙𝑖 − 𝜇𝑖) 𝑙𝑖 ∈ Θ

log 𝑝(𝑙𝑖) 𝑙𝑖 ∈ ¬Θ(6)

where 𝛾𝑖 is

𝛾𝑖 =1

(2𝜋)𝑛/2∣Σ𝑖∣1/2 (7)

It is a normalizer of the density, 𝐶 = log 𝑒, Θ is the set ofobserved features with observations in 𝑍∗, ¬Θ is the set offeatures whose observations are not in 𝑍∗ (𝐿 = Θ + ¬Θ)and 𝑝(𝑙𝑖) is the uniform prior.

Then substitute (6) into (4), leading to

𝑄 =∑𝑙𝑖∈Θ

log 𝛾𝑖 −∑𝑙𝑖∈Θ

𝐶

2𝐸{(𝑙𝑖 − 𝜇𝑖)𝑇Σ−1

𝑖 (𝑙𝑖 − 𝜇𝑖)∣𝑋,𝑍}

−∑

𝑙𝑖∈¬Θ

(−∫

log 𝑝(𝑙𝑖)𝑝(𝑙𝑖∣𝑋,𝑍)𝑑𝑙𝑖)

(8)

where 𝐸{⋅} is the expectation operator.As 𝑝(𝑙𝑖) is in uniform distribution, the third term in (8)

can be defined as

𝐻 ≜ −∫

log 𝑝(𝑙𝑖)𝑝(𝑙𝑖∣𝑋,𝑍)𝑑𝑙𝑖 (9)

It is a constant information entropy. Here it can be founddirectly that to maximize 𝑄 is to decrease the size of ¬Θ,meaning that the features should be kept as much as possibleduring the pruning.

For the conditional expectation in (8), it is computed as,

𝐸{(𝑙𝑖 − 𝜇𝑖)𝑇Σ−1𝑖 (𝑙𝑖 − 𝜇𝑖)∣𝑋,𝑍}

=𝑡𝑟{Σ−1𝑖 Σ𝑖}+ (��𝑖 − 𝜇𝑖)𝑇Σ−1

𝑖 (��𝑖 − 𝜇𝑖)(10)

where ��𝑖 and Σ𝑖 are the mean and variance of the Gaussiandistribution 𝑝(𝑙𝑖∣𝑋,𝑍).

The first term in (8) is computed as∑𝑙𝑖∈Θ

log 𝛾𝑖 = 𝐶∣Θ∣ − 1

2

∑𝑙𝑖∈Θ

log ∣Σ𝑖∣ (11)

where 𝐶 = − log(2𝜋)𝑛/2.Now the measure is decomposed. The term needing com-

putation includes (11) and (10), which will be analyzed inthe follows.

C. Covariance Computation

The second term in (11) is the determinant of Σ𝑖, whichis the 𝑖th 𝑛 × 𝑛 diagonal block of (𝐽𝑇Σ−1

𝑍 𝐽)−1. Since theposes are assumed to be fixed, 𝐽 has a structure as follows

𝐽 =

⎛⎜⎝𝐽1 0 . . .0 𝐽2 . . ....

.... . .

⎞⎟⎠ (12)

where 𝐽𝑖 is

𝐽𝑖 =∂{𝑧𝑡𝑖}∂𝑙𝑖

(13)

where {𝑧𝑡𝑖} is the set of all observations of feature 𝑙𝑖. InSLAM, an observation is expressed as

𝑧𝑡𝑖 = 𝑅𝑇𝑡 (𝑙𝑖 − 𝑇𝑡) (14)

where 𝑅𝑡 and 𝑇𝑡 are the rotation matrix and translation vectorof the pose at time 𝑡, respectively. Its partial derivative is

∂𝑧𝑡𝑖∂𝑙𝑖

= 𝑅𝑇𝑡 (15)

then we have a diagonal structure as follows

𝐽𝑇Σ−1𝑍 𝐽 =

⎛⎜⎝𝐽𝑇1 Σ−1

𝑍1𝐽1 0 . . .

0 𝐽𝑇2 Σ−1

𝑍2𝐽2 . . .

......

. . .

⎞⎟⎠ (16)

where Σ𝑍𝑖is the block covariance of all observations gen-

erated by feature 𝑙𝑖. Now assume Σ−1𝑍 to be spherical as 𝑘𝐼 ,

a closed form of Σ𝑖 is shown as follows

Σ𝑖 =1

𝑘𝑂𝑖𝐼

∣Σ𝑖∣ = 1

(𝑘𝑂𝑖)𝑛

(17)

where 𝑂𝑖 is the number of observations of feature 𝑙𝑖 in 𝑍∗.The (17) means that the uncertainty of a feature is inverselyproportional to the number of observations, which indicatesthe more observation, the better estimation, consistent withthe common sense.

Substitute (17) into (11), we have∑𝑙𝑖∈Θ

log 𝛾𝑖 = 𝐶∣Θ∣+ 𝑛

2

∑𝑙𝑖∈Θ

log 𝑘𝑂𝑖 (18)

where 𝐶 = 𝑛(𝐶 − 𝑙𝑜𝑔(2𝜋))/2. Note that the computationnow only needs counting the number of observations andfeatures.

D. Expectation Computation

The first term in (10) can be computed as follows

𝑡𝑟{Σ−1𝑖 Σ𝑖} =

𝑛𝑂𝑖

��𝑖

(19)

where ��𝑖 is the number of observations of feature 𝑙𝑖 in 𝑍.The second term in (10) is the difference in mean value

caused by pruned poses. Note that the estimator of thefeature is a linear least squares estimator given the posesand observations, the estimation should be unbiased, i.e.��𝑖 = 𝜇𝑖, making the second term equaling to 0. However, abias may exist due to the very limited number of observationsin practice.

Now the expectation term has the form

𝐸{(𝑙𝑖 − 𝜇𝑖)𝑇Σ−1𝑖 (𝑙𝑖 − 𝜇𝑖)∣𝑋,𝑍} =

𝑛𝑂𝑖

��𝑖

(20)

It agrees to the consistency of a linear least square estima-tor that the covariance will reduce when the observationsincreases and the estimation will converge to the true value.

34

E. Computable Measure

Put the results derived together, we have the computablemeasure �� as

�� = 𝐶∣Θ∣+ 𝑛

2

∑𝑙𝑖∈Θ

(log 𝑘𝑂𝑖 − 𝐶𝑂𝑖

��𝑖

)−∑

𝑙𝑖∈¬Θ

𝐻 (21)

whose difference between 𝑄 is denoted as 𝜖, introduced byassuming the covariance matrix being spherical, leading to𝑄 = ��+ 𝜖.

Note that the measure 𝑄 is computed using the subset 𝑍∗

and 𝑋∗. Denote the original measure computed using the fullset 𝑍 and 𝑋 as 𝑄𝑓𝑢𝑙𝑙 and the computable measure ��𝑓𝑢𝑙𝑙.

Recall (2), since the KL divergence between PDF usingthe full set and itself is 0, we have

𝑄𝑓𝑢𝑙𝑙 =

∫𝑝(𝐿∣𝑋,𝑍) log 𝑝(𝐿∣𝑋,𝑍)𝑑𝐿 (22)

It then leads to

𝐾𝐿 = 𝑄𝑓𝑢𝑙𝑙 −𝑄 = ��𝑓𝑢𝑙𝑙 − ��+ 𝜖𝑓𝑢𝑙𝑙 − 𝜖 (23)

where 𝜖𝑓𝑢𝑙𝑙 is the difference in the case of full set. Define

Δ𝑄 ≜ ��𝑓𝑢𝑙𝑙 − �� (24)

It becomes a computable measure to approximate the KLdivergence between the map generated by the full set andthe subset.

IV. ONLINE IMPLEMENTATION

In this section, we present the greedy pruning algorithmbased on the theoretic results derived above. The reason forthe greedy strategy is for the purpose of online computation.The number of pruned pose is limit to be one at most, whichmeans that, there is only one pose that may be pruned at eachtime, ∣𝑋∣ − ∣𝑋∗∣ ≤ 1.

A. Online computation

Note that the set of the features observed by the prunedpose, consists of two parts presented below with their nota-tions:

∙ features that are also observed by other poses in thesubset, denoted as Υ, Υ ⊂ Θ

∙ features that are only observed by the pruned pose,denoted as Ψ, Ψ = ¬Θ.

The computable measure of full poses ��𝑓𝑢𝑙𝑙 is

��𝑓𝑢𝑙𝑙 = 𝐶∣Θ+Ψ∣+ 𝑛2

∑𝑙𝑖∈Θ+Ψ

log 𝑘��𝑖 −∑

𝑙𝑖∈Θ+Ψ

𝑛𝐶

2(25)

Then the Δ𝑄 introduced above is expressed as follows

Δ𝑄 =𝐶∣Ψ∣+∑𝑙𝑖∈Ψ

𝐻

+𝑛

2(

∑𝑙𝑖∈Θ+Ψ

log 𝑘��𝑖 −∑𝑙𝑖∈Θ

log 𝑘𝑂𝑖)

+∑𝑙𝑖∈Θ

𝑛𝐶𝑂𝑖

2��𝑖

−∑

𝑙𝑖∈Θ+Ψ

𝑛𝐶

2

(26)

For the terms in the above equation, we have

∑𝑙𝑖∈Θ+Ψ

log 𝑘��𝑖−∑𝑙𝑖∈Θ

log 𝑘𝑂𝑖 =∑𝑙𝑖∈Υ

log𝑂𝑖 + 1

𝑂𝑖+ ∣Ψ∣ log 𝑘

(27)and∑𝑙𝑖∈Θ

𝑛𝐶𝑂𝑖

2��𝑖

−∑

𝑙𝑖∈Θ+Ψ

𝑛𝐶

2= −

∑𝑙𝑖∈Υ

𝑛𝐶

2(𝑂𝑖 + 1)− ∣Ψ∣𝑛𝐶

2

(28)Now we substitute (27) and (28) into (26), the final

measure of the pruned pose Δ𝑄 is given as

Δ𝑄 = 𝛼∣Ψ∣+ 𝑛

2

∑𝑙𝑖∈Υ

(log𝑂𝑖 + 1

𝑂𝑖− 1

(𝑂𝑖 + 1)) (29)

where 𝛼 is a constant number merging all constants relatedto ∣Ψ∣. For brevity, we set the base of the logarithm be 𝑒,then 𝐶 = 1. One can see that as the number of observationsincreases, the information of an observation reduces, inaccordance with the intuition.

In application, 𝛼 in (29) is regarded as an algorithmparameter measuring the contribution of features in Ψ. Intheory, the probability density between an observed featureand an unobserved feature is very distant in the case ofKL divergence, leading to a very big 𝛼. However, lossof a feature in the map will not cause a big problem insome scenarios. Especially in visual mapping, the numberof features in a frame is always in hundreds. So the value of𝛼 should be application dependent, and it is proposed to beset by the user to indicate the seriousness of loss a feature.In practice, the features given by the frontend always relateto more than one observation, leading 𝛼 to be insensitive.

As the difference between 𝑋 and 𝑋∗ is one pose at most,(29) becomes a very simple measure of the pruned pose’sinformation. When a new frame comes, the Δ𝑄 of someposes require updating since the number of observations maychange. Note that the range of poses requiring updating Δ𝑄includes only ones have observations of the features in Υ,reducing the computational time to the constant time.

B. Edge reservation

In the implementation, when a pose is pruned, its edgesconnecting to its predecessor and successor (e.g. odometrydata) is merged together to keep part of the informationcontained in the pruned pose. Denote the three ordered posesas 𝑖, 𝑗 and 𝑘, there are two edges 𝑒𝑖𝑗 , 𝑒𝑗𝑘. The pruning ofpose 𝑗 leads to a merging of 𝑒𝑖𝑗 and 𝑒𝑗𝑘. The mean can becomputed as

𝜇𝑖𝑘 = 𝑓(𝑒𝑖𝑗 , 𝑒𝑗𝑘) = 𝑒𝑖𝑗 ⊕ 𝑒𝑗𝑘 (30)

where ⊕ is the standard motion operator [17]. Its covariancematrix can be computed through linearization

Σ𝑖𝑘 =∂𝑓

∂𝑒𝑖𝑗Σ𝑖𝑗

∂𝑓

∂𝑒𝑖𝑗

𝑇

+∂𝑓

∂𝑒𝑗𝑘Σ𝑗𝑘

∂𝑓

∂𝑒𝑗𝑘

𝑇

(31)

using the nonlinear error propagation.

35

C. Algorithm

The discussion above can be summarized as follows. First,the computational complexity of updating the contributionΔ𝑄 is roughly constant when a new frame comes as derivedin Subsection IV-A, which enables an online computation.Second, the information contained in the pruned pose canbe reserved partially as explained in Subsection IV-B. Nowwe present two implementation schemes (Prune 1 and Prune2) designed based on these facts:

∙ When a new pose comes, add the corresponding edgesto the graph

– Update the Δ𝑄 for poses that have measured thefeatures in Υ of the new frame using (29).

– Set Δ𝑄𝑠𝑒𝑙 to be the minimum Δ𝑄 in the setexcluding the new pose (Prune 1), or Set Δ𝑄𝑠𝑒𝑙

to be the Δ𝑄 of the most recent pose (Prune 2).– If: Δ𝑄𝑠𝑒𝑙 is smaller than the 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑

∗ Marginalize the pose using edge merging intro-duced in Subsection IV-B.

∗ Update the Δ𝑄 of poses that have observed thefeatures in Υ of pruned pose using (29).

– else: Δ𝑄𝑠𝑒𝑙 is larger than the threshold, no pruning.∙ FinishOne can see that there are two choices to set Δ𝑄𝑠𝑒𝑙. First,

the pruned pose is selected from all the preserved poses,which is denoted as Prune 1. Second, the pruned pose isrestricted to the most recent one, which we call it Prune2. Prune 2 is similar to a traditional keyframe selectiontechnique. Both of them will be tested in the experiments.

V. EXPERIMENTAL RESULTS

In this section, we present the performance of the proposedpruning algorithm in the experiments. The two employeddatasets are collected from the real word. They are DLRdataset [18] and Victoria Park dataset [19]. The optimalsolution is computed by optimizing the full set without prun-ing using a Matlab implemented Gauss-Newton algorithm.Prune 1 and Prune 2 introduced in Subsection IV-C are bothconducted. The final subset after pruning are fed into thesame Gauss-Newton optimizer to yield the final map. Theemployed computer is with a E7300 2.66GHz CPU and 4GBRAM.

The results of two datasets using the full set, subsetthrough Prune 1 and subset through Prune 2 are shown inFig. 1 and Fig. 2. One can see that the features estimatedusing the subset almost overlap that estimated using the fullset.

The total number of the preserved poses at each step isshown in Fig. 3. It can be found that the number of posesincreases slower when using Victoria Park dataset comparedto that in DLR dataset. In the latter, the robot goes aroundthe building for one time while in the former, the robot goesthrough the park for many times, meaning that the number ofloops in Victoria Park dataset is much larger than that in DLRdataset. So many poses in Victoria Park dataset are at similarplaces, indicating a higher redundancy, which explains the

−50 −40 −30 −20 −10 0 10 20−40

−30

−20

−10

0

10

20

30

X/m

Y/m

FullPrune 1Prune 2

−150 −100 −50 0 50 100 150 200 250−100

−50

0

50

100

150

200

250

300

X/m

Y/m

FullPrune 1Prune 2

Fig. 1. The results of DLR (left) and Victoria Park (right) obtained usingfull set, subset of Prune 1 and subset of Prune 2 with the identity covariance.

−50 −40 −30 −20 −10 0 10 20−40

−30

−20

−10

0

10

20

30

X/m

Y/m

FullPrune 1Prune 2

−150 −100 −50 0 50 100 150 200 250−100

−50

0

50

100

150

200

250

300

X/m

Y/m

FullPrune 1Prune 2

Fig. 2. The results of DLR (left) and Victoria Park (right) obtained usingfull set, subset of Prune 1 and subset of Prune 2 with the original covariance.

trend of the evolution curves. This result reflects that thesize of the graph now relates to the size of the mapping areainstead of the length of the trajectory. When the trajectoryhas many loops, Prune 1, which can discard any preservedposes, is more effective than Prune 2.

0 500 1000 1500 2000 2500 3000 35000

0.005

0.01

0.015

0.02

# frame

time

per

fram

e/s

0 500 1000 1500 2000 2500 3000 35000

1000

2000

3000

4000

# frame

# re

serv

ed p

oses

FullPrune 1Prune 2

0 1000 2000 3000 4000 5000 6000 70000

0.005

0.01

0.015

0.02

# frame

time

per

fram

e/s

0 1000 2000 3000 4000 5000 6000 70000

2000

4000

6000

8000

# frame

# re

serv

ed p

oses

FullPrune 1Prune 2

Fig. 3. The running time for Prune 1 and the number of preserved posesv.s. the number of frames for DLR (left) and Victoria Park (right).

The running time of Prune 1 at each time step is shownin Fig. 3. One can see that the time is a fluctuated constantnumber as desired. The fluctuation is due to the differentnumber of features a pose can observe at different timestep.The running time in Victoria Park dataset is higher than thatin DLR dataset because a pose in Victoria Park datset canusually observe more features.

The statistics of the two datasets are shown in Table Iincluding the KL divergence, number of features, number ofposes, and the computational time for optimization. KL(IC)means the KL divergence between the results from thepruned set and full set using the identity covariance, whileKL(OC) indicates the results using the original covariance.Time here means the time consumed for optimization withthe identity covariance. Because we use the result with theidentity covariance as the initial value for optimization with

36

the original covariance, the optimization time for the latterwill be much faster, which are not listed here. Since theproposed greedy algorithm is KL divergence derived, weuse this measure to evaluate the performance. In the caseof the identity covariance, we scale the obtained covariancewith a number which equals to the order of the mean ofdiagonal entries in the original covariance matrix to computethe KL divergence. One can see that Prune 1 has betterperformance in most cases with less preserved poses, whichalso leads to a faster optimization time. These benefits shouldowe to the global searching range of Prune 1. Besides,both prune schemes give a reduction in optimization timecompared to the optimization time using the whole dataset,which is the main advantage of the pruning. Though theposes and features in Victoria Park dataset are fewer, theconsumed time for optimization is longer, which is due tomore observations. By choosing 𝛼 = 0.1 and 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 =0.2 in our implementation, both Prune 1 and Prune 2 givezero loss in terms of the number of features. By adjusting𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑, the size of the graph can be controlled to achievea balance between the accuracy and the optimization time.

Besides, the pruning algorithm is derived with the assump-tion of identity covariance, so the result in Fig. 1 is better.But the performance in optimized result and computationaltime is still not bad for the case of the original covariance.This is mainly due to a fact that the noise in observationsare usually identically independent distributed.

TABLE I

RESULTS OF THE DLR DATASET AND VICTORIA PARK DATASET

DLR KL(IC) KL(OC) features poses time(s)full set - - 549 3298 598Prune 1 4.0087 3.0077 549 1072 53Prune 2 6.8003 1.8406 549 1665 115

Victoria Parkfull set - - 299 6898 9744Prune 1 10.506 1.9764 299 607 46Prune 2 12.3201 6.0689 299 1166 131

The code of the pruning algorithm and the datasetutilized in the experiment section is published onhttp://services.eng.uts.edu.au/˜sdhuang/research.htm.

VI. CONCLUSION

In this paper, we propose an algorithm for online graphpruning in feature mapping. Starting from the similaritymeasure based on KL divergence between two pdf, i.e. maps,a very simple measure of the contribution of each poseis derived by applying the assumption that the observationcovariance is spherical. Then we present an online imple-mentation of the pruning algorithm, which has a fluctuatedconstant computation burden at each timestep. Finally, theproposed algorithm is applied to the real world datasetexperiments and satisfactory performances are achieved. Inthe future, we want to further study the effect of thespherical covariance assumption and extend the method to

the context of pose graph and vision SLAM, and investigatethe difference between keyframe selection techniques andoptimization driven techniques. Our final goal is to realize along term mapping and navigation system.

REFERENCES

[1] P. Newman, G. Sibley, M. Smith, M. Cummins, A. Harrison, C. Mei,I. Posner, R. Shade, D. Schroeter, L. Murphy, et al., “Navigating,recognizing and describing urban spaces with vision and lasers,” TheInternational Journal of Robotics Research, vol. 28, no. 11-12, pp.1406–1433, 2009.

[2] S. Thrun and M. Montemerlo, “The graph slam algorithm with appli-cations to large-scale mapping of urban structures,” The InternationalJournal of Robotics Research, vol. 25, no. 5-6, pp. 403–429, 2006.

[3] G. Grisetti, R. Kummerle, C. Stachniss, and W. Burgard, “A tutorialon graph-based slam,” Intelligent Transportation Systems Magazine,IEEE, vol. 2, no. 4, pp. 31–43, 2010.

[4] R. Kummerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard,“G2o: A general framework for graph optimization,” in Robotics andAutomation (ICRA), 2011 IEEE International Conference on. IEEE,2011, pp. 3607–3613.

[5] K. Konolige, M. Agrawal, and J. Sola, “Large-scale visual odometryfor rough terrain,” Robotics Research, pp. 201–212, 2011.

[6] C. Dhawale and S. Jain, “A novel approach towards keyframe selectionfor video summarization,” Asian Journal of Information Technology,vol. 7, no. 4, pp. 133–137, 2008.

[7] Z. Dong, G. Zhang, J. Jia, and H. Bao, “Keyframe-based real-timecamera tracking,” in Computer Vision, 2009 IEEE 12th InternationalConference on. IEEE, 2009, pp. 1538–1545.

[8] N. Snavely, S. Seitz, and R. Szeliski, “Skeletal graphs for efficientstructure from motion,” in Proc. of the IEEE Conf. on Computer Visionand Pattern Recognition (CVPR), 2008, pp. 1–8.

[9] T. Thormahlen, H. Broszio, and A. Weissenfeld, “Keyframe selectionfor camera motion and structure estimation from multiple views,”Computer Vision-ECCV 2004, pp. 523–535, 2004.

[10] E. Eade, P. Fong, and M. Munich, “Monocular graph slam withcomplexity reduction,” in Intelligent Robots and Systems (IROS), 2010IEEE/RSJ International Conference on. IEEE, 2010, pp. 3017–3024.

[11] K. Konolige and M. Agrawal, “Frameslam: From bundle adjustmentto real-time visual mapping,” Robotics, IEEE Transactions on, vol. 24,no. 5, pp. 1066–1077, 2008.

[12] K. Konolige, J. Bowman, J. Chen, P. Mihelich, M. Calonder, V. Lepetit,and P. Fua, “View-based maps,” The International Journal of RoboticsResearch, vol. 29, no. 8, pp. 941–957, 2010.

[13] K. Konolige and J. Bowman, “Towards lifelong visual maps,” in Intel-ligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ InternationalConference on. IEEE, 2009, pp. 1156–1163.

[14] H. Kretzschmar and C. Stachniss, “Information-theoretic compressionof pose graphs for laser-based slam,” The International Journal ofRobotics Research, vol. 31, no. 11, pp. 1219–1230, 2012.

[15] H. Johannsson, M. Kaess, M. Fallon, and J. Leonard, “Temporallyscalable visual SLAM using a reduced pose graph,” in IEEE Intl.Conf. on Robotics and Automation, ICRA, Karlsruhe, Germany, May2013.

[16] R. Hartley and A. Zisserman, Multiple view geometry in computervision. Cambridge Univ Press, 2000, vol. 2.

[17] R. Smith, M. Self, and P. Cheeseman, “Estimating uncertain spatialrelationships in robotics,” Autonomous robot vehicles, vol. 1, pp. 167–193, 1990.

[18] J. Kurlbaum and U. Frese, “A benchmark data set for data association,”Univ. Bremen, Bremen, Germany, SFB/TR, vol. 8, pp. 017–02, 2009.

[19] J. Guivant and E. Nebot, “Optimization of the simultaneous local-ization and map-building algorithm for real-time implementation,”Robotics and Automation, IEEE Transactions on, vol. 17, no. 3, pp.242–257, 2001.

37

Documents

[IEEE 2013 European Conference on Mobile Robots (ECMR) - Barcelona, Catalonia, Spain (2013.09.25-2013.09.27)] 2013 European Conference on Mobile Robots - Kullback-leibler divergence