31
Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt 1 Dinesh Thakur 1 Nikolay Atanasov 2 Vijay Kumar 1 George Pappas 1 1 GRASP Laboratory University of Pennsylvania Philadelphia, PA, USA 2 Department of Electrical and Computer Engineering University of California San Diego San Diego, CA, USA May 17, 2018 Brent Schlotfeldt 1 , Dinesh Thakur 1 , Nikolay Atanasov Anytime Planning May 17, 2018 1 / 25

Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Anytime Planning for DecentralizedMulti-Robot Active Information Gathering

Brent Schlotfeldt1 Dinesh Thakur1 Nikolay Atanasov2

Vijay Kumar1 George Pappas1

1GRASP LaboratoryUniversity of PennsylvaniaPhiladelphia, PA, USA

2Department of Electrical and Computer EngineeringUniversity of California San Diego

San Diego, CA, USA

May 17, 2018

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 1 / 25

Page 2: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

What is the problem?

Suppose the robot at x0 wants to plan a path to xT to track a target yT

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 2 / 25

Page 3: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Overview

1 Active Information GatheringProblem StatementReduction to Open-Loop Control, and Optimal Solution(ε,δ)-Reduced Value IterationAnytime Reduced Value Iteration

2 Multiple RobotsCoordinate DescentDistributed EstimationApplication: Target Tracking

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 3 / 25

Page 4: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Problem Statement

Consider a team of n mobile robots, with sensor motion models:

xi,t+1 = fi (xi,t , ui,t), i ∈ 1, . . . , n (1)

where xi,t ∈ X , ui,t ∈ U , and U is a finite set.

The robots goal is to track a target yt with target motion model:

yt+1 = Ayt + wt , wt ∼ N (0,W ) t = 0, . . . ,T (2)

The robots are able to sense the target using on-board sensors obeying thefollowing sensor observation model:

zi,t = H(xi,t)yt + vt(xi,t), vt ∼ N (0,V (xi,t)) (3)

Note the Linear, Gaussian assumptions on the target and observations, butnot on the robot.

The set U of admissible controls is finite.

To simplify notation, we refer to the state of the team as xt at time t.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 4 / 25

Page 5: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Problem Statement

Consider a team of n mobile robots, with sensor motion models:

xi,t+1 = fi (xi,t , ui,t), i ∈ 1, . . . , n (1)

where xi,t ∈ X , ui,t ∈ U , and U is a finite set.The robots goal is to track a target yt with target motion model:

yt+1 = Ayt + wt , wt ∼ N (0,W ) t = 0, . . . ,T (2)

The robots are able to sense the target using on-board sensors obeying thefollowing sensor observation model:

zi,t = H(xi,t)yt + vt(xi,t), vt ∼ N (0,V (xi,t)) (3)

Note the Linear, Gaussian assumptions on the target and observations, butnot on the robot.

The set U of admissible controls is finite.

To simplify notation, we refer to the state of the team as xt at time t.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 4 / 25

Page 6: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Problem Statement

Consider a team of n mobile robots, with sensor motion models:

xi,t+1 = fi (xi,t , ui,t), i ∈ 1, . . . , n (1)

where xi,t ∈ X , ui,t ∈ U , and U is a finite set.The robots goal is to track a target yt with target motion model:

yt+1 = Ayt + wt , wt ∼ N (0,W ) t = 0, . . . ,T (2)

The robots are able to sense the target using on-board sensors obeying thefollowing sensor observation model:

zi,t = H(xi,t)yt + vt(xi,t), vt ∼ N (0,V (xi,t)) (3)

Note the Linear, Gaussian assumptions on the target and observations, butnot on the robot.

The set U of admissible controls is finite.

To simplify notation, we refer to the state of the team as xt at time t.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 4 / 25

Page 7: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Problem Statement

Consider a team of n mobile robots, with sensor motion models:

xi,t+1 = fi (xi,t , ui,t), i ∈ 1, . . . , n (1)

where xi,t ∈ X , ui,t ∈ U , and U is a finite set.The robots goal is to track a target yt with target motion model:

yt+1 = Ayt + wt , wt ∼ N (0,W ) t = 0, . . . ,T (2)

The robots are able to sense the target using on-board sensors obeying thefollowing sensor observation model:

zi,t = H(xi,t)yt + vt(xi,t), vt ∼ N (0,V (xi,t)) (3)

Note the Linear, Gaussian assumptions on the target and observations, butnot on the robot.

The set U of admissible controls is finite.

To simplify notation, we refer to the state of the team as xt at time t.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 4 / 25

Page 8: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Active Information Gathering

Problem (Active Information Acquisition)

Given an initial sensor pose x0 ∈ X , a prior distribution over the targetstate y0, and a finite planning horizon T , the task is to choose a sequenceof control actions: u0, . . . , uT−1 ∈ UT which minimize conditional entropyh between the target state: y1:T and the measurement set z1:T :

minu∈UT

J(n)T (u) :=

T∑t=1

h(yt |z1:t , x1:t)† (4)

s.t. xt+1 = f (xt , ut), t = 0, ...,T − 1

yt+1 = Ayt + wt t = 0, ...,T − 1

zt = H(xt)yt + vt t = 0, ...,T

(5)

† Note that many other information measures are possible, such as mutualinformation, minimum mean-squared error, and others.1

1Nikolay Atanasov et al. “Information acquisition with sensing robots: Algorithmsand error bounds”. In: 2014 IEEE International Conference on Robotics andAutomation (ICRA). 2014.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 5 / 25

Page 9: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Reduction to Open-Loop Control

Prior y0 is Gaussian N (y0,Σ0), =⇒ Kalman filter is optimal.

Objective proportional to log det, i.e. h(yt |z1:t , x1:t) ∼ log det(Σt)

Σt independent of realizations z1:t =⇒ open loop control optimal!2

Problem (Active Information Acquisition)

minµ∈UT

J(n)T (µ) :=

T∑t=1

log det(Σt)

s.t. xt+1 = f (xt , ut), t = 0, ...,T − 1

Σt+1 = ρpt+1(ρet (Σt), xt+1), t = 0, ...,T − 1

Update: ρet (Σ, x) := Σ− Kt(Σ, x)Ht(x)Σ

Kt(Σ, x) := ΣHt(x)T (Ht(x)ΣHt(x)T + Vt(x))−1

Predict: ρpt (Σ) := AtΣATt + Wt

2Jerome Le Ny and George J Pappas. “On trajectory optimization for active sensingin Gaussian process models”. In: 2009 IEEE Conference on Decision and Control(CDC). 2009.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 6 / 25

Page 10: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Optimal Solution

Deterministic optimal control problem can be solved with Forward ValueIteration, which builds a search tree from (x0, Σ0)

0 0,x

3u

Branching due to number of motion primitives

Depth due to length of planning horizon

2u1u

Full search has complexity O(‖U‖T ).

Can we do better?

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 7 / 25

Page 11: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

What if trajectories cross?

Want: Conditions on Σt to eliminate nodes if trajectories cross at level t ofthe tree:

2

0x

31 2

1x 31x

11x

12x

22x 3

2x42x

52x

1 2

32 1 3(1 )a a ±

There is a convex combination of Σ1, and Σ3 that dominates Σ2, i.e.Σ2 aΣ1 + (1− a)Σ3. It can be safely removed from the tree.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 8 / 25

Page 12: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

ε - Algebraic Redundancy

Want: Conditions on Σt to eliminate nodes if trajectories cross:3

Definition (ε-Algebraic Redundancy )

Let ε ≥ 0 and let ΣiKi=1 ⊂ Sn+ be a finite set. A matrix Σ ∈ Sn

+ isε-algebraically redundant with respect to Σi if there exist nonnegativeconstants αiKi=1 such that:

K∑i=1

αi = 1 and Σ + εIn K∑i=1

αiΣi

2 1 3(1 )a a ±

Example: Suppose aΣ1 + (1− a)Σ3 is a convexcombination of Σ1, and Σ2. Σ1 and Σ3 cannotstrictly eliminate Σ2, because it is not containedin aΣ1 + (1− a)Σ3. However if we relax theinequality by εI , we can remove Σ2.

3Michael P Vitus et al. “On efficient sensor scheduling for linear dynamical systems”.In: Automatica 48.10 (2012), pp. 2482–2493.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 9 / 25

Page 13: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

(ε, δ)- Reductions

Can we remove even more trajectories?

What if the trajectories do not exactly cross at time t?

Definition (Trajectory δ-Crossing)

Trajectories π1, π2 ∈ XT δ-cross at time t ∈ [1,T ] if dX (π1t , π2t ) ≤ δ for

δ ≥ 0.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 10 / 25

Page 14: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

(ε, δ) Reduced Value Iteration

Idea: Prune trajectories close in space and similar in informativeness

Algorithm 1 (ε, δ) Reduced Value Iteration

1: J0 ← 0, S0 ← (x0,Σ0, J0). St ← ∅ for t = 1, ...T2: for t = 1 : T do3: for all (x ,Σ, J) ∈ St−1 do4: for all u ∈ U do5: xt ← f (x , u),Σt ← ρet+1(ρpt (Σ, x , u), xt))6: Jt ← J + log det(Σt)7: St ← St ∪ (xt ,Σt , Jt)8: end for9: end for

10: Sort St in ascending order according to log det(·)11: S ′t ← St [1]12: for all (x ,Σ, J) ∈ St \ St [1] do13: % Find all nodes in S ′t , which δ-cross x:14: Q ← (Σ′)|(x ′,Σ′, J ′) ∈ S ′t , dX (x , x ′) ≤ δ15: if isempty(Q) or not(Σ is ε-alg-red wrt Q) then16: S ′t ← S ′t ∪ (x ,Σ, J)17: end if18: end for19: St ← S ′t20: end for21: return min J | (x ,Σ) ∈ ST

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 11 / 25

Expanding Nodes inSearch Tree =⇒

(ε,δ)-Pruning =⇒

Page 15: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Performance Guarantee

Search tree with (ε, δ)-pruning is (ε, δ)-suboptimal with cost Jε,δT suchthat the bound holds4

0 ≤ Jε,δT − J∗T ≤ (ζT − 1)[J∗T − log det(W )

]+ ε∆T (6)

where ζT :=t−1∏τ=1

(1 +τ∑

s=1

Lsf Lmδ) ≥ 1 (7)

and ∆T , Lsf and Lm are constants resulting from continuity assumptions.

Note the tradeoff in performance: (ε,δ) → (0, 0) =⇒ Jε,δT → J∗T .

If (ε,δ) → (∞,∞), we get a greedy policy with no guarantee.

This bound is monotone in ε, δ.

4Nikolay Atanasov et al. “Information acquisition with sensing robots: Algorithmsand error bounds”. In: 2014 IEEE International Conference on Robotics andAutomation (ICRA). 2014.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 12 / 25

Page 16: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Motivation for Anytime Planning

How do ε,δ affect execution time?

What if problem dimension changes? (e.g. more targets)

What if team changes? (e.g. more robots)

All of these can be addressed by modifying the search algorithm toiteratively construct the tree, adding progressively more trajectories untilexecution time runs out, always returning a feasible solution. This is calledan anytime algorithm.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 13 / 25

Page 17: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Anytime Reduced Value Iteration

Idea: First build a greedy search tree, then iteratively ImprovePath untiltime expires

To efficiently re-use computations, we mark all computed nodes St as openand closed, i.e. Ot , Ct at all levels t in the tree5

Algorithm 2 Anytime Reduced Value Iteration (x0, Σ0, TARVI )

1: J0 ← 0,S0 ← (x0,Σ0, J0). St ← ∅ for t = 1, ...T2: C0 ← ∅3: Ot ← St for t = 0, ...T4: S ← S0, ..ST, O ← O0, ...OT, C ← C0, ..CT5: ε←∞, δ ←∞6: S,O, C ← ImprovePath(S,O, C, ε, δ)7: Publish best solution from O[T ]8: while Time Elapsed ≤ TARVI do9: Decrease (ε, δ)

10: S,O, C ← ImprovePath(S,O, C, ε, δ)11: Publish best solution J from O[T ]12: end while

5Brent Schlotfeldt et al. “Anytime Planning for Decentralized Multirobot ActiveInformation Gathering”. In: IEEE Robotics and Automation Letters 3.2 (2018),pp. 1025–1032.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 14 / 25

Page 18: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Tree Interpretation

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 15 / 25

Page 19: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Improve Path

Algorithm 3 ImprovePath (S, O, C, ε, δ)

1: for t = 1 : T do2: for all (x ,Σ, J) ∈ Ot−1 \ Ct−1 do3: Ct−1 ← Ct−1 ∪ x ,Σ4: for all u ∈ U do5: xt ← f (x , u), Σt ← ρxt (Σ)6: Jt ← J + log det(Σt)7: St ← St ∪ (xt ,Σ, Jt))8: end for9: end for

10: Sort St in ascending order according tolog det(·)

11: Ot ← Ot ∪ St [1]12: for all (x ,Σ, J) ∈ St \ Ot do13: % Find all nodes in Ot , which δ-cross x:14: Q ← Σ′|(x ′,Σ′, J ′) ∈ Ot , dX (x , x ′) ≤ δ15: Ot ← Ot ∪ (x ,Σ, J)16: end for17: O[t] = Ot

18: end for19: return S,O, C

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 16 / 25

Expanding Open, but notClosed nodes =⇒

(ε,δ)-pruning by markingnodes Open =⇒

Page 20: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Main Result

The following holds regarding the performance of ARVI:6

Theorem (ARVI)

The following are satisfied by the ARVI algorithm:

(i) When ImprovePath(ε, δ) returns with finite (ε, δ), the returnedsolution is guaranteed to be (ε, δ)-sub-optimal.

(ii) The cost J(n)T of the returned solution decreases monotonically over

time.

(iii) Given infinite time, C = O ⊆ S, and O[T ] will contain the optimalsolution to the planning problem.

6Brent Schlotfeldt et al. “Anytime Planning for Decentralized Multirobot ActiveInformation Gathering”. In: IEEE Robotics and Automation Letters 3.2 (2018),pp. 1025–1032.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 17 / 25

Page 21: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

What about multiple robots?

Algorithm still scales exponentially in the number of robots.

Consider a coordinate descent approach7. Each robot solves theirown planning problem with fixed trajectories from prior robots.

uc1,0:(T−1) ∈ argminµ∈UT

1

J(1)T (µ) (8)

This continues:

uc2,0:(T−1) ∈ argminµ∈UT

2

J(2)T (uc1,0:(T−1), µ) (9)

...

7Nikolay Atanasov et al. “Decentralized active information acquisition: Theory andapplication to multi-robot SLAM”. . In: 2015 IEEE International Conference onRobotics and Automation (ICRA) (2015), pp. 4775–4782.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 18 / 25

Page 22: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Performance

Coordinate Descent reduces the planning complexity from exponentialto linear in number of robots.

Algorithm achieves no less than 50% of the optimal performance, inthe case of submodular objective functions8.

Results can also be extended to (approximately) sub-modularobjective functions.

8Nikolay Atanasov et al. “Decentralized active information acquisition: Theory andapplication to multi-robot SLAM”. . In: 2015 IEEE International Conference onRobotics and Automation (ICRA) (2015), pp. 4775–4782.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 19 / 25

Page 23: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

What about Estimation?

The robots can use a distributed information filter9, which uses the inversecovariance Ωi,t = Σ−1i,t and ωi,t = Σ−1i,t yi,t :

Update Step: ωi,t+1 =∑

j∈Ni∪i

κi,jωj,t + HTi V−1i zi (t)

Ωi,t+1 =∑

j∈Ni∪i

κi,jΩj,t + HTi V−1i Hi

yi (t) := Ω−1i,t ωi,t

Predict Step: Ωi,t+1 = (AΩ−1i,t+1AT + W )−1

ωi,t+1 = Ωi,t+1Ayi,t+1

(10)

9Nikolay Atanasov et al. “Joint estimation and localization in sensor networks”. In:Decision and Control (CDC), 2014 IEEE 53rd Annual Conference on. IEEE. 2014,pp. 6875–6882.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 20 / 25

Page 24: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Application: Target Tracking

We use a discretized unicycle model for f (x , u):x1t+1

x2t+1

θt+1

=

x1tx2tθt

+

ν sinc(ωτ2 ) cos(θt + ωτ2 )

ν sinc(ωτ2 ) sin(θt + ωτ2 )

τω

(11)

The target model is a double integrator:

yt+1 = A

[I2 τ I20 I2

]yt + wt , wt ∼ N

(0, q

[τ 3/3I2 τ 2/2I2τ 2/2I2 τ I2

])(12)

The robots have range and bearing sensors.

zt,m = h(xt , yt,m) + vt , vt ∼ N(0,V (xt , yt,m)

)h(x , y) =

[r(x , y)α(x , y)

]:=

[ √(y1 − x1)2 + (y2 − x2)2

tan−1((y2 − x2)(y1 − x1))− θ

](13)

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 21 / 25

Page 25: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Application: Target Tracking

We use a discretized unicycle model for f (x , u):x1t+1

x2t+1

θt+1

=

x1tx2tθt

+

ν sinc(ωτ2 ) cos(θt + ωτ2 )

ν sinc(ωτ2 ) sin(θt + ωτ2 )

τω

(11)

The target model is a double integrator:

yt+1 = A

[I2 τ I20 I2

]yt + wt , wt ∼ N

(0, q

[τ 3/3I2 τ 2/2I2τ 2/2I2 τ I2

])(12)

The robots have range and bearing sensors.

zt,m = h(xt , yt,m) + vt , vt ∼ N(0,V (xt , yt,m)

)h(x , y) =

[r(x , y)α(x , y)

]:=

[ √(y1 − x1)2 + (y2 − x2)2

tan−1((y2 − x2)(y1 − x1))− θ

](13)

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 21 / 25

Page 26: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Application: Target Tracking

We use a discretized unicycle model for f (x , u):x1t+1

x2t+1

θt+1

=

x1tx2tθt

+

ν sinc(ωτ2 ) cos(θt + ωτ2 )

ν sinc(ωτ2 ) sin(θt + ωτ2 )

τω

(11)

The target model is a double integrator:

yt+1 = A

[I2 τ I20 I2

]yt + wt , wt ∼ N

(0, q

[τ 3/3I2 τ 2/2I2τ 2/2I2 τ I2

])(12)

The robots have range and bearing sensors.

zt,m = h(xt , yt,m) + vt , vt ∼ N(0,V (xt , yt,m)

)h(x , y) =

[r(x , y)α(x , y)

]:=

[ √(y1 − x1)2 + (y2 − x2)2

tan−1((y2 − x2)(y1 − x1))− θ

](13)

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 21 / 25

Page 27: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Linearization, and MPC

Due to the non-linear observation model, we linearize about a predicted targettrajectory y and apply MPC:

∇yh(x , y) =1

r(x , y)

[(y1 − x1) (y2 − x2) 01x2

− sin(θ + α(x , y)) cos(θ + α(x , y)) 01x2

](14)

Algorithm 4 Anytime Multi-Robot Target Tracking

1: Input: Tmax , x0, ω0, Ω0, f ,U ,H,V ,A,W ,T ,TARVI

2: while t = 1 : Tmax do3: Send ωi ,t and Ωi ,t to neighboring robots.4: Receive measurements zi ,t and perform distributed update step with

any neighbor ωi ,t , Ωi ,t received.5: Predict a target trajectory of length T : yt , ...yT , and linearize obser-

vation model: Ht ← ∇yh(·, yt)6: Plan T -step trajectories with ARVI (Alg. 2) and coordinate descent.7: Apply first control input to move each robot.8: end while

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 22 / 25

Page 28: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Linearization, and MPC

Due to the non-linear observation model, we linearize about a predicted targettrajectory y and apply MPC:

∇yh(x , y) =1

r(x , y)

[(y1 − x1) (y2 − x2) 01x2

− sin(θ + α(x , y)) cos(θ + α(x , y)) 01x2

](14)

Algorithm 5 Anytime Multi-Robot Target Tracking

1: Input: Tmax , x0, ω0, Ω0, f ,U ,H,V ,A,W ,T ,TARVI

2: while t = 1 : Tmax do3: Send ωi ,t and Ωi ,t to neighboring robots.4: Receive measurements zi ,t and perform distributed update step with

any neighbor ωi ,t , Ωi ,t received.5: Predict a target trajectory of length T : yt , ...yT , and linearize obser-

vation model: Ht ← ∇yh(·, yt)6: Plan T -step trajectories with ARVI (Alg. 2) and coordinate descent.7: Apply first control input to move each robot.8: end while

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 22 / 25

Page 29: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Simulation Environment

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 23 / 25

Page 30: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Some Results

0 200 400 600 800 1000

Timestep

-2

-1

0

1

2

En

tro

py p

er

targ

et

(na

ts)

Average Target Entropy

0 m

1 m

5 m

10 m

15 m

m

(a)

0 200 400 600 800 1000

Timestep

0

2

4

6

8

10

Ave

rag

e t

arg

ets

de

tecte

d

Number of Targets

0 m

1 m

5 m

10 m

15 m

m

(b)

0 200 400 600 800 1000

Timestep

0

0.2

0.4

0.6

0.8

1

1.2

RM

SE

pe

r ta

rge

t (m

)

Average RMSE over time

0 m

1 m

5 m

10 m

15 m

m

(c)

0 100 200 300 400 500 600

Timestep

-3

-2

-1

0

1

2

3

4

En

tro

py p

er

targ

et

(na

ts)

Entropy per Target

TARVI

= 0 (greedy)

TARVI

= 1.5s

TARVI

= 3s

(d)

0 100 200 300 400 500 600

Timestep

0

5

10

15

20

25N

um

be

r o

f ta

rge

tsNumber of Targets Tracked

TARVI

= 0 (greedy)

TARVI

= 1.5s

TARVI

= 3s

(e)

0 100 200 300 400 500 600

Timestep

0

0.5

1

1.5

2

RM

SE

(m

)

Position RMSE per target

TARVI

= 0 (greedy)

TARVI

= 1.5s

TARVI

= 3s

(f)

Figure: Simulation results with 9 robots tracking 16 moving targets. The top rowsweeps over communication radius, and the bottom row sweeps planning time.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 24 / 25

Page 31: Anytime Planning for Decentralized Multi-Robot Active … · 2020-02-11 · Anytime Planning for Decentralized Multi-Robot Active Information Gathering Brent Schlotfeldt1 Dinesh Thakur1

Conclusion

Future Work:

Resilient Active Information Acquisition (IROS 2018)

Unknown Target Models

Quadrotor Motion Model, Camera observation Model, ContinuousActions, etc.

Brent Schlotfeldt1, Dinesh Thakur1, Nikolay Atanasov2, Vijay Kumar1,, George Pappas1 (Penn)Anytime Planning May 17, 2018 25 / 25