32
KRISHNA KALYANAM (INFOSCITEX CORP.) IN COLLABORATION WITH S. DARBHA (TAMU) P. P. KHARGONEKAR (UF, E-ARPA) M. PACHTER (AFIT/ENG) P. CHANDLER AND D. CASBEER (AFRL/RQCA) AFRL/RQCA UAV TEAM MEETING OCT 31, 2012 Optimal Min-max Pursuit Evasion on a Manhattan Grid

Optimal Min-max Pursuit Evasion on a Manhattan Grid

  • Upload
    benita

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

Optimal Min-max Pursuit Evasion on a Manhattan Grid. Krishna kalyanam ( Infoscitex corp.) In collaboration with S. Darbha ( Tamu ) P. P. Khargonekar (UF, E-ARPA) M. Pachter (AFIT/ENG) P. Chandler and D. Casbeer (AFRL/RQCA) AFRL/RQCA UAV Team meeting oct 31, 2012. Scenario. - PowerPoint PPT Presentation

Citation preview

Page 1: Optimal Min-max Pursuit Evasion on a Manhattan Grid

KR IS HN A KA LYA NA M( INF OS C I TEX C O RP. )

I N C O L L A B O RAT I O N W I T H

S . D A R B H A ( TA M U )P. P. K H A R G O N E K A R ( U F , E - A R PA )

M . PA C H T E R ( A F I T / E N G )P. C H A N D L E R A N D D . C A S B E E R ( A F R L / R Q C A )

A F R L / R Q C A U AV T E A M M E E T I N GO C T 3 1 , 2 0 1 2

Optimal Min-max Pursuit Evasion on a Manhattan Grid

Page 2: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 2

UGS Sensor Range

UGS Communication Range

Valid Intruder PathScenarioUAV Communication Range

BASE

10/31/12

Page 3: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 3

Pursuit-Evasion Framework• Pursuer engaged in search and capture of intruder on

a Manhattan road network• Intersections in road instrumented with Unattended

Ground Sensors (UGSs)• Pursuer has a 2x speed advantage over the evader• Pursuer has no on-board sensing capability• Evader triggers UGS and the event is time-stamped

and stored in the UGS• Pursuer interrogates UGSs to get evader location

information• Capture occurs when pursuer and evader are co-

located at an UGS location

10/31/12

Page 4: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 4

Manhattan Grid (3 row corridor)

All edges of the grid are of same length Purser arrives at node (t/c/b,0) with delay D>0 (time steps) behind the evader Evader dynamics - move North, East or South but cannot re-visit a node Pursuer actions - move North, East or South or Loiter/ Wait at current location Pursuer has a 2x speed advantage over the evader

c

0 1 2 n

b

t

10/31/12

D

Page 5: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 5

Governing Equations

10/31/12

Page 6: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 6

Problem FrameworkPose the problem as a Partially Observable Markov

Decision Process (POMDP) unconventional POMDP since observations give

delayed intruder location information with random time delays!

Use observations to compute the set of possible intruder locations

Dual control problem Pursuer’s action in addition to aiding capture

also affects the future uncertainty associated with evader’s location (exploration vs. exploitation)

10/31/12

Page 7: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 7

Partial and delayed state information

10/31/12

Page 8: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 8

Optimization Problem

10/31/12

t

c

b

D

0 1 2

Page 9: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 9

Bellman recursion

10/31/12

Page 10: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 10 10/31/12

Induction - Motivation

cD

0 1 2 D-1 D

D-1 D-2 1 0

single row: capture in exactly D steps T(D)=1+T(D-1);T(1)=1 => T(D) = D

two rows: capture in exactly D+2 steps T(D)=1+T(D-1);T(1)=3 => T(D) = D+2

pursuerevader

t

bD D-1 D-2 1

0

Page 11: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 11

A Feasible Policy (upper bound)

t

c

b

D

0 1 2

10/31/12

Page 12: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 12

Bottom/Top row - delay 1

1

0

pursuerevader

0 1

10/31/12

Page 13: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 13

Bottom/Top row - delay 2

1

00 1 2

2

10/31/12

Page 14: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 14

Center row - delay 1

1

1

00 1 2 3

2

10/31/12

Page 15: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 15

Center row - delay 2

01 2 3 40

2

2 1

1

10/31/12

Page 16: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 16

Bottom row - delay 3

10/31/12

Center row - delay 3

t

c

b

D

0 1 2

Page 17: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 17 10/31/12

Specification of the policyμ

Delay (D) Sequence Max Steps1 ENLNL 52 EN2L 63 EN2 13≥4 EN2? D+10

Delay (D) Sequence Max Steps1 ENLS2 112 ENS2 123 ENSES 13≥4 ?? D+10

bottom row:

center row:

Page 18: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 18

Induction argument for D>=4

Basic step: Tμ(r,3)=13

Induction hypothesis:

10/31/12

Page 19: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 19 10/31/12

Specification of the policyμ

Delay (D) Sequence Min-Max Steps

1 ENLNL 52 EN2L 6≥3 EN2 D+10

Delay (D) Sequence Min-Max Steps

1 ENLS2 112 ENS2 123 ENSES 13≥4 ED-3NSE2S D+10

bottom row:

center row:

Page 20: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 20

Center row, delay D>=4

10/31/12

D

k=D k=D+1 k=2D-4

k=2D+2

k=2D

k=2D-20 1 D-4 D-3 D-2 D-1

(D-3) moves E

Page 21: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 21

Center row, delay D>=4 (contd.)

D

(D-3) moves E

2

k=0,k=D

k=D+1 k=2D-4

k=2 k=4 k=2D-4 k=2D-2

k=2D+2k=2D

k=2D

k=2D-20 1 D-4 D-3 D-2 D-1

10/31/12

Page 22: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 22

Center row, delay D>=4 (contd.)

D

k=0,k=D

k=D+1 k=2D-4

k=2D+2

k=2D

k=2D-20 1 D-4 D-3 D-2 D-1

10/31/12

Page 23: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 23

Center row, delay D>=4

Bottom row, delay D>=4

D

0 1

k=D+1

D-2k=4,k=D+2

k=0,k=D

10/31/12

Page 24: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 24

Lower Bound on Steps to capture

10/31/12

t

c

b

D

0 1 2

Page 25: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 25

Lower bound on optimal time to capture

10/31/12

Page 26: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 26

Optimal (min-max) Steps to Capture

10/31/12

Page 27: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 27

East is optimal at red UGS

sketch of proof:

10/31/12

Page 28: Optimal Min-max Pursuit Evasion on a Manhattan Grid

28

Optimal trajectory

There is an optimal trajectory, referred to as a ``turnpike”, which both the pursuer and the evader strive to reach and stay in, for most of the encounter.

Here, the turnpike is the center row of the symmetric 3 row grid. The pursuer, after initially going east, if not already on the turnpike,

immediately heads towards it. The evader initially heads to the turnpike, unless it is already on it,

until the ``end game", whence it swerves and gets off the turnpike to avoid immediate capture.

The pursuer stays on the turnpike, monitoring the delays, until he observers delay 1. At this point, he also executes the ``end game" maneuver, and captures the evader in exactly 11 more steps.

RQCA Conf. Rm. 10/31/12

Page 29: Optimal Min-max Pursuit Evasion on a Manhattan Grid

29

Summary Advantages

Policy is dependent only on the delay at, and time elapsed since, the last red UGS (sufficient statistic?)

Policy is optimal despite not relying on the entire information history of pursuer

Disadvantages Policy is not in analytical form i.e., function from information state to action

space (and so not extendable to other graphs) what is the intuition (exploration vs. exploitation, does separation exist?)

Extension(s) Can policy be approximated by a feedback policy that minimizes suitable

norm of the error (distance to evader + size of uncertainty) Capture can no longer be guaranteed (by a single pursuer) if number of rows

exceeds 3 With 2 pursuers, capture can be guaranteed in D+4 steps on any number of

rows (including infinity)!

RQCA Conf. Rm. 10/31/12

Page 30: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 30

Extras

10/31/12

Page 31: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 31

Center row, delay D>=4 (contd.)

D

k=0,k=D

k=D+1 k=2D-4

k=2D+2

k=2D

k=2D-20 1 D-4 D-3 D-2 D-1

10/31/12

conservative bound: D-1+11=D+10 (see extra slide)

Page 32: Optimal Min-max Pursuit Evasion on a Manhattan Grid

RQCA Conf. Rm. 32 10/31/12

D

0

k=0,k=D

k=D+1 k=2D-4

k=2 k=4 k=2D-4

k=2D-2

k=2D

k=2D-2k=2D

0 1 D-4 D-3 D-2 D-1

1

steps to capture: D-1+3=D+2conservative bound (per policy) = D-1+11=D+10