5
Adaptive Sensor Selection in Sequential Decision Making Vaibhav Srivastava, Kurt Plarre and Francesco Bullo Center for Control, Dynamical Systems & Computation University of California at Santa Barbara http://motion.me.ucsb.edu/vaibhav 14 December 2011 Conference on Decision & Control and European Control Conference Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 1 / 11 Motivation: Sensor selection Attention in Camera Sensor Network Sensors for UAV Surveillance Which camera to choose? Cloud Rain Wind EO X X X SAR X FPR X X IR X X MTIR X Which sensor to choose? 1 how to avoid operator overload? 2 how to select most informative sensors and focus attention? Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 2 / 11 Motivation: Sensor selection Attention in Camera Sensor Network Sensors for UAV Surveillance Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 2 / 11 Relevant Literature Human Decision Making R. Bogacz, E. Brown, J. Moehlis, P. Holmes, and J. D. Cohen. The physics of optimal decision making: A formal analysis of performance in two-alternative forced choice tasks. Psychological Review, 113(4):700–765, 2006 Sequential Test of Hypothesis A. Wald. Sequential tests of statistical hypotheses. 16(2):117–186, 1945 C. W. Baum and V. V. Veeravalli. A sequential procedure for multihypothesis testing. IEEE Trans In- formation Theory, 40(6):1994–2007, 1994 Sensor Selection D. Bajovi´ c, B. Sinopoli, and J. Xavier. Sensor selection for hypothesis testing in wireless sensor networks: a Kullback-Leibler based approach. In Proc CDC, pages 1659–1664, Shanghai, China, December 2009 S. Joshi and S. Boyd. Sensor selection via convex optimization. IEEE Trans Signal Processing, 57(2):451–462, 2009 Search J. P. Hespanha, H. J. Kim, and S. S. Sastry. Multiple-agent probabilistic pursuit-evasion games. In Proc CDC, pages 2432–2437, Phoenix, AZ, USA, December 1999 T. H. Chung and J. W. Burdick. A decision-making framework for control strategies in probabilistic search. In Proc ICRA, pages 4386–4393, Roma, Italy, April 2007 Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 3 / 11

Motivation: Sensor selection Adaptive Sensor Selection in

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Motivation: Sensor selection Adaptive Sensor Selection in

Adaptive Sensor Selection in Sequential Decision Making

Vaibhav Srivastava, Kurt Plarre and Francesco Bullo

Center for Control, Dynamical Systems & Computation

University of California at Santa Barbara

http://motion.me.ucsb.edu/∼vaibhav

14 December 2011

Conference on Decision & Control and European Control Conference

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 1 / 11

Motivation: Sensor selection

Attention in Camera Sensor Network Sensors for UAV Surveillance

Which camera to choose?

Cloud Rain WindEO X X XSAR XFPR X XIR X XMTIR X

Which sensor to choose?

1 how to avoid operator overload?

2 how to select most informative sensors and focus attention?

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 2 / 11

Motivation: Sensor selection

Attention in Camera Sensor Network Sensors for UAV Surveillance

Which camera to choose?

Cloud Rain WindEO X X XSAR XFPR X XIR X XMTIR X

Which sensor to choose?

1 how to avoid operator overload?

2 how to select most informative sensors and focus attention?

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 2 / 11

Relevant Literature

Human Decision Making

R. Bogacz, E. Brown, J. Moehlis, P. Holmes, and J. D. Cohen. The physics of optimal decisionmaking: A formal analysis of performance in two-alternative forced choice tasks. Psychological

Review, 113(4):700–765, 2006

Sequential Test of Hypothesis

A. Wald. Sequential tests of statistical hypotheses. 16(2):117–186, 1945

C. W. Baum and V. V. Veeravalli. A sequential procedure for multihypothesis testing. IEEE Trans In-

formation Theory, 40(6):1994–2007, 1994

Sensor Selection

D. Bajovic, B. Sinopoli, and J. Xavier. Sensor selection for hypothesis testing in wireless sensornetworks: a Kullback-Leibler based approach. In Proc CDC, pages 1659–1664, Shanghai, China,December 2009S. Joshi and S. Boyd. Sensor selection via convex optimization. IEEE Trans Signal Processing,57(2):451–462, 2009

Search

J. P. Hespanha, H. J. Kim, and S. S. Sastry. Multiple-agent probabilistic pursuit-evasion games. InProc CDC, pages 2432–2437, Phoenix, AZ, USA, December 1999T. H. Chung and J. W. Burdick. A decision-making framework for control strategies in probabilisticsearch. In Proc ICRA, pages 4386–4393, Roma, Italy, April 2007

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 3 / 11

Page 2: Motivation: Sensor selection Adaptive Sensor Selection in

Problem Setup

1 Binary decision making tasks

2 N information sources

3 operator focuses attentionto only one source at a time

4 collection+transmission+processing timeof sensor s is a random variable Ts > 0

Problem: how to select sensors to minimize decision time?

Issues: Non-i.i.d. data, sensor selection problem is NP-hard, in general

Two phase approach

1 determine optimal stationary sensor selection scheme

2 adapt the stationary scheme at each step

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 4 / 11

Problem Setup

1 Binary decision making tasks

2 N information sources

3 operator focuses attentionto only one source at a time

4 collection+transmission+processing timeof sensor s is a random variable Ts > 0

Problem: how to select sensors to minimize decision time?

Issues: Non-i.i.d. data, sensor selection problem is NP-hard, in general

Two phase approach

1 determine optimal stationary sensor selection scheme

2 adapt the stationary scheme at each step

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 4 / 11

Stationary Scheme

two alternative hypotheses: H0,H1

given pdfs fs(y |Hk) = fks (y), k ∈ {0, 1}

sensor selection probability N−tuple q

decision thresholds η0 < 0 < η1

SPRT with stationary sensor selection

At each time t

(a) Sample a sensor st from q

(b) Compute likelihood: lt ≡ log(f 1st (yt)/f0st (yt))

(c) Lt :=�t

τ=1 lτ

(d)η1 < Lt =⇒ sayH1

Lt < η0 =⇒ sayH0

η0 < Lt < η1 =⇒ continue sampling

Stationary Scheme

SPRT Evolutions

Stationary scheme makes sequence {(st , yt)}t∈N i.i.d.

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 5 / 11

Optimal Stationary Scheme

KL divergence D(f 1, f 0) ≡ Ef 1�log f 1(Y )

f 0(Y )

Stationary Decision Time

E[Td |Hk ] =const×

�ns=1 qsTs�n

s=1 qsD(f ks , f ∗s )Linear fractional function

Optimal Sensor Selection Probability

Given prior probability πk of Hk

q∗ = argmin{π0E[Td |H0] + π1E[Td |H1]}

Sum of Linear fractional function

– A non-convex problem, but efficiently solvable

– Conditioned on a hypothesis, optimal policy is deterministic

– An optimal policy samples at most two sensors

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 6 / 11

Page 3: Motivation: Sensor selection Adaptive Sensor Selection in

Optimal Stationary Scheme

KL divergence D(f 1, f 0) ≡ Ef 1�log f 1(Y )

f 0(Y )

Stationary Decision Time

E[Td |Hk ] =const×

�ns=1 qsTs�n

s=1 qsD(f ks , f ∗s )Linear fractional function

Optimal Sensor Selection Probability

Given prior probability πk of Hk

q∗ = argmin{π0E[Td |H0] + π1E[Td |H1]}

Sum of Linear fractional function

– A non-convex problem, but efficiently solvable

– Conditioned on a hypothesis, optimal policy is deterministic

– An optimal policy samples at most two sensors

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 6 / 11

Adaptive Policy

Adaptive Sensor Selection Probability

At each time t + 1

1: Determine posterior probabilities

π0(t) = 1/(1 + exp(Lt)) & π1(t) = 1− π0(t)

2: Adapt the sensor selection probability

q∗t+1=argmin{π0(t)E[Td |H0] + π1(t)E[Td |H1]}

Posterior probabilities Sensor selection probabilities

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 7 / 11

Performance of Adaptive Policy

Global Lower Bound

E[Td |Hk ] ≥ mins∈{1,...,n}

const× Ts

D(f ks , f ∗s )=

const× Tsk

D(f ksk , f∗sk )

Upper Bound for Adaptive Policy

E[Td |Hk ] ≤ minq

max {E[Td |H0],E[Td |H1]}

Adaptive policy is asymptotically optimal

provided each sensor is informative

Performance bounds

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 8 / 11

Performance of Adaptive Policy

Global Lower Bound

E[Td |Hk ] ≥ mins∈{1,...,n}

const× Ts

D(f ks , f ∗s )=

const× Tsk

D(f ksk , f∗sk )

Upper Bound for Adaptive Policy

E[Td |Hk ] ≤ minq

max {E[Td |H0],E[Td |H1]}

Adaptive policy is asymptotically optimal

provided each sensor is informative

Performance bounds

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 8 / 11

Page 4: Motivation: Sensor selection Adaptive Sensor Selection in

Performance of Adaptive Policy

Global Lower Bound

E[Td |Hk ] ≥ mins∈{1,...,n}

const× Ts

D(f ks , f ∗s )=

const× Tsk

D(f ksk , f∗sk )

Upper Bound for Adaptive Policy

E[Td |Hk ] ≤ minq

max {E[Td |H0],E[Td |H1]}

Adaptive policy is asymptotically optimal

provided each sensor is informative

Performance bounds

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 8 / 11

Control of Decision Time

Assumption: Identical processing time of the sensors

Reorder sensors in decreasing order of D(f 0s , f1s )

Pick first ζ ≤ n sensors, s0� , � ∈ {1, . . . , ζ}Similarly, pick sensors s1� , � ∈ {1, . . . , ζ}

Asymptotically optimal policy

Apply adaptive policy to sets Ξk = {sk� | � ∈ {1, . . . , ζ}}, k ∈ {0, 1}Asymptotic decision time: E[Td |Hk ] = const/(

�ζ�=1D(f 0s� , f

1s�))

Control of Decision Time

Given a desired feasible expected decision time, a randomized cardinality ζof the set can be designed

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 9 / 11

Application: Search in a Camera Network

Treasure at region k with prior prob. πk

Search ≡ MSPRT

One camera at each region

(evidence | treas. at loc. k) ∼ f1k ; f

0k o.w.

Region selection probability M-tuple: q

Each sensor non-informative about other regions

Region selection probability Posterior probability of treasure

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 10 / 11

Application: Search in a Camera Network

Treasure at region k with prior prob. πk

Search ≡ MSPRT

One camera at each region

(evidence | treas. at loc. k) ∼ f1k ; f

0k o.w.

Region selection probability M-tuple: q

Each sensor non-informative about other regions

Region selection probability Posterior probability of treasure

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 10 / 11

Page 5: Motivation: Sensor selection Adaptive Sensor Selection in

Conclusions & Future directions

Conclusions

Identification of most pertinent information sources

Max. cardinality of optimal source set = No. of hypothesis

Adaptive source selection is asymptotically optimal

Communication-decision time trade-off

Application to decision theoretic search

Future Directions

Extension to dynamic hypothesis

Extension of GLR

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 11 / 11