Motivation: Sensor selection Adaptive Sensor Selection in

Adaptive Sensor Selection in Sequential Decision Making

Vaibhav Srivastava, Kurt Plarre and Francesco Bullo

Center for Control, Dynamical Systems & Computation

University of California at Santa Barbara

http://motion.me.ucsb.edu/∼vaibhav

14 December 2011

Conference on Decision & Control and European Control Conference

Vaibhav Srivastava (UCSB) Adaptive Sensor Selection 12-14-11 CDC-ECC 1 / 11

Motivation: Sensor selection

Attention in Camera Sensor Network Sensors for UAV Surveillance

Which camera to choose?

Cloud Rain WindEO X X XSAR XFPR X XIR X XMTIR X

Which sensor to choose?

1 how to avoid operator overload?

2 how to select most informative sensors and focus attention?


Motivation: Sensor selection

Attention in Camera Sensor Network Sensors for UAV Surveillance

Which camera to choose?

Cloud Rain WindEO X X XSAR XFPR X XIR X XMTIR X

Which sensor to choose?

1 how to avoid operator overload?

2 how to select most informative sensors and focus attention?


Relevant Literature

Human Decision Making

R. Bogacz, E. Brown, J. Moehlis, P. Holmes, and J. D. Cohen. The physics of optimal decisionmaking: A formal analysis of performance in two-alternative forced choice tasks. Psychological

Review, 113(4):700–765, 2006

Sequential Test of Hypothesis

A. Wald. Sequential tests of statistical hypotheses. 16(2):117–186, 1945

C. W. Baum and V. V. Veeravalli. A sequential procedure for multihypothesis testing. IEEE Trans In-

formation Theory, 40(6):1994–2007, 1994

Sensor Selection

D. Bajovic, B. Sinopoli, and J. Xavier. Sensor selection for hypothesis testing in wireless sensornetworks: a Kullback-Leibler based approach. In Proc CDC, pages 1659–1664, Shanghai, China,December 2009S. Joshi and S. Boyd. Sensor selection via convex optimization. IEEE Trans Signal Processing,57(2):451–462, 2009

Search

J. P. Hespanha, H. J. Kim, and S. S. Sastry. Multiple-agent probabilistic pursuit-evasion games. InProc CDC, pages 2432–2437, Phoenix, AZ, USA, December 1999T. H. Chung and J. W. Burdick. A decision-making framework for control strategies in probabilisticsearch. In Proc ICRA, pages 4386–4393, Roma, Italy, April 2007


Problem Setup

1 Binary decision making tasks

2 N information sources

3 operator focuses attentionto only one source at a time

4 collection+transmission+processing timeof sensor s is a random variable Ts > 0

Problem: how to select sensors to minimize decision time?

Issues: Non-i.i.d. data, sensor selection problem is NP-hard, in general

Two phase approach

1 determine optimal stationary sensor selection scheme

2 adapt the stationary scheme at each step


Problem Setup

1 Binary decision making tasks

2 N information sources

3 operator focuses attentionto only one source at a time

4 collection+transmission+processing timeof sensor s is a random variable Ts > 0

Problem: how to select sensors to minimize decision time?

Issues: Non-i.i.d. data, sensor selection problem is NP-hard, in general

Two phase approach

1 determine optimal stationary sensor selection scheme

2 adapt the stationary scheme at each step


Stationary Scheme

two alternative hypotheses: H0,H1

given pdfs fs(y |Hk) = fks (y), k ∈ {0, 1}

sensor selection probability N−tuple q

decision thresholds η0 < 0 < η1

SPRT with stationary sensor selection

At each time t

(a) Sample a sensor st from q

(b) Compute likelihood: lt ≡ log(f 1st (yt)/f0st (yt))

(c) Lt :=�t

τ=1 lτ

(d)η1 < Lt =⇒ sayH1

Lt < η0 =⇒ sayH0

η0 < Lt < η1 =⇒ continue sampling

Stationary Scheme

SPRT Evolutions

Stationary scheme makes sequence {(st , yt)}t∈N i.i.d.


Optimal Stationary Scheme

KL divergence D(f 1, f 0) ≡ Ef 1�log f 1(Y )

f 0(Y )

�

Stationary Decision Time

E[Td |Hk ] =const×

�ns=1 qsTs�n

s=1 qsD(f ks , f ∗s )Linear fractional function

Optimal Sensor Selection Probability

Given prior probability πk of Hk

q∗ = argmin{π0E[Td |H0] + π1E[Td |H1]}

Sum of Linear fractional function

– A non-convex problem, but efficiently solvable

– Conditioned on a hypothesis, optimal policy is deterministic

– An optimal policy samples at most two sensors


Optimal Stationary Scheme

KL divergence D(f 1, f 0) ≡ Ef 1�log f 1(Y )

f 0(Y )

�

Stationary Decision Time

E[Td |Hk ] =const×

�ns=1 qsTs�n

s=1 qsD(f ks , f ∗s )Linear fractional function

Optimal Sensor Selection Probability

Given prior probability πk of Hk

q∗ = argmin{π0E[Td |H0] + π1E[Td |H1]}

Sum of Linear fractional function

– A non-convex problem, but efficiently solvable

– Conditioned on a hypothesis, optimal policy is deterministic

– An optimal policy samples at most two sensors


Adaptive Policy

Adaptive Sensor Selection Probability

At each time t + 1

1: Determine posterior probabilities

π0(t) = 1/(1 + exp(Lt)) & π1(t) = 1− π0(t)

2: Adapt the sensor selection probability

q∗t+1=argmin{π0(t)E[Td |H0] + π1(t)E[Td |H1]}

Posterior probabilities Sensor selection probabilities


Performance of Adaptive Policy

Global Lower Bound

E[Td |Hk ] ≥ mins∈{1,...,n}

const× Ts

D(f ks , f ∗s )=

const× Tsk

D(f ksk , f∗sk )

Upper Bound for Adaptive Policy

E[Td |Hk ] ≤ minq

max {E[Td |H0],E[Td |H1]}

Adaptive policy is asymptotically optimal

provided each sensor is informative

Performance bounds



Global Lower Bound

E[Td |Hk ] ≥ mins∈{1,...,n}

const× Ts

D(f ks , f ∗s )=

const× Tsk

D(f ksk , f∗sk )


E[Td |Hk ] ≤ minq




Performance bounds



Global Lower Bound

E[Td |Hk ] ≥ mins∈{1,...,n}

const× Ts

D(f ks , f ∗s )=

const× Tsk

D(f ksk , f∗sk )


E[Td |Hk ] ≤ minq




Performance bounds


Control of Decision Time

Assumption: Identical processing time of the sensors

Reorder sensors in decreasing order of D(f 0s , f1s )

Pick first ζ ≤ n sensors, s0� , � ∈ {1, . . . , ζ}Similarly, pick sensors s1� , � ∈ {1, . . . , ζ}

Asymptotically optimal policy

Apply adaptive policy to sets Ξk = {sk� | � ∈ {1, . . . , ζ}}, k ∈ {0, 1}Asymptotic decision time: E[Td |Hk ] = const/(

�ζ�=1D(f 0s� , f

1s�))

Control of Decision Time

Given a desired feasible expected decision time, a randomized cardinality ζof the set can be designed


Application: Search in a Camera Network

Treasure at region k with prior prob. πk

Search ≡ MSPRT

One camera at each region

(evidence | treas. at loc. k) ∼ f1k ; f

0k o.w.

Region selection probability M-tuple: q

Each sensor non-informative about other regions

Region selection probability Posterior probability of treasure


Application: Search in a Camera Network

Treasure at region k with prior prob. πk

Search ≡ MSPRT

One camera at each region

(evidence | treas. at loc. k) ∼ f1k ; f

0k o.w.

Region selection probability M-tuple: q

Each sensor non-informative about other regions

Region selection probability Posterior probability of treasure


Conclusions & Future directions

Conclusions

Identification of most pertinent information sources

Max. cardinality of optimal source set = No. of hypothesis

Adaptive source selection is asymptotically optimal

Communication-decision time trade-off

Application to decision theoretic search

Future Directions

Extension to dynamic hypothesis

Extension of GLR


Documents

Motivation: Sensor selection Adaptive Sensor Selection in