Where are we?

1

Where are we?

Final MURI Review Meeting

Alan S. WillskyDecember 2, 2005

2

Sensor Localization and Calibration

Fundamental problems in making sensor networks useful Localization and calibration Organization (e.g., who does what?)

This talk focuses on the former Part of this research done in collaboration with Prof.

Randy Moses (OSU), funded under the Sensors CTA The results are also directly applicable to localizing

targets rather than sensors This problem raises issues of much broader

importance

3

Problem Formulation Deposit sensors at random A few (or none!) have absolute location information Obtain relative measurements:

Time delay of arrival Received signal strength Direction of arrival…

Find a consistent sensor geometry And its uncertainty…

Denote location of sensor by and prior Observe:

Event with probability If we observe a noisy distance measurement

4

Our Objectives and Approach Go well beyond the state of the art

Issue of ensuring well-posedness and “unambiguous” localization

If I don’t hear you, I may be farther away… Dealing with non-Gaussianity of errors and of

measurement errors Dealing with accidental (or intentional) outliers

Distributed implementation Opening the door for investigations of tradeoffs

between accuracy and communications

5

Sensor LocalizationGraphical model

Associate each node in graph with a random variable Use edges (graph separation) to describe conditional

independency between variables Distribution: Pairwise Markov Random Field

Sensor localization problem has distribution

Natural interpretation as pairwise MRF; terms from Likelihood of observing a measurement If observed, noise likelihood Prior information Easy to modify edge potentials to include possibility of outliers

6

Sensor LocalizationA fully connected graph?

Unfortunately, all pairs of sensors are related!

“Observed” edges are strong P(o=1) tells us two sensors are

nearby Distance measurement info is even

stronger “Unobserved” edges relatively weak

P(o=0) tells us only that two sensors are probably far apart

Approximate with a simplified graph

7

Sensor LocalizationApproximate Graph FormulationApproximate the full graph using only “local” edges Examine two cases

“1-step” : keep only observed edges (otu =1) “2-step” : also keep edges with otv = ovu = 1, but otu = 0 Can imagine continuing to “3-step”, etc…

Notice the relationship to communications graph: “1-step” edges are feasible inter-sensor communications “2-step” messages are single-hop forwarding

Easily distributed solution How many edges should we keep?

Experimentally, little improvement beyond two

8

Distributed algorithms for the computation of location marginals

This formulation demonstrates that this is a problem of inference on graphical models

There are (many!) loops, so exact inference is an (NP-hard) challenge One approach is to use a suboptimal message-

passing algorithm such as Belief Propagation (BP) Other aspects of our work have dealt with

enhanced algorithms well beyond BP However, if we want to use BP or any of these

other algorithms, we still have challenges BP messages are nice vectors if the variables are

discrete or are Gaussian Neither of these are the case for sensor localization

9

Example Network True marginal uncertainties NBP-estimated marginals

Prior info

Sensor LocalizationUncertainty in Localization

Uncertainty has two primary forms “Observed” distance: ring-like likelihood f’n “Unobserved” distance – repulsive relationship

Uncertainty can be very non-Gaussian Multiple ring-like functions yield bimodalities, crescents, …

“High” dimensional (2-3D) Discretization computationally impractical Alternative: kernel density estimates (sample-based)

Similar to (regularized) particle filtering…

10

Nonparametric Inference for General Graphs

Problem: What is the product of two collections of particles?

Belief Propagation•General graphs

•Discrete or Gaussian

Particle Filters•Markov chains

•General potentials

Nonparametric BP•General graphs

•General potentials

11

Nonparametric BP

I. Message Product: Draw samples of from the product of all incoming messages and the local observation potential

II. Message Propagation: Draw samples of from the compatibility function, , fixing to the values sampled in step I

Samples form new kernel density estimate of outgoing message (determine new kernel bandwidths)

Stochastic update of kernel based messages:

12

How do we generate M samples from the product without explicitly computing it?

•Key issue is the label sampling problem (which kernel)•Efficient solutions use importance sampling and multiresolution KD-trees (later)

d messages, M kernels each

Product contains Md kernels

Input: Output: Computationally hard step – computing message products:

The computational challenge

14

10-Node graph

• Addition of “2-step” edges• Reduces bimodality• Improves location estimates

NBP (“2-step”)Joint MAP

Sensor LocalizationExample Networks: Small

15

“1-step” Graph “2-step” Graph

Nonlin Least-Sq NBP, “2-step”NBP, “1-step”

Sensor LocalizationExample Networks: Large

16Nonlin Least-Sq NBP, “2-step”MAP Estimate

• Addition of an outlier process• Robust noise estimation

• Dashed line has large error;• MAP – discard this measurement• NLLS – large distortion • NBP – easy to change noise distribution

Sensor LocalizationExtension: Outlier measurements

17

Message Errors Effect of “distortion” to messages in BP

Why distort BP messages? Quantization effects communications constraints stochastic approximation …

Results in… Convergence of loopy BP (zero distortion) Distance between multiple fixed points Error in beliefs due to message distortions

(or, errors in potential functions)

18

How different are two BP messages? Message “error” as ratio

(or, difference of log-messages)

One (scalar) measure Dynamic range Equivalent log-form

Message Approximation

19

Why the dynamic range? Relationship to L1 norm on log-messages:

Define

Then

20

Properties of d(e) Triangle inequality

Messages combine sub-additively

21

Properties of d(e), cont’d Message errors contract under convolution with

finite-strength potentials given “incoming errors”

we have that the message convolution

produces “outgoing error”

(where

is a measure of the potential strength) lo

g d(

e)

log d(E)

22

Results using this measure Best known convergence results for loopy

BP Result also provides result on relative locations

of multiple fixed points and provide conditions for uniqueness of fixed point

Bounds and stochastic approximations for effects of (possibly intentional) message errors

Basic results use worst case analysis If there is significant structure in the model (e.g.,

some edges that are strong and some that are weak), this can be exploited to obtain tighter results

23

Computation Trees(Weiss & Freeman ’99, along with many others) Tree-structured “unrolling” of loopy graph Contains all length-N “forward” paths At the root, equivalence between

N iterations of BP and N upward stages on computation tree Iterative use of subadditivity and contraction leads to both

bounds on error propagation and also to conditions for BP convergence

27

“Simple” convergence condition Use an inductive argument:

Let “true” messages be any fixed point Bound Ei+1 using Ei

Loopy BP converges if Simple calculus:

Bound still meaningful forfinite iterations

31Potential strength !

Experimental comparisons Two example graphical models

Compute Simon’s condition (uniqueness only) Simple bound on FP distance Bounds using graph geometry

(a)

(b)

32

Adding message distortions Suppose we distort each computed BP message

Add max. error to each message ) changes iteration

Strict bound on steady-state error Note: “error” is vs. “exact” loopy BP solution

Similar: errors in potential functions

Can also use framework to estimate error Assume e.g. incoming message errors are uncorrelated Estimate only, but may be a better guess for

quantization

33

Experiments: Quantization Effects (I)

Small (5x5) grid, binary random variables (positive/mixed correlation)

Relatively weak potential functions Loopy BP guaranteed to converge Bound and estimate behave similarly

Quantization error

34

Experiments: Quantization Effects (II)

Increase the potential strength Loopy BP no longer guaranteed to converge

Bound asymptotes as quantization error goes to zero Estimate (assuming uncorrelated errors) may still be

useful

Quantization error

35

Communicating particle sets Problem: transmit N iid samples Sequence of samples:

Expected cost is ¼ N*R*H(p) H(p) = differential entropy, R = resolution of samples

Set of samples Invariant to reordering

We can reorder to reduce the transmission cost Expected cost is ¼ N*R*H(p) – log(N!) Entropy reduced for any deterministic order

In 1-D, “sorted” order in > 1-D, exploit KD-trees

Yields direct tradeoff between message accuracy and bits required for transmission

Together with message error analysis, we have an audit trail from bits to fusion performance

38

Trading off error vs communications

Examine a simple hierarchical approximation Many other approximations possible…

KD-trees Tree-structure successively divides point sets

Typically along some cardinal dimension Cache statistics of subsets for fast computation Example: cache means and covariances

Can also be used for approximation… Any cut through the tree is a density estimate Easy to optimize over possible cuts

Communications cost Upper bound on error (KL, max-log, etc)

42

Examples – Sensor localization Many inter-related aspects

Tested on small (10-node) graph… Message schedule

Outward “tree-like” pass Typical “parallel” schedule

# of iterations (messages) Typically require very few (1-3) Could replace by msg stopping criterion

Message approximation / bit budget Most messages (eventually) “simple”

unimodal, near-Gaussian Early messages & poorly localized sensors

May require more bits / components…

43

Examples – Particle filtering Simple single-object tracking

Diffusive dynamics (random walk) Nonlinear (distance-based) observations

Transmit posterior at each time step Fixed bit-budget for each transmission

Compare: simple subsampling vs KD-tree approximation

1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

Time / # transmissions

KL-

dive

rgen

ce

Exact commsKD-tree, 200 bits/msgSubsample, 200 bits/msgKD-tree, 1000 bits/msgSubsample, 1000 bits/msg

44

Contributions - I Sensor Localization

Best-known well-posedness methodology Easy incorporation of outliers Distributed implementation Explicit computation of (highly non-

Gaussian) uncertainty Nonparametric Belief Propagation

Finding broad applications Including elsewhere in sensor networks

E.g., multitarget tracking (later)

45

Contributions - II Message Error Analysis for BP

Best-known convergence conditions Methodology for quantifying effects of

message errors Provides basis for investigation of important

questions This presentation Effects of “message censoring” (later) Communications cost of “sensor handoff”

(later)

46

Contributions - III Efficient communication of particle-

based messages Near-optimal coding using KD-trees Multiresolution framework for trading off

communications cost with impact of message errors

47

The way forward Extension of error analysis to other

message-passing formalisms Other high-performance algorithms we have

developed (and that are on the horizon for the future)

TRP, ET, RCM, … Extension to include costs of protocol bits to allow

closer-to-optimal message exploitation Extension to assess value of additional memory at

sensor nodes Extension to “team”-oriented criteria

Some large message errors are OK if the information carried by additional bits are of little incremental value to the receiver

Documents

Where are we?