1
Where are we?
Final MURI Review Meeting
Alan S. WillskyDecember 2, 2005
2
Sensor Localization and Calibration
Fundamental problems in making sensor networks useful Localization and calibration Organization (e.g., who does what?)
This talk focuses on the former Part of this research done in collaboration with Prof.
Randy Moses (OSU), funded under the Sensors CTA The results are also directly applicable to localizing
targets rather than sensors This problem raises issues of much broader
importance
3
Problem Formulation Deposit sensors at random A few (or none!) have absolute location information Obtain relative measurements:
Time delay of arrival Received signal strength Direction of arrival…
Find a consistent sensor geometry And its uncertainty…
Denote location of sensor by and prior Observe:
Event with probability If we observe a noisy distance measurement
4
Our Objectives and Approach Go well beyond the state of the art
Issue of ensuring well-posedness and “unambiguous” localization
If I don’t hear you, I may be farther away… Dealing with non-Gaussianity of errors and of
measurement errors Dealing with accidental (or intentional) outliers
Distributed implementation Opening the door for investigations of tradeoffs
between accuracy and communications
5
Sensor LocalizationGraphical model
Associate each node in graph with a random variable Use edges (graph separation) to describe conditional
independency between variables Distribution: Pairwise Markov Random Field
Sensor localization problem has distribution
Natural interpretation as pairwise MRF; terms from Likelihood of observing a measurement If observed, noise likelihood Prior information Easy to modify edge potentials to include possibility of outliers
6
Sensor LocalizationA fully connected graph?
Unfortunately, all pairs of sensors are related!
“Observed” edges are strong P(o=1) tells us two sensors are
nearby Distance measurement info is even
stronger “Unobserved” edges relatively weak
P(o=0) tells us only that two sensors are probably far apart
Approximate with a simplified graph
7
Sensor LocalizationApproximate Graph FormulationApproximate the full graph using only “local” edges Examine two cases
“1-step” : keep only observed edges (otu =1) “2-step” : also keep edges with otv = ovu = 1, but otu = 0 Can imagine continuing to “3-step”, etc…
Notice the relationship to communications graph: “1-step” edges are feasible inter-sensor communications “2-step” messages are single-hop forwarding
Easily distributed solution How many edges should we keep?
Experimentally, little improvement beyond two
8
Distributed algorithms for the computation of location marginals
This formulation demonstrates that this is a problem of inference on graphical models
There are (many!) loops, so exact inference is an (NP-hard) challenge One approach is to use a suboptimal message-
passing algorithm such as Belief Propagation (BP) Other aspects of our work have dealt with
enhanced algorithms well beyond BP However, if we want to use BP or any of these
other algorithms, we still have challenges BP messages are nice vectors if the variables are
discrete or are Gaussian Neither of these are the case for sensor localization
9
Example Network True marginal uncertainties NBP-estimated marginals
Prior info
Sensor LocalizationUncertainty in Localization
Uncertainty has two primary forms “Observed” distance: ring-like likelihood f’n “Unobserved” distance – repulsive relationship
Uncertainty can be very non-Gaussian Multiple ring-like functions yield bimodalities, crescents, …
“High” dimensional (2-3D) Discretization computationally impractical Alternative: kernel density estimates (sample-based)
Similar to (regularized) particle filtering…
10
Nonparametric Inference for General Graphs
Problem: What is the product of two collections of particles?
Belief Propagation•General graphs
•Discrete or Gaussian
Particle Filters•Markov chains
•General potentials
Nonparametric BP•General graphs
•General potentials
11
Nonparametric BP
I. Message Product: Draw samples of from the product of all incoming messages and the local observation potential
II. Message Propagation: Draw samples of from the compatibility function, , fixing to the values sampled in step I
Samples form new kernel density estimate of outgoing message (determine new kernel bandwidths)
Stochastic update of kernel based messages:
12
How do we generate M samples from the product without explicitly computing it?
•Key issue is the label sampling problem (which kernel)•Efficient solutions use importance sampling and multiresolution KD-trees (later)
d messages, M kernels each
Product contains Md kernels
Input: Output: Computationally hard step – computing message products:
The computational challenge
14
10-Node graph
• Addition of “2-step” edges• Reduces bimodality• Improves location estimates
NBP (“2-step”)Joint MAP
Sensor LocalizationExample Networks: Small
15
“1-step” Graph “2-step” Graph
Nonlin Least-Sq NBP, “2-step”NBP, “1-step”
Sensor LocalizationExample Networks: Large
16Nonlin Least-Sq NBP, “2-step”MAP Estimate
• Addition of an outlier process• Robust noise estimation
• Dashed line has large error;• MAP – discard this measurement• NLLS – large distortion • NBP – easy to change noise distribution
Sensor LocalizationExtension: Outlier measurements
17
Message Errors Effect of “distortion” to messages in BP
Why distort BP messages? Quantization effects communications constraints stochastic approximation …
Results in… Convergence of loopy BP (zero distortion) Distance between multiple fixed points Error in beliefs due to message distortions
(or, errors in potential functions)
18
How different are two BP messages? Message “error” as ratio
(or, difference of log-messages)
One (scalar) measure Dynamic range Equivalent log-form
Message Approximation
19
Why the dynamic range? Relationship to L1 norm on log-messages:
Define
Then
20
Properties of d(e) Triangle inequality
Messages combine sub-additively
21
Properties of d(e), cont’d Message errors contract under convolution with
finite-strength potentials given “incoming errors”
we have that the message convolution
produces “outgoing error”
(where
is a measure of the potential strength) lo
g d(
e)
log d(E)
22
Results using this measure Best known convergence results for loopy
BP Result also provides result on relative locations
of multiple fixed points and provide conditions for uniqueness of fixed point
Bounds and stochastic approximations for effects of (possibly intentional) message errors
Basic results use worst case analysis If there is significant structure in the model (e.g.,
some edges that are strong and some that are weak), this can be exploited to obtain tighter results
23
Computation Trees(Weiss & Freeman ’99, along with many others) Tree-structured “unrolling” of loopy graph Contains all length-N “forward” paths At the root, equivalence between
N iterations of BP and N upward stages on computation tree Iterative use of subadditivity and contraction leads to both
bounds on error propagation and also to conditions for BP convergence
27
“Simple” convergence condition Use an inductive argument:
Let “true” messages be any fixed point Bound Ei+1 using Ei
Loopy BP converges if Simple calculus:
Bound still meaningful forfinite iterations
31Potential strength !
Experimental comparisons Two example graphical models
Compute Simon’s condition (uniqueness only) Simple bound on FP distance Bounds using graph geometry
(a)
(b)
32
Adding message distortions Suppose we distort each computed BP message
Add max. error to each message ) changes iteration
Strict bound on steady-state error Note: “error” is vs. “exact” loopy BP solution
Similar: errors in potential functions
Can also use framework to estimate error Assume e.g. incoming message errors are uncorrelated Estimate only, but may be a better guess for
quantization
33
Experiments: Quantization Effects (I)
Small (5x5) grid, binary random variables (positive/mixed correlation)
Relatively weak potential functions Loopy BP guaranteed to converge Bound and estimate behave similarly
Quantization error
34
Experiments: Quantization Effects (II)
Increase the potential strength Loopy BP no longer guaranteed to converge
Bound asymptotes as quantization error goes to zero Estimate (assuming uncorrelated errors) may still be
useful
Quantization error
35
Communicating particle sets Problem: transmit N iid samples Sequence of samples:
Expected cost is ¼ N*R*H(p) H(p) = differential entropy, R = resolution of samples
Set of samples Invariant to reordering
We can reorder to reduce the transmission cost Expected cost is ¼ N*R*H(p) – log(N!) Entropy reduced for any deterministic order
In 1-D, “sorted” order in > 1-D, exploit KD-trees
Yields direct tradeoff between message accuracy and bits required for transmission
Together with message error analysis, we have an audit trail from bits to fusion performance
38
Trading off error vs communications
Examine a simple hierarchical approximation Many other approximations possible…
KD-trees Tree-structure successively divides point sets
Typically along some cardinal dimension Cache statistics of subsets for fast computation Example: cache means and covariances
Can also be used for approximation… Any cut through the tree is a density estimate Easy to optimize over possible cuts
Communications cost Upper bound on error (KL, max-log, etc)
42
Examples – Sensor localization Many inter-related aspects
Tested on small (10-node) graph… Message schedule
Outward “tree-like” pass Typical “parallel” schedule
# of iterations (messages) Typically require very few (1-3) Could replace by msg stopping criterion
Message approximation / bit budget Most messages (eventually) “simple”
unimodal, near-Gaussian Early messages & poorly localized sensors
May require more bits / components…
43
Examples – Particle filtering Simple single-object tracking
Diffusive dynamics (random walk) Nonlinear (distance-based) observations
Transmit posterior at each time step Fixed bit-budget for each transmission
Compare: simple subsampling vs KD-tree approximation
1 2 3 4 5 60
0.05
0.1
0.15
0.2
0.25
Time / # transmissions
KL-
dive
rgen
ce
Exact commsKD-tree, 200 bits/msgSubsample, 200 bits/msgKD-tree, 1000 bits/msgSubsample, 1000 bits/msg
44
Contributions - I Sensor Localization
Best-known well-posedness methodology Easy incorporation of outliers Distributed implementation Explicit computation of (highly non-
Gaussian) uncertainty Nonparametric Belief Propagation
Finding broad applications Including elsewhere in sensor networks
E.g., multitarget tracking (later)
45
Contributions - II Message Error Analysis for BP
Best-known convergence conditions Methodology for quantifying effects of
message errors Provides basis for investigation of important
questions This presentation Effects of “message censoring” (later) Communications cost of “sensor handoff”
(later)
46
Contributions - III Efficient communication of particle-
based messages Near-optimal coding using KD-trees Multiresolution framework for trading off
communications cost with impact of message errors
47
The way forward Extension of error analysis to other
message-passing formalisms Other high-performance algorithms we have
developed (and that are on the horizon for the future)
TRP, ET, RCM, … Extension to include costs of protocol bits to allow
closer-to-optimal message exploitation Extension to assess value of additional memory at
sensor nodes Extension to “team”-oriented criteria
Some large message errors are OK if the information carried by additional bits are of little incremental value to the receiver