View
216
Download
0
Category
Tags:
Preview:
Citation preview
Analyzing wireless sensor network data under suppression
and failure in transmission
Alan E. GelfandInstitute of Statistics and Decision Sciences
Duke University (with G.Puggioni, J.Yang, A.Silberstein, K.Munagala)
Before we begin…
• From the perspective of a stochastic modeler
• In fact, a hierarchical modeler working in the Bayesian inference paradigm
• Neophyte with regard to sensor networks• However, not much attention in the
statistics community to issues involved in studying these networks
• No optimization!
Outline
• Our niche in the sensor networks world• Global, local, modeling, computation, and analysis• Local data collection• Suppression or transmission; failure; redundancy• Stochastic implications suggests focus on probabilistic
modeling rather than on algorithms• Fully model based approach implies full and exact
inference with uncertainty• Computational challenges (model fitting)• Measuring information loss• An example, some “experiments”• Future work; getting closer to what we really want
Data Collection
• At the node; multiple sensors per node; local calibration using field data collection
• Collection at high temporal resolution (scales?)
• Cost of collection; periods of no collection
• Collection is cheap, transmission is expensive in terms of battery life
• Multivariate data collection
Network Communication
• Here a very simple version – “spokes on a wheel”; single-hop; nodes to gateway (and back); no node to node communication
Model Building Plan
• Single node, suppression only, failure only, both
• Two nodes, suppression only, failure only, both
• Network of nodes, spatial modeling, suppression only, failure only, both
Suppression
• Temporal suppression only here• The basic idea: at high temporal resolution, at a
given node, data expected to change little from time point to time point
• Transmission is expensive relative to collection so only transmit given a “consequential” change
• Suppression schemes? based upon comparison with previous observation? with previously transmitted observation? with a “predicted” value at that time and location?
Suppression cont.
• For location s at time t, Y(s,t) is collected value• For continuous data and a specified Є, consider,
say |Y(s,t) – Y(s,tprevtrans)| > Є or |Y(s,t) – Yest(s,t)| > Є. NOT |Y(s,t)-Y(s,t-1)| > Є.
• Choice of Є? Anticipate a high rate of suppression. Much more “missing data” than in usual statistical analysis settings
• Again, no cross-node communication so here suppression can not be based on neighboring values (spatial suppression)
Transmission Failure
• Practical issue, what is a failure - bit errors, corrupted transmission
• Rate varies spatially, varies seasonally• Will not be known - so models for failure• Disentangling failure from suppression?• Redundancy or error-correcting schemes - when
transmitting, transmit both a value and the time or do this for several previous transmission times (how many?)
• Another idea is to include acknowledgement from gateway; no acknowledgement implies retransmission.
• Suppression or observation after failure results from comparison with Y(s,tfailure).
Modelling
• Envision an overall process model which is spatially dependent time series.
• Observed data is a noisy version of this• In fact, we envision the familiar specification, [data|
process, parameters] x [process|parameters] x [parameters] with dynamics at the second stage
• Dynamics can be driven by local autoregressive models (with drift), by local discretized continuous time models, by local differential equations
• They are connected up in space by spatially colored noise at the second stage and, more generally, by spatially associated model parameters
Inference
• Global and local parameters• Which model parameters vary spatially?• Temporally evolving parameters reflecting seasonality• Interest in reconstruction of the local time series (but not
interested in piecewise interpolation schemes – want full model and full inference under the model)
• Again, full inference in terms of posterior distributions• Global model fitting – offline activity at server, what
temporal scale?• With regard to local computation, communication of
parameter estimation to nodes for local suppression?
Model fitting
• Offline computation
• Bayesian hierarchical spatio-temporal model
• Fitted using Gibbs sampling
• Currently, no local modeling; just comparison with previous transmission (failure or not)
Some details
Details cont.
Cont.
Cont.
Cont.
Cont.
Dynamic model version
An Example
• An AR(1) model• Known drift (as in say precipitation input for a soil
moisture model)• Drift measured at the gateway but assumed
applicable to all nodes• Only parameters are autoregressive coefficient
and process variance• Suppression rate Є known, failure rate not
modeled• Experiments - using, not using (i) suppression-
failure information; (ii) redundancy
Single missing value, known endpoints, no other information
Single missing value, known endpoints, missing value is a known failure
Single missing value, known endpoints, missing value is a known suppression
String of five missing values, known endpoints, no
other information on missing values
String of five missing values, known endpoints, all missing values known to be suppressions
Joint density, adjacent missing values, no other information
Joint density, adjacent missing values, missing values known to be suppressions
Comments
• Anticipate high rate of suppression
• Failure should not “dominate” suppression or else we should not suppress
• Failure rate model – reflecting space and time
• We have not viewed lowering failure rate as an option
Information loss
• For process parameters: - Kullback-Liebler distance between full data posterior and “partial” data posterior - Kullback-Leibler distance comparing different Є’s - Length of fixed coverage credible interval - Coverage probability of a symmetric (about the point estimate) fixed length interval
• For sequence reconstruction: A predictive mean square error criterion
Cont.
• Priority on process parameter inference or on sequence reconstruction
• Cost vs. information loss trade-off
• Utility function with cost linear in transmission
• No “off-line” cost associated with computation, e.g., using or ignoring suppression/failure information
Future Work
• Parameters changing over time• Node-to-node communication• Multi-hop transmission• Multivariate local data collection• Local, non- network data collection for
calibration, fusion• Good approximations for handling high
suppression rate and high failure rate settings• All moving toward modeling for an environmental
observation network
Recommended