14
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian Estimation of a Diffusive Source in Wireless Sensor Networks Tong Zhao, Student Member, IEEE, and Arye Nehorai, Fellow, IEEE Abstract—We develop an efficient distributed sequential Bayesian estimation method for applications relating to diffusive sources—localizing a diffusive source, determining its space-time concentration distribution, and predicting its cloud envelope evolution using wireless sensor networks. Potential applications include security, environmental and industrial monitoring, as well as pollution control. We first derive the physical model of the substance dispersion by solving the diffusion equations under different environment scenarios and then integrate the physical model into the distributed processing technologies. We propose a distributed sequential Bayesian estimation method in which the state belief is transmitted in the wireless sensor networks and updated using the measurements from the new sensor node. We propose two belief representation methods: a Gaussian density approximation and a new LPG function (linear combination of polynomial Gaussian density functions) approximation. These approximations are suitable for the distributed processing in wireless sensor networks and are applicable to different sensor network situations. We implement the idea of information-driven sensor collaboration and select the next sensor node according to certain criterions, which provides an optimal subset and an optimal order of incorporating the measurements into our belief update, reduces response time, and saves energy consumption of the sensor network. Numerical examples demonstrate the effectiveness and efficiency of the proposed methods. Index Terms—Diffusive source, distributed estimation, sensor node scheduling, sequential Bayesian method, wireless sensor networks. I. INTRODUCTION R ECENTLY, wireless sensor networks have been the ob- ject of intensive interest in the research community [1], [2]. Inexpensive, smart nodes with multiple on-board sensors, networked through wireless lines as well as the Internet and deployed in large numbers, interact intelligently with the phys- ical world. These systems are aimed at various applications: environmental, medical, food-safety, and habitant monitoring; assessing the “health” of machines, aerospace vehicles, and civil engineering structures; energy management; inventory control; home and building automation; homeland security; and military initiatives. One important set of applications involves monitoring diffusion phenomena. In this paper, we Manuscript received November 24, 2005; revised June 7, 2006. This work was supported by the Department of Defense under the Air Force Office of Sci- entific Research MURI Grant FA9550-05-1-0443, AFOSR Grant FA9550-05-1- 0018, and by the National Science Foundation Grant CCR-0330342. The asso- ciate editor coordinating the review of this paper and approving it for publication was Dr. Lang Tong. The authors are with the Department of Electrical and Systems Engineering, Washington University, Saint Louis, MO 63130 USA (e-mail: tzhao3@ese. wustl.edu; [email protected]). Digital Object Identifier 10.1109/TSP.2006.889975 address the issue of developing efficient distributed parameter estimation methods and applying them to diffusive source monitoring, i.e., localizing a diffusive source, determining its space-time concentration distribution, and predicting its cloud envelope evolution, using wireless sensor networks. Examples of diffusive sources include biological agents, toxic chemicals, explosives, hazardous materials, pollutants, drugs, and temper- ature fields. In our previous paper, we proposed model-based integrated biochemical sensor array processing methods for detecting and estimating dispersion in realistic environments, in which the un- derlying physical and statistical models were coupled with in- formation processing to obtain the best performance [3]–[7]. However, these methods use centralized processing approaches, i.e., they assume all measured data from the sensors are trans- mitted to a fusion center for the processing; they cannot be di- rectly used in our current problem due to the strict power and resource constraints of the wireless sensor networks. In a typical wireless sensor network, each sensor node operates unattended with limited battery power and limited signal-processing capa- bility; each sensor node communicates wirelessly with other nodes in its radio communication range. These wireless commu- nications consume a significant part of the available energy, and, hence, dominate the life of a wireless sensor network. Therefore, in this paper the essential point is that we propose a distributed parameter estimation method based on a sequential Bayesian approach, which is suitable for wireless sensor networks, and apply it to the estimation of diffusive sources. In this method, we apply information-driven dynamic collaborative information processing between sensor nodes [8], [9]; hence, we decrease the total wireless communications while maintaining the pro- cessing performance above a required threshold. The proposed method works efficiently with low computation complexity and fits the limited processing capability at each node. We first derive the physical models for the spatial and temporal concentration distribution of the dispersed substance from a diffusion source. We obtain these models by solving the diffusion equations under different initial conditions and environmental scenarios. Our methods can be applied to dif- ferent source types, such as instantaneous sources, continuous sources, or distributed sources; they can also be applied to different environments, such as infinite medium, semiinfinite medium, multilayer medium, a cylinder, or a sphere. We also consider the effects of external forces, e.g., wind, temperature, and air turbulence. We then transform the physical models to parametric statis- tical measurement models and develop a distributed parameter estimation method based on the sequential Bayesian approach. The main idea is that the posterior density function, also known 1053-587X/$25.00 © 2007 IEEE

Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511

Distributed Sequential Bayesian Estimation of aDiffusive Source in Wireless Sensor Networks

Tong Zhao, Student Member, IEEE, and Arye Nehorai, Fellow, IEEE

Abstract—We develop an efficient distributed sequentialBayesian estimation method for applications relating to diffusivesources—localizing a diffusive source, determining its space-timeconcentration distribution, and predicting its cloud envelopeevolution using wireless sensor networks. Potential applicationsinclude security, environmental and industrial monitoring, aswell as pollution control. We first derive the physical model ofthe substance dispersion by solving the diffusion equations underdifferent environment scenarios and then integrate the physicalmodel into the distributed processing technologies. We propose adistributed sequential Bayesian estimation method in which thestate belief is transmitted in the wireless sensor networks andupdated using the measurements from the new sensor node. Wepropose two belief representation methods: a Gaussian densityapproximation and a new LPG function (linear combination ofpolynomial Gaussian density functions) approximation. Theseapproximations are suitable for the distributed processing inwireless sensor networks and are applicable to different sensornetwork situations. We implement the idea of information-drivensensor collaboration and select the next sensor node accordingto certain criterions, which provides an optimal subset and anoptimal order of incorporating the measurements into our beliefupdate, reduces response time, and saves energy consumptionof the sensor network. Numerical examples demonstrate theeffectiveness and efficiency of the proposed methods.

Index Terms—Diffusive source, distributed estimation, sensornode scheduling, sequential Bayesian method, wireless sensornetworks.

I. INTRODUCTION

RECENTLY, wireless sensor networks have been the ob-ject of intensive interest in the research community [1],

[2]. Inexpensive, smart nodes with multiple on-board sensors,networked through wireless lines as well as the Internet anddeployed in large numbers, interact intelligently with the phys-ical world. These systems are aimed at various applications:environmental, medical, food-safety, and habitant monitoring;assessing the “health” of machines, aerospace vehicles, andcivil engineering structures; energy management; inventorycontrol; home and building automation; homeland security;and military initiatives. One important set of applicationsinvolves monitoring diffusion phenomena. In this paper, we

Manuscript received November 24, 2005; revised June 7, 2006. This workwas supported by the Department of Defense under the Air Force Office of Sci-entific Research MURI Grant FA9550-05-1-0443, AFOSR Grant FA9550-05-1-0018, and by the National Science Foundation Grant CCR-0330342. The asso-ciate editor coordinating the review of this paper and approving it for publicationwas Dr. Lang Tong.

The authors are with the Department of Electrical and Systems Engineering,Washington University, Saint Louis, MO 63130 USA (e-mail: [email protected]; [email protected]).

Digital Object Identifier 10.1109/TSP.2006.889975

address the issue of developing efficient distributed parameterestimation methods and applying them to diffusive sourcemonitoring, i.e., localizing a diffusive source, determining itsspace-time concentration distribution, and predicting its cloudenvelope evolution, using wireless sensor networks. Examplesof diffusive sources include biological agents, toxic chemicals,explosives, hazardous materials, pollutants, drugs, and temper-ature fields.

In our previous paper, we proposed model-based integratedbiochemical sensor array processing methods for detecting andestimating dispersion in realistic environments, in which the un-derlying physical and statistical models were coupled with in-formation processing to obtain the best performance [3]–[7].However, these methods use centralized processing approaches,i.e., they assume all measured data from the sensors are trans-mitted to a fusion center for the processing; they cannot be di-rectly used in our current problem due to the strict power andresource constraints of the wireless sensor networks. In a typicalwireless sensor network, each sensor node operates unattendedwith limited battery power and limited signal-processing capa-bility; each sensor node communicates wirelessly with othernodes in its radio communication range. These wireless commu-nications consume a significant part of the available energy, and,hence, dominate the life of a wireless sensor network. Therefore,in this paper the essential point is that we propose a distributedparameter estimation method based on a sequential Bayesianapproach, which is suitable for wireless sensor networks, andapply it to the estimation of diffusive sources. In this method,we apply information-driven dynamic collaborative informationprocessing between sensor nodes [8], [9]; hence, we decreasethe total wireless communications while maintaining the pro-cessing performance above a required threshold. The proposedmethod works efficiently with low computation complexity andfits the limited processing capability at each node.

We first derive the physical models for the spatial andtemporal concentration distribution of the dispersed substancefrom a diffusion source. We obtain these models by solvingthe diffusion equations under different initial conditions andenvironmental scenarios. Our methods can be applied to dif-ferent source types, such as instantaneous sources, continuoussources, or distributed sources; they can also be applied todifferent environments, such as infinite medium, semiinfinitemedium, multilayer medium, a cylinder, or a sphere. We alsoconsider the effects of external forces, e.g., wind, temperature,and air turbulence.

We then transform the physical models to parametric statis-tical measurement models and develop a distributed parameterestimation method based on the sequential Bayesian approach.The main idea is that the posterior density function, also known

1053-587X/$25.00 © 2007 IEEE

Page 2: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

1512 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007

as the belief, of the parameters of interest is updated incremen-tally when new measurements are obtained, until a desired per-formance threshold is satisfied. The estimates of the parametersare calculated from the obtained belief according to certain cri-teria. Here we assume that the related parameters are stationary.In contrast with the ordinary sequential Bayesian methods inwhich the belief is updated sequentially in the time domain [10],in our method the belief is transmitted in the sensor network andupdated incrementally in the space domain. Since the informa-tion is transmitted only between sensor nodes and their neigh-bors, a fusion center is not needed; hence, we realize a fullydistributed estimation.

The derived dispersion models yield statistical measurementmodels that are highly nonlinear and non-Gaussian, especiallywhen nuisance parameters appear. Therefore, existent recur-sive Bayesian estimation methods, such as the decentralizedKalman filters or Extended Kalman filters [11], [12] cannot bedirectly applied. Some possible alternative methods, such asgrid-based methods and sequential Monte Carlo (or particle fil-tering) methods [13], [14], are also inapplicable because theyrequire large data transmissions between sensor nodes. Eventhough some distributed particle filters are proposed [15], [16],they still have limitations, e.g., the effectiveness of these algo-rithms depends on some strict assumptions, and the computa-tion complexity of these algorithms are very high [17]. Sincedecreasing power consumption is always an important issue inwireless sensor networks, in our method we propose to representand transmit the belief parametrically, i.e., we approximate thebelief by a family of parameterizable probability distributions;hence, we need to transmit only a limited number of parame-ters between sensor nodes, which decreases the communicationrequirements dramatically. In this paper, we propose two para-metric belief representation methods: Gaussian density approx-imation and a new LPG function approximation. The distribu-tion family of LPG functions is a natural generalization of theGaussian density; hence, it provides a more accurate approxi-mation that leads to fewer data transmission hops while main-taining the desired estimation performance. LPG function ap-proximation approaches have the potential advantage to be usedin other nonlinear, non-Gaussian recursive Bayesian estimationproblems. Note that even though we focus on the estimation ofdiffusive sources, the proposed distributed processing methods,including belief approximation approaches and node selectionschemes also provide a general framework for other applicationsin wireless sensor networks.

In Section II, we derive the physical and statistical mea-surement models of the diffused substance distribution. InSection III, we present the proposed distributed sequentialBayesian estimation method, including the distributed estima-tion algorithms, belief representation methods, and the sensorselection strategies. In Section IV, numerical examples areused to demonstrate the performance of the proposed signalprocessing methods. Conclusions and directions for futurework are given in Section V.

II. PHYSICAL AND STATISTICAL MEASUREMENT MODELS

In this section, we first derive computational physical modelsdescribing the space-time substance dispersion mechanisms

under various environmental scenarios; then we transform theobtained dispersion model to a statistical measurement model.The following distributed parameter estimation methods aredeveloped based on this statistical model.

A. Physical Models for Dispersion

We model the transport of a substance from a diffusive sourceby solving the diffusion equation, which describes the disper-sion of particles from a region of high concentration to regionsof lower concentration due to random motion. Various initialand boundary conditions and environmental effects are consid-ered according to different scenarios. We first compute the con-centration for a stationary impulse point source to find the Greenfunction. In this step, we consider different types of boundaryconditions (permeable and impermeable boundaries) and envi-ronmental effects (external forces and turbulence). Then we ex-tend the results to a continuous source by integrating the sourcerelease rate with the Green function.

1) Stationary Impulse Source: Let denote the concen-tration of the diffused substance at a position andtime . For a source-free volume and space-invariant diffusivity

, when we ignore the effects of external forces, the concentra-tion of a dispersed substance follows the diffusion equation [18]

(1)

The appropriate boundary conditions and initial conditions areapplied to solve this differential equation. For the simplificationof presentation, we consider only the diffusion equation in the

dimension, i.e.,

(2)

This equation can be easily extended to the three-dimensioncases.

Initial conditions: In order to calculate the Green function,we assume there is an impulse point source of a unit mass releaserate stationary at , i.e., the initial condition is

(3)

where is the initial release time.Boundary conditions: For the boundary conditions, we use

a homogeneous semiinfinite medium as an example. The deriva-tion process is also applicable to other environments straight-forwardly. A homogeneous semiinfinite medium in the region

can be used to model the diffusion in air above the ground.For a generally permeable boundary condition, a reasonable as-sumption is that the rate of loss of the diffusing substance is pro-portional to the actual concentration in the surface at any time[18], i.e., the boundary condition at the surface is

(4)

where is a constant of proportionality.

Page 3: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

ZHAO AND NEHORAI: DISTRIBUTED SEQUENTIAL BAYESIAN ESTIMATION 1513

Solutions: Solving the diffusion equation (2) under theabove boundary and initial conditions, we obtain

(5)

where “erfc” is the error-function complement. The first termon the right-hand side (RHS) of (5) is the solution for an imper-meable boundary, which can also be interpreted as a superposi-tion of contributions from the actual source and a mirror-imagesource of equal magnitude at ; the second term repre-sents the loss effect of the diffusing substance.

External forces: In the presence of external forces such aswind, flow, and gravity, the diffusion equation (2) takes the form

(6)

where can represent the wind speed in the direction [19].The second term on the RHS is called advection. We can reducethe above problem to the ordinary diffusion equation without theadvection term by applying the following transformation [19]

(7)

to the differential equation (6). Hence, we can apply the sameprocedures as aforementioned to solve this type of problem(please refer to [7] for more discussion about other environ-mental effects).

2) Continuous Sources: A solution for a continuous source,from which the diffusion substance is liberated continuously ata certain rate, is deduced from the corresponding Green functionby integrating it with the source release rate function.

Point source: Suppose that we have a stationary pointsource continuously releasing a substance at a mass rate .Let denote the Green’s function of the impulsesource case. Then the concentration of a continuous pointsource is obtained using the following integral:

(8)

Using the three-dimensional (3-D) solution in a homogeneoussemi-infinite medium under the impermeable boundary condi-tion as an example and supposing , we obtain theconcentration distribution

(9)

where is the source location; and.

Distributed source: These results can be extended also toa distributed source. Let denote the distributed mass re-lease rate, and let denote the area of the distributed source;then

(10)

B. Measurement Models

To model the measurements, we suppose a spatially dis-tributed wireless sensor network has been deployed. Eachsensor node in this network is located at a known positionand can measure the substance concentration from a diffusionsource. Assuming the physical models derived above representthe underlying dispersion mechanism, we obtain a measurementmodel for a sensor at a position and taking measurementsat time as

(11)where is the concentration of interests; is a bias term,representing the sensor’s response to foreign substances and as-sumed to be a unknown constant; and is the sensor’snoise, assumed to be Gaussian distributed, independent in timeand space (see [3] for a further discussion of this model). As-suming for simplicity that the source’s substance-releasing rateis time invariant, and denoting , ,and , we rewrite (11) as

(12)

where and . Here rep-resents the source and medium parameters. Assuming astationary source in a semiinfinite medium with impermeableboundary whose concentration distribution is as in (9), we have

.1) Multiple Sources: If there exist multiple sources, the con-

centration of interests can be written as

where is the number of sources, and includes theparameters related to the -th source. In this case, theform of the measurement model for is the same as (12),however, here and

.2) Nuisance Parameters: In the above parametric statistical

measurement model, the unknown parameters include , , and. However, in practice only parts of them are of interest; others

are nuisance parameters. For example, under the scenario ofa stationary source in a semiinfinite medium with imperme-able boundary, if we want to estimate the substance dispersion,

are the parameters of interest; whereas if wejust want to localize the diffusion source, only are pa-rameters of interest. For these parameters, some of them can be

Page 4: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

1514 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007

measured or estimated through other methods, thus are removedfrom the unknown parameter list. For example, the diffusivity ,bias term , and noise variance can be measured during thecalibration, which is processed before the estimation phase.

For the remaining nuisance parameters, e.g., source releaserate , we can remove them by integration. We first assumean a priori probability density function (pdf) for , which isa uniform distribution between and , where and

are the possible lowest and highest values of . For a lo-calization problem, we denote the unknown parameter vectoras . Consequently, according to the measurementmodel in (12), the marginal pdf of given is

(13)

where ; is the standard normal cumulativedistribution function

(14)

III. DISTRIBUTED SEQUENTIAL BAYESIAN ESTIMATION

In this section, we first develop the distributed sequentialBayesian estimation algorithm and apply it to the estimationof a diffusion source using wireless sensor networks. In ouralgorithm, how to approximate the state belief appropriatelyis critical for efficient estimation and decreasing the commu-nication requirement. We then propose two parametric beliefrepresentation methods: Gaussian approximation and LPGfunction approximation. Finally, we discuss the sensor nodeselection strategies.

A. Distributed Sequential Bayesian Estimation Algorithm

1) Estimation Scheme: The proposed scheme of estimating adiffusion source using wireless sensor networks has two phases:the measuring phase and the estimation phase. In the measuringphase, when the alarm sensor detects a possible release from adiffusion source, it then activates the sensor nodes around thepotential target. The activated sensor nodes take the measure-ments of the substance concentration at specific time samples

, , and then return to sleeping status. The de-termination of the number of samples depends on how quicklythe task requires the sensor network to respond. Here we assumethe sensor network is synchronized such that the sampling times

are known at each sensor node. In the estimation phase, weprocess a distributed sequential (or incremental) Bayesian algo-rithm, activated by an initial sensor node, to estimate the diffu-sion source, e.g., to localize the source position. We will find thatthe estimation method itself provides a general framework forthe stationary parameter related applications in wireless sensornetworks. It is independent on the sensing models that are used.

List of Notations

• , the sensor node index; is the totalnumber of nodes that are awakened.

• , the time index; is the total number ofmeasurements in each sensor node.

• , the collection of all measure-ments at sensor node .

• , the measurement sequence up tosensor node .

2) Distributed Sequential Bayesian Estimation: In wirelesssensor networks, because of their energy limitation and failuretolerance properties, the usually centralized Bayesian estima-tion methods cannot be directly applied. In the following, wepropose a distributed sequential Bayesian estimation method.“Sequential” here means that the messages are transmitted inthe sensor network, and the state belief and Bayesian estimateare updated spatially according to the previous informationand the measurements at the current sensor node. To derivethe sequential Bayesian estimation [here we use the min-imum-mean-squared error (mmse) estimation], we extendthe basic Bayesian estimation such that it can incrementallycombine measurements over space.

Suppose the current sensor node obtains the belief passedfrom the sensor node as , and the measure-ments at the current sensor node are the . We assume thatthe following conditional independence assumption is satisfied:

A1: Conditioned on , the measurements at the current sensornode are independent of the measurements at the previoussensor node , i.e.,

(15)

Note that in our diffusive source estimation problem, we assumethe noise in the measurement model (12) is independent inspace, which guarantees the assumption A1 is satisfied. Then,we can update the belief to using by applying theBayesian rule as

(16)

At the first sensor node where , we have

(17)

where is the likelihood function of the current observa-tion at the sensor node

(18)

and is the prior pdf for the location which is obtainedfrom the prior information. In practice, if we do not have priorknowledge about we assume the prior pdf is Gaussian witha large variance. Therefore, the current mmse estimate at thesensor node can be calculated as

(19)

Page 5: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

ZHAO AND NEHORAI: DISTRIBUTED SEQUENTIAL BAYESIAN ESTIMATION 1515

Fig. 1. Flowchart of the distributed sequential Bayesian estimation algorithm.

and the covariance matrix of the current estimate is

(20)

which is used as the performance measure.3) Algorithm: In (16) we observe that the current belief is a

product of the previous belief and the current likelihood function(up to a normalization coefficient), which provides us with asequential algorithm to update the belief. Hence, we obtain thedistributed sequential Bayesian estimation algorithm shown inFig. 1 and summarized as follows:

The initial sensor node processes the following steps:• Belief initialization: Calculate the initial belief ac-

cording to the prior parameter pdf and its own measure-ments, using (17).

• Sensor node selection: Select a sensor node from itsneighbor according to certain utility information criteria(the details will be discussed in Section III-C).

• Data transmission: Wake up the selected sensor node,transmit the current belief to it, and return to sleepingstatus.

The next sensor nodes perform the following steps.• Data receiving: Be activated by the previous sensor node

and receive the transmitted data.• Belief update: Calculate a new belief according to the re-

ceived previous belief and its own measurements, using(16).

• Belief quality test: Test the quality of the updated be-lief according to some measure such as the trace or de-terminant of the covariance matrix (20). If the belief is“good enough,” it will calculate the mmse estimate ofthe source parameters according to (19), and the estima-tion phase is terminated.

• Sensor node selection: Select a sensor node from itsneighbor according to certain utility information criteria(the details will be discussed in Section III-C).

• Data transmission: Wake up the selected sensor node,transmit the current belief to it, and return to sleepingstatus.

4) Discussion: In this algorithm the messages are transmittedonly between a sensor node and its neighbor; hence, we realize afully distributed estimation. However, power and resource con-straints in wireless sensor networks create some difficulties ofimplementation:

• In the algorithm, we need to transmit the current belief tothe next sensor node. Because of the nonlinear and non-Gaussian characteristics of the measurement model, wecannot obtain an analytical form of the belief density. Anddirectly transmitting the grid-samples of the belief wouldrequire significant power consumption. Therefore, we needto represent the belief in a proper way.

• In the steps of belief update, mmse calculation, and be-lief quality evaluation, high dimensional multiple integra-tion calculation is required. Due to the nonlinearity andnon-Gaussian nature of the measurement model, we canapply only numerical algorithms, which may lead to highcomputation cost and exceed the limited processing capa-bility of the sensor nodes.

B. Belief Approximation

To overcome the difficulties in the derived distributedestimation algorithm, and consider the fact that the powerconstraint is critical in wireless sensor networks, we proposeto approximate the state belief by a family of pa-rameterizable distributions. Therefore, in the belief updatewe will need to transmit only a finite number of parameters,which decreases the communication burden significantly, withconsequent decrease in power consumption and increase insensor network longevity. We propose two parametric dis-tribution approximation methods: a Gaussian approximationand a new approach—a linear combination of Gaussian mul-tiplied by polynomials (LPG) approximation. In both of thesemethods, we consider the accuracy of the approximation as

Page 6: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

1516 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007

well as decreasing the computation complexity to fit the limitedprocessing capability of the sensor nodes.

1) Gaussian Approximation:a) Approximation algorithm: For the continuous posterior

distribution function, there exists an asymptotic posterior nor-mality property [20], which is presented as follows.

Asymptotic posterior normality. Consider as the pos-terior probability density function of a random variable ,

, given the observation , i.e., ; define, where reaches its local maximum

at , i.e.,

and . Let (c1), (c2), and (c3)represent the conditions “steepness,” “smoothness,” and “con-centration,” respectively (see [20, p. 289] for the definitions ofthese conditions). Then, given (c1) and (c2), (c3) is a necessaryand sufficient condition for to converge in distribution to astandard multivariate normal random variable , whose proba-bility density function is .

This proposition tells us that for large , the posterior func-tion becomes highly peaked and behaves like the multivariatenormal density kernel .Therefore, in our problem, for the belief that (c1) to(c3) are satisfied, we propose to approximate it by a multi-variate Gaussian distribution with mean and covari-ance , i.e.,

(21)

where

(22a)

(22b)

To calculate the mean and covariance using above equations re-quires multiple integrations which introduces significant com-putation cost. In order to decrease the processing capability de-mand, we apply the Laplace approximation method to calculatethe integration in (22a) and (22b).

b) Laplace’s method: Laplace’s method, a family ofasymptotic methods used to approximate integrals, is presentedas a potential candidate for the toolbox of techniques usedfor knowledge acquisition and probabilistic inference in beliefnetworks with continuous variables [20], [21]. This techniqueapproximates posterior moments and marginal posterior dis-tributions with reasonable accuracy in many cases. Let

be a scalar function of parameters of interest . Consider theexpectation of the posterior of the form

(23)

where is the observation vector. By ap-plying the Laplace method, we obtain the approximation for

as

(24)where

• the functions and are defined by

(25a)

(25b)

• and are, respectively, the minimizers for and, i.e.

(26a)

(26b)

• and are the inverse of the Hessian of andevaluated at and , respectively

(27)

In this case, the accuracy of the approximation can be shown as

(28)

c) Application to diffusive source localization: To lo-calize a diffusive source, the parameter vector of interest is

. Thus, by using the Gaussian approximation,in order to update the belief, we need only to transmit total 5scalars (2 for mean and 3 for covariance matrix) to the nextsensor node, which decreases the communications requiredbetween sensor nodes dramatically.

In order to apply the above Laplace approximation to our lo-calization problem, we rewrite the mean and covariance of theposterior in (22a) and (22b) as

(29a)

(29b)

Therefore, when we define

(30)

and replace by and by , we candirectly use the above Laplace’s method to approximate the pos-terior mean and variance in (22a) and (22b). Here, we approxi-mate using the expression (omitting ):

(31)

Page 7: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

ZHAO AND NEHORAI: DISTRIBUTED SEQUENTIAL BAYESIAN ESTIMATION 1517

Note that the functions in this algorithm are related only to thelocal measurements ; hence, the algorithm can be completelyimplemented locally. No distributed processing is needed.

d) Discussion: This Gaussian approximation method hasthe advantage that its belief representation is very simple thatleads to the substantially low amount of wireless transmissionbetween sensor nodes and low computation complexity ateach sensor node. However, the approximation accuracy is notvery good, especially at the beginning of the process whenthe number of collected measurements is small. This lowaccuracy may cause the process of data transmission betweensensor nodes to be relatively long and lead to a high totalcommunication power consumption. Therefore, the Gaussianapproximation is suitable only for the cases when the beliefdensity has Gaussian-like distribution or the processing ca-pacity at the sensor node is very limited.

2) LPG Function Approximation: We propose to use a pa-rameterizable distribution family of LPG functions to approx-imate the belief. The PG functions represent Gaussian densityfunctions multiplied by polynomials, and LPG denotes a linearcombination of PG functions.

a) LPG functions: PG or LPG functions were first pre-sented by Paldi and Nehorai in [22] to enable exact analyticalsolutions for nonlinear filtering problems by approximating theposterior pdf using LPG functions. In our problem, using LPGfunctions to approximate the belief in wireless sensor networkshas these advantages.

• This approximation family is a natural generalization ofthe above Gaussian approximation in Section III-B-1; thus,it can provide higher approximation accuracy: this familyof LPG functions has the property that it can approximateany pdf with an arbitrarily small error, provided that thepolynomial degree is sufficiently large.

• Any moments for this family can be evaluated analytically:for PG functions the calculation of moments is equivalentto the calculation of the moments of the correspondingGaussian functions, for which the analytical results can bederived.

• The PG and LPG families are closed under multiplicationand integration operations. In some cases, this point en-ables the analytical computation of the posterior densityfunctions.

We also note that another typical nonlinear Bayesian estimationmethod using Gaussian sum [23], [24] is a special case of theimplementation of LPG function approximation.

b) Approximation procedure: The procedure of usingan LPG function to approximate the belief is performed bycalculating the optimal parameters for the LPG function(including the mean and covariance matrix of the Gaussiandensity, the polynomial coefficients, and the linear combinationcoefficients) to minimize the error between the real and theapproximated belief function. However, if we directly calculatethese optimal parameters using some numerical methods, e.g.,a least-squares (LS) method, the resulting computation costis very high due to the large number of unknown parameters.Therefore, we propose an efficient approximation algorithmin which we separate the whole approximation procedure intotwo phases: local approximation and global approximation.

TABLE ITHE MULTIVARIATE HERMITE POLYNOMIALS

That is, we first approximate the belief at several importantpeaks (peaks whose contribution to the density cannot beignored) using PG functions, then we create an optimal linearcombination of these local expansions to represent the globalbelief function.

Local approximations: In local approximations, first welocalize the important peaks (local maximums) of the beliefdensity. We can apply numerical methods, such as the efficientquasi-Newton methods, to find them. Another advantage ofusing the quasi-Newton methods is that they also give anestimate of the Hessian at the peak. We then use an orthogonalfamily of PG functions to approximate the density around eachpeak. For such a family we use the Hermite functions [25],which (up to normalization factors) are given by a Gaussiandensity multiplied by Hermite polynomials.

Denote as the local approximation of the beliefaround -th peak, and apply the orthogonal expansions in termsof multivariate Hermite polynomials to the local approxima-tions, then we have

(32)

where is a nonnegative integer vector,is the dimension of ; is the bounded degree of the poly-

nomials. In (32), the summation is over every combinationof whose summation is less or equal . In (32),

is the multinormal probability density function withmean vector and covariance matrix ; denotesthe multivariate Hermite polynomials defined by

(33)

where [25]. For a localization problem, .We derive the multivariate Hermite polynomials for the bivariatecase up to order four which are given in Table I. Here we denote

and .We observe that the approximation in (32) is represented by

the parameters , , and , which are to be determined. Forthe multivariate Gaussian density , we use the positionof the peak as its mean, denoted by , and let its associated

Page 8: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

1518 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007

covariance matrix be equal to the minus of the inverse of theHessian (at the peak) of the logarithm of the belief

(34)

where and denotes the Hessian matrixat the peak . This is a locally best Gaussian approximation inthe sense that its Hessian at the peak is the same as that of thetrue density.

After we obtain the Gaussian function, we can calculate theexpansion coefficients using least squares methods. For effi-cient processing, we first create a mesh of points denselysampled around the peak. The distribution of these samples isdetermined by the estimated Hessian at the peak. Then we finda set of coefficients that minimizes the square of the errorsbetween the precise belief density function and localapproximation at these samples, i.e.,

(35)

Global approximations: After we calculate the local expan-sions for all the peaks, we merge them to obtain a global approx-imation of the belief based on an optimal linear combinations ofthese local expansions. Then the final global approximation is

(36)

where the combination coefficients are determined using LSmethods on a new global mesh of measurement points. Since theform of each local expansion is known, we can apply the linearLS to calculate the coefficients, with low computation cost.

c) Discussion: Validity of approximation: In the local ap-proximation, since we use only the truncated expansion in termsof Hermite polynomials, it is possible that the resulting seriesapproximation is not a valid density function, because somevalues of the approximation may be negative. One way to com-pensate for this disadvantage is to approximate the square root ofthe density function. Then the final approximation is its square(which is still a PG function, since PG functions are closed formultiplication) and is guaranteed to be nonnegative.

Computation complexity: Considering the limited signal pro-cessing capability at the sensor nodes, we should always try toreduce the processing complexity of the above method. An ef-ficient way is to reduce the total number of parameters (whichalso decreases the required communication amount) by keepingthe number of PG functions and the total degrees of their asso-ciated multivariate polynomials as small as possible while stillmaintaining an acceptable global density function approxima-tion accuracy. The possible approaches to solve this probleminclude: i) when meeting a predetermined quality measure oflocal approximation, try to reduce by one the total degree boundof its polynomial and repeat the approximation process until weget the minimal total degree bound that meets the approxima-tion quality measure; ii) if two or more local peaks are clusteredtogether, try to merge them into a single peak (at their weighted

center of gravity—according to their importance) and check ifthe minimal total degree needed for this single peak gives fewerparameters than the total number of parameters needed in theformer approximation (using a PG function of minimal total de-gree for each peak in the cluster); iii) try to ignore in our localapproximation the less important peaks one by one (startingfrom the weakest one) as long as we meet the quality measuretolerance.

Moment calculation: Since the obtained approximation is alinear combination of Gaussian density multiplied by polyno-mials, calculating the mmse estimation (which is the mean of theposterior function) and accuracy measure (which is the covari-ance matrix of the posterior function) is equivalent to the com-putation of the moments of corresponding multinormal densityfunctions. The moments of the multivariate normal can be cal-culated analytically [26] as follows. Supposeis an -dimensional normally distributed random variable withmean and covariance . We consider such a formof moment , where lie in .The need not all be distinct. Then we have

(37)

where the first summation sums over all combina-tions of indices and for which ; the secondsummation sums over all permutations

of , giving distinctterms allowing for the symmetry of .

Comparison with Gaussian approximation: When we com-pare LPG approximation with Gaussian approximation, weobserve that at a single sensor node, the LPG approximationuses more expensive computation to achieve a higher repre-sentation accuracy. In the Gaussian approximation where weapply Laplace’s method to calculate the mean and variance,the most computation power is spent on the two minimizationproblems in (26). Whereas in the LPG approximation, the mostcomputation power is used to localize modes (or importantpeaks) of the belief. The computation cost of each localizationis nearly the same as one minimization problem in the Gaussianapproximation. Hence, it depends on how many modes or howcomplex the belief is to determine how much more computationpower is required for the LPG approximation than the Gaussianapproximation in a single sensor node. However, accordingto the asymptotic posterior normality property, the numberof the modes in the belief, and, hence, also the computationcomplexity of the LPG approximation will decrease alongthe estimation process. We also consider that since the LPGapproximation provides more accurate representation, thecorresponding distributed estimation requires less number ofsensor nodes to achieve an acceptable performance. Therefore,the total required computation power for the LPG approxima-tion is not substantial higher than the Gaussian approximation.Another advantage of the LPG approximation is that since itneeds less number of sensor nodes, it saves the communicationpower and shortens the processing time.

Page 9: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

ZHAO AND NEHORAI: DISTRIBUTED SEQUENTIAL BAYESIAN ESTIMATION 1519

C. Optimal Sensor Node Selection

The above distributed sequential Bayesian estimation algo-rithm can be summarized as: given the current belief, we wishto update the belief incrementally by incorporating the mea-surements of other nearby sensor node, and then estimate theparameters of interest based on the obtained belief. However,not all available sensor nodes in the network provide usefulinformation that improves the estimate; furthermore, some in-formation may be redundant. Therefore, it is useful to select anoptimal subset and an optimal order of incorporating these mea-surements into our belief update. Usually this selection processprovides a faster reduction in estimation uncertainty comparedwith blind or nearest-neighbor sensor selection schemes, andincurs a lower communication burden for meeting a given esti-mation performance requirement.

In our distributed estimation method, we propose the fol-lowing four different information utility measures, fitting dif-ferent scenarios, for selecting the next sensor node optimally.They are mutual information measure, posterior Cramér-Raobound, Mahalanobis distance, and covariance-based measure.

1) Mutual Information: To use the mutual information as aninformation utility measure, we select a sensor node which willmaximize the mutual information

(38)

where is the collection of sensors that are in the neighbor-hood of current sensor node , and represents the indexof the th selected node. This sensor node selection criterion isequivalent to selecting the node whose measurement , whenconditioned on the current measurement history , would pro-vide the greatest amount of new information about the target lo-cation .

2) Posterior Cramér–Rao Bounds: For random parameters,as in our distributed sequential Bayesian estimation, a lowerbound that is analogous to the Cramér-Rao bound (CRB) in anonrandom parameter estimation exists and is derived in [27],usually referred to as posterior CRB (PCRB).

Let be a vector of measurements, is an -dimensional esti-mated random parameter, is the joint probability densityof the pair , and let be an estimate of . Then,the PCRB on the mean-square estimation error satisfies

(39)

where is the Fisher information matrix with the elements

(40)

provided that the derivatives and expectations in (39) and (40)exist. The inequality in (39) means that the differenceis a nonnegative definite matrix. From this property we knowthat PCRB is a lower bound on the error covariance matrix;

hence, we can use PCRB as a measure of the information utility,which is equivalent to using the Fisher information matrix as ameasure.

According to the definition of PCRB in (39), in our estimationproblem we will select the next node by maximizing a informa-tion utility measure based on the following Fisher informationmatrix (usually, we will use its trace or determinant as the infor-mation utility function)

(41)

Upon applying the independence assumption (15), we canrewrite the joint pdf as

(42)

Inserting this equation into the Fisher information matrix (41),we obtain the following recursive equation for the sequence ofthe posterior Fisher information matrices

(43)

where

(44)

represents the information obtained from the measurements atthe next sensor node . Therefore, we can obtain our infor-mation utility measure based only on .

We note that both of the above two information utility mea-sures are expectations over all possible measurement values

and, hence, can be computed before is actuallyobserved. Therefore, it is unnecessary for the current sensor torequest the measurements from its neighboring nodes, therebysaving the energy of the sensor networks. However, to obtain theexpectation, we need to compute a multidimensional integra-tion numerically, which leads to high computation complexityat the current sensor node. Hence, these two measures aresuitable for the cases when the dimension of the measurementvector is low or sensor nodes have relatively high processingcapabilities.

3) Mahalanobis Distance: At the current sensor node , theutility function for selecting the next sensor node is definedas the negative of the Mahalanobis distance, i.e.,

(45)

where is the position of the next sensor node; is thecurrent estimate of the source position given by (22a) andis covariance matrix of the current posterior density function

, as shown in (22b). Intuitively, this criterion means that

Page 10: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

1520 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007

the closer a sensor lies to the longer axis of the uncertainty ellip-soid, the more information it would provide for the estimation.

This utility function is easy to calculate and works well whenthe current belief can be well approximated by a Gaussian distri-bution and the sensor measurement is a function of the distancebetween sensors and sources only. For a complex environmentalscenario, for example, when there exist many obstacles in theenvironment, the above utility function may not be appropriate.

4) Covariance-Based Measure: In this case we derive utilitymeasures based on the covariance of the posterior distribution

, i.e., the covariance of the mmse estimate when weobtain the measurements from the next sensor , using theformula in (20). We note that to calculate this covariance ma-trix, we need to obtain measurements of the sensor nodes in theneighborhood of the current node. In order to avoid requiring themeasurements from each neighboring node, which leads to greatcommunication cost, we propose predicting the covariance ma-trix according to the current belief and the priorknowledge about the sensor positions and sensing model.

First we apply the sensing model in (11) to predict the mea-surements at the next sensor using the current mmse esti-mate as the source location. The next sensor positions and thesampling times are known and the bias term and noise variancecan be predetermined in a calibration phase. The other unknownparameters, e.g., , are estimated by maximizing the likelihoodfunction at the current sensor node. Then, we insert the predictedmeasurements in (13) to obtain the marginal probability densityof the measurement given . Hence, we can calculate thecovariance matrix using (20). Even though thisprediction is not very accurate, it carries decreased communica-tion requirements, and its computation complexity is very muchlower than the mutual information measure and the PCRB-basedmeasure.

IV. NUMERICAL EXAMPLES

In this section, we use numerical examples to illustrate theperformance of our proposed distributed sequential Bayesian es-timation algorithm. We will study the effects of the maximumcommunication range on estimation accuracy; compare the ef-ficiency of different belief representation methods, namely theGaussian approximation and the LPG function approximation;and consider the influence of the diffusion models as well assensor selection criteria on the performance of the estimationalgorithms.

In these examples, wireless sensor nodes are ran-domly deployed in a square area with dimension 100 100 m.We define a pair of neighboring sensor nodes whose distanceis less than the maximum communication range. For the dif-fusion model, we consider the environment as a homogeneoussemi-infinite medium with a impermeable boundary, which fitsa dispersion in air above the ground. We use a scenario of a sta-tionary continuous point source located at the boundary surface,i.e., . The other parameters , , and are taken to be 1kg/s, 40 , and 0 s, respectively.

A. Distributed Localization With Gaussian Approximation

In this numerical example, we implement the proposed dis-tributed sequential Bayesian estimation method in Section III-A

Fig. 2. Message transmission routing for localizing a diffusion source in asquare area with N = 300 sensor nodes randomly placed. “Star” denotes thesource position; “Square” denotes the initial sensor node; “Dot” denotes the de-ployed sensor nodes; “Circle” denotes the activated sensor nodes.

to localize a dispersion source whose position is .Here we approximate the belief by a Gaussian density function,and we use the Mahalanobis distance as the criterion to selectthe next sensor node. We assume the prior pdf of the source loca-tion is Gaussian, the mean of which is the position of the alarmsensor that first detects a possible release. We also assume forsimplicity that there is no external force in the environment. Thebias term in (11) is and the noise standard de-viation is . We take 10 temporal measure-ments at each sensor with the time sampling interval of 5 sec-onds. In Fig. 2 we show the message transmission route whenapplying our algorithm where the position of the initial sensornode is ; in Fig. 3 and Fig. 4 we describe

the estimation bias and the determinant of the covari-ance matrix with respect to the number of used sensor nodesunder different maximum sensor node communication ranges.

In Fig. 2 the solid line connects the sensor nodes that are se-lected (according to the sensor node selection criterion) to es-timate the source position. The direction of the message trans-mission is shown by the arrows. We observe that our algorithmprocesses well: even though the initial sensor node is locatedfar from the source, the messages still transmit to the area nearthe source and the data measured by the sensors in this area arecollected to make the estimation. According to the dispersionmodel in (9), the data measured by the sensor nodes close thesource have a high signal-to-noise ratio (SNR) and, thus, pro-vide an important benefit in the estimation. Therefore, our dis-tributed estimation algorithm provides a relatively optimal pathto collect the information, which decreases the communicationcost while maintaining the required estimation accuracy.

From Fig. 3 and Fig. 4 we find that the communication range(which is related to the wireless communication power) has animportant effect on the estimation process. As shown in Fig.4, when we set the determinant of the covariance

as a threshold of the estimation performance and whenwe use the communication , thenumber of sensor nodes that are passed is 14, fewer than in the

Page 11: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

ZHAO AND NEHORAI: DISTRIBUTED SEQUENTIAL BAYESIAN ESTIMATION 1521

Fig. 3. Estimation bias versus number of sensor nodes used under differentcommunication ranges. (a) Communication range = 10 m. (b) Communica-tion range = 15 m.

Fig. 4. Log determinant of the estimation covariance versus number ofsensor nodes used under different communication ranges: (a) Communicationrange = 10 m. (b) Communication range = 15 m.

case when the communication range is 10 m. This result is rea-sonable since a long communication range means that the cur-rent sensor node can communicate with more other nodes inits neighborhood, and, thus, has advantages in finding a betterpath to collect the information; therefore, it leads to a lowernumber of used sensor nodes. However, the larger communi-cation range means a higher power consumption; but a decreasein the number of wireless transmission hops. Therefore, in prac-tice, we need to design this parameter carefully to find an op-timal trade-off setting.

B. Distributed Localization With LPG Function Approximation

In this section, we illustrate the approximation performanceof the LPG function and compare the estimation performancewhen we use the LPG approximation with the results whenwe use Gaussian approximation. In these examples, the dif-fusive source is located at . The bias term is

and the noise standard deviation is. We take five measurements at each sensor,

and the time sampling interval is fixed at 5 s. We still use theMahalanobis distance as the criterion to select the next sensornode.

Fig. 5. The real posterior density function (belief) when we collect the mea-surements from three sensor nodes.

Fig. 6. The approximated posterior density function (belief) when we collectthe measurements from three sensor nodes, using LPG function.

In the first example, we implement the approximation of thebelief density using LPG function. In Fig. 5 we show the truebelief density (the belief without any approximation) after wecollect the measurements of three sensor nodes. We find thatthe current belief includes two peaks, which is difficult to beapproximated using only the Gaussian or Gaussian-sum densityfunction. In Fig. 6, we draw the approximated belief densitywhen we use the LPG function with the Hermite polynomialsup to fifth-order to represent it. We observe that the approxima-tion result is close to the true one. If we want to obtain a moreaccurate belief representation, or the belief density is very com-plex, we may need to use higher order Hermite polynomials,which would lead to high processing cost at each sensor node.Thus, there is a tradeoff between the computation complexityand the approximation accuracy. Note that when we apply theLPG function approximation, the order of the Hermite polyno-mials that need to be used will decrease when we collect the datafrom more sensor nodes.

In the next example, we compare the estimation performanceof the proposed algorithm using the LPG function approxima-tion with the Gaussian approximation. Namely, we illustratethe estimation bias and the log determinant of the estimationcovariance matrix with respect to the number of used sensor

Page 12: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

1522 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007

Fig. 7. Estimation bias versus number of sensor nodes used under differentbelief representation. (a) Gaussian density approximation. (b) LPG function ap-proximation, for the examples in Section IV-B.

Fig. 8. Log determinant of the estimation covariance versus number of sensornodes used under different belief representation. (a) Gaussian density approxi-mation. (b) LPG function approximation, for the examples in Section IV-B.

nodes in Fig. 7 and Fig. 8, respectively. We observe that thedistributed estimation using the LPG function approximationconverges to the true value much faster than the Gaussian ap-proximation, since the LPG function approximation providesus with a more accurate belief representation. In Fig. 8 we cansee that when we use the LPG approximation, we need to useonly seven sensor nodes to reach the performance threshold

, while for Gaussian approximation, we need20 nodes. Even though the LPG function approximation needsto transmit more parameters between two sensor nodes, it de-creases the data transmission hops and the number of sensorsthat are required to be activated; thus it can provide very lowtotal sensor network power consumption (especially when theform of the belief is very complex, e.g., highly nonlinear andnon-Gaussian) and a fast response speed, which is critical forsome security applications.

C. Sensor Node Selection

In Section III-C, we proposed four different sensor node se-lection criteria. For the nonlinear and non-Gaussian measure-ment model, the mutual information-based measure and the pos-terior CRB provide a better characterization of the usefulness of

Fig. 9. Estimation bias versus number of sensor nodes used for different sensornode selection measure. (a) Mahalanobis distance. (b) Covariance-based mea-sure, for the examples in Section IV-C.

Fig. 10. Log determinant of the estimation covariance versus number of sensornodes for different sensor node selection measure. (a) Mahalanobis distance.(b) Covariance-based measure, for the examples in Section IV-C.

the sensor data. However, the computational complexity to ob-tain these two measures is very high; hence, it requires signifi-cant processing capability of the sensor node, especially whenthe dimension of the measurement vector at each sensor node islarge. Therefore, in this section we compare the performances ofonly two other utility measures: the Mahalanobis distance andthe covariance-based measure.

In the examples in Section IV-A and IV-B, we used theMahalanobis distance as the utility information measure andfound that it worked well. However, in those examples weassumed a simple environment, i.e., a homogeneous semiinfi-nite medium without external force. In the following example,we assume there exists wind in the medium with the speed

. We implement the proposed dis-tributed estimation algorithm of Section III-A with two differentsensor selection criteria: the Mahalanobis distance and the co-variance-based measure. We compare their performance inFig. 9 and Fig. 10, and observe that under the current situation,the covariance-based measure performs better than the Ma-halanobis distance, i.e., when we apply the covariance-basedmeasure, the estimate of the source position approaches thetrue value much faster than using the Mahalanobis distance as a

Page 13: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

ZHAO AND NEHORAI: DISTRIBUTED SEQUENTIAL BAYESIAN ESTIMATION 1523

measure. In addition, to reach a required performance threshold,we need to wake up fewer sensor nodes. As we presented inSection III-C-4, since we use only the prediction value of thecovariance, the sensor node selection is not optimal; however,we save the required communication amount, and at the sametime we incorporate the information of the dispersion modelin the selection of the next sensor node. Hence, this selectioncriterion provides better performance than the Mahalanobisdistance measure in a complex scenario.

V. CONCLUSION

In this paper, we addressed the problem of deriving efficientdistributed estimation methods and apply them to the estimationof a diffusive source in wireless sensor networks. To developour approaches, we first derived the substance dispersionmodel based on solving the diffusion equation under differentenvironmental effects, and then we integrated the obtainedphysical model into the signal processing technologies andproposed a distributed sequential Bayesian estimation method.Actually, except for the applications concerned with a diffusivesource, the proposed distributed estimation method provide ageneral framework for the other applications with stationaryparameters in wireless sensor networks. In our algorithm, wedynamically update the belief density function in the space do-main, which originates from the nonlinear recursive Bayesianfiltering, i.e., we transmit the state belief in the sensor networksand update the belief using the measurements from the newsensor node. In this way, we realize a distributed estimation.We also implemented the idea of information-driven sensorcollaboration for the sensor node selection, which provides anoptimal subset and an optimal order of incorporating the mea-surements into our belief update, reduces the response time andsaves on energy consumption of the sensor network. The beliefrepresentation is another important issue in our method becauseof the strict power and computation capability restrictions inwireless sensor networks. We propose two belief representationmethods: Gaussian density approximation and a new LPGfunction approximation. Both of these approximations aresuitable for wireless sensor networks and are applicable todifferent sensor network situations.

In future research, we will extend our proposed distributedprocessing methods to more complex environments (e.g., urbanenvironments). In these cases, we will need to use numericalmethods to solve the diffusion equation and obtain the physicalmodel. Theoretically, the proposed algorithm can be generalizedto include numerical solutions of the dispersion model. How-ever, the involvement of the numerical results may influence thecomputation complexity and required communication amount;therefore, we need to investigate it carefully. Another researchdirection is to extend our methods to applications involving bothstationary and dynamic state models in diffusion sources. Theapproaches in [2] provide some ideas on this topic. However, be-cause of the specificities of the investigated diffusion model andthe considered types of sensors (e.g., biochemical sensors), wemust study new and practically fitting methods. We also plan toinvestigate new belief representations and sensor node selectionmethods by considering the tradeoff between signal processing

performance, required communication amount, and computa-tion complexity.

ACKNOWLEDGMENT

The authors would like to thank Dr. A. Dogandzic andDr. E. Paldi for their helpful comments and discussions. Wealso thank anonymous reviewers for their careful checking andconstructive comments.

REFERENCES

[1] S. Kumar, F. Zhao, and D. Shepherd, Eds., “Special issue on collabora-tive signal and information processing in microsensor networks,” IEEESignal Process. Mag., vol. 19, no. 2, Mar. 2002.

[2] F. Zhao and L. J. Guibas, Wireless Sensor Networks: An InformationProcessing Approach. San Francisco, CA: Morgan Kaufmann, 2004.

[3] A. Nehorai, B. Porat, and E. Paldi, “Detection and localization ofvapor-emitting sources,” IEEE Trans. Signal Process., vol. SP-43, pp.243–253, Jan. 1995.

[4] B. Porat and A. Nehorai, “Localizing vapor-emitting sources bymoving sensors,” IEEE Trans. Signal Process., vol. SP-44, pp.1018–1021, Apr. 1996.

[5] A. Jeremic and A. Nehorai, “Design of chemical sensor arrays for mon-itoring disposal sites on the ocean floor,” IEEE J. Ocean. Eng., vol. 23,pp. 334–343, Oct. 1998.

[6] ——, “Landmine detection and localization using chemical sensorarray processing,” IEEE Trans. Signal Process., vol. SP-48, pp.1295–1265, May 2000.

[7] T. Zhao and A. Nehorai, “Detecting and estimating biochemical dis-persion of a moving source in a semi-infinite medium,” IEEE Trans.Signal Process., vol. 54, no. 6, pt. 1, pp. 2213–2225, Jun. 2006.

[8] F. Zhao, J. Shin, and J. Reich, “Information-driven dynamic sensorcollaboration,” IEEE Signal Process. Mag., vol. 19, no. 2, pp. 61–72,March 2002.

[9] M. Chu, H. Haussecker, and F. Zhao, “Scalable information-drivensensor querying and routing for ad hoc heterogeneous sensor net-works,” Int. J. High-Perform. Comput. Applicat., vol. 16, no. 3, pp.90–110, 2002.

[10] S. M. Kay, Fundamentals of Statistical Signal Processing: EstimationTheory. Englewood Cliffs, NJ: Prentice-Hall, 1993.

[11] A. Mutambara, Decentralized Estimation and Control for MultisensorSystems. Boca Raton, FL: CRC, 1998.

[12] B. D. Anderson and J. B. Moore, Optimal Filtering. EnglewoodCliffs, NJ: Prentice-Hall, 1979.

[13] R. S. Busy and K. D. Senne, “Digital synthesis of nonlinear filters,”Automatica, vol. 7, pp. 287–298, 1971.

[14] A. Doucet, N. de Freitas, and N. J. Gordon, Eds., Sequential MonteCarlo Methods in Practice, ser. Series: Statistics for Eng. Inf. Sci..New York: Springer-Verlag, 2001.

[15] M. Coates, “Distributed particle filters for sensor networks,” in 3rd Int.Symp. Inf. Process. Sensor Netw., Apr. 26–27, 2004, pp. 93–107.

[16] M. Rosencrantz, G. Gordon, and S. Thrun, “Decentralized sensor fu-sion with distributed particle filters,” in Prof. Conf. Uncertainty in Artif.Intell., Acapulco, Mexico, Aug. 2003.

[17] X. Sheng, Y.-H. Hu, and P. Ramanathan, “Distributed particle filterwith GMM approximation for multiple targets localization and trackingin wireless sensor network,” in Proc. 4th Int. Symp. Inf. Process. SensorNetw., Apr. 15, 2005, pp. 181–188.

[18] J. Crank, The Mathematics of Diffusion, 2nd ed. Oxford: OxfordUniv. Press, 1975.

[19] W. Jost, Diffusion in Solids, Liquids, Gases. New York: Academic,1952.

[20] J. M. Bernardo and A. F. M. Smith, Bayesian Theory. New York:Wiley, 1994.

[21] A. Azevedo-Filho and R. D. Shachter, “Laplace’s method approxi-mations for probabilistic inference in belief networks with continuousvariables,” in Proc. 10th Conf. Uncertainty in Artif. Intell., San Mateo,CA, 1994, pp. 28–36.

[22] E. Paldi and A. Nehorai, “Exact nonlinear filtering via Gauss transformeigenfunctions,” in Proc. Int. Conf. Acoust., Speech, Signal Process.,Apr. 1991, vol. 5, pp. 3389–3392.

Page 14: Electrical & Systems Engineering at WashU - IEEE ...nehorai/paper/04133054.pdfIEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian

1524 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007

[23] H. W. Sorenson and D. L. Alspach, “Recursive Bayesian estimationusing Gaussian sums,” Automatica, vol. 7, pp. 465–479, 1971.

[24] D. L. Alspach and H. W. Sorenson, “Nonlinear Bayesian estimationusing Gaussian sum approximations,” IEEE Trans. Autom. Control,vol. 17, no. 4, Aug. 1972.

[25] C. S. Withers, “A simple expression for the multivariate Hermite poly-nomials,” Stat. Probab. Lett., vol. 47, pp. 165–169, 2000.

[26] ——, “The moments of the multivariate normal,” Bull. Austral. Math.Soc., vol. 32, pp. 103–107, 1985.

[27] P. Tichavský, C. H. Muravchik, and A. Nehorai, “Posterior Cramér-Rao bounds for discrete-time nonlinear filtering,” IEEE Trans. SignalProcess., vol. 46, no. 5, May 1998.

Tong Zhao (S’02) received the B.Eng. and M.Sc. de-grees in electrical engineering from the University ofScience and Technology of China, Hefei, China, in1997 and 2000, respectively.

Currently, he is pursuing the D.Sc. degree with theDepartment of Electrical and Systems Engineering,Washington University, Saint Louis. His research in-terests are statistical signal processing, including de-tection and estimation theory, distributed signal pro-cessing, sequential Bayesian methods, Monte Carlomethods, etc., and their applications in radar, com-

munications, biochemical sensor, sensor arrays, and wireless sensor networks.

Arye Nehorai (S’80–M’83–SM’90–F’94) receivedthe B.Sc. and M.Sc. degrees in electrical engineeringfrom the Technion, Israel, and the Ph.D. degreein electrical engineering from Stanford University,Stanford, CA.

From 1985 to 1995, he was a faculty memberwith the Department of Electrical Engineering, YaleUniversity, New Haven, CT. In 1995, he joined theDepartment of Electrical Engineering and ComputerScience, The University of Illinois at Chicago (UIC),as a Full Professor. From 2000 to 2001, he was Chair

of the department’s Electrical and Computer Engineering (ECE) Division,which then became a new department. In 2001, he was named UniversityScholar of the University of Illinois. In 2006, he became Chairman of theDepartment of Electrical and Systems Engineering, Washington University,St. Louis.

Dr. Nehorai is the inaugural holder of the Eugene and Martha Lohman Profes-sorship and the Director of the Center for Sensor Signal and Information Pro-cessing (CSSIP) at WUSTL since 2006. He was Editor-in-Chief of the IEEETRANSACTIONS ON SIGNAL PROCESSING during 2000 to 2002. During 2003 to2005, he was Vice President (Publications) of the IEEE Signal Processing So-ciety, Chair of the Publications Board, member of the Board of Governors, andmember of the Executive Committee of this Society. He is the founding editor ofthe special columns on Leadership Reflections in the IEEE SIGNAL PROCESSING

MAGAZINE. He was corecipient of the IEEE SPS 1989 Senior Award for BestPaper with P. Stoica, coauthor of the 2003 Young Author Best Paper Award,and corecipient of the 2004 Magazine Paper Award with A. Dogandzic. He waselected Distinguished Lecturer of the IEEE SPS for the term 2004 to 2005. Heis the Principal Investigator of the new multidisciplinary university research ini-tiative (MURI) project entitled Adaptive Waveform Diversity for Full SpectralDominance. He has been a Fellow of the Royal Statistical Society since 1996.