Distributed Kriged Kalman Filter for Spatial Estimation

2816 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 12, DECEMBER 2009

Distributed Kriged KalmanFilter for Spatial Estimation

Jorge Cortés, Senior Member, IEEE

Abstract—This paper considers robotic sensor networks per-forming spatially-distributed estimation tasks. A robotic sensornetwork is deployed in an environment of interest, and takessuccessive point measurements of a dynamic physical processmodeled as a spatio-temporal random field. Taking a Bayesianperspective on the Kriging interpolation technique from geo-statistics, we design the DISTRIBUTED KRIGED KALMAN FILTER forpredictive inference of the random field and of its gradient. Theproposed algorithm makes use of a novel distributed strategy tocompute weighted least squares estimates when measurements arespatially correlated. This strategy results from the combination ofthe Jacobi overrelaxation method with dynamic average consensusalgorithms. As an application of the proposed algorithm, we designa gradient ascent cooperative strategy and analyze its convergenceproperties in the absence of measurement errors via stochasticLyapunov functions. We illustrate our results in simulation.

Index Terms—Cooperative control, distributed estimation, dis-tributed Kriged Kalman filter, robotic sensor networks, spatial sta-tistics.

I. INTRODUCTION

C ONSIDER a robotic sensor network taking successivemeasurements of a dynamic physical process modeled

as a spatio-temporal random field. Our objective is to designa distributed estimation algorithm that enables the network toobtain consistent and statistically sound representations of thespatial field. Arguably, the availability of such representationsto the network agents is necessary to tackle other sensing tasksrelated with the physical process, such as optimal estimation,localization of critical points, or identification of areas of rapidvariability. These tasks are relevant in multiple scenarios, in-cluding environmental monitoring, oceanographic exploration,and atmospheric research, when one might be interested infinding higher pollutant concentrations, areas of maximumsalinity, or locations where algae are abundant.

1) Literature Review: In geostatistics, spatial processes mod-eled as random fields are estimated via Kriging interpolationtechniques [1], [2]. Simple Kriging assumes that the mean of therandom field is constant and known a priori. Universal Kriging,instead, considers the setup where the mean function is an un-known combination of known basis functions. For processes

Manuscript received May 08, 2007; revised January 12, 2009. First publishedNovember 10, 2009; current version published December 09, 2009. This paperwas presented in part at the IEEE Conference on Decision and Control, 2007.This work was supported in part by National Science Foundation (NSF) CA-REER Award ECS-0546871. Recommended by Associate Editor S. Dey.

The author is with the Department of Mechanical and Aerospace Engineering,University of California, San Diego, CA 92093 USA (e-mail: [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TAC.2009.2034192

that evolve in time, [3], see also [4], develops a universal Krigingapproach termed Kriged Kalman filter that combines the timeand spatial components of the field. The work [5] presents aninferential framework for directional gradients of spatial fieldsbased on point-referenced data.

In cooperative control, [6] proposes a decentralized infor-mation filter for parameter estimation based on all-to-all com-munication, and applies it to tracking, localization, and mapbuilding. [7] designs network coordination strategies to seek outlocal optima of a deterministic, static field using noisy mea-surements and all-to-all network communication. The field isrepresented by an affine function. Instead, [8] considers a dy-namic field represented by an unknown linear combination ofknown functions whose coefficients evolve stochastically drivenby white noise. [8] develops distributed optimal estimation tech-niques for networks with connected communication topologyand sensor measurements corrupted by white noise. The mea-surements taken by individual network agents are uncorrelated.Objective analysis techniques are employed in [9] to find, in re-stricted parameterized families of curves, network trajectoriesthat optimize the off-line, centralized estimation of an environ-mental field whose mean is a priori known and whose covari-ance is separable. During the evolution, individual agents donot communicate field measurements to other neighbors or pos-sess a representation of the spatial field. Instead, the fusion ofthe data is performed at the end of the experiment. The works[10], [11] introduce distributed data fusion algorithms basedon averaging consensus that work under the assumption thatsensor measurements are uncorrelated. Dynamic consensus al-gorithms that allow to track the average of a given time-varyingsignal are studied in [12]–[14]. Other related works include [15],[16], where decentralized Kalman filtering procedures are de-veloped that work under the assumption of all-to-all commu-nication. Parallel and distributed algorithms for static networksare thoroughly studied in [17]. Finally, stability analysis toolsfor stochastic systems include the martingale convergence the-orem [18] and stochastic Lyapunov functions [19].

2) Statement of Contributions: The contributions of thispaper are the following: (i) the formulation of the spatio-tem-poral field estimation via Bayesian universal Kriging, and theincorporation of statistically sound gradient information ofthe spatial field; (ii) the synthesis of a distributed algorithm tocompute weighted least squares estimates when sensor mea-surements are correlated. This algorithm combines the Jacobioverrelaxation method with dynamic average consensus algo-rithms; (iii) the design of the DISTRIBUTED KRIGED KALMAN

FILTER for predictive inference of the spatial field and of itsgradient; and (iv) building on the previous contributions, thesynthesis of a distributed motion coordination strategy that

0018-9286/$26.00 © 2009 IEEE

CORTÉS: DISTRIBUTED KRIGED KALMAN FILTER 2817

makes individual robotic agents converge with probability oneto the set of critical points of the random spatial field.

3) Organization: The paper is organized as follows. Sec-tion II presents basic notions on random spatial fields. Sec-tion III introduces the models for the physical process and therobotic sensor network. Section IV describes the sequential es-timation of the spatial field and of its gradient via Bayesianuniversal Kriging. Section V presents a distributed algorithmto compute weighted least squares estimates when sensor mea-surements are correlated. This algorithm is then used in Sec-tion VI to design a distributed implementation of the sequentialestimation discussed in Section IV. Section VII proposes a co-operative strategy to localize critical points of spatial fields andanalyzes its convergence properties. Section VIII presents ourconclusions and ideas for future work.

Notation: Let , , , , and denote, re-spectively, the set of integer, positive integer, non-negative in-teger, real, positive real, and non-negative real numbers. Let

denote the Dirac delta function defined byfor , and . Vectors in Euclidean space are un-derstood as column vectors. Let denote the canonicalbasis of . Given a matrix , let and

denote the row and the column of , re-spectively. For an undirected graph consisting of aset of vertices and a set of edges , the neighborsof in are denoted by .Usually, we take . The adjacency matrix ofis the matrix defined by if

, and otherwise. We will often simply denoteit by . Throughout the paper, we use the math boldface font toemphasize the dependence of the corresponding quantity on thespecific network configuration where it is evaluated. This allowsus to write more concise expressions.

II. RANDOM SPATIAL FIELDS

In this section we review important notions on random spatialfields. The interested reader is referred to [1], [2] for more de-tails. Let us start with some basic definitions. Let be a randomspatial field on , , with positive definite covariancefunction

The field is stationary on if , for, and isotropic on if ,

for . Throughout the paper, we deal with sta-tionary random fields. When modeling physical processes, it iscommon for a random field to be stationary over a strict subsetof instead of the whole Euclidean space. For instance, theassumption of stationarity is reasonable for a temperature fieldconsidered over a small enough region of the ocean. However,over larger spatial domains, other physical phenomena mightcause smaller correlation ranges in particular areas that invali-date the stationarity assumption. The ensuing discussion is alsovalid for random fields that are stationary on an open subset of

.Predictive inference of a spatial field at arbitrary points

given measurements at arbitrary locations can be done via the

joint distribution. For concreteness, letbe a stationary Gaussian process. Given measurements

of the spatial field at locations ,and , define the following shorthand notation for conve-nience:

Then is distributed as the dimensional normal

Consequently, the conditional predictive distribution of the spa-tial field at given observations at is the normal dis-tribution

(1)

The conditional mean in (1) is known in the geostatistics liter-ature as the simple Kriging predictor, and the conditional vari-ance in (1) is the corresponding mean-squared prediction error.In general, perfect observations of the spatial field are not avail-able, and the mean and the covariance structure are only knownup to a certain number of parameters. We will discuss the esti-mation problem in these more general terms in Section IV.

When considering dynamic processes, we restrict our atten-tion to spatio-temporal random fields on with sepa-rable covariance functions, i.e., of the form

where and . Astationary spatio-temporal random field of this form verifies that

and , forand . Note that the above discussion

is also valid for predictive inference of a spatio-temporal fieldat a fixed instant of time.

A. Gradient Random Spatial Fields

The discussion here follows [5] and can be easily extendedfor spatio-temporal fields. For concreteness, we restrict our at-tention to stationary random fields. Given a stationary randomfield on and a vector , a directional gradient fieldon is defined, for

if the limit understood in the sense exists. The random fieldis mean square differentiable at if there exists a vector

such that, for all


It follows that, if is mean square differentiable at , then, for all (where the equality

should be understood in the -sense). In particular

Throughout the paper, we will deal with random fields that aremean square differentiable everywhere.

If is a stationary Gaussian random field, the resulting joint( -dimensional multivariate) Gaussian field onhas a valid cross-covariance function

(2)

where and denote, respec-tively, the gradient and the Hessian of the function . This jointdistribution allows predictive inference for the gradient at arbi-trary points given measurements of the random field at arbitrarylocations. For concreteness, let , with

continuously differentiable. Given measurementsof the spatial field at locations ,

and , according to (2), is distributed as thedimensional normal distribution

where .Consequently, the conditional predictive distribution for the gra-dient is the -dimensional normal distribution

(3)

Section IV studies the conditional predictive distribution of thegradient when the mean of the spatial field is unknown, andperfect observations are not available.

A critical point of the spatial field is a locationsuch that . Note that a critical point satisfies

for all , and hence corresponds to a max-imum, a minimum, or a saddle point of .

III. PROBLEM SET-UP

The objective of this paper is to design distributed estimationalgorithms that enable a robotic sensor network to obtain consis-tent and statistically sound representations of a physical processof interest. In the following, we detail the specific models for theprocess and the robotic network.

A. Physical Process Model

We consider a dynamic physical process, i.e., a process thatevolves in time, modeled as a spatio-temporal Gaussian randomfield of the form

(4a)

(4b)

where and is contin-uously differentiable with respect to its first argument. Here,captures small-scale variability of the physical process, and theevolution of the mean is determined by the interaction function

and the stochastic component . Both and arestationary spatial fields that exhibit temporal variability but haveno temporal dynamics associated with them. Formally, both arezero-mean Gaussian random fields with separable covariancestructure

where denotes the Dirac delta function. Note that both andare uncorrelated in time. We assume that the functions

have finite range. Without loss of generality, bothranges are considered equal, that is, there exists suchthat

(5)

The approach taken in [3], [4] to deal with the time evolu-tion (4b) of the spatial field mean is to consider a truncatedexpansion. Specifically, if is a complete andorthonormal sequence of continuously differentiable functions,the mean admits a representation of the form

where, for each , is a random time series.Likewise, admits a decomposition

The standard procedure is then to truncate the representation ofand to, say, the first basis elements, and use the

orthonormality of the basis to rewrite (4b) as

(6)

where, for simplicity, we use the notation

Alternatively, one can set up the problem by directly assumingthat the mean of in (4a) is a linear combination of knownfunctions whose coefficients evolve in time according to(6).

B. Network Model

Consider a network of agents evolving in according tothe first-order dynamics

The control action is bounded , so thatan agent can move at most in one second. Agents are


equipped with identical sensors, and can take point measure-ments at their location of the spatial field of interest at times

. The measurement taken by agent located at attime is corrupted by white noise according to

(7)

where . Measurement errors are assumed to beindependent. For simplicity, the variance is assumed to be thesame for all agents, although the forthcoming discussion canbe generalized to the case of different noise variances for eachagent.

Each agent can communicate with other agents located withina distance from its current position. As we will showlater, each agent can construct a distributed representation ofthe spatial field and of its gradient in a ball of radius .Therefore, we will make the assumption that

The communication capabilities of the agents induce the net-work topology corresponding to the -disk graph . Ateach network configuration , the -diskgraph is an undirected graph with vertex set

and edge set . The-disk graph is a particular example of the notion of proximity

graph, see e.g., [20]. We assume that either the number of net-work agents is a priori known to everybody, or that agents runa consensus algorithm to determine it.

Remark 3.1 (Distributed Computation): One can provide aformal notion of the concept of distributed computation of func-tions and vector fields, see, e.g., [20]. For simplicity, here weonly use an informal version of this notion, where we charac-terize a computation as distributed over an undirected graph ifeach node can perform the computation using only informationprovided by its neighbors in the graph.

IV. SEQUENTIAL ESTIMATION OF THE SPATIAL

FIELD AND OF ITS GRADIENT

In this section, we take a Bayesian perspective to incorporateprevious knowledge into the estimation of the spatial field and ofits gradient. The scheme followed here recovers the predictivedistribution of the spatial field presented in [3], see also [4], andyields novel information regarding the predictive distribution ofthe gradient of the spatial field. We consider the spatial field es-timation when measurements are taken at multiple time instants,or sequentially. The physical process model in Section III-A to-gether with the data model in Section III-B give rise to the evo-lution, for

(8a)

(8b)

(8c)

where, for convenience, we have introduced the notation, , and

Notice that the matrices and driving the evolution of theparameter change from one time instant to another only ifagent positions change.

The natural Bayesian solution for making predictions aboutthe spatial field at time is to use the conditional distri-bution of given the data up to time and the parameter , butmarginalizing over the posterior distribution of given the dataup to time . This viewpoint also allows us to integrate into thepicture prior information on the distribution of . Therefore, wefollow the next scheme: (i) Section IV-A computes the posteriordistribution of the parameter given the data, (ii) Section IV-Bcomputes the conditional distribution of the spatial field and itsgradient given the data and the parameter, and (iii) Section IV-Cmerges (i) and (ii). The decomposition into the three steps willbe conveniently used in Section VI to design a distributed im-plementation.

A. Sequential Parameter Estimation via Kalman Filtering

With the model (8), the parameter can be optimally pre-dicted via a Kalman filter. Here, instead of considering the usualKalman filter recursion equations, we use the equivalent infor-mation filter formulation, see for instance [21]. We also note thatalternative forms of the information filter, like the Joseph form[22], can be computed in a distributed fashion along the samelines described later in Sections V and VI.

Assume is initially distributed according to a multivariatenormal distribution . Given , let

denote the estimator of at time with data collectedup to time , and let denote the associated mean-squarederror. The usual Kalman filter equations are written in the vari-ables and . Instead,we define

(9)

and write the information filter equations in the variablesand . Note that,

initially

The information filter equations have two steps. The first stepcorresponds to a prediction of the parameter at timegiven data up to time , and the second step incorporates themeasurements taken at time into the picture.

1) Prediction: Using (8c), the one-step-ahead prediction attime with data collected up to time is

(10a)

with information matrix

(10b)

where .


2) Correction: Using (8a), the optimal prediction at timewith data collected up to time can be recursively

expressed as

(11a)

with information matrix

(11b)

where denotes the measure-ments taken by the network agents at time

is the variance corresponding to the spatial field , and isthe variance corresponding to the sensor errors .

The information filter equations provide an iterative fashionof computing and that is appropriate for a dis-tributed implementation by the robotic network. We describethis in Section VI-A.

B. Sequential Simple Kriging

For , let denote the dataavailable up to time . For , let

The covariance structure of the spatial field (cf. Section III-A)has some important consequences. On the one hand, thecomponents of and can only be nonvanishingif . More importantly, the decorrelation in timeof the spatial field and the sensor errors imply that only the ob-servations collected at exactly time play a role in the construc-tion of the conditional predictive distribution of and withobservations collected up to time . This is formalized in thefollowing result.

Lemma 4.1: (Sequential Simple Kriging): For alland all , one has, conditionally on the data collected upto time and the parameter ,

i) the normal distribution with mean

and variance

ii) the normal distribution withmean

and variance

where, for brevity, we let .Proof: Given the evolution (8) and the covariance struc-

ture for and detailed in Section III, we have that forand , the conditional distribution of given

is

where ,

, and .Then, fact (i) follows by noting that:

Fact (ii) can be established analogously using (2).In the absence of measurement errors, Lemma 4.1(i) cor-

responds to the simple Kriging predictor and variance of thespatio-temporal field in Section II.

C. Sequential Bayesian Universal Kriging

Finally, we construct the Bayesian universal Kriging pre-dictor of the spatial field and of its gradient by putting togetherSections VI-A and VI-B. Specifically, at time , for each

, the posterior predictive distributions of andgiven the data are obtained by marginalizing the

conditional distributions in Lemma 4.1 over the posterior dis-tribution obtained with thecombination of the information filter equations in Section VI-Aand equation (9). Accordingly, we obtain the following result.

Lemma 4.2 (Sequential Bayesian Universal Kriging): For alland all , one has, conditionally on the data

collected up to time ,i) the normal distribution with mean

and variance

ii) the normal distribution with mean

and variance


The normal distribution in Lemma 4.2(i) corresponds to thespatial estimation obtained from the Kriged Kalman filter pro-posed in [3]. The normal distribution in Lemma 4.2(ii) givesus information about the gradient of the spatial field. Our nextobjective is to design a distributed coordination algorithm thatallows network agents to compute these quantities.

V. DISTRIBUTED AVERAGE WEIGHTED LEAST SQUARES

This section presents distributed algorithms to compute av-erage weighted least squares estimates. The capability to com-pute such estimates will be instrumental in Section VI to syn-thesize a distributed implementation of the estimation proceduredescribed in Section IV.

Given a network of agents with interaction topology de-scribed by an undirected graph , matrices invert-ible and , and a vector , we introduce herean algorithm to compute the quantity

(12)

that is distributed over . The idea is to combine a Jacobi it-eration and a dynamic consensus algorithm into a single pro-cedure that we term the WEIGHTED LEAST SQUARES A LGO-

RITHM. The reason behind this terminology is the following:consider a linear observation model (determined by ) of anunknown parameter, and let represent measured data with as-sociated covariance . Then, (12) corresponds to the averageweighted least squares estimate of the parameter. Let us start bypresenting the individual ingredients of the WEIGHTED LEAST

SQUARES ALGORITHM.

A. Jacobi Overrelaxation Algorithm

Given an invertible matrix and a vector ,consider the linear system . The Jacobi overrelaxation(JOR) algorithm [17] is an iterative procedure to compute theunique solution . It is formulated as the dis-crete-time dynamical system

for and , with and. The convergence properties of the JOR algorithm can be

fully characterized in terms of the eigenvalues of the matrix de-scribing the linear iteration, see [17]. Here, instead, we will usethe following sufficient convergence criteria from [23, Theorem2].

Lemma 5.1: For symmetric, positive definite andany , if , the JOR algorithm linearly convergesto the solution of from any initial condition.

As long as (i) agent has access to , and (ii) if ,then , are neighbors in , the JOR algorithm is amenable todistributed implementation in the following sense: agent cancompute the component of the solution withinformation provided by its neighbors in .

Remark 5.2: (Robustness to Agents’ Arrivals and Depar-tures): In the scenario where the matrix , the vector , and theinteraction topology are a function of agents’ positions, thefact that the JOR algorithm converges from any initial conditionimplies that it is robust to a finite number of agents’ arrivals

and departures. In other words, if the number of agents afteraddition and deletion is , with corresponding and , thenas long as , the convergence of the JOR algorithm to

is guaranteed.

B. Dynamic Average Consensus Algorithms

Dynamic average consensus filters [12]–[14] are distributedalgorithms that allow the network to track the average of a giventime-varying signal. Under suitable conditions on the evolu-tion of the signal, one can guarantee asymptotic convergence.Here, we use a particular instance of the proportional-integraldynamic consensus estimators studied in [14] but formulatedfor higher-dimensional signals.

Let be a time-varying function,that we refer to as signal. Note that is a -dimensionalvector with each component , , being itselfa -dimensional vector. Consider the dynamical system

(13a)

(13b)

for , where and . Here,is the adjacency matrix of . If agent has access

to the -component of the signal , then this algorithm isdistributed over , i.e., agent can compute the evolution ofand with information provided by its neighboring agents inthe graph . As we see next, can be interpreted as the estimatethat agent possess of the average of the time-varying signal .It can be proved [14] that if is connected, for any , anyconstant signal , and any initialstates , the algorithm (13) satisfies

(14)

exponentially fast for all . For slowly-varyingsignals, the estimator guarantees small steady-state errors.

Remark 5.3: (Robustness to Agents’ Arrivals and Depar-tures): When the signal and the interaction topology area function of agents’ positions, one can show [14] that theestimator (13) guarantees zero steady-state error under a finitenumber of agents’ arrivals and departures.

C. The Weighted Least Squares Algorithm

Here, we combine the JOR algorithm and the dynamic av-erage consensus algorithm to synthesize the WEIGHTED LEAST

SQUARES ALGORITHM described in Table I.Proposition 5.4: Consider the WEIGHTED LEAST SQUARES

ALGORITHM described in Table I. For invert-ible, , and , define the output functions

andby, respectively

where and are defined in Table I. Then,


TABLE IWEIGHTED LEAST SQUARES ALGORITHM.

i) the WEIGHTED LEAST SQUARES ALGORITHM is dis-tributed over , in the sense that agentcan compute and with infor-mation provided by its neighboring agents in ;

ii) the function is such that

exponentially fast;iii) if is connected, the function satisfies

exponentially fast, for all .Proof: The WEIGHTED LEAST SQUARES ALGORITHM is

distributed over by design. The statement on the limit offollows from Lemma 5.1 (the linear rate of

convergence of the discrete-time algorithm translates into anexponential rate of convergence for the continuous-time func-tion). Regarding the limit of , ,note that, with the notation of Table I

where we have used that the JOR algorithm converges to thesolution of . The result now followsfrom the convergence properties (14) of the dynamic averageconsensus algorithm.

Remark 5.5: (Execution of JOR, Dynamic Average Con-sensus, and Weighted Least Squares Algorithms): In theforthcoming discussion, we will make use of the fast conver-gence properties of the JOR, dynamic average consensus, andWEIGHTED LEAST SQUARES ALGORITHM and use the exactasymptotic limit of these algorithms in our derivations. In prac-tice, after a few iterations, the values obtained by the executionof these algorithms are very close to the exact asymptotic limitbecause the convergence rate of the JOR algorithm is linearand the convergence rates of the dynamic average consensusalgorithms and the WEIGHTED LEAST SQUARES ALGORITHM

are exponential. Using this fact, it is not difficult to characterizethe precise number of iterations needed to achieve a desiredlevel of convergence.

VI. DISTRIBUTED IMPLEMENTATION OF

SEQUENTIAL FIELD ESTIMATION

In this section we introduce the DISTRIBUTED KRIGED

KALMAN FILTER. This algorithm allows each network agentto compute, at any time, the posterior predictive distributionof the spatial field and its gradient on a neighborhood of itscurrent location obtained in Section IV, and only requirescommunication with neighboring agents in .

The algorithm is described in Table II. The underlying ideais that, instead of working directly with the posterior predictivedistributions obtained in Section VI, each agent performs ina distributed way (i) the sequential parameter estimation de-scribed in Section VI-A and (ii) the sequential simple Krigingdescribed in Section VI-B. From these two constructions,each agent can then compute the desired posterior predictivedistributions. The implementation of both (i) and (ii) relies onthe algorithms presented in Section V, and in particular on theWEIGHTED LEAST SQUARES A LGORITHM. It should be notedthat every time the algorithms in Section V are invoked inTable II, we use their exact asymptotic limit, cf. Remarks 5.5and 6.2.

In the following, we explain in detail the algorithm steps out-lined in Table II.

A. Distributed Sequential Parameter Estimation

Here, we describe the strategy that network agents imple-ment in order to compute the sequential parameter estimationdescribed in Section VI-A.

1) Distributed Prediction: At time , assumeand are known to all network agents

from the previous iteration (initially, we set ,). According to (10), agent can compute the

one-step-ahead prediction with information matrix(step 9 of Table II) if it has access to the matrices

��

��


TABLE IIDISTRIBUTED KRIGED KALMAN FILTER

We break down this task into the computation of the matrices

(15)

Let us show how the network performs a distributedcomputation of the matrices in (15). This correspondsto step 3 of Table II. Specifically, if agent has accessto the matrix for , define

by

(16)

where is determined by the execution of the dynamic averageconsensus algorithm (13) over with constant signal

given by

According to Section V-B, if is connected, then


as exponentially fast.At time , agent has access to

Moreover, the assumption (5) on the finite correlation rangeof guarantees that agent can compute

by knowing the position of its neighbors in at time. Hence, agent has access to

by communicating with its -disk neighbors. Therefore, inorder to compute the matrices in (15), the network executesthree dynamic average consensus algorithms over toobtain

2) Distributed Correction: At time , assumeand are known to all network agents from

the distributed prediction computation. According to (11), agentcan compute the prediction with information matrix

(step 10 of Table II) if it has access to

(17a)

(17b)

Let us describe how the network performs a distributed compu-tation of (17).

3) Distributed Computation of the Weighted Least SquaresEstimate: This discussion refers to step 4 of Table II.The network computes the vector (17a) by invoking theWEIGHTED LEAST SQUARES A LGORITHM once. Specifi-cally, agent has access to ,and to . Therefore, assuming that

is connected, the execution of theWEIGHTED LEAST SQUARES A LGORITHM with ,

, and guarantees, according to Proposition5.4

as , for all .4) Distributed Computation of the Covariance Matrix of

the Weighted Least Squares Estimate: This discussion refers

to steps 5:-8 of Table II. The network computes the ma-trix (17b) by invoking instances of the WEIGHTED LEAST

SQUARES A LGORITHM (one instance per matrix column).Specifically, for each , agent has access tothe component of the vector , and to

. Moreover, the assumption (5) onthe finite correlation range of the spatial field guarantees thatagent can compute by knowing the position ofits neighbors in at time . Therefore, assuming that

is connected, the execution of theWEIGHTED LEAST SQUARES A LGORITHM with ,

, and guarantees, according toProposition 5.4

as , for all . Hence, the executionof instances of the WEIGHTED LEAST SQUARES A LGO-

RITHM allows agent to compute the time-dependent matrixdefined by

with the property that

as , for all .

B. Distributed Sequential Simple Kriging

Here, we describe the strategy that network agents implementin order to compute the sequential simple Kriging described inSection VI-B. This strategy makes use of the special covari-ance structure of the spatial field. The discussion refers to steps11:-12 of Table II. According to Lemma 4.1, to compute themeans of the conditional distributions of the spatial field and itsgradient at and , we look for the distributedcalculation of

(18a)

(18b)

(18c)

Regarding (18a), note that the components of andcan only be nonvanishing if agent is within -dis-

tance of , that is, . Therefore, any agentcan compute all the nonvanishing components in and

if the ball centered at of radius is contained in thearea within communication range of agent —since in this casethe agent will have access to the location of all other agents con-tained in . Noting that


we deduce that agent can compute (18a) in a distributed wayfor any .

Regarding (18b) and (18c), note that, as a by-product of theexecutions of the WEIGHTED LEAST SQUARES A LGORITHM per-formed in the distributed sequential parameter estimation ofSection VI-A, at time , agent has avail-able the component of the functions

Let us define by

Note that agent has access to the row of. By Proposition 5.4, we have

Finally, note that agent has access to bothand

for all such that and are neighbors in .Therefore, we deduce the following result.

Proposition 6.1: For all and all, and can be

casted as in 12 of Table II, and therefore, are computable byagent over .

Proof: If , then agent can compute allthe non-vanishing components of and , sincethey correspond to the identifiers of other agents that must beits neighbors in . Since agent has also access to thecorresponding components of and

, the result follows.Remark 6.2: (Execution of Distributed Kriged Kalman

Filter): It is reasonable to assume that the order of magnitudeof the time required by individual agents to communicate andcompute is smaller than the one required to move. Additionally,according to the robotic sensor network model in Section III-B,measurements are only taken at instants of time in . Theseconsiderations, together with the observations made in Re-mark 5.5, lead us to assume that the distributed computationsdescribed in Sections VI-A and VI-B run on a time scalewhich is much faster than the time scale . These observationsprovide justification for the asymptotic limits taken in steps3:-8 and 11 of Table II. We are currently addressing the charac-terization of the communication requirements for the algorithmexecution. However, it should be noted that, with regards toother message-passing algorithms, the present approach needsminimal memory requirements at each agent, provides allagents with the same global information, and handles withoutany modification evolving network interaction topologies.

Remark 6.3: (Robustness to Agents’ Ar-rivals and Departures): The requirement inDISTRIBUTEDKRIGEDKALMANFILTER that is connectedalong the evolution can be relaxed as follows. From Remarks5.2 and 5.3, it is clear that both the dynamic average consensusalgorithms and the WEIGHTED LEAST SQUARES A LGORITHM

are robust to changing numbers of network agents. As long aseach agent knows the exact number of agents in its connectedcomponent of , it can perform the distributed datafusion steps described in Table II. Regarding the parameterestimation, different connected components will use differentmeasurements, and hence will have different mean andcovariance estimates about the parameter. As currently stated,the algorithm is then robust to agent deletion, while it isnot robust to the addition of new agents that can join theconnected component with possibly different parameter meanand covariance estimates. Regarding simple Kriging, since thespatial correlation range of the random field is smaller than thecommunication radius, each agent computes the same estimateof the spatial field and of its gradient on a neighborhood aroundits current location. Hence, at this stage, the algorithm is robustto agents’ arrivals and departures.

VII. DISTRIBUTED GRADIENT ASCENT OF SPATIAL FIELDS

The distributed estimation algorithm developed in the pre-vious section can be used in conjunction with the motion ca-pabilities of the robotic agents to perform a number of coordi-nation tasks. In this section, we illustrate these possibilities bydesigning a distributed gradient ascent coordination algorithmto find the maxima of a spatial field.

At any instant of time , the DISTRIBUTED

KRIGED KALMAN FILTER described in Table II allows agentto compute the expected value of both the spatial

field and its gradient in the neighborhood ofits location. With the information provided by the filter, eachagent can then implement a gradient ascent strategy of the form

(19)

Note that, because new measurements are taken at time instantsin , the resulting trajectory of agent is continuous andpiecewise differentiable. The next result characterizes theasymptotic convergence properties of the distributed gradientascent strategy when no measurement errors are present.

Proposition 7.1: Let be a spatial Gaussian random fieldwith continuously differentiable mean function and with com-pact superlevel sets. Consider a robotic sensor network that mea-sures with no error, that is, for in (7).Then, any network trajectory evolving under(19) that starts from

satisfies

as , with probability one.Proof: Let . Note that is in-

variant. This is a consequence of the fact that, for any given setof measurements of the spatial field , the vector fieldin (19) is continuously differentiable, and hence no two trajec-tories intersect. Let , anddefine

By hypothesis, is compact. For each , agentis guaranteed to increase the expected value of the


Fig. 1. Distributed gradient ascent cooperative strategy (19) implemented bya robotic sensor network consisting of 14 agents. We depict the contour plotof the posterior mean of the spatial field. The black disks depict the (randomlygenerated) initial network configuration, while the gray disks depict the networkconfiguration after 36 seconds.

spatial field along the time interval by following thegradient flow (19). Equivalently

Because by hypothesis there are no measurement errors,we have . Therefore,the sequence , where

is a submartingale [18], that is

Using now [24, Corollary 2], we conclude the result.We have implemented the gradient ascent (19) in Mathe-

matica® to illustrate its performance. The DISTRIBUTED KRIGED

KALMAN FILTER is implemented as a single centralized pro-gram. Agents evolve according to the robotic network modeldescribed in Section III-B, with communication radius ,agent control authority bounded by , and noisesensor error variance . During the execution, each agentmakes use of the expected value of the gradient of the spatialfield computed in step 13 of DISTRIBUTED KRIGED KALMAN

FILTER to follow the gradient ascent direction as specified in(19). We illustrate the performance of the closed-loop systemin Fig. 1 with a static (i.e., not evolving in time) spatial fieldwith mean ,zero-mean small-scale variability , and covariance structuredetermined by if and

otherwise. In the simulation, agents initially know.

According to Proposition 7.1, individual agents convergeasymptotically to the set of expected critical points of the spatialfield. However, if two or more agents tend to the same point in

, then the numerical implementation becomes problematicbecause the Bayesian universal Kriging computations are, ingeneral, ill-posed on configurations in , where two or morepoints coincide. Our simulation did not exhibit this problembecause we did not run it for a time long enough. One wayto resolve this is by specifying a threshold in how close theindividual agents need to get to the set of critical points. Onceagents that are converging to the same critical point satisfy thethreshold, they can decide, for instance, to form a circle anduniformly deploy around it.

VIII. CONCLUSION

We have considered a scenario where a robotic sensor net-work takes successive measurements of a dynamic physicalprocess of interest model as a spatio-temporal random field.We have introduced a statistical framework to estimate thedistribution of the random field and of its gradient. Under theassumptions of uncorrelation in time and limited-range corre-lation in space, we have developed the DISTRIBUTED KRIGED

KALMAN FILTER that enables the network to compute the pre-dictive mean functions of the random field and of its gradient.We have illustrated the usefulness of the proposed algorithmby synthesizing a motion coordination strategy that makesnetwork agents find critical points of the field with probabilityone in case of no measurement noise. Numerous avenues ofresearch appear worth pursuing. Among them, we highlightthe consideration of more general statistical assumptions onthe spatio-temporal field, (e.g., measurements correlated intime, unknown parameters in the covariance structure thatmust also be estimated from the data), the quantification ofthe communication requirements of the proposed algorithm,the extension of the convergence results of the gradient ascentalgorithm to the case of measurement errors, or the design ofdistributed algorithms that maximize the information contentof retrieved data.

ACKNOWLEDGMENT

The author wishes to thank Dr. A. Kottas for enlighteningconversations.

REFERENCES

[1] N. A. C. Cressie, Statistics for Spatial Data, Revised ed. , New York:Wiley, 1993.

[2] M. L. Stein, “Interpolation of spatial data. Some theory for Kriging,”in Springer Series in Statistics. New York: Springer, 1999.

[3] K. V. Mardia, C. Goodall, E. J. Redfern, and F. J. Alonso, “The KrigedKalman Filter,” Test, vol. 7, no. 2, pp. 217–285, 1998.

[4] N. A. C. Cressie and C. K. Wikle, “Space-time Kalman Filter,” in Ency-clopedia of Environmetrics, A. H. El-Shaarawi and W. W. Piegorsch,Eds. , New York: Wiley, 2002, vol. 4, pp. 2045–2049.

[5] S. Banerjee, A. E. Gelfand, and C. F. Sirmans, “Directional rates ofchange under spatial process models,” J. Amer. Stat. Assoc., vol. 98,no. 464, pp. 946–954, 2003.

[6] S. Sukkarieh, E. Nettleton, J.-H. Kim, M. Ridley, A. Goktogan, andH. Durrant-Whyte, “The ANSER project: Data fusion across multipleuninhabited air vehicles,” Int. J. Robot. Res., vol. 22, no. 7–8, pp.505–539, 2003.

[7] P. Ögren, E. Fiorelli, and N. E. Leonard, “Cooperative control of mo-bile sensor networks: Adaptive gradient climbing in a distributed envi-ronment,” IEEE Trans. Automat. Control, vol. 49, no. 8, pp. 1292–1302,Aug. 2004.

[8] K. M. Lynch, I. B. Schwartz, P. Yang, and R. A. Freeman, “Decentral-ized environmental modeling by mobile sensor networks,” IEEE Trans.Robot., vol. 24, no. 3, pp. 710–724, Jun. 2008.

[9] N. E. Leonard, D. Paley, F. Lekien, R. Sepulchre, D. M. Fratantoni, andR. Davis, “Collective motion, sensor networks and ocean sampling,”Proc. IEEE, vol. 95, no. 1, pp. 48–74, Jan. 2007.

[10] L. Xiao, S. Boyd, and S. Lall, “A scheme for robust distributed sensorfusion based on average consensus,” in Proc. Symp. Inform. ProcessingSensor Networks, Los Angeles, CA, Apr. 2005, pp. 63–70.

[11] D. P. Spanos, R. Olfati-Saber, and R. M. Murray, “Distributed sensorfusion using dynamic consensus,” in Proc. IFAC World Congress, Jul.2005, [CD ROM].

[12] D. P. Spanos, R. Olfati-Saber, and R. M. Murray, “Dynamic consensusfor mobile networks,” in Proc. IFAC World Congress, Prague, CzechRepublic, Jul. 2005, [CD ROM].


[13] R. Olfati-Saber and J. S. Shamma, “Consensus filters for sensor net-works and distributed sensor fusion,” in Proc. IEEE Conf. DecisionControl, Seville, Spain, 2005, pp. 6698–6703.

[14] R. A. Freeman, P. Yang, and K. M. Lynch, “Stability and convergenceproperties of dynamic average consensus estimators,” in Proc. IEEEConf. Decision Control, San Diego, CA, 2006, pp. 398–403.

[15] B. S. Y. Rao, H. F. Durrant-Whyte, and J. S. Sheen, “A fully decentral-ized multi-sensor system for tracking and surveillance,” Int. J. Robot.Res., vol. 12, no. 1, pp. 20–44, 1993.

[16] J. L. Speyer, “Computation and transmission requirements for a decen-tralized linear-quadratic-gaussian control problem,” IEEE Trans. Au-tomat. Control, vol. AC-24, no. 2, pp. 266–269, Apr. 1979.

[17] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Compu-tation: Numerical Methods. Belmont, MA: Athena Scientific, 1997.

[18] L. C. G. Rogers and D. Williams, Diffusions, Markov Processes andMartingales, 2nd ed. Cambridge, U.K.: Cambridge Univ. Press,2000, vol. 1.

[19] H. J. Kushner and G. G. Yin, Stochastic Approximation and RecursiveAlgorithms and Applications, 2nd ed. , New York: Springer, 2003,vol. 35.

[20] F. Bullo, J. Cortés, and S. Martinez, Distributed Control of Robotic Net-works. Applied Mathematics Series. Princeton, NJ: Princeton Univ.Press, 2009.

[21] B. D. O. Anderson and J. B. Moore, Optimal Filtering. EnglewoodCliffs, NJ: Prentice Hall, 1979.

[22] P. S. Maybeck, Stochastic Models, Estimation, and Control. NewYork: Academic, 1979, vol. 1.

[23] F. E. Udwadia, “Some convergence results related to the JOR iterativemethod for symmetric, positive-definite matrices,” Appl. Math. Com-putation, vol. 47, no. 1, pp. 37–45, 1992.

[24] H. J. Kushner, “On the stability of stochastic dynamical systems,”PNAS, vol. 53, pp. 8–12, 1965.

Jorge Cortés (A’02–M’04–SM’06) received the Li-cenciatura degree in mathematics from the Univer-sidad de Zaragoza, Zaragoza, Spain, in 1997, and thePh.D. degree in engineering mathematics from theUniversidad Carlos III de Madrid, Madrid, Spain, in2001.

He was a Postdoctoral Research Associate atthe Systems, Signals and Control Department,University of Twente, Enschede, The Netherlands(from January to June 2002), and at the CoordinatedScience Laboratory, University of Illinois at Ur-

bana-Champaign (from August 2002 to September 2004). From 2004 to 2007,he was an Assistant Professor with the Department of Applied Mathematicsand Statistics, University of California, Santa Cruz. He is currently an Asso-ciate Professor in the Department of Mechanical and Aerospace Engineering,University of California, San Diego. He is the author of Geometric, Controland Numerical Aspects of Nonholonomic Systems (Berlin, Germany: SpringerVerlag, 2002) and co-author of Distributed Control of Robotic Networks(Princeton, NJ: Princeton University Press, 2009). His research interests focuson mathematical control theory, distributed motion coordination for groups ofautonomous agents, and geometric mechanics and geometric integration.

Dr. Cortés received the 2006 Spanish Society of Applied Mathematics YoungResearcher Prize and the 2008 IEEE Control Systems Magazine OutstandingPaper Award. He is currently an Associate Editor for the IEEE TRANSACTIONS

ON AUTOMATIC CONTROL, SYSTEMS AND CONTROL LETTERS and the EuropeanJournal of Control.

Documents

Distributed Kriged Kalman Filter for Spatial Estimation