23
Estimating a CBRN atmospheric Release in a Complex Environment Using Gaussian Processes Adrien Ickowicz 1 , François Septier 2 , Patrick Armand 3 1 University of Lille / LAGIS UMR CNRS 8219, France 2 Institut Mines-Télécom / Télécom Lille1 / LAGIS UMR CNRS 8219, France 3 French Alternative Energies and Atomic Energy Commission (CEA/DAM/DIF), France Special Session : CBRNE Threats Modelling, Detection and Tracking 15th International Conference on Information Fusion July 12 th , 2012 - Singapore

Fusion 2012

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Fusion 2012

Estimating a CBRN atmospheric Release in aComplex Environment Using Gaussian

Processes

Adrien Ickowicz1 , François Septier 2, Patrick Armand3

1 University of Lille / LAGIS UMR CNRS 8219, France2 Institut Mines-Télécom / Télécom Lille1 / LAGIS UMR CNRS 8219, France

3 French Alternative Energies and Atomic Energy Commission (CEA/DAM/DIF), France

Special Session : CBRNE Threats Modelling, Detection and Tracking15th International Conference on Information Fusion

July 12th, 2012 - Singapore

Page 2: Fusion 2012

We are here −→ •

1 Introduction

2 Problem Formulation

3 Bayesian Inference

4 Numerical Results

5 Conclusion and future works

Adrien Ickowicz, François Septier , Patrick Armand Introduction 1/17

Page 3: Fusion 2012

Introduction

Threat of pollution due to the release (either accidentally or deliberately) of CBRN agents is high !

(C : Chemical, B : Biological, R : Radiological, N : Nuclear)

To reduce the extent of human exposure⇒ Rapid detection and early response to a release is needed

The capability to estimate the source term characteristics is thus a problem of great importance !

Many algorithms have been proposed but they are either :

too complex for real-time processing (Monte-Carlo methods, etc...)

not accurate in complex flow field.

Proposition :Reconstruct the pollutant concentration field by developing a novel formulation of the problem throughGaussian processes ⇒ Estimate from it source term characteristics

Adrien Ickowicz, François Septier , Patrick Armand Introduction 2/17

Page 4: Fusion 2012

We are here −→ •

1 Introduction

2 Problem Formulation

3 Bayesian Inference

4 Numerical Results

5 Conclusion and future works

Adrien Ickowicz, François Septier , Patrick Armand Problem Formulation 2/17

Page 5: Fusion 2012

Dispersion Model

Many atmospheric dispersion models exist :

Plume

Puff

Particles

F IGURE: Different types of atmospheric dispersion models : plume models (top), puffmodels (middle) and particle models (bottom).

⇒ The most advanced dispersion model is the so-called Lagrangian particledispersion model because it can take into account possible inhomogeneities inthe flow and turbulence fields. Here thousands of individual particles (fluidelements) are traced and their distribution yields an estimate for theconcentration field.

Adrien Ickowicz, François Septier , Patrick Armand Problem Formulation 3/17

Page 6: Fusion 2012

Dispersion Model : Lagrangian Particle Dispersion Model

In this model, each individual particle (fluid element) follows the following SDE :

dXit = v(Xi

t, t)dt + σ(Xit, t)dwt (1)

where wt is a standard Brownian motion and v(Xit, t) represents the flow field.

Initial conditions of this SDE are given by the characteristics of the source term(position ls, time ts and mass qs of the release).Owing to the complexity of such model, the mean concentration field is obtainedby numerically generating Np fluid elements and then computing :

Cθ(xn, yn, t) ≈qs

Np

Np

∑i=1

K(xn , yn, t|Xit(θ)) (2)

where

Xit(θ)

Np

i=1correspond to the trajectory of the Np fluid particles generated

using Eq. (1) with θ = ls, ts, qs as initial conditions. K(xn, yn, t|Xit(θ)) is a

smoothing Kernel used to have a continuous concentration field in time andspace.

Adrien Ickowicz, François Septier , Patrick Armand Problem Formulation 4/17

Page 7: Fusion 2012

Observation Model

In this work, we assume that noisy measurements from Nc sensors are available, i.e. forj = 1, . . . , Nc at time t :

Yjt = Cθ(x

j, yj , t) + ǫjt where ǫ

jt ∼ N (0, σ2)

0 100 200 300 400 500 600 700−2

−1

0

1

2

3

4

5

6

7x 10

−3

Temps [s]

Con

cent

ratio

n re

levé

e au

cap

teur

Illustration of the dispersion of a pollutant agent at several time instants (t = ts +5s, +250s,+350s , +500s) and noisy measurements collected at 5 sensor locations.

Adrien Ickowicz, François Septier , Patrick Armand Problem Formulation 5/17

Page 8: Fusion 2012

We are here −→ •

1 Introduction

2 Problem Formulation

3 Bayesian Inference

4 Numerical Results

5 Conclusion and future works

Adrien Ickowicz, François Septier , Patrick Armand Bayesian Inference 5/17

Page 9: Fusion 2012

Introduction

Aim :1 Reconstruct the concentration field from all noisy measurements.

How can we estimate a continuous function in Bayesian framework ?

We propose to use Gaussian Process as prior distribution on this continuous concentrationfield of interest

2 Estimate the source location and time of release from this concentration fieldreconstruction.

Time estimation using change-point detection in the expansion :

En(t) = ∑ni=1

C(xi,t)f (t)

x2

i −(

∑ni=1

C(xi,t)f (t)

xi

)2

Observedconcentration :

C1:n

Learning Gaussian Process :(α, β) = arg max ‖C1:n − GP(f)‖2

Cloud reconstruction :C(x, t) = f∗

xs =arg max C(x, t)

t

α, β

Estimation algorithm

Adrien Ickowicz, François Septier , Patrick Armand Bayesian Inference 6/17

Page 10: Fusion 2012

Introduction : Gaussian Process

A Gaussian Process (GP ) is a possibly infinite set of scalar random variable f (x) indexedby x ∈ R

d and taking values on R such that for any finite set of inputsX = x1, . . . , xN ∈ R

N×d, f = f (x1), . . . , f (xN) is distributed according to a multivariateGaussian distribution. Therefore a GP is completely defined by its two first moments :

Mean function : m(x) = E [f (x)]

Covariance (or kernel) function :κ1:N = E

[

(f (x1:N)− m(x1:N))(f (x1:N)− m(x1:N))T]

parametrized by α, β.

⇒ A GP specifies a distribution over functions

Illustration of GP for regression

−1 0 1−1

0

1

2

3

input, x

outp

ut, y

−1 0 1−1

0

1

2

3

input, x

outp

ut, y

−1 0 1−1

0

1

2

3

input, x

outp

ut, y

Data collected GP Regression GP RegressionMatern Kernel Squared Exponential

Isotropic Kernel

⇒ Results highly depend on the GP Kernel choice

Adrien Ickowicz, François Septier , Patrick Armand Bayesian Inference 7/17

Page 11: Fusion 2012

Bayesian Reconstruction of the pollutant cloud

Noisy measurements of the concentration at sensor location and time, x, arecollected :

y = C(x) + ǫ where ǫ ∼ N (0, σ2I)

By using the GP formulation for the concentration field to reconstruct, we havethus the following joint distribution :

(

YC(x∗)

)

∼ N

(

0,

(

κ(x, x) + σ2I κ(x, x∗)κ(x∗ , x) κ(x∗ , x∗)

))

where x∗ stands for the location (time and space) where we want to predict theconcentration.

Using the rule for conditionals, p(C(x∗)|Y = y) is Gaussian with :

Mean : κ(x, x∗) [κ(x, x) + σ2I]−1 y

Covariance : κ(x∗ , x∗)− κ(x, x∗) [κ(x, x) + σ2I]−1 κ(x∗ , x)

Adrien Ickowicz, François Septier , Patrick Armand Bayesian Inference 8/17

Page 12: Fusion 2012

Bayesian Reconstruction of the pollutant cloud : GP Kernel Choice

The choice of the GP kernel is crucial for having accurate prediction results.

The most classical one is the isotropic Kernel with a squared exponential (SE) :

κiso

(

x, x′)

=1

αexp

(

−‖x − x′‖2

β2

)

⇒ SE decreases with euclidean distance between x and x′

⇒ Not really appropriate for our concentration field especially in the presence ofa complex flow field.Proposition : Use information from the flow field to design a specific Kernel forthis problem :

κdyn

(

x, x′)

=1

σ(t, t′)exp

(

−ds(x, x′)

2σ(t, t′)2

)

where we have :

ds(x, x′) = (x′ − sx,t(t′))2 + (x − sx′,t′ (t))

2

σ(t, t′) = α × |t− t′|β + 1

in which sx0,t0(t′) is the solution at time t′ of the following ODE system :

x = v(x, t) (Flow field)x(t0) = x0

Adrien Ickowicz, François Septier , Patrick Armand Bayesian Inference 9/17

Page 13: Fusion 2012

Bayesian Reconstruction of the pollutant cloud : GP Kernel Choice

The choice of the GP kernel is crucial for having accurate prediction results.

Illustration of our proposed drift-dependant (last column) versus SE isotropic kernel (middle column)3 sensors located at (109.59 ;124.97) , (96.97 ;59.74) , (75 ;35) which give measurements every 30s.

Adrien Ickowicz, François Septier , Patrick Armand Bayesian Inference 10/17

Page 14: Fusion 2012

Source Term estimation

With this GP formulation, we can reconstruct an approximation of theconcentration field at any point in space and time we want.Logically, the source location in space and time would correspond to the locationof the maximum value of the reconstructed concentration field, i.e. :

xs, ts = arg maxx,t

C(x, t)

However, two main issues :

with the use of GP , this maximum value has a high probability of being closein space and time from a sensor location .

this maximization is very complex since we have to reconstruct theconcentration in a high dimension space (different time and location)

Proposition :

Estimate the time of release, ts, via a cloud expansion analysis,

Estimate the source location based on maximization of theGP reconstructed concentration field at this particular estimated time

xs = arg maxx

C(x, ts)

Adrien Ickowicz, François Septier , Patrick Armand Bayesian Inference 11/17

Page 15: Fusion 2012

Source Term estimation : Time of Release

Time of Release Estimation via cloud expansion analysis

For the time of release estimation, we propose to analysis the spatial cloud expansion viathis expression :

E(t) =∫

ΩC(x, t)

(

x −∫

ΩyC(y, t)dy

)2

dx

Indeed, this expression is clearly null at the time of release (null before except if there existssome non-zero background noise)However, the concentration field C(x, t) is unknown, so one can use either theGP prediction or directly noisy measurements from sensors.

Evolution of the real expansion and the one obtained using noisy measurements for asource released at t =150s (Presence of non-zero background noise).

Adrien Ickowicz, François Septier , Patrick Armand Bayesian Inference 12/17

Page 16: Fusion 2012

We are here −→ •

1 Introduction

2 Problem Formulation

3 Bayesian Inference

4 Numerical Results

5 Conclusion and future works

Adrien Ickowicz, François Septier , Patrick Armand Numerical Results 12/17

Page 17: Fusion 2012

Scenario

To study the performance of the proposed approach, we simulate a pollutantsource located at (xs = 115, ys = 10) released at time ts = 150s in a city withcomplex flow field.

Adrien Ickowicz, François Septier , Patrick Armand Numerical Results 13/17

Page 18: Fusion 2012

Numerical Results : Time of Release

To estimate the time of release from the cloud expansion, we compare :

4 change-point detection algorithms :

Penalized Maximum likelihood estimation (PML)Leave-one-out empirical risk minimization (LOO)V-fold empirical risk minimization (ERM-VF)Birgé-Massard empirical risk minimization (ERM-BM)

direct minimization of the log-expansion (mind)

Mean Std Dev. Empirical 0.95 CI

mind 3 :22 2 :42 [0 :54, 9 :27]LOO 5 :32 6 :08 [1 :03, 15 :00]

ERM-VF 5 :22 3 :57 [1 :12, 15 :00]ERM-BM 5 :34 6 :00 [2 :15, 15 :00]

PML 2 :57 1 :42 [1 :06, 6 :03]

TABLE: Comparison between the change-point estimators with 5 sensors.

Adrien Ickowicz, François Septier , Patrick Armand Numerical Results 14/17

Page 19: Fusion 2012

Numerical Results : Time of Release

To estimate the time of release from the cloud expansion, we compare :

4 change-point detection algorithms :

Penalized Maximum likelihood estimation (PML)Leave-one-out empirical risk minimization (LOO)V-fold empirical risk minimization (ERM-VF)Birgé-Massard empirical risk minimization (ERM-BM)

direct minimization of the log-expansion (mind)

Value of the estimator of t0 depending on the number of sensors

PML provides good estimation due to its greater sensibility to small changes in the signal(but earlier changes are more oftenly detected)

Adrien Ickowicz, François Septier , Patrick Armand Numerical Results 14/17

Page 20: Fusion 2012

Numerical Results : Position of Release

Performances of the position estimation method using maximization of theGP at the time estimated from previous approaches (5, 20 and 50 sensors)

True position : x0 = 115 , y0 = 10

♯ x0 y0 σ(x0) σ(y0)

mind 5 68.97 62.58 42.82 38.9620 97.13 26.37 27.64 26.0850 104.47 21.60 28.94 19.47

PML 5 81.90 65.37 41.52 40.1920 85.40 47.80 40.77 43.5150 94.77 37.01 42.00 40.25

ERM-BM 5 69.54 54.78 44.13 36.0820 83.54 42.24 39.56 41.4650 96.66 34.70 39.13 38.79

LOO 5 69.21 54.24 44.02 35.8320 82.61 47.91 42.35 43.5850 96.28 36.74 40.95 39.23

Estimation with the Isotropic Kernel

♯ x0 y0 σ(x0) σ(y0)

mind 5 108.94 12.21 42.00 17.0520 120.28 5.28 12.50 4.6450 114.51 6.48 6.37 3.07

PML 5 97.69 18.65 30.91 13.6520 110.37 9.53 7.32 6.3050 113.44 6.56 3.23 3.23

ERM-BM 5 84.03 24.33 38.24 16.8320 111.32 9.15 7.73 5.9450 113.36 6.64 3.26 3.26

LOO 5 84.23 24.52 37.09 16.5120 110.28 9.81 6.54 6.2850 113.31 6.65 3.26 3.28

Estimation with the drift-dependant Kernel

⇒ Significant improvement by the proposed drift-dependant ke rnel !Remark : Using PML estimate does not give better performance due to that factthat quite often an earlier time of release is estimated.

Adrien Ickowicz, François Septier , Patrick Armand Numerical Results 15/17

Page 21: Fusion 2012

We are here −→ •

1 Introduction

2 Problem Formulation

3 Bayesian Inference

4 Numerical Results

5 Conclusion and future works

Adrien Ickowicz, François Septier , Patrick Armand Conclusion and future works 15/17

Page 22: Fusion 2012

Conclusion and future works

Conclusion :

⋆ Development of a novel and faster methodology for source term estimation,

⋆ To improve the GP reconstruction , a drift-dependant kernel which takesinto account the flow field has been proposed,

⇒ Significant improvement compared to more classical GP kernel

⋆ Expression of a spatial cloud expansion has been derived and used toestimate the time of release.

Future works :

Improvement of the time of release estimation procedure.

Simulation with real dataset.

Acknowledgments :Research sponsored by the CEA and GIS 3SGS.

Adrien Ickowicz, François Septier , Patrick Armand Conclusion and future works 16/17

Page 23: Fusion 2012

Thank you for your attention