62
1 Detection and Characterization of Chemical Vapor Fugitive Emissions by Nonlinear Optimal Estimation: Theory and Simulation Christopher M. Gittins Physical Sciences Inc., 20 New England Business Center, Andover, MA 01810, USA [email protected] This paper addresses detection and characterization of chemical vapor fugitive emissions in a non-scattering atmosphere by processing of remotely-sensed long wavelength infrared spectra. The analysis approach integrates a parameterized signal model based on the radiative transfer equation with a statistical model for the infrared background. The maximum likelihood model parameter values are defined as those which maximize a Bayesian posterior probability and are estimated using a Gauss-Newton algorithm. For algorithm performance evaluation we simulate observation of fugitive emissions by augmenting plume-free measured spectra with synthetic plume signatures. As plumes become optically-thick, the Gauss-Newton algorithm yields significantly more accurate estimates of chemical vapor column density and significantly more favorable plume detection statistics than clutter-matched-filter-based and adaptive-subspace-detector-based plume characterization and detection. OCIS codes: 010.5620 Radiative transfer

Detection and Characterization of Chemical Vapor Fugitive

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

1

Detection and Characterization of Chemical Vapor

Fugitive Emissions by Nonlinear Optimal Estimation:

Theory and Simulation

Christopher M. Gittins

Physical Sciences Inc., 20 New England Business Center, Andover, MA 01810, USA

[email protected]

This paper addresses detection and characterization of chemical vapor fugitive emissions

in a non-scattering atmosphere by processing of remotely-sensed long wavelength infrared

spectra. The analysis approach integrates a parameterized signal model based on the

radiative transfer equation with a statistical model for the infrared background. The

maximum likelihood model parameter values are defined as those which maximize a

Bayesian posterior probability and are estimated using a Gauss-Newton algorithm. For

algorithm performance evaluation we simulate observation of fugitive emissions by

augmenting plume-free measured spectra with synthetic plume signatures. As plumes

become optically-thick, the Gauss-Newton algorithm yields significantly more accurate

estimates of chemical vapor column density and significantly more favorable plume

detection statistics than clutter-matched-filter-based and adaptive-subspace-detector-based

plume characterization and detection.

OCIS codes: 010.5620 Radiative transfer

2

110.4234 Multispectral and hyperspectral imaging

150.1135 Machine vision algorithms

280.1120 Air pollution monitoring

280.4991 Passive remote sensing

300.6340 Spectroscopy, infrared.

1. Introduction

Long wavelength infrared (LWIR) spectrometry has been employed for several decades for

remote sensing of chemical vapor fugitive emissions. This paper addresses detection and

characterization of fugitive emissions in a non-scattering atmosphere by processing of LWIR

spectra collected in an open path configuration. The analysis approach employs a parameterized

signal model derived from the radiative transfer equation (RTE) and applies estimation theory to

determine the maximum likelihood parameter values. In this regard, it bears some similarity to

estimation methods developed for atmospheric profile retrieval using data collected by

atmospheric sounding spectrometers. In both cases, the at-sensor spectral radiance is a nonlinear

function of atmospheric temperature and constituent profiles and retrieval of the properties of

interest requires inverse solution of the RTE. Because inverse solution of the RTE is

mathematically ill-posed, the physical properties of interest must be inferred using estimation

theory. The use of estimation theory for atmospheric profile retrieval was demonstrated by

Rodgers in 1976 [1]. Variations on and refinements of Rodgers’ implementation have been

made by numerous authors in the decades since, see, e.g., [2-7]. The estimation algorithm

presented here has characteristics in common with methods described in those publications. The

RTE used here is well-suited for fugitive emissions detection from short range (<5 km) where

3

observations are made with a horizontal or near-horizontal line-of-sight to the area of interest.

That noted, the nonlinear estimation approach is generally-applicable and may be adapted to

address other observation scenarios by modifying the RTE and/or the statistical model for the

background.

We define the maximum likelihood model parameter values as those which maximize a

Bayesian posterior probability. The posterior probability distribution function (pdf) is the

product of a conditional pdf for the observed spectrum and a prior pdf for infrared background

spectra. Accounting for infrared background characteristics using a prior pdf rather than a

physics-based model is both statistically-justifiable, i.e., no information is lost in an information

theoretical sense, and a practical necessity in order to make the estimation problem

computationally tractable. The signal model for measured spectra is a nonlinear function of the

model parameters and we use a Gauss-Newton algorithm to determine the maximum likelihood

parameter values.

In addition to presenting the signal model and parameter estimation algorithm, we

address algorithm initialization and derive the uncertainties associated with the estimated

parameter values. For algorithm validation, we simulate observation of fugitive emissions by

augmenting plume-free measured spectra with synthetic plume signatures using a process which

preserves the noise characteristics of the original data. We compare the results obtained by

processing the simulated data with the Gauss-Newton algorithm with those obtained by

processing the data with an adaptive subspace detector (for plume detection) and a clutter-

matched filter (for estimation of path-integrated vapor concentration), both of which are derived

from a linear additive signal model which follows from an approximation to the underlying RTE

[8-13]. The purpose of simulation in this work is not to test the validity of the clear air

4

approximation but, postulating that the underlying atmospheric model is correct for the cases

evaluated, to assess the effect of using the approximate, linearized RTE rather than the exact

RTE as the basis for chemical vapor detection and characterization. Comparing results obtained

for simulated observations in a uniform, non-scattering atmosphere is the simplest test case.

Readers will note that the signal model used for nonlinear estimation is easily modified to

address observations in more complex atmospheres.

While nonlinear estimation and estimation based on the linear, additive signal model

produce essentially identical results when applied to optically-thin plumes, nonlinear estimation

provides significantly more accurate estimates of path-integrated chemical vapor concentration

(column density) and significantly more favorable detection statistics when vapor plumes are

optically-thick at one or more of the sensor observation wavelengths. Simulations indicate that

nonlinear estimation utilizing a Gauss-Newton algorithm can reduce column density estimation

error by an order of magnitude and reduce false alarm rates by several orders of magnitude

relative to the clutter-matched filter for a plume optical density as low as 1.0, i.e.,

( )0.1exp −=ontransmissi at the wavelength of strongest absorption. As expected, the

performance of the Gauss-Newton algorithm relative to the linear estimators improves with

increasing plume optical density, i.e., as the optically-thin plume approximation breaks down.

2. Algorithm Formulation

2.1. Radiative Transfer Model

The radiative transfer model which underlies the analysis approach presented here follows from

the stratified atmosphere approximation: the atmosphere along the sensor’s line-of-sight is

modeled as n layers with each layer having uniform temperature, pressure, and chemical

5

composition [14]. The concept is illustrated in Figure 1. Under this model, the monochromatic

at-sensor spectral radiance at the interface of Layer i and Layer i+1 is [15]:

( ) ( ) ( ) ( ) ( )[ ]λτλλτλλ iiiii LRR −⋅+⋅= − 11 (1)

where iτ is the transmission of layer i and iL is the blackbody (Planck) function evaluated at the

thermodynamic temperature of Layer i, Ti. If 0R arises from a solid background then it is the

surface leaving radiance, i.e., 0R includes reflected downwelling radiance as well as surface

emission, not simply the blackbody function evaluated at the surface temperature.

When a vapor plume is present at Layer p, the at-sensor spectral radiance is:

( ) ( )[ ]ppbpaaas LRLR ⋅−+⋅⋅+⋅−= ττττ 11 (2)

where aτ is the atmospheric transmission from the sensor along the line-of-sight to the plume, aL

is the path-weighted blackbody radiance from the sensor to Layer p, pτ is the plume

transmission, bR is the background spectral radiance incident on the plume layer, and pL is the

blackbody function evaluated at the plume temperature. (Note that the wavelength dependence

of the quantities in Eq. (2) has been suppressed for clarity.) The plume transmission follows

Beer’s law. In this work we consider a plume consisting of a single vapor and

[ ]κρτ ⋅−= expp (3)

where κ is the absorption cross-section of the chemical vapor and ρ is its column density.

Absorption cross-sections are temperature- and pressure-dependent; however, those

dependencies are weak near 300 K and 1 atm and we ignore them here.

Eq. (2) has formed the basis for multiple fugitive emissions detection and

characterization approaches, see, e.g., [8,16-18]. When the atmosphere is homogeneous along

the line-of-sight to the plume, aL is simply the blackbody function evaluated at the air

6

temperature. Methods for calculating aL along an inhomogeneous path are described by Clough,

et al. [19] and Rodgers [15]. In those treatments, aL corresponds to the blackbody function

evaluated at an effective temperature derived from the characteristics of the path along the

sensor’s line-of-sight. Although treating the atmosphere as a single uniform layer here may

appear to be a crude approximation likely to introduce significant error in estimated plume

characteristics, Sheen, et al.’s analysis of background and atmospheric variability effects [20]

suggests that this is generally not the case for short range detection: background variability is

generally the primary source of uncertainty and error.

Following Eqs. (1) and (2), the at-sensor radiance in the absence of the plume is:

( ) ( ) baaas RLR ⋅+⋅−= ττ10 (4)

and the change in at-sensor radiance due to the plume is:

( ) ( ) ( )( ) ( )[ ]apasapss LLRLRR −⋅+−⋅−=− ττ 00 1 (5)

Note that when the temperature of the plume is equal to the atmospheric temperature then

0=− ap LL and the change in at-sensor radiance is independent of the atmospheric transmission,

i.e., atmospheric compensation is not necessary to estimate plume properties.

For the remainder of this document we consider the case where the air temperature is

uniform and known and the plume temperature is equal to the air temperature. (In general,

atmospheric transmission effects may be ignored when the effective thermal contrast between the

plume and the air is much less than the effective thermal contrast between the air and the

background.) These constraints are not essential for the analysis approach to be valid; however,

they facilitate a more concise presentation of the analysis approach. We note in the text where

relaxing these constraints affects the details of the parameter estimation algorithm. In practice, if

it is necessary or desirable to presume a known and uniform plume temperature, there are

7

methods for making a reasonable estimate. The simplest approach is to presume that it is equal

to the local atmospheric temperature. Alternatively, depending upon the characteristics of the

measured radiance spectra, the atmospheric temperature in the vicinity of the plume may be

estimated from the spectral radiance at wavelengths where atmospheric water vapor (or other

atmospheric constituent which is reasonably well-mixed over the sensor’s line-of-sight) becomes

optically-thick over a range comparable to the estimated distance to the plume.

2.2. Sensor Signal Model

The equations in the preceding section apply to monochromatic radiation. In developing the

signal model for describing measured spectra it is necessary to address the effects of the sensor’s

spectral resolution. In the absence of measurement noise, the apparent spectral radiance at

acquisition wavelength sλ is:

( ) ( ) ( )∫ ⋅⋅= λλλλλ dgRx sss ; (6)

where ( )λR is as per Eq. (4) or (5), and the instrument lineshape function, ( )sg λλ, , is

normalized such that ( ) 1; =⋅∫ λλλ dg s . Combining Eqs. (5) and (6), the change in sensor-

measured spectral radiance due to the plume is:

( ) ( ) ( )[ ] ( ) ( ) ( )[ ] ( )∫ ⋅⋅−⋅−=− λλλλλλτλλ dgRLxx ssapssp ,1 00 (7)

where px and 0x are the band-averaged spectral radiances as per Eq. (6) with and without the

plume present, respectively. In order to facilitate signal model parameterization and

computationally-efficient parameter estimation, we approximate Eq. (7) as:

[ ] [ ]00 1 xLxx aep −⋅−=− τ (8)

where ( )saa LL λ= and the effective plume transmission is

8

( )κρατ ⋅⋅−= 0expe (9)

The quantity κ is the vapor absorption cross-section averaged over the instrument lineshape

( ) ( ) ( )∫ ⋅⋅= λλλλκλκ dg ss , . (10)

The quantity 0ρ is a reference column density defined to make α a unitless quantity, 0ρρα ≡ .

We choose ( ){ }λκρ max10 = so that α corresponds to the plume optical density at the strongest

absorption feature in the high resolution spectrum. The approximation in Eq. (9) is addressed

further in Section 4. Parameterization of the plume-free background spectral radiance, 0x , is

addressed in the following Section.

From this point forward, measured spectra are treated as k-dimensional vectors:

exx +=~ (11)

where x denotes the vector of the noise-free spectral radiance from Eq. (6) and e denotes

measurement noise. (The tilde denotes a noisy measurement.) In developing the analysis

approach, we presume that the measurement noise is normally-distributed with zero mean and is

uncorrelated from band to band, ( )De ,0~ N where D is a diagonal matrix. The diagonal

elements of D are the 1σ standard deviation in each band resulting from sensor noise and

spectral clutter. We do not presume that the noise variance is equal in all bands.

2.3. Infrared Background Model

There are many reasonable approaches to modeling and parameterization of infrared

backgrounds. We use a factor analysis-based model [21] because it facilitates definition of a

Gaussian prior pdf for model parameter values. This in turn simplifies implementation of the

parameter estimation algorithm. The model is summarized in Appendix A and is similar to the

Probabilistic Principal Components model described by Tipping and Bishop [22]. Briefly, the

9

model presumes that, in the absence of a vapor plume, spectra may be described by a linear

mixing model:

Bβµx += (12)

where µ is the mean background spectrum, B is the k × m dimensional matrix whose columns

are the basis vectors used to span the data space and β is an m × 1 vector of weight coefficients.

The key detail on the implementation of Eq. (12) is that the β vectors for the sample are

presumed to be uncorrelated, have zero mean and unit standard deviation:

( )mN I0,β ~ (13)

where mI is the m×m identity matrix. Eq. (13) is exact for Principal Components Analysis

(PCA) applied to multivariate normal distributed data. The B matrix follows from a

regularization approximation of the calculated sample covariance matrix and is calculated using

eigenvalues and eigenvectors from a PCA of the data. Calculation of the B matrix and estimation

of the model order, m, are described in Appendix A.

2.4. Model Parameter Estimation

Following Eqs. (8), (9), and (12), the signal model parameters are the chemical vapor optical

density, the weight coefficients for the basis vectors for the background (i.e., the elements of the

β vector), and the plume/atmospheric temperature. For each spectrum, the parameters define a

vector θ , [ ]aT,,βθ α= . We take a Bayesian approach to estimating the maximum likelihood

parameter values. Given an observation, x~ , the probability that the parameter values are θ is

( ) ( ) ( )( )x

θθxxθ ~

~~

ppp

p = (14)

10

where ( )θx~p is the conditional probability of observing x~ given θ , ( )θp is the prior

probability of the parameter values being θ , and ( )x~p is the prior probability of observing x~ .

The maximum likelihood parameter values are those which maximize ( )θx~p or, equivalently,

those which minimize ( )θx~ln p− . The maximum likelihood model parameters are denoted θ .

Following Eq. (8), the model function for x is:

( ) [ ] aee Lτ1xτθf oo −+= 0 (15)

where the vector 0x is the estimated noise-free background spectrum given by Eq. (12), 1 is a

k-element vector of ones, eτ is the plume transmission calculated using Eq. (10), aL is the

blackbody function evaluated at the plume temperature, and ○ denotes the element-by-element

(Hadamard) product of the two vectors.

In order to facilitate computationally-efficient parameter estimation, we postulate that

( )θx~p follows a multivariate normal distribution:

( ) ( )[ ] ( )[ ] θxθfxDθfxθx cp T +−−=− − ~~21~ln 1 (16)

where 1−D is a noise-whitening matrix and θxc is a constant; { }222

21 ,...,, kdiag σσσ=D , where iσ

is the standard error of the measured spectral radiance in band i. To deal with the ( )θp term in

Eq.(14) we postulate that θ can partitioned into a subset of parameters where the prior follows a

multivariate normal distribution, i.e., the parameter values are constrained, and a subset where

the prior is uniform, i.e., the parameter values are unconstrained. Following this presumption

( ) [ ] [ ] θθθSθθθ cp acT

ac +−−=− θ21ln (17)

11

where cθ is the subset of θ which are constrained and θc is a constant. Specification of cθ is

described below. The vector aθ is the a priori estimate of cθ and θS is a regularization matrix

which penalizes deviations about aθ . For the purpose of estimating the maximum likelihood

model parameter values, it is not necessary to know the details of ( )x~p because it is simply a

normalization factor and is independent of θ .

Following Eqs. (14), (16), and (17), the model parameter values which maximize ( )xθ ~p

are also those which minimize the cost function:

( )[ ] ( )[ ] [ ] [ ]acT

acTC θθSθθθfxDθfx −−+−−= −

θ~~ 1 (18)

The first term on the righthand side of Eq. (18) penalizes deviations between the measured

spectrum and the model spectrum. The second term on the righthand side of Eq. (18) penalizes

deviations of the constrained model parameters from their nominal values. As stated above, in

order to demonstrate the nonlinear estimation approach without making the mathematics

unnecessarily complicated, we consider parameter estimation when the plume temperature is

known and is equal to the effective temperature of the atmosphere. Also, we presume that if a

chemical vapor cloud is present in the scene that its location and optical density are a priori

unknown and that optical density values are independent of background characteristics.

Following these presumptions, α is treated as an unconstrained parameter, βθ =c , and second

term on the righthand side of Eq. (18) reduces to

[ ] [ ] ββθθSθθ Tac

Tac =−− θ (19)

With respect to Eqs.(17)-(19), while α cannot be <0 or exceed the value corresponding to the

atmospheric pressure and, in principle, should be constrained, in practice it is more

12

computationally-efficient to leave α unconstrained and then reject physically implausible values

after the parameter estimation algorithm terminates.

The maximum likelihood model parameter values following from Eq.(18) may be

estimated using a Gauss-Newton algorithm after expressing the cost function in quadratic form

rrTC = (20)

where r is a p-dimensional vector. The Gauss-Newton algorithm updates parameter values

iteratively as

( ) iTii

Tiii rJJJθθ 1

1−

+ −= (21)

where i indicates iteration number and J is the Jacobian of the r vector, θr ∂∂ . Eq. (21) is a

general result. The dimensionality of the r vector and the Jacobian depends upon the details of

the signal model and the cost function. Following Eqs.(18) and (19), the r vector in this case

consists of (k+m) elements,

( )[ ][ ]βθfxDr ;~21 −= − / (22)

The Jacobian of r is a (k + m) × (m + 1) matrix which is the concatenation of its partial

derivative with respect to the β parameters and its partial derivative with respect to the α

parameter:

⎥⎦

⎤⎢⎣

⎡∂∂

∂∂

=αr

βrJ ; (23)

where [ ]mβββ ∂∂∂∂∂∂≡∂∂ rrrβr ;...;; 21 is a (k + m) × m matrix:

{ }

⎥⎦

⎤⎢⎣

⎡ ⋅⋅−=

∂∂ −

m

ediagI

BτDβr 2/1

(24)

13

and α∂∂r is a (k+m)×1 column vector. The quantity { }ediag τ is a k×k matrix with eτ on the

diagonal and zeros on the off-diagonal; { } Bτ ⋅ediag is a k × m matrix and mI is the m × m

identity matrix. The α∂∂r vector is:

{ }

⎥⎦

⎤⎢⎣

⎡ ⋅⋅=

∂∂ −

m

ediag0

δsτDr 02/1 o

α (25)

where κs ⋅= 0ρ , ( )BβµLδ +−= a0 is the radiance contrast between the air and the estimated

background spectrum (a k×1 column vector) and m0 is an m×1 vector of zeros.

The formulation of Eqs. (23)-(25) ensures that iTi JJ is invertible under virtually all

physically plausible detection scenarios and thereby makes Eq. (21) extremely stable. (The mI

matrix in Eq. (24) is the principal source of stability.) When the atmospheric temperature is

treated as a fixed parameter, the only scenario where iTi JJ is guaranteed not to be invertible, and

therefore Eq. (21) cannot produce an accurate estimate of plume column density, is when the

plume is opaque at all wavelengths, i.e., 0=eτ . The uncertainty associated with the estimated α

value tends to infinity as 00 →δ ; however, this does not prevent the algorithm from

converging.

Although the Eqs.(22)-(25) follow from the presumption that the atmospheric

temperature is known, the framework above permits it to be treated as an estimated parameter.

For example, it may be treated as a constrained parameter by augmenting the r vector with a

term ( ) Ta TT σ0− , where T0 is an a priori estimated air temperature and σT is the uncertainty in

the estimated temperature. In principle, aT may also be treated as an unconstrained parameter;

however, in the absence of a constraint, 0r →∂∂ aT as 0→α . This results in JJT being ill-

14

conditioned as 0→α (and non-invertible for 0=α ) thereby rendering application of the

Gauss-Newton algorithm problematic (or impossible). Treating aT as a free parameter is only

effective when the vapor cloud is optically-thick, in which case the measured spectral radiance at

the wavelength(s) of peak optical density provides a reasonable measure of the plume

temperature.

2.5. Algorithm Initialization

The Gauss-Newton algorithm requires an initial guess at the maximum likelihood model

parameters. We make our initial guess by applying several approximations to Eq. (15) in order

to create a linear additive signal model. The linear additive model facilitates direct calculation of

the maximum likelihood model parameters by matrix algebra. The first step in developing the

approximate model is to presume an optically-thin vapor plume, 1<<α , and approximate

Eq. (15) as

( )00 xLsxx −+= ap oα (26)

By further presuming that the plume is viewed against a blackbody background, Eq. (26)

simplifies to:

''0 sxx α+=p (27)

where the vector 's is

0' TdT

d

aT

a ∆⎥⎦⎤

⎢⎣⎡=

Lss o (28)

and

0

'TTeff

∆⋅= αα (29)

15

The quantity [ ]aTa dTd /L is the derivative of the blackbody function with respect to temperature

evaluated at the air temperature, effT∆ is effective thermal contrast between the air temperature

and the radiometric temperature of the background, and 0T∆ is a reference thermal contrast,

nominally 1 K.

Combining Eqs. (12) and (27) we obtain the approximate signal model:

( ) µBβsθg ++= '''' α (30)

By replacing ( )θf in Eq. (15) with ( )'θg above and maintaining the constraint in Eq. (21), then

applying the criterion that 0' =∇ Cθ at the minimum of the cost function, we obtain a linear

system of equations which may be solved directly for the maximum likelihood values of 'α and

'β , 'α and 'β :

( )( ) ⎥

⎤⎢⎣

−−

⎥⎦

⎤⎢⎣

⎡=⎥

⎤⎢⎣

⎡−

−−

−−

BDµxsDµx

ΛsDBBDssDs

β 1

11

1

11

~'~

''''

'ˆ'ˆ

T

T

mT

TTα (31)

The matrix to be inverted consists of four sub-blocks: '' 1sDs −T is a 1 × 1 sub-block, BDs 1' −T is

a 1 × m sub-block, ( )TTT BDssDB 11 '' −− = is an m × 1 sub-block , and the mΛ sub-block is an

m × m diagonal matrix whose non-zero elements are the leading m eigenvalues of the noise-

whitened sample covariance matrix, 2/12/1 ΣDD− . The system of equations in Eq. (31) yields the

clutter-matched filter result [9,12,13]:

( )'ˆ'

~ˆ''ˆ1

1

sΣsµxΣs

− −=

T

T

α (32)

when DBBΣ += Tˆ is the regularized sample covariance matrix. Using the relation

( )effTT ∆∆⋅= 0'αα , enables comparison of column densities estimated using the linearized

16

model in Eq. (30) with the results obtained presuming the RTE-based signal model in Eq. (15).

Calculation of effT∆ is addressed below.

2.6. Uncertainty Analysis

It is instructive to compare the Cramer-Rao lower bound (CRLB) on the uncertainty in the

column density estimated using the Gauss-Newton solver with that associated with the linear

model estimate. The CRLB on the uncertainties in parameters determined using the

Gauss-Newton algorithm may be determined from the elements of the inverse of the Fisher

information matrix [23]:

( )[ ] ( )[ ]iii I θˆ 12 −≥θσ (33)

where ( )aσ is the 1σ standard deviation the quantity a. Following Eqs. (14), (18) and (19), the

elements of the Fisher information matrix are:

( )[ ] ( ) [ ]ijT

jiij

pEI JJ

xθθ ≈

⎪⎭

⎪⎬⎫

⎪⎩

⎪⎨⎧

∂∂

∂−=

θθ

~lnˆ2

(34)

Combining Eqs. (33) and (34),

( )[ ] ( )[ ]iiT 12ˆ −= JJασ (35)

The index i in Eq. (35) corresponds the index of α in the parameter vector. The matrix ( ) 1−JJT is

calculated with each iteration of the Gauss-Newton algorithm so ( )ασ ˆ may be determined for

each parameter estimated with no additional computational expense.

For the linear model given by Eq. (30), there exists a closed-form expression for ( )[ ]2ασ :

17

( )[ ] ( )[ ] ( ) ( )[ ]

( )[ ] ( ) ( )⎟⎟⎠

⎞⎜⎜⎝

⎛⎥⎦⎤

⎢⎣⎡

∆∆

+⎟⎠⎞

⎜⎝⎛∆∆

=

∆⎥⎦

⎤⎢⎣

⎡∆∂∂

+⎥⎦⎤

⎢⎣⎡∂∂

=

222

20

22

22

2

'ˆ'ˆ

'ˆ'

ˆ

TT

TT

TTL

σαασ

σαασααασ

(36)

where [ ]'αα ∂∂ and ( )[ ]T∆∂∂α are from Eq. (29). The subscript L in Eq. (36) is to distinguish

the uncertainty derived from the linear model from that estimated using the Gauss-Newton

solver. The quantity ( )'ασ is the CRLB associated with the Gaussian probability distribution

function which follows from the cost function defined using the linear model:

( )[ ] [ ] 112 ''ˆ −−≥ sDs Tασ (37)

When there is no uncertainty in the thermal contrast between the air and the background, i.e.,

when the plume and the air temperature are known precisely, ( ) 0=∆Tσ and Eq. (36) simplifies

to:

( )[ ] [ ] 11

2

02 ''ˆ −−

⎟⎟⎠

⎞⎜⎜⎝

∆∆

= sDs T

effL T

Tασ (38)

Note that the uncertainty in the estimated column density tends to infinity as 0→∆ effT . The

effective thermal contrast in Eq. (39) is that which minimizes the sum-squared deviation between

( )0xLs −ao in Eq. (26) and 's in Eq. (27):

( ) ( )[ ]

( ) ( )''' 0

0 ssxLss

Ta

T

eff TT−

⋅∆=∆o

(39)

where 's is as per Eq. (28). (Note that 's is proportional to 0T∆ ; the 0T∆ terms cancel in the

numerator and denominator and effT∆ is independent of 0T∆ .) The effective thermal contrast

goes to zero as ( ) 00 →− xL a .

18

2.7. Detection Decision Formulation

For some standoff detection applications, making the correct “plume absent”/“plume present”

detection decision can be more important than accurate estimation of the chemical vapor column

density. The cost function in Eq. (18) facilitates detection decisions on the basis of a statistical

F-test. The F value associated with the measured spectrum, x~ , is

( ) ( ) ( )( ) ⎥

⎥⎦

⎢⎢⎣

⎡−⋅−= 1ˆ,~

ˆ,~1~ 0

θxθx

xCC

kF (40)

where ( )θ,x ˆ~C is the cost function evaluated using the model parameters estimated with the

Gauss-Newton algorithm and allowing all model parameters to vary and ( )0ˆ~ θ,xC is cost function

evaluated with α fixed at zero. In contrast to ( )θ,x ˆ~C , calculation of ( )0ˆ~ θ,xC does not require an

iterative computation. Under the background model defined in Section 2.3

( ) [ ] [ ] 0001

00ˆˆˆ~ˆ~~ βββBxDβBxθ;x TT

C +−−= − (41)

and the 0β vector is calculated using Eq. (A.9) of Appendix A. “Plume present” is decided when

the F value exceeds a user-specified threshold.

For the linear signal model given by Eq. (30), the analogue to the F test is an adaptive

subspace detector, the Adaptive Cosine Estimator (ACE) [24,25]. The ACE value associated

with the measured spectrum x~ is

( ) [ ]( )( )[ ] [ ]( )µxΣµxsΣs

µxΣsx−−

−=

−−

~ˆ~'ˆ'

~ˆ'~11

21

TT

T

ACED (42)

The ACE statistic can be regarded as cosine-squared of the angle between the test spectrum and

the reference spectrum in noise-whitened, mean-subtracted signal space. The F value calculated

in Eq. (40) is analogous to cotangent-squared of the spectral angle. The ACE value calculated in

19

Eq. (42) may be converted into an equivalent F value for direct comparison with the value

calculated in Eq. (40), ( ) ( )ACEACEACE DDkF −−= 1/1 .

3. Test Data

For algorithm performance evaluation we simulated observations of fugitive emissions by

augmenting plume-free measured spectra with synthetic plume signatures. The plume-free

spectra were collected using an Adaptive Infrared Imaging Spectroradiometer-Wide Area

Detector (AIRIS-WAD) [26,27]. The AIRIS-WAD sensor is an imaging Fabry-Perot

spectrometer comprised of a 256×256 pixel LWIR focalplane array (FPA) which views the far

field through a rapidly tunable LWIR etalon. The sensor’s optical system is configured to

provide a 32 deg x 32 deg field-of-regard (2.2 mrad per pixel IFOV). Spectra are recorded band-

sequentially and consist of measurements at twenty (20) user-specified wavelengths in the 8 to

11 µm spectral region. The instrument’s spectral resolution is ~0.08 µm, nominally 8 cm-1

FWHM at ν=1000 cm-1, and the lineshape is well-described by a Lorentzian function. The

sensor is equipped with an internal blackbody source to facilitate real-time radiometric

calibration of the sensor data.

Figure 2 shows a broadband IR representation of a datacube collected by the sensor. The

broadband IR representation was generated by summing all twenty narrowband images in the

datacube. Lighter pixels indicate higher radiance values. The scene is composed of low brush

and compacted sand in the bottom half of the image and sky in the upper half. Figure 3 shows

the average spectral radiance from the boxes marked “sky”, “horizon”, and “ground” in Figure 2.

The spectrum of the ground region is very similar to the spectrum of a 302 K blackbody. The

20

“sky” spectrum is consistent with a low slant angle view to space and exhibits the characteristic

ozone emission feature near 9.5 µm.

We augmented AIRIS-WAD data with synthetic 1,1,1,2-tetrafluoroethane (R-134a)

spectra. R-134a is a freon widely-used in refrigeration systems and as a propellant for domestic

and industrial applications. Data was augmented using the equation

[ ] exτLτ1x ˆˆ 0 ++−= oo papp (43)

where pτ is the instrument-lineshape-averaged plume transmission, 0x is estimated noise-free

background spectrum of the pixel as calculated using Eq. (A-10) of Appendix A, and e is

defined as the difference between the measured background spectrum and the estimated noise-

free background spectrum, 0ˆ~ˆ xxe −≡ . The elements of pτ are:

( ) ( )[ ] ( )∫ ⋅⋅⋅−= λλλλκρλτ dg ssp ,exp (44)

The plume transmission at high resolution was calculated using an R-134a reference spectrum

from the Pacific Northwest National Laboratories IR Spectral Database [28]. The instrument

lineshape function used to evaluate Eq. (44) was Lorentzian with 0.08 µm FWHM, consistent

with the experimentally-measured resolution function of the AIRIS-WAD sensor. .

A useful characteristic of Eq. (43) is that it preserves the noise in the original data as

plume becomes optically thick and thereby provides more realistic spectra for testing estimation

algorithms than fully synthetic data with added Gaussian noise. As the plume transmission goes

zero eLx ˆ+→ ap , i.e., as the plume becomes opaque the pixel spectrum becomes a noisy

blackbody spectrum rather than a noise-free blackbody spectrum. Conversely, as the plume

column density goes to zero its effective transmission goes to unity and 0xx =p , i.e., the output

spectrum is equal to the original data if no plume is present.

21

The AIRIS-WAD datacube was segmented into four 64×256 pixel horizontal quadrants

for processing. The motivation for segmenting the data was twofold: 1) The AIRIS-WAD FPA

has four separate readouts, each with somewhat different noise characteristics and 2) division

into four quadrants has the effect of partitioning the data into spectrally-similar subsets. The

latter effect improves the fidelity of the infrared background model. The covariance matrix of

each quadrant was calculated using a statistically-robust, Huber-type M-estimator [29]. The

method used to determine the background subspace dimensionality is described in Appendix A.

For the datacube depicted in Figure 2, the dimensionality of the quadrants ranged from m=4 to

m=6. We exercised Eq. (43) to create synthetic datacubes with column density ranging from 0 to

591 ppmv-m (0 to 2.48 g/m2). For reference, 197 ppmv-m corresponds to OD=1.0 (base e) at

8.42 µm, the wavelength of strongest absorption in the R-134a spectrum, and the synthetic

plumes varied from OD=0.0 to OD=3.0. Figure 4 shows the calculated transmission for

20 ppmv-m, 197 ppmv-m, and 591 ppmv-m R-134a plumes, i.e., plumes with peak optical

densities of 0.10, 1.0, and 3.0. The plume temperature was set equal to the local air temperature,

298.0 K, for this simulation. Synthetic plumes were added to 64 pixel (horizontal) × 5 pixel

(vertical) regions in the scene and the column density was the same at each pixel where the

plume signature was added.

4. Results and Discussion

4.1. General

In this Section we compare the results obtained by applying the Gauss-Newton solver for

detection and column density estimation with those obtained and the linear model given by

Eq. (30). Prior to applying the algorithms to the plume-augmented AIRIS-WAD data we

22

verified that both the Gauss-Newton and clutter-matched filter/ACE algorithms were properly

implemented by processing purely synthetic data with additive Gaussian noise. The plume-

augmented AIRIS-WAD data was processed on a quadrant-by-quadrant basis and the Huber-type

M-estimator was used to calculate the background covariance matrix of each quadrant. The

motivation for using the M-estimator is that using a standard covariance calculation, i.e., giving

equal weight to all spectra in a sample, results in a erroneous estimate of the background

covariance when a plume is present in the sample. The M-estimator de-weights the contribution

of statistically-anomalous spectra. Pixels where the plume signature is statistically-significant

generally constitute statistical anomalies so the M-estimator generally provides a more accurate

estimate of the true background covariance matrix.

It is instructive to evaluate the scene in Figure 2 and identify regions where the

conditions are favorable for detection and, conversely, where they are not favorable for

detection. Figure 5 shows the effective thermal contrast as a function of elevation. The effective

thermal contrast was calculated for each pixel in the scene using Eq. (39) and Figure 5 shows the

median for each row. The effective thermal contrast is ~0 K near the horizon (Row ~ 128) and

one therefore expects detection statistics to be unfavorable in that region. The effective thermal

contrast increases with elevation angle above the horizon and one expects detection statistics to

be favorable above Row ~150 where effT∆ > 5 K. (The downward deviations in the vicinity of

Rows 170 and 200 are due to clouds at those elevations.) These qualitative characterizations are

addressed quantitatively in Figure 6, which shows uncertainty in the estimated column density as

a function of elevation angle. The uncertainty was calculated for each pixel using Eq. (38) and

Figure 6 shows the median for each row. We note that the modest discontinuities at Rows 64

and 192 are associated with boundaries between data quadrants. The discontinuity at Row 128 is

23

due primary to the transition from ground to low-angle sky background. The fact that it is also a

quadrant boundary is a coincidence and is a minor contribution to the observed discontinuity.

In the interest of comparing algorithm performance in favorable and unfavorable

detection regions, we present results obtained by processing data with synthetic plumes added to

the regions shown in Figure 7. The effective thermal contrast in Region 1 is 2.6 ± 0.5 K and the

effective contrast in Region 2 is 5.9 ± 0.6 K. (The variation is the 1σ standard deviation in ∆T

over the plume region, not the uncertainty in ∆T.)

4.2. Column Density Estimation

In order to evaluate the accuracy and precision of the Gauss-Newton and linear model estimates

of peak optical density, we calculate the median of the estimated values in the plume region,

{ }imedian α , and the normalized median absolute deviation, a statistically-robust analogue of the

sample standard deviation,

{ }{ }

6745.0ˆˆ

ˆ ˆii medianmedian αα

σα

−= (45)

The { }imedian α and ασ ˆˆ are very nearly equal to the sample mean and standard deviation when

sample values are normally distributed; however, unlike the mean and standard deviation,

{ }imedian α and ασ ˆˆ are insensitive to low occurrence, highly anomalous values in the sample.

We report { }imedian α and ασ ˆˆ rather than the mean and standard deviation because we found

that { } ασα ˆˆ2ˆ ±imedian generally yields a more accurate estimate of the range which incorporates

95% of the sample values than does { } σα 2ˆ ±imean . For the results we evaluated, we observed

that the mean and median α values were in excellent agreement and that ασ ˆˆ was typically 10-

15% lower than the standard deviation.

24

Figures 8 and 9 show the median R-134a optical densities estimated using the

Gauss-Newton solver and the clutter-matched filter. Figure 8 depicts the optical densities

estimated in Region 1 and Figure 9 depicts the optical densities estimated in Region 2. The error

bars in each figure correspond to ασ ˆˆ1± . Optical densities may be converted to column densities

by multiplying by 0ρ , 197 ppmv-m for R-134a. The solid symbols indicate the median OD

estimated using the Gauss-Newton algorithm. The crossed open symbols indicate the median

OD estimated using the clutter-matched filter. The black dashed line indicates perfect agreement

between the actual and estimated OD values. As expected, the uncertainty in estimated OD is

smaller in Region 2 where the thermal contrast is greater. The variation in estimated α values

calculated using Eq. (45) is in reasonable agreement with the predictions of Eqs. (33) and (38).

Eqs. (33) and (38) generally overestimated ασ ˆˆ by 30-70%. Results of simulations run using

purely synthetic data with additive Gaussian noise suggest that the discrepancy is due to the fact

that the noise in the test data is not precisely normally-distributed.

The Gauss-Newton algorithm provides a more accurate estimate of column density than

the linear model in all cases. As expected, the accuracy of the clutter-matched filter estimate

degrades with increasing optical density. The systematic deviation of the Gauss-Newton-

estimated optical densities from the “Ideal” line is due entirely to the approximation in Eq. (10).

When the data is fit using the effective absorption spectrum for the appropriate ρ ,

( ) ( )[ ] ρρρ /ln pe τκ −= , rather than κ then all { }imedian α values all fall on the “Ideal” line.

Figure 10 shows ( )λκ , the effective reference spectrum for OD=0.0, as well as ( )λeκ for

OD=1.0 (197 ppmv-m) and OD=3.0 (591 ppmv-m). The effective peak cross-section decreases

by 9% from OD=0.0 to OD=1.0 and decreases by 29% from OD=0.0 to OD=3.0. The fact that

the Gauss-Newton algorithm systematically underestimates α values as the OD increases is

25

consistent with the observed reduction in maximum effective absorption cross-section. The

observed systematic underestimation of the OD using the Gauss-Newton algorithm suggests the

true OD may be recovered by applying an OD-dependent correction factor to the estimated

value. In principle, a correction faction could also be applied to the clutter-matched filter OD

estimates; however, the fact that those OD values appear to approach a maximum value suggests

that applying a correction factor would be problematic.

4.3. Plume Detection Statistics

The standard performance metric for detection applications is the receiver operator characteristic

(ROC) curve. Traditionally, a ROC curve is a plot of the probability of detection, PD, versus the

probability of false alarm, PFA, where the (PD, PFA) points which constitute the curve are

calculated by varying the detection threshold from lowest to highest value. We use the terms

“detection rate” (DR) and “false alarm rate” (FAR) here rather than PD and PFA because our

results are data-derived rather than based on theoretical calculations of PD and PFA. We construct

ROC curves as follows:

1. The pn F-values (or ACE values) in the plume region are put in ascending order:

pnFFF ,...,, 21 .

2. The DR for each F-value (or ACE value) is calculated as ( ) pni /2/1− , where i is the

F-value’s index, i = 1, 2,…, np. The range of DR is ( ) ( ) 11 212 −− −≤≤ pp nDRn .

3. The FAR corresponding to each DR is calculated by determining the number of F-values

(or ACE values) in the off-plume region which are greater than or equal to Fi. The

number of values exceeding Fi is iFAn , . For 1, ≥iFAn , the false alarm rate is calculated as

( ) biFAi nnFAR /2/1, −= , where bn is the number of pixels in the off-plume region. If

26

0, =iFAn , then we consider the (DR, FAR) point not to exist, so

( ) ( ) 11 212 −− −≤≤ bb nFARn .

Subtracting ½ from np and nb when calculating DR and FAR is a convention for probability

plotting which facilitates comparison with model distribution functions.

Figures 11-14 show the ROC curves calculated for Regions 1 and 2 augmented with

OD=0.1, 0.3, 1.0, and 2.0 plumes. The solid symbols correspond to points resulting from

application of the Gauss-Newton algorithm. The crossed open symbols correspond to points

results for application of the ACE algorithm. As the plume is optically-thin at OD=0.1 and 0.3

the two algorithms generate nearly identical ROC curves. We note that the ROC curves in

Figure 11, OD=0.1, are all indicative of unfavorable plume detection statistics. Setting the

detection threshold to achieve an 80% detection rate in Region 2 would result in ~20% false

positive rate in rest of the scene while setting the detection threshold to achieve 80% detection

rate in Region 1 and would result in ~50% false positive rate. The ROC curves in Figure 12,

OD=0.3, show somewhat more favorable detection statistics, particularly in Region 2. Setting

the detection threshold to achieve an 80% detection rate in Region 2 would result in ~0.3% false

positive rate in rest of the scene and setting the detection threshold to achieve 80% detection rate

in Region 1 and would result in ~30% false positive rate. Some separation between the

Gauss-Newton and ACE curves is observed for OD=0.3 but the differences are modest. For both

the OD=0.1 and OD=0.3 plumes the curves for Region 2 are significantly more favorable than

those for Region 1 because the effect thermal contrast is >2x larger in Region 2 than in Region 1.

The ROC curves in Figure 13, OD=1.0, show significantly better detection statistics for

the Gauss-Newton algorithm than the ACE algorithm in both Region 1 and Region 2. The

[ ] s1s αα −≈−exp approximation for the plume transmission is significantly less accurate than at

27

OD=0.1 and 0.3, so it is expected that the nonlinear estimator will outperform ACE-based

detection. As was true at lower OD values, the detection statistics are significantly more

favorable in the region of higher thermal contrast. In Region 2, >95% detection rate is achieved

with a false positive rate of 1⋅10-5 using Gauss-Newton algorithm while the false positive rate is

approximately four orders of magnitude higher for the same detection rate in Region 1.

Similarly, for ACE-based detection, the false alarm rate for a detection rate of 80% is

approximately 350x lower in Region 2 than it is in Region 1. Comparing the ROC curves for

Gauss-Newton detection in Regions 1 with ACE-based detection in Region 2, while the

degradation introduced by the [ ] ss αα −≈− 1exp approximation is apparent, it is less significant

than the effect of enhanced thermal contrast in going from Region 1 to Region 2. Although the

linear model which underlies the ACE detector is not precisely matched to the data, the ROC

curve obtained by applying ACE to Region 2 is still more favorable than the ROC curve obtained

by applying the Gauss-Newton algorithm to Region 1.

The ROC curves in Figure 14, OD=2.0, show an even greater improvement in detection

statistics for the Gauss-Newton algorithm relative to the ACE algorithm as [ ] s1s αα −≈−exp is

a poor approximation for the plume transmission near the wavelengths of strongest absorption.

Comparing the ROC curve for Gauss-Newton detection in Regions 1 with ACE-based detection

in Region 2, the degradation introduced by the [ ] s1s αα −≈−exp approximation is more

significant than the effect of enhanced thermal contrast in going from Region 1 to Region 2.

With the OD increased to 2.0, the detection statistics in Region 1 become relatively favorable

and for a detection rate of 80% the Gauss-Newton algorithm reduces the false positive rate by a

factor of ~15 relative to ACE-based detection. The reduction in false alarm rate is even more

pronounced in Region 2. One can also examine the differences in detection rate for a fixed false

28

alarm rate. For FAR=1⋅10-4, use of the Gauss-Newton algorithm increases the detection rate

from ~10% to nearly 80% in Region 1. For FAR=1⋅10-5, use of the Gauss-Newton algorithm

increases the detection rate from ~70% to ~99% in Region 2.

The degradation of the ACE ROC curves with increasing OD can be understood by

examining the deviation between the test data and the model spectra calculated using the

maximum likelihood model parameters. Figure 15 shows spectra of pixels from Region 1 along

with model spectra calculated using the maximum likelihood values of the model parameters.

The original AIRIS-WAD spectrum is denoted by crossed open squares and the dashed line

shows the corresponding model spectrum. (The difference between the best fits to the plume-

free spectrum calculated using the Gauss-Newton and linear models is not observable on the

scale of the graph.) For comparison, the solid squares denote the pixel spectrum after

augmentation with an OD=3.0 R-134a plume. The solid line shows the spectrum calculated

using the maximum likelihood parameters estimated using the Gauss-Newton algorithm and the

dashed line shows the spectrum calculated using the maximum likelihood parameters estimated

assuming the linear model, Eq. (30). While the differences appear modest in comparison to the

range of spectral radiance values, the deviations between the data and the model spectra are

systematic and readily discernable.

Figure 16 shows the root-mean-squared (rms) residuals between the data and the model

spectra in Region 1. For reference, the open diamonds show the rms deviation between the data

and the model for no plume signature added to the region. The solid squares indicate the rms

residuals for the linear model applied to Region 1 data augmented with a OD=3.0 (591 ppmv-m)

plume. The rms deviation is ~3x greater between 8.4 and 9.0 µm and ~6x greater near 10.3 µm

than it is for the plume-free spectra. The wavelengths where the rms deviation increases the

29

most correspond to wavelengths of strongest absorption, indicating a shortcoming in the fit

model. For comparison, the crossed open squares indicate the rms deviation for the Gauss-

Newton algorithm applied to the same data. While the rms deviation is slightly larger between

8.4 and 9.0 µm and near 10.3 µm as result of the difference between κ used for fitting and

( )ρκ e used to create the test data, the residuals are in good agreement with rms deviations

calculated for the plume-free data. This is the expected result when the signal model is

consistent with the data.

4.4. Algorithm Convergence

As the Gauss-Newton algorithm is iterative, it is necessary to define a termination criterion. Our

termination criterion is the fractional change in the cost function given by Eq. (19). The

algorithm terminates when the fractional change falls below a user-specified value, ε:

ε≤−≤ +

i

i

CC 110 (46)

The results presented in the preceding section were obtained using ε=0.01. In the event that the

cost function increases from the i-th to (i+1)-th iteration the parameter values are restored to

those from the i-th iteration and the algorithm terminates. Figure 17 shows number of iterations

to convergence for Region 2 augmented with OD=1.0, 2.0, and 3.0 plumes as well as the number

of iterations observed when no plume was added. For reference, the solid bars show the number

of iterations required for convergence in the plume-free region of the scene. Iteration zero

corresponds to the initial guess at the maximum likelihood model parameters. The algorithm

terminated after one iteration for ~55% of the plume-free pixels, i.e., the parameters were

updated but the cost function did not change significantly from the initial guess, and the

algorithm terminated after the second iteration for ~45% of the plume-free pixels. When applied

30

to the original Region 2 data, the algorithm terminated after one iteration approximately half of

the time and after the second iteration the other half of the time, in good agreement with the

plume-free region of the scene. The number of iterations required for convergence increases as

the plume OD increases. For spectra augmented with a OD=1.0 plumes, all pixels required at

least two iterations to converge; ~80% of the pixels require two iterations and ~20% required

three iterations. When the plume OD is increased to 2.0, the algorithm required three iterations to

converge for almost all pixels. A similar result was obtained for OD=3.0.

Reducing ε to 0.0001 increases the number of iterations (from 3 to 4 for the

OD=3.0 plume) however there was no significant effect on the estimated column density values.

The observed differences at algorithm termination were on the order of 0.1% of the estimated

values. For comparison, the uncertainties in the estimated values were typically several orders of

magnitude larger than the differences observed by making the convergence threshold more

stringent.

5. Summary and Conclusions

We have presented a nonlinear optimal estimation method for detecting and characterizing

chemical vapor fugitive emissions in a non-scattering atmosphere using passively-sensed LWIR

spectra. The method integrates a parameterized signal model based on the RTE with a

parameterized representation of covariance of the infrared background to create a probability-

based cost function. The maximum likelihood model parameters are defined as those which

minimize the cost function and are estimated using a Gauss-Newton algorithm. The algorithm

formulation presented here presumes that the plume and air are in thermal equilibrium and that

the air temperature is known; however, the algorithm may be easily modified to handle

scenarios where the air temperature is not known.

31

For algorithm performance evaluation we simulated observation of fugitive emissions by

augmenting plume-free spectra measured by an AIRIS-WAD sensor with synthetic

R-134a plume signatures. The peak optical density of the synthetic plumes varied from

OD=0.0 to OD=3.0. Results obtained by processing the simulated data indicate that the

nonlinear estimator provides significantly more accurate estimates of chemical vapor column

density and significantly more favorable detection statistics than matched-filter-based estimation

when the vapor plume is optically-thick at one or more of the sensor observation wavelengths.

This is because the signal model used for nonlinear estimation is based on the full clear air RTE,

not an approximation which follows from the presumption of an optically-thin plume as do the

clutter-matched filter and adaptive subspace detector. Finite instrument resolution introduced

systematic error in column densities estimated using the Gauss-Newton algorithm but the effect

was only significant for optical densities >>1.0 and the error is much smaller than that associated

with clutter-matched filter estimates. For example, the Gauss-Newton algorithm underestimated

column density by 17% at OD=3.0 whereas the clutter-matched filter underestimated it by ~70%.

We note that while the nonlinear estimator provides significantly better results for optically-thick

plumes, it produces the same result as a clutter-matched filter/adaptive subspace detector as the

plume optical density approaches zero.

The uncertainty in the column density estimated using the Gauss-Newton algorithm may

be calculated from the Fisher information matrix which follows from the cost function. We

observe that the uncertainties derived from the Fisher information matrix are typically 30-70%

larger than the standard deviation of column densities estimated by processing the simulated

sensor data. The discrepancy appears to be the result of non-Gaussian noise in the originally-

measured plume-free spectra. In future implementations of the algorithm we plan to enable

32

estimation of air temperature, plume temperature, and atmospheric transmission effects as well

as implement robust estimators to mitigate the effect of non-Gaussian noise on estimates of

maximum likelihood model parameters.

6. Acknowledgement

This work was supported in part through contract no. HDTRA1-07-C-0067 with the Defense

Threat Reduction Agency.

7. References

1. C. D. Rodgers, “Retrieval of atmospheric temperature and composition from remote

measurements of thermal radiation,” Rev. Geophys. Space Phys., 14, (4), 609-624 (1976).

2. J. R. Eyre, “Inversion of cloudy satellite sounding radiances by nonlinear optimal estimation.

I: Theory and simulation for TOVS,” Q. J. R. Meteorol. Soc., 115, 1001-1026 (1989).

3. W. L. Smith, H. M. Woolf, and H. E. Revercomb, “Linear simultaneous solution for

temperature and absorbing constituent profiles from radiance spectra,” Appl. Opt., 30, (9),

1117-1123 (1991).

4. X. L. Ma, T. J. Schmit, and W. L. Smith, “A nonlinear physical retrieval algorithm – its

application to the GOES-8/9 Sounder,” J. Appl. Meteorol., 38, 501-513 (1999).

5. X.L. Ma, Z. Wan, C.C. Moeller, W.P. Menzel, L.E. Gumley, and Y. Zhang, “Retrieval of

geophysical parameters from moderate resolution imaging spectroradiometer thermal

infrared data: evaluation of a two-step physical algorithm,” Appl. Opt., 39, (20), 3537-3550

(2000).

33

6. T. Steck and T. von Clarmann, “Constrained profile retrieval applied to the observation mode

of the Michelson interferometer for passive atmospheric sounding,” Appl. Opt., 40, (21),

3559-3571 (2001).

7. S. W. Seeman, J. Li, W. P. Menzel, and L. E. Gumley, “Operational retrieval of atmospheric

temperature, moisture, and ozone from MODIS infrared radiances,” J. Appl. Meteorol., 42,

1072-1091 (2003).

8. A. Hayden, E. Niple, and B. Boyce, “Determination of trace-gas amounts in plumes by the

use of orthogonal digital filtering of thermal-emission spectra,” Appl. Opt., 35, (30), 6090-

6098 (1996).

9. C. C. Funk, J. Theiler, D. A. Roberts, and C. C. Borel, “Clustering to improve matched filter

detection of weak gas plumes in hyperspectral thermal imagery,” IEEE Trans. Geosci.

Remote Sensing, 39, (7), 1410-1419 (2001).

10. N. B. Gallagher, B. M. Wise, and D. M. Sheen, “Estimation of trace concentration-pathlength

in plumes for remote sensing applications from hyperspectral images,” Analytica Chimica

Acta, 490, 139-152 (2003).

11. E. M. O’Donnell, D. W. Messinger, C. Salvaggio, and J. R. Schott, “Identification and

detection of gaseous effluents from hyperspectral imagery using invariant algorithms,” in

Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery X,

Proc. SPIE 5425, 573-582 (2004).

12. D. Manolakis and F. M. D’Amico, “A taxonomy of algorithms for chemical vapor detection

with hyperspectral imaging spectroscopy,” in Chemical and Biological Sensing VI, P.J.

Gardner, ed., Proc. SPIE, 5795, 125-133 (2005).

34

13. A.Vallières, A.Villemaire, M.Chamberland, L.Belhumeur, V.Farley, J.Giroux, and J.-

F.Legault, “Algorithms for chemical detection, identification and quantification for thermal

hyperspectral imagers,” in Chemical and Biological Standoff Detection III, J.O.Jensen and

J.-M.Thériault, eds., Proc. SPIE, 5995, 59950G-1 (2005).

14. R. M. Goody and Y. L. Yung, Atmospheric Radiation: Theoretical Basis (Oxford University

Press, 1989), Ch.2, pp. 46

15. C. D. Rodgers, Inverse Methods for Atmospheric Sounding: Theory and Practice, (World

Scientific, 2000) Ch. 2, pp. 30.

16. M. L. Polak, J. L. Hall, and K. C. Herr, “Passive Fourier-transform infrared spectroscopy of

chemical plumes: an algorithm for quantitative interpretation and real-time background

removal,” Appl. Opt., 34, (24), 5406-5412 (1995).

17. D. Flanigan, “Prediction of the limits of detection of hazardous vapors by passive infrared

with the use of MODTRAN,” Appl. Opt., 35, (30), 6090-6098 (1996).

18. R. Harrig, “Passive remote sensing of pollutant clouds by Fourier-transform infrared

spectroscopy: signal-to-noise ratio as a function of spectral resolution,” Appl. Opt., 43, (23),

4603-4610 (2004).

19. S. A. Clough, M. J. Iacono, and J.-L. Moncet, “Line-by-line calculations of atmospheric

fluxes and cooling rates,” J. Geophys. Res., 97, (D14), 15761-15785 (1992).

20. D. M. Sheen, N. B. Gallagher, S. W. Sharpe, K. K. Anderson, and J. F. Shultz, “Impact of

background and atmospheric variability on infrared hyperspectral chemical detection

sensitivity,” in Algorithms and Technologies for Multispectral, Hyperspectral, and

Ultraspectral Imagery IX, Sylvia S. Shen, Paul E. Lewis, Eds., Proc. SPIE, 5093, 218-229

(2003).

35

21. R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis, Fifth Ed.,

(Prentice Hall, 2002), Ch. 9, pp. 477-532.

22. M. E. Tipping and C. M. Bishop, “Probabilistic Principal Components Analysis,” J. R.

Statist. Soc. B, 61, part 3, 611-622, 1999.

23. S. Kay, Fundamentals of Statistical Signal Processing: Volume 1, Estimation Theory,

(Prentice-Hall, 1993) p.40.

24. S. Kraut and L. L. Scharf, “The CFAR adaptive subspace detector is a scale invariant

GLRT,” IEEE Trans. Sig. Proc., 47, (9), 2538-2541 (1999).

25. S. Kraut, L. L. Scharf, and L. T. McWhorter, “Adaptive Subspace Detectors,” IEEE Trans.

Sig. Proc., 49, (1), 1-16 (2001).

26. W. J. Marinelli, C. M. Gittins, B. R. Cosofret, T. E. Ustun, and J. O. Jensen, “Development

of the AIRIS-WAD multispectral sensor for airborne standoff chemical agent and toxic

industrial chemical detection,” Proc. of the Meetings of the Mil. Sens. Symp. Specialty

Groups on Passive Sensors; Camouflage, Concealment, and Deception; Detectors; and

Materials, Charleston, SC, Feb. 2005. Available through Defense Technical Information

Center (DTIC), document ref. no. ADA444225.

27. W. J. Marinelli, C. M. Gittins, A. H. Gelb, and B. D.Green, “Tunable Fabry-Perot etalon-

based long-wavelength infrared imaging spectrometer,” Appl. Opt., 38, (12), 2594-2604

(1999).

28. S. W. Sharpe, T. J. Johnson, R. L. Sams, P. M. Chu, G. C. Roderick, and P. A. Johnson,

“Gas-phase databases for quantitative infrared spectrometry,” Appl. Spectrosc., 58, 1452-

1461 (2004); DOE/PNNL Infrared Spectral Library Release 7.4, May 2004.

36

29. D. E. Tyler, “A distribution-free M-estimator of multivariate scatter,” The Annals of

Statistics, 15, (1), 234-251 (1987).

30. H. Cox and R. Pitre, “Robust DMR and multi-rate adaptive beamforming,” in Proc. Asilomar

Conf. Signals, Syst., Comput., Pacific Grove, CA, pp. 920–924, Nov. 1997.

31. M. Wax and T. Kailath, “Detection of signals by information theoretic criteria,” IEEE Trans.

on Acoustics, Speech, and Sig. Proc., 33, (2), pp. 387-392 (1985).

37

Appendix A. Infrared Background Model

In this work infrared background spectra, i.e., spectra corresponding to sensor views

where no chemical vapor is present, are accounted for using a linear mixing model:

Bβµx += (A.1)

where µ is the mean background spectrum, B is the k × m dimensional matrix whose columns

are the basis vectors used to span the data space and β is an m × 1 vector of weight coefficients;

km < . Measured spectra are presumed to be subject to additive Gaussian noise:

exx +=~ (A.2)

where ( )D0,e N~ ; { }222

21 ,...,, kdiag σσσ=D . The tilde denotes a noisy measurement and iσ is

the 1σ standard deviation due to sensor noise and scene clutter. Estimation of D is described

below.

The B matrix follows from a regularization approximation of the sample covariance

matrix. We summarize the method here and refer the reader to Refs. [22] and [30] for additional

detail. Regularization is implemented in two steps. The first step is to calculate the Principal

Components decomposition of the noise-whitened sample covariance matrix:

UΛΣDD =−− 2/12/1 TU (A.3)

where U is the m×m matrix of eigenvectors and Λ is the diagonal matrix of eigenvalues,

{ }kdiag λλλ ,...,, 21=Λ . In this we estimate the D matrix as { }[ ] 11 −−= ΣD diag and use the robust

estimation method described by Tyler [29] to calculate Σ . The second step follows from the

presumption that only the first m Principal Components and eigenvalues correspond to signal

and the higher order components correspond to noise. This leads to an estimated sample

covariance matrix

38

( )[ ] 2/12/1ˆ DIUIΛUDΣ mTmmmm εε +−= (A.4)

where mU is the k×m matrix whose columns are the leading eigenvectors of 2/12/1 −− ΣDD ,

{ }mm diag λλλ ,...,, 21=Λ , mI is the m×m identity matrix and

∑+=−

=k

miimk 1

1 λε (A.5)

Following Eqs. (A.4) and (A.5) the B matrix used in Eq. (A.1) is

( ) 2/12/1mmm IΛUDB ε−= (A.6)

As noted in Section 2.3, the β coefficients for the sample are presumed to be

uncorrelated, have zero mean and unit standard deviation,

( )mN I0,β ~ (A.7)

as would be the case when applying PCA to multivariate normal distributed data. The maximum

likelihood β vector for an observed spectrum x~ minimizes the cost function:

( ) ( ) ( ) ββBβxDBβx TTC +−−= − ~~ 1ε (A.8)

where the first term on the righthand side corresponds to the deviation between the data and the

model and the ββT term follows from the prior distribution given by Eq. (A.7). Following the

definition of B in (A.6), the maximum likelihood β vector is

( ) ( )µxDUIΛΛβ −−= −− ~ˆˆ 2/12/11 Tmmmm ε (A.9)

The corresponding maximum likelihood noise-free background spectrum is

( )[ ]( ) µµxDUΛIUDx +−−= −− ~ˆ 2/112/1 Tmmmm ε (A.10)

Note that as km → , 0Λ →−1mε and xx →ˆ , i.e., if all Principal Components are deemed

significant, then the maximum likelihood spectrum is equal to the input spectrum. Conversely,

39

for a data set dominated by noise mm IΛ →−1ε and µx →ˆ , i.e., when the data is dominated by

noise then the maximum likelihood spectrum is equal to the sample mean.

Determination of m, the dimensionality of the background subspace, is an information

theory problem. The key criterion for any parameterized background model to be effective for

chemical plume detection and characterization is that the dimensionality of the signal subspace

must be much less than the number of bands in the measured spectra. We attempted to

determine the number of statistically-significant signals in the data using information theoretic

criteria; specifically, the Akaike Information Criterion (AIC) and Minimum Description Length

(MDL) criteria [31]. (We note that the significance criterion in Appendix A of Ref. [22] is

equivalent to the MDL significance criterion.) The AIC and MDL criteria are appealing in that

each is a function of the calculated eigenvalues and are thereby simple to evaluate.

Unfortunately, we applied the AIC and MDL criteria to multiple datacubes and observed that

neither criterion provided either stable, plausible estimates of the number of statistically-

significant PCs; km ≈ was a typical result even though the deviation between the model and

the data changed little after the first several basis vectors.

For this work we determined which basis vectors were statistically-significant by

applying an F-test. For a model spectrum calculated using (m-1) basis vector, the F-test for

statistical significance of the m-th basis vector is:

( ) ( ) ( ) ( )( ) ( ) ⎥

⎥⎦

⎢⎢⎣

⎡−

−−

−−−−=

−−

−− 1

ˆˆˆˆˆˆ

1;1

11

1

mT

m

mT

mmkmxFxxDxxxxDxx

(A.11)

where 1ˆ −mx is the maximum likelihood spectrum calculated using (m-1)-principal components

and mx is the maximum likelihood spectrum calculated using m-principal components. For

normally-distributed noise, the calculated F-values follow an F-distribution with k-m-1 degrees

40

of freedom, ( ) 1,1~; −−mkFmxF . In this work we decide statistical significance based on the

cumulative distribution of F-values. When <5% of the calculated F-values exceeded the F-value

corresponding to 95% significance we considered the principal component statistically

significant. We make no claim that this test is optimal for determining the model order but note

that: 1) it produced the correct results using synthetic test data sets containing Gaussian additive

noise and 2) it produces seemingly reasonable results using real datacubes where noise is not

precisely Gaussian. It would be far more computationally-efficient to use an eigenvalue-based

method such as the AIC or MDL for determining model order; however, we were not able to

identify one which performed reliably with the data of interest.

As noted in Section 3, datacubes were processed on a quadrant-by-quadrant basis.

Figure A.1 shows the fraction of spectra passing the F-test as a function m for the quadrant in of

the datacube in which the synthetic spectra were embedded (upper middle). For that quadrant

m=6. Figure A.2 shows the rms residuals in each band as a function of m for m=4-7 for the

original (plume-free) datacube. Note that basis vectors have decreasing effect with increasing m.

41

8. Figure Captions

Figure 1. Stratified atmosphere model. Each layer defined to have uniform temperature (Ti),

pressure, and chemical composition; layer transmission is iτ . Chemical vapor plume of

interest is Layer p.

Figure 2. Gray scale representation of AIRIS-WAD datacube. Lighter pixels indicate higher

average radiance values; average radiance calculated over all twenty spectral bands acquired

by the sensor. Representative “sky,” “horizon,” and “ground” regions are indicated by the

white boxes and black box, respectively.

Figure 3. Radiance spectra corresponding to the “sky,” “horizon,” and “ground” regions in

Figure 2. Spectra shown are the average of all pixels in the identified region.

Figure 4. Calculated transmission spectra of 20, 197, and 591 ppmv-m R-134a plumes. High

resolution spectra have peak optical density of 0.1, 1.0, and 3.0 (base e), respectively. The

thick lines indicate the spectra calculated using Beer’s law and the R-134a spectrum from the

PNNL database. The thin dotted lines indicate the effective transmission which results from

convolving the high resolution spectrum with a 0.08 µm FWHM Lorentzian lineshape

function. The lower resolution spectra were used to augment AIRIS-WAD data.

Figure 5. Effective thermal contrast between the local air temperature and the effective

radiometric temperature of the background. Calculated median effT∆ for each row in scene

depicted in Figure 2.

Figure 6. Uncertainty in the estimated column density as a function of elevation angle. Plot

shows median for row in scene depicted in Figure 2; uncertainty calculated using Eq. (36).

42

Figure 7. Locations where synthetic R-134a plumes were added to AIRIS-WAD data. The

effective thermal contrast in Region 1 is 2.6 ± 0.5 K and the effective contrast in Region 2 is

5.9 ± 0.6 K.

Figure 8. R-134a optical densities estimated in Region 1 of Figure 7. Black circles indicate

median OD estimated by the Gauss-Newton algorithm. Open circles indicate median OD

estimated using the linear signal model given by Eq. (31). The error bars in correspond to

σ1± variation in estimated column density calculated using Eq. (49).

Figure 9. R-134a optical densities estimated in Region 2 of Figure 7. Black circles indicate

median OD estimated by the Gauss-Newton algorithm. Open circles indicate median OD

estimated using the linear signal model given by Eq. (31). The error bars in correspond to

σ1± variation in estimated column density calculated using Eq. (49).

Figure 10. Effective R-134a absorption cross-sections for OD=0.0, OD=1.0, and OD=3.0. The

OD=0.0 spectrum is used for estimation of plume OD with the Gauss-Newton algorithm and

linear estimator.

Figure 11. ROC curves for OD=0.1 R-134a plumes added to Regions 1 and 2: ■ = Gauss-

Newton solver applied to Region 2, = ACE applied to Region 2, ● = Gauss-Newton solver

applied to Region 1, = ACE applied to Region 1.

Figure 12. ROC curves for OD=0.3 R-134a plumes added to Regions 1 and 2: ■ = Gauss-

Newton solver applied to Region 2, = ACE applied to Region 2, ● = Gauss-Newton solver

applied to Region 1, = ACE applied to Region 1.

Figure 13. ROC curves for OD=1.0 R-134a plumes added to Regions 1 and 2: ■ = Gauss-

Newton solver applied to Region 2, = ACE applied to Region 2, ● = Gauss-Newton solver

applied to Region 1, = ACE applied to Region 1.

43

Figure 14. ROC curves for OD=2.0 R-134a plumes added to Regions 1 and 2: ■ = Gauss-

Newton solver applied to Region 2, = ACE applied to Region 2, ● = Gauss-Newton solver

applied to Region 1, = ACE applied to Region 1.

Figure 15. Representative spectra from Region 1 and best fits to data using linear and nonlinear

models: = original spectrum, ■ = original spectrum augmented with OD=3.0 R-134a

plume.

Figure 16. Rms fit residuals for Region 1: ◊ = fit to original data, = fit to data augmented with

OD=3.0 R-134a plume using nonlinear estimator, ■ = fit to data augmented with OD=3.0 R-

134a plume using linear model.

Figure 17. Number of iterations required for Gauss-Newton algorithm to converge, convergence

threshold = 0.01: ■=plume-free pixels, □ = Region 2 with no plume added, = Region 2

with OD=1.0 plume added, = Region 2 with OD=2.0 plume added, = Region 2 with

OD=3.0 plume added.

Figure A.1. Fraction of spectra passing F-test, Eq. (A.11), as function of number of basis

functions used to model data. Pass criterion is F value for ≥5% of the spectra exceed the F

value for 95% significance. Data corresponds to upper middle quadrant of scene in Fig. 2.

Figure A.2. RMS residuals between model and data as function of m for m=4-7. Six basis

vectors were deemed to be statistically-significant using the F-test criterion.

44

9. Figures

Fig. 1.

45

Fig. 2.

46

Fig. 3.

47

Fig. 4.

48

Fig. 5.

49

Fig. 6.

50

Fig. 7.

51

Fig. 8.

52

Fig. 9.

53

Fig. 10.

54

Fig. 11.

55

Fig. 12.

56

Fig. 13.

57

Fig. 14.

58

Fig. 15.

59

Fig. 16.

60

Fig. 17.

61

Fig. A-1.

62

Fig. A-2.