Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
1
Detection and Characterization of Chemical Vapor
Fugitive Emissions by Nonlinear Optimal Estimation:
Theory and Simulation
Christopher M. Gittins
Physical Sciences Inc., 20 New England Business Center, Andover, MA 01810, USA
This paper addresses detection and characterization of chemical vapor fugitive emissions
in a non-scattering atmosphere by processing of remotely-sensed long wavelength infrared
spectra. The analysis approach integrates a parameterized signal model based on the
radiative transfer equation with a statistical model for the infrared background. The
maximum likelihood model parameter values are defined as those which maximize a
Bayesian posterior probability and are estimated using a Gauss-Newton algorithm. For
algorithm performance evaluation we simulate observation of fugitive emissions by
augmenting plume-free measured spectra with synthetic plume signatures. As plumes
become optically-thick, the Gauss-Newton algorithm yields significantly more accurate
estimates of chemical vapor column density and significantly more favorable plume
detection statistics than clutter-matched-filter-based and adaptive-subspace-detector-based
plume characterization and detection.
OCIS codes: 010.5620 Radiative transfer
2
110.4234 Multispectral and hyperspectral imaging
150.1135 Machine vision algorithms
280.1120 Air pollution monitoring
280.4991 Passive remote sensing
300.6340 Spectroscopy, infrared.
1. Introduction
Long wavelength infrared (LWIR) spectrometry has been employed for several decades for
remote sensing of chemical vapor fugitive emissions. This paper addresses detection and
characterization of fugitive emissions in a non-scattering atmosphere by processing of LWIR
spectra collected in an open path configuration. The analysis approach employs a parameterized
signal model derived from the radiative transfer equation (RTE) and applies estimation theory to
determine the maximum likelihood parameter values. In this regard, it bears some similarity to
estimation methods developed for atmospheric profile retrieval using data collected by
atmospheric sounding spectrometers. In both cases, the at-sensor spectral radiance is a nonlinear
function of atmospheric temperature and constituent profiles and retrieval of the properties of
interest requires inverse solution of the RTE. Because inverse solution of the RTE is
mathematically ill-posed, the physical properties of interest must be inferred using estimation
theory. The use of estimation theory for atmospheric profile retrieval was demonstrated by
Rodgers in 1976 [1]. Variations on and refinements of Rodgers’ implementation have been
made by numerous authors in the decades since, see, e.g., [2-7]. The estimation algorithm
presented here has characteristics in common with methods described in those publications. The
RTE used here is well-suited for fugitive emissions detection from short range (<5 km) where
3
observations are made with a horizontal or near-horizontal line-of-sight to the area of interest.
That noted, the nonlinear estimation approach is generally-applicable and may be adapted to
address other observation scenarios by modifying the RTE and/or the statistical model for the
background.
We define the maximum likelihood model parameter values as those which maximize a
Bayesian posterior probability. The posterior probability distribution function (pdf) is the
product of a conditional pdf for the observed spectrum and a prior pdf for infrared background
spectra. Accounting for infrared background characteristics using a prior pdf rather than a
physics-based model is both statistically-justifiable, i.e., no information is lost in an information
theoretical sense, and a practical necessity in order to make the estimation problem
computationally tractable. The signal model for measured spectra is a nonlinear function of the
model parameters and we use a Gauss-Newton algorithm to determine the maximum likelihood
parameter values.
In addition to presenting the signal model and parameter estimation algorithm, we
address algorithm initialization and derive the uncertainties associated with the estimated
parameter values. For algorithm validation, we simulate observation of fugitive emissions by
augmenting plume-free measured spectra with synthetic plume signatures using a process which
preserves the noise characteristics of the original data. We compare the results obtained by
processing the simulated data with the Gauss-Newton algorithm with those obtained by
processing the data with an adaptive subspace detector (for plume detection) and a clutter-
matched filter (for estimation of path-integrated vapor concentration), both of which are derived
from a linear additive signal model which follows from an approximation to the underlying RTE
[8-13]. The purpose of simulation in this work is not to test the validity of the clear air
4
approximation but, postulating that the underlying atmospheric model is correct for the cases
evaluated, to assess the effect of using the approximate, linearized RTE rather than the exact
RTE as the basis for chemical vapor detection and characterization. Comparing results obtained
for simulated observations in a uniform, non-scattering atmosphere is the simplest test case.
Readers will note that the signal model used for nonlinear estimation is easily modified to
address observations in more complex atmospheres.
While nonlinear estimation and estimation based on the linear, additive signal model
produce essentially identical results when applied to optically-thin plumes, nonlinear estimation
provides significantly more accurate estimates of path-integrated chemical vapor concentration
(column density) and significantly more favorable detection statistics when vapor plumes are
optically-thick at one or more of the sensor observation wavelengths. Simulations indicate that
nonlinear estimation utilizing a Gauss-Newton algorithm can reduce column density estimation
error by an order of magnitude and reduce false alarm rates by several orders of magnitude
relative to the clutter-matched filter for a plume optical density as low as 1.0, i.e.,
( )0.1exp −=ontransmissi at the wavelength of strongest absorption. As expected, the
performance of the Gauss-Newton algorithm relative to the linear estimators improves with
increasing plume optical density, i.e., as the optically-thin plume approximation breaks down.
2. Algorithm Formulation
2.1. Radiative Transfer Model
The radiative transfer model which underlies the analysis approach presented here follows from
the stratified atmosphere approximation: the atmosphere along the sensor’s line-of-sight is
modeled as n layers with each layer having uniform temperature, pressure, and chemical
5
composition [14]. The concept is illustrated in Figure 1. Under this model, the monochromatic
at-sensor spectral radiance at the interface of Layer i and Layer i+1 is [15]:
( ) ( ) ( ) ( ) ( )[ ]λτλλτλλ iiiii LRR −⋅+⋅= − 11 (1)
where iτ is the transmission of layer i and iL is the blackbody (Planck) function evaluated at the
thermodynamic temperature of Layer i, Ti. If 0R arises from a solid background then it is the
surface leaving radiance, i.e., 0R includes reflected downwelling radiance as well as surface
emission, not simply the blackbody function evaluated at the surface temperature.
When a vapor plume is present at Layer p, the at-sensor spectral radiance is:
( ) ( )[ ]ppbpaaas LRLR ⋅−+⋅⋅+⋅−= ττττ 11 (2)
where aτ is the atmospheric transmission from the sensor along the line-of-sight to the plume, aL
is the path-weighted blackbody radiance from the sensor to Layer p, pτ is the plume
transmission, bR is the background spectral radiance incident on the plume layer, and pL is the
blackbody function evaluated at the plume temperature. (Note that the wavelength dependence
of the quantities in Eq. (2) has been suppressed for clarity.) The plume transmission follows
Beer’s law. In this work we consider a plume consisting of a single vapor and
[ ]κρτ ⋅−= expp (3)
where κ is the absorption cross-section of the chemical vapor and ρ is its column density.
Absorption cross-sections are temperature- and pressure-dependent; however, those
dependencies are weak near 300 K and 1 atm and we ignore them here.
Eq. (2) has formed the basis for multiple fugitive emissions detection and
characterization approaches, see, e.g., [8,16-18]. When the atmosphere is homogeneous along
the line-of-sight to the plume, aL is simply the blackbody function evaluated at the air
6
temperature. Methods for calculating aL along an inhomogeneous path are described by Clough,
et al. [19] and Rodgers [15]. In those treatments, aL corresponds to the blackbody function
evaluated at an effective temperature derived from the characteristics of the path along the
sensor’s line-of-sight. Although treating the atmosphere as a single uniform layer here may
appear to be a crude approximation likely to introduce significant error in estimated plume
characteristics, Sheen, et al.’s analysis of background and atmospheric variability effects [20]
suggests that this is generally not the case for short range detection: background variability is
generally the primary source of uncertainty and error.
Following Eqs. (1) and (2), the at-sensor radiance in the absence of the plume is:
( ) ( ) baaas RLR ⋅+⋅−= ττ10 (4)
and the change in at-sensor radiance due to the plume is:
( ) ( ) ( )( ) ( )[ ]apasapss LLRLRR −⋅+−⋅−=− ττ 00 1 (5)
Note that when the temperature of the plume is equal to the atmospheric temperature then
0=− ap LL and the change in at-sensor radiance is independent of the atmospheric transmission,
i.e., atmospheric compensation is not necessary to estimate plume properties.
For the remainder of this document we consider the case where the air temperature is
uniform and known and the plume temperature is equal to the air temperature. (In general,
atmospheric transmission effects may be ignored when the effective thermal contrast between the
plume and the air is much less than the effective thermal contrast between the air and the
background.) These constraints are not essential for the analysis approach to be valid; however,
they facilitate a more concise presentation of the analysis approach. We note in the text where
relaxing these constraints affects the details of the parameter estimation algorithm. In practice, if
it is necessary or desirable to presume a known and uniform plume temperature, there are
7
methods for making a reasonable estimate. The simplest approach is to presume that it is equal
to the local atmospheric temperature. Alternatively, depending upon the characteristics of the
measured radiance spectra, the atmospheric temperature in the vicinity of the plume may be
estimated from the spectral radiance at wavelengths where atmospheric water vapor (or other
atmospheric constituent which is reasonably well-mixed over the sensor’s line-of-sight) becomes
optically-thick over a range comparable to the estimated distance to the plume.
2.2. Sensor Signal Model
The equations in the preceding section apply to monochromatic radiation. In developing the
signal model for describing measured spectra it is necessary to address the effects of the sensor’s
spectral resolution. In the absence of measurement noise, the apparent spectral radiance at
acquisition wavelength sλ is:
( ) ( ) ( )∫ ⋅⋅= λλλλλ dgRx sss ; (6)
where ( )λR is as per Eq. (4) or (5), and the instrument lineshape function, ( )sg λλ, , is
normalized such that ( ) 1; =⋅∫ λλλ dg s . Combining Eqs. (5) and (6), the change in sensor-
measured spectral radiance due to the plume is:
( ) ( ) ( )[ ] ( ) ( ) ( )[ ] ( )∫ ⋅⋅−⋅−=− λλλλλλτλλ dgRLxx ssapssp ,1 00 (7)
where px and 0x are the band-averaged spectral radiances as per Eq. (6) with and without the
plume present, respectively. In order to facilitate signal model parameterization and
computationally-efficient parameter estimation, we approximate Eq. (7) as:
[ ] [ ]00 1 xLxx aep −⋅−=− τ (8)
where ( )saa LL λ= and the effective plume transmission is
8
( )κρατ ⋅⋅−= 0expe (9)
The quantity κ is the vapor absorption cross-section averaged over the instrument lineshape
( ) ( ) ( )∫ ⋅⋅= λλλλκλκ dg ss , . (10)
The quantity 0ρ is a reference column density defined to make α a unitless quantity, 0ρρα ≡ .
We choose ( ){ }λκρ max10 = so that α corresponds to the plume optical density at the strongest
absorption feature in the high resolution spectrum. The approximation in Eq. (9) is addressed
further in Section 4. Parameterization of the plume-free background spectral radiance, 0x , is
addressed in the following Section.
From this point forward, measured spectra are treated as k-dimensional vectors:
exx +=~ (11)
where x denotes the vector of the noise-free spectral radiance from Eq. (6) and e denotes
measurement noise. (The tilde denotes a noisy measurement.) In developing the analysis
approach, we presume that the measurement noise is normally-distributed with zero mean and is
uncorrelated from band to band, ( )De ,0~ N where D is a diagonal matrix. The diagonal
elements of D are the 1σ standard deviation in each band resulting from sensor noise and
spectral clutter. We do not presume that the noise variance is equal in all bands.
2.3. Infrared Background Model
There are many reasonable approaches to modeling and parameterization of infrared
backgrounds. We use a factor analysis-based model [21] because it facilitates definition of a
Gaussian prior pdf for model parameter values. This in turn simplifies implementation of the
parameter estimation algorithm. The model is summarized in Appendix A and is similar to the
Probabilistic Principal Components model described by Tipping and Bishop [22]. Briefly, the
9
model presumes that, in the absence of a vapor plume, spectra may be described by a linear
mixing model:
Bβµx += (12)
where µ is the mean background spectrum, B is the k × m dimensional matrix whose columns
are the basis vectors used to span the data space and β is an m × 1 vector of weight coefficients.
The key detail on the implementation of Eq. (12) is that the β vectors for the sample are
presumed to be uncorrelated, have zero mean and unit standard deviation:
( )mN I0,β ~ (13)
where mI is the m×m identity matrix. Eq. (13) is exact for Principal Components Analysis
(PCA) applied to multivariate normal distributed data. The B matrix follows from a
regularization approximation of the calculated sample covariance matrix and is calculated using
eigenvalues and eigenvectors from a PCA of the data. Calculation of the B matrix and estimation
of the model order, m, are described in Appendix A.
2.4. Model Parameter Estimation
Following Eqs. (8), (9), and (12), the signal model parameters are the chemical vapor optical
density, the weight coefficients for the basis vectors for the background (i.e., the elements of the
β vector), and the plume/atmospheric temperature. For each spectrum, the parameters define a
vector θ , [ ]aT,,βθ α= . We take a Bayesian approach to estimating the maximum likelihood
parameter values. Given an observation, x~ , the probability that the parameter values are θ is
( ) ( ) ( )( )x
θθxxθ ~
~~
ppp
p = (14)
10
where ( )θx~p is the conditional probability of observing x~ given θ , ( )θp is the prior
probability of the parameter values being θ , and ( )x~p is the prior probability of observing x~ .
The maximum likelihood parameter values are those which maximize ( )θx~p or, equivalently,
those which minimize ( )θx~ln p− . The maximum likelihood model parameters are denoted θ .
Following Eq. (8), the model function for x is:
( ) [ ] aee Lτ1xτθf oo −+= 0 (15)
where the vector 0x is the estimated noise-free background spectrum given by Eq. (12), 1 is a
k-element vector of ones, eτ is the plume transmission calculated using Eq. (10), aL is the
blackbody function evaluated at the plume temperature, and ○ denotes the element-by-element
(Hadamard) product of the two vectors.
In order to facilitate computationally-efficient parameter estimation, we postulate that
( )θx~p follows a multivariate normal distribution:
( ) ( )[ ] ( )[ ] θxθfxDθfxθx cp T +−−=− − ~~21~ln 1 (16)
where 1−D is a noise-whitening matrix and θxc is a constant; { }222
21 ,...,, kdiag σσσ=D , where iσ
is the standard error of the measured spectral radiance in band i. To deal with the ( )θp term in
Eq.(14) we postulate that θ can partitioned into a subset of parameters where the prior follows a
multivariate normal distribution, i.e., the parameter values are constrained, and a subset where
the prior is uniform, i.e., the parameter values are unconstrained. Following this presumption
( ) [ ] [ ] θθθSθθθ cp acT
ac +−−=− θ21ln (17)
11
where cθ is the subset of θ which are constrained and θc is a constant. Specification of cθ is
described below. The vector aθ is the a priori estimate of cθ and θS is a regularization matrix
which penalizes deviations about aθ . For the purpose of estimating the maximum likelihood
model parameter values, it is not necessary to know the details of ( )x~p because it is simply a
normalization factor and is independent of θ .
Following Eqs. (14), (16), and (17), the model parameter values which maximize ( )xθ ~p
are also those which minimize the cost function:
( )[ ] ( )[ ] [ ] [ ]acT
acTC θθSθθθfxDθfx −−+−−= −
θ~~ 1 (18)
The first term on the righthand side of Eq. (18) penalizes deviations between the measured
spectrum and the model spectrum. The second term on the righthand side of Eq. (18) penalizes
deviations of the constrained model parameters from their nominal values. As stated above, in
order to demonstrate the nonlinear estimation approach without making the mathematics
unnecessarily complicated, we consider parameter estimation when the plume temperature is
known and is equal to the effective temperature of the atmosphere. Also, we presume that if a
chemical vapor cloud is present in the scene that its location and optical density are a priori
unknown and that optical density values are independent of background characteristics.
Following these presumptions, α is treated as an unconstrained parameter, βθ =c , and second
term on the righthand side of Eq. (18) reduces to
[ ] [ ] ββθθSθθ Tac
Tac =−− θ (19)
With respect to Eqs.(17)-(19), while α cannot be <0 or exceed the value corresponding to the
atmospheric pressure and, in principle, should be constrained, in practice it is more
12
computationally-efficient to leave α unconstrained and then reject physically implausible values
after the parameter estimation algorithm terminates.
The maximum likelihood model parameter values following from Eq.(18) may be
estimated using a Gauss-Newton algorithm after expressing the cost function in quadratic form
rrTC = (20)
where r is a p-dimensional vector. The Gauss-Newton algorithm updates parameter values
iteratively as
( ) iTii
Tiii rJJJθθ 1
1−
+ −= (21)
where i indicates iteration number and J is the Jacobian of the r vector, θr ∂∂ . Eq. (21) is a
general result. The dimensionality of the r vector and the Jacobian depends upon the details of
the signal model and the cost function. Following Eqs.(18) and (19), the r vector in this case
consists of (k+m) elements,
( )[ ][ ]βθfxDr ;~21 −= − / (22)
The Jacobian of r is a (k + m) × (m + 1) matrix which is the concatenation of its partial
derivative with respect to the β parameters and its partial derivative with respect to the α
parameter:
⎥⎦
⎤⎢⎣
⎡∂∂
∂∂
=αr
βrJ ; (23)
where [ ]mβββ ∂∂∂∂∂∂≡∂∂ rrrβr ;...;; 21 is a (k + m) × m matrix:
{ }
⎥⎦
⎤⎢⎣
⎡ ⋅⋅−=
∂∂ −
m
ediagI
BτDβr 2/1
(24)
13
and α∂∂r is a (k+m)×1 column vector. The quantity { }ediag τ is a k×k matrix with eτ on the
diagonal and zeros on the off-diagonal; { } Bτ ⋅ediag is a k × m matrix and mI is the m × m
identity matrix. The α∂∂r vector is:
{ }
⎥⎦
⎤⎢⎣
⎡ ⋅⋅=
∂∂ −
m
ediag0
δsτDr 02/1 o
α (25)
where κs ⋅= 0ρ , ( )BβµLδ +−= a0 is the radiance contrast between the air and the estimated
background spectrum (a k×1 column vector) and m0 is an m×1 vector of zeros.
The formulation of Eqs. (23)-(25) ensures that iTi JJ is invertible under virtually all
physically plausible detection scenarios and thereby makes Eq. (21) extremely stable. (The mI
matrix in Eq. (24) is the principal source of stability.) When the atmospheric temperature is
treated as a fixed parameter, the only scenario where iTi JJ is guaranteed not to be invertible, and
therefore Eq. (21) cannot produce an accurate estimate of plume column density, is when the
plume is opaque at all wavelengths, i.e., 0=eτ . The uncertainty associated with the estimated α
value tends to infinity as 00 →δ ; however, this does not prevent the algorithm from
converging.
Although the Eqs.(22)-(25) follow from the presumption that the atmospheric
temperature is known, the framework above permits it to be treated as an estimated parameter.
For example, it may be treated as a constrained parameter by augmenting the r vector with a
term ( ) Ta TT σ0− , where T0 is an a priori estimated air temperature and σT is the uncertainty in
the estimated temperature. In principle, aT may also be treated as an unconstrained parameter;
however, in the absence of a constraint, 0r →∂∂ aT as 0→α . This results in JJT being ill-
14
conditioned as 0→α (and non-invertible for 0=α ) thereby rendering application of the
Gauss-Newton algorithm problematic (or impossible). Treating aT as a free parameter is only
effective when the vapor cloud is optically-thick, in which case the measured spectral radiance at
the wavelength(s) of peak optical density provides a reasonable measure of the plume
temperature.
2.5. Algorithm Initialization
The Gauss-Newton algorithm requires an initial guess at the maximum likelihood model
parameters. We make our initial guess by applying several approximations to Eq. (15) in order
to create a linear additive signal model. The linear additive model facilitates direct calculation of
the maximum likelihood model parameters by matrix algebra. The first step in developing the
approximate model is to presume an optically-thin vapor plume, 1<<α , and approximate
Eq. (15) as
( )00 xLsxx −+= ap oα (26)
By further presuming that the plume is viewed against a blackbody background, Eq. (26)
simplifies to:
''0 sxx α+=p (27)
where the vector 's is
0' TdT
d
aT
a ∆⎥⎦⎤
⎢⎣⎡=
Lss o (28)
and
0
'TTeff
∆
∆⋅= αα (29)
15
The quantity [ ]aTa dTd /L is the derivative of the blackbody function with respect to temperature
evaluated at the air temperature, effT∆ is effective thermal contrast between the air temperature
and the radiometric temperature of the background, and 0T∆ is a reference thermal contrast,
nominally 1 K.
Combining Eqs. (12) and (27) we obtain the approximate signal model:
( ) µBβsθg ++= '''' α (30)
By replacing ( )θf in Eq. (15) with ( )'θg above and maintaining the constraint in Eq. (21), then
applying the criterion that 0' =∇ Cθ at the minimum of the cost function, we obtain a linear
system of equations which may be solved directly for the maximum likelihood values of 'α and
'β , 'α and 'β :
( )( ) ⎥
⎦
⎤⎢⎣
⎡
−−
⎥⎦
⎤⎢⎣
⎡=⎥
⎦
⎤⎢⎣
⎡−
−−
−
−−
BDµxsDµx
ΛsDBBDssDs
β 1
11
1
11
~'~
''''
'ˆ'ˆ
T
T
mT
TTα (31)
The matrix to be inverted consists of four sub-blocks: '' 1sDs −T is a 1 × 1 sub-block, BDs 1' −T is
a 1 × m sub-block, ( )TTT BDssDB 11 '' −− = is an m × 1 sub-block , and the mΛ sub-block is an
m × m diagonal matrix whose non-zero elements are the leading m eigenvalues of the noise-
whitened sample covariance matrix, 2/12/1 ΣDD− . The system of equations in Eq. (31) yields the
clutter-matched filter result [9,12,13]:
( )'ˆ'
~ˆ''ˆ1
1
sΣsµxΣs
−
− −=
T
T
α (32)
when DBBΣ += Tˆ is the regularized sample covariance matrix. Using the relation
( )effTT ∆∆⋅= 0'αα , enables comparison of column densities estimated using the linearized
16
model in Eq. (30) with the results obtained presuming the RTE-based signal model in Eq. (15).
Calculation of effT∆ is addressed below.
2.6. Uncertainty Analysis
It is instructive to compare the Cramer-Rao lower bound (CRLB) on the uncertainty in the
column density estimated using the Gauss-Newton solver with that associated with the linear
model estimate. The CRLB on the uncertainties in parameters determined using the
Gauss-Newton algorithm may be determined from the elements of the inverse of the Fisher
information matrix [23]:
( )[ ] ( )[ ]iii I θˆ 12 −≥θσ (33)
where ( )aσ is the 1σ standard deviation the quantity a. Following Eqs. (14), (18) and (19), the
elements of the Fisher information matrix are:
( )[ ] ( ) [ ]ijT
jiij
pEI JJ
xθθ ≈
⎪⎭
⎪⎬⎫
⎪⎩
⎪⎨⎧
∂∂
∂−=
θθ
~lnˆ2
(34)
Combining Eqs. (33) and (34),
( )[ ] ( )[ ]iiT 12ˆ −= JJασ (35)
The index i in Eq. (35) corresponds the index of α in the parameter vector. The matrix ( ) 1−JJT is
calculated with each iteration of the Gauss-Newton algorithm so ( )ασ ˆ may be determined for
each parameter estimated with no additional computational expense.
For the linear model given by Eq. (30), there exists a closed-form expression for ( )[ ]2ασ :
17
( )[ ] ( )[ ] ( ) ( )[ ]
( )[ ] ( ) ( )⎟⎟⎠
⎞⎜⎜⎝
⎛⎥⎦⎤
⎢⎣⎡
∆∆
+⎟⎠⎞
⎜⎝⎛∆∆
=
∆⎥⎦
⎤⎢⎣
⎡∆∂∂
+⎥⎦⎤
⎢⎣⎡∂∂
=
222
20
22
22
2
'ˆ'ˆ
'ˆ'
ˆ
TT
TT
TTL
σαασ
σαασααασ
(36)
where [ ]'αα ∂∂ and ( )[ ]T∆∂∂α are from Eq. (29). The subscript L in Eq. (36) is to distinguish
the uncertainty derived from the linear model from that estimated using the Gauss-Newton
solver. The quantity ( )'ασ is the CRLB associated with the Gaussian probability distribution
function which follows from the cost function defined using the linear model:
( )[ ] [ ] 112 ''ˆ −−≥ sDs Tασ (37)
When there is no uncertainty in the thermal contrast between the air and the background, i.e.,
when the plume and the air temperature are known precisely, ( ) 0=∆Tσ and Eq. (36) simplifies
to:
( )[ ] [ ] 11
2
02 ''ˆ −−
⎟⎟⎠
⎞⎜⎜⎝
⎛
∆∆
= sDs T
effL T
Tασ (38)
Note that the uncertainty in the estimated column density tends to infinity as 0→∆ effT . The
effective thermal contrast in Eq. (39) is that which minimizes the sum-squared deviation between
( )0xLs −ao in Eq. (26) and 's in Eq. (27):
( ) ( )[ ]
( ) ( )''' 0
0 ssxLss
Ta
T
eff TT−
⋅∆=∆o
(39)
where 's is as per Eq. (28). (Note that 's is proportional to 0T∆ ; the 0T∆ terms cancel in the
numerator and denominator and effT∆ is independent of 0T∆ .) The effective thermal contrast
goes to zero as ( ) 00 →− xL a .
18
2.7. Detection Decision Formulation
For some standoff detection applications, making the correct “plume absent”/“plume present”
detection decision can be more important than accurate estimation of the chemical vapor column
density. The cost function in Eq. (18) facilitates detection decisions on the basis of a statistical
F-test. The F value associated with the measured spectrum, x~ , is
( ) ( ) ( )( ) ⎥
⎥⎦
⎤
⎢⎢⎣
⎡−⋅−= 1ˆ,~
ˆ,~1~ 0
θxθx
xCC
kF (40)
where ( )θ,x ˆ~C is the cost function evaluated using the model parameters estimated with the
Gauss-Newton algorithm and allowing all model parameters to vary and ( )0ˆ~ θ,xC is cost function
evaluated with α fixed at zero. In contrast to ( )θ,x ˆ~C , calculation of ( )0ˆ~ θ,xC does not require an
iterative computation. Under the background model defined in Section 2.3
( ) [ ] [ ] 0001
00ˆˆˆ~ˆ~~ βββBxDβBxθ;x TT
C +−−= − (41)
and the 0β vector is calculated using Eq. (A.9) of Appendix A. “Plume present” is decided when
the F value exceeds a user-specified threshold.
For the linear signal model given by Eq. (30), the analogue to the F test is an adaptive
subspace detector, the Adaptive Cosine Estimator (ACE) [24,25]. The ACE value associated
with the measured spectrum x~ is
( ) [ ]( )( )[ ] [ ]( )µxΣµxsΣs
µxΣsx−−
−=
−−
−
~ˆ~'ˆ'
~ˆ'~11
21
TT
T
ACED (42)
The ACE statistic can be regarded as cosine-squared of the angle between the test spectrum and
the reference spectrum in noise-whitened, mean-subtracted signal space. The F value calculated
in Eq. (40) is analogous to cotangent-squared of the spectral angle. The ACE value calculated in
19
Eq. (42) may be converted into an equivalent F value for direct comparison with the value
calculated in Eq. (40), ( ) ( )ACEACEACE DDkF −−= 1/1 .
3. Test Data
For algorithm performance evaluation we simulated observations of fugitive emissions by
augmenting plume-free measured spectra with synthetic plume signatures. The plume-free
spectra were collected using an Adaptive Infrared Imaging Spectroradiometer-Wide Area
Detector (AIRIS-WAD) [26,27]. The AIRIS-WAD sensor is an imaging Fabry-Perot
spectrometer comprised of a 256×256 pixel LWIR focalplane array (FPA) which views the far
field through a rapidly tunable LWIR etalon. The sensor’s optical system is configured to
provide a 32 deg x 32 deg field-of-regard (2.2 mrad per pixel IFOV). Spectra are recorded band-
sequentially and consist of measurements at twenty (20) user-specified wavelengths in the 8 to
11 µm spectral region. The instrument’s spectral resolution is ~0.08 µm, nominally 8 cm-1
FWHM at ν=1000 cm-1, and the lineshape is well-described by a Lorentzian function. The
sensor is equipped with an internal blackbody source to facilitate real-time radiometric
calibration of the sensor data.
Figure 2 shows a broadband IR representation of a datacube collected by the sensor. The
broadband IR representation was generated by summing all twenty narrowband images in the
datacube. Lighter pixels indicate higher radiance values. The scene is composed of low brush
and compacted sand in the bottom half of the image and sky in the upper half. Figure 3 shows
the average spectral radiance from the boxes marked “sky”, “horizon”, and “ground” in Figure 2.
The spectrum of the ground region is very similar to the spectrum of a 302 K blackbody. The
20
“sky” spectrum is consistent with a low slant angle view to space and exhibits the characteristic
ozone emission feature near 9.5 µm.
We augmented AIRIS-WAD data with synthetic 1,1,1,2-tetrafluoroethane (R-134a)
spectra. R-134a is a freon widely-used in refrigeration systems and as a propellant for domestic
and industrial applications. Data was augmented using the equation
[ ] exτLτ1x ˆˆ 0 ++−= oo papp (43)
where pτ is the instrument-lineshape-averaged plume transmission, 0x is estimated noise-free
background spectrum of the pixel as calculated using Eq. (A-10) of Appendix A, and e is
defined as the difference between the measured background spectrum and the estimated noise-
free background spectrum, 0ˆ~ˆ xxe −≡ . The elements of pτ are:
( ) ( )[ ] ( )∫ ⋅⋅⋅−= λλλλκρλτ dg ssp ,exp (44)
The plume transmission at high resolution was calculated using an R-134a reference spectrum
from the Pacific Northwest National Laboratories IR Spectral Database [28]. The instrument
lineshape function used to evaluate Eq. (44) was Lorentzian with 0.08 µm FWHM, consistent
with the experimentally-measured resolution function of the AIRIS-WAD sensor. .
A useful characteristic of Eq. (43) is that it preserves the noise in the original data as
plume becomes optically thick and thereby provides more realistic spectra for testing estimation
algorithms than fully synthetic data with added Gaussian noise. As the plume transmission goes
zero eLx ˆ+→ ap , i.e., as the plume becomes opaque the pixel spectrum becomes a noisy
blackbody spectrum rather than a noise-free blackbody spectrum. Conversely, as the plume
column density goes to zero its effective transmission goes to unity and 0xx =p , i.e., the output
spectrum is equal to the original data if no plume is present.
21
The AIRIS-WAD datacube was segmented into four 64×256 pixel horizontal quadrants
for processing. The motivation for segmenting the data was twofold: 1) The AIRIS-WAD FPA
has four separate readouts, each with somewhat different noise characteristics and 2) division
into four quadrants has the effect of partitioning the data into spectrally-similar subsets. The
latter effect improves the fidelity of the infrared background model. The covariance matrix of
each quadrant was calculated using a statistically-robust, Huber-type M-estimator [29]. The
method used to determine the background subspace dimensionality is described in Appendix A.
For the datacube depicted in Figure 2, the dimensionality of the quadrants ranged from m=4 to
m=6. We exercised Eq. (43) to create synthetic datacubes with column density ranging from 0 to
591 ppmv-m (0 to 2.48 g/m2). For reference, 197 ppmv-m corresponds to OD=1.0 (base e) at
8.42 µm, the wavelength of strongest absorption in the R-134a spectrum, and the synthetic
plumes varied from OD=0.0 to OD=3.0. Figure 4 shows the calculated transmission for
20 ppmv-m, 197 ppmv-m, and 591 ppmv-m R-134a plumes, i.e., plumes with peak optical
densities of 0.10, 1.0, and 3.0. The plume temperature was set equal to the local air temperature,
298.0 K, for this simulation. Synthetic plumes were added to 64 pixel (horizontal) × 5 pixel
(vertical) regions in the scene and the column density was the same at each pixel where the
plume signature was added.
4. Results and Discussion
4.1. General
In this Section we compare the results obtained by applying the Gauss-Newton solver for
detection and column density estimation with those obtained and the linear model given by
Eq. (30). Prior to applying the algorithms to the plume-augmented AIRIS-WAD data we
22
verified that both the Gauss-Newton and clutter-matched filter/ACE algorithms were properly
implemented by processing purely synthetic data with additive Gaussian noise. The plume-
augmented AIRIS-WAD data was processed on a quadrant-by-quadrant basis and the Huber-type
M-estimator was used to calculate the background covariance matrix of each quadrant. The
motivation for using the M-estimator is that using a standard covariance calculation, i.e., giving
equal weight to all spectra in a sample, results in a erroneous estimate of the background
covariance when a plume is present in the sample. The M-estimator de-weights the contribution
of statistically-anomalous spectra. Pixels where the plume signature is statistically-significant
generally constitute statistical anomalies so the M-estimator generally provides a more accurate
estimate of the true background covariance matrix.
It is instructive to evaluate the scene in Figure 2 and identify regions where the
conditions are favorable for detection and, conversely, where they are not favorable for
detection. Figure 5 shows the effective thermal contrast as a function of elevation. The effective
thermal contrast was calculated for each pixel in the scene using Eq. (39) and Figure 5 shows the
median for each row. The effective thermal contrast is ~0 K near the horizon (Row ~ 128) and
one therefore expects detection statistics to be unfavorable in that region. The effective thermal
contrast increases with elevation angle above the horizon and one expects detection statistics to
be favorable above Row ~150 where effT∆ > 5 K. (The downward deviations in the vicinity of
Rows 170 and 200 are due to clouds at those elevations.) These qualitative characterizations are
addressed quantitatively in Figure 6, which shows uncertainty in the estimated column density as
a function of elevation angle. The uncertainty was calculated for each pixel using Eq. (38) and
Figure 6 shows the median for each row. We note that the modest discontinuities at Rows 64
and 192 are associated with boundaries between data quadrants. The discontinuity at Row 128 is
23
due primary to the transition from ground to low-angle sky background. The fact that it is also a
quadrant boundary is a coincidence and is a minor contribution to the observed discontinuity.
In the interest of comparing algorithm performance in favorable and unfavorable
detection regions, we present results obtained by processing data with synthetic plumes added to
the regions shown in Figure 7. The effective thermal contrast in Region 1 is 2.6 ± 0.5 K and the
effective contrast in Region 2 is 5.9 ± 0.6 K. (The variation is the 1σ standard deviation in ∆T
over the plume region, not the uncertainty in ∆T.)
4.2. Column Density Estimation
In order to evaluate the accuracy and precision of the Gauss-Newton and linear model estimates
of peak optical density, we calculate the median of the estimated values in the plume region,
{ }imedian α , and the normalized median absolute deviation, a statistically-robust analogue of the
sample standard deviation,
{ }{ }
6745.0ˆˆ
ˆ ˆii medianmedian αα
σα
−= (45)
The { }imedian α and ασ ˆˆ are very nearly equal to the sample mean and standard deviation when
sample values are normally distributed; however, unlike the mean and standard deviation,
{ }imedian α and ασ ˆˆ are insensitive to low occurrence, highly anomalous values in the sample.
We report { }imedian α and ασ ˆˆ rather than the mean and standard deviation because we found
that { } ασα ˆˆ2ˆ ±imedian generally yields a more accurate estimate of the range which incorporates
95% of the sample values than does { } σα 2ˆ ±imean . For the results we evaluated, we observed
that the mean and median α values were in excellent agreement and that ασ ˆˆ was typically 10-
15% lower than the standard deviation.
24
Figures 8 and 9 show the median R-134a optical densities estimated using the
Gauss-Newton solver and the clutter-matched filter. Figure 8 depicts the optical densities
estimated in Region 1 and Figure 9 depicts the optical densities estimated in Region 2. The error
bars in each figure correspond to ασ ˆˆ1± . Optical densities may be converted to column densities
by multiplying by 0ρ , 197 ppmv-m for R-134a. The solid symbols indicate the median OD
estimated using the Gauss-Newton algorithm. The crossed open symbols indicate the median
OD estimated using the clutter-matched filter. The black dashed line indicates perfect agreement
between the actual and estimated OD values. As expected, the uncertainty in estimated OD is
smaller in Region 2 where the thermal contrast is greater. The variation in estimated α values
calculated using Eq. (45) is in reasonable agreement with the predictions of Eqs. (33) and (38).
Eqs. (33) and (38) generally overestimated ασ ˆˆ by 30-70%. Results of simulations run using
purely synthetic data with additive Gaussian noise suggest that the discrepancy is due to the fact
that the noise in the test data is not precisely normally-distributed.
The Gauss-Newton algorithm provides a more accurate estimate of column density than
the linear model in all cases. As expected, the accuracy of the clutter-matched filter estimate
degrades with increasing optical density. The systematic deviation of the Gauss-Newton-
estimated optical densities from the “Ideal” line is due entirely to the approximation in Eq. (10).
When the data is fit using the effective absorption spectrum for the appropriate ρ ,
( ) ( )[ ] ρρρ /ln pe τκ −= , rather than κ then all { }imedian α values all fall on the “Ideal” line.
Figure 10 shows ( )λκ , the effective reference spectrum for OD=0.0, as well as ( )λeκ for
OD=1.0 (197 ppmv-m) and OD=3.0 (591 ppmv-m). The effective peak cross-section decreases
by 9% from OD=0.0 to OD=1.0 and decreases by 29% from OD=0.0 to OD=3.0. The fact that
the Gauss-Newton algorithm systematically underestimates α values as the OD increases is
25
consistent with the observed reduction in maximum effective absorption cross-section. The
observed systematic underestimation of the OD using the Gauss-Newton algorithm suggests the
true OD may be recovered by applying an OD-dependent correction factor to the estimated
value. In principle, a correction faction could also be applied to the clutter-matched filter OD
estimates; however, the fact that those OD values appear to approach a maximum value suggests
that applying a correction factor would be problematic.
4.3. Plume Detection Statistics
The standard performance metric for detection applications is the receiver operator characteristic
(ROC) curve. Traditionally, a ROC curve is a plot of the probability of detection, PD, versus the
probability of false alarm, PFA, where the (PD, PFA) points which constitute the curve are
calculated by varying the detection threshold from lowest to highest value. We use the terms
“detection rate” (DR) and “false alarm rate” (FAR) here rather than PD and PFA because our
results are data-derived rather than based on theoretical calculations of PD and PFA. We construct
ROC curves as follows:
1. The pn F-values (or ACE values) in the plume region are put in ascending order:
pnFFF ,...,, 21 .
2. The DR for each F-value (or ACE value) is calculated as ( ) pni /2/1− , where i is the
F-value’s index, i = 1, 2,…, np. The range of DR is ( ) ( ) 11 212 −− −≤≤ pp nDRn .
3. The FAR corresponding to each DR is calculated by determining the number of F-values
(or ACE values) in the off-plume region which are greater than or equal to Fi. The
number of values exceeding Fi is iFAn , . For 1, ≥iFAn , the false alarm rate is calculated as
( ) biFAi nnFAR /2/1, −= , where bn is the number of pixels in the off-plume region. If
26
0, =iFAn , then we consider the (DR, FAR) point not to exist, so
( ) ( ) 11 212 −− −≤≤ bb nFARn .
Subtracting ½ from np and nb when calculating DR and FAR is a convention for probability
plotting which facilitates comparison with model distribution functions.
Figures 11-14 show the ROC curves calculated for Regions 1 and 2 augmented with
OD=0.1, 0.3, 1.0, and 2.0 plumes. The solid symbols correspond to points resulting from
application of the Gauss-Newton algorithm. The crossed open symbols correspond to points
results for application of the ACE algorithm. As the plume is optically-thin at OD=0.1 and 0.3
the two algorithms generate nearly identical ROC curves. We note that the ROC curves in
Figure 11, OD=0.1, are all indicative of unfavorable plume detection statistics. Setting the
detection threshold to achieve an 80% detection rate in Region 2 would result in ~20% false
positive rate in rest of the scene while setting the detection threshold to achieve 80% detection
rate in Region 1 and would result in ~50% false positive rate. The ROC curves in Figure 12,
OD=0.3, show somewhat more favorable detection statistics, particularly in Region 2. Setting
the detection threshold to achieve an 80% detection rate in Region 2 would result in ~0.3% false
positive rate in rest of the scene and setting the detection threshold to achieve 80% detection rate
in Region 1 and would result in ~30% false positive rate. Some separation between the
Gauss-Newton and ACE curves is observed for OD=0.3 but the differences are modest. For both
the OD=0.1 and OD=0.3 plumes the curves for Region 2 are significantly more favorable than
those for Region 1 because the effect thermal contrast is >2x larger in Region 2 than in Region 1.
The ROC curves in Figure 13, OD=1.0, show significantly better detection statistics for
the Gauss-Newton algorithm than the ACE algorithm in both Region 1 and Region 2. The
[ ] s1s αα −≈−exp approximation for the plume transmission is significantly less accurate than at
27
OD=0.1 and 0.3, so it is expected that the nonlinear estimator will outperform ACE-based
detection. As was true at lower OD values, the detection statistics are significantly more
favorable in the region of higher thermal contrast. In Region 2, >95% detection rate is achieved
with a false positive rate of 1⋅10-5 using Gauss-Newton algorithm while the false positive rate is
approximately four orders of magnitude higher for the same detection rate in Region 1.
Similarly, for ACE-based detection, the false alarm rate for a detection rate of 80% is
approximately 350x lower in Region 2 than it is in Region 1. Comparing the ROC curves for
Gauss-Newton detection in Regions 1 with ACE-based detection in Region 2, while the
degradation introduced by the [ ] ss αα −≈− 1exp approximation is apparent, it is less significant
than the effect of enhanced thermal contrast in going from Region 1 to Region 2. Although the
linear model which underlies the ACE detector is not precisely matched to the data, the ROC
curve obtained by applying ACE to Region 2 is still more favorable than the ROC curve obtained
by applying the Gauss-Newton algorithm to Region 1.
The ROC curves in Figure 14, OD=2.0, show an even greater improvement in detection
statistics for the Gauss-Newton algorithm relative to the ACE algorithm as [ ] s1s αα −≈−exp is
a poor approximation for the plume transmission near the wavelengths of strongest absorption.
Comparing the ROC curve for Gauss-Newton detection in Regions 1 with ACE-based detection
in Region 2, the degradation introduced by the [ ] s1s αα −≈−exp approximation is more
significant than the effect of enhanced thermal contrast in going from Region 1 to Region 2.
With the OD increased to 2.0, the detection statistics in Region 1 become relatively favorable
and for a detection rate of 80% the Gauss-Newton algorithm reduces the false positive rate by a
factor of ~15 relative to ACE-based detection. The reduction in false alarm rate is even more
pronounced in Region 2. One can also examine the differences in detection rate for a fixed false
28
alarm rate. For FAR=1⋅10-4, use of the Gauss-Newton algorithm increases the detection rate
from ~10% to nearly 80% in Region 1. For FAR=1⋅10-5, use of the Gauss-Newton algorithm
increases the detection rate from ~70% to ~99% in Region 2.
The degradation of the ACE ROC curves with increasing OD can be understood by
examining the deviation between the test data and the model spectra calculated using the
maximum likelihood model parameters. Figure 15 shows spectra of pixels from Region 1 along
with model spectra calculated using the maximum likelihood values of the model parameters.
The original AIRIS-WAD spectrum is denoted by crossed open squares and the dashed line
shows the corresponding model spectrum. (The difference between the best fits to the plume-
free spectrum calculated using the Gauss-Newton and linear models is not observable on the
scale of the graph.) For comparison, the solid squares denote the pixel spectrum after
augmentation with an OD=3.0 R-134a plume. The solid line shows the spectrum calculated
using the maximum likelihood parameters estimated using the Gauss-Newton algorithm and the
dashed line shows the spectrum calculated using the maximum likelihood parameters estimated
assuming the linear model, Eq. (30). While the differences appear modest in comparison to the
range of spectral radiance values, the deviations between the data and the model spectra are
systematic and readily discernable.
Figure 16 shows the root-mean-squared (rms) residuals between the data and the model
spectra in Region 1. For reference, the open diamonds show the rms deviation between the data
and the model for no plume signature added to the region. The solid squares indicate the rms
residuals for the linear model applied to Region 1 data augmented with a OD=3.0 (591 ppmv-m)
plume. The rms deviation is ~3x greater between 8.4 and 9.0 µm and ~6x greater near 10.3 µm
than it is for the plume-free spectra. The wavelengths where the rms deviation increases the
29
most correspond to wavelengths of strongest absorption, indicating a shortcoming in the fit
model. For comparison, the crossed open squares indicate the rms deviation for the Gauss-
Newton algorithm applied to the same data. While the rms deviation is slightly larger between
8.4 and 9.0 µm and near 10.3 µm as result of the difference between κ used for fitting and
( )ρκ e used to create the test data, the residuals are in good agreement with rms deviations
calculated for the plume-free data. This is the expected result when the signal model is
consistent with the data.
4.4. Algorithm Convergence
As the Gauss-Newton algorithm is iterative, it is necessary to define a termination criterion. Our
termination criterion is the fractional change in the cost function given by Eq. (19). The
algorithm terminates when the fractional change falls below a user-specified value, ε:
ε≤−≤ +
i
i
CC 110 (46)
The results presented in the preceding section were obtained using ε=0.01. In the event that the
cost function increases from the i-th to (i+1)-th iteration the parameter values are restored to
those from the i-th iteration and the algorithm terminates. Figure 17 shows number of iterations
to convergence for Region 2 augmented with OD=1.0, 2.0, and 3.0 plumes as well as the number
of iterations observed when no plume was added. For reference, the solid bars show the number
of iterations required for convergence in the plume-free region of the scene. Iteration zero
corresponds to the initial guess at the maximum likelihood model parameters. The algorithm
terminated after one iteration for ~55% of the plume-free pixels, i.e., the parameters were
updated but the cost function did not change significantly from the initial guess, and the
algorithm terminated after the second iteration for ~45% of the plume-free pixels. When applied
30
to the original Region 2 data, the algorithm terminated after one iteration approximately half of
the time and after the second iteration the other half of the time, in good agreement with the
plume-free region of the scene. The number of iterations required for convergence increases as
the plume OD increases. For spectra augmented with a OD=1.0 plumes, all pixels required at
least two iterations to converge; ~80% of the pixels require two iterations and ~20% required
three iterations. When the plume OD is increased to 2.0, the algorithm required three iterations to
converge for almost all pixels. A similar result was obtained for OD=3.0.
Reducing ε to 0.0001 increases the number of iterations (from 3 to 4 for the
OD=3.0 plume) however there was no significant effect on the estimated column density values.
The observed differences at algorithm termination were on the order of 0.1% of the estimated
values. For comparison, the uncertainties in the estimated values were typically several orders of
magnitude larger than the differences observed by making the convergence threshold more
stringent.
5. Summary and Conclusions
We have presented a nonlinear optimal estimation method for detecting and characterizing
chemical vapor fugitive emissions in a non-scattering atmosphere using passively-sensed LWIR
spectra. The method integrates a parameterized signal model based on the RTE with a
parameterized representation of covariance of the infrared background to create a probability-
based cost function. The maximum likelihood model parameters are defined as those which
minimize the cost function and are estimated using a Gauss-Newton algorithm. The algorithm
formulation presented here presumes that the plume and air are in thermal equilibrium and that
the air temperature is known; however, the algorithm may be easily modified to handle
scenarios where the air temperature is not known.
31
For algorithm performance evaluation we simulated observation of fugitive emissions by
augmenting plume-free spectra measured by an AIRIS-WAD sensor with synthetic
R-134a plume signatures. The peak optical density of the synthetic plumes varied from
OD=0.0 to OD=3.0. Results obtained by processing the simulated data indicate that the
nonlinear estimator provides significantly more accurate estimates of chemical vapor column
density and significantly more favorable detection statistics than matched-filter-based estimation
when the vapor plume is optically-thick at one or more of the sensor observation wavelengths.
This is because the signal model used for nonlinear estimation is based on the full clear air RTE,
not an approximation which follows from the presumption of an optically-thin plume as do the
clutter-matched filter and adaptive subspace detector. Finite instrument resolution introduced
systematic error in column densities estimated using the Gauss-Newton algorithm but the effect
was only significant for optical densities >>1.0 and the error is much smaller than that associated
with clutter-matched filter estimates. For example, the Gauss-Newton algorithm underestimated
column density by 17% at OD=3.0 whereas the clutter-matched filter underestimated it by ~70%.
We note that while the nonlinear estimator provides significantly better results for optically-thick
plumes, it produces the same result as a clutter-matched filter/adaptive subspace detector as the
plume optical density approaches zero.
The uncertainty in the column density estimated using the Gauss-Newton algorithm may
be calculated from the Fisher information matrix which follows from the cost function. We
observe that the uncertainties derived from the Fisher information matrix are typically 30-70%
larger than the standard deviation of column densities estimated by processing the simulated
sensor data. The discrepancy appears to be the result of non-Gaussian noise in the originally-
measured plume-free spectra. In future implementations of the algorithm we plan to enable
32
estimation of air temperature, plume temperature, and atmospheric transmission effects as well
as implement robust estimators to mitigate the effect of non-Gaussian noise on estimates of
maximum likelihood model parameters.
6. Acknowledgement
This work was supported in part through contract no. HDTRA1-07-C-0067 with the Defense
Threat Reduction Agency.
7. References
1. C. D. Rodgers, “Retrieval of atmospheric temperature and composition from remote
measurements of thermal radiation,” Rev. Geophys. Space Phys., 14, (4), 609-624 (1976).
2. J. R. Eyre, “Inversion of cloudy satellite sounding radiances by nonlinear optimal estimation.
I: Theory and simulation for TOVS,” Q. J. R. Meteorol. Soc., 115, 1001-1026 (1989).
3. W. L. Smith, H. M. Woolf, and H. E. Revercomb, “Linear simultaneous solution for
temperature and absorbing constituent profiles from radiance spectra,” Appl. Opt., 30, (9),
1117-1123 (1991).
4. X. L. Ma, T. J. Schmit, and W. L. Smith, “A nonlinear physical retrieval algorithm – its
application to the GOES-8/9 Sounder,” J. Appl. Meteorol., 38, 501-513 (1999).
5. X.L. Ma, Z. Wan, C.C. Moeller, W.P. Menzel, L.E. Gumley, and Y. Zhang, “Retrieval of
geophysical parameters from moderate resolution imaging spectroradiometer thermal
infrared data: evaluation of a two-step physical algorithm,” Appl. Opt., 39, (20), 3537-3550
(2000).
33
6. T. Steck and T. von Clarmann, “Constrained profile retrieval applied to the observation mode
of the Michelson interferometer for passive atmospheric sounding,” Appl. Opt., 40, (21),
3559-3571 (2001).
7. S. W. Seeman, J. Li, W. P. Menzel, and L. E. Gumley, “Operational retrieval of atmospheric
temperature, moisture, and ozone from MODIS infrared radiances,” J. Appl. Meteorol., 42,
1072-1091 (2003).
8. A. Hayden, E. Niple, and B. Boyce, “Determination of trace-gas amounts in plumes by the
use of orthogonal digital filtering of thermal-emission spectra,” Appl. Opt., 35, (30), 6090-
6098 (1996).
9. C. C. Funk, J. Theiler, D. A. Roberts, and C. C. Borel, “Clustering to improve matched filter
detection of weak gas plumes in hyperspectral thermal imagery,” IEEE Trans. Geosci.
Remote Sensing, 39, (7), 1410-1419 (2001).
10. N. B. Gallagher, B. M. Wise, and D. M. Sheen, “Estimation of trace concentration-pathlength
in plumes for remote sensing applications from hyperspectral images,” Analytica Chimica
Acta, 490, 139-152 (2003).
11. E. M. O’Donnell, D. W. Messinger, C. Salvaggio, and J. R. Schott, “Identification and
detection of gaseous effluents from hyperspectral imagery using invariant algorithms,” in
Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery X,
Proc. SPIE 5425, 573-582 (2004).
12. D. Manolakis and F. M. D’Amico, “A taxonomy of algorithms for chemical vapor detection
with hyperspectral imaging spectroscopy,” in Chemical and Biological Sensing VI, P.J.
Gardner, ed., Proc. SPIE, 5795, 125-133 (2005).
34
13. A.Vallières, A.Villemaire, M.Chamberland, L.Belhumeur, V.Farley, J.Giroux, and J.-
F.Legault, “Algorithms for chemical detection, identification and quantification for thermal
hyperspectral imagers,” in Chemical and Biological Standoff Detection III, J.O.Jensen and
J.-M.Thériault, eds., Proc. SPIE, 5995, 59950G-1 (2005).
14. R. M. Goody and Y. L. Yung, Atmospheric Radiation: Theoretical Basis (Oxford University
Press, 1989), Ch.2, pp. 46
15. C. D. Rodgers, Inverse Methods for Atmospheric Sounding: Theory and Practice, (World
Scientific, 2000) Ch. 2, pp. 30.
16. M. L. Polak, J. L. Hall, and K. C. Herr, “Passive Fourier-transform infrared spectroscopy of
chemical plumes: an algorithm for quantitative interpretation and real-time background
removal,” Appl. Opt., 34, (24), 5406-5412 (1995).
17. D. Flanigan, “Prediction of the limits of detection of hazardous vapors by passive infrared
with the use of MODTRAN,” Appl. Opt., 35, (30), 6090-6098 (1996).
18. R. Harrig, “Passive remote sensing of pollutant clouds by Fourier-transform infrared
spectroscopy: signal-to-noise ratio as a function of spectral resolution,” Appl. Opt., 43, (23),
4603-4610 (2004).
19. S. A. Clough, M. J. Iacono, and J.-L. Moncet, “Line-by-line calculations of atmospheric
fluxes and cooling rates,” J. Geophys. Res., 97, (D14), 15761-15785 (1992).
20. D. M. Sheen, N. B. Gallagher, S. W. Sharpe, K. K. Anderson, and J. F. Shultz, “Impact of
background and atmospheric variability on infrared hyperspectral chemical detection
sensitivity,” in Algorithms and Technologies for Multispectral, Hyperspectral, and
Ultraspectral Imagery IX, Sylvia S. Shen, Paul E. Lewis, Eds., Proc. SPIE, 5093, 218-229
(2003).
35
21. R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis, Fifth Ed.,
(Prentice Hall, 2002), Ch. 9, pp. 477-532.
22. M. E. Tipping and C. M. Bishop, “Probabilistic Principal Components Analysis,” J. R.
Statist. Soc. B, 61, part 3, 611-622, 1999.
23. S. Kay, Fundamentals of Statistical Signal Processing: Volume 1, Estimation Theory,
(Prentice-Hall, 1993) p.40.
24. S. Kraut and L. L. Scharf, “The CFAR adaptive subspace detector is a scale invariant
GLRT,” IEEE Trans. Sig. Proc., 47, (9), 2538-2541 (1999).
25. S. Kraut, L. L. Scharf, and L. T. McWhorter, “Adaptive Subspace Detectors,” IEEE Trans.
Sig. Proc., 49, (1), 1-16 (2001).
26. W. J. Marinelli, C. M. Gittins, B. R. Cosofret, T. E. Ustun, and J. O. Jensen, “Development
of the AIRIS-WAD multispectral sensor for airborne standoff chemical agent and toxic
industrial chemical detection,” Proc. of the Meetings of the Mil. Sens. Symp. Specialty
Groups on Passive Sensors; Camouflage, Concealment, and Deception; Detectors; and
Materials, Charleston, SC, Feb. 2005. Available through Defense Technical Information
Center (DTIC), document ref. no. ADA444225.
27. W. J. Marinelli, C. M. Gittins, A. H. Gelb, and B. D.Green, “Tunable Fabry-Perot etalon-
based long-wavelength infrared imaging spectrometer,” Appl. Opt., 38, (12), 2594-2604
(1999).
28. S. W. Sharpe, T. J. Johnson, R. L. Sams, P. M. Chu, G. C. Roderick, and P. A. Johnson,
“Gas-phase databases for quantitative infrared spectrometry,” Appl. Spectrosc., 58, 1452-
1461 (2004); DOE/PNNL Infrared Spectral Library Release 7.4, May 2004.
36
29. D. E. Tyler, “A distribution-free M-estimator of multivariate scatter,” The Annals of
Statistics, 15, (1), 234-251 (1987).
30. H. Cox and R. Pitre, “Robust DMR and multi-rate adaptive beamforming,” in Proc. Asilomar
Conf. Signals, Syst., Comput., Pacific Grove, CA, pp. 920–924, Nov. 1997.
31. M. Wax and T. Kailath, “Detection of signals by information theoretic criteria,” IEEE Trans.
on Acoustics, Speech, and Sig. Proc., 33, (2), pp. 387-392 (1985).
37
Appendix A. Infrared Background Model
In this work infrared background spectra, i.e., spectra corresponding to sensor views
where no chemical vapor is present, are accounted for using a linear mixing model:
Bβµx += (A.1)
where µ is the mean background spectrum, B is the k × m dimensional matrix whose columns
are the basis vectors used to span the data space and β is an m × 1 vector of weight coefficients;
km < . Measured spectra are presumed to be subject to additive Gaussian noise:
exx +=~ (A.2)
where ( )D0,e N~ ; { }222
21 ,...,, kdiag σσσ=D . The tilde denotes a noisy measurement and iσ is
the 1σ standard deviation due to sensor noise and scene clutter. Estimation of D is described
below.
The B matrix follows from a regularization approximation of the sample covariance
matrix. We summarize the method here and refer the reader to Refs. [22] and [30] for additional
detail. Regularization is implemented in two steps. The first step is to calculate the Principal
Components decomposition of the noise-whitened sample covariance matrix:
UΛΣDD =−− 2/12/1 TU (A.3)
where U is the m×m matrix of eigenvectors and Λ is the diagonal matrix of eigenvalues,
{ }kdiag λλλ ,...,, 21=Λ . In this we estimate the D matrix as { }[ ] 11 −−= ΣD diag and use the robust
estimation method described by Tyler [29] to calculate Σ . The second step follows from the
presumption that only the first m Principal Components and eigenvalues correspond to signal
and the higher order components correspond to noise. This leads to an estimated sample
covariance matrix
38
( )[ ] 2/12/1ˆ DIUIΛUDΣ mTmmmm εε +−= (A.4)
where mU is the k×m matrix whose columns are the leading eigenvectors of 2/12/1 −− ΣDD ,
{ }mm diag λλλ ,...,, 21=Λ , mI is the m×m identity matrix and
∑+=−
=k
miimk 1
1 λε (A.5)
Following Eqs. (A.4) and (A.5) the B matrix used in Eq. (A.1) is
( ) 2/12/1mmm IΛUDB ε−= (A.6)
As noted in Section 2.3, the β coefficients for the sample are presumed to be
uncorrelated, have zero mean and unit standard deviation,
( )mN I0,β ~ (A.7)
as would be the case when applying PCA to multivariate normal distributed data. The maximum
likelihood β vector for an observed spectrum x~ minimizes the cost function:
( ) ( ) ( ) ββBβxDBβx TTC +−−= − ~~ 1ε (A.8)
where the first term on the righthand side corresponds to the deviation between the data and the
model and the ββT term follows from the prior distribution given by Eq. (A.7). Following the
definition of B in (A.6), the maximum likelihood β vector is
( ) ( )µxDUIΛΛβ −−= −− ~ˆˆ 2/12/11 Tmmmm ε (A.9)
The corresponding maximum likelihood noise-free background spectrum is
( )[ ]( ) µµxDUΛIUDx +−−= −− ~ˆ 2/112/1 Tmmmm ε (A.10)
Note that as km → , 0Λ →−1mε and xx →ˆ , i.e., if all Principal Components are deemed
significant, then the maximum likelihood spectrum is equal to the input spectrum. Conversely,
39
for a data set dominated by noise mm IΛ →−1ε and µx →ˆ , i.e., when the data is dominated by
noise then the maximum likelihood spectrum is equal to the sample mean.
Determination of m, the dimensionality of the background subspace, is an information
theory problem. The key criterion for any parameterized background model to be effective for
chemical plume detection and characterization is that the dimensionality of the signal subspace
must be much less than the number of bands in the measured spectra. We attempted to
determine the number of statistically-significant signals in the data using information theoretic
criteria; specifically, the Akaike Information Criterion (AIC) and Minimum Description Length
(MDL) criteria [31]. (We note that the significance criterion in Appendix A of Ref. [22] is
equivalent to the MDL significance criterion.) The AIC and MDL criteria are appealing in that
each is a function of the calculated eigenvalues and are thereby simple to evaluate.
Unfortunately, we applied the AIC and MDL criteria to multiple datacubes and observed that
neither criterion provided either stable, plausible estimates of the number of statistically-
significant PCs; km ≈ was a typical result even though the deviation between the model and
the data changed little after the first several basis vectors.
For this work we determined which basis vectors were statistically-significant by
applying an F-test. For a model spectrum calculated using (m-1) basis vector, the F-test for
statistical significance of the m-th basis vector is:
( ) ( ) ( ) ( )( ) ( ) ⎥
⎥⎦
⎤
⎢⎢⎣
⎡−
−−
−−−−=
−−
−− 1
ˆˆˆˆˆˆ
1;1
11
1
mT
m
mT
mmkmxFxxDxxxxDxx
(A.11)
where 1ˆ −mx is the maximum likelihood spectrum calculated using (m-1)-principal components
and mx is the maximum likelihood spectrum calculated using m-principal components. For
normally-distributed noise, the calculated F-values follow an F-distribution with k-m-1 degrees
40
of freedom, ( ) 1,1~; −−mkFmxF . In this work we decide statistical significance based on the
cumulative distribution of F-values. When <5% of the calculated F-values exceeded the F-value
corresponding to 95% significance we considered the principal component statistically
significant. We make no claim that this test is optimal for determining the model order but note
that: 1) it produced the correct results using synthetic test data sets containing Gaussian additive
noise and 2) it produces seemingly reasonable results using real datacubes where noise is not
precisely Gaussian. It would be far more computationally-efficient to use an eigenvalue-based
method such as the AIC or MDL for determining model order; however, we were not able to
identify one which performed reliably with the data of interest.
As noted in Section 3, datacubes were processed on a quadrant-by-quadrant basis.
Figure A.1 shows the fraction of spectra passing the F-test as a function m for the quadrant in of
the datacube in which the synthetic spectra were embedded (upper middle). For that quadrant
m=6. Figure A.2 shows the rms residuals in each band as a function of m for m=4-7 for the
original (plume-free) datacube. Note that basis vectors have decreasing effect with increasing m.
41
8. Figure Captions
Figure 1. Stratified atmosphere model. Each layer defined to have uniform temperature (Ti),
pressure, and chemical composition; layer transmission is iτ . Chemical vapor plume of
interest is Layer p.
Figure 2. Gray scale representation of AIRIS-WAD datacube. Lighter pixels indicate higher
average radiance values; average radiance calculated over all twenty spectral bands acquired
by the sensor. Representative “sky,” “horizon,” and “ground” regions are indicated by the
white boxes and black box, respectively.
Figure 3. Radiance spectra corresponding to the “sky,” “horizon,” and “ground” regions in
Figure 2. Spectra shown are the average of all pixels in the identified region.
Figure 4. Calculated transmission spectra of 20, 197, and 591 ppmv-m R-134a plumes. High
resolution spectra have peak optical density of 0.1, 1.0, and 3.0 (base e), respectively. The
thick lines indicate the spectra calculated using Beer’s law and the R-134a spectrum from the
PNNL database. The thin dotted lines indicate the effective transmission which results from
convolving the high resolution spectrum with a 0.08 µm FWHM Lorentzian lineshape
function. The lower resolution spectra were used to augment AIRIS-WAD data.
Figure 5. Effective thermal contrast between the local air temperature and the effective
radiometric temperature of the background. Calculated median effT∆ for each row in scene
depicted in Figure 2.
Figure 6. Uncertainty in the estimated column density as a function of elevation angle. Plot
shows median for row in scene depicted in Figure 2; uncertainty calculated using Eq. (36).
42
Figure 7. Locations where synthetic R-134a plumes were added to AIRIS-WAD data. The
effective thermal contrast in Region 1 is 2.6 ± 0.5 K and the effective contrast in Region 2 is
5.9 ± 0.6 K.
Figure 8. R-134a optical densities estimated in Region 1 of Figure 7. Black circles indicate
median OD estimated by the Gauss-Newton algorithm. Open circles indicate median OD
estimated using the linear signal model given by Eq. (31). The error bars in correspond to
σ1± variation in estimated column density calculated using Eq. (49).
Figure 9. R-134a optical densities estimated in Region 2 of Figure 7. Black circles indicate
median OD estimated by the Gauss-Newton algorithm. Open circles indicate median OD
estimated using the linear signal model given by Eq. (31). The error bars in correspond to
σ1± variation in estimated column density calculated using Eq. (49).
Figure 10. Effective R-134a absorption cross-sections for OD=0.0, OD=1.0, and OD=3.0. The
OD=0.0 spectrum is used for estimation of plume OD with the Gauss-Newton algorithm and
linear estimator.
Figure 11. ROC curves for OD=0.1 R-134a plumes added to Regions 1 and 2: ■ = Gauss-
Newton solver applied to Region 2, = ACE applied to Region 2, ● = Gauss-Newton solver
applied to Region 1, = ACE applied to Region 1.
Figure 12. ROC curves for OD=0.3 R-134a plumes added to Regions 1 and 2: ■ = Gauss-
Newton solver applied to Region 2, = ACE applied to Region 2, ● = Gauss-Newton solver
applied to Region 1, = ACE applied to Region 1.
Figure 13. ROC curves for OD=1.0 R-134a plumes added to Regions 1 and 2: ■ = Gauss-
Newton solver applied to Region 2, = ACE applied to Region 2, ● = Gauss-Newton solver
applied to Region 1, = ACE applied to Region 1.
43
Figure 14. ROC curves for OD=2.0 R-134a plumes added to Regions 1 and 2: ■ = Gauss-
Newton solver applied to Region 2, = ACE applied to Region 2, ● = Gauss-Newton solver
applied to Region 1, = ACE applied to Region 1.
Figure 15. Representative spectra from Region 1 and best fits to data using linear and nonlinear
models: = original spectrum, ■ = original spectrum augmented with OD=3.0 R-134a
plume.
Figure 16. Rms fit residuals for Region 1: ◊ = fit to original data, = fit to data augmented with
OD=3.0 R-134a plume using nonlinear estimator, ■ = fit to data augmented with OD=3.0 R-
134a plume using linear model.
Figure 17. Number of iterations required for Gauss-Newton algorithm to converge, convergence
threshold = 0.01: ■=plume-free pixels, □ = Region 2 with no plume added, = Region 2
with OD=1.0 plume added, = Region 2 with OD=2.0 plume added, = Region 2 with
OD=3.0 plume added.
Figure A.1. Fraction of spectra passing F-test, Eq. (A.11), as function of number of basis
functions used to model data. Pass criterion is F value for ≥5% of the spectra exceed the F
value for 95% significance. Data corresponds to upper middle quadrant of scene in Fig. 2.
Figure A.2. RMS residuals between model and data as function of m for m=4-7. Six basis
vectors were deemed to be statistically-significant using the F-test criterion.