Upload
bayard
View
39
Download
1
Embed Size (px)
DESCRIPTION
Information Content. Tristan L’Ecuyer. Claude Shannon (1948), “A Mathematical Theory of Communication”, Bell System Technical Journal 27, pp. 379-423 and 623-656. Historical Perspective. - PowerPoint PPT Presentation
Citation preview
1
Information Content
Tristan L’Ecuyer
2
Historical Perspective
Information theory has its roots in telecommunications and specifically in addressing the engineering problem of transmitting signals over noisy channels.
Papers in 1924 and 1928 by Harry Nyquist and Ralph Hartley, respectively introduce the notion of information as a measurable quantity representing the ability of a receiver to distinguish different sequences of symbols.
The formal theory begins with Shannon (1948), the first to establish the connection between information content and entropy.
Since this seminal work, information theory has grown into a broad and deep mathematical field with applications in data communication, data compression, error-correction, and cryptographic algorithms (codes and ciphers).
Claude Shannon (1948), “A Mathematical Theory of Communication”, Bell System Technical Journal 27, pp. 379-423 and 623-656.
3
Link to Remote Sensing
Shannon (1948): “The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point.”
Similarly, the fundamental goal of remote sensing is to use measurements to reproduce a set of geophysical parameters, the “message”, that are defined or “selected” in the atmosphere at the remote point of observation (eg. satellite).
Information theory makes it possible examine the capacity of transmission channels (usually in bits) accounting for noise, signal gaps, and other forms of signal degradation.
Likewise in remote sensing we can use information theory to examine the “capacity” of a combination of measurements to convey information about the geophysical parameters of interest accounting for “noise” due to measurement error and model error.
4
Corrupting the Message:Noise and Non-uniqueness
Measurement and model error as well as the character of the forward model all introduce non-uniqueness in the solution.
∆y
∆x
∆x
Linear Model Quadratic Model Cubic Model
∆x < ∆x ∆x < ∆x < ∆x
Unwanted Solutions
5
Forward Model Errors (∆y)
Uncertainty due to unknown “influence parameters” that impact forward model calculations but are not directly retrieved often represents the largest source of retrieval error
Errors in these parameters introduce non-uniqueness in the solution space by broadening the effective measurement PDF
Forward Problem Inverse Problem
εb)F(x,y
“Influence”parameters
εb)(y,Fx 1
Forward modelerrors
Measurementerror
Errors inInversion
Uncertainty in“influence”parameters
6
Error Propagation in Inversion
Bi-variate PDF of (sim. – obs.) measurements. Width dictated by measurement error and uncertainty in forward model assumptions.
R0.64μm
R2.
13μm
σTB
σ∆TB
Obs.
Error in product from width of posterior distribution from application of Bayes theorem.
τ
Ref
f στ
σReff
Soln
7
Visible Ice Cloud Retreivals
τ = 2
0.66 μm Reflectance
2.13
μm
Ref
lect
ance τ = 10 20 30 50
τ = 10 20 30 50
8 μm
12 μm
24 μm
48 μm
τ = 2
τ = 45±5; Re = 11±2τ = 18±2; Re = 19±2Due to assumptions: τ = 16-50; Re = 9-21
Nakajima and King (1990) technique based on a conservative scattering visible channel for optical depth and an absorbing near- IR channel for reff
Influence parameters are crystal habit, particle size distribution, and surface albedo.
8
CloudSat Snowfall Retrievals
Snowfall retrievals relate reflectivity, Z, to snowfall rate, S This relationship depends on snow crystal shape, density, size
distribution, and fall speed Since few, if any of these factors can be retrieved from
reflectivity alone, they all broaden the Z-S relationship and lead to uncertainty in the retrieved snowfall rate
9Snowfall Rate (mm h-1)
Refle
ctiv
ity
(dBZ
e)
Hex Columns4-arm Rosettes6-arm Rosettes8-arm Rosettes
Impacts of Crystal Shape (2-7 dBZ)
10
Impacts of PSD (3-6 dBZ)
ν = 0ν = 1ν = 2
Snowfall Rate (mm h-1)
Refle
ctiv
ity
(dBZ
e)
Sensitivity to ν
Snowfall Rate (mm h-1)
Refle
ctiv
ity
(dBZ
e)
Sekhon/Srivastavaa & b = -10%a & b = +10%
Sensitivity to PSD Shape
ΛDν0 eDNN(D) βαSΛ b
0 aSN 0ν
11
Implications for Retrieval
Given a “perfect” forward model, 1 dB measurement errors lead to errors in retrieved snowfall rate of less than 10 %
Ideal Case
Refle
ctiv
ity
Snowfall Rate (mm h-1)
“Reality”
Refle
ctiv
ity
Snowfall Rate (mm h-1)
PSD and snow crystal shape, however, spread the range of allowable solutions in the absence of additional constraint
12
Quantitative Retrieval Metrics
Four useful metrics for assessing how well formulated a retrieval problem:
– Sx – the error covariance matrix provides a useful diagnostic of retrieval performance measuring the uncertainty in the products
– A – the averaging kernel describes, among other things, the amount of information that comes from the measurements as opposed to a priori information
– Degrees of freedom– Information content
All require accurate specification of uncertainties in all inputs including errors due to forward model assumptions, measurements, and any mathematical approximations required to map geophysical parameters into measurement space.
13
Degrees of Freedom
The cost function can be used to define two very useful measures of the quality of a retrieval: the number of degrees of freedom for signal and noise denoted ds and dn, respectively
where Sa is the covariance matrix describing the prior state space and K represents the Jacobian of the measurements with respect to the parameters of interest.
ds specifies the number of observations that are actually used to constrain retrieval parameters while the dn is the corresponding number that are lost due to noise
Clive Rogers (2000), “Inverse Methods for Atmospheric Sounding: Theory and Practice”, World Scientific, 238 pp.
Φ T T-1 -1a a a y= x - x S x - x + y - Kx S y - Kx
ds dn
14
Degrees of Freedom
Using the expression for the state vector that minimizes the cost function it is relatively straight-forward to show that
where Im is the m x m identity matrix and A is the averaging kernel. NOTE: Even if the number of retrieval parameters is equal to or less
than the number of measurements, a retrieval can still be under-constrained if noise and redundancy are such that the number of degrees of freedom for signal is less than the number of parameters to be retrieved.
sd =Tr Tr Tr -1-1 T -1 -1 T -1
x a y a yS S = K S K + S K S K = A
nd =Tr Tr -1T
y a y mS KS K + S = I - A
15
Entropy-based Information Content
The Gibbs entropy is the logarithm of the number of discrete internal states of a thermodynamic system
where pi is the probability of the system being in state i and k is the Boltzmann constant.
The information theory analogue has k=1 and the pi representing the probabilities of all possible combinations of retrieval parameters.
More generally, for a continuous distribution (eg. Gaussian):
i ii
S(P)=-k p lnp
2S P(x) =- P(x)log P(x) dx
16
Entropy of a Gaussian Distribution
For the Gaussian distributions typically used in optimal estimation
we have:
For an m-variable Gaussian dist.:
2
1/2 2
x-x1P(x)= exp -2σ2π σ
2 21/2
21/2 2 2
x-x x-x1S P(x) = exp - log 2π σ +exp - dx2σ 2σ2π σ
1/ 22S P(x) =log 2 e
1/ 2 12 22S P( ) =m log 2 log ye x S
17
Information Content of a Retrieval
The information content of an observing system is defined as the difference in entropy between an a priori set of possible solutions, S(P1), and the subset of these solutions that also satisfy the measurements, S(P2):
If Gaussian distributions are assumed for the prior and posterior state spaces as in the O. E. approach, this can be written:
since, after minimizing the cost function, the covariance of the posterior state space is:
)S(P)S(PH 21
2 21 1H= log log2 2
1 T 1 11 2 a y aS S S K S K S
11y
T1ax KSKSS
18
Interpretation
Qualitatively, information content describes the factor by which knowledge of a quantity is improved by making a measurement.
Using Gaussian statistics we see that the information content provides a measure of how much the ‘volume of uncertainty’ represented by the a priori state space is reduced after measurements are made.
Essentially this is a generalization of the scalar concept of ‘signal-to-noise’ ratio.
21H= log2
-1x aS S
19
Measuring Stick Analogy
Information content measures the resolution of the observing system for resolving solution space.
Analogous to the divisions on a measuring stick: the higher the information content, the finer the scale that can be resolved.
A: Biggest scale = 2 divisions H = 1
Full range of a priori solutions
AC
C: Finer still = 8 divisions H = 3
B
B: Next finer scale = 4 divisions H = 2
D
D: Finest scale = 16 divisions H = 4
20
Blue a priori state space
Green state space that also matches MODIS visible channel (0.64 μm)
Red state space that matches both 0.64 and 2.13 μm channels
Yellow state space that matches all 17 MODIS channels
Liquid Cloud Retrievals
Prior State Space 0.64 μm (H=1.20)
LWP
(gm
-3)
Re (μm)
LWP
(gm
-3)
Re (μm)
0.64 & 2.13 μm(H=2.51)
17 Channels(H=3.53)
21
Snowfall Retrieval Revisited
With a 140 GHz brightness temperature accurate to ±5 K as a constraint, the range of solutions is significantly narrowed by up to a factor of 4 implying an information content of ~2.
Radar Only
Refle
ctiv
ity
Snowfall Rate (mm h-1)
Radar + Radiometer
Snowfall Rate (mm h-1)Re
flect
ivit
y
22
Return to Polynomial Functions
Order, N X1 X2 Error (%) ds H
1 1.984 1.988 18 1.933 1.45
2 1.996 1.998 9 1.985 2.19
5 1.999 2.000 3 1.998 3.16
N1 21 1 1
2 32 2 2
a ay x b= +
a ay x b
σy = 10%σa = 100%
σy = 25%σa = 100%
σy = 10%σa = 10%
Order, N X1 X2 Error (%) ds H
1 1.401 1.432 8 0.568 0.07
2 1.682 1.771 7 1.099 0.21
5 1.927 1.976 3 1.784 0.83
Order, N X1 X2 Error (%) ds H
1 1.909 1.929 41 1.659 0.65
2 1.976 1.986 21 1.911 1.29
5 1.996 1.998 8 1.987 2.25
X1 = X2 = 2; X1a = X2a = 1
1
25
23
Application: MODIS Cloud Retrievals
The concept of information content provides a useful tool for analyzing the properties of observing systems within the constraints of realistic error assumptions.
As an example, consider the problem of assessing the information content of the channels on the MODIS instrument for retrieving cloud microphysical properties.
Application of information theory requires:– Characterize the expected uncertainty in modeled radiances due to
assumed temperature, humidity, ice crystal shape/density, particle size distribution, etc. (i.e. evaluate Sy);
– Determine the sensitivity of each radiance to the microphysical properties of interest (i.e. compute K);
– Establish error bounds provided by any available a priori information (eg. cloud height from CloudSat);
– Evaluate diagnostics such as Sx, A, ds, and H
1. L’Ecuyer et al. (2006), J. Appl. Meteor. 45, 20-41.
2. Cooper et al. (2006), J. Appl. Meteor. 45, 42-62.
24
Error Analyses
Fractional errors reveal a strong scene-dependence that varies from channel to channel.
LW channels are typically better at lower optical depths while SW channels improve at higher values.
25
Sensitivity Analyses
The sensitivity matrices also illustrate a strong scene dependence that varies from channel to channel.
The SW channels have the best sensitivity to number concentration in optically thick clouds and effective radius in thin clouds.
LW channels exhibit the most sensitivity to cloud height for thick clouds and to number concentration for clouds with optical depths between 0.5-4.
0.646 μm
2.130 μm
11.00 μm
26
Information Content
Information content is related to the ratio of the sensitivity to the uncertainty – i.e. the signal-to-noise.
H
ds
14 km11 km9 km
14 km11 km9 km
27
The Importance of Uncertainties
Rigorous specification of forward model uncertainties is critical for an accurate assessment of the information content of any set of measurements.
Uniform 10% Errors Rigorous Errors11 km11 km
11 km 11 km
28
The Role of A Priori
Information content measures the amount state space is reduced relative to prior information.
As prior information improves, the information content of the measurements decreases.
The presence of cloud height information from CloudSat, for example, constrains the a priori state space and reduces the information content of the MODIS observations.
Without CloudSat With CloudSat11 km11 km
11 km 11 km