Verifying Satellite Precipitation Estimates for Weather and Hydrological Applications Beth Ebert Bureau of Meteorology Research Centre Melbourne, Australia

Verifying Satellite Precipitation Estimates for Weather and Hydrological Applications

Beth Ebert

Bureau of Meteorology Research Centre

Melbourne, Australia

1st IPWG Workshop, 23-27 September 2002, Madrid

val.i.date ( ) tr.v. 1. To declare or make legally valid. 2. To mark with an indication of official sanction. 3. To substantiate; verify.

ver.i.fy ( ) tr.v. 1. To prove the truth of by the presentation of evidence or testimony; substantiate. 2. To determine or test the truth or accuracy of, as by comparison, investigation, or reference: "Findings are not accepted by scientists unless they can be verified" (Norman L. Munn)

-e 'tad lav '

-e 'if rev '

The American Heritage Dictionary of the English Language. William Morris, editor, Houghton Mifflin, Boston, 1969.

Satellite precipitation estimates -- what do we especially want to get right?

Climatologists - mean bias

NWP data assimilation (physical initialization) - rain location and type

Hydrologists - rain volume

Forecasters and emergency managers - rain location and maximum intensity

Everyone needs error estimates!

Short-term precipitation estimates• High spatial and temporal resolution desirable

• Dynamic range required

• Motion may be important for nowcasts

• Can live with some bias in the estimates if it's not too great

• Verification data need not be quite as accurate as for climate verification

• Land-based rainfall generally of greater interest than ocean-based

Some truths about "truth" data

• No existing measurement system adequately captures the high spatial and temporal variability of rainfall.

• Errors in validation data artificially inflate errors in satellite precipitation estimates

Rain gauge observations

Advantages DisadvantagesTrue rain measurements May be unrepresentative of

aerial valueVerification results biased

toward regions with high gauge density

Most obs made once daily

Radar dataAdvantages DisadvantagesExcellent spatial and Beamfilling, attenuation,

temporal resolution overshoot, clutter, etc.Limited spatial extent

TRMM PR

Rain gauge analysesAdvantages DisadvantagesGrid-scale quantities Smoothes actual rainfall Overcomes uneven values

distribution of raingauges

Stream flow measurementsAdvantages DisadvantagesIntegrates rainfall over Depends on soil conditions,

a catchment hydrological modelMany accurate measure- Time delay between rain

ments available and outflowHydrologists want it Blurs spatial distribution

time

Discharge(m3/hr) estimated

observed

Verification strategy for satellite precipitation estimates

Use (gauge-corrected) radar data for local instantaneous or very short-term estimates

Use gauge or radar-gauge analysis for larger spatial and/or temporal estimates

Focus on methods, not results

• What scores and methods can we use to verify precipitation estimates?

• What do they tell us about the quality of precipitation estimates?

• What are some of the advantages and disadvantages of these methods?

• Will focus on spatial verification

Does the satellite estimate look right?

• Is the rain in the correct place?

• Does it have the correct mean value?

• Does it have the correct maximum value?

• Does it have the correct size?

• Does it have the correct shape?

• Does it have the correct spatial variability?

Spatial verification methods

• Visual ("eyeball") verification• Continuous statistics• Categorical statistics• Joint distributions

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

• Scale decomposition methods • Entity-based methods

"standard"

"scientific" or "diagnostic"

Step 1: Visual ("eyeball") verificationVisually compare maps of satellite estimates and

observations

Advantage: "A picture tells a thousand words…"

Disadvantages: Labor intensive, not quantitative, subjective

Verifies this attribute?LocationSizeShapeMean valueMaximum valueSpatial variability

Rozumalski, 2000

Continuous verification statistics

Measure the correspondence between the values of the estimates and observations

Examples:• mean error (bias)• mean absolute error• root mean squared error• skill score• linear error in probability

space (LEPS)• correlation coefficient

Advantages: Simple, familiar

Disadvantage: Not very revealing as to what's going wrong in the forecast

Mean absolute error

||1

1i

N

ii OF

NMAE

Measures: Average magnitude of forecast error

Root mean square error2

1

)(1

i

N

ii OF

NRMSE

Measures: Error magnitude, with large errors having a greater impact than in the MAE


Mean error (bias)

)(1

1i

N

ii OF

NMean Error

Measures: Average difference between forecast and observed values

Time series of error statistics

24-hr rainfall from NRL Experimental Geostationary algorithm validated against Australian operational daily rain gauge analysis

0.25° grid boxes, tropics only

Linear error in probability space (LEPS)

Measures: Probability error - does not penalise going out on a limb when it is justified.

|)()(|1

1io

N

iio OCDFFCDF

NLEPS

Verifies this attribute?LocationSizeShapeMean valueMaximum valueSpatial variabilityOi Fi

Cumulativeprobability of observations

CDFo

Value

error {

Correlation coefficient

22 )()(

)( )(

OOFF

OOFFr

Measures: Correspondence between estimated spatial distribution and observed spatial distribution, independent of mean bias


Danger...

Rozumalski, 2000

AutoEstimator validated against Stage III

8x8 km grid boxes

Skill score

referenceperfect

referenceestimate

scorescore

scorescorescoreSkill

Measures: Improvement over a reference estimate. When MSE is the score used in the above expression then the resulting statistic is called the reduction of variance.

The reference estimate is usually one of the following(a) random chance(b) climatology(c) persistence

but it could be another estimation algorithm.


Cross-validation - useful when observations are included in the estimates

),...,1 ,,( * NiOYscorescore ii

where Yi* is the estimate at point i computed with Oi excluded from the analysis

Measures: Expected accuracy at the scale of the observations. The score is usually bias, MAE, RMS, correlation, etc.


Categorical statisticsMeasure the correspondence between estimated and

observed occurrence of events

Examples:• bias score• probability of detection• false alarm ratio• threat score• equitable threat score• odds ratio• Hanssen and Kuipers score• Heidke skill score

Advantages: Simple, familiar

Disadvantage: Not very revealing

Estimated yes no

yes hits misses

no false correctalarms negativesO

bser

ved

Estimated Observed

Falsealarms

Hits

Misses

Correct negatives

Categorical statistics

Bias score

misseshits

alarmsfalsehitsBIAS

Measures: Ratio of estimated area (frequency) to observed area (frequency)


Probability of Detectionmisseshits

hitsPOD


False Alarm Ratioalarmsfalsehits

alarmsfalseFAR

Threat score (critical success index)

alarmsfalsemisseshits

hitsCSITS

Equitable threat score

random

random

hitsalarmsfalsemisseshits

hitshitsETS

Odds ratio

alarmsfalsemisses

negativescorrecthitsOR

*

*

Hanssen and Kuipers discriminant (true skill statistic)

Measures: Ability of the estimation method to separate the "yes" cases from the "no" cases.

negativescorrectalarmsfalse

alarmsfalse

misseshits

hitsHK


Heidke skill score

Measures: Fraction of correct yes/no detections after eliminating those which would be correct due purely to random chance

random

random

correctN

correctnegativescorrecthitsHSS

) (

)*()*(1

nonoyesyesrandom EstObsEstObsN

correct

Categorical verification of daily satellite precipitation estimates from GPCP 1DD algorithm during summer 2000-01 over Australia

Rain threshold varies from light to heavy

North (tropics) Southeast (mid-latitudes)

Real-time verification example24-hr rainfall from NRL Experimental Geostationary algorithm

Real-time verification example24-hr rainfall from NRL Experimental blended microwave algorithm

Distributions oriented view


Advantage: Much more complete picture of forecast performance

Disadvantage: Lots of numbers

Estimated category

1 2 … K total

1 n11 n12 … n1K No1

2 n21 n22 … n2K No2

… … … … … …Observedcategory

K nK1 nK2 … nKK NoK

total Ne1 Ne2 … NeK N

PREDICTED (mm/d) .0--.1--.2--.5---1---2---5--10--20--50--100--200 total 0.0 | 4134 130 267 136 111 83 28 23 18 6 0 4936 0.1 | 206 25 45 42 30 15 3 4 4 1 0 375 0.2 | 281 17 52 36 25 29 12 6 3 3 0 464 0.5 | 260 6 34 17 17 31 16 20 6 3 1 411 1 | 229 13 41 28 28 61 20 26 29 4 1 480 2 | 259 22 77 50 51 55 53 43 38 6 1 655 5 | 182 21 59 37 48 76 66 68 80 15 0 652 10 | 104 21 27 47 54 106 112 127 134 27 5 764 20 | 42 6 19 13 41 96 125 158 325 127 9 961 50 | 7 1 0 0 0 1 7 8 46 45 13 128 100 | 0 0 0 0 0 0 0 0 0 8 1 9 200

total 5704 262 621 406 405 553 442 483 683 245 31 9835

OBSERVED (mm/d)

24-hr rainfall from NRL Experimental Geostationary algorithm validated against Australian operational daily rain gauge analysis on 21 Jan 2002

Scatterplot

Shows: Joint distribution of estimated and observed values

NRL geo 20020121

R=0.63

Probability distribution function

Shows: Marginal distributions of estimated and observed values

geoanal

NRL geo 20020121

Heidke skill score (K distinct categories)

Measures: Skill of the estimation method in predicting the correct category, relative to that of random chance


K

1k

1 1

)()( 1

)()( ),(

kk

K

k

K

kkkkk

OPFP

OPFPOFP

HSS

Scale decomposition methodsMeasure the correspondence between the estimates

and observations at different spatial scales

Examples:• 2D Fourier decomposition• wavelet decomposition• upscaling

Advantages: Scales on which largest errors occur can be isolated, can filter noisy data

Disadvantages: Less intuitive, can be mathematically tricky

Discrete wavelet transforms


Concept: Decompose fields into scales representing different detail levels. Test whether the forecast resembles the observations at each scale.

Measures, for each scale:• % of total MSE• linear correlation• RMSE• categorical verification scores• others...

Casati and Stephenson (2002) technique

Step 1: "Recalibrate" forecast using histogram matching

errortotal = errorbias + errorrecalibrated

Step 2: Threshold the observations and recalibrated forecast to get binary images

Step 3: Subtract to get error (difference) image

Step 4: Discrete wavelet decomposition of error to scales of resolution x 2n

Odds ratio

Step 5: Compute verification statistics on error field at discrete scales. Repeat for different rain thresholds.

Multiscale statistical organizationZepeda-Arce et al. (J. Geophys. Res., 2000)

Concept: Observed precipitation patterns have multi-scale spatial and spatio-temporal organization. Test whether the satellite estimate reproduces this organization.

Method: Start with fine scale, average to coarser scale


Measures:• TS vs. scale• depth vs. area• spatial scaling parameter• dynamic scaling exponent

obs fcst

Scale (km)

Thr

eat s

core

+++

+

+

+

+

+

Area (km2)

Dep

th (

mm

)

obs

fcst

Std

. dev

.

Scale (km)

**

**

obs

fcst

Upscaling verification of IR power law rainrate16 September 2002, Melbourne

IR

radar

IR

radar

mm hr-1

GMSRA validated against rain gauge analyses at different spatial scales

(Ba and Gruber, 2001)

Entity-based methods

Use pattern matching to associate forecast and observed entities ("blobs"). Verify the properties of the entities.

Examples:• CRA (contiguous rain area) verification


Advantages: Intuitive, quantifies "eyeball" verification

Disadvantage: May fail if forecast does not sufficiently resemble observations

CRA (entity) verification

Ebert and McBride (J. Hydrology, Dec 2000)

Concept: Verify the properties of the forecast (estimated) entities against observed entities

Method: Pattern matching to determine location error, error decomposition, event verification


Measures:• location error• size error• error in mean, max values• pattern error

Determine the location error using pattern matching:

• Horizontally translate the estimated blob until the total squared error between the estimate and the observations is minimized in the shaded region. Other possibilities: maximum correlation, maximum overlap

• The displacement is the vector difference between the original and final locations of the estimate.

Observed Estimated

CRA error decomposition

The total mean squared error (MSE) can be written as:

MSEtotal = MSEdisplacement + MSEvolume + MSEpattern

The difference between the mean square error before and after translation is the contribution to total error due to displacement,

MSEdisplacement = MSEtotal – MSEshifted

The error component due to volume represents the bias in mean intensity,

where and are the CRA mean estimated and observed values after the shift.

The pattern error accounts for differences in the fine structure of the estimated and observed fields,

MSEpattern = MSEshifted - MSEvolume

2)( XFMSEvolume

XF

24-hr rainfall from NRL Experimental Geostationary algorithm validated against Australian operational daily rain gauge analysis

Diagnosis of systematic errors

Displacement (km)

NRL Experimental Geostationary algorithm

289 CRAs

April 2001-March 2002

Diagnosis of systematic errors

EstimateAnalyzed

NRL Experimental Geostationary algorithm

289 CRAs

April 2001-March 2002

Tropical Rain Potential (TRaP) verification?

TRaP 24 h rain from 20001208_16

Which methods verify which attributes?Visual

(“eyeball”)Contin-

uousstatistics

Cate-gorical

statistics

J ointdistri-bution

Scaledecom-position

Entity-based

Location

Size

Shape

Mean value

Maximumvalue

Spatialvariability

Conclusions

• The most effective diagnostic verification method is still visual ("eyeball") verification.

• Categorical statistics based on yes-no discrimination are probably the least informative of all of the verification methods, although they remain very useful for quantitative algorithm intercomparison.

• The newer diagnostic verification methods (scale decomposition, entity-based) give a more complete and informative diagnosis of algorithm performance

• Need methods to deal with observational uncertainty

__________________

UNDER CONSTRUCTION

__________________

http://www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.shtml

Documents

Verifying Satellite Precipitation Estimates for Weather and Hydrological Applications Beth Ebert Bureau of Meteorology Research Centre Melbourne, Australia