30
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Fuzzy Verification toolbox: definitions and results Felix Ament MeteoSwiss, Switzerland

Fuzzy Verification toolbox: definitions and results

  • Upload
    erik

  • View
    24

  • Download
    0

Embed Size (px)

DESCRIPTION

Fuzzy Verification toolbox: definitions and results. Felix Ament MeteoSwiss, Switzerland. Mesoscale model (5 km) 21 Mar 2004. Global model (100 km) 21 Mar 2004. Observed 24h rain. Sydney. Sydney. RMS=13.0. RMS=4.6. Motivation for new scores. Which rain forecast would you rather use?. - PowerPoint PPT Presentation

Citation preview

Page 1: Fuzzy Verification toolbox: definitions and results

Federal Department of Home Affairs FDHAFederal Office of Meteorology and Climatology MeteoSwiss

Fuzzy Verification toolbox: definitions and results

Felix AmentMeteoSwiss, Switzerland

Page 2: Fuzzy Verification toolbox: definitions and results

2 Fuzzy Verification [email protected]

Which rain forecast would you rather use?

Mesoscale model (5 km) 21 Mar 2004

Sydney

Global model (100 km) 21 Mar 2004

Sydney

Motivation for new scores

Observed 24h rain

RMS=13.0 RMS=4.6

Page 3: Fuzzy Verification toolbox: definitions and results

3 Fuzzy Verification [email protected]

Fine scale verification: Fuzzy Methods

observation forecast

x x xx xx x x

x

X XX X

X Xx XXX

x

x x

Intensity

ScaleEvaluate box

statistics

• (Choose a threshold to define event and non-event)

• define scales of interest

• consider statistics at these scales for verification

General Recipe

“… do not evaluate a point by point match!”

score depends on spatial scale and intensity

Page 4: Fuzzy Verification toolbox: definitions and results

4 Fuzzy Verification [email protected]

A Fuzzy Verification Toolbox

Ebert, E.E., 2007: Fuzzy verification of high resolution gridded forecasts: A review and proposed framework. Meteorol. Appls., submitted.

Toolbox available at http://www.bom.gov.au/bmrc/wefor/staff/eee/fuzzy_verification.zip

Fuzzy method Decision model for useful forecast

Upscaling (Zepeda-Arce et al. 2000; Weygandt et al. 2004) Resembles obs when averaged to coarser scales

Anywhere in window (Damrath 2004), 50% coverage Predicts event over minimum fraction of region

Fuzzy logic (Damrath 2004), Joint probability (Ebert 2002) More correct than incorrect

Multi-event contingency table (Atger 2001) Predicts at least one event close to observed event

Intensity-scale (Casati et al. 2004) Lower error than random arrangement of obs

Fractions skill score (Roberts and Lean 2005) Similar frequency of forecast and observed events

Practically perfect hindcast (Brooks et al. 1998) Resembles forecast based on perfect knowledge of observations

Pragmatic (Theis et al. 2005) Can distinguish events and non-events

CSRR (Germann and Zawadzki 2004) High probability of matching observed value

Area-related RMSE (Rezacova et al. 2005) Similar intensity distribution as observed

Page 5: Fuzzy Verification toolbox: definitions and results

5 Fuzzy Verification [email protected]

Applying fuzzy scores

Fuzzy scores provide a wealth of information, but

• the results seems to be contrasting

• their interpretation is sometimes difficult

• contain too many numbers

goodpoor

Page 6: Fuzzy Verification toolbox: definitions and results

6 Fuzzy Verification [email protected]

Application versus testbed

Know the scores Forecast error is unknown

! ?

? !Scores are unknown Know the forecast

error

Application

Testbed

Page 7: Fuzzy Verification toolbox: definitions and results

7 Fuzzy Verification [email protected]

A Fuzzy Verification testbed

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

0.90 0.90 0.90 0.90 0.90 0.90 1.00 1.00 1.00

0.70 0.70 0.70 0.70 0.70 0.70 1.00 1.00 1.00

0.50 0.50 0.50 0.50 0.50 0.50 0.90 0.90 0.90

0.50 0.50 0.50 0.50 0.50 0.50 0.90 0.90 0.90

0.40 0.40 0.50 0.50 0.50 0.50 0.90 0.90 0.90

0.30 0.40 0.40 0.50 0.50 0.50 0.90 0.90 0.90

Perturbation Generator

AnalyzerFuzzy Verification Toolbox

Virtual truth(Radar data, model data, synthetic field)

Realizations of virtual erroneousmodel forecasts

Realizations ofverification results Assessment of

• sensitivity (mean)

• [reliability (STD)]

Two ingredients:

1. Reference fields: Hourly radar derived rain fields, August 2005 flood event, 19 time stamps (Frei et al., 2005)

2. Perturbations: next slide

Page 8: Fuzzy Verification toolbox: definitions and results

8 Fuzzy Verification [email protected]

Perturbations

Perturbation Type of forecast error Algorithm

PERFECT No error – perfect forecast! -

XSHIFT Horizontal translationHorizontal translation (10 grid points)

BROWNIAN No small scale skillRandom exchange of neighboring points (Brownian motion)

LS_NOISE Wrong large scale forcingMultiplication with a disturbance factor generated by large scale 2d Gaussian kernels.

SMOOTHHigh horizontal diffusion (or coarse scale model)

Moving window arithmetic average

DRIZZLEOverestimation of low intensity precipitation

Moving Window filter setting each point below average point to the mean value

Page 9: Fuzzy Verification toolbox: definitions and results

9 Fuzzy Verification [email protected]

Perfect forecast

All scores should equal !

• But, in fact, 5 out of 12 do not!

Page 10: Fuzzy Verification toolbox: definitions and results

10 Fuzzy Verification [email protected]

Effect of „Leaking“ Scores

observation forecast

Problem: Some methods assume no skill at scales below window size!

pobs=0.5 pforecast=0.5

Assuming random ordering within window

yes no

yes 0.25 0.25

no 0.25 0.25

An example: Joint probability method

ForecastO

BS Not perfect!

Page 11: Fuzzy Verification toolbox: definitions and results

11 Fuzzy Verification [email protected]

Expected response to perturbations

XSHIFT BROWNIAN LS_NOISE SMOOTH DRIZZLE

Sensitivity: expected (=0.0); not expected (=1.0)

Contrast := mean( ) – mean( )

Summary in terms of contrast:

low high

intensity

coarse

fine

spat

ial

scal

e

Page 12: Fuzzy Verification toolbox: definitions and results

12 Fuzzy Verification [email protected]

Summary real

-0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0

0.1

0.2

Up-scaling

Any-where in Window

50%cover-age

FuzzyLogig

JointProb.

Multi event

cont. tab.

Intensity Scale

Fraction Skill

Score

Prag-matic Appr.

Practic. Perf.

HindcastCSSR

Area related RMSE

Leaking Scores

XSHIFTBROWNIAN SMOOTH

LS_NOISE DRIZZLEContrast

STD

good

good

• Leaking scores show an overall poor performance

• “Intensity scale” and “Practically Perfect Hindcast” perform in general well, but …

• Many score have problem to detect large scale noise (LS_NOISE); “Upscaling” and “50% coverage” are beneficial in this respect

• Leaking scores show an overall poor performance

• “Intensity scale” and “Practically Perfect Hindcast” perform in general well, but …

• Many score have problem to detect large scale noise (LS_NOISE); “Upscaling” and “50% coverage” are beneficial in this respect

Page 13: Fuzzy Verification toolbox: definitions and results

13 Fuzzy Verification [email protected]

Spatial detection versus filtering

x=25km

x=10km

x=5km

Horizontal translation (XSHIFT) with variable displacement x

• “Intensity scale” method can detect spatial scale of perturbation

• All other methods like the “Fraction Skill score” just filter small scale errors

Page 14: Fuzzy Verification toolbox: definitions and results

14 Fuzzy Verification [email protected]

Redundancy of scoresCorrelation (%) of resulting scores between all score for all thresholds, window sizes – averaged over all types of perturbation:

Groups of scores:

• UP, YN, MC, FB, PP

• FZ, JP

• FB, PP, (IS)

Page 15: Fuzzy Verification toolbox: definitions and results

15 Fuzzy Verification [email protected]

August 2005 flood eventPrecipitation sum 18.8.-23.8.2005:

Mean: 106.2mm

Mean: 43.2mm

Mean: 73.1mm

Mean: 62.8mm

(Hourly radar data calibrated using rain gauges (Frei et al., 2005))

Page 16: Fuzzy Verification toolbox: definitions and results

16 Fuzzy Verification [email protected]

Fuzzy Verification of August 2005 flood Based on 3 hourly accumulations during August 2005 flood period (18.8.-23.8.2005)

bad

goodIntensitythreshold (mm/3h)

Sca

le(7

km g

ridpo

ints

)

COSMO-7 COSMO-2

Page 17: Fuzzy Verification toolbox: definitions and results

17 Fuzzy Verification [email protected]

Fuzzy Verification of August 2005 flood

COSMO-7 better

COSMO-2 better

neutral

Difference of Fuzzy Scores

Intensity threshold (mm/3h)

Sca

le(7

km g

ridpo

ints

)

Page 18: Fuzzy Verification toolbox: definitions and results

18 Fuzzy Verification [email protected]

D-PHASERADAR

• Operational phase (June until November 2007) is running

• 33 atmospheric models take part …

• … and store there output in a common format in one data archive

Demonstration of Probabilistic Hydrological and Atmospheric Simulation of flood Events in the Alpine region

Standard verification (see Poster)

Let’s apply the fuzzy toolbox

• Models: COSMO -2, -7, -DE, -EU

• Period: August 2007

• Lead times: most recent forecast starting at forecast hour +03.

• Observations: Swiss Radar data aggregated on each model grid

• To be verified: 3h accumulation of precip.

Page 19: Fuzzy Verification toolbox: definitions and results

19 Fuzzy Verification [email protected]

D-PHASE: August 2007Intensity Scale score (preliminary), 3h accumulation

CO

SM

O-7

CO

SM

O-2

CO

SM

O-D

E

CO

SM

O-E

U

Page 20: Fuzzy Verification toolbox: definitions and results

20 Fuzzy Verification [email protected]

Conclusions

• Fuzzy Verification score are a promising framework for verification of high resolution precipitation forecasts.

• The testbed is a useful tool to evaluate the wealth of scores (not necessarily fuzzy ones):

• Not all scores indicate a perfect forecast by perfect scores (Leaking scores).

• The “intensity scale” method is able to detect the specific scale of an spatial error.

• MeteoSwiss goes for: Upscaling, Intensity scale, Fraction skill score ( and Pracitically perfect hindcast) methods.

• First long term application for D-PHASE has just started.

Page 21: Fuzzy Verification toolbox: definitions and results

21 Fuzzy Verification [email protected]

Summary ideal

-0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0

0.1

0.2

Up-scaling

Any-where in Window

50%cover-age

FuzzyLogig

JointProb.

Multi event

cont. tab.

Intensity Scale

Fraction Skill

Score

Prag-matic Appr.

Practic. Perf.

HindcastCSSR

Area related RMSE

Leaking Scores

XSHIFTBROWNIAN SMOOTH

LS_NOISE DRIZZLEContrast

STD

good

good

Page 22: Fuzzy Verification toolbox: definitions and results

22 Fuzzy Verification [email protected]

D-PHASE: August 2007

Page 23: Fuzzy Verification toolbox: definitions and results

23 Fuzzy Verification [email protected]

D-PHASE: August 2007

Page 24: Fuzzy Verification toolbox: definitions and results

24 Fuzzy Verification [email protected]

D-PHASE: August 2007 – cosmoch7

Page 25: Fuzzy Verification toolbox: definitions and results

25 Fuzzy Verification [email protected]

D-PHASE: August 2007 – Cosmoch2

Page 26: Fuzzy Verification toolbox: definitions and results

26 Fuzzy Verification [email protected]

D-PHASE: August 2007 - LME

Page 27: Fuzzy Verification toolbox: definitions and results

27 Fuzzy Verification [email protected]

D-PHASE: August 2007 - LMK

Page 28: Fuzzy Verification toolbox: definitions and results

28 Fuzzy Verification [email protected]

August 2005 flood eventPrecipitation sum 18.8.-23.8.2005:

Mean: 73.1mm

Mean: 62.8mm Mean: 106.2mm

Mean: 43.2mm

7

Page 29: Fuzzy Verification toolbox: definitions and results

29 Fuzzy Verification [email protected]

August 2005 flood event

Fuzzy Verification (hourly accumulations):

COSMO-7 COSMO-2

Page 30: Fuzzy Verification toolbox: definitions and results

30 Fuzzy Verification [email protected]

August 2005 flood event

Fuzzy Verification COSMO-2 – COSMO-7:

• Suprisingly, small differences

• However, COSMO2 seems to be slightly better slightly better at