Bill Kuo 1, Louisa Nance 1, Barb Brown 1 and Zoltan Toth 2 Developmental Testbed Center 1. National Center for Atmospheric Research 2. Earth System Research

Bill Kuo1, Louisa Nance1, Barb Brown1 and Zoltan Toth2

Developmental Testbed Center

1. National Center for Atmospheric Research2. Earth System Research Laboratory

Highlights of DTC Model Testing and Evaluation Results*

*Contribution from all DTC Staff

Objectives of the DTC*Advance NWP research by providing the research community an

environment that is functionally similar to that used in operations to test and evaluate the components of the NWP systems supported by the DTC;

Reduce the average time required to implement promising codes emerging from the research community by performing initial extensive testing to demonstrate the potential of new science and technologies for possible use in operations;

Sustain scientific interoperability of the community modeling system;

Manage and support the common baseline of end-to-end community software to users, including dynamic cores, physics and data assimilation codes, pre- and post-processors and codes that support ensemble forecasting systems; and

Establish, maintain and support a community statistical verification system for use by the broad NWP community.*DTC is jointly sponsored by NOAA, Air Force, NSF, & NCAR

2

Model Evaluation Tools (MET)State-of-the-art tools

Traditional and advanced (e.g., spatial)

Database and display systemSupported to community

Tutorials, email help, etc.Ensemble tools

Brier score + decompositionsROC, Reliability

Ensemble quantilesRank histogramCRPS (continuous rank probability

score)

See Verification Methods session Friday, 10:30 and P57 - Fowlerhttp://www.dtcenter.org/met/users/

3

DTC Test & Evaluation Activities Mesoscale Modeling (WRF)

Assess performance of select configurations for new WRF releases Test physics in a functionally similar operational environment P58 - Wolff , P59 -

Harrold Test SREF member configurations QPF verification of high resolution models Assess performance of microphysics schemes (HMT)

Hurricanes (HWRF) Test HWRF configured from WRF repository for use at EMC 2.2 – Bernardet, P84 - Bao Test HWRF physics options & assess their impact on rapid intensification 10.7 - Biswas Perform diagnostics studies to examine the strengths & weaknesses of HWRF (HFIP)

Data Assimilation Test GSI baseline (comparison w/ WRF-Var) 9.6 - Shao Test regional EnKF systems P8 – Newman

Ensembles Test bias-correction & down-scaling schemes for SREF Verification of storm-scale ensemble systems for severe weather & QPF (HWT) Demonstration of real-time QPF verification for mesoscale ensemble system (HMT)

P55 – Jensen, P56 – Tollerud Data Assimilation/Ensembles/Hurricanes

Assess the impact of GSI-hybrid DA on HWRF forecasts (HFIP) P7 - Zhou

4

WRF Innovation T&EInter-comparison T&E allows for a

quantitative assessment of forecast performance betweenan operational baseline and community

contributed scheme

Inter-comparison T&E allows for a quantitative assessment of forecast performance betweenan operational baseline and community

contributed scheme

5P59 - Harrold

QNSE vs AFWA OC RRTMG vs AFWA OC

WRF Innovation T&EInter-comparison T&E allows for a

quantitative assessment of forecast performance betweenan operational baseline and community

contributed schemetwo different versions of WRF using the same

physics scheme

Inter-comparison T&E allows for a quantitative assessment of forecast performance betweenan operational baseline and community

contributed scheme

6

AFWA OCV3.3.1 vs V3.1.1

Comparison of V3.1.1 and V3.3.1

From: Wei Wang and Ming Chen

WRF Member Testing for NCEP’s SREFNew membership:

NMMB(7), NMME(7) & ARW(7)

Tested performance of 5 WRF configurations for ~50 cases distributed over a yearCandidate configuration

NMM-GFS replaced w/ NMM-NCAR

Cursory timing tests – ARW adaptive time stepTransition from 32/35

km to 16/17 km

8

Physics Parameteriza

tion

ARW-NCAR

ARW-RR ARW-NAM

NMM-NAM

NMM-GFS

Microphysics WSM3 Thompson Ferrier Ferrier Ferrier

Surface LayerM-O

Similarity

Eta Similarity

Eta Similarit

y

Eta Similarit

yGFS

PBL YSU MYJ MYJ MYJ GFS

ConvectionKain-

Fritsch Grell-3D BMJ BMJ SAS

LSM Noah RUC Noah Noah Noah

RadiationRRTM/Dudhia

RRTM/Goddard

GFDL/GFDL

GFDL/GFDL

GFDL/GFDL

Mesoscale Model Evaluation Testbed (MMET) – P58 – Jamie Wolff et al.

Outcome of NWP Workshop on Model Physics with an Emphasis on Short-Range Weather Prediction, held at EMC 26-28 July 2011

Mechanism to assist research community with initial stage of testing and allow for efficient demonstration of merits of a new developmentCommon framework for testing; allow for direct comparisons

between different techniquesModel input and observational datasets provided to utilize for

testingBaseline results for select operational models established by

the DTCHosted by the DTC; served through Repository for

Archiving, Managing and Accessing Diverse DAta (RAMADDA)

http://dtcenter.org/repository9

http://dtcenter.org/repository

MMET Cases Initial solicitation of cases from DTC Science Advisory Board

Members and Physics Workshop Participants – great response and enthusiasm towards endeavor

Target cases during initial year20090228 – Mid-Atlantic snow storm where North American

Mesoscale (NAM) model produced high QPF shifted too far north20090311 – High dew point predictions by NAM over the upper

Midwest and in areas of snow20091007 –High-Resolution Window (HIRESW) runs

underperformed compared to coarser NAM model20091217 – “Snowapocalypse ‘09”: NAM produced high QPF

over mid-Atlantic, lack of cessation of precipitation associated with decreasing cloud top over eastern North Carolina

20100428-0504 – Historic Tennessee flooding associated with an atmospheric river event

20110404 – Recording breaking severe report day 20110518-26 – Extended period of severe weather outbreak

covering much of the mid-west and into the eastern states later in the period

20111128 – Cutoff low over SW US; NAM had difficulties throughout the winter of breaking down cutoff lows and progressing them eastward

20120203-05 – Snow storm over Colorado, Nebraska, etc.; NAM predicted too little precipitation in the warm sector and too much snow north of front (persistent bias)

10

Research System: ESRL/GSD and HMT Ensemble Modeling System

WRF model 9-member ensemble; ARW and NMM cores

Outer domain 9km; Nested domain 3 km

Hybrid members: Multi physics packages, two model cores, and different GFS initial conditions

Outer domain runs to 5 day lead time; Nest to 12 hr; DTC evaluated first 72 hours

Comparisons made with current operational systems (GFS, SREF, NAM, HRRR, etc)

Evaluation focus on QPFwith addition of state variables in 2011-2012

HMT-West typically runs December – March; DTC has evaluated approximately 3.5 months of data for past 3 seasons

Innovations from HMT-West

11see P55 Jensen et. al

HRRR (3km) HMT-Ens Mean (9km)NMM-B parallel (4km)

NAM (12km) GFS (0.5 deg)

Model Comparisons from 2010-2011 HMT-West

6 12 18 24 30 36 42 48 54 60 66 72

Gilb

ert

Ski

ll Sco

re (

or

ETS)

6hr Accum Precip > 1” – Meso- and fine-scale models tended to have higher median Gilbert Skill Scores over GFS for extreme precipitation events. Differences appear statistically significant at hours 18-30 and 66

Including Parallel Runs in Testbed Evaluations: DTC testing of NMM-B parallel runs provided additional confidence (beyond EMC routine pre-implementation testing) and helped push forward an Oct. 2011 implementation of NMM-B core.

12

Beyond higher resolution: Different initialization sources (AFWA) and methods (HMT and AFWA) may prove useful for the next-generation ensemble system. Select innovations from HMT-West will be tested by DTC during the coming year.

Model Comparisons from HMT-West 2012

Gilbert Skill Score – Ability to forecast given amount

6hr Accum Precip > 1” - All scores are low – partially due to sample-size but SREF (32km) shows very little skill whereas HMT & AFWA (3 & 4km) ensembles can score as high at 0.3

Area Under ROC – Ability to discriminate between event/non-event

Prob(6hr Accum Precip) > 1” - All scores are low at 6hr lead time – There are differences in the median AFWA and SREF values at 12 hr leads that may be significant

No Skill

Better Optimal

(21 member)(10 member) (9 member)

Gilb

ert S

kill

Sco

re (

or E

TS

)

Are

a U

nder

RO

C C

urve

13see P55 Jensen et. al

HFIP GSI-Hybrid Data Assimilation Test: P7 Zhou

No DAGSI 3DVARGSI-HybridBest Track

GSI Hybrid using global ensemble improved Bret track forecast 14

Summary & OutlookThe DTC is a community facility with a mission to:

Accelerate the transition of new NWP technology into operations

Maintain and support community modeling systems for research and operational NWP communities

Facilitate the interaction between research and operational NWP

The DTC seeks input from the community through:Participation in DTC Testing and Evaluation activities (e.g.,

MMET): Funding is available for off-cycle visitor proposalSuggestions for new DTC T&E activitiesDefining future direction of the DTC through the DTC

Science Advisory Board (Cliff Mass is the chair of DTC SAB)

15

THANK YOU!

http://www.dtcenter.org/

Documents

Bill Kuo 1, Louisa Nance 1, Barb Brown 1 and Zoltan Toth 2 Developmental Testbed Center 1. National Center for Atmospheric Research 2. Earth System Research