208
Model Evaluation Tools Version 3.0 (METv3.0) User’s Guide Developmental Testbed Center Boulder, Colorado, USA September 2010

MET Users Guide v3.0

Embed Size (px)

Citation preview

  • Model Evaluation Tools Version 3.0 (METv3.0)

    Users Guide

    Developmental Testbed Center

    Boulder, Colorado, USA

    September 2010

  • ii

  • iii

    Contents Section Page Foreword: A note to MET users .......................................................................... vii New for MET v3.0 .......................................................................... vii Model Evaluation Tools (MET) terms of use....................................................... viii Acknowledgments ................................................................................................ x Chapter 1 Overview of MET............................................................................... 1-1

    1.1 Purpose and organization of the Users Guide.................................. 1-1 1.2 The Developmental Testbed Center (DTC)....................................... 1-1

    1.3 MET goals and design philosophy..................................................... 1-2 1.4 MET components .............................................................................. 1-3 1.5 Future development plans ................................................................. 1-5 1.6 Code support ..................................................................................... 1-6 Chapter 2 Software Installation/Getting Started.............................................. 2-1 2.1 Introduction ..................................................................................... 2-1 2.2 Supported architectures .................................................................... 2-1 2.3 Programming languages ................................................................... 2-1 2.4 Required compilers and scripting languages..................................... 2-2 2.5 Required libraries and optional utilities.............................................. 2-2 2.6 Installation of required libraries.......................................................... 2-3 2.7 Installation of optional utilities ............................................................ 2-4 2.8 MET directory structure ..................................................................... 2-6 2.9 Building the MET package................................................................. 2-7 2.10 Sample test cases ............................................................................. 2-8 Chapter 3 MET Data I/O and Re-Formatting..................................................... 3-1

    3.1 Input data formats.............................................................................. 3-1 3.2 Intermediate data formats.................................................................. 3-1 3.3 Output data formats........................................................................... 3-2 3.4 Data format summary ........................................................................ 3-4 3.5 PB2NC tool ..................................................................................... 3-5 3.5.1 pb2nc usage .......................................................................... 3-6 3.5.2 pb2nc configuration file .......................................................... 3-7 3.5.3 pb2nc output .......................................................................... 3-11 3.6 ASCII2NC tool .................................................................................. 3-12 3.6.1 ascii2nc usage................................................................... 3-13 3.7 Pcp-Combine tool .............................................................................. 3-14 3.7.1 pcp_combine usage ............................................................. 3-14 3.7.2 pcp_combine output ............................................................. 3-17 3.8 Gen-Poly-Mask tool ........................................................................... 3-18 3.8.1 gen_poly_mask usage ......................................................... 3-18 3.9 Ensemble Stat tool ............................................................................ 3-19

  • iv

    3.9.1 ensemble_stat usage ......................................................... 3-20 3.9.2 ensemble_stat output......................................................... 3-22 Section Page Chapter 4 The Point-Stat Tool........................................................................... 4-1

    4.1 Introduction ..................................................................................... 4-1 4.2 Scientific and statistical aspects ........................................................ 4-1 4.2.1 Interpolation/matching methods.............................................. 4-1 4.2.2 Statistical measures................................................................ 4-4 4.2.3 Confidence intervals ............................................................... 4-5 4.3 Practical information .......................................................................... 4-7 4.3.1 point_stat usage................................................................ 4-7 4.3.2 point_stat configuration file ............................................... 4-9 4.3.3 point_stat output .............................................................. 4-16 Chapter 5 The Grid-Stat Tool ............................................................................ 5-1 5.1 Introduction ..................................................................................... 5-1 5.2 Scientific and statistical aspects ........................................................ 5-1 5.2.1 Statistical measures................................................................ 5-1 ` 5.2.2 Statistical confidence intervals................................................ 5-2 5.2.3 Neighborhood methods .......................................................... 5-2 5.3 Practical information .......................................................................... 5-3 5.3.1 grid_stat usage.................................................................. 5-4 5.3.2 grid_stat configuration file ................................................. 5-5 5.3.3 grid_stat output ................................................................. 5-12 Chapter 6 The MODE Tool ................................................................................. 6-1 6.1 Introduction ..................................................................................... 6-1 6.2 Scientific and Statistical Aspects ....................................................... 6-1 6.2.1 Resolving objects.................................................................... 6-1 6.2.2 Attributes................................................................................. 6-4 6.2.3 Fuzzy logic .............................................................................. 6-5 6.2.4 Summary statistics.................................................................. 6-6 6.3 Practical information .......................................................................... 6-6 6.3.1 mode usage ............................................................................ 6-7 6.3.2 mode configuration file ............................................................ 6-8 6.3.3 mode output ............................................................................ 6-19 Chapter 7 The Wavelet-Stat Tool ....................................................................... 7-1 7.1 Introduction ..................................................................................... 7-1

  • v

    7.2 Scientific and Statistical Aspects ....................................................... 7-2 7.2.1 The method............................................................................. 7-2 7.2.2 The spatial domain constraints .............................................. 7-11 7.2.3 Aggregation of statistics on multiple spatial cases ................. 7-12 Section Page 7.3 Practical information .......................................................................... 7-13 7.3.1 wavelet_stat usage ........................................................... 7-13 7.3.2 wavelet_stat configuration file........................................... 7-14 7.3.3 wavelet_stat output ........................................................... 7-17 Chapter 8 The Stat-Analysis Tool ..................................................................... 8-1 8.1 Introduction ..................................................................................... 8-1 8.2 Scientific and statistical aspects ........................................................ 8-1 8.2.1 Filter STAT line ....................................................................... 8-1 8.2.2 Summary statistics for columns .............................................. 8-1 8.2.3 Aggregated values from multiple STAT lines.......................... 8-2 8.2.4 Aggregate STAT lines and produce aggregated statistics...... 8-2 8.2.5 GO Index................................................................................. 8-2 8.2.6 Verifying Wind Direction.......................................................... 8-3 8.3 Practical information .......................................................................... 8-3 8.3.1 stat_analysis usage ......................................................... 8-3 8.3.2 stat_analysis configuration file ....................................... 8-6 8.3.3 Stat-Analysis tool output ......................................................... 8-12 Chapter 9 The MODE-Analysis Tool ................................................................. 9-1 9.1 Introduction ..................................................................................... 9-1 9.2 Scientific and statistical aspects ........................................................ 9-1 9.3 Practical information .......................................................................... 9-1 9.3.1 mode_analysis usage ......................................................... 9-2 9.3.2 mode_analysis configuration file ....................................... 9-10 9.3.3 MODE-Analysis tool output..................................................... 9-11 Chapter 10 Scripting .......................................................................................... 10-1 10.1 Example scripts for running MET tools .............................................. 10-1 10.2 Example scripts for use with MODE output files................................ 10-3 Chapter 11 Plotting and Graphics Support...................................................... 11-1 11.1 Grid-Stat tool examples ..................................................................... 11-1 11.2 MODE tool examples......................................................................... 11-2 References............................................................................................................. R-1

  • vi

    Appendix A How do I ?.................................................................................. A-1 A.1 Frequently Asked Questions ............................................................. A-1 A.2 Troubleshooting................................................................................. A-2 A.3 Where to get help .............................................................................. A-3 A.4 How to contribute code...................................................................... A-3 Appendix B Map Projections, Grids, and Polylines........................................ B-1 B.1 Map Projections................................................................................. B-1 B.2 Grids ................................................................................................ B-1 B.3 Polylines ............................................................................................ B-1 Appendix C Verification measures.................................................................. C-1 C.1 MET verification measures for categorical (dichotomous) variables . C-1 C.2 MET verification measures for continuous variables ......................... C-5 C.3 MET verification measures for probabilistic forecasts ....................... C-12 C.4 MET verification measures for ensemble forecasts........................... C-19 C.5 MET verification measures for neighborhood methods ..................... C-21 Appendix D Confidence intervals..................................................................... D-1

  • vii

    Foreword: A note to MET users

    This users guide is provided as an aid to users of the Model Evaluation Tools (MET). MET is a set of verification tools developed by the Developmental Testbed Center (DTC) for use by the numerical weather prediction community and especially users and developers of the Weather Research and Forecasting (WRF) model to help them assess and evaluate the performance of numerical weather predictions. It is important to note here that MET is an evolving software package. Previous releases of MET (METv1.0 and METv2.0) occurred in 2008 and 2009, respectively. This documentation describes the 3.0 release from fall of 2010 that includes new capabilities and enhancements as well as corrections to some errors or system issues. Intermediate releases may include bug fixes. In the future, MET will also be able to accept new modules contributed by the community. A protocol will be established to determine the maturity of new verification methods that are contributed and to coordinate the inclusion of new modules in future versions. This users guide was prepared by the developers of the MET, including John Halley Gotway, Randy Bullock, Paul Oldenburg, Tara Jensen, Lacey Holland, Barbara Brown, Tressa Fowler, David Ahijevych, and Eric Gilleland.

    New for MET v3.0

    MET v3.0 incorporates some major new features. The new Ensemble-Stat tool preprocesses sets of forecasts into ensemble forecasts, including mean, spread, and probability. When observations are included, it will also derive ensemble statistics such as rank histogram and continuous ranked probability score. When accumulating statistics over time, users can now adjust the confidence intervals to account for serial correlation. To assist our WRF ARW users, MET now can read the netCDF output from the pinterp postprocessor. Some bugs and typos have also been corrected. Multi-category, e.g. 3x3, contingency tables are now supported, along with appropriate skill scores. A new tools directory is also included in the software. One tool will plot point observations from the MET preprocessing tools. Users wishing to perform cloud verification using the Air Force Weather Agencys World Wide Merged Cloud Analysis (WWMCA) product for verification will find two new tools for their use. One creates plots of the WWMCA total cloud amount, while the other will interpolate the WWMCA onto any grid or map projection, allowing WWMCA to be matched to the cloud forecast of interest.

  • viii

    TERMS OF USE

    IMPORTANT!

    USE OF THIS SOFTWARE IS SUBJECT TO THE FOLLOWING TERMS AND CONDITIONS:

    1. License. Subject to these terms and conditions, University Corporation for Atmospheric

    Research (UCAR) grants you a non-exclusive, royalty-free license to use, create derivative works, publish, distribute, disseminate, transfer, modify, revise and copy the Model Evaluation Tools (MET) software, in both object and source code (the Software).

    You shall not sell, license or transfer for a fee the Software, or any work that in any manner contains the Software.

    2. Disclaimer of Warranty on Software. Use of the Software is at your sole risk. The

    Software is provided "AS IS" and without warranty of any kind and UCAR EXPRESSLY DISCLAIMS ALL WARRANTIES AND/OR CONDITIONS OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT OF A THIRD PARTYS INTELLECTUAL PROPERTY, MERCHANTABILITY OR SATISFACTORY QUALITY AND FITNESS FOR A PARTICULAR PURPOSE. THE PARTIES EXPRESSLY DISCLAIM THAT THE UNIFORM COMPUTER INFORMATION TRANSACTIONS ACT (UCITA) APPLIES TO OR GOVERNS THIS AGREEMENT. No oral or written information or advice given by UCAR or a UCAR authorized representative shall create a warranty or in any way increase the scope of this warranty. Should the Software prove defective, you (and neither UCAR nor any UCAR representative) assume the cost of all necessary correction.

    3. Limitation of Liability. UNDER NO CIRCUMSTANCES, INCLUDING NEGLIGENCE,

    SHALL UCAR BE LIABLE FOR ANY DIRECT, INCIDENTAL, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES INCLUDING LOST REVENUE, PROFIT OR DATA, WHETHER IN AN ACTION IN CONTRACT OR TORT ARISING OUT OF OR RELATING TO THE USE OF OR INABILITY TO USE THE SOFTWARE, EVEN IF UCAR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

    4. Compliance with Law. All Software and any technical data delivered under this

    Agreement are subject to U.S. export control laws and may be subject to export or import regulations in other countries. You agree to comply strictly with all applicable laws and regulations in connection with use and distribution of the Software, including export control laws, and you acknowledge that you have responsibility to obtain any required license to export, re-export, or import as may be required.

    5. No Endorsement/No Support. The names UCAR/NCAR, National Center for

    Atmospheric Research and the University Corporation for Atmospheric Research may not be used in any advertising or publicity to endorse or promote any products or commercial entity unless specific written permission is obtained from UCAR. The Software is provided without any support or maintenance, and without any obligation to provide you with modifications, improvements, enhancements, or updates of the Software.

  • ix

    6. Controlling Law and Severability. This Agreement shall be governed by the laws of

    the United States and the State of Colorado. If for any reason a court of competent jurisdiction finds any provision, or portion thereof, to be unenforceable, the remainder of this Agreement shall continue in full force and effect. This Agreement shall not be governed by the United Nations Convention on Contracts for the International Sale of Goods, the application of which is hereby expressly excluded.

    7. Termination. Your rights under this Agreement will terminate automatically without

    notice from UCAR if you fail to comply with any term(s) of this Agreement. You may terminate this Agreement at any time by destroying the Software and any related documentation and any complete or partial copies thereof. Upon termination, all rights granted under this Agreement shall terminate. The following provisions shall survive termination: Sections 2, 3, 6 and 9.

    8. Complete Agreement. This Agreement constitutes the entire agreement between the

    parties with respect to the use of the Software and supersedes all prior or contemporaneous understandings regarding such subject matter. No amendment to or modification of this Agreement will be binding unless in writing and signed by UCAR.

    9. Notices and Additional Terms. Copyright in Software is held by UCAR. You must

    include, with each copy of the Software and associated documentation, a copy of this Agreement and the following notice:

    "The source of this material is the Research Applications Laboratory at the National Center for Atmospheric Research, a program of the University Corporation for Atmospheric Research (UCAR) pursuant to a Cooperative Agreement with the National Science Foundation; 2007 University Corporation for Atmospheric Research. All Rights Reserved."

    The following notice shall be displayed on any scholarly works associated with, related to or derived from the Software:

    "Model Evaluation Tools (MET) was developed at the National Center for Atmospheric Research (NCAR) through a grant from the United States Air Force Weather Agency (AFWA). NCAR is sponsored by the United States National Science Foundation."

    By using or downloading the Software, you agree to be bound by the terms and conditions of this Agreement.

  • x

    Acknowledgments We thank the U.S. Air Force Weather Agency for their support of this work. Thanks also go to the staff at the Developmental Testbed Center for their help, advice, and many types of support. We are grateful to the individuals who participated in MET planning workshops in February 2007, April 2008, and August 2009; the ideas generated at those workshops will help MET grow in future years. Finally, we would like to specifically thank the verification advisory group (Mike Baldwin, Matthew Sittel, Elizabeth Ebert, Geoff DiMego, Chris Davis, and Jason Knievel) for their guidance and other contributions. The DTC is sponsored by the National Oceanic and Atmospheric Administration (NOAA), AFWA, and the National Science Foundation (NSF). NCAR is sponsored by the National Science Foundation (NSF).

  • Chapter 1: Overview of MET 1-1

    Chapter 1 Overview of MET

    1.1. Purpose and organization of the Users Guide The goal of this Users Guide is to provide basic information for users of the Model Evaluation Tools (MET) to enable users to apply MET to their datasets and evaluation studies. MET has been specifically designed for application to the Weather Research and Forecasting (WRF) model (see http://www.wrf-model.org/index.php for more information about the WRF). However, MET may also be used for the evaluation of forecasts from other models or applications if certain file format definitions (described in this document) are followed. The Users Guide is organized as follows. Chapter 1 provides an overview of MET and its components. Chapter 2 contains basic information about how to get started with MET including system requirements; required software (and how to obtain it); how to download MET; and information about compilers, libraries, and how to build the code. Chapter 3 focuses on the data needed to run MET, including formats for forecasts, observations, and output. This chapter also documents the new Ensemble preprocessing tool. Chapters 4 through 7 focus on the main modules contained in the current version of MET, including the Point-Stat, Grid-Stat, MODE and Wavelet-Stat tools. These chapters include an introduction to the statistical verification methodologies utilized by the tools, followed by a section containing practical information, such as how to set up configuration files and the form of the output. Chapters 8 and 9 focus on the analysis modules, Stat-Analysis and MODE-Analysis, which aggregate the output statistics from the other tools across multiple cases. Finally, Chapters 10 and 11 include some additional tools and information for scripting MET runs and plotting MET results. The appendices provide further useful information, including answers to some typical questions (Appendix A: How do I?); and links and information about map projections, grids, and polylines (Appendix B). Appendices C and D provide more information about the verification measures and confidence intervals that are provided by MET. Sample code that can be used to perform analyses on the output of MET and create particular types of plots of verification results is posted on the MET website (http://www.dtcenter.org/met/users/). Note that the MET development group also accepts contributed analysis and plotting scripts which may be posted on the MET website for use by the community. The remainder of this chapter includes information about the context for MET development, as well as information on the design principles used in developing MET. In addition, this chapter includes an overview of the MET package and its specific modules.

    1.2 The Developmental Testbed Center (DTC) MET has been developed, and will be maintained and enhanced, by the Developmental Testbed Center (DTC; http://www.dtcenter.org/ ). The main goal of the DTC is to serve

  • Chapter 1: Overview of MET 1-2

    as a bridge between operations and research, to facilitate the activities of these two important components of the numerical weather prediction (NWP) community. The DTC provides an environment that is functionally equivalent to the operational environment in which the research community can test model enhancements; the operational community benefits from DTC testing and evaluation of models before new models are implemented operationally. MET serves both the research and operational communities in this way offering capabilities for researchers to test their own enhancements to models and providing a capability for the DTC to evaluate the strengths and weaknesses of advances in NWP prior to operational implementation. The MET package will also be available to DTC visitors and to the WRF modeling community for testing and evaluation of new model capabilities, applications in new environments, and so on.

    1.3 MET goals and design philosophy The primary goal of MET development is to provide a state-of-the-art verification package to the NWP community. By state-of-the-art we mean that MET will incorporate newly developed and advanced verification methodologies, including new methods for diagnostic and spatial verification and new techniques provided by the verification and modeling communities. MET also utilizes and replicates the capabilities of existing systems for verification of NWP forecasts. For example, the MET package replicates existing NCEP operational verification capabilities (e.g., I/O, methods, statistics, data types). MET development will take into account the needs of the NWP community including operational centers and the research and development community. Some of the MET capabilities include traditional verification approaches for standard surface and upper air variables (e.g., Equitable Threat Score, Mean Squared Error); confidence intervals for verification measures; and spatial forecast verification methods. In the future, MET will include additional state-of-the-art and new methodologies. The MET package has been designed to be modular and adaptable. For example, individual modules can be applied without running the entire set of tools. New tools can easily be added to the MET package due to this modular design. In addition, the tools can readily be incorporated into a larger system that may include a database as well as more sophisticated input/output and user interfaces. Currently, the MET package is a set of tools that can easily be applied by any user on their own computer platform. The MET code and documentation is maintained by the DTC in Boulder, Colorado. The MET package is freely available to the modeling, verification, and operational communities, including universities, governments, the private sector, and operational modeling and prediction centers.

  • Chapter 1: Overview of MET 1-3

    1.4 MET components The major components of the MET package are represented in Figure 1-1. The main stages represented are input, reformatting, intermediate output, statistical analyses, and output and aggregation/analysis. Each of these stages is described further in later chapters. For example, the input and output formats are discussed in Chapter 2 as well as in the chapters associated with each of the statistics modules. MET input files are represented on the far left. Note that forecast model output is currently expected to be in GRIB1 format; GRIB2 and other formats will be incorporated in future releases of MET.

    Figure 1-1. Basic representation of current MET structure and modules. Green areas represent software and modules included in MET, and gray areas represent input and

    output files. The reformatting stage of MET consists of the Gen-Poly-Mask, PB2NC, ASCII2NC, Pcp-Combine, and Ensemble Stat tools. The PB2NC tool is used to create NetCDF files from input PrepBufr files containing point observations. Likewise, the ASCII2NC tool is used to create NetCDF files from input ASCII point observations. These NetCDF files are then used in the statistical analysis step. The Gen-Poly-Mask and Pcp-Combine are optional. The Gen-Poly-Mask tool will create a bitmapped masking area from a user specified polygon, i.e. a text file containing a series of latitudes / longitudes. This mask can then be used to efficiently limit verification to the interior of a user specified region. The Pcp-Combine tool accumulates precipitation amounts into the time

  • Chapter 1: Overview of MET 1-4

    interval selected by the user if a user would like to verify over a different time interval than is included in their forecast or observational dataset. The Ensemble-Stat tool will combine many forecasts into an ensemble mean or probability forecast. Additionally, if observations are included ensemble rank histogram information is produced. The four main statistical analysis components of the current version of MET are: Point-Stat, Grid-Stat, MODE, and Wavelet-Stat. The Point-Stat tool is used for grid-to-point verification, or verification of a gridded forecast field against a point-based observation (i.e., surface observing stations, ACARS, rawinsondes, and other observation types that could be described as a point observation). In addition to providing traditional forecast verification scores for both continuous and categorical variables, confidence intervals are also produced using parametric and non-parametric methods. Confidence intervals take into account the uncertainty associated with verification statistics due to sampling variability and limitations in sample size. These intervals provide more meaningful information about forecast performance. For example, confidence intervals allow credible comparisons of performance between two models when a limited number of model runs is available. Sometimes it may be useful to verify a forecast against gridded fields (e.g., Stage IV precipitation analyses). The Grid-Stat tool produces traditional verification statistics when a gridded field is used as the observational dataset. Like the Point-Stat tool, the Grid-Stat tool also produces confidence intervals. The Grid-Stat tool also now includes new neighborhood spatial methods, such as the Fractional Skill Score (Roberts and Lean 2008). These methods are discussed in Ebert (2008). The MODE (Method for Object-based Diagnostic Evaluation) tool also uses gridded fields as observational datasets. However, unlike the Grid-Stat tool, which applies traditional forecast verification techniques, MODE applies the object-based spatial verification technique described in Davis et al. (2006a,b) and Brown et al. (2007). This technique was developed in response to the double penalty problem in forecast verification. A forecast missed by even a small distance is effectively penalized twice by standard categorical verification scores: once for missing the event and a second time for producing a false alarm of the event elsewhere. As an alternative, MODE defines objects in both the forecast and observation fields. The objects in the forecast and observation fields are then matched and compared to one another. Applying this technique also provides diagnostic verification information that is difficult or even impossible to obtain using traditional verification measures. For example, the MODE tool can provide information about errors in location, size, and intensity. The Wavelet-Stat tool decomposes two-dimensional forecasts and observations according to the Intensity-Scale verification technique described by Casati et al. (2004). There are many types of spatial verification approaches and the Intensity-Scale technique belongs to the scale-decomposition (or scale-separation) verification approaches. The spatial scale components are obtained by applying a wavelet transformation to the forecast and observation fields. The resulting scale-decomposition measures error, bias and skill of the forecast on each spatial scale. Information is

  • Chapter 1: Overview of MET 1-5

    provided on the scale dependency of the error and skill, on the no-skill to skill transition scale, and on the ability of the forecast to reproduce the observed scale structure. The Wavelet-Stat tool is primarily used for precipitation fields. However, the tool can be applied to other variables, such as cloud fraction. Results from the statistical analysis stage are output in ASCII, NetCDF and Postscript formats. The Point-Stat, Grid-Stat, and Wavelet-Stat tools create STAT (statistics) files which are tabular ASCII files ending with a .stat suffix. In earlier versions of MET, this output format was called VSDB (Verification System DataBase). VSDB, which was developed by the National Centers for Environmental Prediction (NCEP), is a specialized ASCII format that can be easily read and used by graphics and analysis software. The STAT output format of the Point-Stat, Grid-Stat, and Wavelet-Stat tools is an extension of the VSDB format developed by NCEP. Additional columns of data and output line types have been added to store statistics not produced by the NCEP version. The Stat-Analysis and MODE-Analysis tools aggregate the output statistics from the previous steps across multiple cases. The Stat-Analysis tool reads the STAT output of Point-Stat, Grid-Stat, and Wavelet-Stat and can be used to filter the STAT data and produce aggregated continuous and categorical statistics. The MODE-Analysis tool reads the ASCII output of the MODE tool and can be used to produce summary information about object location, size, and intensity (as well as other object characteristics) across one or more cases.

    1.5 Future development plans MET is an evolving verification software package. New capabilities are planned in controlled, successive version releases. Bug fixes and user-identified problems will be addressed as they are found and posted to the known issues section of the MET Users web page (www.dtcenter.org/met/users/support ). Plans are also in place to incorporate many new capabilities and options in future releases of MET. Some of the planned additions are listed below.

    Additional statistical capabilities Additional spatial forecast verification methods Hurricane track verification Enhanced support for wind direction verification

    Support for other input formats Support for gridded data in GRIB2 Support for gridded data in NetCDF, CF convention

    Additional analysis capabilities and plotting routines Post to the MET website sample analysis and plotting routines that may include

    o Boxplots o Discrimination plots

  • Chapter 1: Overview of MET 1-6

    o Reliability diagrams o Scatter/density plots o Color-fill/contour maps of statistics o Height series o Histograms

    Other capabilities Analysis tools for model comparisons Graphical user interface (GUI) Autoconf configurability Database and display system for the statistical output of MET

    1.6 Code support MET support is provided through a MET-help e-mail address: [email protected]. We will endeavor to respond to requests for help in a timely fashion. In addition, information about MET and tools that can be used with MET are provided on the MET Users web page (http://www.dtcenter.org/met/users/). We welcome comments and suggestions for improvements to MET, especially information regarding errors. Comments may be submitted using the MET Feedback form available on the MET website. In addition, comments on this document would be greatly appreciated. While we cannot promise to incorporate all suggested changes, we will certainly take all suggestions into consideration. The MET package is a living set of tools. Our goal is to continually enhance it and add to its capabilities. Because our time, resources, and talents are limited, we welcome contributed code for future versions of MET. These contributions may represent new verification methodologies, new analysis tools, or new plotting functions. For more information on contributing code to MET, please contact [email protected].

  • Chapter 2: Software Installation/Getting Started 2-1

    Chapter 2 Software Installation/Getting Started

    2.1 Introduction This chapter describes how to install the MET package. MET has been developed and tested on Linux and IBM operating systems. Support for additional platforms and compilers will be added in future releases. The MET package requires four external libraries to be available on the user's computer prior to installation. Required and recommended libraries, how to install MET, the MET directory structure, and sample cases are described in the following sections.

    2.2 Supported architectures The MET package was developed on Debian Linux using the GNU compilers and the Portland Group (PGI) compilers. The MET package has also been built on several other Linux distributions using either the GNU or PGI compilers. The MET package has also been ported to IBM machines using the IBM compilers. Other machines will be added to this list in future releases as they are tested. In particular, the goal is to support those architectures supported by the WRF model itself.

    Table2-1. Hardware and compiler configurations tested for the MET package.

    Vendor Hardware OS Compiler DELL XEON Linux GNU / PGI / Intel IBM Power Series AIX IBM The MET package runs on a single processor and there are currently no plans to run it across multiple processors in the future. Therefore, none of the utilities necessary for running WRF on multiple processors are necessary for running MET.

    2.3 Programming languages The MET package is written primarily in C/C++ in order to be compatible with an extensive verification code base in C/C++ already in existence. In addition, the object-based MODE verification tool relies heavily on the object-oriented aspects of C++. Knowledge of C/C++ is not necessary to use the MET package. The MET package has been designed to be highly configurable through the use of ASCII configuration files, enabling a great deal of flexibility without the need for source code modifications.

  • Chapter 2: Software Installation/Getting Started 2-2

    NCEP's BUFRLIB is written entirely in Fortran. The portion of MET that handles the interface to the BUFRLIB for reading PrepBufr point observation files is also written in Fortran. The MET package is intended to be a tool for the modeling community to use and adapt. As users make upgrades and improvements to the tools, they are encouraged to offer those upgrades to the broader community by offering feedback to the developers.

    2.4 Required compilers and scripting languages The MET package was developed and tested using the GNU g++/gfortran compilers and the Portland Group (PGI) pgCC/pgf77 compilers. The MET package has also been ported to IBM machines using the IBM xlC/xlf90 compilers. As additional compilers are successfully tested, they will be added to the list of supported platforms/compilers. The GNU make utility is used in building all executables and is therefore required. The MET package consists of a group of command line utilities that are compiled separately. The user may choose to run any subset of these utilities to employ the type of verification methods desired. New tools developed and added to the toolkit will be included as command line utilities. In order to control the desired flow through MET, users are encouraged to run the tools via a script (see Chapter 10 for some examples). Some sample scripts are provided in the distribution; these examples are written in the Bourne shell. However, users are free to adapt these sample scripts to any scripting language desired.

    2.5 Required libraries and optional utilities Four external libraries are required for compiling/building MET and should be downloaded and installed before attempting to install MET: 1. NCEP's BUFRLIB is used by MET to decode point-based observation datasets in

    PrepBufr format. BUFRLIB is distributed and supported by NCEP and is freely available for download from NCEP's website at http://www.nco.ncep.noaa.gov/sib/decoders/BUFRLIB. BUFRLIB requires C and Fortran-77 compilers that should be from the same family of compilers used when building MET.

    2. Several tools within MET use Unidata's NetCDF libraries for writing output NetCDF

    files. NetCDF libraries are distributed and supported by Unidata and are freely available for download from Unidata's website at http://www.unidata.ucar.edu/software/netcdf. The same family of compilers used to build NetCDF should be used when building MET. MET is compatible with most NetCDF version 3 releases, but it is not compatible with NetCDF version 4.

  • Chapter 2: Software Installation/Getting Started 2-3

    3. The GNU Scientific Library (GSL) is used by MET when computing confidence

    intervals. GSL is distributed and supported by the GNU Software Foundation and is freely available for download from the GNU website at http://www.gnu.org/software/gsl.

    4. The F2C (or G2C) Library may be required depending on which Fortran compiler is

    used to compile MET. It is not necessary when using the GNU gfortran and PGI pgf77 compilers but is required for the GNU g77 compiler. The F2C (or G2C) library is used by MET to enable the PB2NC tool, written in C++ to communicate with the BUFRLIB, written in Fortran. If F2C (or G2C) is not already installed on your system, it may be downloaded from the Netlib website at http://www.netlib.org/f2c. Download the file libf2c.zip and refer to the README file for installation instructions.

    Two additional utilities are strongly recommended for use with MET: 1. The WRF Post-Processor is recommended for post-processing the raw model

    output prior to verifying the model forecasts with MET. The WRF Post-Processor is freely available for download from the downloads section of the WRF-NMM user's website at http://www.dtcenter.org/wrf-nmm/users. MET requires input data in GRIB1 format on a standard, de-staggered grid and on pressure or regular levels in the vertical. The WRF Post-Processor outputs model data in this format. However, the WRF Post-Processor is not strictly required as long as the user can produce GRIB input data on a standard de-staggered grid on pressure or regular levels in the vertical. Two-dimensional fields (e.g., precipitation amount) are also accepted for some modules.

    2. The copygb utility is recommended for re-gridding model and observation datasets

    in GRIB format to a common verification grid. This utility is highly recommended when using the Grid-Stat, Wavelet-Stat, or MODE tools. Prior to running MET, the model output and verifying gridded observations must be placed on the same grid. The copygb utility is distributed as part of the WRF Post-Processor and is available from other sources as well. However, the copygb utility is not strictly required as long as users can ensure that their model and gridded observation datasets reside on a common grid.

    2.6 Installation of required libraries As described in section 2.5, three libraries are required for building the MET: 1. NCEP's BUFRLIB is used by the MET to decode point-based observation datasets

    in PrepBufr format. Once you have downloaded and unpacked the BUFRLIB tarball, refer to the README_BUFRLIB file. When compiling the library using the GNU C and Fortran compilers, users are strongly encouraged to use the -DUNDERSCORE and

  • Chapter 2: Software Installation/Getting Started 2-4

    -fno-second-underscore options. Also, MET expects the BUFRLIB archive file to be named libbufr.a. Therefore, compiling the BUFRLIB using the GNU compilers consists of the following 3 steps:

    gcc -c -DUNDERSCORE *.c gfortran -c -DUNDERSCORE -fno-second-underscore *.f *.F ar crv libbufr.a *.o

    Alternatively, compiling the BUFRLIB using the PGI C and Fortran-77 compilers consists of the following 3 steps:

    pgcc -c -DUNDERSCORE *.c pgf77 -c -DUNDERSCORE -Mnosecond_underscore *.f *.F ar crv libbufr.a *.o

    Compiling the BUFRLIB using the IBM C and Fortran compilers consists of the following 3 steps:

    xlc -c -DUNDERSCORE *.c xlf -c -qextname *.f *.F ar crv libbufr.a *.o

    2. Unidata's NetCDF libraries are used by several tools within MET for writing output

    NetCDF files. The same family of compilers used to build NetCDF should be used when building MET. Users may also find some utilities built for NetCDF such as ncdump and ncview useful for viewing the contents of NetCDF files. Detailed installation instructions are available from Unidata at http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-install/

    3. The GNU Scientific Library (GSL) is used by MET for random sampling and normal

    and binomial distribution computations when estimating confidence intervals. Precompiled binary packages are available for most GNU/Linux distributions and may be installed with root access. When installing GSL from a precompiled package on Debian Linux, the developers version of GSL must be used; otherwise, use the GSL version available from the GNU website (http://www.gnu.org/software/gsl/). MET requires access to the GSL source headers and library archive file at build time.

    2.7 Installation of optional utilities As described in the introduction to this chapter, two additional utilities are strongly recommended for use with MET. 1. The WRF Post-Processor is recommended for post-processing the raw model

    output prior to verifying the data with MET. The WRF Post-Processor may be used

  • Chapter 2: Software Installation/Getting Started 2-5

    on output from both the ARW and NMM cores. Please refer to online documentation for instructions on how to install and use the WRF Post-Processor. Installation instructions for the WRF Post-Processor can be found in Chapter 2 of the WRF-NMM Users Guide or online at http://www.dtcenter.org/wrf-nmm/users/docs/user_guide/WPS/ .

    2. The copygb utility is recommended for re-gridding model and observation datasets

    in GRIB format to a common verification grid. The copygb utility is distributed as part of the WRF Post-Processor and is available from other sources as well. Please refer to the WRF Post-processor utility mentioned above for information on availability and installation.

  • Chapter 2: Software Installation/Getting Started 2-6

    2.8 MET directory structure Once you have downloaded the MET tarball and unzipped and unpacked its contents, the top-level MET directory structure follows this outline: METv3.0/

    o Makefile o Makefile_gnu o Makefile_ibm o Makefile_intel o Makefile_pgi o README o bin/ o data/

    colortables/ config/ map/ poly/ ps/ sample_fcst/ sample_obs/

    o doc/ MET_Users_Guide.pdf

    o lib/ o out/ o scripts/

    config/ o src/

    ascii2nc/ ensemble_stat/ gen_poly_mask/ grid_stat/ mode/ mode_analysis/ pb2nc/ pcp_combine/ point_stat/ stat_analysis/ wavelet_stat/

    o tools/ The top-level MET directory consists of a README file, Makefiles, and several subdirectories. The top-level Makefiles control how the entire toolkit is built by calling sub-makes for each of the internal libraries and applications. These top-level Makefiles will be modified in Section 2.9.

  • Chapter 2: Software Installation/Getting Started 2-7

    When MET has been successfully built, the bin/ directory will contain executables for each module of MET (grid_stat, mode, mode_analysis, ensemble_stat, pb2nc, ascii2nc, gen_poly_mask, pcp_combine, point_stat, stat_analysis, and wavelet_stat) The data/ directory contains several configuration and static data files used by MET. The colortables/, map/, and ps/ subdirectories contain data used in creating PostScript plots for the MODE tool. The poly/ subdirectory contains predefined lat/lon polyline regions for use in selecting regions over which to verify. The polylines defined correspond to verification regions used by NCEP as described in Appendix B. The config/ subdirectory contains default configuration files for each MET tool that accepts one. Users may copy these configuration files to another location and modify them for their own use. The sample_fcst/ and sample_obs/ subdirectories contain sample data used by the test scripts provided in the scripts/ directory. The doc/ directory contains documentation for MET, including the MET User's Guide. The lib/ directory contains the source code for several internal libraries used by MET tools. The out/ directory will be populated with sample output from the test cases described in the next section. The src/ directory contains the source code for each of the seven tools in MET. The scripts/ directory contains test scripts to be run after MET has been successfully built, as well as a directory of sample configuration files located in the config/ subdirectory. The output from the test scripts in this directory will be written to the out/ directory. Users are encouraged to copy sample configuration files to another location and modify them for their own use.

    2.9 Building the MET package Building the MET package consists of three main steps: (1) installing the required libraries, (2) configuring the top-level Makefile, and (3) executing the build. 1. Install the required libraries

    Please refer to Section 2.6 on how to install the required libraries.

    2. Configure the top-level Makefile Once you have downloaded the MET tarball, unzip and unpack its contents

    (refer to Section 2.8).

  • Chapter 2: Software Installation/Getting Started 2-8

    Make a copy of the Makefile most similar to your OS and compiler. For example, if compiling on Linux using the GNU compilers: o cp Makefile_gnu Makefile

    Edit the top-level Makefile as follows: o Set MAKE to the full path for the GNU Make utility. o Set CXX to the full path for your C++ compiler. o Set FC to the full path for your Fortran compiler. o Set NETCDF_BASE to the location where NetCDF is installed if it is not

    installed in a standard location. The NetCDF directory should contain include/ and lib/ subdirectories.

    o Set BUFR_BASE to the location where BUFRLIB is installed if it is not installed in a standard location.

    o Set GSL_BASE to the location where the GNU Scientific Library is installed if it is not installed in a standard location. The GSL directory should contain include/gsl/ and lib/ subdirectories.

    o If required for your compiler, set F2C_BASE to the location where the F2C or G2C library is installed if it is not installed in a standard location.

    o If required for your compiler, set F2C_LIBNAME to either lf2c or lg2c to indicate which library is to be used.

    o The additional parameters in the Makefile may be set as needed to configure the build to your system such as compiler flags and additional libraries.

    3. Execute the build

    Execute the GNU make command, typically by typing make, to build the MET package. Note that on IBM machines, the GNU make command may be named gmake: o make >& make.log&

    Execute the following tail command to monitor the progress of the make: o tail -f make.log

    When the make has completed, use CNTL-F to end the tail command. Examine the contents of the make.log file.

    o Look for the following message which likely indicates that the build was successful:

    *** Finished Making the Model Evaluation Tools Project ***

    Several compilation warnings may occur which are expected. If any errors occur, please refer to the appendix on troubleshooting for

    common problems.

    2.10 Sample test cases Once the MET package has been built successfully, the user is encouraged to run the sample test scripts provided. Change directories into the scripts/ directory. The

  • Chapter 2: Software Installation/Getting Started 2-9

    scripts directory contains a test Bourne shell script for each of the eight tools in MET. However, the test_all.sh script will run the other eight scripts in the proper order. Execute the following commands:

    Run the script. ./test_all.sh >& test_all.log&

    Monitor the progress of the script: tail -f test_all.log

    When the test script has completed, use CNTL-F to end the tail command.

    NOTE: All of these test scripts should take less than 10 minutes to run on most machines.

    Examine the contents of the test_all.log file: o Look for the following message which indicates that the test script

    completed: *** Finished Testing the Model Evaluation Tools Project ***

    o If any warnings or errors occur, please refer to Appendix A on troubleshooting for common problems.

    The output from this test script is written to the top-level out/ directory,

    organized by the names of the MET tools.

  • Chapter 3: MET Data I/O and Re-Formatting 3-1

    Chapter 3 MET Data I/O and Re-Formatting Both the input and output file formats are described in this chapter. Sections 3.1 and 3.2 are primarily concerned with re-formatting input files into the intermediate files required by some MET modules. These steps are represented by the first three columns in the MET flowchart depicted in Fig. 1-1. Output data formats and the software modules used to reformat the data are described in later sections.

    3.1 Input data formats The MET package can handle gridded input data in GRIB version 1 format (i.e., the same as the output format produced by the WRF Post-Processor). Point observation files may be supplied in either PrepBufr or ASCII format. Note that MET does not require the WRF Post-Processor to be used, but does require that the input GRIB data be on a standard, de-staggered grid on pressure or regular levels in the vertical. While the Grid-Stat, Wavelet-Stat, and MODE tools can be run on a gridded field at virtually any level, the Point-Stat tool can only be used to verify forecasts at the surface or on pressure levels. When comparing two gridded fields with the Grid-Stat, Wavelet-Stat, or MODE tools, the input model and observation datasets must have already been placed on the same grid. The copygb utility is recommended for re-gridding GRIB files. To preserve characteristics of the observations, it is generally preferred to re-grid the model data to the observation grid, rather than vice versa. Input point observation files in PrepBufr format are available through NCEP. The PrepBufr observation files contain a wide variety of point-based observation types in a single file in a standard format. However, some users may desire to use observations not included in the standard PrepBufr files. For this reason, prior to performing the verification step in the Point-Stat tool, the PrepBufr file is reformatted with the PB2NC tool. In this step, the user can select various ways of stratifying the observation data spatially, temporally, and by type. The remaining observations are reformatted into an intermediate NetCDF file. The ASCII2NC tool may be used to convert ASCII point observations that are not available in the PrepBufr files into this NetCDF format for use by the Point-Stat verification tool.

    3.2 Intermediate data formats MET uses NetCDF as an intermediate file format. The Ensemble-Tool, WWMCA-Tool, Pcp-Combine, Gen-Poly-Mask, PB2NC, and ASCII2NC tools write intermediate files in NetCDF format.

  • Chapter 3: MET Data I/O and Re-Formatting 3-2

    The Pcp-Combine tool operates in 3 different modes. It may be used to sum accumulated precipitation from several GRIB files into a single NetCDF file containing the desired accumulation period. It may also be used to add or subtract the accumulated precipitation in two GRIB files directly. The command line arguments for the Pcp-Combine tool vary depending on the mode in which it is run. The user may choose to: (1) combine the model accumulations to match the observation accumulation period, (2) combine the observation accumulations to match the model accumulation period, or (3) combine both the model and observation accumulations to some new period desired for verification. In performing this summation, the user may not specify an accumulation interval smaller than the accumulation period in the GRIB files. However, if the input model and observation GRIB files already contain accumulated precipitation with the same desired accumulation period, then pcp_combine need not be run. Each time the Pcp-Combine tool is called, a NetCDF file is written containing the requested accumulation period. The Gen-Poly-Mask tool is used to define a bitmapped masking region that can be used by the Ensemble-Tool, Grid-Stat, Point-Stat, and MODE as a verification subdomain. It is generally more efficient to use the NetCDF output of gen_poly_mask to define a masking region than using a complex polyline directly in the other MET tools. However, the NetCDF output can only be applied to datasets on a common domain. It must be regenerated for each domain used. The PB2NC tool is used to reformat the input PrepBufr files containing point observations. This tool stratifies the observations as requested in a configuration file and writes out the remaining observations in a NetCDF format. The NetCDF output of the PB2NC tool is used as input to the verification step performed in the Point-Stat tool. The ASCII2NC tool simply reformats ASCII point observations into the NetCDF format needed by the Point-Stat tool. The output NetCDF file from the ASCII2NC tool has a format that is identical to the format of the output from the PB2NC tool.

    3.3 Output data formats The MET package currently produces output in four basic file formats: STAT files, ASCII files, NetCDF files, and PostScript plots. The STAT format consists of tabular ASCII data that can be easily read by many analysis tools and software packages. MET produces STAT output for the Grid-Stat, Point-Stat, and Wavelet-Stat tools. STAT is a specialized ASCII format containing one record on each line. However, a single STAT file may contain multiple line types. Several header columns at the beginning of each line remain the same for each line type. However, the remaining columns after the header change for each line type.

  • Chapter 3: MET Data I/O and Re-Formatting 3-3

    STAT files can be difficult for a human to read as the quantities represented for many columns of data change from line to line. For this reason, ASCII output is also available as an alternative for the Grid-Stat, Point-Stat, and Wavelet-Stat tools. The ASCII files contain exactly the same output as the STAT files but each STAT line type is grouped into a single ASCII file with a column header row making the output more human-readable. The configuration files control which line types are output and whether or not the optional ASCII files are generated. The MODE tool creates two ASCII output files as well (although they are not in a STAT format) and also generates an ASCII file containing contingency table counts and statistics comparing the model and observation fields being compared. The MODE tool also generates a second ASCII file containing all of the attributes for the single objects and pairs of objects. Each line in this file contains the same number of columns, and those columns not applicable to a given line type contain fill data. The Ensemble-Tool, Grid-Stat, Wavelet-Stat, and MODE tools generate gridded NetCDF output. The MODE tool creates a NetCDF file containing four gridded fields for the objects identified in the forecast and observation, simple and cluster object fields. The Ensemble-Tool creates a NetCDF file containing the ensemble forecast values, statistics, and, if requested, matched observations for each verification region and variable type/level requested in the configuration file. In addition, when rank histogram information is requested, the NetCDF file contains the observation rank values. The Grid-Stat tool creates a NetCDF file containing the matched forecast/observation pairs and the forecast minus observation difference fields for each verification region and variable type/level requested in the configuration file. The Wavelet-Stat tool creates a NetCDF file summarizing the wavelet decomposition of the forecast and observation fields for each variable type/level, raw threshold, and tile masking region chosen. The generation of these files is controlled by configuration files or command line switches. As discussed in the previous section, the Pcp-Combine and Gen-Poly-Mask tools create gridded NetCDF output as well, while the PB2NC and ASCII2NC tools create intermediate NetCDF files containing point observations. The MODE and Wavelet-Stat tools produce PostScript plots summarizing the features-based approach used in the verification. The PostScript plots are generated using internal libraries and do not depend on an external plotting package. The MODE plots contain several summary pages at the beginning, but the total number of pages will depend on the merging options chosen. Additional pages will be created if merging is performed using the double thresholding or fuzzy engine merging techniques for the forecast and observation fields. The number of pages in the Wavelet-Stat plots depend on the number of masking tiles used and the dimension of those tiles. The first summary page is followed by plots for the wavelet decomposition of the forecast and observation fields. The generation of these PostScript output files can be disabled using command line options.

  • Chapter 3: MET Data I/O and Re-Formatting 3-4

    3.4 Data format summary The following is a summary of the input and output formats for each of the tools currently in MET. The output listed is the maximum number of possible output files. Generally, the type of output files generated can be controlled by the configuration files and/or the command line options:

    1. PB2NC Tool Input: One PrepBufr point observation file and one configuration file. Output: One NetCDF file containing the observations that have been

    retained.

    2. ASCII2NC Tool Input: One ASCII point observation file that has been formatted as expected. Output: One NetCDF file containing the reformatted observations.

    3. Pcp-Combine Tool

    Input: Two or more gridded model or observation files in GRIB1 format containing accumulated precipitation to be combined to create a new accumulation interval.

    Output: One NetCDF file containing the summed accumulation interval.

    4. Gen-Poly-Mask Tool Input: One gridded model or observation file in GRIB1 format and one ASCII

    file defining a Lat/Lon masking polyline. Output: One NetCDF file containing a bitmap for the masking region defined

    by the polyline over the domain of the gridded input file.

    5. Ensemble Stat Tool Input: An arbitrary number of gridded model files in GRIB1 format and one or

    more optional files containing observations. The observations may be in either netCDF or GRIB1 format. Point and gridded observations are both accepted.

    Output: One NetCDF file containing requested ensemble forecast information and, where applicable, rank histogram information.

    6. Point-Stat Tool

  • Chapter 3: MET Data I/O and Re-Formatting 3-5

    Input: One model file either in GRIB1 format or in the NetCDF format output from the Pcp-Combine tool, at least one point observation file in NetCDF format (as the output of the PB2NC or ASCII2NC tool), and one configuration file.

    Output: One STAT file containing all of the requested line types, and several ASCII files for each line type requested.

    7. Grid-Stat Tool Input: One model file and one observation file either in GRIB1 format or in the

    NetCDF format output from the Pcp-Combine tool, and one configuration file. Output: One STAT file containing all of the requested line types, several

    ASCII files for each line type requested, and one NetCDF file containing the matched pair data and difference field for each verification region and variable type/level being verified.

    8. MODE Tool Input: One model file and one observation file either in GRIB1 format or in the

    NetCDF format output from the Pcp-Combine tool, and one or two configuration files.

    Output: One ASCII file containing contingency table counts and statistics, one ASCII file containing single and pair object attribute values, one NetCDF file containing object indices for the gridded simple and cluster object fields, and one PostScript plot containing a summary of the features-based verification performed.

    9. Wavelet-Stat Tool

    Input: One model file and one gridded observation file either in GRIB1 format or in the NetCDF format output from the Pcp-Combine tool, and one configuration file.

    Output: One STAT file containing the ISC line type, one ASCII file containing intensity-scale information and statistics, one NetCDF file containing information about the wavelet decomposition of forecast and observed fields and their differences, and one PostScript file containing plots and summaries of the intensity-scale verification.

    10. Stat-Analysis Tool

    Input: One or more STAT files output from the Point-Stat and/or Grid-Stat tools and, optionally, one configuration file containing specifications for the analysis job(s) to be run on the STAT data.

    Output: ASCII output of the analysis jobs will be printed to the screen unless redirected to a file using the -out option.

  • Chapter 3: MET Data I/O and Re-Formatting 3-6

    11. MODE-Analysis Tool Input: One or more MODE object statistics files from the MODE tool and,

    optionally, one configuration file containing specification for the analysis job(s) to be run on the object data.

    Output: ASCII output of the analysis jobs will be printed to the screen unless redirected to a file using the -out option.

    3.5 PB2NC tool This section describes how to configure and run the PB2NC tool. The PB2NC tool is used to stratify the contents of an input PrepBufr point observation file and reformat it into NetCDF format for use by the Point-Stat tool. The PB2NC tool must be run on the input PrepBufr point observation file prior to performing verification using the Point-Stat tool. Please note that in earlier version of the PB2NC tool, users were required to run their PrepBufr files through the cwordsh tool to perform Fortran-blocking on their PrepBufr files prior to running them through PB2NC. That step is no longer required since the Fortran-blocking is now done internally.

    3.5.1 pb2nc usage The usage statement for the PB2NC tool is shown below: Usage: pb2nc prepbufr_file netcdf_file config_file [-pbfile prepbufr_file] [-valid_beg time] [-valid_end time] [-nmsg n] [-dump path] [-v level] pb2nc has three required arguments and can take up to six optional ones. Required arguments for pb2nc

    1. The prepbufr_file argument indicates the name of the PrepBufr file to be processed.

    2. The netcdf_file argument indicates the name given to the output NetCDF file.

  • Chapter 3: MET Data I/O and Re-Formatting 3-7

    3. The config_file argument indicates the name of the configuration file to be used. The contents of the configuration file are discussed below.

    Optional arguments for pb2nc

    1. The pbfile prepbufr_file option may be used to pass additional PrepBufr files to the PB2NC tool.

    2. The valid_beg time option in YYYYMMDD[_HH[MMSS]] format sets the

    beginning of the retention time window. 3. The valid_end time option in YYYYMMDD[_HH[MMSS]] format sets the end

    of the retention time window. 4. The -nmsg num_messages option may be used for testing purposes. This

    argument indicates that only the first num_messages PrepBufr messages should be processed rather than the whole file. This option is provided to speed up testing because running the PB2NC tool can take a few minutes for each file. Most users will not need this option.

    5. The -dump path option may be used to dump the entire contents of the PrepBufr

    file to several ASCII files written to the directory specified by path. The user may use this option to view a human-readable version of the input PrepBufr file, although writing the contents to ASCII files can be slow.

    6. The -v level option indicates the desired level of verbosity. The value of level

    will override the default setting of 1. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity above 1 will increase the amount of logging.

    An example of the pb2nc calling sequence is shown below: pb2nc sample_pb.blk sample_pb.nc

    PB2NCConfig In this example, the PB2NC tool will process the input sample_pb.blk file applying the configuration specified in the PB2NCConfig file and write the output to a file named sample_pb.nc.

    3.5.2 pb2nc configuration file The default configuration file for the PB2NC tool named PB2NCConfig_default can be found in the data/config directory in the MET distribution. The version used for the example run in Chapter 2 is available in scripts/config. It is recommended that

  • Chapter 3: MET Data I/O and Re-Formatting 3-8

    users make a copy of these files prior to modifying their contents. Each configuration file contains many comments describing its contents. When editing configuration files, environment variables may be used for setting the configurable parameters if convenient. The configuration file parser expands any environment variables to their full value before proceeding. Within the configuration file, environment variables must be specified in the form: ${VAR_NAME}. For example, using an environment variable to set the message_type (see below) parameter to use APDUPA and ADPSFC message types might consist of the following: In a C-Shell: setenv MSG_TYP ADPUPA, ADPSFC In the configuration file: message_type[] = [ ${MSG_TYP} ]; The example script for running MODE included in section 10.2 provides another example of using environment variables in configuration files. The contents of the default pb2nc configuration file found in data/config are described in the subsections below.

    message_type[] = []; Each PrepBufr message is tagged with one of eighteen message types as listed in the configuration file. The message_type refers to the type of observation from which the observation value (or report) was derived. The user may specify a comma-separated list of message types to be retained. Providing an empty list indicates that all message types should be retained.

    station_id[] = []; Each PrepBufr message has a station identification string associated with it. The user may specify a comma-separated list of station IDs to be retained. Providing an empty list indicates that messages from all station IDs will be retained.

    beg_ds = -5400; end_ds = 5400; Each PrepBufr file has an observation time associated with it. Every PrepBufr message within the file has a time-offset defined relative to that files observation time. The beg_ds and end_ds variables define a time window around the file's observation time for PrepBufr messages that should be retained. beg_ds indicates how many seconds relative to the files observation time to begin retaining observations to be used for verification (the negative sign indicates this window begins prior to the time assigned to

  • Chapter 3: MET Data I/O and Re-Formatting 3-9

    the PrepBufr file). end_ds indicates how many seconds after the files time to stop retaining observations for verification. The time window shown above is +/- 1.5 hours (+/- 5400 seconds) around the file observation time.

    mask_grid = ""; mask_poly = ""; The mask_grid and mask_poly variables are used to define a spatial masking region for retaining observations. mask_grid may be set to one of the pre-defined NCEP grids which are specified as GNNN where NNN is the three digit designation for the grid. mask_poly may be set to a pre-defined or a user-created file consisting of a name for the polygon followed by a series of lat/lon points used to define a masking region. If a masking region is specified, only observations falling inside the region will be retained. Refer to Appendix B for a list of the grids available for mask_grid and pre-defined polylines for mask_poly.

    beg_elev = -1000; end_elev = 100000; The beg_elev and end_elev variables are used to stratify the elevation (in meters) of the observations to be retained. The range shown above is set to -1000 to 100000 meters, which essentially retains every observation.

    pb_report_type[] = []; in_report_type[] = []; instrument_type[] = []; The pb_report_type, in_report_type, and instrument_type variables are used to specify comma-separated lists of PrepBufr report types, input report types, and instrument types to be retained, respectively. If left empty, all PrepBufr report types, input report types, and instrument types will be retained.

    beg_level = 1; end_level = 255; The beg_level and end_level variables are used to stratify the model level of observations to be retained. The range shown above is 1 to 255, which is the current maximum possible level.

    obs_grib_code[] = [SPFH, TMP, HGT, UGRD, VGRD];

  • Chapter 3: MET Data I/O and Re-Formatting 3-10

    Each PrepBufr message will likely contain multiple observation variables. The obs_grib_code variable is used to specify which observation variables are to be retained or derived. The GRIB code itself or the corresponding abbreviation may be used to specify which observation variables are to be retained or derived. The following GRIB codes may be derived: DPT, WIND, RH, MIXR, and PRMSL for dewpoint, wind speed, relative humidity, mixing ratio, and pressure reduced to MSL. The list of GRIB codes shown above indicates that specific humidity, temperature, height, and the u and v components of the wind are to be retained.

    quality_mark_thresh = 2; Each observation has a quality mark value associated with it. The quality_mark_thresh is used to stratify out which quality marks will be retained. The value shown above indicates that only observations with quality marks less than or equal to 2 will be retained.

    event_stack_flag = 1; A PrepBufr message may contain duplicate observations with different quality mark values. The event_stack_flag indicates whether to use the observations at the top of the event stack (observation values have had more quality control processing applied) or the bottom of the event stack (observation values have had no quality control processing applied). The flag value of 1 listed above indicates the observations with the most amount of quality control processing should be used.

    level_category[] = []; The level_category variable is used to specify a comma-separated list of Prepbufr data level categories to retain. An empty string indicates that all level categories should be retained. Accepted values and their meanings are described in the table below. These represent the same categories available from http://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/table_1.htm

    Table 3-1. Values for the level_category option. Level category value Description

    0 Surface level 1 Mandatory level

    2 Significant temperature level 3 Winds-by-pressure level 4 Winds-by-height level 5 Tropopause level 6 Reports on a single level

  • Chapter 3: MET Data I/O and Re-Formatting 3-11

    Level category value Description

    7 Auxiliary levels generated

    via interpolation from spanning levels

    tmp_dir = /tmp;

    The tmp_dir indicates where temporary files should be written. output_prefix = ""; This option specifies a string to be used in the output file name. It can be useful for keeping results for different models or variables from overwriting each other.

    version = V3.0;

    The version indicates the version of the pb2nc configuration file used. Future versions of MET may include changes to pb2nc and the pb2nc configuration file. This value should not be modified.

    3.5.3 PB2NC output Each NetCDF file generated by the PB2NC tool contains the dimensions and variables shown in the following tables.

    Table 3-2. NetCDF file dimensions for pb2nc output. pb2nc NetCDF DIMENSIONS

    NetCDF Dimension Description mxstr Maximum string length (16) hdr_arr_len Number of entries in each PrepBufr message header array (3) obs_arr_len Number of entries in each PrepBufr observation array (5) nobs Number of PrepBufr observations in the file (UNLIMITED) nhdr Number of PrepBufr messages in the file (variable)

    Table 3-3. NetCDF variables in pb2nc output.

    pb2nc NetCDF VARIABLES

  • Chapter 3: MET Data I/O and Re-Formatting 3-12

    NetCDF Variable Dimension Description

    obs_arr nobs, obs_arr_len

    Array of floats containing values for each observation including: Reference to the entry in the hdr_arr with which

    this observation is associated GRIB code corresponding to this observation type Pressure level in hPa or accumulation interval Height in meters above sea level Observation value

    hdr_typ nmsg, mxstr Text string containing the message type for each PrepBufr message

    hdr_sid nmsg, mxstr Text string containing the station id for each PrepBufr message

    hdr_vld nmsg, mxstr Text string containing the observation valid time for each PrepBufr message in YYYYMMDD_HHMMSS format

    hdr_arr nhdr, hdr_arr_len

    Array of floats containing values for each PrepBufr message including: Latitude in degrees north Longitude in degrees east Elevation in meters above sea level

    3.6 ASCII2NC tool This section describes how to run the ASCII2NC tool. The ASCII2NC tool is used to reformat ASCII point observations into the NetCDF format expected by the Point-Stat tool. For those users wishing to verify against point observations that are not available in PrepBufr format, the ASCII2NC tool provides a way of incorporating those observations into MET. Since the ASCII2NC tool simply performs a reformatting step, no configuration file is needed. The initial version of the ASCII2NC tool supports a single input ASCII point observation format consisting of 10 columns of data for each observation value. The ASCII2NC tool may be enhanced in future releases of MET to support additional ASCII point observation formats directly, based on community input and resource availability. The input ASCII point observation format consists of one row of data per observation value. Each row of data consists of 10 columns as shown in the following table.

    ascii2nc ASCII Point Observation Format Column Name Description

  • Chapter 3: MET Data I/O and Re-Formatting 3-13

    ascii2nc ASCII Point Observation Format Column Name Description

    1 Message_Type Text string containing the observation message type as described in the previous section on the PB2NC tool.

    2 Station_ID Text string containing the station id. 3 Valid_Time Text string containing the observation valid time in YYYYMMDD_HHMMSS format. 4 Lat Latitude in degrees north of the observing location. 5 Lon Longitude in degrees east of the observation location. 6 Elevation Elevation in msl of the observing location. 7 Grib_Code Integer grib code value corresponding to this observation type. 8 Level Pressure level in hPa or accumulation interval in hours for the observation value. 9 Height Height in msl of the observation value. 10 Observation_Value Observation value in units consistent with the grib code definition.

    3.6.1 ascii2nc usage Once the ASCII point observations have been formatted as expected, the ASCII file is ready to be processed by the ASCII2NC tool. The usage statement for ASCII2NC tool is shown below: Usage: ascii2nc ascii_file netcdf_file [-format ASCII_format] [-v level] ascii2nc has two required arguments and can take up to two optional ones. Required arguments for ascii2nc

    1. The ascii_file argument indicates the name of the ASCII point observation file to be processed.

    2. The netcdf_file argument indicates the name given to the output NetCDF file.

    Optional arguments for ascii2nc

  • Chapter 3: MET Data I/O and Re-Formatting 3-14

    1. The format ASCII_format will be used in future releases of MET to define the ASCII point observation format contained in the ASCII point observation file. Since the ASCII2NC tool currently only reads one point observation format, users will not need to specify this argument.

    2. The -v level option indicates the desired level of verbosity. The value of level

    will override the default setting of 1. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity above 1 will increase the amount of logging.

    An example of the ascii2nc calling sequence is shown below: Ascii2nc sample_ascii_obs.txt sample_ascii_obs.nc In this example, the ASCII2NC tool will reformat the input sample_ascii_obs.txt file into NetCDF format and write the output to a file named sample_ascii_obs.nc.

    3.7 Pcp-Combine tool This section contains a description of running the Pcp-Combine tool. The Pcp-Combine tool is used (if needed) to modify the precipitation accumulation intervals from two or more GRIB files into a single NetCDF file containing the desired accumulation interval, for input to the MET statistics tools. Use of Pcp-Combine on a single file will result in that file being written out in netCDF format, with not changes to the content. The GRIB files being combined must have already been placed on the grid on which the user would like to verify. The copygb utility is recommended for re-gridding GRIB files. In addition, the Pcp-Combine tool will only operate on files with the same initialization time unless it is indicated to ignore the initialization time.

    3.7.1 pcp_combine usage The usage statement for the Pcp-Combine tool is shown below: Usage: pcp_combine

    [[-sum] sum_args] | [-add add_args] | [-subtract subtract_args]]

    [-gc code] [-ptv number] [-v level] The arguments to pcp_combine vary depending on the mode in which it is run. Listed below are the arguments for the sum command:

  • Chapter 3: MET Data I/O and Re-Formatting 3-15

    SUM_ARGS: init_time in_accum valid_time out_accum out_file [-pcpdir path] [-pcprx reg_exp] Listed below are the arguments for the add: ADD_ARGS: in_file1 accum1 [in_file2 accum2 in_file3 accum3 . . . ] out_file Listed below are the arguments for the subtract command: SUBTRACT_ARGS: in_file1 accum1 in_file2 accum2 out_file Required arguments for the pcp_combine

    1. The Pcp-Combine tool must be run with exactly one of the sum, add, or subtract command line arguments with the corresponding additional arguments.

    Optional arguments for pcp_combine

    1. The -gc code option may be used to override the default GRIB code value of 61 for accumulated precipitation.

    2. The -ptv number option may be used to specify which GRIB parameter table version number should be used for interpreting the meaning of GRIB codes.

    3. The -v level option indicates the desired level of verbosity. The contents of level will override the default setting of 1. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity above 1 will increase the amount of logging.

    Required arguments for the pcp_combine sum command

  • Chapter 3: MET Data I/O and Re-Formatting 3-16

    1. The init_time argument, provided in YYYYMMDD[_HH[MMSS]] format, indicates

    the initialization time for model data to be summed. Only files found with this initialization time will be processed. If combining observation files, Stage II or Stage IV data for example, the initialization time is not applicable. Providing a string of all zeros (00000000_000000) indicates that all files, regardless of initialization time should be processed.

    2. The in_accum argument, provided in HH[MMSS] format, indicates the

    accumulation interval of the model or observation GRIB files to be processed. This value must be specified, since a model output file may contain multiple accumulation periods for precipitation in a single file. The argument indicates which accumulation period to extract.

    3. The valid_time argument, in YYYYMMDD[_HH[MMSS]] format, indicates the

    desired valid time to which the accumulated precipitation is to be summed.

    4. The out_accum argument, in HH[MMSS] format, indicates the desired total accumulation period to be summed.

    5. The out_file argument indicates the name for the NetCDF file to be written.

    Optional arguments for pcp_combine sum command

    1. The -pcpdir path option indicates the directories in which the input GRIB files reside. The contents of path will override the default setting.

    2. The -pcprx reg_exp option indicates the regular expression to be used in

    matching files in the precipitation directory specified. The contents of reg_exp will override the default setting which matches all file names. If the precipitation directory contains a large number of files, the user may specify that only a subset of those files be processed using a regular expression which will speed up the run time.

    Required arguments for the pcp_combine add command

    1. The in_file1 argument indicates the first GRIB file to be processed.

    2. The in_accum1 argument, provided in HH format, indicates the accumulation interval to be extracted from the first GRIB file.

    An arbitrary number of additional files and accumulations can be provided. All of them will be added and the total will be placed in the output file. Required arguments for the pcp_combine subtract command

  • Chapter 3: MET Data I/O and Re-Formatting 3-17

    1. The in_file1 argument indicates the first GRIB file to be processed.

    2. The in_accum1 argument, provided in HH format, indicates the accumulation

    interval to be extracted from the first GRIB file.

    3. The in_file2 argument indicates the second GRIB file to be processed.

    4. The in_accum2 argument, provided in HH format, indicates the accumulation

    interval to be extracted from the second GRIB file. This accumulation will be subtracted from the first.

    An example of the pcp_combine calling sequence is presented below: Example 1: pcp_combine -sum 20050807_000000 3 20050808_000000 24 sample_fcst.nc -pcpdir ../data/sample_fcst/2005080700 In Example 1, the Pcp-Combine tool will sum the values in model files initialized at 2005/08/07 00Z and containing 3-hourly accumulation intervals of precipitation. The requested valid time is 2005/08/08 00Z with a requested total accumulation interval of 24 hours. The output file is to be named sample_fcst.nc, and the Pcp-Combine tool is to search the directory indicated for the input GRIB files. The Pcp-Combine tool will search for 8 files containing 3-hourly accumulation intervals which meet the criteria specified. It will write out a single NetCDF file containing that 24 hours of accumulation. A second example of the pcp_combine calling sequence is presented below: Example 2: pcp_combine -sum 00000000_000000 1 1 20050808_000000 24 sample_obs.nc -pcpdir ../data/sample_obs/ST2ml Example 2 shows an example of using the Pcp-Combine tool to sum observation data. The init_time has been set to all zeros to indicate that when searching through the files in precipitation directory, the initialization time should be ignored. The