16
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University of Illinois, Urbana-Champaign

A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

Embed Size (px)

Citation preview

Page 1: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

A High performance I/O Module: the HDF5 WRF I/O module

 Muqun Yang, Robert E. McGrath, Mike Folk

National Center for Supercomputing ApplicationsUniversity of Illinois, Urbana-Champaign

Page 2: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

Part of NSF-funded Alliance MEAD Expedition project

• The MEAD expedition will develop and adapt a cyberinfrastructure that will enable simulation, data mining, and visualization of atmospheric phenomena

Page 3: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

Goals

• Provide a new IO module for WRF

• Evaluate the IO performance of parallel HDF5 with real scientific application

Page 4: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

What is HDF?

• Format and software for scientific data• Stores images, multidimensional arrays, tables,

etc.• Emphasis on storage and high performance I/O• Free and commercial software support• Emphasis on standards• Users from many engineering and scientific fields

Page 5: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

HDF4 vs HDF5

• HDF4 - Based on original 1988 version of HDF– Backwardly compatible with all earlier versions– 6 basic objects

• raster image, multidimensional array (SDS), palette, group (Vgroup), table (Vdata), annotation

• HDF5– New format & library - not compatible with HDF4– 2 basic objects

Page 6: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

HDF supporters and users• NASA Earth Science Data & Info System

– Migrate from HDF4 to HDF5– HDF-EOS library on HDF5

– Aura and NPOESS will use HDF5

• DOE Advanced Simulation Computing(ASC)– Open standard exchange format and high performance I/O library– DOE tri-lab ASC applications

• And many other scientific applications

Check URL at http://hdf.ncsa.uiuc.edu/users5.html

Page 7: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

HDF5 Features

• Dataset larger than 2 Giga-Byte• Any number of Objects per file• Alternative storage layout• Parallel IO (MPI-IO)

Page 8: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

compressed

extendable

Metadata for Fred

Dataset “Fred”

File AFile A

File BFile B

Data for FredData for Fred

Special Storage Options

chunked

compressed

extendable

Split file

Better subsettingAccess time;extendable

Improves storage efficiency, Transmission speed

Arrays can be extendedin any direction

Metadata in one file,raw data in another

Page 9: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

Now WRF-HDF5 module work

Page 10: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

/

WRF_DATA Dim_group

TKE RHO_U

ETC…

TimeTop_bot

Lat Lon

Dimensional scale table

Schematic HDF5 File Structure of WRF Output

HDF5 dataset(WRF field) HDF5 group(WRF dataset)

Solid line: HDF5 datasets or sub-groups (the arrow points to) that are members of the HDF5 parent group.Dash line: the association of one HDF5 object to another HDF5 object;in terms of HDF5, object reference.

Page 11: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

Follow WRF Standard

• Standard WRF Configuration (Similar as NetCDF)• Use WRF common APIs• Application can select an IO module

Facts about WRF-HDF5 module

Page 12: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

A complete WRF-HDF5 sequential module

• Have both reader and writer

• Has been tested on NCSA Linux cluster,

PC linux, NCAR IBM SP(blackforest), NCSA IBM P690

• Available for friendly users

Page 13: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

Parallel HDF5 module

We are in the process of developing Parallel HDF5 module now!

• Another WRF-IO module• Read and write the same HDF5 file as the

sequential module• Use parallel IO

Page 14: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

Future work

http://hdf.ncsa.uiuc.edu/apps/WRF-ROMS/Email: [email protected]

As part of MEAD, WRF and the Regional Ocean Modeling System (ROMS) are being coupled and HDF5 I/O is being implemented in both models.

Page 15: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

Acknowledgements:

The WRF-HDF5 module is being developed as part of the NSF funded MEAD at NCSA. The MEAD (http://www.ncsa.uiuc.edu/Expeditions/MEAD/) is one of six expeditions that will developAnd adapt a cyberinfrastructure that will enable simulation, data mining, And visualization of atmospheric phenomena.

Page 16: A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University

Some facts related to WRF-HDF5 module

WRF-HDF5

module

Sequential Computing

Sequential

IO

Sequential

HDF5

Parallel

Computing

Sequential

IO

Sequential

HDF5

Parallel

Computing

Parallel

IO

Parallel

HDF5