Upload
ayoka
View
50
Download
0
Embed Size (px)
DESCRIPTION
Prof. Thomas Sterling Prof. Hartmut Kaiser Department of Computer Science Louisiana State University April 12 th , 2011. HIGH PERFORMANCE COMPUTING : MODELS, METHODS, & MEANS PARALLEL FILE I/O 3 LIBRARIES 2. Puzzle of the Day. - PowerPoint PPT Presentation
Citation preview
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
HIGH PERFORMANCE COMPUTING: MODELS, METHODS, & MEANS
PARALLEL FILE I/O 3LIBRARIES 2
Prof. Thomas SterlingProf. Hartmut KaiserDepartment of Computer Science Louisiana State UniversityApril 12th, 2011
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
#include <stdio.h>
int main() { int a = 1, 2; printf("a : %d\n",a); return 0; }
Puzzle of the Day
2
I thought the following C program is perfectly valid (after reading about the comma operator in C). But there is a mistake in the following program, can you identify it?
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
#include <stdio.h>
int main() { int a = (1, 2); printf("a : %d\n",a); return 0; }
Puzzle of the Day
3
I thought the following C program is perfectly valid (after reading about the comma operator in C). But there is a mistake in the following program, can you identify it?
Comma operator has lowest precendence, even lower than assignment.
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011 4
Topics
• Introduction • Scientific I/O Interface: netCDF• Scientific Data Package: HDF5• Application domain specific libraries• SMP programming support: CILK, TBB
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
NetCDF: Introduction
• Stands for Network Common Data Form• Portable format to represent scientific data• Developed at the Unidata Program Center in Boulder, Colorado, with many
contributions from user community• Project page hosted by the Unidata program at University Corporation for
Atmospheric Research (UCAR): http://www.unidata.ucar.edu/software/netcdf/ • Provides a set of interfaces for array-oriented data access and a collection of
data access libraries for C, Fortan (77 and 90), C++, Java, Perl, Python, and other languages
• Available on UNIX and Windows platforms• Features simple programming interface• Supports large data files (and 64-bit offsets)• Open source, freely available• Commonly used file extension is “.nc” (changed from “.cdf” to avoid confusion
with other formats)• Current stable release is version 4.1.2 (released on March 29, 2011)• Used extensively by a number of climate modeling, land and atmosphere,
marine, naval data storage, satellite data processing, theoretical physics centers, geological institutes, commercial analysis, universities, as well as other research institutions in over 30 countries
5
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
NetCDF Rationale
• To facilitate the use of common datasets by distinct applications• Permit datasets to be transported between or shared by
dissimilar computers transparently, i.e., without translation (automatic handling of different data types, endian-ness, etc.)
• Reduce the programming effort usually spent interpreting formats• Reduce errors arising from misinterpreting data and ancillary
data• Facilitate using output from one application as input to another• Establish an interface standard which simplifies the inclusion of
new software into already existing application set (originally: Unidata system)
• However: not another DBMS!
6
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Key Properties of NetCDF Format
• Self-describing– A netCDF file includes information about the data it contains
• Portable– Files are accessible by computers that use different ways of representing
and storing of integers, floating-point numbers and characters
• Direct-access– Enabling an efficient access to small subsets of a large dataset without the
need to read through all preceding data
• Appendable– Additional data may be appended to a properly structured netCDF file
without copying the dataset or redefining its structure
• Sharable– One writer and multiple readers may simultaneously access the same
netCDF file
• Archivable– Access to all earlier forms of netCDF data will be supported by current and
future versions of the software
7
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
NetCDF Dataset Building Blocks
• Data in netCDF are represented as n-dimensional arrays, with n being 0, 1, 2, … (scalars are 0-dimensional arrays)
• Array elements are of the same data type• Three basic entities:
– Dimension: has name and length; one dimension per array may be UNLIMITED for unbounded arrays
– Variable: identifies array of values of the same type (byte, character, short, int, float, or double)
• In addition, coordinate variables may be named identically to dimensions, and by convention define physical coordinate set corresponding to that dimension
– Attribute: provides additional information about a variable, or global properties of a dataset
• There are established conventions for attribute names, e.g., unit, long_name, valid_range, etc.
• Multiple attributes per dataset are allowed
• The only kind of data structures supported by netCDF classic are collections of named arrays with attached vector attributes
8
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Common Data Form Language (CDL)• NetCDF uses CDL to provide a way to describe data model• CDL represents the information stored in binary netCDF files in a
human-readable form, e.g.:
9
netcdf example_1 { // example of CDL notation for a netCDF dataset
dimensions: // dimension names and lengths are declared first lat = 5, lon = 10, level = 4, time = unlimited;
variables: // variable types, names, shapes, attributes float temp(time,level,lat,lon); temp:long_name = "temperature"; temp:units = "celsius"; int lat(lat), lon(lon), level(level); lat:units = "degrees_north"; lon:units = "degrees_east"; level:units = "millibars"; short time(time); time:units = "hours since 1996-1-1"; // global attributes :source = "Fictional Model Output";data: // optional data assignments level = 1000, 850, 700, 500; lat = 20, 30, 40, 50, 60; lon = -160,-140,-118,-96,-84,-52,-45,-35,-25,-15; time = 12;}
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
NetCDF Utilities
• ncgen– takes input in CDL format and creates a netCDF file, or a C or
Fortran program that creates a netCDF dataset
ncgen [-b] [-o netcdf-file] [-c] [-f] [-k kind] [-x]
[input-file]
• ncdump– generates the CDL text representation of a netCDF dataset on
standard output, optionally including some or all variable data– Output from ncdump is an acceptable input to ncgen
ncdump [-c|-h] [-v var1,…] [-b lang] [-f lang] [-l len]
[-p fdig[,ddig]] [-n name] [-k] [input-file]
10
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
11
NetCDF API: Create a DatasetFunction: nc_create()
int nc_create(const char *path, int cmode, int *id);
Description:Creates a new dataset returning its id that can be used in subsequent calls. The file name for the dataset is specified in path. The cmode argument determines creation mode, and may contain zero or more of the following flags or’d: NC_NOCLOBBER (to avoid overwriting existing files), NC_SHARE (limits buffering in scenarios where one or more other processes concurrently read the file being updated by a single writer process), NC_64BIT_OFFSET (create a file with 64-bit offsets). The default zero value is aliased to NC_CLOBBER, i.e. no overwrite protection for existing files. On success NC_NOERR is returned.
#include <netcdf.h> ... int status;int ncid; ... status = nc_create("foo.nc", NC_NOCLOBBER, &ncid);if (status != NC_NOERR) handle_error(status);
#include <netcdf.h> ... int status;int ncid; ... status = nc_create("foo.nc", NC_NOCLOBBER, &ncid);if (status != NC_NOERR) handle_error(status);
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
12
NetCDF API: Open a DatasetFunction: nc_open()
int nc_open(const char *path, int omode, int *id);
Description:Opens an existing dataset stored in a file identified by path, returning its id. The omode argument may contain zero or more of the following flags or’d: NC_WRITE (to open in read/write mode), NC_SHARE (same meaning as for nc_create). The default (zero) is aliased to NC_NOWRITE, which opens the file in read-only mode without sharing. On success NC_NOERR is returned.
#include <netcdf.h> ...int status;int ncid;...status = nc_open("foo.nc", 0, &ncid);if (status != NC_NOERR) handle_error(status);
#include <netcdf.h> ...int status;int ncid;...status = nc_open("foo.nc", 0, &ncid);if (status != NC_NOERR) handle_error(status);
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
13
NetCDF API: Create a DimensionFunction: nc_def_dim()
int nc_def_dim(int id, const char *name, size_t len, int *dimid);
Description:Adds a new dimension to an open dataset identified by id. The dimension name is pointed to by name, and its length, a positive integer or constant NC_UNLIMITED, is passed in len. On success NC_NOERR is returned and dimension id is stored in *dimid.
#include <netcdf.h> ... int status, ncid, latid, recid; ... status = nc_create("foo.nc", NC_NOCLOBBER, &ncid);if (status != NC_NOERR) handle_error(status); ... status = nc_def_dim(ncid, "lat", 18L, &latid);if (status != NC_NOERR) handle_error(status);status = nc_def_dim(ncid, "rec", NC_UNLIMITED, &recid);if (status != NC_NOERR) handle_error(status);
#include <netcdf.h> ... int status, ncid, latid, recid; ... status = nc_create("foo.nc", NC_NOCLOBBER, &ncid);if (status != NC_NOERR) handle_error(status); ... status = nc_def_dim(ncid, "lat", 18L, &latid);if (status != NC_NOERR) handle_error(status);status = nc_def_dim(ncid, "rec", NC_UNLIMITED, &recid);if (status != NC_NOERR) handle_error(status);
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
14
NetCDF API: Create a VariableFunction: nc_def_var()
int nc_def_var(int id, const char *name, nc_type xtype, int ndims,
const int dimids[], int *varid);
Description:Adds a new variable with name pointed to by name to an open dataset identified by id. The new variable id is stored in *varid. xtype defines the external data type, and must be one of: NC_BYTE, NC_CHAR, NC_SHORT, NC_INT, NC_FLOT, or NC_DOUBLE. The arguments ndims and dimids specify respectively the number of dimensions and their ids. On success NC_NOERR is returned.
#include <netcdf.h>int status, ncid; /* error status and dataset ID */int lat_dim, lon_dim, time_dim; /* dimension IDs */int rh_id, rh_dimids[3]; /* variable ID and shape */ ... status = nc_create("foo.nc", NC_NOCLOBBER, &ncid);if (status != NC_NOERR) handle_error(status);/* define dimensions */status = nc_def_dim(ncid, "lat", 5L, &lat_dim);if (status != NC_NOERR) handle_error(status);status = nc_def_dim(ncid, "lon", 10L, &lon_dim);if (status != NC_NOERR) handle_error(status);status = nc_def_dim(ncid, "time", NC_UNLIMITED, &time_dim);if (status != NC_NOERR) handle_error(status);/* define variable */rh_dimids[0] = time_dim; rh_dimids[1] = lat_dim; rh_dimids[2] = lon_dim;status = nc_def_var(ncid, "rh", NC_DOUBLE, 3, rh_dimids, &rh_id);if (status != NC_NOERR) handle_error(status);
#include <netcdf.h>int status, ncid; /* error status and dataset ID */int lat_dim, lon_dim, time_dim; /* dimension IDs */int rh_id, rh_dimids[3]; /* variable ID and shape */ ... status = nc_create("foo.nc", NC_NOCLOBBER, &ncid);if (status != NC_NOERR) handle_error(status);/* define dimensions */status = nc_def_dim(ncid, "lat", 5L, &lat_dim);if (status != NC_NOERR) handle_error(status);status = nc_def_dim(ncid, "lon", 10L, &lon_dim);if (status != NC_NOERR) handle_error(status);status = nc_def_dim(ncid, "time", NC_UNLIMITED, &time_dim);if (status != NC_NOERR) handle_error(status);/* define variable */rh_dimids[0] = time_dim; rh_dimids[1] = lat_dim; rh_dimids[2] = lon_dim;status = nc_def_var(ncid, "rh", NC_DOUBLE, 3, rh_dimids, &rh_id);if (status != NC_NOERR) handle_error(status);
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
15
NetCDF API: Leave Define ModeFunction: nc_enddef()
int nc_enddef(int id);
Description:Finalizes define mode and commits to disk changes made to the dataset. Returns NC_NOERR on success.
#include <netcdf.h> ... int status; int ncid; ... status = nc_create("foo.nc", NC_NOCLOBBER, &ncid);if (status != NC_NOERR) handle_error(status); .../* create dimensions, variables, attributes */ ...status = nc_enddef(ncid); /*leave define mode*/if (status != NC_NOERR) handle_error(status);
#include <netcdf.h> ... int status; int ncid; ... status = nc_create("foo.nc", NC_NOCLOBBER, &ncid);if (status != NC_NOERR) handle_error(status); .../* create dimensions, variables, attributes */ ...status = nc_enddef(ncid); /*leave define mode*/if (status != NC_NOERR) handle_error(status);
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
16
NetCDF API: Quering Variable Information
Function: nc_inq_varid(), nc_inq_var*()
int nc_inq_varid(int id, const char *name, int *varid);
int nc_inq_var(int id, int varid, char *name, nc_type *xtype, int *ndims, int dimids[], int *natts);
int nc_inq_varname(int id, int varid, char *name);
int nc_inq_vartype(int id, int varid, nc_type *xtype);
int nc_inq_varndims(int id, int varid, *ndims);
int nc_inq_vardimid(int id, int varid, int dimids[]);
int nc_inq_varnatts(int id, int varid, int *natts);
Description:The first function returns in *varid variable ID identified by name in dataset id.
The second function returns information about variable identified by varid, including its name (null terminated, in area pointed to by name), type (in *xtype), number of dimensions (in *ndims), dimension IDs (in dimids[]), and number of attributes (in *natts). The buffer to store variable name has to be allocated by user and should be at least NC_MAX_NAME+1 characters long if the name size is not known in advance. NC_NOERR is returned on success in both calls.
The remaining functions retrieve individual pieces of information, all of which nc_inq_var() returns in a single call.
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
17
NetCDF API: Variable Information (Example)
#include <netcdf.h> ... int status; /* error status */ int ncid; /* netCDF ID */ int rh_id; /* variable ID */ nc_type rh_type; /* variable type */ int rh_ndims; /* number of dims */ int rh_dimids[NC_MAX_VAR_DIMS]; /* dimension ids */int rh_natts; /* number of attributes */ ... status = nc_open("foo.nc", NC_NOWRITE, &ncid); if (status != NC_NOERR) handle_error(status); ... status = nc_inq_varid(ncid, "rh", &rh_id); if (status != NC_NOERR) handle_error(status); /* we don't need name, since we already know it */ status = nc_inq_var(ncid, rh_id, 0, &rh_type, &rh_ndims, rh_dimids, &rh_natts); if (status != NC_NOERR) handle_error(status);
#include <netcdf.h> ... int status; /* error status */ int ncid; /* netCDF ID */ int rh_id; /* variable ID */ nc_type rh_type; /* variable type */ int rh_ndims; /* number of dims */ int rh_dimids[NC_MAX_VAR_DIMS]; /* dimension ids */int rh_natts; /* number of attributes */ ... status = nc_open("foo.nc", NC_NOWRITE, &ncid); if (status != NC_NOERR) handle_error(status); ... status = nc_inq_varid(ncid, "rh", &rh_id); if (status != NC_NOERR) handle_error(status); /* we don't need name, since we already know it */ status = nc_inq_var(ncid, rh_id, 0, &rh_type, &rh_ndims, rh_dimids, &rh_natts); if (status != NC_NOERR) handle_error(status);
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
18
NetCDF API: Read a VariableFunction: nc_get_var_type()
int nc_get_var_text (int id, int varid, const char *ptr);
int nc_get_var_uchar (int id, int varid, const unsigned char *ptr);
int nc_get_var_schar (int id, int varid, const signed char *ptr);
int nc_get_var_short (int id, int varid, const short *ptr);
int nc_get_var_int (int id, int varid, const int *ptr);
int nc_get_var_long (int id, int varid, const long *ptr);
int nc_get_var_float (int id, int varid, const float *ptr);
int nc_get_var_double(int id, int varid, const double *ptr);
Description:Reads all the values from a netCDF variable referred to by varid of an open dataset with handle id. The dataset must be in data mode. The values of multidimensional arrays are read into consecutive memory locations with the last dimension varying fastest, starting at location pointed to by ptr. Type conversion will occur if the type of data differs from the netCDF variable type. Returns NC_NOERR on success.
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
19
NetCDF API: Read a Variable (Example)
#include <netcdf.h> ... #define TIMES 3 #define LATS 5 #define LONS 10 int status; /* error status */ int ncid; /* netCDF ID */ int rh_id; /* variable ID */ double rh_vals[TIMES*LATS*LONS]; /* array to hold values */ ... status = nc_open("foo.nc", NC_NOWRITE, &ncid); if (status != NC_NOERR) handle_error(status); ... status = nc_inq_varid(ncid, "rh", &rh_id); if (status != NC_NOERR) handle_error(status); ... /* read values from netCDF variable */ status = nc_get_var_double(ncid, rh_id, rh_vals); if (status != NC_NOERR) handle_error(status);
#include <netcdf.h> ... #define TIMES 3 #define LATS 5 #define LONS 10 int status; /* error status */ int ncid; /* netCDF ID */ int rh_id; /* variable ID */ double rh_vals[TIMES*LATS*LONS]; /* array to hold values */ ... status = nc_open("foo.nc", NC_NOWRITE, &ncid); if (status != NC_NOERR) handle_error(status); ... status = nc_inq_varid(ncid, "rh", &rh_id); if (status != NC_NOERR) handle_error(status); ... /* read values from netCDF variable */ status = nc_get_var_double(ncid, rh_id, rh_vals); if (status != NC_NOERR) handle_error(status);
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
20
NetCDF API: Write a VariableFunction: nc_put_var_type()
int nc_put_var_text (int id, int varid, const char *ptr);
int nc_put_var_uchar (int id, int varid, const unsigned char *ptr);
int nc_put_var_schar (int id, int varid, const signed char *ptr);
int nc_put_var_short (int id, int varid, const short *ptr);
int nc_put_var_int (int id, int varid, const int *ptr);
int nc_put_var_long (int id, int varid, const long *ptr);
int nc_put_var_float (int id, int varid, const float *ptr);
int nc_put_var_double(int id, int varid, const double *ptr);
Description:Writes all values of a possibly multidimensional variable referred to by varid to an open dataset with handle id. The location of the block of data values to be written is pointed to by ptr. The values may be implicitly converted to the external data type specified in variable definition. Returns NC_NOERR on success.
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
21
NetCDF API: Write a Variable (Example)
#include <netcdf.h> ... #define TIMES 3 #define LATS 5 #define LONS 10 int status; /* error status */ int ncid; /* netCDF ID */ int rh_id; /* variable ID */ double rh_vals[TIMES*LATS*LONS]; /* array to hold values */ int i; ... status = nc_open("foo.nc", NC_WRITE, &ncid); if (status != NC_NOERR) handle_error(status); ... status = nc_inq_varid(ncid, "rh", &rh_id); if (status != NC_NOERR) handle_error(status); ... for (i = 0; i < TIMES*LATS*LONS; i++) rh_vals[i] = 0.5; /* write values into netCDF variable */ status = nc_put_var_double(ncid, rh_id, rh_vals); if (status != NC_NOERR) handle_error(status);
#include <netcdf.h> ... #define TIMES 3 #define LATS 5 #define LONS 10 int status; /* error status */ int ncid; /* netCDF ID */ int rh_id; /* variable ID */ double rh_vals[TIMES*LATS*LONS]; /* array to hold values */ int i; ... status = nc_open("foo.nc", NC_WRITE, &ncid); if (status != NC_NOERR) handle_error(status); ... status = nc_inq_varid(ncid, "rh", &rh_id); if (status != NC_NOERR) handle_error(status); ... for (i = 0; i < TIMES*LATS*LONS; i++) rh_vals[i] = 0.5; /* write values into netCDF variable */ status = nc_put_var_double(ncid, rh_id, rh_vals); if (status != NC_NOERR) handle_error(status);
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
22
NetCDF API: Close a DatasetFunction: nc_close()
int nc_close(int id);
Description:Closes an open dataset referred to by id. If the dataset is in define mode, nc_enddef() will be called implicitly. After close, the id value may be reassigned to another newly opened or created dataset. NC_NOERR is returned on success.
#include <netcdf.h> ... int status; int ncid; ... status = nc_create("foo.nc", NC_NOCLOBBER, &ncid); if (status != NC_NOERR) handle_error(status); ... /* create dimensions, variables, attributes */ ...status = nc_close(ncid); /* close netCDF dataset */ if (status != NC_NOERR) handle_error(status);
#include <netcdf.h> ... int status; int ncid; ... status = nc_create("foo.nc", NC_NOCLOBBER, &ncid); if (status != NC_NOERR) handle_error(status); ... /* create dimensions, variables, attributes */ ...status = nc_close(ncid); /* close netCDF dataset */ if (status != NC_NOERR) handle_error(status);
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Parallel NetCDF
Possible usage scenarios on parallel computers– Serial netCDF to access single files from a single process– Multiple files accessed concurrently and independently through
serial netCDF API– Parallel netCDF API to access single files cooperatively or
collectively
23
Source: http://www-unix.mcs.anl.gov/parallel-netcdf/pnetcdf-sc2003.pdf
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
PnetCDF Implementation• Available from:
http://trac.mcs.anl.gov/projects/parallel-netcdf • Library layer between user space and file
system space• Processes parallel I/O requests from
compute nodes, optimizes them, and passes them down to the MPI-IO library
• Advantages:– Optimized for the netCDF file format– Regular and predictable data patterns in netCDF
compatible with MPI-IO interface– Low overhead of header I/O (local header copies
viable)– Well defined metadata creation phase– no need for collective I/O when accessing
individual objects• Disadvantages:
– No hierarchical data layout– Additions of data and header extensions are
costly after file creation due to linear layout order– No support for combining of multiple files in
memory (like HDF5 software mounting)– NetCDF source required for installation
24Source: http://www-unix.mcs.anl.gov/parallel-netcdf/pnetcdf-sc2003.pdf
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
PnetCDF Sample Calling Sequence
25
Source: http://www-unix.mcs.anl.gov/parallel-netcdf/sc03_present.pdf
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011 26
Topics
• Scientific I/O Interface: netCDF• Scientific Data Package: HDF5• Application domain specific libraries• SMP programming support: CILK, TBB
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Introduction to HDF5• Acronym for Hierarchical Data Format, a portable, freely distributable, and well
supported library, file format, and set of utilities to manipulate it• Explicitly designed for use with scientific data and applications• Initial HDF version was created at NCSA/University of Illinois at Urbana-
Champaign in 1988• First revision in widespread use was HDF4• Main HDF features include:
– Versatility: supports different data models and associated metadata
– Self-describing: allows an application to interpret the structure and contents of a file without any extraneous information
– Flexibility: permits mixing and grouping various objects together in one file in a user-defined hierarchy
– Extensibility: accommodates new data models, added both by the users and developers
– Portability: can be shared across different platforms without preprocessing or modifications
• HDF5 is the most recent incarnation of the format, adding support for new type and data models, parallel I/O and streaming, and removing a number of existing restrictions (maximal file size, number of objects per file, flexibility of type use, storage management configurability, etc.), as well as improving the performance
27
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
HDF5 File Layout• Major object classes: groups and datasets
• Namespace resembles file system directory hierarchy (groups ≡ directories, datasets ≡ files)
• Alias creation supported through links (both soft and hard)
• Mounting of sub-hierachies is possible
28
User’s viewLow-level
organization
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
HDF5 API & Tools
Library functionality grouped by function name prefix
• H5: general purpose functions
• H5A: attribute interface
• H5D: dataset manipulation
• H5E: error handling
• H5F: file interface
• H5G: group creation and access
• H5I: object identifiers
• H5P: property lists
• H5R: references
• H5S: dataspace definition
• H5T: datatype manipulation
• H5Z: inline data filters and compression
29
Command-line utilities• h5cc, h5c++, h5fc: C, C++ and
Fortran compiler wrappers• h5redeploy: updates compiler tools
after installation in new location• h5ls, h5dump: lists hierarchy and
contents of a HDF5 file• h5diff: compares two HDF5 files• h5repack, h5repart: rearranges or
repartitions a file• h5toh4, h4toh5: converts between
HDF5 and HDF4 formats• h5import: imports data into HDF5 file• gif2h5, h52gif: converts image data
between gif and HDF5 formats
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Basic HDF5 Concepts• Group
– Structure containing zero or more HDF5 objects (possibly other groups)
– Provides a mechanism for mapping a name (path) to an object
– “Root” group is a logical container of all other objects in a file
• Dataset– A named array of data elements (possibly multi-dimensional)
– Specifies the representation of the dataset the way it will be stored in HDF5 file through associated datatype and dataspace parameters
• Dataspace– Defines dimensionality of a dataset (rank and dimension sizes)
– Determines the effective subset of data to be stored or retrieved in subsequent file operations (aka selection)
• Datatype– Describes atomically accessed element of a dataset
– Permits construction of derived (compound) types, such as arrays, records, enumerations
– Influences conversion of numeric values between different platforms or implementations
• Attribute– A small, user-defined structure attached to a group, dataset or named datatype,
providing additional information
30
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
HDF5 Spatial Subset Examples
31
Source: http://hdf.ncsa.uiuc.edu/HDF5/RD100-2002/All_About_HDF5.pdf
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
HDF5 Virtual File Layer
32
Source: http://hdf.ncsa.uiuc.edu/HDF5/RD100-2002/All_About_HDF5.pdf
• Developed to cope with large number of available storage subsystem variations
• Permits custom file driver implementations and related optimizations
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Overview of Data Storage Options
33
Source: http://hdf.ncsa.uiuc.edu/HDF5/RD100-2002/All_About_HDF5.pdf
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Simultaneous Spatial and Type Transformation Example
34
Source: http://hdf.ncsa.uiuc.edu/HDF5/RD100-2002/All_About_HDF5.pdf
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Simple HDF5 Code Example
35
/* Writing and reading an existing dataset. */#include "hdf5.h"#define FILE "dset.h5"
int main() { hid_t file_id, dataset_id; /* identifiers */ herr_t status; int i, j, dset_data[4][6];
/* Initialize the dataset. */ for (i = 0; i < 4; i++) for (j = 0; j < 6; j++) dset_data[i][j] = i * 6 + j + 1;
/* Open an existing file. */ file_id = H5Fopen(FILE, H5F_ACC_RDWR, H5P_DEFAULT); /* Open an existing dataset. */ dataset_id = H5Dopen(file_id, "/dset"); /* Write the dataset. */ status = H5Dwrite(dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, dset_data);
status = H5Dread(dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, dset_data);
/* Close the dataset. */ status = H5Dclose(dataset_id); /* Close the file. */ status = H5Fclose(file_id);}
/* Writing and reading an existing dataset. */#include "hdf5.h"#define FILE "dset.h5"
int main() { hid_t file_id, dataset_id; /* identifiers */ herr_t status; int i, j, dset_data[4][6];
/* Initialize the dataset. */ for (i = 0; i < 4; i++) for (j = 0; j < 6; j++) dset_data[i][j] = i * 6 + j + 1;
/* Open an existing file. */ file_id = H5Fopen(FILE, H5F_ACC_RDWR, H5P_DEFAULT); /* Open an existing dataset. */ dataset_id = H5Dopen(file_id, "/dset"); /* Write the dataset. */ status = H5Dwrite(dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, dset_data);
status = H5Dread(dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, dset_data);
/* Close the dataset. */ status = H5Dclose(dataset_id); /* Close the file. */ status = H5Fclose(file_id);}
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Parallel HDF5
• Relies on MPI-IO as the file layer driver• Uses MPI for internal communications• Most of the functionality controlled through property lists (requires
minimal HDF5 interface changes)• Supports both individual and collective file access• Three raw data storage layouts: contiguous, chunking and compact• Enables additional optimizations through derived MPI datatypes
(esp. for regular collective accesses)• Limitations
– Chunked storage with overlapping chunks (results non-deterministic)
– Read-only compression
– Writes with variable length datatypes not supported
36
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011 38
Topics
• Scientific I/O Interface: netCDF• Scientific Data Package: HDF5• Application domain specific libraries• SMP programming support: CILK, TBB
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Application domains
• Linear algebra– BLAS, ATLAS, LAPACK, ScaLAPACK, Slatec, pim
• Ordinary and Partial Differential Equations– PETSc
• Mesh Manipulation and Load Balancing – METIS, ParMETIS, CHACO, JOSTLE, PARTY
• Graph Manipulation– Boost.Graph library
• Vector/Signal/Image Processing– VSIPL, PESSL (IBMs Parallel Engineering Scientific Subroutine Library
• General Parallelization– MPI, pthreads
• Other Domain Specific Libraries– NAMD, NWChem, Fluent, Gaussian, LS-DYNA
39
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Application Domain Overview
• Linear Algebra Libraries – Provide optimized methods for constructing sets of linear equations,
performing operations on them (matrix-matrix products, matrix-vector products) and solving them (factoring, forward & backward substitution.
– Commonly used libraries include BLAS, ATLAS, LAPACK, ScaLAPACK, PaLAPACK
• PDE Solvers: – General-purpose, parallel numerical PDE libraries
– Usual toolsets include manipulation of sparse data structures, iterative linear system solvers, preconditioners, nonlinear solvers and time-stepping methods.
– Commonly used libraries for solving PDEs include SAMRAI, PETSc, PARASOL, Overture, among others.
40
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Application Domain Overview
• Mesh manipulation and Load Balancing – These libraries help in partitioning meshes in roughly equal sizes
across processors, thereby balancing the workload while minimizing size of separators and communication costs.
– Commonly used libraries for this purpose include METIS, ParMetis, Chaco, JOSTLE among others.
• Other packages:– FFTW: features highly optimized Fourier transform package
including both real and complex multidimensional transforms in sequential, multithreaded, and parallel versions.
– NAMD: molecular dynamics library available for Unix/Linux, Windows, OS X
– Fluent: computational fluid dynamics package, used for such applications as environment control systems, propulsion, reactor modeling etc.
41
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
CBLAS Matrix Multiplication
#include <stdio.h>#include <cblas.h>
#define N (sizeof(x)/sizeof(x[0]))#define M ((sizeof(A)/sizeof(A[0]))/(N))
int main(int argc, char **argv){ unsigned int i; double A[] = {1.0, 0.1, 0.01, 0.0, 0.0, 10., 1.0, 0.1, 0.01, 0.0, 0.0, 10., 1.0, 0.1, 0.01, 0.0, 0.0, 10., 1.0, 0.1}; double x[] = {1, 2, 3, 4, 5}, y[M];
/* y = 2.0*A*x + 0.0*y */ cblas_dgemv(CblasRowMajor, CblasNoTrans, M, N, 2.0, A, N, x, 1, 0.0, y, 1);
/* print out the result */ for (i = 0; i < M; i++) printf("y[%d] = %6.2lf\n", i, y[i]);}
42
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
uBLAS Matrix Multiplication 1
#include <iostream>#include <boost/numeric/ublas/blas.hpp>
using namespace boost::numeric;
int main(int argc, char **argv){ ublas::matrix<double> A (4, 5); A(0, 0) = 1.0; A(0, 1) = 0.1; A(0, 2) = 0.01; A(0, 3) = 0.0; A(0, 4) = 0.0; A(1, 0) = 10.0; A(1, 1) = 1.0; A(1, 2) = 0.1; A(1, 3) = 0.01; A(1, 4) = 0.0; A(2, 0) = 0.0; A(2, 1) = 10.0; A(2, 2) = 1.0; A(2, 3) = 0.1; A(2, 4) = 0.01; A(3, 0) = 0.0; A(3, 1) = 0.0; A(3, 2) = 10.0; A(3, 3) = 1.0; A(3, 4) = 0.1; ublas::vector<double> x(5); x[0] = 1; x[1] = 2; x[2] = 3; x[3] = 4; x[4] = 5; // y = 2.0*A*x ublas::vector<double> y = prod(2*A, x);
for (unsigned int i = 0; i < y.size(); ++i) std::cout << "y[" << i << "] = " << y[i] << std::endl; return 0;}
43
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
uBLAS Matrix Multiplication 2
#include <iostream>#include <boost/numeric/ublas/blas.hpp>
using namespace boost::numeric;
int main(int argc, char **argv){ ublas::banded_matrix<double> A (4, 5, 1, 2); A(0, 0) = 1.0; A(0, 1) = 0.1; A(0, 2) = 0.01; A(1, 0) = 10.0; A(1, 1) = 1.0; A(1, 2) = 0.1; A(1, 3) = 0.01; A(2, 1) = 10.0; A(2, 2) = 1.0; A(2, 3) = 0.1; A(2, 4) =
0.01; A(3, 2) = 10.0; A(3, 3) = 1.0; A(3, 4) = 0.1; ublas::vector<double> x(5); x[0] = 1; x[1] = 2; x[2] = 3; x[3] = 4; x[4] = 5; // y = 2.0*A*x ublas::vector<double> y = prod(2*A, x);
for (unsigned int i = 0; i < y.size(); ++i) std::cout << "y[" << i << "] = " << y[i] << std::endl; return 0;}
44
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
NewMath Matrix Multiplication
#include <iostream>#include <newmatap.h> // need matrix applications
using namespace NEWMAT;
int main(int argc, char **argv){ Matrix A (4, 5); double d[] = { 1.0, 0.1, 0.01, 0.0, 0.0, 10.0, 1.0, 0.1, 0.01, 0.0, 0.0, 10.0, 1.0, 0.1, 0.01, 0.0, 0.0, 10.0, 1.0, 0.1, }; A << d; ColumnVector x(5); double xv[] = { 1.0, 2.0, 3.0, 4.0, 5.0 }; x << xv; // y = 2.0*A*x ColumnVector y = 2 * A * x;
for (int i = 1; i <= y.Nrows(); ++i) std::cout << "y[" << i << "] = " << y(i) << std::endl; return 0;}
45
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Blitz++
• Blitz++ is a C++ class library for scientific computing which provides performance on par with Fortran 77/90.
• It uses template meta-programming techniques to achieve high performance
• The current versions provide dense arrays and vectors, random number generators, and small vectors and matrices.
• Blitz++ is distributed freely under an open source license
46
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Blitz++ Example
• Transforms a vector y using a principal rotation about the third axis:
47
3
2
1
3
2
1
100
0cossin
0sincos
y
y
y
x
x
x
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Blitz++ Example
// Blitz++ code
void transform(double alpha, TinyVector<double, 3>& x, const TinyVector<double, 3>& y)
{
double cosa = cos(alpha), sina = sin(alpha);
// Create the principal rotation matrix C_3(alpha)
TinyMatrix<double, 3, 3> C;
C = cosa, -sina, 0.0,
sina, cosa, 0.0,
0.0, 0.0, 1.0;
x = C * y;
}
48
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Blitz++ Example (cont.)
// low-level implementation void transform2(double alpha, double* x, double*
y) { double c = cos(alpha), s = sin(alpha); x[0] = c * y[0] - s * y[1]; x[1] = s * y[0] + c * y[1]; x[2] = y[2]; }
49
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Blitz++ Example: Results
50
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Outlook
Functional specification with a Domain Specific Embedded Language (DSEL)
equation = sum<vertex_edge> [ sumf<edge_vertex>(0.0,
_e) [ pot * orient(_e, _1) ] * A / d * eps] - V * rho
51
Elliptic PDE discretized by Finite Volume
References: [1]
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Demo
• NetCDF Demo• HDF5 Demo• LAPACK Demo
52
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011 53
Topics
• Scientific I/O Interface: netCDF• Scientific Data Package: HDF5• Application domain specific libraries• SMP programming support: CILK, TBB
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
CILK
• Extends the C language with some keywords• Programmer is responsible for exposing the parallelism• Runtime environment (automatically) takes care of
– Load balancing– Synchronization– Communication protocols
54
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
CILK
• Basic parallelism in CILK is achieved by using the following keywords– Spawn indicates that the procedure calling it can safely operate
in parallel with other executing code• Note that the scheduler is not obligated to run this procedure in
parallel
– Sync indicates that execution of the current procedure cannot proceed until all previously spawned procedures have completed an returned their results
55
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
CILK
• Work stealing– Each processor maintains a work dequeue of ready threads, and
it manipulates the bottom of the dequeue like a stack.– When a processor runs out of work, it steals a thread from the
top of a random victim’s queue.
58
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011 59
Topics
• Scientific I/O Interface: netCDF• Scientific Data Package: HDF5• CILK• TBB
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
TBB
• TBB– C++ template library which takes advantage of multi-core
processors– Developed by Intel Corporation– Stable release ver. 3.0, May 4 2010– Supports Microsoft Windows, Mac OS X, Linux using various
compilers (Visual C++, Intel C++ Compiler, GNU Compiler Collection)
– For additional information• http://threadingbuildingblocks.org/
• http://en.wikipedia.org/wiki/Intel_Threading_Building_Blocks
60
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
TBB
• TBB Library– Allows a programmer to avoid complications arising from use of
native threading packages, such as POSIX, Windows threads, etc.
– Provides access to multiple processors is abstracted by treating operations as “tasks”
• “Tasks” are allocated to individual cores dynamically by a run-time engine
– Provides (automated) efficient use of cache– Relies on generic programming– Has a higher level API compared to MPI or POSIX threads
(which leads to fewer bugs and easier code development)
61
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
TBB
• Notable Features– Recursive splitting
• TBB relies on breaking problems up recursively as required to get to the right level of parallel tasks
– Task stealing (instead of global task queue)• Scalability improves due to task stealing
– Algorithms• Easy to use due to generic programming approach
62
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
TBB Example
01: #include <iostream>
02: #include <string>
04: #include “tbb/task_scheduler_init.h”
05: #include “tbb/parallel_for.h”
06: #include “tbb/blocked_range.h”
07: using namespace tbb;
08: using namespace std;
09: static const size_t N = 23;
…
36: int main() {
37: task_scheduler_init init;
38: string str[N] = { string(”a”), string(”b”) };
39: for (size_t i = 2; i < N; ++i) str[i] = str[i-1]+str[i-2];
40: string &to_scan = str[N-1];
41: size_t *max = new size_t[to_scan.size()];
42: size_t *pos = new size_t[to_scan.size()];
43: parallel_for(blocked_range<size_t>(0, to_scan.size(), 100),
44: SubStringFinder( to_scan, max, pos ) );
45: for (size_t I = 0; I < to_scan.size(); ++i)
46: cout << ” ” << max[i] << ”(” << pos[i] << ”)” << endl;
47: return 0;
48: }
63
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011
Summary – Material for the Test
• NetCDF – 5-25• HDF5 – 26-37
64
CSC 7600 Lecture 23 : Parallel File I/O 3Spring 2011 65