Upload
enid
View
66
Download
0
Tags:
Embed Size (px)
DESCRIPTION
NetCDF. Ed Hartnett Unidata/UCAR [email protected]. Unidata. Unidata - helps universities acquire, display, and analyze Earth-system data. UCAR – University Corporation for Atmospheric Research - a nonprofit consortium of 66 universities. SDSC Presentation, July 2005. - PowerPoint PPT Presentation
Citation preview
Unidata
• Unidata - helps universities acquire, display, and analyze Earth-system data.
• UCAR – University Corporation for Atmospheric Research - a nonprofit consortium of 66 universities.
SDSC Presentation, July 2005
• Intro to NetCDF Classic • Intro to NetCDF-4
What is NetCDF?
• A conceptual data model for scientific data.
• A set of APIs in C, F77, F90, Java, etc. to create and manipulate data files.
• Some portable binary formats.• Useful for storing arrays of data and
accompanying metadata.
History of NetCDF
20051988 20041991 1996
netCDF 2.0 released
netCDF developed at Unidata
netCDF 3.0 released
netCDF 3.6.0 released
netCDF 4.0 beta released
Getting netCDF
• Download latest release from the netCDF web page: http://www.unidata.ucar.edu/content/software/netcdf
• Builds and installs on most platforms with no configuration necessary.
• For a list platforms netCDF versions have built on, and the output of building and testing netCDF, see the web site.
NetCDF Portability
• NetCDF is tested on a wide variety of platforms, including Linux, AIX, SunOS, MacOS, IRIX, OSF1, Cygwin, and Windows.
• We test with native compilers when we can get them.
• 64-bit builds are supported with some configuration effort.
What Comes with NetCDF
• NetCDF comes with 4 language APIs: C, C++, Fortran 77, and Fortran 90.
• Tools ncgen and ncdump.• Tests.• Documentation.
NetCDF Java API
• The netCDF Java API is entirely separate from the C API.
• You don’t need to install the C API for the Java API to work.
• Java API contains many exciting features, such as remote access and more advanced coordinate systems.
Tools to work with NetCDF Data
• The netCDF core library provides basic data access.
• ncgen and ncdump provide some helpful command line functionality.
• Many additional tools are available, see: http://www.unidata.ucar.edu/packages/netcdf/software.html
CDL – Common Data Language
• Grammar defined for displaying information about netCDF files.
• Can be used to create files without programming.
• Can be used to create reading program in Fortran or C.
• Used by ncgen/ncdump utilities.
Example of CDLnetcdf foo { // example netCDF specification in CDL dimensions: lat = 10, lon = 5, time = unlimited;
variables: int lat(lat), lon(lon), time(time); float z(time,lat,lon), t(time,lat,lon); double p(time,lat,lon); int rh(time,lat,lon);
lat:units = "degrees_north"; lon:units = "degrees_east";
data: lat = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90; lon = -140, -118, -96, -84, -52; }
Software Architecture of NetCDF-3
V3 C API
V2 C API
V2 C tests
V3 C tests F77 API
F77 tests F90 API
C++ APIncdumpncgen
• Fortran, C++ and V2 APIs are all built on the C API.
• Other language APIs (perl, python, MatLab, etc.) use the C API.
NetCDF Documentation
• Unidata distributes a NetCDF Users Guide which describes the data model in detail.
• A language-specific guide is provided for C, C++, Fortran 77, and Fortran 90 users.
• All documentation can be found at: http://my.unidata.ucar.edu/content/software/netcdf/docs
NetCDF Jargon
• “Variable” – a multi-dimensional array of data, of any of 6 types (char, byte, short, int, float, or double).
• “Dimension” – information about an axis: it’s name and length.
• “Attribute” – a 1D array of metadata.
More NetCDF Jargon
• “Coordinate Variable” – a 1D variable with the same name as a dimension, which stores values for each dimension value.
• “Unlimited Dimension” – a dimension which has no maximum size. Data can always be extended along the unlimited dimension.
The NetCDF Classic Data Model
• The netCDF Classic Data Model contains dimensions, variables, and attributes.
• At most one dimension may be unlimited.• The Classic Data Model is embodied by
netCDF versions 1 through 3.6.0• NetCDF is moving towards a new, richer
data model: the Common Data Model.
NetCDF Example
• Suppose a user wants to store temperature and pressure values on a 2D latitude/longitude grid.
• In addition to the data, the user wants to store information about the lat/lon grid.
• The user may have additional data to store, for example the units of the data values.
NetCDF Model Example
temperature
pressure
Units: C
Units: mb
VariablesDimensions
latitude
longitude
Attributes
latitude
longitude
Coordinate Variables
Important NetCDF Functions
• nc_create and nc_open to create and open files.• nc_enddef, nc_close.• nc_def_dim, nc_def_var, nc_put_att_*, to define
dimensions, variables, and attributes.• nc_inq, nc_inq_var, nc_inq_dim, nc_get_att_* to
learn about dims, vars, and atts.• nc_put_vara_*, nc_get_vara_* to write and read
data.
C Functions to Define Metadata /* Create the file. */ if ((retval = nc_create(FILE_NAME, NC_CLOBBER, &ncid))) return retval;
/* Define the dimensions. */ if ((retval = nc_def_dim(ncid, LAT_NAME, LAT_LEN, &lat_dimid))) return retval; if ((retval = nc_def_dim(ncid, LON_NAME, LON_LEN, &lon_dimid))) return retval;
/* Define the variables. */ dimids[0] = lat_dimid; dimids[1] = lon_dimid; if ((retval = nc_def_var(ncid, PRES_NAME, NC_FLOAT, NDIMS, dimids, &pres_varid))) return retval if ((retval = nc_def_var(ncid, TEMP_NAME, NC_FLOAT, NDIMS, dimids, &temp_varid))) return retval;
/* End define mode. */ if ((retval = nc_enddef(ncid))) return retval;
C Functions to Write Data
/* Write the data. */ if ((retval = nc_put_var_float(ncid, pres_varid, pres_out))) return retval; if ((retval = nc_put_var_float(ncid, temp_varid, temp_out))) return retval;
/* Close the file. */ if ((retval = nc_close(ncid))) return retval;
C Example – Getting Data• /* Open the file. */• if ((retval = nc_open(FILE_NAME, 0, &ncid)))• return retval;
• /* Read the data. */• if ((retval = nc_get_var_float(ncid, 0, pres_in)))• return retval;• if ((retval = nc_get_var_float(ncid, 1, temp_in)))• return retval;
• /* Do something useful with the data… */• • /* Close the file. */• if ((retval = nc_close(ncid)))• return retval;
Data Reading and Writing Functions
• There are 5 ways to read/write data of each type.
• var1 – reads/writes a single value.• var – reads/writes entire variable at once.• vara – reads/writes an array subset.• vars – reads/writes an array by slices.• varm – reads/writes a mapped array.• Ex.: nc_put_vars_short writes shorts by slices.
Attributes
• Attributes are 1-D arrays of any of the 6 netCDF types.
• Read/write them with functions like: nc_get_att_float and nc_put_att_int.
• Attributes may be attached to a variable, or may be global to the file.
NetCDF File Formats
• Starting with 3.6.0, netCDF supports two binary data formats.
• NetCDF Classic Format is the format that has been in use for netCDF files from the beginning.
• NetCDF 64-bit Offset Format was introduced in 3.6.0 and allows much larger files.
• Use classic format unless you need the large files.
NetCDF-3 Summary
• NetCDF is a software library and some binary data formats, useful for scientific data, developed at Unidata.
• NetCDF organizes data into variables, with dimensions and attributes.
• NetCDF has proven to be reliable, simple to use, and very popular.
Why Add to NetCDF-3?
• Increasingly complex data sets call for greater organization.
• Size limits, unthinkably huge in 1988, are routinely reached in 2005.
• Parallel I/O is required for advanced Earth science applications.
• Interoperability with HDF5.
NetCDF-4
• NetCDF-4 aims to provide the netCDF API as a front end for HDF5.
• Funded by NASA, executed at Unidata and NCSA.
• Includes reliable netCDF-3 code, and is fully backward compatible.
NetCDF-4 Organizations• Unidata/UCAR• NCSA –
The National Center for Supercomputing ApplicationsUniversity of Illinois at Urbana-Champaign
• NASA – NetCDF-4 was funded by NASA award number AIST-02-0071.
New Features of NetCDF-4
• Multiple unlimited dimensions.• Groups to organize data.• New types, including compound types and
variable length arrays.• Parallel I/O.
The Common Data Model
• NetCDF-4, scheduled for beta-release this Summer, will conform to the Common Data Model.
• Developed by John Caron at Unidata, with the cooperation of HDF, OpenDAP, netCDF, and other software teams, CDM unites different models into a common framework.
• CDM is a superset of the NetCDF Classic Data Model
The NetCDF-4 Data Model
• NetCDF-4 implements the Common Data Model.• Adds groups, each group can contain variables,
attributes and dimensions, and groups.• Dimensions are scoped so that variables in
different groups can share dimensions.• Compound types allow users to define new
types, comprised of other atomic or user-defined types.
• New integer and string types.
Software Architecture of NetCDF-4
V4 C API
V2 C API
V2 C tests
V3 C tests F77 API
F77 tests F90 API
C++ APIncdumpncgen
V3 C API HDF5
NetCDF-4 Release Status
• Latest alpha release includes all netCDF-4 features – depends on latest HDF5 development snapshot.
• Beta release – due out in August, replaces artificial netCDF-4 constructs, and depends on a yet-to-be-released version of HDF5.
• Promotion from beta to full release will happen sometime in 2006.
Building NetCDF-4
• NetCDF-4 requires that HDF5 version 1.8.3 be installed. This is not released yet.
• The latest HDF5 development release works with the latest netCDF alpha release.
• To build netCDF-4, specify –enable-netcdf-4 at configure.
When to Use NetCDF-4 Format
• The new netCDF-4 features (groups, new types, parallel I/O) are only available for netCDF-4 format files.
• When you need HDF5 files.• When portability is less important, until
netCDF-4 becomes widespread.
Versions and Formats
Classic Format
64-Bit Offset Format
NetCDF-4 Format
20051988 20041991 1996
netCDF 2.0 released
netCDF developed by Glenn Davis netCDF 3.0
released
netCDF 3.6.0 released
netCDF 4.0 beta released
NetCDF-4 Feature Review
• Multiple unlimited dimensions.• How to use groups.• Using compound types.• Other new types.• Variable length arrays.• Parallel I/O.• HDF5 Interoperability.
Multiple Unlimited Dimensions
• Unlimited dimensions are automatically expanded as new data are written.
• NetCDF-4 allows multiple unlimited dimensions.
Working with Groups
• Define a group, then use it as a container for the classic data model.
• Groups can be used to organize sets of data.
Model_Run_1arhlat
lon temp
units
units
history
Model_Run_1rhlat
lon temp
units
units
history Model_Run_2rhlat
lon temp
units
units
history
An Example of Groups
New Functions to Use Groups
• Open/create returns ncid of root group.• Create a new group with nc_def_grp. nc_def_grp(int parent_ncid, char *name, int *new_ncid);
• Learn about groups with nc_inq_grps. nc_inq_grps(int ncid, int *numgrps, int *ncids);
C Example Using Groups if (nc_create(FILE_NAME, NC_NETCDF4, &ncid)) ERR; if (nc_def_grp(ncid, DYNASTY, &tudor_id)) ERR; if (nc_def_dim(tudor_id, DIM1_NAME,
NC_UNLIMITED, &dimid)) ERR; if (nc_def_grp(tudor_id, HENRY_VII, &henry_vii_id))
ERR; if (nc_def_var(henry_vii_id, VAR1_NAME, NC_INT, 1,
&dimid, &varid)) ERR; if (nc_put_vara_int(henry_vii_id, varid, start, count,
data_out)) ERR; if (nc_close(ncid)) ERR;
Create Complex Types
• Like C structs, compound types can be assembled into a user defined type.
• Compound types can be nested – that is, they can contain other compound types.
• New functions are needed to create new types.
• V2 API functions are used to read/write complex types.
C Example of Compound Types
/* Create a file with a compound type. Write a little data. */ if (nc_create(FILE_NAME, NC_NETCDF4, &ncid)) ERR; if (nc_def_compound(ncid, sizeof(struct s1), SVC_REC, &typeid)) ERR; if (nc_insert_compound(ncid, typeid, BATTLES_WITH_KLINGONS,
HOFFSET(struct s1, i1), NC_INT)) ERR; if (nc_insert_compound(ncid, typeid, DATES_WITH_ALIENS,
HOFFSET(struct s1, i2), NC_INT)) ERR; if (nc_def_dim(ncid, STARDATE, DIM_LEN, &dimid)) ERR; if (nc_def_var(ncid, SERVICE_RECORD, typeid, 1, dimids, &varid)) ERR; if (nc_put_var(ncid, varid, data)) ERR; if (nc_close(ncid)) ERR;
New Ints, Opaque, String Types
• Opaque types are bit-blobs of fixed size.• String types allow multi-dimensional arrays
of strings.• New integer types: UBYTE, USHORT,
UINT, UINT64, INT64.
Variable Length Arrays
• Variable length arrays allow the efficient storage of arrays of variable size.
• For example: an array of soundings of different number of elements.
Parallel I/O with NetCDF-4• Must use configure option –enable-parallel when
building netCDF.• Depends on HDF5 parallel features, which require
MPI.• Must create or open file with nc_create_par or
nc_open_par.• All metadata operations are collective.• Adding a new record is collective.• Variable reads/writes are independent by default,
but can be changed to do collective operations.
HDF5 Interoperability
• NetCDF-4 can interoperate with HDF5 with a SUBSET of HDF5 features.
• Will not work with HDF5 files that have looping groups, references, and types not found in netCDF-4.
• HDF5 file must use new dimension scale API to store shared dimension info.
• If a HDF5 follows the Common Data Model, NetCDF-4 can interoperate on the same files.
Future Plans for NetCDF
• NetCDF 4.0 release in 2006.• Beta for next major version of netCDF in
Summer, 2006.• Full compatibility with Common Data Model.• Remote access, including remote subsetting of
data.• XML-based representation of netCDF metadata.• Full Fortran 90 support, but limited F77 support.
For Further Information
• netCDF mailing list: [email protected]
• email Ed: [email protected]• netCDF web site: www.unidata.ucar.edu