16
Unidata Technologies Relevant to GO-ESSP: An Update Russ Rew 2008-09-17 QuickTime™ and a decompressor are needed to see this picture.

Unidata Technologies Relevant to GO-ESSP: An Update

  • Upload
    sarai

  • View
    23

  • Download
    0

Embed Size (px)

DESCRIPTION

Unidata Technologies Relevant to GO-ESSP: An Update. Russ Rew 2008-09-17. Since June 2007 (Paris meeting). NetCDF: releases of C- and Java-based software CDM: Unidata’s Common Data Model NcML: NetCDF Markup Language SDCI: netCDF/OPeNDAP integration TDS: THREDDS Data Server - PowerPoint PPT Presentation

Citation preview

Page 1: Unidata Technologies  Relevant to GO-ESSP: An Update

Unidata Technologies Relevant to GO-ESSP: An Update

Russ Rew

2008-09-17

QuickTime™ and a decompressor

are needed to see this picture.

Page 2: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 2

Since June 2007 (Paris meeting)

NetCDF: releases of C- and Java-based software CDM: Unidata’s Common Data Model NcML: NetCDF Markup Language SDCI: netCDF/OPeNDAP integration TDS: THREDDS Data Server Unidata conventions for observational data Libcf: on hold before netCDF-4 release, development

starting up again Other projects: GALEON, RAMADDA, IDV, …

Page 3: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 3

NetCDF-4, June 2008

Backward-compatible with netCDF-3 data and API Default build configuration just installs netCDF-3 If configured to use an HDF5 library, installs netCDF-4 with

enhanced data model, format, and API Performance enhancements with HDF5-based format:

Compression (per-variable) Chunking (multi-dimensional tiling) Efficient changes to file schema Elimination of unneeded endianness conversions Ample variable sizes Parallel-IO

Page 4: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 4

Enhanced data model for netCDF

A file has a top-level unnamed group. Each group may contain one or more named subgroups, user-defined types, variables, dimensions, and attributes. Variables also have attributes. Variables may share dimensions, indicating a common grid. One or more dimensions may be of

unlimited length.

Dimension name: String

length: int

isUnlimited( )

Attribute name: String

type: DataType

values: 1D array

Variable name: String

shape: Dimension[ ]

type: DataType

array: read( ), …

Group name: String

File location: Filename

create( ), open( ), …

Variables and attributes have one of twelve primitive data types or one of

four user-defined types.DataType

PrimitiveTypecharbyte

short int

int64float

doubleunsigned byte unsigned short

unsigned intunsigned int64

string

UserDefinedType

typename: String

Compound

VariableLength

Enum

Opaque

Page 5: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 5

Enhanced netCDF-4 data model

Data access level of Unidata’s Common Data Model New primitive types: strings, unsigned, 64-bit integers Adds user-defined types

Compound structures Variable-length types May be nested

Adds named groups for nested name scopes, hierarchical organization of data, factoring out common metadata, sequences of analysis steps, …

Supports multiple unlimited dimensions

Page 6: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 6

Change to permitted netCDF names, even in netCDF-3

Most special characters now allowed in names Example use: attributes with “:” for namespace prefixes

International names for variables, dimensions, attributes, (and groups, types, compound members, enum symbols)

Unicode permitted Supported by ncdump and ncgen utilities

:Conventions = “CF-2.0, iso=http://iso.org”var:units = “m”;var:iso\:attrunit = “metres”;

Page 7: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 7

International use of CF Conventions

variables:float pressure(pressure) ;

pressure:standard_name = "air_pressure" ;float pressão(pressão) ;

pressão:standard_name = "air_pressure" ;float presión(presión) ;

presión:standard_name = "air_pressure" ;float ciśnienie(ciśnienie) ;

ciśnienie:standard_name = "air_pressure" ;float налягане(налягане) ;

налягане:standard_name = "air_pressure" ;float 压力 ( 压力 ) ;

压力 :standard_name = "air_pressure" ;float πίεση(πίεση) ;

πίεση:standard_name = "air_pressure" ;float давление(давление) ;

давление:standard_name = "air_pressure" ;float 압력 ( 압력 ) ;

압력 :standard_name = "air_pressure" ;

Page 8: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 8

First generic netCDF-4 application: ncdump

Full support for netCDF-4 data model Handles unbounded number of types Evidence that developing such generic applications is practical Still has limitations, e.g. no NcML output for new features yet

Extended CDL can represent all features of data model Nested groups User-defined type definitions

• Compound types• Variable length types• Enumerations with symbols• Opaque types

Optional explicit attribute types Syntax for data constants of each user-defined type

By end of 2008: Performance parameters: chunking, compression, format variant CDL to binary or C program with new ncgen (by end of year)

Page 9: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 9

time = 53602, 53602.25, 53602.5, 53602.75;

time = "2005-08-19", "2005-08-19 06", "2005-08-19 12", “2005-08-19 18”;

variable double time; time:calendar = “julian”; time:units = “days since 1858-11-17 00:00:00”;…

Another recent feature of ncdump

Support for CF climate calendars

Default:

With new “-t” option, ISO-8601 notation

Uses LLNL’s cdtime library instead of udunits

Page 10: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 10

Unidata’s Common Data Model

Scientific Feature Types

Grid

Point

Radial

Trajectory

Swath

Station Profile

Coordinate Systems

Data Access

netCDF-3, HDF5, OPeNDAP

BUFR, GRIB1, GRIB2, NEXRAD, NIDS, McIDAS, GEMPAK, GINI, DMSP, HDF4, HDF-EOS,

DORADE, GTOPO, ASCII

Page 11: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 11

Unidata’s Common Data Model

Three-layers: data access, coordinate systems, feature types

Implemented in netCDF-Java 2.2.22 and 4.0-alpha IO Service Provider plugins for GRIB, netCDF-4,

HDF4, HDF-EOS, BUFR, GEMPAK Grids, CADIS, McIDAS Area, others …

GRIB now includes thin grids, Gaussian grids CDM feature types, point feature types Draft proposed CF conventions for point data Collaborating with OPeNDAP.org, HDF Group

Page 12: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 12

NcML (NetCDF Markup Language)

More than just an XML version of netCDF data: can use to add or modify metadata for existing files

Supports several kinds of aggregations of multiple CDM files into a single virtual netCDF dataset

Union of variables in files Aggregation on an existing dimension Aggregation on a new dimension Dynamic aggregations of files in a directory

New kinds of aggregation: forecast model run collection, tiled aggregation

Even C and Fortran clients can access data through NcML on server, using OPeNDAP protocol

See Rich Signell’s presentation

Page 13: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 13

NetCDF and OPeNDAP Integration

Two-year project funded by NSF SDCI Goal to improve OPeNDAP and netCDF integration

by Enhancing Unidata's netCDF C library to directly support

OPeNDAP protocol for remote access to netCDF data Extending OPeNDAP protocol to support elements of the

Unidata Common Data Model on server as well as client

Client support for OPeNDAP available in current netCDF snapshot releases

See Dennis Heimbigner’s presentation

Page 14: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 14

THREDDS Data Server (TDS)

Prototype netCDF Subset Service Subsets CDM datasets using earth coordinates and date ranges

(not array indices) Subsets by variables may be requested Grids or station data Returns choice of netCDF binary file, XML, CSV, or ASCII

Authorization/authentication capabilities Restrict dataset access Can use pluggable authorization, e.g. CAS, CAMS

Support for runtime configuration (avoids shutdown) Generation of CF-compliant data, if possible See Ethan Davis’ presentation on TDS, OGC WCS, and CF See Jon Blower’s presentation on new TDS support for Web

Map Service

Page 15: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 15

Other Unidata Technologies and Projects

RAMADDA: a database approach to managing metadata and data repositories

GALEON / OGC / WCS 1.1 / NcML-G: GIS and web services using CF-netCDF

IDV: Integrated Data Viewer for analysis and visualization of data integrated from diverse sources

Proposals for developing CF satellite data product conventions NASA ESDS submission for netCDF classic format standard Next-generation LDM: event-driven Internet distribution of near

real-time data by subscription

Page 16: Unidata Technologies  Relevant to GO-ESSP: An Update

2008-09-17 16

From NSF Panel review of Unidata’s 5-year proposal

The Panel members are unanimous in their judgment that the Unidata program has been a success, and in recommending that Unidata be supported over the next five years. … the panel recommends that the UPC play a strong leadership role to help shape future cyberinfrastructure technologies for geosciences.

So, look out for netCDF-5 …