Upload
wendy-murphy
View
215
Download
1
Embed Size (px)
Citation preview
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
METADATA DEVELOPMENTfor the
EARTH SYSTEM GRID
Luca Cinquini(SCD/NCAR)
for theEarth System Grid collaboration
www.earthsystemgrid.org
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
Metadata-centric view of ESG services
METADATASERVICES
METADATASERVICES
USER AUTHENTICATIONAND AUTHORIZATION
USER AUTHENTICATIONAND AUTHORIZATION
ACCESS AND AUTHORIZATION
METADATA
DATA TRANSPORTDATA TRANSPORT
LOCATIONMETADATA
SYSTEM MONITORINGAND CONTROL
SYSTEM MONITORINGAND CONTROL
LOGGINGMETADATA
DATA SEARCH & DISCOVERYDATA SEARCH & DISCOVERY
CONTENT METADATA
ANNOTATION & HISTORYMETADATA
DATA ANALYSIS & VISUALIZATION
DATA ANALYSIS & VISUALIZATION
AGGREGATION METADATA
DATA BROWSINGDATA BROWSING
CATALOGUINGMETADATA
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002ESG Metadata Services
Goal Functionality
• Services responsible for the creation, management and utilization of metadata associated with geophysical data
• Functionality: Metadata extraction (automatically, from files in different
format and according to various possible metadata standards) Metadata conversion (from one standard to another) Metadata aggregation (associated with data collections) Metadata annotation (manually by humans) Metadata validation (basic quality control of metadata) Registration (population of metadata holdings) Harvesting (combination of metadata from different
repositories) Metadata browsing and display (for humans) Search and discovery of data through metadata Metadata query (by agents or clients for data analysis and
visualization)
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
ESG Metadata Services Architecture
3-layers architecture:• Metadata Holdings: physical metadata content, stored in a
system of relational and/or XML native databases• Core Metadata Services: modules and libraries that
mediates all access to the Metadata Holdings (insert, update, delete, query) – expose an API that hides the specific implementation of the databases and query languages
• High Level Metadata Services: system of applications that make use of the Core Metadata Services to fulfill a specific atomic functionality – will be invoked by external clients
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
METADATAEXTRACTION
METADATAEXTRACTION
METADATADISPLAY
METADATADISPLAY
METADATABROWSING
METADATABROWSING
METADATASEARCH, QUERY
& DISCOVERY
METADATASEARCH, QUERY
& DISCOVERY
ESG CLIENTS API & USER INTERFACES
ReplicaLocationServices
MetadataCataloguing
ServicesXML DB THREDDS
catalogs
METADATA HOLDINGS
METADATAANNOTATION
METADATAANNOTATION
METADATAVALIDATION
METADATAVALIDATION
METADATA ACCESS(update, insert, delete, query)
METADATA ACCESS(update, insert, delete, query)
SERVICE TRANSLATIONLIBRARY
SERVICE TRANSLATIONLIBRARY
CORE METADATA SERVICES
METADATAAGGREGATION
METADATAAGGREGATION
METADATACONVERSION
METADATACONVERSION
METADATA & DATA REGISTRATION
METADATA & DATA REGISTRATION
PUBLISHINGPUBLISHING
HIGH LEVEL METADATA SERVICES
SEARCH & DISCOVERYSEARCH & DISCOVERYADMINISTRATIONADMINISTRATION BROWSING & DISPLAYBROWSING & DISPLAY
ANALYSIS & VISUALIZATIONANALYSIS & VISUALIZATION
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
ESG Metadata Services Current Development
Currently developing or evaluating the following technologies :• Replica Location Services : database to manage and index
multiple copies of the same data stored at different centers• Metadata Cataloguing Services : relational database to
store scientific metadata (developed for high energy physics and geophysical data)
• XML native databases (Apache Xindice)• THREDDS (by Unidata ) : system for hierarchical
cataloguing of datasets and associated metadata (http://www.unidata.ucar.edu/projects/THREDDS)
• NcML (Netcdf Markup Language) : XML language for encoding of metadata associated with data in netcdf format (and more…)
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
ESG Metadata Policy
• Premise : geophysical sciences are too broad and complex to impose a single, omnicomprehensive metadata standard to capture the relevant information for all datasets, projects, instruments, scientists
• ESG will not mandate use of any metadata schema or convention
• Allow data providers, scientists to use their metadata of choice, provide technologies and tools to store and access metadata through common services (MCS, XML DB, THREDDS catalogs)
• Encourage development and reuse of a limited set of domain-specific standards (climate data, radar data, airborn instrumentation etc), encoding in XML (according to community developed schemas), interoperability and combination of schemas (XML namespaces, RDF, ontologies)
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
Netcdf Markup Language (NcML)Work in progress, collaboration between
ESG, Unidata and the University of Florence
•Definition: XML representation for data following the netcdf model
•Features:
Express metadata associated with data in netcdf format
Definition of coordinates and coordinate systems (capturing netcdf conventions)
Aggregation/subsetting
Definition of new data, restracturing of existing data (virtual datasets)
Interoperability with openGIS and ISO
Also, possibly extend the model to other data formats (HDF, Grib etc.)
•Strategy: develop a system of XML schemas each covering a specific domain (advantages: more flexible, mantainable and extensible). Keep it simple!
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
NcML: schemas architecture
Netcdf core(generic netcdf data)
Netcdf core(generic netcdf data)
Netcdf Coordinate Systems
(netcdf conventions for coord, coord systems)
Netcdf Coordinate Systems
(netcdf conventions for coord, coord systems)
Netcdf (virtual) dataset(operations on data)
Netcdf (virtual) dataset(operations on data)
Netcdf Geo Coordinate Systems(geo-referenced coord systems)
Netcdf Geo Coordinate Systems(geo-referenced coord systems)
openGIS-ISO Reference Coordinate
Systems
openGIS-ISO Reference Coordinate
Systems
Other schemas for openGIS-ISO
Other schemas for openGIS-ISO
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002NcML: core schema
• For XML encoding of metadata (and data) of any generic netcdf file• Objects: Netcdf, Dimension, Variable, Attribute• Beta version reference implementation as Java library (http://www.scd.ucar.edu/vets/luca/netcdf/extract_metadata.htm)
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002Example: two-dimensional latitude,
longitudecoordinate variables (CDL)
• dimensions: xc = 128; yc = 64; lev = 18;
• variables: float T(lev,yc,xc);
T:long_name = "temperature"; T:units = "K"; T:coordinates = "lon lat";
float xc(xc); xc:long_name = "x-coordinate in Cartesian system"; xc:units =
"m"; float yc(yc);
yc:long_name = "y-coordinate in Cartesian system"; yc:units = "m";
float lev(lev); lev:long_name = “altitude levels"; lev:units = “km";
float lon(yc,xc); lon:long_name = "longitude"; lon:units = "degrees_east";
float lat(yc,xc); lat:long_name = "latitude"; lat:units = "degrees_north";
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002NcML core schema
<?xml version="1.0" encoding="UTF-8"?><nc:netcdf xmlns:nc="http://www.ucar.edu/schemas/netcdf"
uri="http://www.scd.ucar.edu/vets/luca/netcdf/example.nc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ucar.edu/schemas/netcdf http://www.ucar.edu/schemas/netcdf.xsd"><nc:dimension length="128" name="xc"/><nc:dimension length="64" name="yc"/><nc:dimension length="18" name="lev"/>
<nc:variable name="xc" shape="xc" type="float"> <nc:attribute name="long_name" type="string" value="x cartesian coord"/>
<nc:attribute name="units" type="string" value="m"/></nc:variable><nc:variable name="yc" shape="yc" type="float">
<nc:attribute name="long_name" type="string" value="y cartesian coord"/>
<nc:attribute name="units" type="string" value="m"/></nc:variable>
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002NcML core schema
<nc:variable name="lev" shape="lev" type="float"><nc:attribute name="long_name" type="string"
value="altitude levels"/><nc:attribute name="units" type="string" value="km"/>
</nc:variable><nc:variable name="lon" shape="yc xc" type="float">
<nc:attribute name="units" type="string" value="degrees_east"/></nc:variable><nc:variable name="lat" shape="yc xc" type="float">
<nc:attribute name="units" type="string" value="degrees_north"/></nc:variable>
<nc:variable name="T" shape="lev yc xc" type="float"><nc:attribute name="long_name" type="string"
value="temperature"/><nc:attribute name="units" type="string" value="K"/><nc:attribute name="coordinates" type="string" value="lat
lon"/></nc:variable>
</nc:netcdf>
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
NcML: coordinate systems schema
Generalization and unification of netcdf conventions for coordinates and coordinate systems
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002Coordinate Systems extension to NcML
<nc:coordinateVariable name="xc" shape="xc" type="float"><nc:attribute name="long_name" type="string" value=“x cartesian
coord"/><nc:attribute name="units" type="string" value="m"/>
</nc:coordinateVariable><nc:coordinateVariable name="yc" shape="yc" type="float">
<nc:attribute name="long_name" type="string" value=“y cartesian coord"/>
<nc:attribute name="units" type="string" value="m"/></nc:coordinateVariable><nc:coordinateVariable name="lev" shape="lev" type="float">
<nc:attribute name="long_name" type="string" value="altitude levels"/>
<nc:attribute name="units" type="string" value="km"/></nc:coordinateVariable><nc:coordinateVariable name="lon" shape="yc xc" type="float">
<nc:attribute name="units" type="string" value="degrees_east"/></nc:coordinateVariable><nc:coordinateVariable name="lat" shape="yc xc" type="float">
<nc:attribute name="units" type="string" value="degrees_north"/></nc:coordinateVariable>
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002Coordinate Systems extension to NcML
<nc:coordinateSystem name=“implicit"><nc:coordinateAxis ref=“xc” /><nc:coordinateAxis ref=“yc” /><nc:coordinateAxis ref=“lev” />
</nc:coordinateSystem><nc:coordinateVariable name=“geo">
<nc:coordinateAxis ref=“lon” /><nc:coordinateAxis ref=“lat” /><nc:coordinateAxis ref=“lev” />
</nc:coordinateVariable>
<nc:variable name="T" shape="lev yc xc" type="float“ coordinateSystems=“implicit geo”><nc:attribute name="long_name" type="string"
value="temperature"/><nc:attribute name="units" type="string" value="K"/><nc:attribute name="coordinates" type="string" value="lat
lon"/></nc:variable>
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
Coordinate Systems extension to NcML
<nc:variable name=“ps" shape="lev yc xc" type="float“ coordinateSystems=“implicit geo”>
<nc:attribute name="long_name" type="string" value=“pressure"/>
<nc:attribute name="units" type="string" value=“Pa"/><nc:attribute name="coordinates" type="string" value="lat
lon"/></nc:variable><nc:coordinateSystem name=“pressure">
<nc:coordinateAxis ref=“lon” /><nc:coordinateAxis ref=“lat” /><nc:coordinateAxis ref=“pressure” />
</nc:coordinateSystem><nc:variable name="T" shape="lev yc xc" type="float”
coordinateSystems=“implicit geo pressure”><nc:attribute name="long_name" type="string"
value="temperature"/><nc:attribute name="units" type="string" value="K"/><nc:attribute name="coordinates" type="string" value="lat
lon"/></nc:variable>
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
Aggregation in NcML
• XML naturally suited to represent aggregation of netcdf data• Rules for representing an aggregation hierarchy:
Allow netcdf nodes to contain other netcdf nodes Factor out (i.e. in the parent netcdf node) all common
structure between two nodes Structure defined in a netcdf node overrides that defined in
a parent netcdf node
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002NcML aggregation over existing
coordinate (time)
<nc:netcdf><nc:dimension name="lat" length="64" /><nc:dimension name="lon" length="128" /><nc:dimension name="time" length="6" /><nc:variable name="temperature" shape="lat lon time"><nc:variable name="humidity" shape="lat lon time"><nc:netcdf uri="file1.nc">
<nc:dimension name="time" length=“3" /> <nc:coordinateVariable name="time" shape="time">
<nc:values separator=" ">10 20 30</values> </nc:coordinateVariable></nc:netcdf><nc:netcdf uri="file2.nc">
<nc:dimension name="time" length=“3" /> <nc:coordinateVariable name="time" shape="time">
<nc:values separator=" ">40 50 60</values> </nc:coordinateVariable></nc:netcdf>
</nc:netcdf>
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002NcML aggregation over variables
<nc:netcdf>
<nc:dimension name="lat" length="64" /> <nc:dimension name="lon" length="128" />
<nc:netcdf uri="file1.nc"> <nc:variable name="temperature" shape="lat lon"> </nc:netcdf> <nc:netcdf uri="file2.nc"> <nc:variable name="humidity" shape="lat lon"> </nc:netcdf>
</nc:netcdf>
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002NcML double aggregation
<nc:netcdf> <nc:dimension name="lat" length="64" /> <nc:dimension name="lon" length="128" /> <nc:dimension name=“time" length=“6" />
<nc:netcdf uri=“temp/”><nc:variable name="temperature" shape="lat lon time"><nc:netcdf uri=“file1.nc>
<nc:dimension name="time" length=“3" /> <nc:coordinateVariable name="time"
shape="time"> <nc:values separator=" ">10 20 30</values> </nc:coordinateVariable>
<nc:netcdf><nc:netcdf uri=“file2.nc>
<nc:dimension name="time" length=“3" /> <nc:coordinateVariable name="time"
shape="time"> <nc:values separator=" ">40 50 60</values> </nc:coordinateVariable>
<nc:netcdf>
</nc:netcdf>
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002NcML double aggregation
<nc:netcdf uri=“humid/”><nc:variable name=“humidity" shape="lat lon time"><nc:netcdf uri=“file1.nc>
<nc:dimension name="time" length=“3" /> <nc:coordinateVariable name="time"
shape="time"> <nc:values separator=" ">10 20 30</values> </nc:coordinateVariable>
<nc:netcdf><nc:netcdf uri=“file2.nc>
<nc:dimension name="time" length=“3" /> <nc:coordinateVariable name="time"
shape="time"> <nc:values separator=" ">40 50 60</values> </nc:coordinateVariable>
<nc:netcdf> </nc:netcdf>
</nc:netcdf>
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
Other NcML planned development
• Subsetting of data• Compute derived data• Extensions for interoperability with openGIS and ISO standards :
Establish a bond between Atmospheric Research and Geo-spatial communities
Allows import of NcML data into GIS tools, export of GIS data in netcdf format
Luca Cinquini for the Earth System Grid
NIEeS Workshop, Cambridge (UK), Sep 2002
Conclusions
• ESG is very active in the research and development of metadata schemas, services and technologies
• We are very interested in collaborating with other projects and institutions to the definition and adoption of metadata standards for the geosciences and to work at interoperability technologies among standards