GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

Preview:

DESCRIPTION

World Data Center Climate: Status and Portal Integration. Michael Lautenschlager, Hannes Thiemann and Frank Toussaint ICSU World Data Center Climate Model and Data / Max-Planck-Institute for Meteorology Hamburg, Germany. GO-ESSP at LLNL Livermore, June 19th – 21st, 2006. - PowerPoint PPT Presentation

Citation preview

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 1

GO-ESSP at LLNLLivermore, June 19th – 21st, 2006

World Data Center Climate:Status and Portal Integration

WDCC Home: www.wdcc-climate.de / WDCC Contact: data@dkrz.de

Michael Lautenschlager, Hannes Thiemann and Frank Toussaint

ICSU World Data Center ClimateModel and Data / Max-Planck-Institute for Meteorology

Hamburg, Germany

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 2

Content:

WDCC Status

CERA Concept

Portal Integration

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 3

WDCC Content

ERA40

IPCC

CEOPBALTEX HOAPS

CARIBIC

WOCE

ERA15/40NCEP

GEBCO

COSMOS

Simulations @ MPI, GKSS,…

Data from Earth SystemModelling andRelated Observations

EH5/MPI-OMIPCC-AR4

Start: Approved in January 2003Maintenance: Model and Data (M&D/MPI-M) and German Climate Computing Centre (DKRZ)

June 2006: 590 Experiments / 79.000 Data Sets

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 4

Data Export from WDC Climate

Corresponds to 2 – 10 TB/month

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 5

Geographical Distribution of WDCC Users

Total number of registered users: 750 (Mai 2006)

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 6

Data Import into WDC Climate

6 * 10**9 BLOBs

ECHAM5/MPI-OM IPCC AR4 Scenarios (ca. 110 TB)

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 7

(I) Data catalogue and Pointer to Unix files Enable search and identification of data Allow for data access as they are (coarse granularity raw

data files)

(II) Application-oriented data storage in BLOB tables Time series of individual variables are stored as BLOB

entries in DB Tables (fine granularity data products)Allow for fast and selective data access

Storage in standard data format (GRIB, NetCDF/CF)Allow for application of standard data processing routines

(PINGOs, CDOs)

CERA1) Concept:Semantic Data Management

1) Climate and Environmental data Retrieval and Archiving

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 8

Level 1 - Interface:Metadata entries(XML, ASCII)+ Data Files

Level 2 – Interf.:Separate filescontaining BLOBtable data in application adapted structure(time series ofsingle variables)

Experiment Description

Pointer toUnix-Files

Dataset 1Description

Dataset nDescription

BLOB DataTable

BLOB DataTable

WDCC Data Topology

BLOB DB Table corresponds to scalable, virtual file at the operating system level.

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 9

CERA Data Model

Entry

Reference

Status

Distribution

Contact Coverage

Parameter

SpatialReferenceLocal Adm.

Data Access

Data Org

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 10

Data matrix of model experiment

Model variables

Mod

el R

un T

ime

2 D: small BLOBS (180 KB)

3 D: large BLOBS (3 MB)

Raw data file: direct model output (1.3 – 16.2 GB)

Each columm is one BLOB Table in CERA-DB

Raw

data file in D

KR

Z A

rchive

T2M Precip SLP2D variables . . Temp

Water vapour

3D variables . .

T1T2T3.......Tn...................Tend

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 11

Preferred DB-storage structure for web-based access:• single variable• single level • time series of 2D gridded data records• Formats: GRIB-1 – NetCDF/CF (- GRIB-2)

Climate Model Data Structures

Application related data structure (2-D)original data structure (4-D)

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 12

TX

7: In

tel I

tan

ium

-2 w

ith L

inu

x

DKRZ Architecture

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 13

Portal Integration

Two strategies:One way integration: discovery and use metadata are integrated in a central data portal in one step Example: C3Grid data catalogue (refer to presentation from Heinrich Widmann)

Two way integration: discovery metadata are integrated in central data portal, use metadata are extracted from remote archive when they are needed for data download and processingExample: Primary data publication in TIB library catalogue (STD-DOI)WDCC integration in NDG (NERC Data Grid)

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 14

Primary data publication (STD-DOI)URL: http://www.std-doi.de/

DataReview

Primary DataPublicationProcess

ISO 690-2: Metadata for citation of electronic media

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 15

Example: Publ.-DOI from WDCC

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 16

DOIURN

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 17

Publ.-DOI

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 18

830 GB

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 19

Data retrieval procudure is given at the end (user identification is required)

Ident.-DOI

WDCC Metadaten und OAI-PMH

O p e n A r c h i v e s I n i t i a t i v e

Protocol for Metadata Harvesting

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 23

ÜWDCC OAI server at:

(Software: dlese (www.dlese.org) + apache-tomcat 5.5.12 + Java 1.5)

http://uranus.dkrz.de:8080/oai/provider

- 35 IPCC experiments with more than 11000 datasetsMetadata Format: ISO 19115

C3Grid (http://gsphere.awi.de:8080/gridsphere/gridsphere)

- 40 STD-DOI experiments with more than 1700 datasetsMetadata Format: DIFGO-ESSP (NDG, http://ndg.badc.rl.ac.uk/)

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 24

NDG

ÜDIF XMLsWDCC

OAI ServerWDCC(Software: dlese)

OAI ClientNDG(dlese)

DIF XMLsProvider 2

OAI Server 2

OAI Server n

Discovery PortalNDG

OAI Harvesting (Pull or Notification)

CatalogNDGrecord 1...n

Process

Delivery

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 25

URL: http://glue.badc.rl.ac.uk/discovery/Keyword: ECHAM4

Recommended