23
M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 1 GO-ESSP at LLNL Livermore, June 19th – 21st, 2006 World Data Center Climate: Status and Portal Integration WDCC Home: www.wdcc-climate.de / WDCC Contact: [email protected] Michael Lautenschlager, Hannes Thiemann and Frank Toussaint ICSU World Data Center Climate Model and Data / Max-Planck-Institute for Meteorology Hamburg, Germany

GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

  • Upload
    mrinal

  • View
    38

  • Download
    3

Embed Size (px)

DESCRIPTION

World Data Center Climate: Status and Portal Integration. Michael Lautenschlager, Hannes Thiemann and Frank Toussaint ICSU World Data Center Climate Model and Data / Max-Planck-Institute for Meteorology Hamburg, Germany. GO-ESSP at LLNL Livermore, June 19th – 21st, 2006. - PowerPoint PPT Presentation

Citation preview

Page 1: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 1

GO-ESSP at LLNLLivermore, June 19th – 21st, 2006

World Data Center Climate:Status and Portal Integration

WDCC Home: www.wdcc-climate.de / WDCC Contact: [email protected]

Michael Lautenschlager, Hannes Thiemann and Frank Toussaint

ICSU World Data Center ClimateModel and Data / Max-Planck-Institute for Meteorology

Hamburg, Germany

Page 2: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 2

Content:

WDCC Status

CERA Concept

Portal Integration

Page 3: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 3

WDCC Content

ERA40

IPCC

CEOPBALTEX HOAPS

CARIBIC

WOCE

ERA15/40NCEP

GEBCO

COSMOS

Simulations @ MPI, GKSS,…

Data from Earth SystemModelling andRelated Observations

EH5/MPI-OMIPCC-AR4

Start: Approved in January 2003Maintenance: Model and Data (M&D/MPI-M) and German Climate Computing Centre (DKRZ)

June 2006: 590 Experiments / 79.000 Data Sets

Page 4: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 4

Data Export from WDC Climate

Corresponds to 2 – 10 TB/month

Page 5: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 5

Geographical Distribution of WDCC Users

Total number of registered users: 750 (Mai 2006)

Page 6: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 6

Data Import into WDC Climate

6 * 10**9 BLOBs

ECHAM5/MPI-OM IPCC AR4 Scenarios (ca. 110 TB)

Page 7: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 7

(I) Data catalogue and Pointer to Unix files Enable search and identification of data Allow for data access as they are (coarse granularity raw

data files)

(II) Application-oriented data storage in BLOB tables Time series of individual variables are stored as BLOB

entries in DB Tables (fine granularity data products)Allow for fast and selective data access

Storage in standard data format (GRIB, NetCDF/CF)Allow for application of standard data processing routines

(PINGOs, CDOs)

CERA1) Concept:Semantic Data Management

1) Climate and Environmental data Retrieval and Archiving

Page 8: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 8

Level 1 - Interface:Metadata entries(XML, ASCII)+ Data Files

Level 2 – Interf.:Separate filescontaining BLOBtable data in application adapted structure(time series ofsingle variables)

Experiment Description

Pointer toUnix-Files

Dataset 1Description

Dataset nDescription

BLOB DataTable

BLOB DataTable

WDCC Data Topology

BLOB DB Table corresponds to scalable, virtual file at the operating system level.

Page 9: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 9

CERA Data Model

Entry

Reference

Status

Distribution

Contact Coverage

Parameter

SpatialReferenceLocal Adm.

Data Access

Data Org

Page 10: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 10

Data matrix of model experiment

Model variables

Mod

el R

un T

ime

2 D: small BLOBS (180 KB)

3 D: large BLOBS (3 MB)

Raw data file: direct model output (1.3 – 16.2 GB)

Each columm is one BLOB Table in CERA-DB

Raw

data file in D

KR

Z A

rchive

T2M Precip SLP2D variables . . Temp

Water vapour

3D variables . .

T1T2T3.......Tn...................Tend

Page 11: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 11

Preferred DB-storage structure for web-based access:• single variable• single level • time series of 2D gridded data records• Formats: GRIB-1 – NetCDF/CF (- GRIB-2)

Climate Model Data Structures

Application related data structure (2-D)original data structure (4-D)

Page 12: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 12

TX

7: In

tel I

tan

ium

-2 w

ith L

inu

x

DKRZ Architecture

Page 13: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 13

Portal Integration

Two strategies:One way integration: discovery and use metadata are integrated in a central data portal in one step Example: C3Grid data catalogue (refer to presentation from Heinrich Widmann)

Two way integration: discovery metadata are integrated in central data portal, use metadata are extracted from remote archive when they are needed for data download and processingExample: Primary data publication in TIB library catalogue (STD-DOI)WDCC integration in NDG (NERC Data Grid)

Page 14: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 14

Primary data publication (STD-DOI)URL: http://www.std-doi.de/

DataReview

Primary DataPublicationProcess

ISO 690-2: Metadata for citation of electronic media

Page 15: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 15

Example: Publ.-DOI from WDCC

Page 16: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 16

DOIURN

Page 17: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 17

Publ.-DOI

Page 18: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 18

830 GB

Page 19: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 19

Data retrieval procudure is given at the end (user identification is required)

Ident.-DOI

Page 20: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

WDCC Metadaten und OAI-PMH

O p e n A r c h i v e s I n i t i a t i v e

Protocol for Metadata Harvesting

Page 21: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 23

ÜWDCC OAI server at:

(Software: dlese (www.dlese.org) + apache-tomcat 5.5.12 + Java 1.5)

http://uranus.dkrz.de:8080/oai/provider

- 35 IPCC experiments with more than 11000 datasetsMetadata Format: ISO 19115

C3Grid (http://gsphere.awi.de:8080/gridsphere/gridsphere)

- 40 STD-DOI experiments with more than 1700 datasetsMetadata Format: DIFGO-ESSP (NDG, http://ndg.badc.rl.ac.uk/)

Page 22: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 24

NDG

ÜDIF XMLsWDCC

OAI ServerWDCC(Software: dlese)

OAI ClientNDG(dlese)

DIF XMLsProvider 2

OAI Server 2

OAI Server n

Discovery PortalNDG

OAI Harvesting (Pull or Notification)

CatalogNDGrecord 1...n

Process

Delivery

Page 23: GO-ESSP at LLNL Livermore, June 19th – 21st, 2006

M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 25

URL: http://glue.badc.rl.ac.uk/discovery/Keyword: ECHAM4