Upload
shirin
View
18
Download
0
Embed Size (px)
DESCRIPTION
Data Discovery and Basic Processing within the German Collaborative Climate Community Data and Processing Grid (C3Grid) Project. Heinrich Widmann and Stephan Kindermann Model and Data / DKRZ / Max-Planck-Institute for Meteorology Hamburg, Germany. GO-ESSP at LLNL - PowerPoint PPT Presentation
Citation preview
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 1
Data Discovery and Basic Processing within the German
Collaborative Climate Community Data and Processing Grid (C3Grid)
Project
Heinrich Widmann and Stephan KindermannModel and Data / DKRZ / Max-Planck-Institute for Meteorology
Hamburg, Germany
GO-ESSP at LLNLLivermore, June 19th – 21st, 2006
C3Grid Home: www.c3grid.de
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 2
Overview
• C3Grid Background• Data Analysis Workflows• C3Grid Architecture and Interfaces• Data Discovery and Metadata in C3-
Grid• Data Information Service with
Lucene• Data Access and Preprocessing• Summary
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 3
C3Grid Background
• C3Grid– Status : month 10 of 36 (phase 1)– is the earth system science community grid
within the German D-Grid initiative– D-Grid includes five further community grid
projects (AstroGrid, HEP-Grid, InGrid, MediGrid, TextGrid)– is a community driven grid
Goal is to develop a grid infrastructure appropriate for typical climate analysis workflows
Stepwise introduction and integration
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 4
Requirements
• Metadata• Discovery• Data access(+
preprocessing)
• Security• Scheduling• Complex
processing
Grid technologies
ISO19115 / ISO19139 OAI-PMH + Lucenecommunity
webservice
Shibboleth Globus Toolkit 4 WS-GRAM
C3Grid Data Analysis Workflow Requirements
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 5
C3Grid Architecture and Interfaces
Data
Discovery
Data Access and
Basic Processing
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 6
C3Grid Data Discovery and Data Access
workspaceworkspace
workspace
data
Scheduling Data Management Service
Portal- Discovery
Data Access Web Service
• oids• time/space constraints• processing constraints
Data request
preprocessing
datadata
DB Files
Prop. Xml
Prop. Rel.
World Data Centers (Climate,Mare,RSAT), DWD
PIK,
IFM-Geomar,..
ISO 19115 /19139
Discovery
Use
Web server / OAI provider
OAI harvester
OAI-PMH
C3 Metadata catalog
workspace
resourceprovider
- Workflow composition
WS-GRAM
Grid Infrastructure Metadata
job submission
analysisjob
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 7
<MD_Metadata http://www.isotc211.org/xxx">
<fileIdentifier ../>
<resourceConstraints ../>
<extent … spatial+temporal bounding box .. />
<contentInfo ..>
<attributeDescription ../>
<distributionInfo ..>
<DS_Series>
<composed_of>
<composed_of>
</MD_Metadata>
<MD_Metadata …. >
<MD_Metadata …. >
C3 ISO 19139 Metadata “Profile”
Data Items:
• gridded data
MetadataDatabase
“implicit” Metadata
Metadata
Metadata
ArchiveDatabase
PostprocessedExperiment Data• 2D single variabletime series
Post-processing
Raw Experiment Data• 3D multi variablefiles
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 8
C3Grid Data Information Service with Lucene
full-text index
harvestingbackend
Web service frontend
Apache Axis+ Servlet Container
Apache Lucene
Portal
CERAPangaeaArchiv
Webserver
OAI-PMH
DIS
<MD_Metadata>...</MD_Metadata><MD_Metadata>...</MD_Metadata><MD_Metadata>...</MD_Metadata><MD_Metadata>...</MD_Metadata>
Field Term Documentidentifier ABC:123 2
identifier XYZ:223 6
identifier MI6:007 12
abstract region 2,23,112abstract pressure 3,23abstract humid 4,33,215,6,4
min_lat 030.43 1min_lat -023.23 2local file://path/ 4
inverted index
cache for ISO19139 documents
indexingof
selectedfields
[T. Langhammber, ZIB, Berlin]
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 9
C3Grid Portal – Simple search
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 10
C3Grid Portal – Advanced search
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 11
C3Grid Data Access and Preprocessing
• Data access interface– Community-specific webservice (WSDL)– Solutions of the individual institutes will
be adapted to support the webservice•e.g. triggering of local data
processing tools – Support data base and file based
storage types– More detailed use metadata will be
provided during the extraction process with the data
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 12
C3Grid Data Access/Preprocessing Interface
datadata
DB
Files
DataAccessWeb
service
Access
CDO processing
Stage file webservice request contains :• ObjectList of OIDs requested• CFList of standard names • Space constraints• Time constraints• Target directory• File format, e.g. netCDF or grib• …
SOAP-XMLStageFileRequest
Constraints
necessaryprocessing
CF standardnames
Local variable
names
data
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 13
Summary
• Grid development is application driven• Discovery is based on
– ISO 19115/19139 based metadata catalog– Hierarchical, two-leveled metadata
scheme– Text based search in the catalog
• Data access is implemented by• Proprietary C3Grid data access interface
(webservice)
• Part of the use data are provided along with the data extraction
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 14
The end
H. Widmann (M&D) Data Discovery and Processing within C3Grid GO-ESSP/LLNL / June, 19th 2006 / 15
C3Grid Architecture
DBMS/File
AvailableResources
Distributed Processing Resources
Distributed Processing Resources
DistributedData Archives
DistributedData Archives
MetaData
JobData
DMS (local)Site C3Grid Components
OAI / WS
Pre-Proc
Grid Workspace
ResourceScheduler
Base Data &Meta Data
File Management
ArchiveInterface
Data Transfer Service
DistributedGrid Infrastructure
• GT4 based• new Metadata-Service
DMS (global)WorkflowScheduler
ResourceInformation
Service
DIS
Staging
Search
Harvesting Task Execution
Matchmaking
User
Job Submission
User Interface API (Web Services) GUIMonitoring