Upload
gordon-johns
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
The ISDC concept for long-term sustainability
of geoscience data and information
B. Ritschel, [email protected]
ISDC Team (V. Mende, H. Palm1, Ch. Bruhns2, R. Kopischke2, S. Freiberg3, L. Gericke3), [email protected]
1left the group, 2administrator, 3university student
The ElectronicGeophysical Year
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Information in time and space
Problem: Digital information sustainability
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC portal homepage: isdc.gfz-potsdam.de
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC collaboration projects (1)
• CHAMP and GRACE satellite missions Orbit/Gravity + Magnetic/Electric Field + Atmosphere data
• GGP (Global Geodynamic Project) Superconducting Gravimeter + auxiliary data
• GNSS (Galileo testbed phase 1 project) GFZ Potsdam GPS ground station data
• GPS-PDR (Potsdam-Dresden-Reprocessing) GPS Orbit + Earth rotation parameter data
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC collaboration projects (2)
• GGSP (Galileo Geodetic Service Provider) GPS time series, orbit, ERP, SLR, auxiliary data
• TerraSAR-X Orbit + Atmosphere/Ionosphere data
• ICGEM (International Centre for Global Earth Models)
Global Earth gravity models
• GGOS (Global Geodetic Observing System)
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Number and volume of data
• 288 different product types from different geoscience domains– 92 + 20* product types for public use
– 4 product types with extended rights for specific science team members
– 180 product types for internal use only
• 10,2 Terra Byte of data• 15,9 Million products
• 1576 national and international users and user groups• 19 data provider (scientific groups) GFZ + NASA
*TerraSAR-X (coming soon)
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC portal – Product Types: isdc.gfz-potsdam.de/product_types
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC user - country graph
1576 registered and active users and user groups
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC user development graph
User development (2007-10-05)
1596 registered users and user groups
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Product data flow
Number of data files per time (2007-10-05)
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC architecture schema
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
http://gcmd.nasa.gov/User/difguide/difman.html
ISDC Metadata Standard = Parent DIF (V. 9.0) + Extended Child DIF(s)*
*in preparation
Metadata
ProducerData pump
User
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
DIF metadata fields (extract)
Required fields
• Entry ID • Entry Title• Parameters (Science
Keywords)• ISO Topic Category• Data Center • Summary • Metadata_Name• Metadata_Version
• Personnel• Data Set Citation• Instrument• Platform• Temporal Coverage• Paleo-Temporal Coverage• Data Set Progress• Spatial Coverage• Location• Data Resolution• Project• Keyword (Ancillary Keyword)• Quality• Access Constraints• Use Constraints• Data Set Language• Originating Center• Distribution• Multimedia Sample• Reference• Discipline• Related URL• Parent DIF• …
Science keyword vocabulary5-level hierarchical classification
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
GCMD's Science Keywords andAssociated Directory Keywords
Example for the structureof science keywords:
EARTH SCIENCE >
Solid Earth >
Geodetics/Gravity >
Satellite Orbits
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC and GCMD1 DIF2
• Product type independent base DIF V.9 XML schema• Product type dependent child DIF XML schemata
• Product type referencing parent DIF V.9 XML document• Product (data file) referencing child DIF XML documents
(containing skinny DIF + data file describing data)
Generating GCMD DIF standardcompliant metadata documents
1NASA’s Global Change Master Directory, 2Directory Interchange Format
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC Parent DIF V.9 XML document
CH-OG-3-RSO XML schema: base-dif.xsd
-<DIF: xmlns: ... “http://isdc.gfz-potsdam.de/xsd/base-dif.xsd”>
...
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC Child DIF V.9 XML document
CH-OG-3-RSO+CTS-CHA_2000_219_10
<Parent_DIF>CH-OG-3-RSO</Parent_DIF>
+<Data_Parameters> XML schema: CH-OG-3-RSO.xsd
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
CH-OG-3-RSO Data Parameters
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Mapping of standards
DIF <=> ISOXSL Transformation
DIF XML metadata file(DIF Version 9.0 XSD)
ISO 19115 XML metadata file
(ISO 19115/19139 XSD)
XSD: XML Schema DefinitionXSL: Extensible Stylesheet Language Source: http://en.wikipedia.org/wiki/Xslt
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC main system components deployment
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC storage management structure (in realization)
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Data lifecycle management (1)
• ISDC product philosophy (product = metadata + data)
• Providing the input of data via dedicated FTP directories for data providers (GFZ internal + external)
• Ensuring the sustainability of data by– Long-term archiving (storage of original and 1 copy)
– Online Product Archive (OPA)
• Filling and maintaining the ISDC product catalog using product type and product related metadata
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Data lifecycle management (2)
• GUI and API for product retrieval– Product type dependent retrieval forms
– Product browser
– Product request list file for bulk requests
• Providing the output of data via dedicated FTP directories for users (GFZ internal + external)
• Personalization (selecting favorite product types, subscription of Really Simple Syndication [RSS] feeds)
• User management, user forum, monitoring components
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Data lifecycle management (3)
Missing tasks:
• Harmonization of data• Tailoring of data • Merging of data• Aggregation of data• Removing of data
– Prediction data– Semi-finished products – Back up files
=> Enhancement of data interoperability
=> Providing data for other scientific domains
=> Keeping the operational status
Science drivendata review processis necessary!!!
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Service Oriented Architecture
Improving the interoperability of the ISDC portal system by using Service Oriented Architecture (SOA) techniques
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Interoperability via OGC CSW
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Networking data Sensor Web concept*
*OGC® Sensor Web Enablement: Overview And High Level Architecture.
+ virtual sensors (database, data archive)*
*extended by the author
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Mashup geoscientific data
Katrina Hurricane Tracking and Google Maps
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Science video portal
www.scivee.tv
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Techniques for Using Web 2.0*
• Dramatically lower the experience barrier
• Collect user contributions
• Enable formation of communities
• Become an open platform
• Provide self-evolving customer relationship management (CRM)
Differences in the way ofInteraction between dataprovider and users
*Dion Hinchcliffe’s Web 2.0 Blog
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Connecting different worlds
Committee driven developments• Metadata/Service
standards• Catalog Web
services• Data standards• Data/Application
services• SOA approach
Community driven developments• Mashups • Social software
• Networks
• (corporate) Blogs
• Wikis
• Chats/Messenger
• Social navigation• Tagging
Integration of sustainable Web techniques from both worlds
Web 3.0* *W. Wahlster (DFKI), acatech Symposium, Berlin, 31 May 07
Semantic web Web 2.0
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC activities (1)
• Preparing the TerraSAR-X data management
• Improving metadata (management) interoperability– Using and developing the Directory Interchange Format
Standard Version 9.x– Changing from ASCII-based DIF to XML-based DIF
documents– Introduction of specific ISDC parent - child principle– Using XML database for parent DIF XML
documents
• Providing thematic catalog product search using ISDC catalog and user generated (Web 2.0) metadata ontologies
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC activities (2)
• Developing interoperable catalog and data services for distributing and networking of metadata and data– Catalog Web Services OGC C-WS and OAI-PMH– Sensor Web Enablement (SWE)– Virtual Observatory (VO object oriented ontology methods, OWL)– Open data access protocol (OPeNDAP)– Evaluating Earth Science Mark-up Language (ESML)
• Providing information about the usage of data via user driven activities like tagging and social navigation data– Object oriented approach (relations between product types)– Different type of classification (project, scientific domain, application)– Networking different semantic layer based on metadata created by
data provider and users (Web 2.0 techniques)
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC activities (3)
• Developing a service for publication of data via unique identifier (e.g. DOI, URN)
• ISDC has become part of CEOS International Directory Network (IDN) gateway to Earth science data and information maintained by NASA's GCMD
• Implementing framework S/W and preparing ISDC DIF metadata XSLT for ISO 19115 compliant CWS
• Providing information and access to data related e-print publications using OAI-PMH harvesting
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ISDC activities (4)
• Integration of science application services – Spatial visualization of retrieval result sets on maps
– Visualization of data products (e.g. profile data)
• Design of ISDC portal (version 3.x) using
• Active role in Global Geodetic Observing System project– System design
– Software development
– ISDC as active data and service provider
– ISDC is part of GEOSS
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Questions and Challenges
• How to improve interoperability concerning metadata and services?– Different metadata standards (DIF, ISO, Dublin Core, …)– OGC WCS standard but different metadata profiles (ISO
19115:profile xyz)– Web 2.0 community is providing new techniques …
• How to make data and data products available for other domains (science and non-science)?– Lack of Information about processing the data (input data
and models, processing software, constraints, original domain for the product)
– Lack of information about applications and domains where data are used
– Product tailoring (inter-domain knowledge is necessary)
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Challenges and Tasks
• Providing sufficient money for all what is necessary in order to guarantee a long-term sustainability of data
• Improving awareness and understanding of ESSI concepts for administration and high management level
• Helping scientists to take theirs responsibility for making data available to all interested communities
Understanding metadata and Web services concepts Describing the process of product generation in a way
scientists form other domains are able to understand it in order to use these data for their own purpuse
Providing data in different kind and formats (tailored products)
Overcome personal egoism in keeping data and just publishing results (most difficult task)
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
Einsteinturm (1921)
[email protected] Thank you for your attention.
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ESSI Goals of GGOS
• Promote the data and products of the services and become the collective voice for IAG;
• Collect and archive, through interoperable** services, geodetic observations, products, and models and ensure their consistency, reliability and accessibility;
• Identify a consistent set of geodetic products generated by the services and establish the requirements concerning the products’ accuracy, time resolution, and consistency;
**added by the author
PV 2007 CONFERENCE, Germany, Oberpfaffenhofen, October 9 - 11, 2007
ESSI challenges of GGOS
• App. 1000 different geodetic product types (covering all geodetic techniques and level of processing)
• > 100,000,000 data sets, > 100 TB of data (distributed all over the world)
• Complete heterogenous picture concerning the management of data by the different data providers (single scientist <=> world data center)
• Different data policy related to the access of data• No common understanding about the meaning, the
importance and the realization of IT-based geoscientific infrastructure