5
The eGY Legacy: A framework for e-Discovery and e-Access (eDeA) to Scientific Data Vladimir Papitashvili, AOSS, University of Michigan The eGY General Meeting, Boulder, Colorado, March 13-14, 2007

The e GY Legacy: A framework for e-Discovery and e-Access (eDeA) to Scientific Data

  • Upload
    aderes

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

The e GY Legacy: A framework for e-Discovery and e-Access (eDeA) to Scientific Data. Vladimir Papitashvili, AOSS, University of Michigan The e GY General Meeting, Boulder, Colorado, March 13-14, 2007. - PowerPoint PPT Presentation

Citation preview

Page 1: The  e GY Legacy: A framework for  e-Discovery and  e-Access  (eDeA)  to Scientific Data

The eGY Legacy:A framework for e-Discovery and e-Access (eDeA) to Scientific Data

Vladimir Papitashvili, AOSS, University of Michigan

The eGY General Meeting, Boulder, Colorado, March 13-14, 2007

Page 2: The  e GY Legacy: A framework for  e-Discovery and  e-Access  (eDeA)  to Scientific Data

A Legacy of IGY - World Data Centers System20th Century Paradigm of Sharing Data: Data were to submitted to Data Centers

Data submissions to World Data Centers () were and remains voluntary.

World Data Centers require significant and continuous support (financial & manpower) for data acquisition and storage.

Many types of collected scientific data are often not suitable for World Data Centers; e.g., the quality of geomagnetic variation data does not satisfy the WDC criteria, set mainly for the standard magnetic observatory data.

“Push Data” Concept

Although at present the World Data Centers provide most of their data online, they still constitute a quasi-centralized system of data collection, storage, and dissemination.

Courtesy of the RAND Corporation

Page 3: The  e GY Legacy: A framework for  e-Discovery and  e-Access  (eDeA)  to Scientific Data

21st Century Paradigm: Data are published, visualized, and shared via World Wide Web

Sharing data via multiple Virtual Observatories allows data providers achieve greater visibility among scientific & user communities.

This eliminates the ‘voluntary’ need of submitting data to World Data Centers () – the centers can “pull data” from the data provider Web sites.

A fabric of interconnected data nodes (providers and secondary archives) is a new vision for distributed, self-populating data repositories.

“Pull Data” Concept

Courtesy of the RAND Corporation

Being integrated in this (Data Fabric) cyber- infrastructure, the World Data Centers will be

playing even a more important role - as clearinghouses they would need to watch the always evolving “Data Fabric” and preserve at least 2-3 copies of a particular dataset across the global network of data.

Page 4: The  e GY Legacy: A framework for  e-Discovery and  e-Access  (eDeA)  to Scientific Data

• Google and many other searches engines help finding INFORMATION about scientific data in cyberspace (“data discovery”) – this is mainly based on the keywords search.

• What is needed? - Geo-SML descriptors to list actual data sets on the World Wide Web.

• These descriptors would allow Google (and others) to search actual SCIENTIFIC DATA on the Web creating “look-up” tables for real e-Access to these data (eDeA).

Wikipedia: Service Modeling Language (SML) is an XML-based specification by leading information technology companies that defines a consistent way to express how computer networks, applications, servers and other IT resources are described or modeled so businesses can more easily manage the services that are built on these resources.

Page 5: The  e GY Legacy: A framework for  e-Discovery and  e-Access  (eDeA)  to Scientific Data

• Google and many other searches engines help finding INFORMATION about scientific data in cyberspace (“data discovery”) – this is mainly based on the keywords search.

• What is needed? - Geo-SML descriptors to list actual data sets on the World Wide Web.

• These descriptors would allow Google (and others) to search actual SCIENTIFIC DATA on the Web creating “look-up” tables for real e-Access to these data (eDeA).

Thus, a major legacy of eGY could be a common framework (Geo-SML descriptors and appropriate cyber infrastructure) developed for scientific data representing various geoscience disciplines.