Data Sharing in IPY: Policy, Practice, and Services Mark A. Parsons Co-Chair, IPY Data Policy and...

Preview:

Citation preview

Data Sharing in IPY: Policy, Practice, and Services

Mark A. ParsonsCo-Chair, IPY Data Policy and Management SubcommitteeManager, IPY Data and Information Service

Knowledge is power.

– Francis Bacon

The restriction of knowledge to an elite

group destroys the spirit of society and leads to

its intellectual impoverishment.

– Albert Einstein

International Data Policies

• World Meteorological Organization (WMO)

• WMO World Data Centers

• Intergovernmental Oceanographic Commission (IOC) of UNESCO

• World Climate Program Committee on Earth Observations Satellites (CEOS)

• International Earth Observing System (IEOS)

• Houston Economic Summit of the Group of Seven Most Industrialized Nations

• Organization for Economic Co-operation and Development (OECD)

• Inter-American Institute for Global Change Research

• Agenda 21, UN Conference on the Environment and Development (UNCED)

• Framework Convention on Climate Change

• International Council for Science (ICSU)

• ICSU World Data Centers (WDC)

• International Geosphere-Biosphere Program (IGBP)

• Global Climate Observing System (GCOS)

• Second World Climate Conference (SWCC)

• Scientific Committee on Solar-Terrestrial Physics (SCOSTEP)

• International Social Science Council (ISSC)

• International Union of Radio Science (URSI)

• World Ocean Circulation Experiment (WOCE)

• International Polar Year (IPY)

• Global Earth Observing System of Systems (GEOSS)

• etc.etc.etc.

Some IPY Objectives

•IPY has an interdisciplinary emphasis, with active inclusion of the social sciences.

•IPY will link researchers across different fields to address questions and issues lying beyond the scope of individual disciplines.

•IPY will strengthen international coordination of research and enhance international collaboration and cooperation

•IPY will leave a legacy of observing sites, facilities and networks, as well as individual data and data systems to support ongoing polar research and monitoring.

Data used by IPY

Data generated by IPY

Special Cases:•Human subjects•Intellectual property of LTK•Where data release may cause harm

http://www.ipy.org/Subcommittees/final_ipy_data_policy.pdf

IPY Data Policy—What are IPY Data?

Data

The IPY Joint Committee requires that IPY data, including operational data delivered in real time, are made available fully, freely, openly, and on the shortest feasible timescale.The only exceptions to this policy of full, free, and open access are:

• where human subjects are involved, confidentiality must be protected

• where local and traditional knowledge is concerned, rights of the knowledge holders shall not be compromised

• where data release may cause harm, specific aspects of the data may need to be kept protected

—IPY Data Policy

IPY will set a new standard in scientific cooperation as rapid and unrestricted data exchange becomes an accepted and enabling factor in daily research.

—IPY Science Plan

Metadata

All IPY data must be accompanied by a full set of metadata that completely document and describe the data. In accordance with the ISO standard Reference Model for an Open Archival Information System (OAIS) (CCSDS 2002), complete metadata may be defined as all the information necessary for data to be independently understood by users and to ensure proper stewardship of the data.

Regardless of any data access restrictions or delays in delivery of the data itself, all IPY projects must promptly provide basic descriptive metadata of collected data in an internationally recognized, standard format to an appropriate catalog or registry.

—IPY Data Policy

IPY Metadata Profile (and crosswalk)

•“All data registries and repositories collecting data and metadata from IPY projects are required to collect and share sufficient information to adhere to the IPY Metadata Profile”

•Basic who, what, where, when in either FGDC, DIF, THREDDS (ISO coming, but could use some help), plus some information on metadata provenance.

•Controlled vocabulary from GCMD for some fields.

•The “bare minimum of information necessary to allow simple discovery across disciplines and to ensure we can track the heritage of the metadata in a broadly distributed data management environment.”

•Details available at ipydis.org

Attribution

..users of IPY data must formally acknowledge data authors (contributors) and sources. Where possible, this acknowledgment should take the form of a formal citation, such as when citing a book or journal article. Journals should require the formal citation of data used in articles they publish.

“Publish or Perish”

“Preserve or Perish”

Preservation

All IPY data must be archived in their simplest, useful form and be accompanied by a complete metadata description. An IPY Data and Information Service (IPYDIS) should help projects identify appropriate long-term archives and data centers, but it is the responsibility of individual IPY projects to make arrangements with long-term archives to ensure the preservation of their data. It must be recognized that data preservation and access should not be afterthoughts and need to be considered while data collection plans are developed.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

A striking proportion of project difficulties stem

from people in both customer and supplier organisations failing to implement known best

practice.

– Oxford University/Computer Weekly survey of public and private sector IT

projects

Science and Data Management

•Many have stated the need to involve scientists in data management, but…

•It is also important to involve data managers in conducting science.

•Field Experiments:

•~20% increase in data quality (Parsons, et al. 2004)

•70% of experiment cost is data collection (Longley, et al. 2001)

•Observing systems

•Define/clarify roles for data centers and investigators

•QC (from file verification to scientific assessment)

•Metadata and documentation development

•Formatting, gridding, packaging (e.g. sharing protocols)

Documentation

•Register basic discovery metadata in a portal

•Use existing standards, e.g.

•FGDC metadata standard

•OAIS Reference Model

•Develop “Data Stories”

•Describe uncertainty

•Challenge your assumptions

“We must not … start from any and every accepted opinion, but only from those we have defined — those accepted by our judges or by those whose authority they recognize.”

—Aristotle c. 350 BC

0110001010010011110101110001111011001010100011100111001010100111010101001110001101011010000100001001010010010101100100100010101001001001010101010010101001010010101000001111100101101010101101000101111010110101101010100110001010010011110101110001111011001010100011100111001010100111010101001110001101011010000100001001010010010101100100100010101001001001010101010010101001010010101000001111100101101010101101000101111011

The Data

Formats:

• Negotiate common formats and conventions

• ASCII is useful but not really a precise format

• avoid proprietary formats

• some suggestions: netCDF is popular for some, OGC (WMS/WCS/WFS) compatibility is nice

• Archives and users may have different needs

Access:

• Integrate with many systems to allow increased user discovery (register with the IPYDIS)

• Use open source software when possible, use open standards everywhere.

Preservation

• Open Archive Information System Reference Model

• Attribute and provide info for attribution readily through all gateways

Other practices to consider

•Annual meetings/sessions focused on data and initial scientific results

•Reports made available online, e.g., addressing data as part of a general annual project progress report

•“Updates from field” (e.g., tied to education & outreach)

•Search/request facility for data in process (e.g., field catalog), to facilitate cross-disciplinary data discovery

•Establish data tracking systems for projects, disciplines, countries

•Build data sharing and data integration partnerships and then extend (I told two friends and they told two friends and …)

http://ipydis.org

http://ipycoord.met.no

http://gcmd.nasa.gov/portals/ipy

http://www.ipy-ice-portal.org

ELOKA The Exchange for Local Observations and Knowledge of the Arcticworks to provide data management and user support to facilitate the collection, preservation, exchange, and use of local observations and knowledge of the Arctic.

http://nsidc.org/eloka PI: Shari Gearheard

18 May 2006

http://arcticportal.org

Data Coordinators

• Assist on compliance with standards, identification of archives, development of the union catalogue, and other data management requirements for IPY.

• Visibly track the data flow for IPY.

• In collaboration with the IPO, develop a data registry that will continue throughout the IPY.

• Survey the planned projects and the data they intend to collect and identify existing archives, portals, experts, and significant gaps in the IPY data infrastructure.

•Mark Parsons—Overall, US

•Øystein Godøy—Operational Data, Norway

•Canadian Coordinator—Overall, Canada

•National coordinators in Netherlands, China, UK

Data Committee

• Develops data policy

• Develops data strategy

• Determines data flow structure (consideration of procedures, real-time requirements, transmission and archival)

• Advise JC

• Requirements and recommend actions for IPYDIS--what do we need?

• Mark Parsons, USA; co-chair

• Taco de Bruin, Netherlands; co-chair

• Nathan Bindoff, Australia (represented by Kim Finney)

• Joan Eamer, Norway/UNEP

• Eberhard Fahrbach, Germany; JC liaison

• Hannes Grobe, Germany

• Ray Harris, UK/GEOSS

• Ellsworth LeDrew, Canada

• Xin Li, China

• Håkan Olsson, Sweden

• Alexander Sterin, Russia/WMO

• Vladimir Papitashvilli, USA/eGY

• Birger Poppel, Greenland

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

catalog 3(Mirror)

catalog 2

catalog 1

catalog 9

catalog 4

catalog 8

catalog 7

catalog 6(Mirror)

catalog 5

The “Union” Catalog

courtesy P. Pulsifer

Social Network

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.Darlene Fichter

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

We want your feedback!

Mark A. Parsonsparsonsm@nsidc.org

Taco de Bruinbruin@nioz.nl

http://ipydis.org | ipydis@ipydis.orgEllsworth LeDrew

ells@watleo.uwaterloo.ca

IPY metadata profile elements

• Entry ID (controlled)• Data set title• Data set progress• Data set summary• Data set citation

information including Online Resource• Parameters• Locations• ISO topic categories• Temporal coverage• Spatial coverage

• Data center contact information• Access restrictions• Use constraints• Data Set Language• Metadata contact

information• Metadata authority• Metadata version• Last revision• IPY flag• IPY Project ID

Recommended