25
1 Eric Guilyardi and the Metafor team Common Meta data for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

Embed Size (px)

Citation preview

Page 1: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

1

Eric Guilyardi and the Metafor team

Common Metadata for Climate Modelling Digital Repositories

Metafor Dissemination WorkshopAbingdon, 14 March 2011

Page 2: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

2

Outline• Why Metafor• Metafor goals• How we are doing it• What we have done

– Common Information Model (CIM)– Controlled vocabulary– CMIP5 support and questionnaire

• Next steps

Page 3: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

3

Finding climate model data is hard Understanding data is harder (esp. for non-experts) Discriminating between two simulations/models is not easy Documentation is “patchy” and specific to modelling centres Documentation currently revolves around (at best) the

runtime, but not the scientific detail and relevance of the model components

Little or no documentation of the “simulation context”, i.e. the whys and wherefores and issues associated with any particular simulation.

Why Metafor

Current issues (before Metafor):

Page 4: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

4

Why Metafor

• 20 modelling centres• 40+ models• 60 numerical experiments• 90,000 years of simulation• 2 million output datasets• Data to be available from “core-nodes” and “modelling-

nodes” in a global federation (PetaBytes of data).• Users need to find datasets, and discriminate between

models, and between simulation characteristics.

Example: the CMIP5 documentation challenge

Page 5: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

Discovery, Documentation, Definition

Why Metafor

Page 6: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

CF-NETCDF

IS-ENES Metrics

Metafor-CIM

GEOSS

Journal Paper

Discovery, Documentation, Definition & examples

Why Metafor

Page 7: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

7

“The main objective of Metafor is to develop a Common Information Model (CIM) to describe climate data and the models that produce it in a standard way, and to ensure the wide adoption of the CIM”

Requirements for success:• Gather top field experts• Engage with similar existing activities• Work towards community adoption• Capture wider community needs• Ensure post-project governance

Why Metafor

Page 8: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

8

Facts and Figures

12 partnersEU contribution of 2.2M€Started March 2008, duration 3.5 years

– NCAS, University of Reading, UK (Coordinator)– BADC, Science and Technology Facilities Council, UK– CERFACS, France– Models and Data, Max Planck Institute for

Meteorology, Germany– Institute Pierre-Simon Laplace, CNRS, France– University of Manchester, UK– Met Office, UK– Administratia Nationala de Meterologie, Romania– Météo France, CNRM, France– CLIMPACT, France– CICS, Princeton University, USA– University of Cantabria, Spain

INFRA-2007-1.2.1 Scientific Digital Repositories

Page 9: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

9

METAFOR Year 1 meeting in AbingdonFeb. 2009

Page 10: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

10

Where Metafor came from• PRISM project (FP5 2001-2004), ENES• PRISM Sustained Initiative (PSI)

– Code coupling and I/O– Integration and modelling environments – Data processing and management – Meta-data standards (key !) – Computing issues

MetaforIS-ENES

IS-ENESIS-ENES

IS-ENES, Metafor

ENES coordination

Page 11: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

11

Metafor organisation

Metafor activities and work packages (WP) map onto the I3 structure.

Project management, training and dissemination are organised in WP1 and WP7.

Page 12: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

12

the CIM builds on existing metadata standards used internationally in climate research (CF, NMM, Curator, FLUME, ISO-standards, etc.) + new bits

the CIM defines a general structure over which a specific Controlled Vocabulary (CV) can be applied a CV consists of the terms (and their

relationships) used to build the content of CIM instances.

The CIM

Page 13: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

13

Lots of people talk about climate models and data;

Some people even agree about those things (“the yolk”);

We have a formal way of describing that (UML, CONCIM);

That UML is constrained to follow a particular meta-model...

...so that it can be transformed into something usable (XSD, APPCIM) for particular users;

Metadata instances conform to an APPCIM

The CIM “metamodel”

Page 14: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

14

The CONCIM

Climate Modelling = an activity using a software to produce data on a grid to be archived in a repository.

Page 15: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

15

The CONCIM

Software

Data

Activity

Grid

CIM v 1.4 available on the Metafor website at: http://metaforclimate.eu/trac/browser/CIM

Page 16: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

16

Conceptual ModelClimate Modelling = an activity using a software to produce data on a grid to be archived in a repository.

UML

Application Model

Application Model

XSD

RDF

Instance @ PCMDI

XML

An essential aim of Metafor is that the conceptual model is not changed by the manner in which it is used or applied.

Instance @ IPSL

Instance @ BADC

e.g. CIM

e.g. CMIP5

The CONCIM to APPCIM

More in Allyn’s presentation

Page 17: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

17

The Query and other CIM Tool(s)

building a search interface for CIM instances

building a CIM instance viewer

building a CIM instance comparer

More in Mark’s presentation

Page 18: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

18

Using the CIM to support CMIP5

• The CMIP5 experimental archives will be ~500TB of model run data

• We need to be able to capture all the details of these experiments (and the component models and platforms used) to allow users of the archive to differentiate between the experiments and the models.

• To do this, Metafor has been tasked by WGCM/CMIP to define, collect and provide the CMIP5 model metadata.

Page 19: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

19

Novelty in community: Software CV

Gathering a way of describing (in a community consistent way) the scientific properties of model subcomponents

Controlled Vocabulary

570 controlled questions - hundred of choices !

Creating the CV mindmap

Page 20: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

20

Ocean Lateral physics CV

More in Sébastien’s presentation

Page 21: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

CMIP5 metadata collectionCMIP5 Questionnaire @BADC

More in Gerry’s presentation

Page 22: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

ESG Federation Architecturefor CMIP5

Page 23: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

23

How it all fits togetherCreating the CV mindmap

CMIP5 Questionnaire @BADC

CIM+DOI database

Portals for user access (search & other tools)

User perspectiveand access to data via ESGF

Page 24: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

24

Upcoming challenges

• CMIP5 questionnaire support – Deployed since November 2010– Growing use (20 groups)– Feedback on CIM concepts

• Provide first CIM tools and services (WP4/5/6)– Articulate developments with IS-ENES and ESG

• Community outreach and buy-in – Develop CIM for other uses (Charlotte’s presentation)

• Set up long-term governance for CIM and CV– “standards committee” under WCRP

Page 25: 1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011

25

“The open standard developed in Metafor will play a catalytic role in the way next generation climate data repositories, such as IPCC AR5, are organised, preserved and accessed” - METAFOR project proposal

More on: http://metaforclimate.eu

“METAFOR is now a major international focal point for earth system modelling metadata definition” - Karl Taylor, PCMDI

“The right people for the right project at the right time !”

“There is evidence of excellent teamwork” - EU review

A few quotes...