17
EUDAT Towards a European Collaborative Data Infrastructure Damien Lecarpentier – CSC, IT Center for Science, Finland ISC’11, Hamburg, 20 June 2011

EUDAT Towards a European Collaborative Data Infrastructure

  • Upload
    royal

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

EUDAT Towards a European Collaborative Data Infrastructure. Damien Lecarpentier – CSC, IT Center for Science, Finland ISC’11, Hamburg, 20 June 2011. Outline of the talk. EUDAT concept EUDAT consortium EUDAT service approach Expected benefits and challenges of a CDI. - PowerPoint PPT Presentation

Citation preview

Page 1: EUDAT Towards a European Collaborative Data Infrastructure

EUDAT

Towards a European Collaborative Data Infrastructure

Damien Lecarpentier – CSC, IT Center for Science, FinlandISC’11, Hamburg, 20 June 2011

Page 2: EUDAT Towards a European Collaborative Data Infrastructure

Outline of the talk

EUDAT concept

EUDAT consortium

EUDAT service approach

Expected benefits and challenges of a CDI

Page 3: EUDAT Towards a European Collaborative Data Infrastructure

Initiative funded through FP7 e-Infrastructure Call 9 (WP11): INFRA-2011-1.2.2: Data infrastructure for e-Science (november 2010) Call 9 Objective: ”Establish a peristent and robust service infrastructure for scientific data in Europe that

responds to the need of data-intensive Science of 2020” Budget 43M€

EUDAT selected for funding (three-year project) Official starting date: 1st October 2011 Biggest budget of the call: 9,3 M€ EC Grant Total Budget: 16,3 M€

Consortium 23 partners representing 13 countries 15 user communities from a wide range of disciplines (Biomed, Earth Science, Climate, SSH, etc.)

Targets EUDAT objective: “To deliver a Collaborative Data Infrastructure (CDI) with the capacity and capability for

meeting researchers’ needs in a flexible and sustainable way, across geographical and disciplinary boundaries.”

EUDAT Key facts and objectives

The infrastructure must be Collaborative The infrastructure must be driven by researchers’ needs The infrastructure must be sustainable yet flexible The infrastructure must be pan-European The infrastructure must be multi-disciplinary

Page 4: EUDAT Towards a European Collaborative Data Infrastructure

The current data infrastructure landscape: challenges and opportunities Long history of data management in Europe: several existing data infrastructures

dealing with established and growing user communities (e.g., ESO, ESA, EBI, CERN)

New Research Infrastructures are emerging and are also trying to build data infrastructure solutions to meet their needs (CLARIN, EPOS, ELIXIR, ESS, etc.)

A large number of projects providing excellent data services (EURO-VO, GENESI-DR, Geo-Seas, HELIO, IMPACT, METAFOR, PESI, SEALS, etc.)

However, most of these infrastructures and initiatives address primarily the needs of a specific discipline and user community

Challenges Compatibility, interoperability, and cross-disciplinary research

Data growth in volume and complexity (the so-called “data tsunami”) strong impact on costs threatening the sustainability of the infrastructure

Opportunities

Potential synergies do exist: although disciplines have different ambitions, they have common basic needs and requirements that could be matched with generic pan-European services supporting multiple communities and ensuring greater interoperability.

Strategy needed at pan-European level

Page 5: EUDAT Towards a European Collaborative Data Infrastructure

Towards a Collaborative Data Infrastructure

Source: HLEG report, p. 31

EUDAT will focus on building this generic data infrastructure layer and offer a trusted domain for long term data preservation accompanied with related services to store, identify, authenticate and mine these data.

This need be done in close collaboration with the Communities Core services must match the requirements of the communities Community services can also be incorporated into the common data service infrastructure

when they are of use to other communities.

Page 7: EUDAT Towards a European Collaborative Data Infrastructure

The EUDAT Communities

Page 9: EUDAT Towards a European Collaborative Data Infrastructure

The EUDAT Communities (by field)

EUDAT targets all scientific disciplines (discipline neutral):

To enable the capture and identify cross-discipline requirements To involving the scientists of all the communities in the shaping of the

infrastructure and its services

Biological and Medical Science VPH, ELIXIR, BBRMI, ECRIN

Environmental Science ENES, EPOS, Lifewatch, EMSO, IAGOS-ERI, ICOS

Social Sciences and Humanities CLARIN

Physical Sciences and Engineering WLCG, ISIS

Material Science ESS…

Energy EUFORIA…

Page 10: EUDAT Towards a European Collaborative Data Infrastructure

EUDAT Services Activities – Iterative Design

EUDAT’s Services activity is concerned with identification of the types of data services needed by the European research communities, delivering them through a federated data infrastructure and supporting their users

1. Capturing Communities Requirements (WP4)

Services to be deployed must be based on user communities needs Strong engagement and collaboration with user communities (EUDAT

communities and beyond) to capture requirements

2. Building the services (WP5)

User requirements must be matched with available technologies Need to identify:

available technologies and tools to develop the required services (technology appraisal) gaps and market failures that should be addressed by EUDAT research activities

Services must be designed, built and tested in a pre-production test bed environment and made available to WP4 for evaluation by their users

3. Deploying the services and operating the federated infrastructure (WP6)

Services must be deployed on the EUDAT infrastructure and made available to users, with interfaces for cross-site, cross-community operation

Reliability, 24h/7d availability and accessibility of the shared services, with operational security, data integrity and compliance with stakeholder requirements and policies.

Page 11: EUDAT Towards a European Collaborative Data Infrastructure

Core services are building blocks of EUDAT‘s Common Data Infrastructuremainly included on bottom layer of data services

Fundamental Core Services• Long-term preservation• Persistent identifier service• Data access and upload• Workspaces• Web execution and workflow services• Single Sign On (federated AAI)• Monitoring and accounting services• Network services

Extended Core Services (community-supported)• Joint meta data service• Joint data mining service

EUDAT core services

No need to match the needs of all at the same time, addressing a group of communities can be very valuable, too

Page 12: EUDAT Towards a European Collaborative Data Infrastructure

Service Model Approach and Generic Collaboration

Generic Service Model• Fundamental Core Services meet

strongly overlapping service requirements

• Extended Core Services are mainly community-supported, community requirements are typically overlapping between some disciplines

Collaboration between Teams• Fundamental Core Services are operated and

supported by an Operations Team which collaborates across the participating centres.

• Extended Core Services and other joint multi-disciplinary service must be community-supported, the requirements are overlapping between a specific subset of disciplines

Page 13: EUDAT Towards a European Collaborative Data Infrastructure

EUDAT Kick-Off

Service deployment

SERVICE DESIGN

USER REQUIREMENTS

SERVICE DEPLOYMENT

2012 2013 2014 2015

1st User Forum 4th User Forum2nd User Forum 3rd User Forum

First Services available

Cross-Community

Services

Full core Services deployed

Sustainability Plan

EUDAT Timeline

Page 14: EUDAT Towards a European Collaborative Data Infrastructure

Expected benefits of a Collaborative Data Infrastructure

Enabling multi-disciplinary data intensive research and collaboration Development of common services supporting research communities

Support to existing scientific communities’ infrastructures Support to smaller communities through access to sophisticated services

Inter-disciplinary collaboration and exploitation of synergies between communities Communities from different disciplines working together to build services Data sharing between disciplines

Collaboration with other large-scale infrastructure European e-Infrastructures: Géant, PRACE,EGI, etc. Global initiatives in the US, Japan, Australia, etc.

Ensuring wide access to and preservation of data in a sustainable way

A robust generic infrastructure capable of handling the scale and complexity of data that will be generated over the next 10-20 years

Greater access to existing data and better management of data for the future Increased security by managing multiple copies in geographically distant locations

Put Europe in a competitive position for important data repositories of world-wide relevance

Economies of scale and cost-efficiency Shared resources and work are less costly

Page 15: EUDAT Towards a European Collaborative Data Infrastructure

Challenges and Opportunities

Delivering high level multi-disciplinary data services

Achieving a high level of interoperability in the context of diversity of data, research disciplines and practices

Need to strongly involve the different communities in the design and evaluation of services EUDAT as a platform to discuss interoperability issues (along with other initiatives: e.g DAITF)

Building trust among stakeholders

Trust between service providers and users but also between the researchers and disciplines themselves Trust in the EUDAT infrastructure, the data deposited and collected, data integrity

Ensuring the sustainaibility of the infrastructure Providing a framework and a plan to ensure the continuity of services beyond the immediate funding

window, through the setting up of a sustainable entity Funding and business models Parnerships (new communities, industry, etc.) and governance models

Page 16: EUDAT Towards a European Collaborative Data Infrastructure

“Do the difficult things while they are easy and do the great things while they are small. A journey of a

thousand miles must begin with a single step.”

Lao Tzu

The beginning of a long journey…

Page 17: EUDAT Towards a European Collaborative Data Infrastructure

How to get in touch with EUDAT?

 

Kimmo Koski, CSC - IT Center for ScienceEUDAT Project Coordinator

[email protected]

Peter Wittenburg, Max Planck Institute for Psycholinguistics at Nijmegen (MPI-PL)

EUDAT Scientific [email protected]

Damien Lecarpentier, CSC - IT Center for ScienceEUDAT Project Manager

[email protected]

EUDAT@ISC’11

BoF session on “e-Infrastructure for science in Europe”, on Tuesday 21 June, 14:30-15:15, Hall B

Partners’ booths at ISC:

CSC #146 BSC # 114 DKRZ # 140 EPCC # 152

THANK YOU!