Upload
duongkhanh
View
232
Download
2
Embed Size (px)
Citation preview
www.egi.eu EGI-Engage is co-funded by the Horizon 2020 Framework Programme
of the European Union under grant number 654142
EGI technical platforms for advanced computing
Tiziana Ferrari Technical Director, EGI Foundation
2
Outline
• Introduction to EGI • Services for distributed computing, data
management and AAI • New requirements, new challenges • Towards the European Open Science Cloud
Astronomical Data Analysis Software and Systems 2016 3
EGIFoundation
Astronomical Data Analysis Software and Systems 2016 4
EGI Membership
• Major national e-Infrastructures: 22 NGIs • EIROs: CERN and EMBL-EBI • EGI Foundation • (ERICs)
https://www.egi.eu/about/egi-foundation/
Astronomical Data Analysis Software and Systems 2016 5
International Partnerships
AfricaandArabiaCouncilforScientificandIndustrialResearch,SouthAfrica
IndiaCentreforDevelopmentofAdvancedComp.
ChinaInst.OfHEPChineseAcademyofSciences
LatinAmericaUniversidadeFederaldoRiodeJaneiro
UkraineUkrainianNationalGrid
USA
Canada
AsiaPacificRegionAcademiaSinicaatTaiwan
Astronomical Data Analysis Software and Systems 2016 6
EGI Federation, 2016 QR3 The largest distributed compute e-Infra worldwide
23Cloudproviders,+300datacentres
+250000instantiatedVMs/year
1.7Millionjobs/day
2.6BillionCPUhours/year+26%
>48000users,+25%
Astronomical Data Analysis Software and Systems 2016 7
Serving researchers and innovators
ESFRIs,FETflagships
Sizeofindividualgroups
Multinationalcommunities ‘Longtail’
WLCGCTAELIXIREPOSEISCAT_3DBBMRICLARINLOFAREMSOELILifeWatchICOSEMSOCORBELENVRIplus…
VREprojectsWeNMRDRIHMVERCEMuGAgINFRACMMSTLSGCSuperSitesExploitationEnvironmentalsci.neuGRID…
PeachNoteCEBAGalaxyeLabSemiconductordesignMain-beltcometsQuantumpysicsstudiesVirtualimaging(LS)BovinetuberculosisspreadConvergentevol.ingenomesGeographyevolutionSeafloorseismicwaves3DlivermapswithMRIMetabolicratemodellingGenomealignmentTapewormsinfectiononfish…
Industry, SMEs
AgroknowCloudEOCloudSMEEcohydrosgnubilaSinergiseSixSqTEISSTerradueUbercloud…
Astronomical Data Analysis Software and Systems 2016 8
Supporting international research communities and thematic services
Example:StructuralBiology Distributionofusers(2016,QR3)
➢ 2700users➢ 81countries
(credits:A.Bonvin,WeNMR)
Installedcomputecapacitytrends2011-2016
Astronomical Data Analysis Software and Systems 2016 9
Astronomy/Astrophysics/Astro-particle physics projects and RIs in EGI
ARGO, AUGER, CTA, KM3NeT, LHCb, LOFAR, Large Synoptic Survey Telescope/LSST, PAMELA, ESA Planck Mission, XENON etc.
2010-2016: > 25 M job
Astronomical Data Analysis Software and Systems 2016 10
Supporting international research communities and thematic services
StructuralBiology Distributionofusers(2016,QR3)
➢ 2700users➢ 81countries
(credits:A.Bonvin,WeNMR)
Installedcomputecapacitytrends2011-2016
Astronomical Data Analysis Software and Systems 2016 11
Services Catalogue
http://go.egi.eu/ServiceCatalogue
Astronomical Data Analysis Software and Systems 2016 12
Runvirtualmachineson-demandwithcompletecontroloverthecomputingresources• On-demandprovisioning• Fullcontrolovercomputingresources
• Standardinterfacetodeployonmultipleserviceproviders
Benefits• Executecompute-anddata-intensiveworkloads,includingGPGPUcomputinginthecloud
• Hostlong-runningservices
Cloud Compute
Astronomical Data Analysis Software and Systems 2016 13
EGI Federated Cloud• System of cloud
infrastructures • Standard user interfaces
– Clouds and their interconnections are based on open standards, open technologies
– Based on OCCI/OGF and OpenStack
• Harmonised operational behaviour
• Value proposition: distributed cloud computing for analysis of distributed large datasets
Astronomical Data Analysis Software and Systems 2016 14
Benefits
OpenStack OpenNebula
OpenStackOpenNebula
OpenStack
Synnefo
Harmonisedoperation
CloudregistryInformationsystemVirt.Machinemarketpl.UsageaccountingAccesscontrol
Uniform userinterfaces
OpenStackNova
-Oneverysite
-OnOSsites
CDMI-onanysite• OpenStackSWIFT–onOSsites
VMandblockstoragemanagement: Objectstoragemanagement(optional):
Astronomical Data Analysis Software and Systems 2016 15
EGI Federated Cloud
EGIFederatedCloudisacollaborationofcommunitiesdeveloping,innovating,operatingandusingcloudfederationsforresearchandeducation.
Today:• 23providersfrom14NGIs
• 15OpenStack• 6OpenNebula• 1Synnefo
• ~7.000coresintotal
Astronomical Data Analysis Software and Systems 2016 16
Porting of LOFAR calibration pipeline 1/2
Credits: Susana Sánchez Expósito - CSIC
Astronomical Data Analysis Software and Systems 2016 17
Porting of LOFAR calibration pipeline 2/2
Astronomical Data Analysis Software and Systems 2016 18
Outcomes
• The computing capabilities fulfil the requirements from the use case
• The memory and cpu needs depends on the specific pipeline step, and the EGI federated cloud allows to configure virtual machines with different capabilities.
• A better storage solution is needed • The user data are too large to be stored in the VM
images. They should be stored in volumes easily mountable from several VMs and synchronized across different cloud providers.
• COMPSs facilitates the porting and deployment of the application
Astronomical Data Analysis Software and Systems 2016 19
Cloud Container Compute
RunDockercontainerswithinisolateduser-spacewithnooverhead• On-demandprovisioning• Lightweightenvironmentformaximizedperformance
• Standardinterfacetodeployonmultipleserviceproviders
Benefits• Reducetimetoproductionbyremovingfrictionbetweendevelopmentandoperationsenvironments
• Interoperableandtransparent
Astronomical Data Analysis Software and Systems 2016 20
High-Throughput Compute
Analyzelargedatasetsbyexecutinglargenumbers(thousands)ofcomputationaltasks• Accesstohigh-qualitycomputingresources
• Integratedmonitoringandaccountingtoolstoprovideinformationabouttheavailabilityandresourceconsumption
• Workloadanddatamanagementtoolstomanageallcomputationaltasks
Benefits• Largeamountsofprocessingcapacityoverlongperiodsoftime
• Fasterresultsforyourresearch• Sharedresourcesamongusers,enablingcollaborativeresearch
Astronomical Data Analysis Software and Systems 2016 21
CTAresourcepoolinEGI:• 20siteswithapprox.8000CPUcores• Disk:1.3PBin6sites(OnlineStorage)• Tape:400TBin3sites(ArchiveStorage)
DIRACforCTA:• CTA-specificextensionofDIRAC• FileCatalogue~21millionfiles• Computingtasks
• 8millionsofar;2.6PBprocessed• Datatransformationtasksvs.Usertasks
CTA:MonteCarlosimulationsandAnalysis
• UseofEGIHighThroughputCompute,OnlineStorage,ArchiveStorage• UseofDIRACforDataCataloguingandWorkloadManagement
Metadataselection
Catalogbrowsing
Queryresult
Astronomical Data Analysis Software and Systems 2016 22
Online storage
Store,shareandaccessyourlesandtheirmetadataonaglobalscale• Assignglobalidentifierstofiles• Accesshighly-scalablestoragefromanywhere
• Controlthedatayoushare• Organiseyourdatausingaexiblehierarchicalstructure
Benefits• Highlyscalablestoragesystemaccessiblefromanywhere
• Easilysharedata• Accessthroughdifferentinterfaces
Astronomical Data Analysis Software and Systems 2016 23
Archive storage
Back-upyourdataforthelongtermandfutureuseinasecureenvironment• Storedataforlong-termretention• Storelargeamountofdata• FreeupyouronlinestorageBenefits• Storeslargeamountsofdata• Long-termretention• Reliableandinteroperable
Astronomical Data Analysis Software and Systems 2016 24
Data Transfer
Transferlargesetsofdatafromoneplacetoanother• Moveresearchdatafast• Specializedanalyticsofon-goingtransfers• Userinterfacetomanagetransferandnetworkresources
Benefits• Idealforverylargefiles• Abletohandlelargeamountsoffiles• Transferprocesswithautomaticretry
Astronomical Data Analysis Software and Systems 2016 25
EGI AAI New Architecture
EGIServices
EGICheckIn
IdP
AttributeAuthority
EGIUIDFirstname,last
name
email affiliation
Mandatory Attributes
Astronomical Data Analysis Software and Systems 2016 26
Why a IdP/SP Proxy?
• Service Providers (SPs) can have one statically configured IdP
• No need to run an IdP Discovery Service on each EGI SP
• Connected SPs get harmonised user identifiers and accompanying attribute sets from one or more AAs that can be interpreted in a uniform way for authZ purposes
• External IdPs only deal with a single EGI SP proxy EGI services will not have to deal with the complexity of
multiple IdPs/Federations/Attribute Authorities/technologies.
27
Open Data Platform
• Manage entire data life cycle from raw data to preservation
• Combine efficient computation services with open data managed by federated infrastructures – No local staging of data for processing
• Share public datasets for download or reuse
• Make public datasets discoverable
AstronomicalDataAnalysisSoftwareandSystems2016
28
Open Data Platform interfaces
GUI• Webbased
• Easydatamanagementandsharing,accesscontrol
• • Publicationofdataitems
REST• Advanc
eddataandcollectionmanagementAPIforintegrationwithcommunitytoolsandportals
CDMI• Standar
ddatamanagementoperations
• Advancedmetadataqueries
• Integrationwithfuture
POSIX• Enable
directmountingofspacesinthelocalfilesystemwithoutfulldatatransfer
OAI-PMH• OAI
DataProviderinterface
• DublinCoremetadatabydefault
• Morecomplexmetadatacanbe
HTTP• Direct
downloadofopendatafromURL’s
29
Open Data Platform workflow
Private Resources
Dataset
Share of Dataset
Direct mount Via POSIXCopy of
Data-set-1.1
Private Resources
ViewdatasetpreviewinbrowserusingDOI
Copydataset
Discoverdataset
Mount open data set directlyon desktop or VM (in read only mode)
Create a share from folder or file withMetadata in DC
PublicServicesForDataDiscovery
Register identifier (e.g. DOI)
LazyReplication
Discov
erdat
aset
andm
ountsha
re
Expose share metadata over OAI-PMH
Astronomical Data Analysis Software and Systems 2016 30
EGI role towards the European Open Science Cloud
• An Open Science Service Exchange as partnership of public/commercial organizations and initiatives responsible for
– Provisioning of wide set of services to researcher and innovators ! consolidation of national e-Infrastructures, open standards, technical and business process integration among the suppliers (e-Infrastructures, Research Infrastructures etc.)
– Platform integration for community-specific capabilities with coordinated outreach
– Aggregation of demand for economies of scale, technical requirements translations, cross-border access via brokering and procurement, end-to-end operations
– Development of human capacity – A “Digital innovation hub” to support innovation with industry/
SMEs
www.egi.eu
Thank you for your attention.
Questions?
This work by Parties of the EGI-Engage Consortium is licensed under a Creative Commons Attribution 4.0 International License.
AcknowledgementsThispresentationusediconsmadebyFreepikfromwww.flaticon.com