35
Oct. 20th, 2016 Euclid data processing challenges 1 Euclid Consortium The Euclid Challenges The Euclid Challenges Pierre Dubath Pierre Dubath SDC-CH SDC-CH Department of Astronomy Department of Astronomy University of Geneva University of Geneva IAU Symposium 325 IAU Symposium 325 Astroinformatics Astroinformatics Sorrento (Italy), October 20-24, 2016 Sorrento (Italy), October 20-24, 2016

Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Embed Size (px)

Citation preview

Page 1: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 1

EuclidConsortium

The Euclid ChallengesThe Euclid Challenges

Pierre DubathPierre DubathSDC-CHSDC-CH

Department of AstronomyDepartment of AstronomyUniversity of GenevaUniversity of Geneva

IAU Symposium 325IAU Symposium 325

AstroinformaticsAstroinformaticsSorrento (Italy), October 20-24, 2016Sorrento (Italy), October 20-24, 2016

Page 2: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 2

EuclidConsortiumList of Collaborators

Nikolaos Apostolakos, Andrea Bonchi, Andrey Belikov,Massimo Brescia, Peter Capak, Jean Coupon, ChristopheDabin, Hubert Degaudenzi, Shantanu Desai, Florian Dubath,Adriano Fontana, Sotiria Fotopoulou, Marco Frailis, AudreyGalametz, Catherine Grenet, John Hoar, Mark Holliman, BenHoyle, Olivier Ilbert, Martin Kuemmel, Clotilde Laigle,Giuseppe Longo, Henry Joy McCracken, Martin Melchior,Yannick Mellier, Joe Mohr, Nicolas Morisset, StéphanePaltani, Roser Pello, Stefano Pilo, Gianluca Polenta, MauricePoncet, Roberto Saglia, Mara Salvato, Marc Sauvage, MarcSchefer, Marco Scodeggio, Stella Seitz, Santiago Serrano,Marco Soldati, Andrea Tramacere, Rees Williams, AndreaZacchei, etc.

Page 3: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 3

EuclidConsortiumOutline

1.Science and mission overview

2.Instruments and data analysis

3.Software development

4.Software integration and operation preparation

5.Swiss Science Data Center (SDC-CH) major tasks

● This presentation ● targets a non-Euclid audience● Focus on data processing aspects of the Euclid mission

Page 4: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 4

EuclidConsortiumAn Ever Expanding Universe ?

Physics Nobel Price 2011Physics Nobel Price 2011

Discovery of the accelerated expansion Discovery of the accelerated expansion of the Universe through distantof the Universe through distantsupernovae observationssupernovae observations

Perlmutter, Schmidt and RiessPerlmutter, Schmidt and Riess

Page 5: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 5

EuclidConsortiumThe Euclid mission main goal

● What is the Nature of the Dark Matter and Energy?

68%68%

27%27%

5%5%

Page 6: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 6

EuclidConsortiumThe Euclid mission

● ESA medium scientific Cosmology mission selected in 2011

● Soyuz launch from Kourou to L2 in 2020 and 6 year mission● Survey of 15'000 square degrees : Optical and NIR images

and NIR spectra → shape and distance measurements ofbillions of galaxies

● Constraints on cosmology models from different types ofmeasurements (or probes):

– Gravitational (strong and weak) lensing– Baryonic Acoustic Oscillation (BAO)– Integrated Sachs-Wolfe (ISW) effect (galaxy clusters)– Redshift-space distortions (Kaiser effect)

Page 7: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 7

EuclidConsortium

Weak lensing illustration

Masses bend light paths !

Page 8: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 8

EuclidConsortium

Page 9: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 9

EuclidConsortium3D dark matter map, COSMOS field

NASA, ESA, R. Massey (California Institute of Technology).

Page 10: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 10

EuclidConsortiumBaryonic Acoustic Oscillation

Page 11: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 11

EuclidConsortiumStrong Lensing

Euclid will lead to the detection of very large numbers of strong lensesat cluster and galaxy scales

Beautiful images...but, only Euclid legacy science !

Page 12: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 12

EuclidConsortiumThe Euclid Spacecraft

1.2m Korsch SiliconCarbide primary mirror

Page 13: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 13

EuclidConsortium

Page 14: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 14

EuclidConsortium… and ground photometry for PHZ!

Dark Energy Survey (DES), Kilo-Degree Survey(KiDS), LSST (?),Javalahambre/Spain, Subaru/Japan (?), CFHT/Canada

Page 15: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 15

EuclidConsortiumProcessing budget

2021 2022 2023 2024 2025 2026 2027

Storage (PB) 15 30 50 60 75 90 90 Computing (kilo cores / year)

2.5 5 8.5 12 16 20 21

Numbers from Christophe Dabin @ tk1

Page 16: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 16

EuclidConsortium

Level 1

Level S

SIMVIS EXTNIR SIR

Level E

LE1

OPS

MOC

GroundStation

MER

SPE SHEPHZ

LE3

SOC

Level 2

Level 3

Processing functional break down

● SIM : simulated dataSIM : simulated data

● VIS : visible calibrated framesVIS : visible calibrated frames● NIR : near IR calibrated framesNIR : near IR calibrated frames● SIR : calibrated 1-D spectraSIR : calibrated 1-D spectra● EXT : calibrated ground framesEXT : calibrated ground frames● MER : catalog with consistentMER : catalog with consistent

photometry and spectroscopyphotometry and spectroscopy● SPE : spectroscopic redshiftsSPE : spectroscopic redshifts● PHZ : photometric redshiftsPHZ : photometric redshifts● SHE : shape measurements SHE : shape measurements ● LE3 : high-level processingLE3 : high-level processing

Page 17: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 17

EuclidConsortiumEuclid data flow

Page 18: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 18

EuclidConsortiumEuclid SGS organization

Page 19: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 19

EuclidConsortiumEuclid SGS organization

OU task : Algorithms specification & validationOU task : Algorithms specification & validation

Page 20: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 20

EuclidConsortiumEuclid SGS organization

SDC task : Software development and Data processingSDC task : Software development and Data processing

OU task : Algorithms specification & validationOU task : Algorithms specification & validation

Page 21: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 21

EuclidConsortiumSoftware Development

● C++ and Python languages

● One reference platform– Linux from the Red Hat family (currently CentOS7)– Set of common libraries (EDEN)

● Software development on a virtual machine (LODEEN)

● RPM packaging

● XML-based common data model

● A common building and packaging framework

Page 22: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 22

EuclidConsortiumElements framework

Elements is a Cmake-based building and packagingframework (capitalizing on CERN expertise) featuring :

● a standard source code structure● easy software building according to CMakeLists.txt

instructions● automated RPM packaging (make rpm)● basic services, such as program option handling and

logging

Page 23: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 23

EuclidConsortiumProjects (Elements Framework)

Page 24: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 24

EuclidConsortiumDistributed data processing

● 10+ SDCs involved

● Central metadata database

● Data centric approach: softwareruns were the required data hasbeen shipped

● In each SDC● Distributed processing

management tools● Computing infrastructure for

– processing– storage

EuclidEuclidMetadataMetadataData BaseData Base

SDC SDC Processing &Processing &Local ArchiveLocal Archive

SDC SDC Processing &Processing &Local ArchiveLocal Archive

SDC SDC Processing &Processing &Local ArchiveLocal Archive

SDCSDCProcessing &Processing &Local ArchiveLocal Archive

SDC SDC Processing &Processing &Local ArchiveLocal Archive

SDC SDC Processing &Processing &Local ArchiveLocal Archive

SOCSOCProcessing &Processing &Local ArchiveLocal Archive

Euclid Archive SystemData ProductsMetadata UpdatesMetadata Queries

Page 25: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 25

EuclidConsortium

Science Archive Meta-Data Storage

Distributed Processing Infrastructure

SDC zSDC z

Data Storage

FileXML

Computing Infrastructure

InfrastructureAbstractionLayer (IAL)

Processing Control (Processing Order Definition)

Software Continuous Integration and Deployment (CernVM FS)

Monitoring(Icinga)

Euclid Archive SystemEuclid Archive System

SDC ySDC ySDC xSDC x

……

Page 26: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 26

EuclidConsortiumInfrastructure Abstraction Layer

Meta Scheduler

Pipeline Run Server

Creates and traverses data fow graphSubmits and monitors HPC jobs

IALDRM

WorkSpace

Submission Host, HPCIAL Host

Data StoragePolls Processing OrdersFetches inputs from EAS and prepares workspaceIngests outputs into EAS

MetadataData Base

Queuing System

Compute Nodes

Contains all inputs,outputs, intermediarydata for pipeline runs.

SDCProcessingOrder

Definition

Page 27: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 27

EuclidConsortiumChallenge-driven development

● Iterative development through the planning of anumber of incremental integration tests

● Series of challenges for different aspects of the system● weak lensing (Great)● infrastructure● “science”● photometric redshifts

● Consolidation of the interfaces (Common Data Model)

Page 28: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 28

EuclidConsortiumInfrastructure Challenge 6

Science Archive Meta-Data Storage

SDC zSDC z

Data Storage

FileXML

Computing Infrastructure

InfrastructureAbstractionLayer (IAL)

Processing Control (COORS) (Processing Orders)

Software Continuous Integration and Deployment (CernVM FS)

Monitoring(Icinga)

Euclid Archive SystemEuclid Archive System

SDC ySDC ySDC xSDC x

……

preliminary versions of (almost) all components involving almost all SDCs!

Page 29: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 29

EuclidConsortiumScience challenges 2 and 3

Level 1

Level S

SIMVIS EXTNIR SIR

Level E

LE1

OPS

MOC

GroundStation

MER

SPE SHEPHZ

LE3

SOC

Level 2

Level 3

Science 2 challenge(spring 2016)

Science 3 challenge(spring 2017)

Page 30: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 30

EuclidConsortiumSDC-CH major tasks

● Develop and provide the Elements building andpackaging framework to the collaboration

● Photometric redshift-related software development● Phosphoros : template fitting algorithm implementation● PHZ pipeline combining template fitting and machine

learning algorithms● Strong lens detection

● Contribution to algorithm exploration – (Paraficz et al. 2016 https://arxiv.org/abs/1605.04309)– (Tramacere et al. 2016 https://arxiv.org/abs/1609.06728)

● Development of a new (SExtractor) framework in C++

Page 31: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 31

EuclidConsortiumPhosphoros challenge results

Page 32: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 32

EuclidConsortiumSExtractor++

● A new modular and extensible SExtractor framework

● For the astronomical and the Euclid communities

● Long term maintenance and evolution perspectives

● Modern software design● API based on interfaces● Single responsibility principles● Design patterns● BOOST plugin system for adding algorithm steps

● Collaboration between Emmanuel Bertin and the Euclidcommunity

Page 33: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 33

EuclidConsortiumSExtractor++ status

● Framework ready

● Simplified aperture photometry : SExtractor comparison !

● Multi-frame model fitting

SExtractor 2.23.1SExtractor 2.23.1 SExtractor++SExtractor++

Page 34: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 34

EuclidConsortiumConclusions

● Euclid challenges: science goals, hardwaredevelopment, algorithm determination, softwaredevelopment, etc...

● Challenge-driven development : best approach forbuilding up software systems through largecollaborations?

● Possible extra benefits for the astronomical community:● The “Elements” building and packaging framework● Part of the “Infrastructure Abstraction Layer” (IAL)● Science tools, such as Phosphoros and SExtractor

Page 35: Pierre Dubath SDC-CH Department of Astronomy University of ...dame.dsf.unina.it/astroinformatics2016/lectures/The_Euclid... · Oct. 20th, 2016 Euclid data processing challenges 1

Oct. 20th, 2016 Euclid data processing challenges 35

EuclidConsortium

Thanks for your attention !