Upload
phamtruc
View
214
Download
1
Embed Size (px)
Citation preview
Oct. 20th, 2016 Euclid data processing challenges 1
EuclidConsortium
The Euclid ChallengesThe Euclid Challenges
Pierre DubathPierre DubathSDC-CHSDC-CH
Department of AstronomyDepartment of AstronomyUniversity of GenevaUniversity of Geneva
IAU Symposium 325IAU Symposium 325
AstroinformaticsAstroinformaticsSorrento (Italy), October 20-24, 2016Sorrento (Italy), October 20-24, 2016
Oct. 20th, 2016 Euclid data processing challenges 2
EuclidConsortiumList of Collaborators
Nikolaos Apostolakos, Andrea Bonchi, Andrey Belikov,Massimo Brescia, Peter Capak, Jean Coupon, ChristopheDabin, Hubert Degaudenzi, Shantanu Desai, Florian Dubath,Adriano Fontana, Sotiria Fotopoulou, Marco Frailis, AudreyGalametz, Catherine Grenet, John Hoar, Mark Holliman, BenHoyle, Olivier Ilbert, Martin Kuemmel, Clotilde Laigle,Giuseppe Longo, Henry Joy McCracken, Martin Melchior,Yannick Mellier, Joe Mohr, Nicolas Morisset, StéphanePaltani, Roser Pello, Stefano Pilo, Gianluca Polenta, MauricePoncet, Roberto Saglia, Mara Salvato, Marc Sauvage, MarcSchefer, Marco Scodeggio, Stella Seitz, Santiago Serrano,Marco Soldati, Andrea Tramacere, Rees Williams, AndreaZacchei, etc.
Oct. 20th, 2016 Euclid data processing challenges 3
EuclidConsortiumOutline
1.Science and mission overview
2.Instruments and data analysis
3.Software development
4.Software integration and operation preparation
5.Swiss Science Data Center (SDC-CH) major tasks
● This presentation ● targets a non-Euclid audience● Focus on data processing aspects of the Euclid mission
Oct. 20th, 2016 Euclid data processing challenges 4
EuclidConsortiumAn Ever Expanding Universe ?
Physics Nobel Price 2011Physics Nobel Price 2011
Discovery of the accelerated expansion Discovery of the accelerated expansion of the Universe through distantof the Universe through distantsupernovae observationssupernovae observations
Perlmutter, Schmidt and RiessPerlmutter, Schmidt and Riess
Oct. 20th, 2016 Euclid data processing challenges 5
EuclidConsortiumThe Euclid mission main goal
● What is the Nature of the Dark Matter and Energy?
68%68%
27%27%
5%5%
Oct. 20th, 2016 Euclid data processing challenges 6
EuclidConsortiumThe Euclid mission
● ESA medium scientific Cosmology mission selected in 2011
● Soyuz launch from Kourou to L2 in 2020 and 6 year mission● Survey of 15'000 square degrees : Optical and NIR images
and NIR spectra → shape and distance measurements ofbillions of galaxies
● Constraints on cosmology models from different types ofmeasurements (or probes):
– Gravitational (strong and weak) lensing– Baryonic Acoustic Oscillation (BAO)– Integrated Sachs-Wolfe (ISW) effect (galaxy clusters)– Redshift-space distortions (Kaiser effect)
Oct. 20th, 2016 Euclid data processing challenges 7
EuclidConsortium
Weak lensing illustration
Masses bend light paths !
Oct. 20th, 2016 Euclid data processing challenges 8
EuclidConsortium
Oct. 20th, 2016 Euclid data processing challenges 9
EuclidConsortium3D dark matter map, COSMOS field
NASA, ESA, R. Massey (California Institute of Technology).
Oct. 20th, 2016 Euclid data processing challenges 10
EuclidConsortiumBaryonic Acoustic Oscillation
Oct. 20th, 2016 Euclid data processing challenges 11
EuclidConsortiumStrong Lensing
Euclid will lead to the detection of very large numbers of strong lensesat cluster and galaxy scales
Beautiful images...but, only Euclid legacy science !
Oct. 20th, 2016 Euclid data processing challenges 12
EuclidConsortiumThe Euclid Spacecraft
1.2m Korsch SiliconCarbide primary mirror
Oct. 20th, 2016 Euclid data processing challenges 13
EuclidConsortium
Oct. 20th, 2016 Euclid data processing challenges 14
EuclidConsortium… and ground photometry for PHZ!
Dark Energy Survey (DES), Kilo-Degree Survey(KiDS), LSST (?),Javalahambre/Spain, Subaru/Japan (?), CFHT/Canada
Oct. 20th, 2016 Euclid data processing challenges 15
EuclidConsortiumProcessing budget
2021 2022 2023 2024 2025 2026 2027
Storage (PB) 15 30 50 60 75 90 90 Computing (kilo cores / year)
2.5 5 8.5 12 16 20 21
Numbers from Christophe Dabin @ tk1
Oct. 20th, 2016 Euclid data processing challenges 16
EuclidConsortium
Level 1
Level S
SIMVIS EXTNIR SIR
Level E
LE1
OPS
MOC
GroundStation
MER
SPE SHEPHZ
LE3
SOC
Level 2
Level 3
Processing functional break down
● SIM : simulated dataSIM : simulated data
● VIS : visible calibrated framesVIS : visible calibrated frames● NIR : near IR calibrated framesNIR : near IR calibrated frames● SIR : calibrated 1-D spectraSIR : calibrated 1-D spectra● EXT : calibrated ground framesEXT : calibrated ground frames● MER : catalog with consistentMER : catalog with consistent
photometry and spectroscopyphotometry and spectroscopy● SPE : spectroscopic redshiftsSPE : spectroscopic redshifts● PHZ : photometric redshiftsPHZ : photometric redshifts● SHE : shape measurements SHE : shape measurements ● LE3 : high-level processingLE3 : high-level processing
Oct. 20th, 2016 Euclid data processing challenges 17
EuclidConsortiumEuclid data flow
Oct. 20th, 2016 Euclid data processing challenges 18
EuclidConsortiumEuclid SGS organization
Oct. 20th, 2016 Euclid data processing challenges 19
EuclidConsortiumEuclid SGS organization
OU task : Algorithms specification & validationOU task : Algorithms specification & validation
Oct. 20th, 2016 Euclid data processing challenges 20
EuclidConsortiumEuclid SGS organization
SDC task : Software development and Data processingSDC task : Software development and Data processing
OU task : Algorithms specification & validationOU task : Algorithms specification & validation
Oct. 20th, 2016 Euclid data processing challenges 21
EuclidConsortiumSoftware Development
● C++ and Python languages
● One reference platform– Linux from the Red Hat family (currently CentOS7)– Set of common libraries (EDEN)
● Software development on a virtual machine (LODEEN)
● RPM packaging
● XML-based common data model
● A common building and packaging framework
Oct. 20th, 2016 Euclid data processing challenges 22
EuclidConsortiumElements framework
Elements is a Cmake-based building and packagingframework (capitalizing on CERN expertise) featuring :
● a standard source code structure● easy software building according to CMakeLists.txt
instructions● automated RPM packaging (make rpm)● basic services, such as program option handling and
logging
Oct. 20th, 2016 Euclid data processing challenges 23
EuclidConsortiumProjects (Elements Framework)
Oct. 20th, 2016 Euclid data processing challenges 24
EuclidConsortiumDistributed data processing
● 10+ SDCs involved
● Central metadata database
● Data centric approach: softwareruns were the required data hasbeen shipped
● In each SDC● Distributed processing
management tools● Computing infrastructure for
– processing– storage
EuclidEuclidMetadataMetadataData BaseData Base
SDC SDC Processing &Processing &Local ArchiveLocal Archive
SDC SDC Processing &Processing &Local ArchiveLocal Archive
SDC SDC Processing &Processing &Local ArchiveLocal Archive
SDCSDCProcessing &Processing &Local ArchiveLocal Archive
SDC SDC Processing &Processing &Local ArchiveLocal Archive
SDC SDC Processing &Processing &Local ArchiveLocal Archive
SOCSOCProcessing &Processing &Local ArchiveLocal Archive
Euclid Archive SystemData ProductsMetadata UpdatesMetadata Queries
Oct. 20th, 2016 Euclid data processing challenges 25
EuclidConsortium
Science Archive Meta-Data Storage
Distributed Processing Infrastructure
SDC zSDC z
Data Storage
FileXML
Computing Infrastructure
InfrastructureAbstractionLayer (IAL)
Processing Control (Processing Order Definition)
Software Continuous Integration and Deployment (CernVM FS)
Monitoring(Icinga)
Euclid Archive SystemEuclid Archive System
SDC ySDC ySDC xSDC x
……
Oct. 20th, 2016 Euclid data processing challenges 26
EuclidConsortiumInfrastructure Abstraction Layer
Meta Scheduler
Pipeline Run Server
Creates and traverses data fow graphSubmits and monitors HPC jobs
IALDRM
WorkSpace
Submission Host, HPCIAL Host
Data StoragePolls Processing OrdersFetches inputs from EAS and prepares workspaceIngests outputs into EAS
MetadataData Base
Queuing System
Compute Nodes
Contains all inputs,outputs, intermediarydata for pipeline runs.
SDCProcessingOrder
Definition
Oct. 20th, 2016 Euclid data processing challenges 27
EuclidConsortiumChallenge-driven development
● Iterative development through the planning of anumber of incremental integration tests
● Series of challenges for different aspects of the system● weak lensing (Great)● infrastructure● “science”● photometric redshifts
● Consolidation of the interfaces (Common Data Model)
Oct. 20th, 2016 Euclid data processing challenges 28
EuclidConsortiumInfrastructure Challenge 6
Science Archive Meta-Data Storage
SDC zSDC z
Data Storage
FileXML
Computing Infrastructure
InfrastructureAbstractionLayer (IAL)
Processing Control (COORS) (Processing Orders)
Software Continuous Integration and Deployment (CernVM FS)
Monitoring(Icinga)
Euclid Archive SystemEuclid Archive System
SDC ySDC ySDC xSDC x
……
preliminary versions of (almost) all components involving almost all SDCs!
Oct. 20th, 2016 Euclid data processing challenges 29
EuclidConsortiumScience challenges 2 and 3
Level 1
Level S
SIMVIS EXTNIR SIR
Level E
LE1
OPS
MOC
GroundStation
MER
SPE SHEPHZ
LE3
SOC
Level 2
Level 3
Science 2 challenge(spring 2016)
Science 3 challenge(spring 2017)
Oct. 20th, 2016 Euclid data processing challenges 30
EuclidConsortiumSDC-CH major tasks
● Develop and provide the Elements building andpackaging framework to the collaboration
● Photometric redshift-related software development● Phosphoros : template fitting algorithm implementation● PHZ pipeline combining template fitting and machine
learning algorithms● Strong lens detection
● Contribution to algorithm exploration – (Paraficz et al. 2016 https://arxiv.org/abs/1605.04309)– (Tramacere et al. 2016 https://arxiv.org/abs/1609.06728)
● Development of a new (SExtractor) framework in C++
Oct. 20th, 2016 Euclid data processing challenges 31
EuclidConsortiumPhosphoros challenge results
Oct. 20th, 2016 Euclid data processing challenges 32
EuclidConsortiumSExtractor++
● A new modular and extensible SExtractor framework
● For the astronomical and the Euclid communities
● Long term maintenance and evolution perspectives
● Modern software design● API based on interfaces● Single responsibility principles● Design patterns● BOOST plugin system for adding algorithm steps
● Collaboration between Emmanuel Bertin and the Euclidcommunity
Oct. 20th, 2016 Euclid data processing challenges 33
EuclidConsortiumSExtractor++ status
● Framework ready
● Simplified aperture photometry : SExtractor comparison !
● Multi-frame model fitting
SExtractor 2.23.1SExtractor 2.23.1 SExtractor++SExtractor++
Oct. 20th, 2016 Euclid data processing challenges 34
EuclidConsortiumConclusions
● Euclid challenges: science goals, hardwaredevelopment, algorithm determination, softwaredevelopment, etc...
● Challenge-driven development : best approach forbuilding up software systems through largecollaborations?
● Possible extra benefits for the astronomical community:● The “Elements” building and packaging framework● Part of the “Infrastructure Abstraction Layer” (IAL)● Science tools, such as Phosphoros and SExtractor
Oct. 20th, 2016 Euclid data processing challenges 35
EuclidConsortium
Thanks for your attention !