Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Arewethereyet?ExperiencesdevelopingandcommissioningtheHPCSystemforASKAPTelescope
CSIROASTRONOMYANDSPACESCIENCE
JuanCarlos(JC)Guzman|HeadofATNFSoftwareandComputingPerthHPCAdvisoryCouncilConference– 31July– 1August2017
Arewethereyet?ExperiencesdevelopingandcommissioningtheHPCSystemforASKAPTelescope
CSIROASTRONOMYANDSPACESCIENCE
JuanCarlos(JC)Guzman|HeadofATNFSoftwareandComputingPerthHPCAdvisoryCouncilConference– 31July– 1August2017
WeacknowledgetheWajarri Yamatji peopleasthetraditionalownersoftheObservatorysiteandtheNoongar peopleasthetraditionalownersoftheland
wherethismeetingisbeingheld
OutlineOverviewofASKAP
ASKAPComputingSystemhistory,challengesandfuture
LessonsLearned
OutlineOverviewofASKAP
ASKAPComputingSystemhistory,challengesandfuture
Lessonslearned
AustralianSKAPathfinder- overview• 36-antennamulti-beaminterferometerinaradio-quietzone• Frequencyrange:700MHz– 1.8GHz,baselinesfrom23mto6km• Surveyinstrument– pushingwideinstantaneousfieldofview• 2nd generationphased-arrayfeed(PAF)receiver+flexiblebeamformer• 3-axismount(wholeantennacanrotate)– canfixorrotatebeampattern• Automaticprocessingeventually– necessaryforthefullinstrument• Earlysciencewith12antennasstartedinOctober2016• MostreportedsciencewaswithBETA(6-antennaarraywithMkI PAF)• 18antennashavealreadybeenintegratedintothearray
PhasedArrayFeed– 188singlepolreceivers
Widefieldofview
7 |
• 126km2
• 32kmroadsandtracks• 16000kmopticfibre• >8000fibres• ControlBuilding• Powerstation• Underconstruction
MurchisonRadioObservatory(MRO)
MROpowerstation
ASKAP– systemarchitecture
x36
Combineddatarate~21Tb/s
~2.5GB/s
OutlineOverviewofASKAP
ASKAPComputingSystemhistory,challengesandfuture
Lessonslearned
Indirectimagingofthesky
Synthesistelescopesmeasurecorrelationsbetween
receivedvoltagesforeachpairofantennas
Threedifferenttypesofimagesarerequired
Continuumimage Spectrallinecube Transientimage• Verycoarseimage• Madeevery5seconds
• Veryaccurateimage• Needmultipleiterations• Hardtoparallelize
• 16200independentimages• Eachatslightlydifferentfrequency• Embarrassinglyparalleltask• Oneiterationmaybesufficient
Weneedtomakeimagesinnearrealtime,ideallyallthreetypesinparallel
Tothefirstorder(narrowFOV),themeasurementequationisa2DFourierTransform
Indirectimagingofthesky
Synthesistelescopesmeasurecorrelationsbetween
receivedvoltagesforeachpairofantennas
Threedifferenttypesofimagesarerequired
Continuumimage Spectrallinecube Transientimage• Verycoarseimage• Madeevery5seconds
• Veryaccurateimage• Needmultipleiterations• Hardtoparallelize
• 16200independentimages• Eachatslightlydifferentfrequency• Embarrassinglyparalleltask• Oneiterationmaybesufficient
Weneedtomakeimagesinnearrealtime,ideallyallthreetypesinparallel
Tothefirstorder(narrowFOV),themeasurementequationisa2DFourierTransform
90%ComputationCost
ASKAPKeyComputingRequirements• 90%ComputationalCostingridding/degridding• https://www.skatelescope.org/uploaded/59116_132_Memo_Humphreys.pdf• Developedstand-alonebenchmarkinggriddingcodefortestinginmultipleplatforms
• 10,000cores(80%efficiency),4GB/core200TFLOPsPeak• DataIngestfromCorrelator~2.8GB/s=~10TB/h(RawVisibilities)• Processingofrawvisibilities(calibration&imaging)needstokeepup• Cannotaffordtokeeprawvisibilities• Multiplescienceproductsafterobservations~5PB/year
ASKAPSDP- PietroBaracchiConference|JCGuzman14 |
ThePawsey HighPerformanceComputingCentreforSKAScience• AUD$80Msuper-computingcentre• 25%resourcestosupportoperationalrequirementsofstorageandprocessingofdatafromASKAPandMWA• ConstructioncompletedApril2013
ASKAPCentralProcessor@Pawsey CentreIngestcluster
• 16nodes,2socketspernode• 8coresCPUs,64GbofRAMpernode
CentralProcessor(Galaxy)472xCrayXC30ComputeNodes• 200TFlop/sPeak• 64GbofRAMpernode• 2socketspernode,10coreseach
SharedstorageCraySonexion Lustre Storage• 1.3PBusable• 480x4TBDiskDrives• PeakI/Operformance:30Gb/s
ASKAPsoft• Indevelopmentsince2007• Extensivere-useofcorelibraries• Re-writtenSynthesis(parallel)codeC++/MPI
• Assumptions• Instrumentstable(relativelyeasytocalibrate)
• Goodglobalskymodel• Imagingmodeladequate
• Automatedcalibrationandimaging(pipeline)• ASKAPisoneofthepathfindersinthisdomain(streaming+batch)
• Treatprocessingsoftwareasapartofthetelescope
• Requiresparadigmshiftinthesciencecommunity
• Commissioningrequiresdifferentthingstothefulltelescope
Calibration Pipeline Services
Small-N (e.g. Continuum) Imager Pipeline
Large-N (eg. Spectral Line) Imager Pipeline
Ingest Pipeline
UV Data
16416 Channels(18.5kHz)
UV Data
304 Channels(1MHz)
Imager(cimager)
Imager(cimager)
Source Finder/Identifier
Source Finder/Identifier
Source Catalog
Source Catalog
ccalibrator
Transient Detector Pipeline
Transient Imager
(cfimager) Images
Transient Finder/Identifier
Transient Detections
16416 Channels(18.5kHz)
Calibration Solution
~30 Channels(10MHz)
Calibration Data
Service
Sky Model Service
Light Curve Service
Image Cube
Images
ASKAP Science Processing
ASKAP-SW-0020
Version: 2.0Date: 20/12/2011Project: ASKAP
Prepared by: Tim Cornwell, Ben Humphreys, Emil Lenc, Maxim Voronkov, MatthewWhiting
Reviewed by: Ilana Feain,Review reference : Redmine issue 3280Approved by: Ilana Feain Date: 20/12/2011
Keywords: computing, science, processing
• Smallerdatasets!• 1 TB/hr (ASKAP-12)vs10TB/hr (ASKAP-36)• Largernaturalresolution(maximumbaseline=2.18km)
• Abletodomanualprocessing– stillhard(manybeams,largecubes),buttractable• Processingteamwillrunpipelinesmanuallyuponcompletionof
observation• Neededtounderstandandlearnabouttheinstrument!!
• Somefeaturesnotavailable• Processingisnotautomated• NoSkyModelavailable,norcalibrationserviceappliediningest• Transientpipelinenotyetdeveloped
ASKAPsoft forCommissioning&EarlyScience
Results:ASKAPsoft:First36beamimage
Imagecredit:Wasim Raja
• Continuumimagewith9antennasat939.5MHz• Processingresemblesanearly-scienceexperiment• Eachbeamcalibratedseparately• Individualdeconvolution ofdifferentbeams
• OnlyASKAPsoft used
Results:NGC7232WALLABYEarlyScience
Credit:JuanMadrid– 14Sep2016
ASKAPComputingProject
• Teamof7peopledistributedbetweenPerth&Sydney• ExternalReviews:PreliminaryDesign(2009),CriticalDesign(2010)andProductionReadiness(2016)• Iterativesoftwaredevelopmentprocess~2monthscycles• ContinuousIntegrationTool(Jenkins)• Confluence&JIRA• Subversionsoontobemovedtogit
Issues
• 1.3PBFaststorage(Lustre filesystem)aka/scratch2• MultipleusersdoingmanualprocessingneededduringcommissioningandEarlyScience
• SharedwithMWAusers• Shortageofspaceandnon-deterministicperformanceaffectedthedataingestsoftware(ingestpipeline)
• UnderestimatescratchspaceofEarlyScienceProgram
• New1.9PBfilesysteminMay2017• ProcuredbyPawsey• 1PBdedicatedtoASKAPreal-timeand0.9PBtoMWA
• Stillhaveashortageof0.5– 1PBtosupportEarlyScienceprogramdependingonthefateof/scratch2
Issues
• Needstableinstrumenttovalidateourassumptions• PAFbeamsstable(relativelyeasytocalibrate)• GoodGlobalSkyModel(continuum)• Imagingperformanceadequate(highdynamicrange)
• EarlyScienceandCommissioningdifferentusecaseasfull(automated)pipeline->ScopeCreep
• Under-estimateeffortonsoftwareintegration,verificationandsupport
• SharingresourceswithASKAPCommissioningandSKApre-construction
Nextsteps• SoftwareDevelopmentforbasicmodesforfullASKAP• Scalingtestinganddebugging• Real-timeservicesdevelopmentandintegration(calibration)• Automatedcontinuumandspectrallinepipelines
• AdditionalSciencePipelines• FullPolarisation Calibration• ”PostageStamps”– smallregions(10”spatialresolution)• TransientandZoom-modepipeline
• UpgradeoftheGalaxyplatformin12– 24months(TBD)• Testing,profilinginAthena(benchmarking&datachallenges)• EvaluatingGPUcode• UpdatingASKAPComputingRequirements
Nextsteps– TowardsSKA1inAustralia
• SKA1_LOW~100timeslargerthanASKAP
• JointICRAR/CSIROSKAScienceDataProcessingProject(namedRialto)• ContinueourinvolvementinSDPconsortiumtowardsCDR
• NextgenerationofcalibrationandimagingprocessingsoftwareasaprototypeforSKA1_LOW,ASKAPandMWA
• Re-useofASKAPsoft andDAliuGE ExecutionFramework
OutlineOverviewofASKAP
ASKAPComputingSystemhistory,challengesandfuture
Lessonslearned
Lessonslearned(forSKA1)
• ASKAPoperationsmodeldoesnotfollowtraditionalHPC(batch)user/supportmodel• Buildstrongrelationshipwithserviceproviders:ServiceAgreements,co-location
• DedicatedresourcesatalllevelsforRadioAstronomy:People,Software,Hardware
• Commissioningoftelescopestakeslongtime,significantresourcesandisdifferenttofulloperationsofthetelescope• Supportthetransitionperiodwasunderestimated
• Isolatefastsharedstorage(Lustre filesystem)from“traditional”HPCusermodelandincludemorestorageifyoucan
Arewethereyet?
ASKAPsoft isalreadyworking!
Stilllotsofworktodo,manychallengesaheadandmoretolearn!
Whensoftwareisreallyfinished?...…Never?
CSIROAstronomyandSpaceScienceJuanCarlosGuzmanHeadofATNFSoftwareandComputingt +61864368569E [email protected] www.csiro.au/cass
CSIROASTRONOMYANDSPACESCIENCE
Thankyou