43
HPC and Exascale for the Square Kilometer Array Telescope www.skatelescope.org Bill Boas Cray, Business Development , SKA [email protected] 510.375.8840 HPC Advisory Council, Stanford 1

HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

HPC and Exascale for the Square Kilometer Array Telescope

www.skatelescope.org

Bill Boas Cray, Business Development , SKA

[email protected] 510.375.8840

HPC Advisory Council, Stanford 1

Page 2: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Table of Contents – Exascale and SKA

● Overall Project Summary

● The Technology Opportunity Funding, Governance, Timeline and Structure

● The Countries participating

● Board of Directors

● Project Structure Pre-Construction Phase – SKA1

● Self-Funded Consortia under contract to SKAO

● Consortias’ Deliverables in 2016

● Acquisition and Construction SKA1 2017-2020

● SKA2 Scale Up and Schedule

● RARIC in US

● Known IBM Activities

Cray Inc. – Jan 2014 2

Page 3: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Overall Project Summary

● SKA is a very large, decade in the making, decades more ahead, global, exascale, radio astronomy observatory, a 24/7enterprise, conceived and driven by astronomers

● 2/3 installed in Africa, 1/3 installed Australia, Project Office in UK, 12 member countries now, USA backed out in 2010

● Three Phases going forward from now ● 2013-16 SKA1 (10%) Pre-Construction requirements, architecture

design, specification 12 Consortia awarded this phase,

● 2107-18 Issue tenders, award and construct SKA1 include pre-cursors

● 2019-2021 design and specify SKA2 (90%) SKA1 operational and for 50 years thereafter

● 2022-24 Issue SKA2 tenders, award, construct, integrate into operations

● Budgets in Euros ● Pre-Construction $90M in-kind by members, 5M for computing

prototypes in Architecture Lab at Cambridge University

● SKA1 650M cap from Project Office; SKA2 ~5000M

Cray Inc. – March 2013

3

Page 4: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Technology Opportunity – Architect, Design and Integrate Real-Time Data Handling, HPC and Big Data

Cray Inc. – Jan 2014 4

# 2013 estimate by SKA South Africa

MeerKAT Pre-Cursor 2014-15

SKA Phase 1 2017-19

SKA Phase 2 Est. 2020-24

Data into CSP 2 Tbps 50 Tbps up to 5 Pbps

Data into SDP 0.4 Tbps 20 Tbps up to 500 Tbps

Into Storage 35 Gbps 300+ Gbps up to 2 Tbps

Computing load 200 TFlops 30+ PFlops 3+ EFlops

Incoming Signals from

Dishes and Arrays

Sw

itch

Switch

Co

rrela

tor

Be

am

form

er

Scie

nce

Pro

ce

ssor

Scie

nce

Arc

hiv

e

A No-Stop* Data Streaming, Analysis, Storage and Distribution Architecture

SKA

One observatory South Africa – 2/3

One observatory In Australia -1/3

Re

se

arc

hers

Wo

rldw

ide

* - No-Stop means the data never stops incoming, must be either handled or dropped in bit bucket

*Does not mean “Non-Stop”, h/w, s/w fail-over and silent error detection are not required

# - more information in further slides.

Open Skies

Merit Based

Distribution

To

Researchers

Analyse Signals to

extract Data from Noise Process Data to

Create Visibilities

Archive Visibilities

for Distribution

Page 5: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

SKA Organization and Governance Overview

Cray Inc. – Jan 2014 5

Page 6: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Structure Diagram and Current Funding

Cray Inc. – Jan 2014 6

(in kind) means member countries have funded their institutions,

companies and individuals within them (listed further in slides)

to work on SKA Pre-Construction Phase

Page 7: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

SKA Governance

● SKA Board of Directors made siting decision in May 2012: ● SKA1-low –Australia ● SKA1-survey –Australia ● SKA1-mid –South Africa

● SKA Office is UK Company Limited by Guarantee i.e. non-profit ● Expedient solution to enable SKA project to proceed; long-term governance

structure under review ● SKA Board has set a cost-cap for SKA1 – 650Million Euro

● Imposes discipline on the design process ● Each Consortium given cost for Pre-Construction Phase

● SKAO has Element Engineers in each Consortium ● Role is to guide the Work Packages of each Consortium

● Board Members coordinate each Country Governments ● Seeking construction funding

● Design guided by scientific and engineering assessments ● Re-baselining in ~1 year..

● To incorporate precursors – MeerKAT South Africa, ASKAP in Australia ● Re-use as much existing infrastructure as possible in SKA1

● Pathfinder – LOFAR in Europe may be?

Cray Inc. – Jan 2014 7

Page 8: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

The Countries participating in SKA in 2014

● Australia (DIISRTE) -

● Canada (NRC-Herzberg)

● China (MOST)

● Germany (BMBF)

● Italy (INAF)

● Netherlands (NWO)

● New Zealand (MED)

● South Africa (DST)

● Sweden (Chalmers)

● UK (STFC)

● India (Tata/DAE) – anticipate joining

Cray Inc. – Jan 2014 8

Page 9: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

The Project Structure ● Led by SKA Office

● Management ● Science ● System Design and System Engineering ● Maintenance & Support and Operations

● Carried out by Work Package Consortia

● Dish Array ● Aperture Arrays ● Signal and Data Transport (including synchronization and timing) ● Central Signal Processor ● Science Data Processor ● Telescope Manager ● Infrastructure, including power, etc. ● Assembly, Integration and Verification

● Advanced Instrumentation Programs

● Mid Frequency Aperture Array ● Wide Band Single Pixel Feeds

Cray Inc. – Jan 2014 9

Page 10: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Pre-Construction Design and Engineering Structure

● Design of SKA to be by multinational global consortia

● Consortia as contractors to the central office

● Consortia leaders chosen on merit and peer acceptance

● SKA Office holds the design authority for the project.

● SKA Office will run system engineering, receive and review designs from consortia, monitor progress, analyze, assess merit and approve

● SKA Office issued a *baseline conceptual design to serve as starting point for design, based on previous work and CoDRs.

● 10 consortia formed to undertake the design.

Cray Inc. – Jan 2014 10

* http://www.skatelescope.org/wp-content/uploads/2012/07/SKA-TEL-SKO-DD-001-1_BaselineDesign1.pdf

Page 11: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Project Timeline

Cray Inc. – Jan 2014 11

Page 12: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Pre-Construction Phase Consortia and Leaders

● Dish Array – Mark McKinnon, CSIRO, Australia

● Low Frequency Aperture Arrays - Jan Geralt Bij de Vaate, ASTRON, Netherlands

● Mid Frequency Aperture Arrays - Jan Geralt Bij de Vaate, ASTRON, Netherlands

● Signal and Data Transport – Keith Grainge, Univ. Manchester, UK

● Central Signal Processor – David Stephens, MDA/NRC, Canada

● Science Data Processor – Paul Alexander, Univ. Cambridge UK

● Telescope Manager – Yashwant Gupta, NCRA, India

● Infrastructure, including power, etc. -

● Assembly, Integration and Verification – Richard Lord, SKA South Africa

Cray Inc. – Jan 2014 12

Page 13: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

DISH Consortium Members

● Commonwealth Scientific and Industrial Research Organization (CSIRO), Australia

● RPC Technologies, Australia

● National Research Council, Canada

● Joint Laboratory for Radio Astronomy Technology (JLRAT), China

● Max Planck Institute for Radio Astronomy (MPIfR), Germany

● Vertex Antennentechnik, Germany

● IAF Fraunhofer, Germany

● National Institute of Astrophysics (INAF), Italy

● European Industrial Engineering (EIE), Italy

● Società Aerospaziale Mediterranea (SAM), Italy

● SKA South Africa, South Africa

● EM Software and Systems (EMSS), South Africa

● Spain University Group, Spain

● Chalmers University/Onsala Space Observatory, Sweden

● Omnisys Instruments AB, Sweden

Cray Inc. – Jan 2014 13

Initial technical Solution

http://www.skatelescope.org/wp-content/uploads/2013/09/SKA-TEL_DSH_MGT-CSIRO-TS-004-1_DishTechSol.pdf

Page 14: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

SKA Dishes

Cray Inc. – Jan 2014 14

The Consortium is responsible for the design and

verification of the antenna structure, optics, feed

suites, receivers, and all supporting systems and

infrastructure for SKA1-mid and SKA1-survey.

The Consortium is sub-divided into work

elements, summarised below. The “optics” are

how the dishes are described, and whilst the

tolerances are not as tight as for their optical

counterparts, they still have to be built to a level

of precision unsurpassed in the field of radio

astronomy.

The main challenge is the mass production of several thousand 15m wide telescopes, all with

identical performance characteristics, all built with new design ideas, and built to last and tolerate

the harsh conditions of the deserts in which they will operate. Combine with that the overriding

element of cost, and getting the very best price to performance ratio, and the dish element of the

SKA is a formidable technical and engineering challenge.

The task of the Dish Structure work element is to deliver the construction-ready design for the

structure element of the SKA1-mid and SKA1-survey dishes. Three prototype antennas are being

built within the Consortium: DVA-1 in Canada, DVA-C in China, and MeerKAT-1 in South Africa.

Page 15: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Status of a Candidate Dish ● The Dish Verification Antenna (DVA-1) ● The DVA-1 project in Canada is progressing on

many fronts. The foundation is complete and now sits beneath a 3m pile of regolith to post-load the soil. Trenching for power and data is also complete.

● To improve surface accuracy the primary and secondary molds have been faired. Results on the secondary are extremely good with an error of ~.1mm rms from the design shape.

● Measurements of the primary are underway and are expected to be < .5mm rms. Fabrication of the secondary reflector will begin in mid-April, followed by a large scale infusion test panel for the primary reflector. Layup and infusion of the primary reflector should be complete by July.

● Steady progress is being made on the telescope pedestal by Matt Fleming’s team at Minex Engineering in California. With the major pieces complete, work is focused on integrating parts and measuring performance.

● Other subcontractors such as Profile Composites, FormaShape, and Vectorworks Marine are respectively fabricating sub-components such as carbon feed legs, composite backing pieces, and dish rim connectors. Integration of DVA1 assemblies will begin in early summer with testing expected to begin in the fall.

Cray Inc. – Jan 2014 15

Page 16: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Low Frequency Aperture Arrays

● International Centre for Radio Astronomy Research (ICRAR), Australia

● Key Lab of Aperture Array and Space Application (KLAASA), China

● German Long Wave Consortium (GLOW), Germany

● National Institute for Astrophysics (INAF), Italy

● University of Malta, Malta

● Netherlands Institute for Radio Astronomy (ASTRON), The Netherlands

● Joint Institute for VLBI in Europe (JIVE), The Netherlands

● University of Cambridge, UK

● University of Manchester, UK

● University of Oxford, UK

● Massachusetts Institute of Technology (MIT), USA

Cray Inc. – Jan 2014 16

Page 17: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Mid Aperture Array Consortium Members

● Key Lab of Aperture Array and Space Application (KLAASA) , China

● University of Bordeaux, France

● Paris/Nançay Observatory, France

● University of Malta, Malta

● Netherlands Institute for Radio Astronomy (ASTRON), Netherlands

● Instituto de Telecomunicações (IT), Portugal

● SKA South Africa, South Africa

● University of Cambridge, UK

● University of Manchester , UK

● University of Oxford , UK

Cray Inc. – Jan 2014 17

Page 18: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Aperture Arrays

Cray Inc. – Jan 2014 18

• An aperture array is a large number of small,

fixed antenna elements coupled to appropriate

receiver systems which can be arranged in a

regular or random pattern on the ground.

• A signal “beam” is formed and steered by

combining all the received signals after

appropriate time delays have been introduced

to align the phases of the signals coming from a

particular direction.

• Innovative, efficient and low cost, aperture

array antennas provide a large field of view and

are capable of observing more than one part of

the sky at once.

• By simultaneously using different sets of timing

delays, this “beam forming” can be repeated

many times to create multiple independent

beams, yielding an enormous total field of view.

• The ability to configure numerous beams will

permit the system to look at multiple regions of

the sky simultaneously, massively increasing

the telescope survey speed.

• The number of useful beams produced, or total

field of view, is limited by the available

computing and communications capabilities.

• The SKA from the outset is challenging

academia, industry and technologies with

concepts and designs that, at this time have not

been developed and do not exist.

Low

Frequency

in SKA1

Mid

Frequency

added in

SKA2

Concept of

Beamforming

Page 19: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Signal and Data Transport Consortium Members

● Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

● Australia Academic and Research Network (AARNet), Australia

● Tsinghua University/ Peking University, China

● National Centre for Radio Astrophysics (NCRA) / Tata Consulting, India

● Netherlands Institute for Radio Astronomy (ASTRON), The Netherlands

● Joint Institute for VLBI in Europe (JIVE), The Netherlands

● Instituto de Telecomunicações (IT), Portugal

● SKA South Africa, Nelson Mandela Metropolitan University (NMMU), South Africa

● University of Granada, Spain

● University of Manchester, UK

● National Physical Laboratory (NPL), UK

● DANTE, UK

Cray Inc. – Jan 2014 19

Page 20: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

SADT Overview

Cray Inc. – Jan 2014 20

Page 21: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Central Signal Processor Consortium Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

International Centre for Radio Astronomy Research (ICRAR), Australia

Swinburne University of Technology, Australia

CISCO, Australia

National Research Council of Canada (NRC), Canada

Canadian Institute of Theoretical Astrophysics (CITA), Canada

MDA Systems Ltd, Canada

Key Lab of Aperture Array and Space Application (KLAASA), China

Max Plank Institute for Radio Astronomy (MPIfRA), Germany

National Centre for Radio Astrophysics (NCRA), India

National Institute for Astrophysics (INAF), Italy

SELEX Electronic Systems, Italy

University of Malta, Malta

Netherlands Institute for Radio Astronomy (ASTRON), The Netherlands

Joint Institute for VLBI in Europe (JIVE), The Netherlands

Cray Inc. – Jan 2014 21

Netherlands eScience Center (NLeSC), The

Netherlands

AUT University, New Zealand

Massey University, New Zealand

University of Auckland, New Zealand

Compucon New Zealand, New Zealand

Open Parallel Ltd, New Zealand

SKA South Africa, South Africa

Reutech Radar Systems (A Division of Reutech

Limited), South Africa

Ingeniería de Sistemas para la Defensa de España

(ISDEFE), Spain

Universidad Politécnica de Madrid (UPM), Spain

IBM Zurich, Switzerland

Science and Technology Facilities Council (STFC),

UK

University of Manchester, UK

University of Oxford, UK

Adaptative Array Systems Limited, UK

NVIDIA, USA

NASA JPL, USA

Page 22: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Correlators Approaches Summary

Cray Inc. – Jan 2014 22

Sub Element Lead Institute

Or Person

Approaches Technologies

LOW-AA correlator Oxford / Zarb-Adami “Hardware-based”, Custom PowerMX, Uniboard-2, or

Redback, or other TBD.

Primarily FPGAs.

LOW-AA correlator Curtin / Steve Ord “Software-based”

programming methods.

Results could impact

Stage 2 investigations in

all correlator sub-elements

COTS NVIDIA GPUs as a

starting point, using all

COTS equipment.

(SMART research)

MID-DISH correlator/BF NRC+SKA-

SA/Carlson+Kapp

PowerMX custom platform,

with standards generation

for use/contribution

elsewhere.

FPGAs are the baseline.

Will consider ASICs and

SMART research

SUR-DISH correlator

NZ Alliance / Ensor “Hardware-based”. Multi-

facetted; will consider

PowerMX, Redback, and

others…whatever works

best.

FPGAs, multi-core CPUs,

ASICs, possibly mixed-

and-matched on PowerMX

boards or other platforms.

PSS Engine UManchester / Stappers Currently studied for COTS

using existing software

algorithms/methods

Primarily GPUs, but will

consider FPGA and even

ASIC accelerators to save

cost and power

PST Engine Swinburne/van Straten COTS using existing

software algorithms,

methods, and research.

COTS GPUs. Cost and

power here are negligible.

Page 23: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Signal Processing Functions

Cray Inc. – Jan 2014 23

Page 24: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Science Data Processor Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

International Centre for Radio Astronomy Research (ICRAR), Australia

iVEC, Australia

University of Melbourne, Australia

Canadian Consortium, University of Calgary, Canada

Canadian Astronomy Data Centre (CADC), Canada

IBM Canada Limited, Canada

Calgary Scientific Inc., Canada

Rackforce, Canada

University of Alberta, Canada

University of British Columbia, Canada

McGill University, Canada

University of Chile, Chile

Regensburg University, Germany

Netherlands Institute for Radio Astronomy (ASTRON), The Netherlands

Institute for Radio Astronomy and Space Research (IRASR), New Zealand

COMPUCON New Zealand

GreenButton, New Zealand

Cray Inc. – Jan 2014 24

AUT University, New Zealand

Massey University, New Zealand

University of Auckland, New Zealand

University of Otago, New Zealand

Callaghan Innovation, New Zealand

Open Parallel, New Zealand

Victoria University of Wellington, New Zealand

Instituto de Telecomunicações (IT), Portugal

University of Évora, Portugal

Council for Scientific and Industrial Research (CSIR), South Africa

Instituto de Astrofísica de Andalucía, Spain

Barcelona Supercomputing Center, Spain

Fundación Centro de Supercomputación (FCSCL), Spain

Science and Technology Facilities Council (STFC), UK

University of Manchester, UK

University of Cambridge, UK

Oxford University, UK

University College London, UK

University of Southampton, UK

Google, USA

Page 25: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

SDP Computing Tasks

The tasks are combined in iterative loops that typically involve refining estimates of array parameters – such as complex gains – while concurrently creating images that converge towards the transformed observed data in these steps. ● Removing data that has been corrupted by interference or faults in the system. This can

include interference from mobile phones, or any other Earth based radio signal, by errors in the signal transport, or problems with the hardware.

● Calibrating each antenna’s signal to remove the effects of instrumental variation and variations in the line-of-sight propagation of the radio signal. With so many radio telescopes, this requires huge amounts of processing power to manage.

● Transforming the data onto a rectangular grid in what radio astronomers call the “u-v plane” – this is akin to interpolating a few randomly scattered altitude measurements onto a regular map grid to estimate the altitudes at all grid intersections.

● A mathematical calculation called a Fourier transformation to convert the data into a representation of the object’s image in the sky.

● A further calculation called “deconvolution of the point-response function of the array” to remove the radio equivalent of the spikes around bright stars in an optical image.

These steps must be done for thousands of separated frequencies, as whilst the SKA radio telescopes work over a low, mid and high frequency ranges, they are indeed that – ranges, and thousands of individual frequencies must be analysed within each range. The SKA’s computers are required to do all of this in real time. Buffer memory is required to store interim processing results while the processing loops are being executed. The science processing facility will also have large data storage sub-systems. The end results of the converged image processing form the basis of the final astronomical images that are distributed to astronomers and physicists around the world.

Cray Inc. – Jan 2014 25

Page 26: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

SDP Design Approach ● Adopt Incremental and Iterative Design approach to the system engineering ,and prototyping

● Horizontal prototyping aims to provide a system-wide prototype, Vertical prototyping provides performance and functionality of individual components

● Open Architecture Lab Approach based on Lawrence Livermore Hyperion Emphasis on a determining an appropriate scalable element

● Emphasis on system-level components of the open source software stack

● Create an evaluation and prototyping testbed: Petascale I/O technology scaling for SKA1 and future capacity to SKA2 - processor, memory, networking, storage, visualization, etc.

● Design for future technology refresh, expansion, and upgrades

Cray Inc. – Jan 2014 26

Page 27: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Cray Inc. – Jan 2014 27

Overall SDP Bock Diagram

Page 28: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Multi-Tasking and Multifunction Processing

Cray Inc. – Jan 2014 28

Page 29: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Cray Inc. – Jan 2014 29

Page 30: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Estimated SDP Sizing

Cray Inc. – Jan 2014 30

Page 31: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Telescope Manager

● Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

● National Research Council of Canada (NRC), Canada

● GTD GmbH, Germany

● National Centre for Radio Astrophysics (NCRA), India

● Tata Research, Development and Design Centre (TRDDC), India

● National Institute for Astrophysics (INAF), Italy

● Instituto de Telecomunicações (IT), Portugal

● Geo-Space Sciences Research Center (CICGE-FCUP ), Portugal

● SKA South Africa, South Africa

● Science and Technology Facilities Council (STFC), UK

Cray Inc. – Jan 2014 31

Page 32: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Telescope Manager Description

● This element includes all hardware and software necessary to control the telescope and associated infrastructure. The TM includes the co-ordination of the systems at observatory level and the software necessary for scheduling the telescope operations. It also includes the central monitoring of key performance metrics and the provision of central co-ordination of safety signals.

● The TM provides physical and software access to, and at, remote locations for transmission of diagnostic data and local control.

● The TM design and development, when complete will be responsible for the monitoring of the entire telescope, the engineering and operational status of its component parts.

● The TM is also responsible for enabling control of various sub-systems and their associated components, as well as provide and support online and physical access

● The TM will send control signals when needed, detect and manage faults if they arise, control associated infrastructure, and coordinate the handling of safety signals.

● The TM is also responsible for coordinating observations, including telescope operations, operator infrastructure, metadata collection, archiving of collected monitor and control data, and much more.

● The TM also links to a number of other work package elements through interfaces and provides the backbone for the functioning of the telescope arrays.

● In summary, the TM is responsible for the management of all astronomical observations, management of all the telescope hardware and software systems that perform the observations and facilitating communication across the primary stakeholders, in addition to ensuring safety

Cray Inc. – Jan 2014 32

Page 33: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Infrastructure, Power, etc.

● Two Consortiums – Australia and South Africa separate ● INFRA-AU ● Aurecon, Australia

● Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

● Woodhead, Australia

● Radio Quiet Zone (RQZ) Solutions, Australia

● Rider Levett Bucknall, Australia

● INFRA-SA ● SKA South Africa – Tracy Cheetham

Cray Inc. – Jan 2014 33

Page 34: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Assembly Integration Verification

● Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

● Netherlands Institute for Radio Astronomy (ASTRON),The Netherlands

● SKA South Africa, South Africa

Cray Inc. – Jan 2014 34

Page 35: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Acquisition and Construction SKA1 2017-2020

● 2016 Consortia deliverables to SKAO are documents to enable preparation of necessary tenders

● 2107 an organization formed to enable the procurements should be formed by the Board and set up to issue the tenders

● 2017 can expect some teams (not current consortia) are encouraged to respond to tenders

● 2017 response to tenders and selection of successful bidding teams

● 2018 construction of SKA1 begins

● 2020 SKA1 operations begin

Cray Inc. – Jan 2014 35

Page 36: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

SKA2 Scale Up and Schedule

● Design and Engineering completed by 2020 based on experience and learnings from SKA1

● 2021 tenders issued

● 2022 build out begins

● 2025 SKA2 complete and operations begin

● SKA intended to be operational until 2075

Cray Inc. – Jan 2014 36

Page 37: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Radio Astronomy Related Industry Consortium (RARIC)

● Missions ● Prototype, develop, document and advocate RA as an area of science

exploration as valuable to US industry at extreme scale ● Guidance from astronomers collaborating with industry to create working

groups and carry out projects that advance technologies for RA ● Add to industry and government understanding of RA’s economic and

societal value ● Provide results of work to SKAO and other RA projects of extreme scale

● Initial Voting Members ● Industry – Cray, DDN, IBM, Intel*, Mellanox, Nvidia*, Xyratex ● Academic - Berkeley, Cornell*, Illinois/NCSA, JPL/Caltech, UNC/Renci

● Proposed Working Groups ● Data Management – Paul Grun Co-Chair with Caltech ● Software Correlation – Co Chairs are NCSA and Berkeley ● Foundational Technologies – Bill Boas co Chair with Alan Benner ● Transient Streaming Analytics – proposed by JPL

Cray Inc. – March 2013 37

* - not yet agreeing to join and assign people

Page 38: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

SKA’s Analysis of HPC Roadmap by 2017

Cray Inc. – Jan 2014 38

Historical Progress of Top500 • By 2017 SKA goals should be more than

realistic for FLOPs

• May even be able to use a Top100

system

• Integer and no.of threads perspective

may be a challenge

• As may be the data streaming aspect

Memory Bandwidth • Arithmetic Intensity is lower so FLOPs is

not a good measure

• Expected to fall short by (10) by 2017

Data Rates • Correlator preliminary projections lead to

700 40GbE or 280 100GbE

• Beamformer is different, need is higher by

O(10)

Power • Projections for Exascale of 20 MW, not

including cooling in desert

• Cost range in Euro maybe in range 45-

100M per year !!

2018 2009 vs 2018

Rmax 1 EFlop O(1000)

Energy requirement 20 MW O(10)

Energy/Flop 20 pJ/Flop -O(100)

System memory 32 - 64 PB O(100)

Memory/Flop 0.03 B/Flop -O(10)

Node performance 1 - 15 Tflop O(10) - O(100)

Node interconnect b/w 200-400 GB/s O(100)

Memory bw/node 2 - 4 TB/s O(100)

Memory bw/Flop 0.002 B/s/Flop -O(100)

Concurrency O(109) O(10,000)

MTTI O(1 day) -O(10)

Page 39: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Software and Applications – major needs

● Operating Systems and Middleware ● No-Stop Data Streaming at Extreme Scale

● Huge increase in concurrency

● Significant Increase in I/O efficiency

● Consistency across heterogeneous hardware

● Synchronicity across nodes and cores to run in lock-step

● No “failures” (h/w or s/w) impact continuous operation

● Compilers/Libraries/Development Tools ● Consistent across Hardware and Instruction Set Architectures

● Exreme levels of parallelism across all APIs

● Failure Handling does not interrupt No-Stop Data Streaming at Extreme Scale

● Programmer knowledge of data transformations and locality of data within streams

● Knowledge of fops, bytes and joules in real time

● Algorithms and Applications ● Fortunately nearly all codes today are “home grown” in RA

● To achieve the necessary scale nearly all current RA codes need re-writing

● No. of channels and computing resources are roughly in balance, not at extreme scale

● Co-Design ● Close collaboration between RA and Industry NECESSARY for all of the above

Cray Inc. – Jan 2014 39

Page 40: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Thank You

Q & A

[email protected]

510-375-8840

Cray Inc. – Jan 2014 40

Page 41: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Cray Inc. – Jan 2014 41

2018 2009 vs 2018

Rmax 1 EFlop O(1000)

Energy requirement 20 MW O(10)

Energy/Flop 20 pJ/Flop -O(100)

System memory 32 - 64 PB O(100)

Memory/Flop 0.03 B/Flop -O(10)

Node performance 1 - 15 TFlop O(10) - O(100)

Node interconnect b/w 200-400 GB/s O(100)

Memory bw/node 2 - 4 TB/s O(100)

Memory bw/Flop 0.002 B/s/Flop -O(100)

Concurrency O(109) O(10,000)

MTTI O(1 day) -O(10)

Table 1 - Projected supercomputer specifications, compared to two current top ranking supercomputers.

Page 42: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Overall Project Summary

● SKA is a very large, decade in the making, decades more ahead, global, exascale, radio astronomy observatory, a 24/7enterprise, conceived and driven by astronomers

● 2/3 installed in Africa, 1/3 installed Australia, Project Office in UK, 12 member countries now, USA backed out in 2010

● Three Phases going forward from now ● 2013-16 SKA1 (10%) Pre-Construction requirements, architecture

design, specification 12 Consortia awarded this phase,

● 2107-18 Issue tenders, award and construct SKA1 include pre-cursors

● 2019-2021 design and specify SKA2 (90%) SKA1 operational and for 50 years thereafter

● 2022-24 Issue SKA2 tenders, award, construct, integrate into operations

● Budgets in Euros ● Pre-Construction $90M in-kind by members, 5M for computing

prototypes in Architecture Lab at Cambridge University

● SKA1 650M cap from Project Office; SKA2 ~5000M

Cray Inc. – March 2013

42

Page 43: HPC and Exascale for the Square Kilometer Array Telescope · 2107-18 Issue tenders, award and construct SKA1 include pre-cursors ... Consortia leaders chosen on merit and peer acceptance

Q & A

Thank you

[email protected]

Cray Inc. – Jan 2014 43