55
ASKAP Computing - ASKAIC Technical Update “Towards SKA - ASKAP computing & the development of SKA systems” Ben Humphreys ASKAP Computing Project Engineer 10 th July 2009 Image Credit: Swinburne Astronomy Productions

ASKAP Computing - ASKAIC · 2009. 7. 14. · • Apache Tuscany (SCA/SOA) • ActiveMQ / JMS (Message Oriented Middleware ... Python, Ruby, and PHP • Ice supports heterogeneous

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • ASKAP Computing - ASKAIC Technical Update “Towards SKA - ASKAP computing & the development of SKA systems”

    Ben Humphreys ASKAP Computing Project Engineer 10th July 2009

    Image Credit: Swinburne Astronomy Productions

  • CSIRO. ASKAP Computing - ASKAIC Technical Update

    Outline

    •  ASKAP Computing & Software Architecture Overview

    •  ASKAP Central Processor Overview •  Design •  Middleware •  Hardware

    •  ASKAP Data Storage

    •  Challenges for ASKAP Computing

  • CSIRO. ASKAP Computing - ASKAIC Technical Update

    ASKAP Computing & Software Architecture

    Overview

  • Architecture Goals

    •  The structure of the system •  Top-level decomposition of the software system

    •  Definition of interactions between components

    •  Loosely coupled components •  Hardware platform independence •  Language independence •  Operating system independence •  Implementation independence •  Location and server transparency •  Asynchronous communication

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Logical View

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Evaluation of Middleware

    • Evaluated: • Apache Tuscany (SCA/SOA) • ActiveMQ / JMS (Message Oriented Middleware) •  ICE (Distributed Objects)

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Evaluation of Middleware

    •  Requirements •  Support Linux •  Language bindings for Java, C++ and Python •  Support for Request / Response style communication

    •  Synchronous •  Asynchronous

    •  Support for Publish / Subscribe style communication •  Promote loose coupling •  Support for fault tolerance (replication, etc.)

    •  Desirable: •  Mature •  Open Source

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Evaluation of Middleware

    •  Both ActiveMQ and ICE were found to meet our requirements

    • We have selected ICE over ActiveMQ/JMS primarily because of its interface definition language

    •  Avoids us having to define our own interface definition language

    •  Avoids us having to build our own bindings between the language and the interface definition

    •  Also many of our interfaces appear to be best suited to an object oriented model

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Evaluation of Middleware ICE

    •  ICE •  The Internet Communications Engine (ICE) is an object-oriented

    middleware with support for C++, .NET, Java, Python, Ruby, and PHP •  Ice supports heterogeneous environments: client and server can be

    written in different programming languages, can run on different operating systems and machine architectures, and can communicate using a variety of networking technologies

    •  Open Source •  Available under the GPL and a commercial license

    •  Put simply, ICE is CORBA done right!

    •  Long history of CORBA use in similar systems •  Advanced Technology Solar Telescope (ATST) employs ICE

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Other Software Decisions

    •  Central processor decisions will be discussed later

    •  Telescope Operating System (aka Monitoring and Control) •  Experimental Physics and Industrial Control System (EPICS) •  Originally written jointly by Los Alamos National Laboratory and

    Argonne National Laboratory •  EPICS released 1990, used in >100 projects •  Decision made after extensive evaluation

    •  PVSS-II •  EPICS •  ALMA Common Software (ACS) •  TANGO

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Other Software Decisions

    •  Logging •  Log4j, Log4cxx, Log4py

    • Monitoring •  MoniCA (Existing ATNF package)

    •  Data Services •  MySQL or PostgreSQL •  SQLAlchemy

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Pending Software Decisions

    • Observation Preparation Tool •  Investigating the possibility of reuse:

    •  Gemini Observing Tool •  James Clerk Maxwell Telescope (JCMT) Observing Tool •  ALMA Observing Tool

    •  Alarm Management System •  Ongoing evaluation

    •  EPICS Alarm Handler •  DESY (German Electron Synchrotron) Alarm Management System

    • Operator Displays •  Rich client? •  Web based?

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • CSIRO. ASKAP Computing - ASKAIC Technical Update

    ASKAP Central Processor Overview

  • ASKAP Central Processor Overview

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Key Requirements

    •  Process data from observing to archive with minimal human decision making

    •  Calibrate automatically

    •  Imaging •  Fully automated processing, random field •  Fully automated processing, Galactic plane •  Fully automated processing, HI

    •  Form science oriented catalogues automatically

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Traditional Approach to Data Reduction

    •  Traditional/interactive approach to data reduction typically involves multiple access to the same data

    •  Don’t want to sit in front of a computer all day, so ideally reducing say a 8-12 hour observation would take an hour or two

    •  For ASKAP this approach would require ~500GB/s •  For SKA perhaps a few PB/s

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Streaming Approach to Data Reduction

    •  Streaming approach possible for single-pass, automated data reduction

    •  No temporary storage is required to store raw data. Must keep up with input data rate

    •  However there are downsides: •  Certain optimizations impossible. Must process the data generally

    in the order it is received •  Failure handling becomes much more difficult

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Buffered Approach to Data Reduction

    •  Buffering the dataset at a certain stage allows: •  Optimizations in ordering before the compute intensive stage •  Better failure handling

    •  Single-pass, automated data reduction. Must keep up with input data rate but can cope with some processing time variability

    •  For ASKAP this approach would require ~10GB/s •  For SKA perhaps 20-50TB/s

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Processing models

    •  Streamed •  Visibility data is processed immediately •  Few optimization options •  Need enough memory to store images over 100TB for 16384 ch •  Time to finish off •  Difficult to handle faults without losing channels

    •  Buffered •  Data is stored to disk for part or all of observation •  When enough data is ready, processing starts •  High latency •  Reduced memory requirement (16-32TB for 16384 ch) •  Many choices to optimize sequence of processing and hence

    performance •  Simplifies fault handling •  Impossible if ingest rate exceeds disk I/O capability

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Design

    •  Central Processor is composed of: •  Frontend (Data Conditioner) – responsible for:

    •  Ingesting visibilities from the correlator •  Ingesting metadata from the telescope operating system •  Annotating visibility stream with metadata •  Flagging suspect data (e.g. RFI impacted data) •  Calculation of calibration parameters •  Applying calibration to the observed visibilities •  Writing out of measurement set or similar

    •  Backend – responsible for: •  Processing of calibrated visibilities into science products:

    •  Image cubes •  Source catalogs •  Transient detections

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Design Physical View (Omits Control Plane)

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Design Logical View

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Design Central Processor Process View / Data Flow

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Design

    •  Front-End •  Based on the “Pipes & Filters” design pattern

    •  Back-End •  Is more complex

    •  Spectral line imager is generally a linear processing pipeline •  Continuum imager is iterative and involves a large reduction step •  Transient detection may be more decision event/decision oriented

    •  Design for course-grained parallelism •  For most tasks, each channel can be handled independently •  For many tasks each baseline, polarization, or time sample can be

    handled independently •  Doesn't preclude fine-grained parallelism where required

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Middleware

    •  Selection of middleware can limit hardware options due to platform and operating system support restrictions

    •  Extensive evaluation of middleware was carried out

    • Many potential middleware/frameworks identified: •  MPI •  CORBA (OmniORB, TAO) •  IBM InfoSphere Streams (System S) •  Data Distribution Service (DDS) •  ICE

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Middleware Front-End

    • We have selected ICE for the frontend of the central processor

    •  ICE runs on most of major Unix platforms: •  Full support from ZeroC:

    •  Linux, Solaris, MacOS X •  Builds and runs:

    •  HP-UX, FreeBSD

    •  Prototypes of both the front-end (ingest pipeline) and the imager have been developed using ICE

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Middleware Back-End

    •  The “Backend” of the central processor is where the substantial compute resources in ASKAP reside

    • We have selected MPI for the backend of the central processor •  MPI is supported on practically all HPC platforms •  We believe this decision gives us the widest range of hardware

    options •  MPI implementation of ASKAPsoft simulator, imager & source

    finder has been used for over two years

    • Our design also requires a resource manager for the backend •  Must be DRMAA compliant •  Currently using Torque/PBS

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Model for synthesis data processing costs

    •  Key result from investigations over last two years

    •  Under continual refinement

    •  Key parameters •  Convolution support •  Cost per million points

    per sec •  Baseline length

    • Gridding only •  Calibration,

    deconvolution, and source finding not yet included

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Evaluation of Hardware

    •  Built a small benchmark from the core gridding/degridding algorithm. About 90% of our computing requirements relate to this algorithm

    •  Distributed benchmark very widely

    •  Benchmarked on systems from: •  Intel (Harpertown and Nehalem CPUs) •  AMD (Opteron 2000 series CPUs) •  NEC (SX-8R & SX-9R) •  SGI (SGI Altix 4700 Itanium & SGI Altix XE) •  IBM (BlueGene/P) •  NVIDIA (Tesla C870 GPU & GeForce GTX 260) •  Cray (XT5 & X2)

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Evaluation of Hardware

    CSIRO. ASKAP Computing - ASKAIC Technical Update

    Other numbers are confidential

  • Evaluation of Hardware

    •  Special processors for convolutional resampling:

    •  Co processors •  FPGA •  Cell processors •  GPGPUs

    •  Field Programmable Gate Arrays (FPGA) •  Performance appeared to be promising for small grid sizes.

    Relative to CPU ~50x speedup •  Limited memory makes large grid sizes challenging •  Long development cycle and requires very specialised skill set

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Evaluation of Hardware

    •  Cell Processor •  The STI Cell Processor has a HPC variant sold by IBM

    in the QS22 blade •  Difficult programming model •  Very small (256KB) local memory is problematic for our

    imaging algorithms. The memory bus becomes a significant bottleneck

    •  See paper by Varbanescu, et al.

    • General Purpose Graphics Processor Unit (GPGPU) •  General purpose GPU available from NVIDIA and AMD •  We have ported our gridding benchmark with some

    promising results •  Software development effort is larger than with a regular

    CPU and may cancel out any cost savings

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Evaluation of Hardware

    •  Have identified a metric to measure price/performance: •  Price per million grid points per second

    •  i.e. how much it costs to acquire a computer to perform at a certain level

    •  Have identified a metric to measure power/performance: •  Watts per million grid points per second

    •  i.e. how much power is required to perform at a certain level

    •  Final hardware decision will take many other factors into account; reliability, maintainability, quality, maintenance costs, integration/packaging, etc.

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Hardware Needs

    • We are somewhat reliant on Moore’s Law •  Building the Central Processor with hardware available today

    would be too costly

    •  Hardware options must be kept open as long as possible so we are not railroaded to a certain platform/technology

    •  Discussions we have had with vendors and testing of current hardware indicate next generation systems (2011-2012) match our requirements and budget

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Hardware Needs for BETA

    •  Approximate requirements for an indicative Intel/AMD cluster: •  3-6 TFlop/s

    •  256-512 cores (as of late 2008 / early 2009)

    •  1-2 TB memory

    •  Good memory bandwidth •  > 5GB/s per core

    •  50 TB persistent storage (1 GB/s I/O rate) •  Plus backup solution

    •  Modest network interconnect •  Single 1GbE for compute nodes •  Single 10GbE for the ingest and output nodes

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Hardware Needs for ASKAP

    •  Approximate requirements for an indicative Intel/AMD cluster: •  100 TFlop/s

    •  ~8000 cores (as of late 2008 / early 2009) •  ~10000 if we assume a more realistic 80% efficiency

    •  16-150 TB memory (depending on processing model)

    •  Good memory bandwidth •  > 5GB/s per core

    •  1 PB persistent storage (10 GB/s I/O rate) •  Plus backup solution

    •  Modest network interconnect •  1GbE for compute nodes (but would likely use 10GbE or Infiniband) •  2-4 x 10GbE for the ingest and output nodes

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Development Environment

    •  Hardware: •  x86 64-bit

    •  Intel Core 2 / Core i7 •  AMD Opteron

    • Operating System: •  Linux

    •  Debian Etch •  Fedora 8

    •  Mac OSX 10.5

    • Middleware: •  OpenMPI 1.3 •  ICE 3.3

    • Other 3rd Party Packages •  LAPACK •  FFTW •  BLAS •  WCSLib •  Boost •  LOFAR Software •  Casacore •  Duchamp

    CSIRO. ASKAP Computing - ASKAIC Technical Update

    Can use optimized math libraries. ASKAPsoft has been trialed with: •  Intel MKL •  AMD Core Math Library •  IBM ESSL •  ATLAS

  • ASKAPsoft Codebase Status

    •  ASKAPsoft in development since 2006

    •  Approximately 110,000 SLOC •  C++ 53375 (49%) •  ANSI C 30616 (28%) •  Python 13232 (12%) •  Fortran 11272 (10%) •  sh 1105 (1%)

    •  Depends upon over 50 third party packages

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Scalability Testing

    •  Currently performing scalability testing at the National Computational Infrastructure (NCI) National Facility

    •  SGI Altix XE Cluster System •  156 x SGI Altix XE 320 nodes •  1248 cores (312 x 3.0GHz Intel Harpertown CPUs) •  DDR InfiniBand interconnect •  18 x Quad-core SGI Altix XE 210 servers for Lustre filesystem

    • Migrating to the new Sun Constellation system late 2009 •  Hosted by the NCI National Facility @ ANU •  1500 Sun Blade modules (12,000 cores) •  500TB Lustre Filesystem

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor Scaling Timeline

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Central Processor I/O Scaling

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • CSIRO. ASKAP Computing - ASKAIC Technical Update

    ASKAP Data Storage

  • ASKAP Data Storage

    •  ASKAP Telescope/Instrument - Online System •  Location: Geraldton & Boolardy •  Provides storage required for the ASKAP instrument to operate

    •  ASKAP Science Data Archive Facility •  Location: Probably Perth •  Where astronomers go to access ASKAP data products

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • ASKAP Data Storage Online System (Geraldton/Boolardy)

    RDBMS Parallel Filesystem MySQL or PostgreSQL Lustre, GPFS, PVFS, pNFS

    Probably < 10TB Minimum 1PB

    What do we store there?

    •  Configuration •  Scheduling Blocks •  Source Catalogues •  Calibration Parameters •  Monitoring Archives •  Logging & Alarms

    What do we store there?

    •  Raw Datasets •  Visibilities •  Metadata

    •  Images •  Image Cubes

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • ASKAP Science Data Archive Facility

    •  Where astronomers go to access ASKAP data products •  Located separately to Processing Centre (probably Perth) •  One-way data path from Online Data Store to ASDAF

    •  Save for acknowledgement that data has been received

    •  Data sent to Archive: •  Images, cubes, visibilities (continuum only) + their metadata •  Transient images & time series •  Source catalogues

    •  Capabilities of Archive: •  Limited to queries and downloads •  Reprocessing capabilities (e.g. stacking cubes) out of scope •  Standard VO-style queries plus more specialised ones on ASKAP-specific

    metadata •  Normal downloading hard (large images!), so provide “take-away” capability

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • ASKAP Science Data Archive Facility Size of potential data products

    Product Size Continuum Visibility Data ~ 370 TB/year Spectral Line Visibility Data ~ 23 PB/year Transient Visibility Data ~ 23 TB/year Continuum Images ~ 256 TB/year Transient Images ~ 8.4 PB/year Spectral Line Images ~ 4 PB/all sky survey Continuum Catalogue ~ 60 GB Transient Catalogue ? Spectral Line Catalogue ? Spectral Line Stacks ?

    CSIRO. ASKAP Computing - ASKAIC Technical Update

    Source: Cornwell, T.J. “Cost estimates for the ASKAP Science Archive”, ASKAP-SW-0016, 2008

  • ASKAP Science Data Archive Facility

    •  Currently in negotiations with ICRAR for development of the ASKAP Science Data Archive Facility

    •  Hardware/Software Needs •  Fast online storage •  Cheap offline storage •  Hierarchical Storage Management (HSM) •  RDBMS/Hadoop/SciDB

    •  Significant software development and innovation required!! •  Managing large datasets •  Virtual observatory (IVOA) interface

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • CSIRO. ASKAP Computing - ASKAIC Technical Update

    Challenges for ASKAP Computing

  • Challenges for ASKAP Computing

    •  All the usual •  Developing parallel/distributed software •  Debugging parallel/distributed software •  Reliability •  Power (both logistics and running costs) •  Acquisition, maintenance & software development costs

    •  Plus a few slightly more specific to our needs •  Batch vs Streaming •  Flop/s vs Memory sub-system performance

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Challenges for ASKAP Computing Batch vs Streaming

    •  The vast majority of HPC facilities run batch jobs, usually modeling/simulations and not processing of streaming data in real-time

    •  The tools to harness the potential of HPC for real-time data acquisition and analysis are still in their infancy or in most cases don’t exist

    •  The ASKAP Central Processor leverages two software frameworks:

    •  ICE – For handling of input data streams •  MPI – For harnessing the power of a HPC system

    •  Ideally one software framework would suit the end to end requirements

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Challenges for ASKAP Computing Flop/s vs Memory sub-system performance

    • Our imaging algorithms are more data intensive than computationally intensive. Typical operation:

    •  Load spectral sample (α) •  Load convolution (x) •  Load grid point (y) •  Compute y

  • Challenges for ASKAP Computing Flop/s vs Memory sub-system performance

    •  Locality optimizations are hard because of… •  large memory requirements of the images and convolution function

    + quasi-random access pattern •  high input data rate and potential inability to buffer and reorder

    input data

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Challenges for ASKAP Computing Flop/s vs Memory sub-system performance

    •  Bridging the Processor-Memory Gap •  Good recent progress

    •  DDR3 •  Move towards on-chip memory controllers •  More channels to memory

    •  Can’t let the gap widen any further

    •  Locality awareness will always be important •  Software advances are critical

    •  New implementation of algorithms •  New algorithms •  New approaches to data processing

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Challenges for ASKAP Computing Flop/s vs Memory sub-system performance

    •  NVIDIA GPU architecture is very good in this respect:

    •  NVIDIA GeForce GTX 285 memory bandwidth 159GB/s

    •  Typical x86 CPU memory bandwidth 15-35GB/s

    •  Memory stalls don’t leave the GPU core(s) idle

    •  Other threads can be scheduled while one thread is stalled on a load or store. But needs to have many (1000s) of threads to effectively hide memory latency

    CSIRO. ASKAP Computing - ASKAIC Technical Update

  • Contact Us Phone: 1300 363 400 or +61 3 9545 2176

    Email: [email protected] Web: www.csiro.au

    Thank you

    Australia Telescope National Facility Ben Humphreys ASKAP Computing Project Engineer

    Phone: 02 9372 4211 Email: [email protected] Web: http://www.atnf.csiro.au/projects/askap/

    CSIRO. ASKAP Computing - ASKAIC Technical Update