Astronomical “Big Data” Analysis and...

Preview:

Citation preview

Amr H. Hassan

Centre for Astrophysics and Supercomputing, Swinburne University of Technology

With :

Christopher Fluke (Swinburne), David Barnes (Monash), Virginia Kilborn (Swinburne)

Astronomical “Big Data” Analysis and Visualization

http://www.gizmodo.com.au csironewsblog.com

Astronomy In the “Big Data” Era

© http://science.psu.edu Photography by Paul Bourke and Jonathan Knispel. Supported by WASP (UWA), iVEC, ICRAR, and CSIRO

Australian SKA Pathfinder (ASKAP) The Large Synoptic Survey Telescope (LSST)

3 to 12 TB/ Day 30 TB/ Day

400 TFLOP/S

Swinburne gStar GPU-Supercomputer Square Kilometre Array (SKA)

30 to 360 TB/ Day

© skatelescope.org

Astronomy In the “Big Data” Era

http

://ww

w.w

ired.c

om

/wire

de

nte

rpris

e

LSST – 2018

• 15 TB / night

• Image size ~ 6 to 10 GB

http

://ww

w.b

igdata

byte

s.c

om

ASKAP – 2014

• 3 to 12 TB / day

• Data unit ~ 1TB

Image credit – Swinburne Astronomy Productions / CSIRO

The Australian Square Kilometre Array Pathfinder

Big Data - Case Study

ASKAP

ASKAP data

ASKAP central processing

Simulated Data

ASDAF

ASDAF

Science Community

WALLABY Science Processing

Quality Control

Source Finding

(2)

Spectral Stacking

Source Parameterization

Data Management

Science Analysis

Publications Multi-wavelength data

Radio Astronomy – Computer Assisted Data Analysis

58 TB

Source : WALLABY ASKAP Review - PIs: B. S. Koribalski & L. Staveley-Smith

©zeis

s.m

agn

et.fs

u.e

du

Radio Astronomy – Computer Assisted Data Analysis

DEC

DEC

ASKAP

ASKAP data

ASKAP central processing

Simulated Data

ASDAF

ASDAF

Science Community

WALLABY Science Processing

Quality Control

Source Finding

(2)

Spectral Stacking

Source Parameterization

Data Management

Science Analysis

Publications Multi-wavelength data

Radio Astronomy – Computer Assisted Data Analysis

58 TB

Source : WALLABY ASKAP Review - PIs: B. S. Koribalski & L. Staveley-Smith

ASKAP Cube Dimensions 6144 x 6144 x 16384

10 fps ≈ 30 Minutes

ASKAP Cube Dimensions into 6x6 Grid ≈ 36 x 1024 x 1024 x 16384

10 fps → 36 x 27.3 Minutes ≈ 16.4 Hours

Each Cube 64 GB each

Radio Astronomy – Computer Assisted Data Analysis

Radio Astronomy – Computer Assisted Data Analysis

Radio Astronomy – Computer Assisted Data Analysis

(2σ)

(3σ) (4σ) (7σ)

Computer Assisted Data Analysis Sigma-Clipping Transfer Function

Computer Assisted Data Analysis 3D Spectrum Extraction

Computer Assisted Data Analysis Integerating Source Finder Output

Computer Assisted Data Analysis Other operations

http://www.getmemedia.com

8000×8000 pixel volume rendering of the HIPASS dataset on the CSIRO Optiportal at Marsfield,

NSW. The Southern Sky cube was generated by Russell Jurek (ATNF) from 387 HIPASS cubes.

Credit: Christopher Fluke

Computer Assisted Data Analysis Next Step – Better data interaction

Performance Analysis and Benchmarks gStar

50 standard SGI C3108-TY11 nodes that

each contain:

• 2 six-core Westmere processors at

2.66 GHz

• 48 GB RAM

• 2 NVIDIA Tesla C2070 GPUs (each

with 6 GB RAM).

• 1.7 petabytes of usable disk space

served by a Lustre file system.

Performance Analysis and Benchmarks Datasets

Dataset Name Dimensions (Data Points)

File Size Number of Points

HIPASS Cube 1721 x 1721 x 1024 11.3 GB 3 Billion

8X HIPASS Cube 3442 x 3442 x 2048 90.4 GB 24 Billion

27X HIPASS Cube 5163 x 5163 x 3072 305.1 GB 81 Billion

48X HIPASS Cube 6884 x 6884 x 3072 542.33 GB 145 Billion

Performance Analysis and Benchmarks Datasets

Performance Analysis and Benchmarks Data Analysis Performance - 96 GPUs

Dataset Name File Loading Median (s)

Mean/Std (s)

Histogram (s)

48X HIPASS Cube ~ 9 Minutes 44 1.745 4

27X HIPASS ~ 5.3 Minutes 22 1.2 3.9

8X HIPASS ~ 8.3 Minutes 7.8 0.5 1.6

HIPASS ~ 10 Seconds 2 0.4 0.12

Computer Assisted Data Analysis Volume Rendering – Performance

Computer Assisted Data Analysis Volume Rendering Performance with 96 GPUs

http://fresh-flow.co.uk

Recommended