26
CASA Algorithms R&D S. Bhatnagar NRAO, Socorro

CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

CASAAlgorithms R&D

S. Bhatnagar

NRAO, Socorro

Page 2: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

2

CASAOutline

● Broad areas of work

1. Processing for wide-field wide-band imaging

Full-beam, Mosaic, wide-band, full-polarization

Wide-band continuum and spectral-line imaging

2. Related High Performance Computing (HPC)

Multi-threading, Cluster computing, GPU, ...

3. Pipeline processing (imaging)

4. Establish cost-performance equation

● Relatively low-FTE effort● 1 – 1.5 FTEs spread across 3 – 4 scientists

Page 3: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

3

CASAWorkload 

● Determine the landscape● Characterize the problem, survey existing solutions, estimate the

parameter space, etc.

● Develop a path with scientifically useful intermediate stops

● R&D for solution (start simple), stabilize the implementation, scientific testing, characterize the algorithm, etc.

● Publish papers in appropriate refereed journals

● Integrate with production software● Overheads of issues related to new code in CASA software system

● Usable HPC is necessary: So also involved with HPC effort(s)

● Write document, maintain code, even user-support...

Page 4: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

4

CASACurrent relevant activities

● Algorithms for wide-band wide-field imaging● Frequency dependence of the sky-brightness distribution

● Instrumental: Time- and frequency-dependent PB

Heterogeneous PBs

● Scientific testing, characterization of limits

– Some results & details in later slides

● Full-polarization imaging

– In-beam polarization (full-Mueller (?) Imaging)● Numerical characterization of the problem

– Extend WB PB corrections to full-pol

– Wide-band full-pol. MT-MFS or Cube imaging● RM synthesis

Page 5: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

5

CASACurrent relevant activities

● Related HPC activities● Large data + High Computing load

– Data parallelism on Cluster + Multi-threading

– GPU computing

● Determine balance between computing resources and imaging performance

– Needs scientific testers with domain expertise in advanced algorithms

– Establish cost-performance equation● Important for development going forward● Crucial for usable and reliable pipeline processing● Of great interest for the larger RA community and algorithms R&D

Page 6: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

6

CASACurrent relevant activities

● Pipeline processing● Develop heuristics to determine an optimal path through the imager

parameter space

– Needs understanding and characterization of limits, estimate of cost-performance equations, scaling laws, etc.

● Software development [Details in presentations later]● Re-factor imager framework

● Integrate with existing parallelization framework, re-integrate with new parallelization framework when it is ready

● Test for correctness, performance

– Many overheads + inherently time-consuming

Page 7: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

7

CASASome terminology / definitions

● “Wide-band”: Frequency dependent effects are significant

– Fractional bandwidth used for imaging > ~20%

– High Spectral Index sources

● Wide-field imaging: Imaging FoV requires PB or W-term corrections

– Imaging beyond the 50% of the PB at a reference frequency● Single pointing wide-band imaging at lower EVLA bands● Mosaicking (by definition!) at any of the EVLA or ALMA bands

– Imaging when (error due to the W-term is significant)Bλ

fD2>1

Page 8: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

8

CASASome terminology / definitions

● MT-MFS: Multi-term Multi-Frequency Synthesis algorithm

– To account for the frequency dependence of the sky brightness distribution

– Important for fractional bandwidth of > ~20% and dynamic range (DR) > ~103

● A-Projection: Algorithm to correct for Direction-dependent (DD) effects (PB effects) as a function of time and polarization

– Useful for sensitive spectral-line imaging

– a.k.a “Narrow Band A-Projection” or “NB A-Projection”

● WB A-Projection: Algorithm to also account for frequency dependent DD effects (frequency dependent PB)

– PB corrections beyond 50% point in single-pointing imaging

– For accurate mosaic imaging at DR in the range of few x 103

– Probably at even lower DR for full-pol imaging at any of the ALMA or EVLA bands

Page 9: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

9

CASAWide­band imaging

● MT-MFS for frequency dependence of sky brightness [Rau & Cornwell, A&A, 2010]

S (ν , l⃗ ) ∝ S (νo , l⃗ ) ( ννo )

α(ν , l⃗ )

• 3C286, BW=1.0-2.1 GHz● No wide-band modeling of the sky emission

● DR: 1600

• 3C286, BW=1.0-2.1 GHz● With MS-MFS (freq. Dependent model for the sky emission)

● DR: >110,000

Page 10: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

10

CASAPB effects: Characterization

● A-Projection for in-beam time and pol effects

● WB A-Projection for frequency dependence

IContinuum( l⃗ , Pol) = ∬ I ( l⃗ ,ν)PB( l⃗ ,ν , t , Pol) d ν dt

Time-dependent DD effects Pol-dependent DD effects

Page 11: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

11

CASATime­ and polarization­dependence

● Effects of time and polarization dependence of the PB

Errors due to PBSquint + Rotation + Pointing errors

Purely instrumentalStokes-V artifacts

Due to avg. PB

Stokes-I

Stokes-V

Page 12: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

12

CASAPB Effects: Frequency dependence

● WB A-Projection for in-beam frequency dependence

I Continuum( l⃗ , Pol) = ∬ I ( l⃗ , ν)PB( l⃗ ,ν , t , Pol) d ν dt

I Spectral(ν , l⃗ , Pol) = ∫ I (ν , l⃗ )PB ( l⃗ ,ν , t , Pol) dt

PB Freq. dependence(blue curve)

Page 13: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

13

CASAWide­band wide­field imaging: Characterization

● Effect of instrumental frequency dependence

Pulsar Sp. Ndx -3.0

Artificially steepSpectral Index

Page 14: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

14

CASAWide­band wide­field imaging: Performance evaluation

● Combined MT-MFS and WB A-Projection algorithm

MF

S

+S

tan

dar

d I

mag

ing

MT-

MF

S

+N

B A

-Pro

ject

ion M

T-MF

S

+W

B A

-Pro

jection

MT-M

FS

+S

tand

ard Im

agin

g

Ap.J., 2013

WB

A-P

rojec

tion

Page 15: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

15

CASAWide­band wide­field imaging: Performance evaluation

● Characterize performance, limits [Rau & Bhatnagar, in prep.]

● Heterogeneous PB correction [Kundert & Rau, Masters thesis]

– Time-, shape-dependence, in-beam effects important at DR > 104

– Size-dependent functions sufficient for ALMA for now (usable already)

– Size-dependent full-pol support for ALMA may be required next

MT-MFS +WB A-Projection

2 uJy rms

Cube +NB A-Projection3 uJy rms

MT-MFS

Cube

Brightest Source :100 mJy

4 uJy rmspeak res : 20 uJy

6 uJy rms*peak res : 15 uJy

Page 16: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

16

CASAWide­band Mosaic Imaging

● Characterize performance, limits [Rau & Bhatnagar, in prep.]

Intensity : Reconstructed / True

Alpha : Reconstructed - True

Page 17: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

17

CASAWide­band Mosaic Imaging

● Characterize performance, limits [Rau & Bhatnagar, in prep.]

RMS : 0.3 uJy

Intensity : Reconstructed / True

Alpha : Reconstructed - True

Page 18: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

18

CASAFull polarization imaging: Work in progress

● Extend WB A-Projection to full-polarization (full-Mueller?) [PhD Thesis project of P. Jagannathan]

[I I

o

IQo

IUo

IVo ] = [

I Iobs

IQobs

IUobs

I Vobs ]

The Direction-dependent Mueller matrix (in Stokes basis)

Page 19: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

19

CASAParameter Space for HPC

● In terms of algorithm design● Move towards higher compute-to-i/o ratio

● Minimize memory footprint

– Remain inside the green box

Computing

I/OM

emo

ry

Compute-to-I/O Ratio

More memory per FLOP

Lesser memory per FLOP

Page 20: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

20

CASARelated HPC

● Large data volumes (few 100 GB to few TB), higher computing load, higher memory footprint

● Distributed major-cycle on compute cluster + Luster FS [EVLA Memo #132,133, 2009]

– Favorable compute-to-i/o ratio

– Good scaling: 60 – 70% efficiency

● Memory foot-print an issue beyond a certain scale. Solutions: – Multi-threaded gridding [Golap]

● A single instance of gridder utilizing all available cores

– Optimal W-Projection planes [Golap]● Determine number of w-planes from the data rather than FoV

● Scientific testing in progress

● Frequency resolution for wide-band PB correction

● Rotation with PA: interpolation vs. Caching

● Oversampling

Page 21: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

21

CASATesting: MT­MFS + WB AWP + Mosaic + HPC Result

● 80-pointing EVLA WB mosaic @L-Band

● MTMFS + WB A-P using ~40 processes

[Rau & Bhatnagar, (work in progress)]

Stokes-I

Page 22: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

22

CASATesting: MT­MFS + WB AWP + Mosaic + HPC Result

● 80-pointing EVLA WB mosaic @L-Band

● MTMFS + WB A-P using ~40 processes

●Unresolved issues● Numerical noise

with wide-band and“large” number ofpointings

● TODO● Evaluate solutions

for WB OTFM

[Rau & Bhatnagar, (work in progress)]

Intensity WeightedSp. Ndx.

Page 23: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

23

CASAImaging pipeline

● The imager can be configured into large number of states with vastly varying computing cost and imaging performance

● Staged development: Heuristics to determine imager parameters● minimize computing costs and maximize imaging performance

● Auto-flagging: Existing algorithms (tfcrop, rflag) a good start, but need heuristics to use in a pipeline

Page 24: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

24

CASAOther, longer­term activities

● Asp-Clean type deconvolution algorithm [L. Zhang’s PhD thesis] ● Positively impacts memory footprint and Spectral Index imaging performance

● CompSens ideas built-in

– Evaluate other similar ideas

VTrue- VModel

Id-BIM Niter ~60K 50 ~15K ~1 KClean MEM MS-Clean Asp-Clean

Ima

ge

Vis

ibil

ity

Page 25: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

25

CASAOther, longer­term activities

● GPU computing● Collaboration with NVIDIA Dev. Tech. Division + Univ. group

● Efficient for computing OTF convolution function computation and multi-scale computations

● Not yet clear if useful for gridding (the dominant cost)

● PB measurements/modeling

TotalTotal

FFTFFTFFTFFT

Mulit-scale imageMulit-scale imagecomputationcomputation

Page 26: CASA - NRAO Safe Server...3 CASA Workload Determine the landscape Characterize the problem, survey existing solutions, estimate the parameter space, etc. Develop a path with scientifically

26

CASASummary

● Pace of work is resource limited

● Resources of the right skill-set is crucial

● Following the 2009 memo plan and EVLA/ALMA requirements

● Wide-band continuum imaging requires MT-MFS for DR > few x 103

● PB effects important @ DR > few x 103-104 ; @ few x 103 for full-pol.

● MT-MFS + WB A-P required for mosaic and spectral index mapping

● Work in progress

● Test combined WB imaging algorithm (including mosaicking) in production code

● Test deployment on HPC platforms (necessary for practical usability)

● Characterizing effects of in-beam polarization for full-pol imaging

● Towards developing wide-field full-band full-pol imaging capability

● Research deconvolution algorithms with smaller memory footprint