26
V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow Survey’ V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline Processing Team April 1 st , 2009 CALIM 09, Socorro

V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Pre-Processing and Calibration for ‘Million Source Shallow Survey’

V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline Processing Team

April 1st, 2009 CALIM 09, Socorro

Page 2: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

2

3

MSSS (MS3) and LOFAR processing chain 1

Outline

4

- Ongoing development/Open Issues

Preprocessing (DP3) - Role + Capabilities

Calibration (BBS) – Role + Capabilities

5 - Ongoing development/Open issues

DP3 - Default Pre-Processing Pipeline BBS - Black Board Selfcal System MS3 - Million Source Shallow Survey (60MHz and 150MHz)

Page 3: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

MSSS - Specifications

Sky Coverage (Ω) 2π str 2π str

FOV 120 deg sqr 20 deg sqr

Observation time 45 min x 4 15 min x 4

Max Baseline 10 km 10 km

PSF 82.5’’ 33’’

Integration time 1 s 1 s

Freq resln, BW 0.76 KHz, 8MHz 0.76 KHz, 8MHz

Total data size 407 Tbyte 2.3 Pbyte

Point source sensitivity(σ) I 5 mJy 0.5 mJy

N(S>30σ ) 2.7x105 1.05x105

LBA (60MHz) HBA (150MHz) *To be carried out with LOFAR 20(13+7)

Page 4: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Processing Chain - LOFAR

OFFLINE CLUSTER

Page 5: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

DP3, BBS in Automatic Offline Processing

CEP (BLUE GENE/P)

Data stream from Stations

SB0 SB1 SB2 subbands spread along frequency axis SubBand N

Correlated Data Copied to Offline Storage/Cluster

DP3 (Default Pre-Processing Pipeline)!!

SB0 SB1 SB2 SB N

BBS (Black Board System) DISTRIBUTED + Multi-Threaded Processing

GDS

SB0 SB1 SB2 SB N

MWI (Master Worker Imager – CImager) DISTRIBUTED PROCESSING

Data -> Preprocessed, Flagged and Compressed

Data ->Calibrated

Images, Sources …..

Y Y Y Y

Page 6: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

DP3 – Aims/Objectives

Carry out all Pre-Calibration common default steps in an efficient way.

1.  Flagging – Pre-Flags and Algorithm based 2.  Phase Correction due to diff clocks at stns 3.  Sub-Band Pass correction 4.  Global Band Pass correction 5.  Compression along Frequency Axis 6.  Compression along Time Axis

7.  Combining Different Sub Bands into a single Mset.

I N T E G R A T E D

D I S T R I B U T E D

Software

Parameter Set File, Cluster Description file, …

•  Runs in a Distributed (Purely Data) way on different compute nodes Each Measurement Set processed on one processor •  Reads the Measurement set only once for all steps •  Single Parset file for all the subbands

Source Code (C++)

Page 7: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

DP3 – Capabilities 1. Flagging

•  Generic code to accommodate - different flagging algorithms - multiple cascade flaggers •  Currently used - MAD flagger (Hampel filter) (Two dimensional – freq & time) - Mirroring on the ends- avoids edge effects -  Can run on data column desired * - Multiple flaggers implemented

ALGORITHM BASED APRIOR Knowledge Based

REGION CUM CRITERIA APPROACH (under progress!!) REGION: - Time, Freq, baseline, Correlation, Target, Direction CRITERIA: - A condition which Pixel in the region should satisfy

•  Satisfactorily results on actual data •  May be just good enough for MSSS (Surely not for Transients) •  Later Hierarchical system of flaggers

Advantage: Provides region specific suitable flagging Ex. Threshold a function of correlation, baseline etc.

R1- C1, C2, …. R2 – C1, C6… R1

R2

R3 R

Emphasis on using all instrument specific information regarding RFI environment

Page 8: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Pre-Flagging -> Region cum criteria

•  All operations expressed for entire observation expressed as region cum criteria statements •  Regions can overlap and have multiple criteria •  example R1 C1&C2, C1|| C2 Use as much information as possible: Station beam calibration meta data, Satellite bands, Transmitter, ionosphere information, Bootstrap based on user inputs

Pre-Flagging refers to flagging bad data based on aprior information/estimates before RFI detection Algorithms take charge - Estimate must not use neighborhood visibilities Region – Time range, Freq range, Correlation, Baseline, Target source, Direction (can be many) Time range – Sidereal time, local time, UTC, Integration number, time since start Freq Range – MHz, Subband number, Channel number Baseline – (ant1,ant2), (baseline length, Direction) Direction – any direction in sky, Zenith baseline length for example can be an expression (u>0λ & u<30λ, direction1)

Criteria - a condition visibility should satisfy but should not involve neighborhood visibilities example ALL, amp>0.8, amp> 0.8 f(ν) || amp<0.1f(ν)

Page 9: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

DP3 – example: Flagger

Tim

e (4

hour

s)

Tim

e(4

hour

s)

Frequency -> Frequency ->

Before Flag

After Flag

MS9315 (40MHzOct 24,2008)

CS1_us0 and CS1_us1 (XX corrleation)

Integ 30s

No absolute Threshold Flagging

Page 10: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

DP3 example

One channel A single Baseline

Typical RFI Cases which can be handled with ease

Page 11: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

DP3 - Flagging Example

Time index

Am

plitu

de (X

X)

Page 12: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

DP3 – Capabilities

•  Correct for phase differences introduced due to different clocks at stations •  Predetermined Table of corrections to be applied •  At this stage exact algorithm is under investigation

•  Bandpass shape within each Sub-band due to Polyphase filter bank •  Predetermined table as a function of frequency •  Already implemented at CEP (Blue Gene)

•  In case we need improvements, correction may be implemented.

•  Global Bandpass correction due to Antennae response to •  Pre-determined table as a function of frequency •  Presently Estimation using BBS global solver under investigation

Implementation to have multiple tables of corrections in one step

2. Correcting for clock phases, and cable delays

3. Correcting for sub-band shape

4. Correcting for Global Bandpass

Page 13: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

DP3: Data Compression 5. Compression along frequency axis

6. Compression along time axis

•  Implemented together, any one may be selected or switched off •  Compressed pixel = mean ( unflagged values ) •  Weight column appropriately modified depending on number of pixels flagged •  Time and Frequency of compressed (Averaged pixel) ** - MASK ? (Long baseline work?) •  Multiple stage compression allowed

Performance of DP3

•  >97% CPU usage most of the time (Data sets of ~ 5Gbyte)

0 0 1 1 1 1 1 1 1 1 1 0 0 1 1

Time

Freq

Page 14: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Flag Category – Multiple Bits •  Instead of single flag for each visibility, Use multiple bits to store flags by flag category •  Each bit tagged by User/KSP, Parset/algorithm log, dependencies of other flag bits, time.. •  log warning when conflicting conditions •  An intermediate file produced which expresses the parset in terms of condensed statements.

-  For being able to select desired combination of flag categories based on their performance -  Comparative study of different flagging algorithms -  Flagging during calibration ex. based on gain solutions can be stored -  Flagging based on residual data can be stored -  Each KSP may have different strategy for flags, so all can be accommodated -  Minimal increased disk space (few bits compared to 8 bytes per correlation visibility) -  One can combine algorithms of more than one user -  For EoR it may be critical to have this information

Data can reside at one place and users can have only model and corrected data..?

A few advantages

A1 A2 A3 A4 0 1 0 0 1 1 0 1 0 1 0 2 0 1 1 1

Page 15: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Inferences, issues

•  Complete DP3 Pipeline has been tested, works successfully.

•  Computational Speed improved – steps integrated in one.

•  Integrated in the offline Pipeline.

•  Code Generic – support Multiple flaggers and Multiple levels of compression.

•  Flags stored bitwise based on flag category

•  Time and frequency centroid of averaged visibilities in compressed MS under discussion.

•  Performance?

•  Regularly used from ?

•  RFI for Transients..

•  Does not support flagging using data across different subbands/days together.

Page 16: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

LOFAR Calibration - BBS

●  Pool of independent processes operate on shared memory ●  A central control process examines the black board and decides what is

to be done next depending on the current state. ●  Software to do matrix self-calibration

●  Emphasis on performance, batch-mode operation, and distributed processing of large data volumes. Well integrated with SAS/MAC

Commonly used reduction packages for aperture synthesis data — AIPS : VLA, WSRT, GMRT, ATCA, VLBI,… — Miriad : VLA, ATCA, WSRT,… — NEWSTAR : WSRT — AIPS++ : WSRT, VLA, … and Now CASA For LOFAR, with all it novel /complicated aspects, we need to do much better. Two packages have been, and continue to be, developed: — MeqTrees is being used to develop/simulate our understanding — BBS will be implementing efficiently/optimally what we have learned

BBS (Based on Black Board Design Pattern for Distributed computing)

Page 17: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

BBS

Sky model parameters

Processing strategy

Instrument model parameters

Calibrated visibility data

Observed visibility data

Environment

Page 18: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

BBS Capabilities ●  Processing strategy

–  List of processing steps

●  Processing steps have the following attributes: –  Data selection

–  Operation to perform

–  Sky model

–  Instrument model

●  Supported operations –  Simulation

–  Subtraction, Addition

–  Correction

–  Parameter fitting

–  Easy to add other operations

Page 19: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

BBS - Capabilities ●  Supported parameter types

–  Constants

–  Polynomials of frequency and/or time

●  Supported source models –  Point source

–  Elliptical Gaussian

●  Supported model components –  Bandpass

–  (Directional) Gain

–  Basic SPAM

–  Dipole beam ●  Analytical model (S. Yatawatta) ●  Semi-analytical model (J.P. Hamaker) Dipole beam

Page 20: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Processing strategy ●  Data will be read only once to minimize disk I/O ●  Processing strategy

–  List of processing steps to perform –  Hierarchical specification

Peel0 Peel1

Solve Subtract

Root

Solve Subtract

Root.Steps = [Peel0, Peel1] Peel0.Model.Sources = [CasA] Peel0.Steps = [Solve, Subtract] Peel1.Model.Sources = [CygA] Peel1.Steps = [Solve, Subtract]

Page 21: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Data distribution ●  The visibility data of an observation is stored in a

distributed fashion –  Individual subbands are distributed over the cluster

●  This is only an issue when fitting parameters –  Parameter fitting combines visibility data from a certain

domain in frequency and time –  All other operations are embarrassingly parallel

●  BBS supports parameter fitting across subbands using separate solver processes

Page 22: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Calibration group 0

Calibration process 0

Sol

ver

Calibration process 1

Sol

ver

Calibration process 2

Sol

ver

Calibration group 1

Calibration process 3

Sol

ver

Calibration process 4 S

olve

r

Calibration group 2

Calibration process 5

Sol

ver

Calibration process 6

Sol

ver

Solver process 1

Merge equations Solve

Solver process 0

Merge equations Solve

Solver process 2

Merge equations Solve

Equations

Model parameters

Equations

Model parameters

Equations

Model parameters

SB0

SB1

SB2

SB3

SB4

SB5

SB6

Parameter fitting

Use of Storage node CPUs?

Page 23: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Inspecting the output

●  Parameter values are stored in a (distributed) parameter database

●  Python interface (Parm Facade) is available

-Enables easy plotting, analysis of solutions-speed up commissioning

Time Time

A M P L I T U D E

P h a s e

Page 24: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

LBA image using BBS Average { 24hrs, 36* Sub Bands: 0-35} 38-62 MHz

>500 sources

Band width(5 MHz)

Res ~ 0.50

Rms ~ 1 Jy

Initial deep all sky wide field (full hem

isphere centered on North C

elestial Pole)

18h

12h

Sun

Tycho

Taurus

00h

06h

CygA (residual) 1% level

Page 25: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Open Issues ●  High quality images with test stations routinely obtained ●  Source positions agree - <1% of HPBW(0.50) on different days (avg) - <3% of HPBW with NVSS Technical - Control framework general enough - Scaling, Multi-Threading, Performance - Scheduling/Failing nodes?

Algorithms ●  Simultaneous solution v. peeling ●  Solver robustness ●  Alternative solver algorithms

–  Models ●  Beam ●  Ionosphere, Mosaicing

Page 26: V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline ... · V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro Pre-Processing and Calibration for ‘Million Source Shallow

V.N.Pandey et. al. DPPP and BBS for MSSS CALIM 2009, Socorro

Thank you