Upload
armand-fowler
View
27
Download
0
Tags:
Embed Size (px)
DESCRIPTION
FABRIC A pilot study of distributed correlation. Huib Jan van Langevelde Ruud Oerlemans Sergei Pogrebenko and many other JIVErs…. Aim of the project. Research the possibility of distributed correlation Using the Grid for getting the CPU cycles - PowerPoint PPT Presentation
Citation preview
FABRICFABRICA pilot study of distributed A pilot study of distributed
correlationcorrelation
Huib Jan van LangeveldeHuib Jan van Langevelde
Ruud OerlemansRuud Oerlemans
Sergei PogrebenkoSergei Pogrebenkoand many other JIVErs…and many other JIVErs…
hu
ib 2
3/6
/06
2/12NGC Groningen 29 June 2006
Aim of the project
• Research the possibility of distributed correlation• Using the Grid for getting the CPU cycles • Can it be employed for the next generation VLBI
correlation?• Exercise the advantages of software correlation
• Using floating point accuracy and special filtering• Explore (push) the boundaries of the Grid paradigm
• “Real time” applications, data transfer limitations
• To lead to a modest size demo• With some possible real applications:
• Monitoring EVN network performance• Continuous available eVLBI network with few telescopes
•Monitoring transient sources•Astrometry, possibly of spectral line sources
• Special correlator modes: spacecraft navigation, pulsar gating• Test bed for broadband eVLBI research
Something to try on the roadmap for the next generation correlator,
even if you do not believe it is the solution…
hu
ib 2
3/6
/06
3/12NGC Groningen 29 June 2006
SCARIe FABRIC
•EC funded project EXPReS (03/2006)• To turn eVLBI into an operational system• Plus: Joint Research Activity: FABRIC
• Future Arrays of Broadband Radio-telescopes on Internet Computing
•One work-package on 4Gb/s data acquisition and transport(Jodrell Bank, Metsahovi, Onsala, Bonn, ASTRON)
•One work-package on distributed correlation (JIVE, PNSC Poznan)
•Dutch NWO funded project SCARIe (10/2006)• Software Correlator Architecture Research and Implementation
for eVLBI
• Use Dutch Grid with configurable high connectivity• Software correlation with data originating from JIVE
•Complementary projects with matching funding• International and national expertise from other partners
• Poznan Supercomputer centre• SARA and University of Amsterdam
• Total of 9 man year at JIVE, plus some matching from staff• plus similar amount at partners
hu
ib 2
3/6
/06
4/12NGC Groningen 29 June 2006
Previous experience on Software correlation
• Builds on previous experience at JIVE
• regular and automated network performance tests
• Using Japanese software correlator from NICT
•Huygens extreme narrow band correlation
• Home grown superFX with sub-Hz resolution
hu
ib 2
3/6
/06
5/12NGC Groningen 29 June 2006
Basic idea
•Use the Grid for correlation•CPU cycles on compute nodes•The Net could be crossbar switch?
•Correlation will be asynchronous•Based on floating point arithmetic•Portable code, standard environment
typical VLBI problems
descriptionN
telescopesN
subbandsdata-rate
[Mb/s]N
spect/prod Tflops1 Gb/s full array 16 16 1024 16 83.89typical eVLBI continuum 8 8 128 16 2.62typical spectral line 10 2 16 512 16.38FABRIC demo 4 2 16 32 0.16future VLBI 32 32 4096 256 21474.84
Rough estimate based on XF correlation
hu
ib 2
3/6
/06
6/12NGC Groningen 29 June 2006
Work packages
• Grid resource allocation•Grid workflow management
• Tool to allocate correlator resources and schedule correlation
• Data flow from telescopes to appropriate correlator resources
•Expertise from the Poznan group in Virtual Laboratories• Will this application fit on Grid?• As it is very data intensive• And time-critical if not real-time
• Software correlation•correlator algorithm design
• High precision correlation on standard computing• Scalable to cluster computers • Portable for grid computers and interfaced to standard
middleware• Interactive visualization and output definition
• Collect & merge data in EVN archive• Standard format and proprietary rights
hu
ib 2
3/6
/06
7/12NGC Groningen 29 June 2006
Workflow Management
• Must interact with normal VLBI schedules•Divide data, route to compute nodes, setup correlation•Dynamic resource allocation, keep up with incoming data!
Eff
ort fro
m P
ozn
an
, based
on
their V
irtual
Lab
.
hu
ib 2
3/6
/06
8/12NGC Groningen 29 June 2006
Topology
•Slice in time• Every node gets an interval
• A “new correlator” for every time slice
• Employ clusters computers at nodes
• Minimizes total data transport
• Bottleneck at compute node
• Probably good connectivity at Grid nodes anyway
• Scales perfectly• Easily estimated how many
nodes are needed• Works with heterogeneous
nodes• But leaves sorting to
compute nodes• Memory access may limit
effectiveness
•Slice in baseline• Assign a (or a range of)
products to a certain node• E.g. two data streams
meet in some place• Transport Bottleneck at
sources (telescopes)• Maybe curable with
multicast transport mechanism which forks at network nodes
• Some advantage when local nodes at telescopes
• Does not scale very simply• Simple schemes for ½N2
nodes• Need to re-sort output
• But reduces the compute problem
• Using the network as the cross-bar switch
hu
ib 2
3/6
/06
9/12NGC Groningen 29 June 2006
Broadband software correlation
Raw data 16 MHz,Mk4 format on linux disk
Channel extraction
Extracted data
Delay corrections
Delay corrected data
Station 1 Station 2 Station N
Correlation. SFXC
Data Product
Pre-calculated,Delay tables
From Mk5 to linux disk
Raw data BW=16 MHz, Mk4 format on Mk5 disk
DIM,TRM,CRM
DCM,DMM,FR
SU
Correlator
Chip
EVN Mk4 equivalents
hu
ib 2
3/6
/06
10/12NGC Groningen 29 June 2006
Better SNR than Mk4 hardware
hu
ib 2
3/6
/06
11/12NGC Groningen 29 June 2006
Software correlation
•Working on benchmarking• Single core processors so
far• Different CPU’s available
• Already quite efficient• More work on memory
performance
•Must deploy on cluster computers
•And then on Grid
•Organize the output to be used for astronomy
SFX correlator: measuring CPU on single coreAuto and Cross correlations
0
500
1000
1500
2000
2500
3000
3500
4000
0 4 8 12 16 20 24 28 32 36 40 44
number of stations
CP
U tim
e (
s)
jop32
pcint
cedar
SFX correlator:CPU contributions
0
500
1000
1500
2000
2500
3000
3500
4000
0 4 8 12 16 20 24 28 32 36 40 44
number of stations
CP
U tim
e (
s)
cedar
FFT only
I/O only
FFT Auto
hu
ib 2
3/6
/06
13/12NGC Groningen 29 June 2006
Huygens, software correlation
•Experience with software correlation from Huygens
•Carrier signal from Titan lander
•Recorded on Mk5 disk system• Saved Doppler data experiment
•Requires extreme narrow band correlation
•And solar system model•May reveal 3D trajectory at
1km accuracy
hu
ib 2
3/6
/06
14/12NGC Groningen 29 June 2006
Goal of the project
• Develop: methods for high data rate e-VLBI using distributed correlation
•High data rate eVLBI data acquisition and transport• Develop a scalable prototype for broadband data
acquisition•Prototype acquisition system
• Establish a transportation protocol for broadband e-VLBI•Build into prototype, establish interface normal system
• Interface e-VLBI public networks with LOFAR and e-MERLIN dedicated networks
•Correlate wide band Onsala data on eMERLIN•Demonstrate LOFAR connectivity
•Distributed correlation• Setup data distribution over Grid
•Workflow management tool
• Develop a software correlator•Run a modest distributed eVLBI experiment
hu
ib 2
3/6
/06
15/12NGC Groningen 29 June 2006
2 major components
Part 1: Scalable connectivity• 1.1. Data Acquisition
•1.1.1. Data acquisition architecture (MRO)• Scalable data acquisition system, off-the-shelf components
new version of PC-EVN?•1.1.2. Data acquisition prototype (MRO)
• Prototype for 4Gb/s?•1.1.3. Data acquisition control (MPI)
• Control data acquisition, interface for protocol, distributed computing
• 1.2. Broadband Datapath•1.2.1. Broadband protocols (JBO)
• IP protocols, lambda switching, multicasting•1.2.2. Broadband data processor interface (JBO)
• Data from public network to eMERLIN correlator•1.2.3. Integrate and test (OSO)
• 10 Gb/s test environment for OSO-eMERLIN (and LOFAR?)•1.2.4. Public to dedicated interface (ASTRON)
• LOFAR transport over public network, LO & timing
hu
ib 2
3/6
/06
16/12NGC Groningen 29 June 2006
Components (part 2)
Part 2: Distributed correlation• 2.1. Grid resource allocation
• 2.1.1. Grid VLBI collaboration (PNSC)• Establish relevant tools for eVLBI
• 2.1.2. Grid workflow management (PNSC)• Tool to allocate correlator resources and schedule correlation
• 2.1.3. Grid routing (PNSC)• Data flow from telescopes to appropriate correlator resources
• 2.2. Software correlation• 2.2.1. correlator algorithm design
• High precision correlation on standard computing• 2.2.2. Correlator computational core• 2.2.3. Scaled up version for clusters• 2.2.4. Distributed version, middleware
• Deploy on Grid computing• 2.2.5. Interactive visualization• 2.2.6. Output definition
• Output data from individual correlators• 2.2.7. Output merge
• Collect data in EVN archive
hu
ib 2
3/6
/06
17/12NGC Groningen 29 June 2006
On distributed computing
• What are the Grid resources•Calibrate the require amount of computing
• Dynamical allocation possible?
• Interaction with observing schedule
• Topology of network•Slice data in frequency, time or differently?
• Interface for routing data•Multicast implementation on acquisition module
hu
ib 2
3/6
/06
18/12NGC Groningen 29 June 2006
Distributed correlation
• Correlator model centrally generated?•Or calculate at every node
• Plan for merging data back together
• How to get uvw coordinates in data
• Monitor progress centrally
hu
ib 2
3/6
/06
19/12NGC Groningen 29 June 2006
Current eVLBI practice
observing schedulein VEX format
user correlatorparameters
earth orientationparameters
correlator controlincluding model
calculation
field systemcontrols antennaand acquisition
BBC & samplers
Mk4formatter
Mk5playback
Mk5recorder
Mk4 datain Mk5prop form
over TCPIP
outputdata
hu
ib 2
3/6
/06
20/12NGC Groningen 29 June 2006
FABRIC=
The GRID
FABRIC components
observing schedulein VEX format
user correlatorparameters
GRIDresources data
correlator controlincluding model
calculation
field systemcontrols antennaand acquisition
DBBCVSI
VSIe??on??
outputdata
earth orientationparameters
PC-EVN#2
resource allocationand routing