16
MWA Data Capture and Archiving Dave Pallot MWA Conference Melbourne Australia 7 th December 2011

MWA Data Capture and A rchiving

  • Upload
    polly

  • View
    67

  • Download
    4

Embed Size (px)

DESCRIPTION

MWA Data Capture and A rchiving. Dave Pallot MWA Conference Melbourne Australia 7 th D e cember 2011. Talking Points. Data Capture and Archive System Systems overview Correlator data capture RTS data capture On-site data operations The Next Generation Archiving System (NGAS). - PowerPoint PPT Presentation

Citation preview

Page 1: MWA Data Capture and A rchiving

MWA Data Capture and Archiving

Dave PallotMWA ConferenceMelbourne Australia7th December 2011

Page 2: MWA Data Capture and A rchiving

Talking Points

• Data Capture and Archive System– Systems overview– Correlator data capture– RTS data capture– On-site data operations– The Next Generation Archiving System (NGAS).– Archive Details

Page 3: MWA Data Capture and A rchiving

Data Capture and Archive System

Capture the data products from the MWA (MRO) and transport it to the peta-byte storage facility at the Pawsey Center (Perth) for later retrieval, processing and analysis.

Page 4: MWA Data Capture and A rchiving

Data Capture and Archive System

• Correlator:

– 24x GPU-X– ~32 MB/s (0.5 sec, 40 kHz, 32-bit)

• On-site Storage:

– ~48 TB of transportable storage• Pawsey:

– 15 PB reserved for MWA– 96 GPU nodes for data processing

Page 5: MWA Data Capture and A rchiving

System Flow

1. Monitor & Control tells correlator to capture data.

2. Correlator dumps visibility data to configurable storage location.

3. Monitor & Control tells correlator to stop data capture.

4. Visibility files are produced, collected and transported to Pawsey for archiving (NGAS).

5. Observations, with their visibilities, are accessed and images are produced.

Page 6: MWA Data Capture and A rchiving

Correlator Data Capture• Data capture modes:

- Save all.- Save all on trigger.

Save All Mode – Dump all visibility data to a single data file per machine for

the fixed duration of a single observation.

– Size of each visibility file is dependant on the output block size and the duration of the observation.

Page 7: MWA Data Capture and A rchiving

Correlator Data Capture cont.

Save All on Trigger Mode

• Stream data to a circular disk buffer and only produce a visibly data file (flush the buffers) when triggered i.e. something interesting happens.– Telescope continuously on.

• Trigger is activated by an expert who is external to the capture process. – Architecture allows automatic detection and triggering via

various pipelines.• If there is no trigger then no visibility data is flushed to file.

• Once triggered, the observation has ended.

Page 8: MWA Data Capture and A rchiving

Correlator Data Capture cont.

– Circular buffer size of possibly 100’s GB on disk.• Example: 100 GB / 32 MB/s ≅ 52 mins

– Circular buffer size can configured. • Must be defragmented into a contiguous block to get

maximum I/O performance.

Page 9: MWA Data Capture and A rchiving

Correlator Data Capture cont.• In both modes:

– One visibility file per machine per observation is produced.• Total of 24 files per observation

– Same data format and filenames.• No special treatment of data files once they are produced.• Special treatment of data buffers but that is hidden.

• Files will have unique identifiers in the file name to link them to the meta-data in our databases.

Page 10: MWA Data Capture and A rchiving

RTS Data Capture

• Accumulate and generate images on the GPU?– Avoid accessing visibilities from disk storage– Performance reasons (Concurrent disk access)

• Images dumped to separate location to visibilities. – Visibilities can be purged if the RTS images are bad.– Will not be transported as they can be reconstructed.

• Required more discussion– Mitch (RTS Group), M&C group, Curtin.

Page 11: MWA Data Capture and A rchiving

On-site Data Operations

• Facility to process images from archiving node on-site– Tools to access visibilities form local storage.– Images/processing will be done outside of the MWA data

pipeline.

• Ability to “flag” bad data– Can be purged before transportation. – Who makes that decision?

Page 12: MWA Data Capture and A rchiving

Data Transport

• Data transport from MRO to Perth?– Transportable disk array

• 48 TB of storage• Interim measure

– 10 Gb NBN• Fiber link form MWA to Pawsey• Termination location and timeframe is uncertain

• Transportation and archive coordination – NGAS

Page 13: MWA Data Capture and A rchiving

Next Generation Archiving System (NGAS)• Distributed storage software solution.

• Operate transparently across physically and logically separated location– Reliable communications (HTTP interface)– Supports archive replication and mirroring.– Access to data on-site and through the archive.

• Scalable as it can co-ordinate multi-peta bytes of storage.

• Lots of tools.

• Proven architecture for archiving large data sets.– National Radio Astronomy Observatory (NRMO)– Atacama Large Millimeter/submillimeter Array (ALMA)

Page 14: MWA Data Capture and A rchiving

Archive• Standard features you would expect from an archive.

– Performance/usage trends, retrieval, store, etc• Specific features to MWA

– Sky Maps, Temperature plots, etc– Will evolve over time

• Comprehensive meta-data search tool – RA/DEC, Source, Gains, Freq, Date/Time, temperatures etc

• Pawsey supercomputer node. – Generate images from a composite set of visibilities. – Fully configurable pipeline plug-in architecture to archive.– Reduce I/O, storage & processing constraints for single

users.

Page 15: MWA Data Capture and A rchiving

Current state of play

• Raised a PO for 48 TB transportable storage array and controllers.– Arrive in the new year.

• Data capture modes ready for first “Quarter T” roll-out.– May-June 2012

• First cut of archive subsystems (NGAS) – Implementation, benchmarking, commissioning, interfaces.– April 2012

Page 16: MWA Data Capture and A rchiving

Thank You