41
The Palomar Transient Factory or Adventures in High Fidelity Rapid Turnaround Data Processing at IPAC Jason Surace Russ Laher, Frank Masci, Wei Mi (did the IPAC work) Branamir Sesar, Eran Ofek, David Levitan (students & post-docs) Vandana Desai, Carl Grillmair, Steve Groom, Eugean Hacopians, George Helou, Ed Jackson, Lisa Storrie-Lombardi, Lin Yan (IPAC Team) Eric Bellm (Project Scientist), Shri Kulkarni (PI)

Jason Surace Russ Laher , Frank Masci , Wei Mi (did the IPAC work)

Embed Size (px)

DESCRIPTION

The Palomar Transient Factory or Adventures in High Fidelity Rapid Turnaround Data Processing at IPAC. Jason Surace Russ Laher , Frank Masci , Wei Mi (did the IPAC work) Branamir Sesar , Eran Ofek , David Levitan (students & post-docs) - PowerPoint PPT Presentation

Citation preview

Page 1: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

The Palomar Transient Factory or

Adventures in High Fidelity Rapid TurnaroundData Processing at IPAC

Jason SuraceRuss Laher, Frank Masci, Wei Mi (did the IPAC work)

Branamir Sesar, Eran Ofek, David Levitan (students & post-docs)Vandana Desai, Carl Grillmair, Steve Groom, Eugean Hacopians, George Helou, Ed Jackson, Lisa Storrie-Lombardi, Lin Yan (IPAC

Team)

Eric Bellm (Project Scientist), Shri Kulkarni (PI)

Page 2: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

What was/is PTF/iPTF?

• PTF is a robotic synoptic sky survey system designed to study transient (time-domain) phenomena.

• Surveys 1000-3000 square degrees a night, predominantly at R-band to a depth of 20.5.

• Primarily aimed at supernova science.• But also can study variable stars, exoplanets, asteroids, etc.• And produces an imaging sky survey like SDSS over larger

area.• PTF ran 4 years on-sky starting in 2009, now “iPTF” for

another 3. Early foray into the next big theme in astronomy.• Total budget ~$3M.

Surace 2014

Page 3: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Surace 2011

Former CFHT 12k Camera -> PTF Camera

Eliminated nitrogen dewar; camera now mechanically cryo-cooled. New field flattener, etc. 7.8 square degree active area.

Surace 2014

Page 4: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

The Venerable 48-inch Telescope

Surace 2014

Page 5: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Surace 2014

PTF camera installed in late 2008; Operations started 2009

Fully robotic operation. Automatically opens, takes calibrations, science data, and adapts to weather closures. Human intervention used to guide science programs.

Page 6: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Infrared Processing and Analysis Center

Surace 2009

IPAC is NASA’s multi-mission science center and data archive center for IR/submm astronomy. Specifically, we handle processing, archiving, and/or control for numerous missions including: IRAS, ISO, Spitzer, GALEX, Herschel, Planck, and WISE, as well as 2MASS, KI, and PTI. Also the seat of the Spitzer Science Center, NExSci, NED, NStED, and IRSA. Approximately 150 employees in two buildings on the CIT campus.

Surace 2014

Page 7: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

R-band Holdings

1292 nights, 3.1 million images47 billion source apparitions (epochal detections)

Surace 2014

Page 8: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

g-band Holdings

241 nights, 500 thousand images

Surace 2014

Page 9: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

H-alpha Holdings

99 nights, 125 thousand images

Surace 2014

Page 10: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

P48

Caltech/Cahill

NERSC Image Subtraction and Transient Detection/RB Pipeline

Ingest

Realtime Image Subtraction

Pipeline

Photometric Pipeline

Reference Pipeline

Lightcurve Pipeline

Transient Candidates

LightcurvesReference Catalogs

Epochal Images and

Catalogs

IPAC

Moving Object Pipeline SSOs

Reference Images

Surace 2014

Page 11: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

IPAC Infrastructure

• Data transmission from Palomar via microwave link to SDSC.

• ~1TB of data every 4-5 days.• 24 drones with 240 cores. Mixed Sun

and Dell blade units running RHE.• Roughly 0.5 PB spinning disk in

Nexsan storage units.• Associated network equipment.• Database and file servers. • Archive servers.• Tape backup.

IPAC Morrisroe Computer Center

Surace 2014

Page 12: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Cluster/Parallelization Architecture

• PTF data are observed on a fixed system of spatial tiles on the sky. Vastly simplifies data organization and processing. PTF fields and CCD combinations are the basic unit to parallelize processing over multiple cluster nodes. Each node processes a CCD at a time.

• “Virtual Pipeline Operator” on a master control node oversees job coordination and staging.

• Multi-tiered local scratch disk, “sandbox” (working area) and archive disk structure; inherited architecture from previous projects driven by issues with very large file counts and I/O heavy processes.

• Disk system shared with archive for budget constraint issues.

Surace 2014

Page 13: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Software Structure

• Individual modules written predominantly in C, but also FORTRAN, PYTHON, MATLAB, and IDL.

• Connected with PERL wrapper infrastructure into discrete pipelines.• Postgres database used for tracking dataflow, data quality, etc. Relational

database not used in the operations system for catalog storage; not needed, and flat file access is more efficient.

• Heavy use of community software: sextractor, swarp, scamp, astrometry.net, daophot, hotpants. Cheaper not to re-invent the wheel.

• Software replaced as needed by new code development.• Highly agile development program: unknown and changing science

requirements, small team, and no separate development system due to budget constraints!

• Continuous refinement process. There’s a trap with big data development on a new instrument.

Surace 2014

Page 14: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Realtime Pipeline• Realtime – data is processed as received, turnaround in 20

minutes. Needed for same-night followup.• Astrometric and photometrically calibrated.• Image subtraction against a reference image library

constructed from all the data to-date. In-house software.• “Streak detection” for fast-moving objects; moving object

pipeline constructs solar system object tracklets.• Transient candidate detection and extraction via psf-fitting

and aperture extraction.• Machine-learning “scores” candidates.• Image subtractions and candidate catalogs are pushed to

an external gateway where they are picked up by the solar system, ToO, and extragalactic marshalls.

Surace 2014

Page 15: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Realtime Image Subtraction and Transient Detection

Surace 2014

Originally the community “HOTPANTS” package, now replaced with a more sophisticated in-house image subtraction algorithm.

Page 16: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Photometric Pipeline• This pipeline processes data in the traditional manner.• Starts up at the end of the night, after all the data has been received.• Calibration is derived from the entire night’s worth of data. Specifically, the

bias and flat-fields are derived from the data themselves.• Photometric calibration is derived from extracted photometry from all sources,

fitting color, extinction, time and large-scale spatial variations vs. the SDSS. Typically reach an accuracy of a few %.

• Astrometric calibration is done individually at the CCD level, against a combined SDSS and UCAC4 catalog. Typically good to 0.15”.

• Output from this pipeline are calibrated single-CCD FITS images and single-CCD catalog FITS binary tables (both aperture and psf-fit). These are archived through IRSA. Available 1-3 days after observation.

Page 17: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Photometric Pipeline OutputSingle R-band thumbnail image of Arp 220, 8 arcminutes across.

Aperture extractions catalog (sextractor-based) overlaid. All observations and detections of everything are saved in the archive.

Products are a reduced image, bit-encoded data quality mask, and catalogs. All products are FITS.

Page 18: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Reference Image Pipeline• Once enough individual observations accumulate, the “reference

image” pipeline is triggered.• This pipeline coadds the existing data, after selecting “best frames”,

e.g. best seeing, photometric conditions, astrometry, etc.• Coaddition is done based on CCD id, PTF tile, and filter.• These images are the reference of the static sky, at a level deeper

than the individual observations.• “Reference Catalogs” are extracted from these images.• This concept is important, because these are both the underlying

basis of the image subtractions, and also the basis of the light-curve pipeline.

• Like PTF coverage, the depth of these is variable, but is current 5<n<50.

• Resulting products are FITS images and FITS binary tables.

Page 19: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Reference Images

Single Image 60 sec @R Field 5257, Chip 7, Stack of 34

Surace 2014

Page 20: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Deep Sky Coadds aka “Reference Images”

* Results not typical. Near Galactic Center.

Surace 2014

Page 21: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Deep Coadds

Surace 2014

Page 22: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Light Curve Pipeline

• Each night, all detected sources from the photometric pipeline are matched against the reference catalog (better than a generic catalog-matching approach).

• All sources ever seen for a given CCD, PTF tile, and filter combination are loaded and analyzed.

• Least variable sources used as anchors for the calibration.• Image-by-image correction factors computed for that image as

a whole and stored as a lookup table.• Application of these secondary correction factors improves

overall relative calibration to near-millimag levels for bright sources (that part is important).

• Triggers less frequently (planned weekly updates).• Highest level of our products.

Page 23: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Surace 2014

From Van Eyken

Binary star light curves taken from PTF processed images in Orion.

Page 24: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Example Light Curves

Something a little different, these are relatively faint asteroid light curves from Chang et al. 2014.

Surace 2014

Page 25: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

PTF Archive at IRSA

Surace 2014

Data products can be searched and retrieved via sophisticated GUI tools and also through an application program interface that allows integration of the archive into other, 3rd party software.

Page 26: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

PTF Archive at IRSA

Surace 2014

IRSA is looking to hire a UI software developer , see the Caltech website https://jobs.caltech.edu/postings/2254 or ask Steve Groom at this meeting.

Page 27: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

PTF “Marshals”• PTF “Science Marshals” sit on top of the data archive.• Marshals are like interactive science twikis. • Marshals are predominantly written by science users

for their science collaborations, with coordinated interaction between them and the ops/archive system.

• The ops system produces science products (e.g. data), the archive produces access to science products, the marshals help turn the science products into science results (e.g. papers).

• They can be used to classify data, listen for alerts, lay down new observations for robotic followup, coordinate collaborators, etc.

Surace 2014

Page 28: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

iPTF Extragalactic Marshal

Surace 2014

Page 29: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

iPTF Extragalactic Marshal

Surace 2014

Page 30: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

NEA “Streaker” Marshal

Surace 2014

Page 31: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

NEA “Streaker” Marshal

Surace 2014

Page 32: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

GRB Target of Opportunity (ToO) Marshall

iPTF ToO MarshalliPhone App

Surace 2014

GRBs and (should they ever be detected) gravity waves can only be localized to tens to a few hundred square degrees.

PTF and ZTF can survey these areas in tens of minutes as targets of opportunity to localize fading electromagnetic counterparts.

Marshall receives alerts from Fermi and Swift, automatically lays down proposed ToO observations, and alerts a user by phone to activate the followup.

Page 33: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

ZTF was awarded full funding through NSF-MSIP (Mid-Scale Innovation Program).

ZTF now a roughly 50:50 public:private partnership.

Total Budget ~$17M

Zwicky Transient Facility

More or less what PTF was, but an order of magnitude more of it.

Surace 2014

Page 34: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Wafer-Scale CCDs

Surace 2014

e2v CCD231-C6 6k x 6k form factor with 15 micron pixels. A little under 4 inches on a side.

Focal plane readout time <10 seconds! 16 CCDs, 4 readouts each. And they are cheap.30-second cadence means 1.2 GB raw data every 45 seconds. ~16x current data rate from PTF.5 CCDs in-hand, remaining 11 now ordered.

Page 35: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Surace 2014

ZTF camera FOV is 50 square degrees.

Largest camera on >1m telescope by area in the world.

Or, to make a little clearer, here’s Orion.

The white box is the ZTF imaging area.

The moon is in the upper right corner of the white box.

Page 36: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Surace 2014

Page 37: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

And to Process All This?

Surace 2014

IPAC is the data processing and archive center for all aspects of ZTF.Continuous raw data flow of 30MB/s.0.5-1 PB/yr of data products.Drone farm of 128 computers.Replication of proven PTF design in subunits similar to PTF data load (camera quadrants).

Page 38: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Surace 2014

Page 39: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Surace 2014

Transient Science Summer Schools

Page 40: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

Schedule

• Early 2014 – PTF data for selected high cadence fields (M81, Beehive, Orion, Kepler, Stripe 82, Cass-A.

• 2015 – Complete PTF Archive release.• 2016 – Rolling Releases of iPTF Archive, , including deep

reference images and light curves.• 2017 – ZTF First Light (Jan), commissioning of camera,

building of new reference images.• 2018 – First ZTF data release (images, catalogs, light

curves, transient candidates)• 2019 – Release of transient alerts.• 2020 – NSF funded period ends. Project continues with

private partners.

Surace 2014

Page 41: Jason Surace Russ  Laher , Frank  Masci , Wei  Mi  (did the IPAC work)

http://ptf.caltech.edu

Surace 2014