16
Analysis Framework for the Prompt Discovery of Compact Binary Mergers in Gravitational-wave Data Cody Messick, 1, 2, a Kent Blackburn, 3 Patrick Brady, 4 Patrick Brockill, 4 Kipp Cannon, 5, 6 Romain Cariou, 7 Sarah Caudill, 4 Sydney J. Chamberlin, 1, 2 Jolien D. E. Creighton, 4 Ryan Everett, 1, 2 Chad Hanna, 1, 8, 2 Drew Keppel, 9 Ryan N. Lang, 4 Tjonnie G. F. Li, 10 Duncan Meacher, 1, 2 Alex Nielsen, 9 Chris Pankow, 11 Stephen Privitera, 12 Hong Qi, 4 Surabhi Sachdev, 3 Laleh Sadeghian, 4 Leo Singer, 13 E. Gareth Thomas, 14 Leslie Wade, 15 Madeline Wade, 15 Alan Weinstein, 3 and Karsten Wiesner 9 1 Department of Physics, The Pennsylvania State University, University Park, PA 16802, USA 2 Institute for Gravitation and the Cosmos, The Pennsylvania State University, University Park, PA 16802, USA 3 LIGO Laboratory, California Institute of Technology, MS 100-36, Pasadena, California 91125, USA 4 Leonard E. Parker Center for Gravitation, Cosmology, and Astrophysics, University of Wisconsin-Milwaukee, Milwaukee, WI 53201, USA 5 Canadian Institute for Theoretical Astrophysics, 60 St. George Street, University of Toronto, Toronto, Ontario, M5S 3H8, Canada 6 RESCEU, University of Tokyo, Tokyo, 113-0033, Japan 7 epartement de physique, ´ Ecole Normale Sup´ erieure de Cachan, Cachan, France 8 Department of Astronomy and Astrophysics, The Pennsylvania State University, University Park, PA 16802, USA 9 Albert-Einstein-Institut, Max-Planck-Institut f¨ ur Gravitationsphysik, D-30167 Hannover, Germany 10 Department of Physics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China 11 Center for Interdisciplinary Exploration and Research in Astrophysics (CIERA) and Department of Physics and Astronomy, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208, USA 12 Albert-Einstein-Institut, Max-Planck-Institut f¨ ur Gravitationsphysik, D-14476 Potsdam-Golm, Germany 13 NASA/Goddard Space Flight Center, Greenbelt, MD 20771, USA 14 University of Birmingham, Birmingham, B15 2TT, UK 15 Department of Physics, Hayes Hall, Kenyon College, Gambier, Ohio 43022, USA (Dated: January 12, 2017) We describe a stream-based analysis pipeline to detect gravitational waves from the merger of binary neutron stars, binary black holes, and neutron-star–black-hole binaries within 1 minute of the arrival of the merger signal at Earth. Such low-latency detection is crucial for the prompt response by electromagnetic facilities in order to observe any fading electromagnetic counterparts that might be produced by mergers involving at least one neutron star. Even for systems expected not to produce counterparts, low-latency analysis of the data is useful for deciding when not to point telescopes, and as feedback to observatory operations. Analysts using this pipeline were the first to identify GW151226, the second gravitational-wave event ever detected. The pipeline also operates in an offline mode, in which it incorporates more refined information about data quality and employs acausal methods that are inapplicable to the online mode. The pipeline’s offline mode was used in the detection of the first two gravitational-wave events, GW150914 and GW151226, as well as the identification of a third candidate, LVT151012. I. INTRODUCTION The field of gravitational-wave astronomy has come to life in a spectacular way, with the first detections of grav- itational waves on September 14, 2015 [1] and December 26, 2015 [2] by the two detectors of the Laser Interferom- eter Gravitational-wave Observatory (LIGO) [3]. These detectors are currently undergoing further commission- ing and will reach design sensitivity in the next few years. Additionally, they will be joined by a network of gravitational-wave observatories that include Advanced Virgo [4], KAGRA [5], and a third LIGO observatory in India [6]. We expect this network to bring more observa- tions of binary black hole mergers [7], as well as binary neutron star (BNS) and neutron-star–black-hole (NSBH) a [email protected] mergers [8]. As we enter the era of gravitational-wave astron- omy, the need for low-latency analyses becomes criti- cal. Gravitational waves from BNS and NSBH mergers are expected to be paired with electromagnetic emission and neutrinos [9–11]. Gravitational-wave-triggered elec- tromagnetic observations may lead to the detection of prompt short gamma-ray bursts and high-energy neutri- nos within seconds, followed by X-ray, optical, and radio afterglows days to years later. Multimessenger observa- tions will aid in our understanding of astrophysical pro- cesses and increase our search sensitivity [9, 12]. Addi- tionally, even in the absence of a counterpart, the rapid identification of gravitational waves has a number of ben- efits. Low-latency detection allows us to provide feed- back to commissioners when search sensitivity drops un- expectedly, helping to return the detector to its nominal state [13]. Furthermore, upon identification of a candi- date, we can submit timely requests to minimize detector arXiv:1604.04324v3 [astro-ph.IM] 10 Jan 2017

Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

Analysis Framework for the Prompt Discovery of Compact Binary Mergers inGravitational-wave Data

Cody Messick,1, 2, a Kent Blackburn,3 Patrick Brady,4 Patrick Brockill,4 Kipp Cannon,5, 6 Romain

Cariou,7 Sarah Caudill,4 Sydney J. Chamberlin,1, 2 Jolien D. E. Creighton,4 Ryan Everett,1, 2 Chad

Hanna,1, 8, 2 Drew Keppel,9 Ryan N. Lang,4 Tjonnie G. F. Li,10 Duncan Meacher,1, 2 Alex Nielsen,9

Chris Pankow,11 Stephen Privitera,12 Hong Qi,4 Surabhi Sachdev,3 Laleh Sadeghian,4 Leo Singer,13

E. Gareth Thomas,14 Leslie Wade,15 Madeline Wade,15 Alan Weinstein,3 and Karsten Wiesner9

1Department of Physics, The Pennsylvania State University, University Park, PA 16802, USA2Institute for Gravitation and the Cosmos, The Pennsylvania State University, University Park, PA 16802, USA

3LIGO Laboratory, California Institute of Technology, MS 100-36, Pasadena, California 91125, USA4Leonard E. Parker Center for Gravitation, Cosmology, and Astrophysics,

University of Wisconsin-Milwaukee, Milwaukee, WI 53201, USA5Canadian Institute for Theoretical Astrophysics, 60 St. George Street,

University of Toronto, Toronto, Ontario, M5S 3H8, Canada6RESCEU, University of Tokyo, Tokyo, 113-0033, Japan

7Departement de physique, Ecole Normale Superieure de Cachan, Cachan, France8Department of Astronomy and Astrophysics, The Pennsylvania State University, University Park, PA 16802, USA

9Albert-Einstein-Institut, Max-Planck-Institut fur Gravitationsphysik, D-30167 Hannover, Germany10Department of Physics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China

11Center for Interdisciplinary Exploration and Research in Astrophysics (CIERA) and Department of Physics and Astronomy,Northwestern University, 2145 Sheridan Road, Evanston, IL 60208, USA

12Albert-Einstein-Institut, Max-Planck-Institut fur Gravitationsphysik, D-14476 Potsdam-Golm, Germany13NASA/Goddard Space Flight Center, Greenbelt, MD 20771, USA

14University of Birmingham, Birmingham, B15 2TT, UK15Department of Physics, Hayes Hall, Kenyon College, Gambier, Ohio 43022, USA

(Dated: January 12, 2017)

We describe a stream-based analysis pipeline to detect gravitational waves from the merger ofbinary neutron stars, binary black holes, and neutron-star–black-hole binaries within ∼ 1 minuteof the arrival of the merger signal at Earth. Such low-latency detection is crucial for the promptresponse by electromagnetic facilities in order to observe any fading electromagnetic counterpartsthat might be produced by mergers involving at least one neutron star. Even for systems expectednot to produce counterparts, low-latency analysis of the data is useful for deciding when not topoint telescopes, and as feedback to observatory operations. Analysts using this pipeline were thefirst to identify GW151226, the second gravitational-wave event ever detected. The pipeline alsooperates in an offline mode, in which it incorporates more refined information about data qualityand employs acausal methods that are inapplicable to the online mode. The pipeline’s offline modewas used in the detection of the first two gravitational-wave events, GW150914 and GW151226, aswell as the identification of a third candidate, LVT151012.

I. INTRODUCTION

The field of gravitational-wave astronomy has come tolife in a spectacular way, with the first detections of grav-itational waves on September 14, 2015 [1] and December26, 2015 [2] by the two detectors of the Laser Interferom-eter Gravitational-wave Observatory (LIGO) [3]. Thesedetectors are currently undergoing further commission-ing and will reach design sensitivity in the next fewyears. Additionally, they will be joined by a network ofgravitational-wave observatories that include AdvancedVirgo [4], KAGRA [5], and a third LIGO observatory inIndia [6]. We expect this network to bring more observa-tions of binary black hole mergers [7], as well as binaryneutron star (BNS) and neutron-star–black-hole (NSBH)

a [email protected]

mergers [8].

As we enter the era of gravitational-wave astron-omy, the need for low-latency analyses becomes criti-cal. Gravitational waves from BNS and NSBH mergersare expected to be paired with electromagnetic emissionand neutrinos [9–11]. Gravitational-wave-triggered elec-tromagnetic observations may lead to the detection ofprompt short gamma-ray bursts and high-energy neutri-nos within seconds, followed by X-ray, optical, and radioafterglows days to years later. Multimessenger observa-tions will aid in our understanding of astrophysical pro-cesses and increase our search sensitivity [9, 12]. Addi-tionally, even in the absence of a counterpart, the rapididentification of gravitational waves has a number of ben-efits. Low-latency detection allows us to provide feed-back to commissioners when search sensitivity drops un-expectedly, helping to return the detector to its nominalstate [13]. Furthermore, upon identification of a candi-date, we can submit timely requests to minimize detector

arX

iv:1

604.

0432

4v3

[as

tro-

ph.I

M]

10

Jan

2017

Page 2: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

2

changes in order to gather enough data to reliably esti-mate the search background and perform followup cali-bration measurements.

In this work, we present the GstLAL-based in-spiral pipeline, a gravitational-wave search pipelinebased on the GstLAL library [14], and derived fromGStreamer [15] and the LIGO Algorithm Library [16].The pipeline can operate in a low-latency mode to as-certain whether a gravitational-wave signal is presentin data, provide point estimates for the binary param-eters, and estimate event significance. Analysts run-ning the low-latency mode of this pipeline were thefirst to identify the second gravitational wave eventdetected, GW151226 [2]. The pipeline can also op-erate in an “offline” configuration that can be usedto process archival gravitational-wave data with addi-tional background statistics and data quality informa-tion. The offline configuration was used in the detectionof GW150914, LVT151012 [17], and GW151226 [2].

The GstLAL-based inspiral pipeline expands onthe parameter space covered by previous low-latencysearches [18–21]. In addition, it extends many of thetechniques used in prior searches for compact binary co-alescences [22, 23] to operate in a fully parallel, stream-based mode that allows for the identification of candidategravitational-wave events within seconds of recording thedata. The key differences include: (1) time-domain [24]rather than frequency-domain [25] matched filtering, (2)time-domain rather than frequency-domain [26] signalconsistency tests to reject non-stationary noise tran-sients, (3) a multidimensional likelihood ratio rankingstatistic to robustly identify gravitational-wave candi-dates in a way that automatically adjusts to the prop-erties of the noise [27], and (4) a background estimationtechnique that relies on tracking noise distributions toallow rapid evaluation of significance of identified can-didates [28]. For a discussion of performance differencesbetween time-domain and frequency-domain matched fil-tering, the reader is referred to Ref. [24].

This paper is organized as follows: In Sec. II, we dis-cuss inputs to the low-latency and offline analyses, theonline acquisition of data, measurement of the powerspectral density (PSD), and whitening and conditioningof the data for matched filtering. We also present the ba-sic offline and low-latency workflows in Fig. 1 and Fig. 2,respectively. In Sec. III, we discuss the matched-filter al-gorithm and our procedure for producing a list of rankedcandidate events. In Sec. IV, we explain the significancecalculation for identified candidate events and the proce-dure for responding to significant events via alerts to ourobserving partners. Differences between the offline andlow-latency operation modes will be highlighted when rel-evant.

H L ...

Disk

PSD Estimationt ∈ {t0}

PSD Estimationt ∈ {t1}

PSD Estimationt ∈ {tN}

Median PSD Estimationt ∈ {t0 . . . tN}

SVD decompositionθ ∈ {θ0}

SVD decompositionθ ∈ {θN}

Filter / coincidenceSignal based vetoesρH , ξ

2H , ρL, ξ

2L

t ∈ {t0}, θ ∈ {θ0}

Filter / coincidenceSignal based vetoesρH , ξ

2H , ρL, ξ

2L

t ∈ {tN}, θ ∈ {θ0}

Filter / coincidenceSignal based vetoesρH , ξ

2H , ρL, ξ

2L

t ∈ {t0}, θ ∈ {θN}

Filter / coincidenceSignal based vetoesρH , ξ

2H , ρL, ξ

2L

t ∈ {tN}, θ ∈ {θN}

Rank: L(ρ, ξ2, θ)t ∈ {t0 . . . tN}θ ∈ {θ0}

Rank: L(ρ, ξ2, θ)t ∈ {t0 . . . tN}θ ∈ {θN}

Cluster:Keep events

in N s window

Significance:P (L, θ|n)θ ∈ {θ0}

Significance:P (L, θ|n)θ ∈ {θN}

Significance:P (L|n)

θ ∈ {θ0 . . . θN}

False-AlarmProbability andRate Estimation

CandidateEvent

Follow-up

GW CandidateEvent Database

FIG. 1. Diagram of the offline search mode of the Gst-LAL based inspiral pipeline. First, data is transferred fromeach observatory (H,L,. . . ) to a central computing cluster(Sec. II A). Next, data is read from disk and the PSD is es-timated (Sec. II B) in chunks of time t0, t1, . . . , tN for eachobservatory. The median over the entire analysis time of eachobservatory PSD estimate is computed. The input templatebank, which is generated upstream of the analysis, is split intoregions of similar parameters θ0, θ1, . . . , θN (Sec. II D) andthen decomposed into a set of orthonormal filters weightedby the median PSD for each observatory. The data is filteredto produce a series of triggers characterized by SNR, ρ, sig-nal consistency check, ξ2 (Secs. III A, III B, and III C), andcoalescence time. Coincident triggers between detectors areidentified and promoted to the status of events (Sec. III D).Events are ranked according to their relative probability ofarising from signal versus noise (Sec. III E). The data is thenreduced to the most highly ranked event in 8 second windows(Sec. III F). In parallel, triggers not found in coincidence areused to construct the probability of obtaining a given eventfrom noise, P (Λ|n). Finally, the event significance and False-Alarm Rate are estimated (Sec. IV A). Note that the arrowsdrawn between nodes in this diagram do not imply the out-put of one node is the input of the next node, they simplyindicate the order in which tasks are performed.

Page 3: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

3

H

L

...

F

S

DataBroadcast

InjectionBroadcast

BackgroundEstimation

Filter / coincidenceSignal based vetoes / RankρH , ξ

2H , ρL, ξ

2L, θ ∈ {θ0}

Filter / coincidenceSignal based vetoes / RankρH , ξ

2H , ρL, ξ

2L, θ ∈ {θ1}

Filter / coincidenceSignal based vetoes / RankρH , ξ

2H , ρL, ξ

2L, θ ∈ {θN}

Gravitational-WaveCandidate Event

Database

BAYESTARRapid Sky

Localization

RAVEN GRBCoincidence

lal inferenceParameterEstimation

DQ and CandidateFollow-up

ApprovalProcessor

GCN

GW Observatories

O(10)s latencyCentral Computing Cluster

O(30)s latencyResources spread

over the LIGO Data GridOutside world

FIG. 2. Diagram of the low-latency search mode of the GstLAL based inspiral pipeline. First, data is received over a networkconnection from each observatory to a data broadcaster in a central computing facility. The data is then broadcast over the entirecluster with an efficient multicast protocol. The online analysis uses precomputed bank decompositions for each observatoryfrom reference PSDs as input to jobs that combine the filtering, vetoing, coincidence, ranking, and significance estimation stepsfrom the offline pipeline. Unlike the offline case, the online analysis work-flow can not be described as a directed acyclic graph,and in fact, data from each filtering job is exchanged bi-directionally and asynchronously to a process that constantly evaluatesthe global background estimates for the entire analysis. Events that are identified by any one filtering job, and subsequentlypass a predetermined significance threshold, are sent to the Gravitational-Wave Candidate Event Database (GraceDB) [29]within a matter of seconds of the data being recorded at the observatories.

II. MATCHED FILTERING INPUT

Matched filtering algorithms for compact binary merg-ers have traditionally filtered the data d(t) against a setof complex template waveforms {hc

i (t)} in the frequencydomain using the relation

zi(t) = xi(t) + iyi(t) = 4

∫ ∞

0

dfhc∗i (f)d(f)

Sn(f)e2πift, (1)

where zi(t) is the complex SNR using the ith template,xi(t) is the matched filter response to a gravitationalwave signal with orbital coalescence phase φ0 (the realpart of the template in the time domain), yi(t) is thematched filter response to the same signal with orbitalcoalescence phase φ0 + π/4 (the imaginary part of thetemplate in the time domain), and Sn(f) is the single-sided noise PSD. The templates are normalized such that

1 = 4

∫ ∞

0

df

∣∣∣hci (f)

∣∣∣2

Sn(f). (2)

Defining the SNR, ρ(t), as the modulus of the com-plex SNR, z(t), allows one to search efficiently over theunknown coalescence phase, while its expression in thefrequency domain allows one to efficiently implementmatched filtering using FFT routines.

The GstLAL-based inspiral pipeline, however, per-forms matched filtering in the time domain with realtemplates {hi(t)}. The matched filter output is thus thereal-valued xi(t) instead of the complex-valued zi(t). Wecan recast Eq. (1) in the time domain using the convolu-

tion theorem, which gives

xi(t) = 2

∫ ∞

−∞dfh∗i (f)d(f)

Sn(|f |) e2πift (3a)

= 2

∫ ∞

−∞dτ hi(τ)d(t+ τ), (3b)

where

d(τ) =

∫ ∞

−∞df

d(f)√Sn(|f |)

e2πifτ (4)

is the whitened data; the whitened template hi(τ) is de-fined similarly. As a consequence of using real templates,Eq. (3) returns the matched filter response to a singlecoalescence phase, while Eq. (1) returns the response totwo phases. Thus, Eq. (3) must be evaluated a secondtime using the template corresponding to the π/4-shiftedphase in order to compute the SNR. To account for usingtwice as many templates, the template index, i, used onreal templates is related to the index used on complextemplates via

h2i(t) = Re [hci (t)] , (5a)

h2i+1(t) = Im [hci (t)] . (5b)

A different template normalization is also used, specifi-cally

1 = 4

∫ ∞

0

df

∣∣∣hi(f)∣∣∣2

Sn(f). (6)

In this section, we discuss the inputs to thetime-domain matched filtering calculation expressed in

Page 4: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

4

Eq. (3). We begin by discussing the low-latency distri-bution of the data itself. We then describe our methodfor estimating the PSD, which we use to construct the

whitened data d(t) and whitened templates hi(t). In de-scribing our construction of the whitened data stream,we also describe the removal of loud noise transients anddealing with data dropouts in the low-latency broadcast.Finally, we describe the construction of the whitenedtemplate filters, which involves a number of computa-tional enhancements to reduce the cost of filtering in thetime domain.

A. Data Acquisition

Gravitational-wave strain data acquired at the LIGOsites is digitized at a sample rate of 16384 Hz and bundledinto IGWD frames, a custom LIGO file format describedin Ref. [30], on a four-second cadence. Information aboutthe state of the instrument and data quality are distilledfrom a host of auxiliary environmental and instrumental-control-system channels into a single channel, referredto as the state-vector channel. The four-second framescontaining the gravitational-wave strain and state-vectorchannels are delivered for low-latency processing at com-puting clusters across the LIGO Data Grid within ∼ 12seconds of the data being acquired.

Searches for compact binary coalescences require usinghundreds or thousands of compute nodes in parallel toprocess all the possible template waveforms. Low-latencydata must be made available to all of these nodes as soonas it arrives, thus an efficient multicast protocol is usedto broadcast the data in low latency to the entire cluster.The nature of the low-latency transmission causes somesmall data loss within the tolerances acceptable to thepipeline, with efforts underway to reduce these losses.

B. PSD

Abstractly, we define the (one-sided) noise power spec-tral density Sn(f) as

〈n(f)n∗(f ′)〉 =1

2Sn(f)δ(f − f ′), f > 0 (7)

where 〈· · · 〉 denotes an ensemble average over realizationsof the detector noise n(t), which is assumed to be sta-tionary and Gaussian. In practice, we cannot use Eq. (7)to calculate the PSD for a variety of reasons. To beginwith, our knowledge of the detector noise comes exclu-sively from the observed data, which may contain signalin addition to noise. Furthermore, real data may con-tain brief departures from stationarity (commonly called“glitches”), which we do not want to contribute to thePSD estimate. Finally, the PSD can drift slowly overtime scales shorter than the duration of a typical detec-tor lock segment, and we want to track these changes.

For low-latency applications, we also require a PSD esti-mate that converges quickly using only data in the past,so that we obtain an accurate estimate of the PSD assoon as possible after the data begin to flow. In this sub-section, we discuss the PSD estimation algorithm andhow the result is used to whiten the data and templatebank. We also present the results of a study done on theconvergence of an estimated PSD to its known spectrum.

1. Estimation and Whitening

We use a median and a running geometric mean tomeet these requirements for each analyzed segment ofdata. The median estimate operates on medium timescales and is robust against shorter time-scale fluctua-tions in the noise, while the running geometric meantracks longer time-scale changes in the PSD, averag-ing the PSD estimates with the most recent estimatesweighted more strongly. The time scales of the medianand geometric mean are set, respectively, by the tunableparameters nmed and navg.

The PSD calculation begins by partitioning the straintime series into blocks of length N points with each blockoverlapping the previous by N/2+Z points, where N andZ are even-valued integers. Each block of data, denoteddj [k], is windowed and Fourier transformed,

dj [`] =

√N

∑N−1k=0 w[k]2

∆t

N−1∑

k=0

dj [k]w[k]e−2πi`k/N , (8)

where k ∈ [0, N − 1] is the time index, ` ∈ [0, N2 ] is thefrequency index, ∆t is the time sample step, and

w[k] =

0, 0 ≤ k < Z

sin2 π(k−Z)N−2Z , Z ≤ k < N − Z

0, N − Z ≤ k ≤ N − 1

(9)

is a zero-padded Hann window function. The DC and

Nyquist terms of Eq. 8, d[0] and d[N/2], are set to zero,and the zero-padded Hann window is defined such thatthe sequence of overlapping window functions sum tounity everywhere. The squared magnitude of Eq. (8)is proportional to the instantaneous PSD and has a fre-quency resolution of ∆f = 1

N∆t .The median of the most recent nmed instances of the

instantaneous PSD, Smedj [`], is determined for each fre-

quency bin `. Mathematically,

Smedj [`] = median{ 2∆f |dk[`]|2 }k=j

k=j−nmed. (10)

The median is relatively insensitive to short time-scalefluctuations, which must occur over a time scale of12nmed(N/2− Z)∆t in order to affect the median.

The median is used to estimate the geometric mean ofthe last nmed samples for each frequency bin. Assum-ing the noise is a stationary, Gaussian process allows us

Page 5: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

5

to assume that the measured frequency bins of the es-timated PSD are χ2-distributed random variables. Thegeometric mean of a χ2-distributed random variable isequal to the median divided by a proportionality con-stant β. The logarithm of the running geometric meanof median estimated PSDs, logSj [`], is computed fromone part logSmed

j [`]/β and (navg − 1) parts logSj−1[`].Mathematically,

Sj [`] = exp

[navg − 1

navglogSj−1[`] +

1

navglog

Smedj [`]

β

].

(11)Changes to the PSD must occur over a time scale ofat least navg(N/2 − Z)∆t to be fully accounted for byEq. (11).

To whiten the data and the templates, the arithmeticmean is estimated from the geometric mean. The arith-metic mean of a χ2-distributed random variable is equalto the geometric mean multiplied by exp(γ), where γ isEuler’s constant. If the noise assumptions are violated,then the true arithmetic mean of the spectrum will differfrom the measured spectrum by some unknown factor.This estimated arithmetic mean is referred to as Sn(f)in the continuum limit (see e.g. Eq. 7).

The low-latency operating mode must whiten the dataand update the running geometric mean of the PSD atthe same time. The whitening process is done after therunning geometric mean has been updated and is per-formed by dividing each frequency bin of Eq. 8 by thesquare root of the corresponding frequency bin in theestimated arithmetic mean of the PSD. Mathematically,

˜dj [`] =

dj [`]√Sj [`] exp(γ)

, (12)

dj [k] = 2∆t

√√√√N−1∑

m=0

w[m]2∆f

N/2∑

`=0

˜dj [`]e

2πi`k/N . (13)

The extra terms in the inverse Fourier transform are nec-essary for unity variance.

The low-latency analysis typically uses N = fs(8 s)and Z = fs(2 s), where fs = 1/∆t is the sampling fre-quency, resulting in 1/4 Hz frequency resolution. Thisintroduces four seconds of latency into the analysis. Un-like the low-latency case, the offline analysis begins witha known list of data segments. The PSD of each segmentis estimated using N = fs(32 s) and Z = 0; the finalresult of the running average is written to disk as a “ref-erence PSD.” The median of the reference PSDs is usedto whiten the template bank before matched filtering.However, the data segments are whitened in a proceduresimilar to the low-latency analysis, using N = fs(32 s)and Z = fs(8 s). At the time of writing, the typical val-ues used are nmed = 7 and navg = 64 for both modes ofoperation. The only procedural difference between theoffline and low-latency whitening steps is that the offlineanalysis seeds the running average with the segment’sreference PSD.

2. Convergence

For low-latency applications, we require a PSD esti-mate that converges quickly using only data in the past,so that we obtain an accurate estimate of the PSD assoon as possible after the data begin to flow. To quan-tify the convergence, we create noise with a known powerspectrum and compute a quantity that is proportional tothe expected SNR for a given PSD in the absence of noise(commonly referred to as the ‘optimal SNR’). In the sta-tionary phase approximation, binary waveforms in thefrequency domain, h(f), are proportional to f−7/6 [31],thus

ρ ∝∫ f2

f1

dff−7/3

Sn(f). (14)

We choose f1 = 10 Hz and f2 = 2048 Hz. Specifically,we compare the quantity computed using the measuredspectrum Sn(f), which we denote simply as ρ, to the

SNR computed from the known spectrum Sn(f), whichwe denote as ρ. Fig. 3 shows the fractional change of ρwith respect to ρ as a function of time,

δρ

ρ(t) =

ρ(t)− ρρ(t)

. (15)

We find that convergence happens quickly relative to thelength of the data. The approximation of the true PSDdoes not affect the measured SNR after tens of seconds.

100 101 102 103 104

time (s)

10−3

10−2

10−1

100

log

10

(∣ ∣ ∣δρ ρ

∣ ∣ ∣)

FIG. 3. PSD convergence properties. Estimating the PSDis a critical part of ensuring that events are detected andassigned the appropriate significance. This figure illustratesthe convergence properties of the PSD estimation in terms ofthe impact on SNR. Within 20 s the PSD will have an O(10%)impact on SNR, and within 200 s the impact drops to O(1%),where it remains.

Page 6: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

6

C. Data Conditioning

Matched filtering is optimal under the condition thatthe noise, n(t), is Gaussian. Our implementation ofmatched filtering also assumes stationarity over time-scales at least as long as the compact binary wave-form. Although non-stationarity on long time scalescan be handled by tracking the PSD, short noise tran-sients, commonly referred to as “glitches,” can causehigh-SNR matched filter outputs that mimic signal de-tections. Glitches are handled by either removing themfrom the data or using signal consistency checks to vetthe matched filter output. Sec. III C provides more detailon the latter.

The GstLAL based inspiral pipeline removes glitchesfrom the data in two ways. In some cases, the matchedfilter outputs of glitches have considerably higher am-plitude than any expected output from a compact bi-nary signal and can thus be safely removed from the datathrough a process called gating. Once the data has beenwhitened, it has unit variance. If a momentary excursiongreater than some number of standard deviations, σ, isobserved in the whitened data, then the gating processzeros the excursion in the whitened data with a 0.25 spadding on each side. An example of this is shown inFig. 4. When gating the strain data, care must be takento choose a threshold that will not discard real gravita-tional wave signals. The threshold is chosen by testingwith simulated gravitational wave signals.

The choice of 0.25 s padding is conservative for LIGOPSDs, where the whitening filter in the time domain canbe approximated as a narrow sinc function. An InitialLIGO PSD and the time domain representation of its cor-responding whitening filter, estimated from data takenduring Initial LIGO’s sixth science run [32, 33] (referredto as S6), are shown in Fig. 5. ∼ 0.98 whitening filter’ssquare magnitude is contained within ±10 ms of the fil-ter’s peak, thus we expect no significant spectral leakagewhen gating glitches with 0.25 s padding.

In many cases, auxiliary information is availablethrough environmental and instrumental monitors thatcan ascertain times of clear coupling between local tran-sient noise sources [34, 35]; in cases where data quality isknown to be poor, vetoes are applied after the strain datais whitened. Since whitened data is, by definition, un-correlated between adjacent samples for stationary Gaus-sian processes, vetoes are applied by simply replacing thewhitened data during vetoed times with zeros.

D. Template Bank Decomposition

In order to detect any compact binaries within a regionof the mass parameter space, we filter the data against abank of template signals. As the true binary parameterspace is continuous, actual signals may not exactly matchany one template from the bank; such signals incur a lossof SNR. The parameters of the templates in the bank

−1.0 −0.5 0.0 0.5 1.0

Time from glitch, t (s)

−60

−40

−20

0

20

40

60

Wh

iten

edd(t

)

50σ threshold

50σ threshold

FIG. 4. Data conditioning. In this two second block of LIGOS6 data, three noise transients (“glitches”) are visible. Theglitch at time zero surpassed the threshold of 50 standarddeviations (σ), triggering the gate to veto a ±0.25 secondwindow around the glitch by replacing the data with zeros(black). The gray trace shows what the data looked like priorto gating.

−1.0 −0.5 0.0 0.5 1.0

Time (s)

−1

0

1

2

3

4

5

6

Whi

tene

rKer

nel(

stra

in−

1019)

102 103

Frequency (Hz)

10−46

10−45

10−44

10−43

10−42

10−41

10−40

10−39

PS

D(s

train

2/H

z)

FIG. 5. Top: Time domain representation of the whiteningfilter computed from a PSD estimated in an analysis of oneweek of S6 data. ∼ 0.98 of the filters square magnitude iscontained enclosed within ±10 ms of the peak. Bottom: ThePSD used to compute the whitening filter.

are chosen to minimize this loss of SNR using as fewtemplates as possible [36–38]. Techniques for efficientlycovering the binary parameter space with templates havebeen extensively developed [39–45]. We assume here thatsuch a template bank has already been constructed, anddescribe how the bank is decomposed to more efficientlyfilter the data.

The standard methods for template bank construc-tion naturally lead to banks of highly redundant tem-plates. In the frequency-domain, filtering directly with

Page 7: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

7

the physical templates has the advantage of admittingcomputationally-efficient searches over the unknown sig-nal coalescence phase and time; this advantage is lost inthe time-domain. The GstLAL-based inspiral pipelinetherefore does not directly filter the data against thephysical template waveforms themselves. Rather, it em-ploys the LLOID method [24] (see also Sec. III A), whichcombines singular value decomposition (SVD) [46–48]with near-critical sampling to construct a reduced setof orthonormal filters with far fewer samples.

In order to prepare the templates for the LLOIDdecomposition, the template bank is first split intopartially-overlapping “split-banks” of templates withsimilar time-frequency evolution based on the templateparameters, as depicted in Fig. 6. Templates correspond-ing to binary black hole systems with circular orbits andcomponent spins parallel to the orbital angular momentcan be characterized by the component masses mi and

the dimensionless spin parameters χi = ~Si · L/m2i for

i = 1, 2, where ~Si are the spin vectors and L is the orbitalangular momentum unit vector. Circularized binary neu-tron star templates with aligned spins can also be char-acterized by mi and χi, however not as accurately dueto neutron-star specific effects such as tidal disruption.The templates for these systems are binned in a two-dimensional space, first by an effective spin parameterχeff ,

χeff ≡m1χ1 +m2χ2

m1 +m2, (16)

and then by chirp mass M,

M =(m1m2)3/5

(m1 +m2)1/5. (17)

2NT real templates are placed in each split-bank, whereNT is typically O(100). The factor of 2 is a result ofusing two orthogonal real-valued templates in place of 1complex-valued template (Sec. II). The input templatesin adjacent M bins are overlapped in order to mitigateboundary effects from the SVD. Overlapping regions areclipped after reconstruction such that the output hasno redundant template waveforms1. The waveformsare then whitened using reference PSDs, as describedin Sec. II B, and each split-bank is decomposed via theLLOID method, described below and in Fig. 7.

Each split-bank is divided into various time slices afterprepending the templates with zeros such that every tem-plate has the same number of sample points; this allowsus to efficiently sample different regions of our waveformswith the appropriate Nyquist frequency instead of over-sampling the low-frequency regions of the waveform with

1 Split-banks that contain the lowest and highest M templates ina given χ bin are padded with duplicate templates from withinthe split-bank in order to keep the clipping uniform betweensplit-banks.

M

χeff

Mc

2NT templatesin each tile

Adjacent tilesoverlap

NB tiles with similarM are grouped together

to define θ

FIG. 6. An illustration of how the physical parameterspace is tiled into regions in which the LLOID decomposi-tion is done. The physical parameter space is projected ontothe M, χ plane. Tiles of equal template number, 2NT , areconstructed and overlapped in the M direction by O(10%).Above a specified chirp mass,Mc, waveforms that use the fullinspiral-merger-ringdown description are used. Below Mc,waveforms that model only the inspiral phase are used. Tilesof similar chirp mass are then grouped together to define aone-dimensional family of similar parameters, θ, used in theevaluation of the likelihood-ratio ranking statistic (Sec. III E).

the sampling frequency required for the high-frequencyregions. The SVD is then performed on each time sliceof each split-bank and truncated such that we retain onlythe most important basis waveforms returned by the SVDalgorithm, as measured by the match between the origi-nal templates and the reconstructed waveforms [24].

In addition to being used for the LLOID decompo-sition, split-banks are binned by the lowest chirp massin each split-bank to construct bins of similar templates.These are referred to as θ bins, and define a binning of thelikelihood-ratio detection statistic defined in Sec. III E.

III. EVENT IDENTIFICATION STAGE

Borrowing the language commonly used in particle ex-periments, the GstLAL-based inspiral pipeline identifies“triggers” from individual interferometer data streams.Triggers which arrive in coincidence are elevated to the“event” classification and ranked by the likelihood-ratioranking statistic. In this section, we discuss how a listof triggers is generated by the matched-filtering algo-rithm and how coincidences are identified and ranked asevents.

Page 8: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

8

...

...

...

...

...

SVD Basis for time slice 9 SVD Basis for time slice 4

Basis 1

Basis 3

Basis 6

Basis 1

Basis 3

Basis 6

Template 25

Time Slice 10 9 8 7 6 5 4 3 2 1

Template 26

Template 175

256 Hz 512 Hz 2048 Hz

FIG. 7. An example of the LLOID decomposition [24]. Inthis example, NT = 195 binary inspiral waveforms (390 in-cluding the two possible phases) with a chirp mass between0.87 and 0.88 are first “whitened” by dividing them by a re-alistic noise amplitude spectral density from aLIGO. Theline features in the spectrum are responsible for the ampli-tude modulation of the waveforms. The waveforms, whichare prepended with zeros when necessary so that all of thetemplates in a given decomposition have the same number ofsample points, were decomposed into 30 time slices at samplerates ranging between 128 Hz and 2048 Hz (only the last 10slices are shown). A basis filter set from the waveforms ineach time slice was constructed using the SVD [46]. Only 6-10 basis waveforms per slice were needed to reconstruct bothphases of the 195 input waveforms to an accuracy of betterthan 99.9%.

A. Matched filtering and the LLOID method

As discussed in Sec. II D, groups of templates are par-titioned into time slices as part of the LLOID decompo-sition [24]. Specifically, any split-bank H can be writtenas a collection of matrices Hs,

H = {Hs}, (18)

where each Hs contains time-slice s of all 2NT templatesin the split-bank,

Hs = {hsi (t) : i ∈ [0, 2NT − 1]}. (19)

The index s is chosen to be largest at the start of thetemplate waveforms, decreasing until s = 0 for the lastslice (as seen in Fig. 7). Each slice of the split-bank, Hs,is decomposed via the SVD to provide basis functions u.

These basis functions can be used to reconstruct any hi

to a predetermined tolerance, i.e.,

hsi (t) ≈N−1∑

ν=0

vsiνσsνu

sν(t), (20)

where us = {usν(t)} is a matrix comprised of N basisvectors, vs = {vsiν} is a reconstruction matrix, and ~σ ={σsν} is a vector of singular values whose magnitudes aredirectly proportional to how important a correspondingbasis vector is to the reconstruction process [46]. Nowinstead of evaluating Eq. (3b) 2NT times for each slice of2NT templates, we can evaluate

Usν (t) = 2

∫ ∞

−∞dτ usν(t)ds(t+ τ) (21)

N < 2NT times for each slice, where ds(t) is sampledat the same rate as usν(t). The matched filter outputtime series is calculated for each time slice Hs, then up-sampled via sinc interpolation and added to the outputof other time slices (in order of decreasing s) to obtainthe output of Eq. (3b). The matched filter output accu-mulated up through slice s is defined recursively for eachtemplate in a given split-bank as

xsi (t) =

Previous xi︷ ︸︸ ︷(H↑xs+1

i

)(t) +

N−1∑

ν=0

vsiνσsνU

sν (t)

︸ ︷︷ ︸Current xi

, (22)

where H↑ acts on a time series sampled at fs+1 and up-samples it to fs. Recall that the GstLAL-based inspiralpipeline uses two real-valued templates in place of onecomplex-valued template (Sec. II), thus the computedSNR is the quadrature sum of matched filter outputsfrom waveforms which differ only in coalescence phaseby π/4,

ρj(t) =√x2j(t)2 + x2j+1(t)2, j ∈ [0, NT − 1]. (23)

Note that there are half as many SNRs as there are tem-plates, which is a result of using real-valued templatesin place of complex-valued templates (Sec. II). Evaluat-ing Eq. (22) saves a factor of 104 in computational costover a direct time domain convolution of the templatewaveforms for typical aLIGO search parameters [24].

B. Triggers

The raw SNR time series, typically sampled at 2 kHz,is discretized into “triggers” before being stored. The dis-cretization is done by maximizing the SNR over time inone-second windows and recording the peak if it crossesa predetermined threshold. With the typical thresholdof SNR = 4, it is probable to have at least one triggerin every one-second interval for every template for ev-ery detector dataset analyzed. Although the number of

Page 9: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

9

triggers can easily be hundreds of thousands per second(due to modern templates banks containing hundreds ofthousands of templates [17]) storing them is a marked im-provement over storing the raw SNR time series, which isover three orders of magnitude more voluminous. How-ever, we do not discard the raw SNR time series infor-mation immediately, because it is needed for the nextstage of the pipeline (Sec. III C). For each trigger, werecord the parameters of the template, the trigger time,the SNR, and the coalescence phase. The trigger timeis computed via sub-sample interpolation to nanosecondprecision; while low SNR triggers suffer from poor timingresolution, high SNR triggers can be resolved to betterresolution than that of the sample rate [9, 49, 50]. Trig-gers are identified in parallel across each template in agiven θ bin.

C. Signal-based vetoes

Detector data often contain glitches that are not re-moved during the data conditioning stage (Sec. II C).Therefore, ranking triggers solely by SNR is not suf-ficient to separate noise from transient signals. For-tunately, we can exploit consistency checks to improveour ability to discriminate spurious glitches from truegravitational-wave events. Requiring multiple-detectorcoincidence (Sec. III D) is one powerful check, but herewe discuss a separate check on waveform consistency fora single detector’s matched-filter output. This waveformconsistency check determines how similar the SNR timeseries of the data is to the SNR time series expected froma real signal.

Under the assumption that the signal in the data ex-actly matches the matched-filter template up to a con-stant, it is possible to predict the local matched-filterSNR by computing the template autocorrelation functionand scaling it to the known SNR. However, the knownSNR is a result of the matched-filter response from twoidentical but out of phase templates, thus instead of scal-ing the autocorrelation function to the SNR, a complexSNR series is constructed from the two matched-filteroutputs,

zj(t) = x2j(t) + ix2j+1(t), j ∈ [0, NT − 1]. (24)

These are compared to the complex autocorrelation func-tion,

Rj(t) =

∫ ∞

−∞df|h2j(f)|2 + |h2j+1(f)|2

Sn(|f |) e2πift, (25)

where t = 0 is chosen to be the peak time, tp. By conven-tion, each real template is normalized such that its au-tocorrelation is 1

2 at the peak time, thus Rj(0) = 1. We

compute a signal consistency test value, ξ2, as a functionof time given the complex SNR time series zj(t), a trig-ger’s peak complex SNR zj(0), and the autocorrelation

function time series Rj(t) as

ξ2j (t) = |zj(t)− zj(0)Rj(t)|2. (26)

If the gravitational wave strain data contain only noise(i.e., d(f) = n(f)), then (see Appendix A for derivation)

〈ξ2j (t)〉 = 2− 2|Rj(t)|2. (27)

In practice, a value of ξ2 is computed for each triggerby integrating ξ2(t) in a window of time around the trig-ger and normalizing it using Eq. (27). The integral takesthe form

ξ2j =

∫ δt−δt dt|zj(t)− zj(0)Rj(t)|2∫ δt−δt dt(2− 2|Rj(t)|2)

, (28)

where δt is a tunable parameter that defines the size ofthe window around the peak time over which to performthe integration. Typically, δt is calculated in terms ofan odd-valued autocorrelation length (ACL), specified asa number of samples such that δt = (ACL − 1)∆ts/2,where ∆ts = f−1

s is the sampling time step. A suitablevalue for ACL was found to be 351 samples when filter-ing is conducted at a 2048 Hz sample rate, resulting inδt ∼ 85.4 ms; this value was found using by Monte Carlosimulations in real data.

Fig. 8 plots the SNR and scaled autocorrelation fora template that recovered a simulated signal in InitialLIGO data. Subtracting the measured SNR time seriesfrom the predicted series shown in this figure is what isdone in (28) on a trigger-by-trigger basis.

We note that ξ2 differs from the traditional time-frequency χ2 test in [26], and it is not in fact a χ2-distributed number in Gaussian noise. However, thestatistics of the ξ2 test are recorded for both noiseand simulated signals and can therefore be used in thelikelihood-ratio test described in Sec. III E.

D. Coincidence

Demanding that two triggers are found in temporal co-incidence between the LIGO sites is a powerful techniqueto suppress the background of the search. For a single de-tector trigger, we define the time of an event to coincidewith the peak of its SNR time series. Given a triggerin one detector, we check for corresponding triggers inthe other detector within an appropriate time window,which takes into account the maximum gravitational-wave travel time between detectors and statistical fluctu-ations in the measured event time due to detector noise.For the two LIGO detectors, the time window is typ-ically ±15 ms. We further require that the mass andspin template parameters are the same for the two trig-gers. This exact match requirement potentially resultsin a small loss of SNR for real signals, since the loudesttrigger in each detector will in general not have the exactsame template parameters due to independent noise in

Page 10: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

10

−0.10 −0.05 0.00 0.05 0.10−10

−5

0

5

10

15

20

25

H1ρ(t

)Measured ρ(t)

Predicted ρ(t)

−0.10 −0.05 0.00 0.05 0.10

Time from peak (s)

−10

−5

0

5

10

15

L1ρ(t

)

Measured ρ(t)

Predicted ρ(t)

FIG. 8. Ingredients in the auto-correlation-based least-squares test as described in (26). The two panels show theSNR time series near a simulated signal in Initial LIGO data(black) along with the predicted SNR computed from the tem-plate autocorrelation. Subtracting these two time series andintegrating their squared magnitude provides a signal consis-tency test, ξ2, at the time of a given trigger that can be usedto reject non-stationary noise transients.

the detectors. However, taking into account such fluctu-ations requires detailed knowledge of the metric on thesignal manifold [51], which may not be easily available.Furthermore, the exact match restriction suppresses thenoise and drastically simplifies the pipeline.

E. Event Ranking

Each trigger from each detector has independentlycomputed ρ, ξ2, and tp values. After coincidences areformed, it is necessary to rank the coincident events fromleast likely to be a signal to most likely to be a signal andto assign a significance to each. The GstLAL-based in-spiral pipeline uses the likelihood-ratio statistic describedin [27] to rank coincident events by their SNR, ξ2, theinstantaneous sensitivity of each detector (expressed asthe horizon distance, {DH1, DL1}), and the detectors in-volved in the coincidence (expressed as the set {H1,L1}).For the case where only the aLIGO observatories H1and L1 are participating, the likelihood ratio of an eventfound in coincidence is defined as

L({DH1, DL1}, {H1,L1}, ρH1, ξ

2H1, ρL1, ξ

2L1, θ

)= L

({DH1, DL1}, {H1,L1}, ρH1, ξ

2H1, ρL1, ξ

2L1 | θ

)L(θ)

=P({DH1, DL1}, {H1,L1}, ρH1, ξ

2H1, ρL1, ξ

2L1 | θ, signal

)

P({DH1, DL1}, {H1,L1}, ρH1, ξ2

H1, ρL1, ξ2L1 | θ,noise

) L(θ), (29)

where θ is a label corresponding to the template bankbin being matched-filtered (Sec. II D). The numeratorand denominator are factored into products of severalterms in [27], assuming that the noise distributions foreach interferometer are independent of each other. Thecomputation of each term in the factored numerator anddenominator is discussed in detail in [27]; in this paper,we will give only a short summary of the denominator.The denominator is factored such that

P({DH1, DL1}, {H1,L1}, ρH1, ξ

2H1, ρL1, ξ

2L1 | θ,noise

)

∝∏

inst∈{H1,L1}

P (ρinst, ξ2inst | θ,noise). (30)

The detection statistics ρ and ξ2 from non-coincidenttriggers are used to populate histograms for each de-tector, which are then normalized and smoothed by aGaussian smoothing kernel to approximate P (ρinst, ξ

2inst |

θ,noise). Running in the low-latency operation mode re-quires a burn-in period until the analysis collects enoughnon-coincident triggers to construct an accurate estimateof the (ρ, ξ2) PDFs. Neither operation mode tracks time

Page 11: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

11

dependence of these PDFs, instead the PDFs are con-structed from cumulative histograms. Future work mayadd time dependence.

Rather than collecting non-coincident (ρ, ξ2) statisticsfrom individual templates, we group linearly dependenttemplates together to avoid the computational cost andcomplexity of tracking each template separately. Fur-thermore, it has been observed that groups of linearlydependent templates produce similar PDFs, thus coarsegraining the parameter space allows one to approximatethese PDFs for collections of templates. Therefore, the θin the likelihood ratio is a label that identifies a specifictemplate bank bin. Exactly how templates are groupedtogether into background bins is left as a tuning decisionfor the user, but typically O(1000) templates from eachdetector are grouped together.

Examples of the ρ and ξ2 distributions, estimated froman analysis of one week of S6 data, are shown in Fig. 9.The analysis considered data recorded between Septem-ber 14, 2010, 23:58:48 UTC and September 21, 2010,23:58:48 UTC. These boundaries were chosen to includethe blind injection performed on September 16, 2010, at06:42:23 UTC, often referred to as the “Big Dog.” Thewarm colormap corresponds to the natural logarithm ofthe estimated noise probability density function. Thecool colormap corresponds to a PDF generated by addingthe coincident triggers to the single detector triggers be-fore smoothing and normalizing. The cool-colormap dis-tribution was then masked to only show regions whichdeviate from the background estimate. The location ofthe Big Dog parameters is marked with a black X.

Examples of two other likelihood-ratio componentsfrom the Big Dog analysis are shown in Fig. 10. Thetop plot shows the joint SNR PDF, which is used in thenumerator of the likelihood ratio to enforce amplitudeconsistency [27]; the bottom plot shows the signal hy-pothesis model of the (ρ, ξ2) plane. The semi-analyticmodels used to generate these plots are described in [27].

F. Event Clustering

Signals can produce several high-likelihood events atthe same time in different templates; we wish to ensurethat we only consider the most likely event associatedwith a signal. In the offline analysis, we use a clusteringalgorithm that picks out the maximum likelihood-ratioevent globally across the input template bank within a±4-second window. The online analysis does not clus-ter events globally to reduce latency. Instead, the on-line analysis keeps the maximum likelihood-ratio eventin each θ bin within a ±1-second window.

FIG. 9. PDFs used in the likelihood ratio calculation, gener-ated by histogramming, then smoothing, and normalizing thetriggers. The plots shown are from an analysis of S6 data be-ginning at September 14, 2010, at 23:58:48 UTC and endingat September 21, 2010, at 23:58:48 UTC, which includes theblind injection known as the “Big Dog.” The warm colormapcorresponds to the natural logarithm of marginalized proba-bility density function estimated from non-coincident triggersonly; the cool-colormap region was computed by adding coin-cident triggers to the histograms before smoothing and nor-malizing. Regions of the cool colormap model consistent withthe warm-colormap model were then masked. The location ofthe Big Dog in the (ρ, ξ2) plane is marked with a black X.

IV. EVENT PROCESSING AND SENSITIVITYESTIMATION

The result of the pipeline components described inSec. III is a list of events ranked from most to least likelyto be a gravitational-wave signal. In this section, we dis-cuss how the significance is estimated, the procedure inthe case that a sufficiently significant candidate is iden-tified, and how simulated waveforms are used to to char-acterize the sensitivity of the analysis to gravitational

Page 12: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

12

FIG. 10. Instances of two of the distributions included in thecalculation of the likelihood-ratio numerator, generated froman analysis of S6 data beginning at September 14, 2010, at23:58:48 UTC and ending at September 21, 2010, at 23:58:48UTC. Top: The joint SNR PDF used to enforce amplitudeconsistency across observatories. The location of the mea-sured Big Dog parameters is marked with a black X. Bottom:The (ρ, ξ2) signal distribution used in the numerator of thelikelihood ratio. The locations of the measured Big Dog pa-rameters are marked with a black X for Hanford and a black+ for Livingston.

waves.

A. Event Significance Estimation

Most coincident events are noise, thus the p-value, theprobability that noise would produce an event with aranking statistic at least as large as the one under con-sideration, is the standard tool used to identify candi-date gravitational-wave events. The p-value has con-ventionally been evaluated by performing time slides,where a set of time-shifts that are much larger than

the gravitational-wave travel time (tens of milliseconds)between gravitational-wave detectors is introduced intoone or more datasets and the coincidence and event-ranking procedure is repeated in the same way as it isdone without the time-shifts [52]. Instead of perform-ing time slides, the GstLAL-based inspiral pipeline usestriggers not found in coincidence to compute a kerneldensity estimate of the probability density of noise-likeevents in each background bin, P

(lnL | θ,noise

)[27, 28].

The background bins are then marginalized over to ob-tain P (lnL | noise) and the complementary cumulativedistribution,

C (lnL∗ | noise) =

∫ ∞

lnL∗d lnL P (lnL | noise) . (31)

The p-value we seek describes the probability that apopulation of M independent coincident noise-like eventscontains at least one event with a log likelihood ratiogreater than or equal to some threshold lnL∗. This canbe written as the complement of the binomial distribu-tion [28],

P (lnL ≥ lnL∗ | noise1, . . . ,noiseM )

= 1−(M

0

)(1− e−C(lnL∗|noise)

)0

×(e−C(lnL∗|noise)

)M

= 1− e−MC(lnL∗|noise), (32)

where e−C(lnL∗|noise) is the probability that a Poissonprocess with mean rate C (lnL∗ | noise) will yield anevent with log likelihood ratio less than lnL∗. The bino-mial coefficient and the term that follows are both clearlyunity and were only explicitly written for pedagogicalreasons.

When calculating an event’s significance during an ex-periment of undetermined length, such as the low-latencyprocessing of data during a science run, it is conve-nient to express the significance in terms of how oftenthe noise is expected to yield an event with a log likeli-hood ratio ≥ lnL∗. This is referred to as the false-alarmrate (FAR) [28]; for an experiment of length T , we definethis as

FAR =C (lnL∗ | noise)

T. (33)

The time used in the calculation of the FAR is the totalelapsed observing time regardless of instrument state forthe low-latency configuration and the total time where atleast two detectors are operating for the offline analysisoperation. The offline configuration definition is histori-cally what has been used; however, the online definitionleads to intuitive false alarm rates for sharing low-latencyevents with external observing partners.

The procedure to estimate the background distribu-tion described thus far does not account for the clus-tering described in Sec. III F. Events with low lnL are

Page 13: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

13

more common than those with high lnL, thus the clus-tering process removes low lnL events preferentially. Thenormalization of the background model is determined bythe observed events above a log-likelihood ratio thresholdchosen to be safely out of the region affected by cluster-ing. This is acceptable because low lnL events are, byconstruction, the least likely to contain a signal. Conse-quently, we only consider events well above this thresholdas viable candidates. Work is currently underway to cre-ate a background model that accounts for clustering.

A plot of the significance results from the Big Dog rundiscussed in Sec. III E is shown in Fig. 11. The Big Dogwas found with a p-value of 5.4 × 10−9 (5.7σ), whichcorresponded to a FAR of 1.1× 10−14 Hz (1 per ∼ 2.7×106 years); Table I lists the recovered parameters of theBig Dog.

2σ 3σ 4σ 5σ

2σ 3σ 4σ 5σ

5 10 15 20 25 30 35 40 45lnL

10−1510−1310−1110−910−710−510−310−1101

103

Num

ber

ofev

ents≥lnL

Big Dog

Search ResultExpected (BG Includes Coincident Trigs)Expected

FIG. 11. Number of observed events as a function of log like-lihood ratio in an analysis of S6 data beginning on Septem-ber 14, 2010, at 23:58:48 UTC and ending at September 21,2010, at 23:58:48 UTC. The Big Dog injection, found witha false alarm probability of 5.4 × 10−9 (5.7σ), is marked onthe observed distribution (green). The black line representsthe predicted number of events when observed events are in-cluded in the background model, while the blue line is thepredicted number when the observed events are not includedin the background model.

B. Generating Alerts

When operating in a low-latency analysis configura-tion, one of the primary goals of the GstLAL-based in-spiral pipeline is to identify candidate events and up-load them to the Gravitational-wave Candidate EventDatabase (GraceDB [29]) as quickly as possible in orderto issue alerts to observing partners [12].

Events that pass a given FAR threshold are identifiedwithin ∼ 1 minute of the gravitational-wave signals ar-riving at Earth. The basic parameters of the event aretransmitted to GraceDB, including the GPS time of the

p-value 5.4× 10−9 (5.7σ)

FAR (Hz) 1.1× 10−14

logL 39.6

ρH 14.7

ξ2H 1.4

ρL 9.4

ξ2L 1.5

M (M�) 4.7

TABLE I. The result of the analysis of S6 data beginning onSeptember 14, 2010, at 23:58:48 UTC and ending on Septem-ber 21, 2010, at 23:58:48 UTC. Only the parameters foundfor the recovered Big Dog injection are shown. The Big Dogwas the most significant event found in this analysis period.

event, the SNR and ξ2 values for the triggers in each de-tector, and the parameters of the best-fit template (forexample, mass and spin values, the significance estimate,etc.). Furthermore, the instantaneous estimate of thePSD is uploaded to GraceDB, as well as the histogramdata used in computing the p-value.

An event upload automatically initiates several auto-mated and human followup activities to aid rapid com-munication with observing partners [53]. First, a rapidsky localization routine known as BAYESTAR [9, 50, 54]uses the event information and the PSD to estimate theevent’s sky position within minutes. At the same time,deeper parameter estimation analysis begins in order toprovide updated position reconstruction, as well as thefull posterior probability distributions of the binary pa-rameters [55], on a timescale that ranges from hours todays.

In addition to parameter estimation, data-quality in-formation is also mined to provide rapid feedback to an-alysts. Time-frequency spectrograms are automaticallygenerated to indicate the stationarity of noise near anevent [56]. Furthermore, low-latency mining of LIGO’sauxiliary channels provide additional information aboutthe state of the detector and environment when an alertis first generated [16, 57, 58]

The suitability of the low-latency pipeline for generat-ing data for external alerts has been studied extensivelyin [9, 54]

C. Software injections

Simulated gravitational waveforms known as “softwareinjections” are used to assess the pipeline response toreal gravitational-wave signals. The LIGO strain datais duplicated and simulated compact binary waveformsare digitally added to the duplicated data streams. Inlow latency, the new data with software injections addedare broadcast to the LIGO Data Grid in parallel to thenormal dataset so that a simultaneous run can measurethe instantaneous sensitivity of the low-latency analysisto compact binary sources. In the offline mode, strain

Page 14: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

14

data is read from disk, software injections are added, andthe new data is written back to disk before the offlineinspiral pipeline processes the data.

Injections are considered ‘found’ if a coincident eventwith the correct template parameters is found with aFAR ≤ 30 d at the time of the injection, and ‘missed’otherwise. The volume of space the pipeline is sensitiveto, V , is approximated as a sphere and computed via

V = 4π

∫ ∞

0

drε(r)r2, (34)

where ε(r) is an efficiency parameter given by the ratio offound to total injections modeled to be a distance r away.We define our estimated range, the average furthest dis-tance a signal can originate from and still be detected,as

R =

(3V

)1/3

. (35)

It is important to note the range depends on parametersof the compact binary system. For example, the rangefor a 1.4 − 1.4M� binary neutron star system will bedifferent than that of a 10−10M� binary black hole sys-tem, thus different injection sets must be used to deter-mine the pipeline’s sensitivity to different regions of thecompact binary parameter space. Eq. (35) is comparedto the analytically computed SenseMon range, an esti-mate of pipeline sensitivity calculated from the PSD [59].Comparing the sensitivity estimated from the PSD to thesensitivity estimated from injections provides additionalconfidence in the sensitivity estimates.

Typically, injections are added at a much higher ratethan the expected gravitational wave signal rate. How-ever, their cadence is chosen such that they do not biasthe PSD estimate described in Sec. II B. In practice, in-jections are typically added about once per minute sothat it is possible to evaluate the average response tocertain signal types over the entire experiment duration.

V. CONCLUSION

The GstLAL-based inspiral pipeline is a stream-basedpipeline that allows for time-domain compact binarysearches capable of identifying and uploading candidategravitational-wave signals within seconds. This providesrapid feedback to the gravitational-wave detector con-trol rooms and enables prompt event alerts for electro-magnetic followup by observing partners. The anal-

ysis techniques were designed for second- and third-generation gravitational-wave detectors and have beendemonstrated to be applicable even to the computation-ally challenging case of the future Einstein Telescope [60].

GstLAL and all related software is available for publicuse and licensed under the GPL [14].

VI. ACKNOWLEDGEMENTS

The authors wish to thank B. Sathyaprakash, theLIGO Scientific Collaboration, and the Compact BinaryCoalescence working group for many useful discussions.

We gratefully acknowledge the support of the EberlyResearch Funds of Penn State and the National ScienceFoundation through PHY-0757058, NSF-0923409, PHY-1104371, PHY-1454389, and PHY-1307429. This docu-ment has LIGO document number: P1600009.

Appendix A: Expectation Value of SignalConsistency Test Value in Noise

Expanding Eq. (26) and taking the ensemble average,we find

〈ξ2j (t)〉 = 〈|zj(t)− zj(0)Rj(t)|2〉,

= 〈|zj(t)|2〉 − 2Re[〈z∗j (t)zj(0)〉Rj(t)

]

+ 〈|zj(0)|2〉|Rj(t)|2. (A1)

Starting with Eq. (3a),

〈|zj(t)|2〉 =

⟨∣∣∣∣∣2∫ ∞

−∞dfn(f)(h∗2j(f) + ih∗2j+1(f))

Sn(f)e2πitf

∣∣∣∣∣

2⟩,

= 4

∫ ∞

−∞

∫ ∞

−∞df1df2

(〈n(f1)n(f2)〉 e2πit(f1−f2)

(h∗2j(f1) + ih∗2j+1(f))(h2j(f2)− ih2j+1(f2))

Sn(|f1|)Sn(|f2|)

),

= 2

∫ ∞

−∞df|h2j(f)|2 + |h2j+1(f)|2

Sn(|f |) ,

〈|zj(t)|2〉 = 〈|zj(0)|2〉 = 2. (A2)

where Eq. (7) was used in the last step. Computing〈z∗j (t)zj(0)〉 follows the same steps, except the zj(0) termdoes have a complex exponential to cancel the complexexponential accompanying z∗j (t), thus

〈z∗j (t)zj(0)〉 = 2R∗j (t), (A3)

〈ξ2j (t)〉 = 2− 2|Rj(t)|2 (A4)

[1] B. Abbott, R. Abbott, T. Abbott, M. Abernathy, F. Ac-ernese, K. Ackley, C. Adams, T. Adams, P. Addesso,

R. Adhikari, et al., Physical Review Letters 116, 061102(2016).

Page 15: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

15

[2] B. Abbott, R. Abbott, T. Abbott, M. Abernathy, F. Ac-ernese, K. Ackley, C. Adams, T. Adams, P. Addesso,R. Adhikari, et al., Physical Review Letters 116, 241103(2016).

[3] J. Aasi, B. Abbott, R. Abbott, T. Abbott, M. Abernathy,K. Ackley, C. Adams, T. Adams, P. Addesso, R. Ad-hikari, et al., Classical and quantum gravity 32, 074001(2015).

[4] F. Acernese, M. Agathos, K. Agatsuma, D. Aisa, N. Alle-mandou, A. Allocca, J. Amarni, P. Astone, G. Balestri,G. Ballardin, et al., Classical and Quantum Gravity 32,024001 (2015).

[5] Y. Aso, Y. Michimura, K. Somiya, M. Ando,O. Miyakawa, T. Sekiguchi, D. Tatsumi, and H. Ya-mamoto, Physical Review D 88, 043007 (2013).

[6] B. Iyer et al., “LIGO-India, Proposal of the Consortiumfor Indian Initiative in Gravita tional-wave Observations(IndIGO),” (2011), LIGO-DCC-M1100296.

[7] LIGO RATES PAPER, TO BE ADDED BEFORE SUB-MISSION, In review.

[8] J. Abadie, B. Abbott, R. Abbott, M. Abernathy, T. Ac-cadia, F. Acernese, C. Adams, R. Adhikari, P. Ajith,B. Allen, et al., Classical and Quantum Gravity 27,173001 (2010).

[9] L. P. Singer, L. R. Price, B. Farr, A. L. Urban,C. Pankow, S. Vitale, J. Veitch, W. M. Farr, C. Hanna,K. Cannon, et al., The Astrophysical Journal 795, 105(2014).

[10] B. P. Abbott et al. (LIGO Scientific, Virgo), (2016),arXiv:1602.08492 [astro-ph.HE].

[11] S. Adrian-Martınez et al. (ANTARES, IceCube, LIGOScientific, Virgo), (2016), arXiv:1602.05411 [astro-ph.HE].

[12] “Identification and follow up of electromagnetic counter-parts of gravitational wave candidate events,” http://

www.ligo.org/scientists/GWEMalerts.php, accessed:2016-01-15.

[13] B. P. Abbott et al. (LIGO Scientific, Virgo), (2016),arXiv:1602.03844 [gr-qc].

[14] “Gstlal,” https://www.lsc-group.phys.uwm.edu/

daswg/projects/gstlal.html (), accessed: 2015-07-01.[15] “Gstreamer,” https://gstreamer.freedesktop.org ().[16] “Lalsuite,” https://www.lsc-group.phys.uwm.edu/

daswg/projects/lalsuite.html, accessed: 2015-07-01.[17] B. Abbott, R. Abbott, T. Abbott, M. Abernathy, F. Ac-

ernese, K. Ackley, C. Adams, T. Adams, P. Addesso,R. Adhikari, et al., Physical Review D 93, 122003 (2016).

[18] J. Abadie, B. Abbott, R. Abbott, T. Abbott, M. Aber-nathy, T. Accadia, F. Acernese, C. Adams, R. Adhikari,C. Affeldt, et al., Astronomy & Astrophysics 541, A155(2012).

[19] S. Klimenko et al., Classical and Quantum Gravity 25,114029 (2008).

[20] R. Lynch, S. Vitale, R. Essick, E. Katsavounidis, andF. Robinet, (2015), arXiv:1511.05955 [gr-qc].

[21] B. P. Abbott et al. (Virgo, LIGO Scientific), (2016),arXiv:1602.03843 [gr-qc].

[22] D. Buskulic, L. S. Collaboration, V. Collaboration, et al.,Classical and Quantum Gravity 27, 194013 (2010).

[23] S. Babak, R. Biswas, P. Brady, D. A. Brown, K. Can-non, C. D. Capano, J. H. Clayton, T. Cokelaer, J. D.Creighton, T. Dent, et al., Physical Review D 87, 024033(2013).

[24] K. Cannon, R. Cariou, A. Chapman, M. Crispin-Ortuzar,N. Fotopoulos, M. Frei, C. Hanna, E. Kara, D. Kep-pel, L. Liao, et al., The Astrophysical Journal 748, 136(2012).

[25] B. Allen, W. G. Anderson, P. R. Brady, D. A. Brown,and J. D. Creighton, Physical Review D 85, 122006(2012).

[26] B. Allen, Physical Review D 71, 062001 (2005).[27] K. Cannon, C. Hanna, and J. Peoples, arXiv preprint

arXiv:1504.04632 (2015).[28] K. Cannon, C. Hanna, and D. Keppel, Physical Review

D 88, 024025 (2013).[29] “Gravitational wave candidate event database,”

https://www.lsc-group.phys.uwm.edu/daswg/

projects/gracedb.html, accessed: 2016-01-15.[30] S. Anderson et al., “Specification of a Common Data

Frame Format for Interferometric Gravitational WaveDetectors,” (2009), LIGO-DCC-T970130.

[31] S. Droz, D. J. Knapp, E. Poisson, and B. J. Owen, Phys-ical Review D 59, 124016 (1999).

[32] J. Abadie, B. Abbott, R. Abbott, T. Abbott, M. Aber-nathy, T. Accadia, F. Acernese, C. Adams, R. Adhikari,C. Affeldt, et al., Physical Review D 85, 082002 (2012).

[33] J. Abadie, B. Abbott, R. Abbott, T. Abbott, M. Aber-nathy, T. Accadia, F. Acernese, C. Adams, R. Adhikari,C. Affeldt, et al., The Astrophysical Journal 760, 12(2012).

[34] J. Slutsky, L. Blackburn, D. Brown, L. Cadonati, J. Cain,M. Cavaglia, S. Chatterji, N. Christensen, M. Cough-lin, S. Desai, et al., Classical and Quantum Gravity 27,165023 (2010).

[35] N. Christensen, L. S. Collaboration, V. Collaboration,et al., Classical and Quantum Gravity 27, 194010 (2010).

[36] B. J. Owen, Physical Review D 53, 6749 (1996).[37] B. J. Owen and B. Sathyaprakash, Physical Review D

60, 022002 (1999).[38] T. A. Apostolatos, Physical Review D 52, 605 (1995).[39] T. Cokelaer, Phys. Rev. D76, 102004 (2007),

arXiv:0706.4437 [gr-qc].[40] B. Abbott, R. Abbott, R. Adhikari, J. Agresti, P. Ajith,

B. Allen, R. Amin, S. Anderson, W. Anderson, M. Arain,et al., Physical Review D 78, 042002 (2008).

[41] I. W. Harry, A. H. Nitz, D. A. Brown, A. P. Lund-gren, E. Ochsner, and D. Keppel, Physical Review D89, 024010 (2014).

[42] S. Babak, Classical and Quantum Gravity 25, 195011(2008).

[43] I. W. Harry, B. Allen, and B. Sathyaprakash, PhysicalReview D 80, 104014 (2009).

[44] G. M. Manca and M. Vallisneri, Physical Review D 81,024004 (2010).

[45] S. Privitera, S. R. Mohapatra, P. Ajith, K. Cannon,N. Fotopoulos, M. A. Frei, C. Hanna, A. J. Weinstein,and J. T. Whelan, Physical Review D 89, 024003 (2014).

[46] K. Cannon, A. Chapman, C. Hanna, D. Keppel, A. C.Searle, and A. J. Weinstein, Physical Review D 82,044025 (2010).

[47] K. Cannon, C. Hanna, and D. Keppel, Physical ReviewD 84, 084003 (2011).

[48] K. Cannon, C. Hanna, and D. Keppel, Physical ReviewD 85, 081504 (2012).

[49] S. Fairhurst, New Journal of Physics 11, 123006 (2009).[50] L. P. Singer and L. R. Price, Physical Review D 93,

024013 (2016).

Page 16: Gravitational-wave Data - arXiv · sients, (3) a multidimensional likelihood ratio ranking statistic to robustly identify gravitational-wave candi-dates in a way that automatically

16

[51] C. Robinson, B. Sathyaprakash, and A. S. Sengupta,Physical Review D 78, 062002 (2008).

[52] C. D. Capano, Searching for gravitational waves fromcompact binary coalescence using LIGO and virgo data,Ph.D. thesis, Syracuse University (2011).

[53] https://www.lsc-group.phys.uwm.edu/daswg/

projects/lvalert.html, accessed: 2016-01-15.[54] C. P. Berry, I. Mandel, H. Middleton, L. P. Singer, A. L.

Urban, A. Vecchio, S. Vitale, K. Cannon, B. Farr, W. M.Farr, et al., arXiv preprint arXiv:1411.6934 (2014).

[55] J. Veitch, V. Raymond, B. Farr, W. Farr, P. Graff, S. Vi-tale, B. Aylott, K. Blackburn, N. Christensen, M. Cough-lin, et al., Physical Review D 91, 042003 (2015).

[56] “Ligo data viewer,” https://www.lsc-group.phys.uwm.

edu/daswg/projects/ligodv.html, accessed: 2016-01-

15.[57] R. Biswas, L. Blackburn, J. Cao, R. Essick, K. A. Hodge,

E. Katsavounidis, K. Kim, Y.-M. Kim, E.-O. Le Bigot,C.-H. Lee, et al., Physical Review D 88, 062003 (2013).

[58] R. Essick, L. Blackburn, and E. Katsavounidis, Classicaland Quantum Gravity 30, 155010 (2013).

[59] J. Abadie, B. Abbott, R. Abbott, M. Abernathy, T. Ac-cadia, F. Acernese, C. Adams, R. Adhikari, P. Ajith,B. Allen, et al., arXiv preprint arXiv:1003.2481 (2010).

[60] D. Meacher, K. Cannon, C. Hanna, T. Regimbau, andB. Sathyaprakash, Physical Review D 93, 024018 (2016).