Parallel Sequential Multi-Sensor Change-Point Detection

Parallel Sequential Multi-SensorChange-Point Detection

Yao Xie∗, David Siegmund†

∗ Duke University, † Stanford UniversityJoint Statistical Meeting 2012

October 1, 2012

Outline

I Motivating applicationsI ModelI Mixture procedure and parallel procedureI Performance evaluation

Motivation: volcano monitoring

Change-point detection using multiple sensors

Anormaly detectionHarvard Sensor Networks Lab

Solar flare detection

I Video sequences, each pixel is a “sensor”I Very high-dimensional: # sensors = 232× 292 = 67744I Goal: online detection of small and transient solar flare

t = 100t = 226

Source: NASA

Challenges

I High-dimensional dataI Online detectionI Fundamental question:

How to quickly detect temporal change of spatial data

0 10 20 30 40 50−10010

Change-point occurs at time κ

Formulation

Preprocessed data/model residuals:

{yn,t}n=1,·,N,t=1,2,···,

H0 : yn,t = wn,t , n = 1, · · · ,N, t = 1,2, · · ·

{yn,t = wn,t , n ∈ S, t = 1, · · · , k ;yn,t = µn + wn,t , n ∈ Sc , t = k + 1, · · ·

wn,t ∼ N (0,1)

I Unknown parameters1. µn: changepoint amplitude2. S: subset of sensors affected3. k : changepoint time

Simple procedure

I The simple approach is to use total energy:

N∑n=1

y2n,t ≥ b

I Slightly more sophisticated, form a CUSUM statistic

Tsimple = inf

maxt−w≤k≤t

t∑i=k+1

N∑n=1

y2n,i ≥ b.

w : window length

I These simple approaches have been widely used inengineering fields, network anomaly detection, videosurveillance

Multichannel changepoint detection

I Likelihood ratio statistic

max1≤k≤t

∑n∈S

maxµn

t∑i=k+1

{yn,iµn −µ2

I S has 2N possibilities, N = # sensorsI One simplification:

Tmc = inf

t : max1≤k≤t

N∑n=1

maxµn

t∑i=k+1

(yn,iµn −µ2

Related to [Tartakovsky, Veeravalli 08]

Sparsity

t = 226

I Number of sensors affected by thechangepoint is small

I Fraction of affected sensors:

p = |S|/N � 1

I Model this by assuming each sensoraffected with probability p0

10 / 1

Mixture procedure

Tmix(p0,b)

{t : max

0≤k<t

N∑n=1

1− p0 + p0 exp[(U+n,k ,t)

2/2])≥ b

Sn,t =t∑

yn,i ,

Un,k ,t = (t − k)−1/2(Sn,t − Sn,k ).

11 / 1

Performance metrics

I Average run length (ARL)

E∞{T}

Average period of making false alarms when there is nochangepoint.

I Expected detection delay

max1≤k≤t

Ek{T − k |T > k}

Number of samples needed before claiming a detection whenthere is changepoint.

12 / 1

ARL of mixture procedure

Closed form expression for ARL of mixture procedure?

Table : Threshold for Tmix(p0,b), ARL ≈ 5000, N = 100, m1 = 200.

Procedure Monte Carlo ARLTmix(1,53.5) 4978

Tmix(0.1,19.5) 5000

? “Sequential multisensor changepoint detection”, Xie, Siegmund, 2012, submitted to Annuals of Stats.

13 / 1

Detection delay

I Performance of mixture procedure is better

0 0.05 0.1 0.15 0.2 0.25 0.30

GLRMixture, p

0 = 0.1

Modified TV

14 / 1

Sensitivity to pI Mix procedure: good enough?I We do not know true pI When p0 6= p, N is large: p0N very different from pN

0 0.05 0.1 0.15 0.2 0.25 0.3 0.350

p0 = p

p0 = 0.1

N = 100

15 / 1

Parallel Procedure

I How to be more robust to uncertainty in pI Use two mixture procedures

Tparallel , min{Tmix(p1,b1),Tmix(p2,b2)}.

I Choose p1 and p2:p ∈ [p1,p2]

16 / 1

ARL of Parellel

I No closed form ARL of ParallelI Choose b1 and b2: ARL of two procedures equalI Use a very conservative lower bound

I P∞{Tmix(pi ,bi) ≤ 1000} ≈ 0.05, i = 1,2I By Bonferroni inequality:

P∞{min[Tmix(p1,b1),Tmix(p2,b2)] ≤ 1000} ≤ 0.1,

So E∞{Tparallel} ≥ 10000

17 / 1

Example

Tparallel = min{Tmix(0.02,21.2),Tmix(0.33,87.7)}

Table : Thresholds m1 = 200, N = 400.

Procedure b Monte Carlo ARLTsimple 14.34 10000

Tmc 0.77 10000Tmix(0.1) 44.7 10000

Tmix(0.02) 21.2 20000Tmix(0.33) 87.7 20000

Tparallel 10000

18 / 1

Detection Delay: Parallel-1

Table : Comparison of Parallel and Single Procedures

p µn = µ FI Tsimple Tmc Tmix (0.1) Tparallel0.1 0.7 19.6 187.0 5.7 6.5 6.4

0.005 1.0 2.0 445.0 48.5 27.1 22.90.005 0.7 1.0 523.0 94.7 54.5 45.80.25 0.3 7.5 197.5 8.4 12.0 10.5

FI = Fisher Information, µ2Np/σ2

19 / 1

Parallel Procedure - 2

I Drawback of Parallel-1: no closed form ARLI Parallel-2: linearly combine the statistics of Tmix(pi ,bi):

S(pi)Sparallel = c1S(p1) + c2S(p2)

I Parallel-2

Tparallel,2 = inf{t : S(parallel) > b}

20 / 1

Summary

I Exploit sparsity in high-dimensional change-point detectionI Parallel procedure further boost performanceI Future work: how to extract detection statistic from high

dimensional dataOngoing work:

21 / 1

ARL of Parallel-2

I Close-form approximation for the average run-lengthN →∞ and b →∞ with b/N fixed, m1 = o(br ) for somepositive integer r , and define θ by ψ̇(θ) = b/N

E∞{T2} ∼ f (N, θ, p0)/

∫ [2Nγ(θ)/m0]1/2

[2Nγ(θ)/m1]1/2yν2(y)dy .

g(x , c1, c2,p1,p2) = c1 log(1− p1 + p1 exp[(x+)2/2]) +c2 log(1− p2 + p2 exp[(x+)2/2]),ψ(θ) = logE{exp[θg(U, c1, c2,p1,p2)]}

22 / 1

Table : Thresholds N = 400, m1 = 200.

Procedure b Monte Carlo ARLTmix(0.1) 44.7 10000Sparallel 46.24 10000

Sparallel = S(0.02) + 0.3S(0.33)

23 / 1

Table : Delay, N = 400, m1 = 200

Procedure p µ Expected Detection DelayTmix (0.1, b) 0.1 0.7 6.48

0.005 1.0 27.080.005 0.7 54.490.25 0.3 11.96

Sparallel 0.2 0.7 4.480.005 1.0 26.400.005 0.7 49.790.25 0.3 13.81

Improvement not as significant as Parallel procedure.

24 / 1

Parallel Sequential Multi-Sensor Change-Point Detection

Documents

Parallel Mining of Closed Sequential Patterns

Adaptive Sequential Posterior Simulators for Massively Parallel

Distributed and Parallel High Utility Sequential Pattern ...individual.utoronto.ca/_zihayatm/Papers/IEEEBigData.pdfand distributed high utility sequential pattern mining algo-rithm

Sequential Adaptive Sensor Management – A. Hero Sequential: only one sensor deployed at a time Adaptive: next sensor selection based on present and past

Dryad: Distributed Data-Parallel Programs from Sequential

Dryad: Distributed Data-Parallel Programs from Sequential … · Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks Michael Isard Microsoft Research, Silicon

Shared Memory Programming with OpenMP · Sequential code •In OpenMP, all code outside parallel regions, or inside MASTER and SINGLE directives is sequential. •Time spent in sequential

Chapter 1 Sequential Anomaly Detection Using Wireless ...web.eecs.utk.edu/~leparker/publications/AnomalyDetection_WSN... · Sequential Anomaly Detection Using Wireless Sensor Networks

Algorithms Sequential and Parallel—A Unified Approach

Sequential and Parallel Abstract Machines for Optimal ... · Sequential and Parallel Abstract Machines for Optimal Reduction Marco Pedicini (Roma Tre University) in collaboration

Communication-optimal parallel and sequential QR and LU ... · 9.3 Householder QR . . . . . . . . . . . . . . . . . . . . . . . . . . .42 9.3.1 Parallel Householder QR . . . . .

Dryad: Distributed Data-Parallel Programs from Sequential Building

Solving Sequential Problems in Parallel · 2016-11-23 · Solving Sequential Problems in Parallel, Rev. 0 Freescale Semiconductor 3 RSA Public-key Cryptography In public-key cryptography,

Scheduling and ordering issues in Sequential Task Flow ...solhar.gforge.inria.fr/lib/...issues_stf_parallel_multifrontal_method.pdf · Sequential Task Flow parallel multifrontal methods

Communication-optimal parallel and sequential QR and LU ......Communication-optimal parallel and sequential QR and LU factorizations James Demmel, Laura Grigori, Mark Hoemmen, and

Communication-avoiding parallel and sequential QR factorizations

Processing Sequential Sensor Data

SEQUENTIAL AND PARALLEL ALGORITHMS FOR CAUSAL EXPLANATION WITH

Sequential and Parallel Algorithms for the Eneralized Maximum Subarray Problem

A Unified Approach to Sequential and Parallel Algorithms · Web viewAlgorithms Sequential and Parallel: A Unified Approach Russ Miller Laurence Boxer Prentice-Hall, Inc., New Jersey,