29
Cluster-Centric Anomaly Detection and Characterization in Spatial Time Series Dr. Hesam Izakian October 2014

Dr. Hesam Izakian October 2014. 2 Spatial time series Problem formulation Anomaly detection in spatial time series- questions Overall scheme

Embed Size (px)

Citation preview

Page 1: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

Cluster-Centric Anomaly Detection and

Characterization in Spatial Time Series

Dr. Hesam Izakian

October 2014

Page 2: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

2

Spatial time series Problem formulation Anomaly detection in spatial time series- questions Overall scheme of the proposed method

o Time series segmentationo Spatial time series clusteringo Assigning anomaly scores to clusterso Visualizing the propagation of anomalies

An outbreak detection scenario Application Conclusions

Outline

Page 3: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

3

Structure of datao A set of spatial coordinateso One or more time series for

each point

Exampleso Daily average temperature in different climate stationso Stock market indexes in different countrieso Number of absent students in different schoolso Number emergency department visits in different hospitalso Measured signals in different parts of brain

Spatial time series

Page 4: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

4

There are N spatial time series

Objective: Find a spatial neighborhood of data

In a time interval

Containing a high level of unexpected changes

nriN R xxxx ,,,, 21

rntxtxtxtrsxsxsxs

ts

iniii

iriii

iii

)(),....,(),()(2usually)(),...,(),()(

)(|)(

21

21

T

xx

xxx

Npp xxx ,,, 21

nlnlqqttt lqqq ,,0,...,, 1t

Problem formulation

Page 5: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

5

Spatial neighborhood of datao Size of neighborhoodo Overlapping neighborhoods

Unexpected changes (anomalies)o What kind of changes are expected/not expectedo How to evaluate the level of unexpected changes

Anomaly visualization Anomaly characterization

o What was the source of anomalyo How the anomaly is propagated over time

Anomaly detection in spatial time series- questions

Page 6: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

6

Revealing the structure of data in various time intervals Comparing the revealed structures

Overall scheme of the proposed method

Sliding window

Spatial time series clustering

Spatial time series data

KUUU ,,, 21

Anomaly scores

Fuzzy relations

Ksss ,,, 21

KKRRR 1,-2,32,1 ,,,

Spatial time series data

KWWW ,,, 21

Page 7: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

7

Time series part segmentation

Sliding windowo Spatio-temporal subsequenceso Local view of time series part

Page 8: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

8

Revealing the structure of data in various time intervals Comparing the revealed structures

Overall scheme of the proposed method

Sliding window

Spatial time series clustering

Spatial time series data

KWWW ,,, 21

KUUU ,,, 21

Anomaly scores

Fuzzy relations

Ksss ,,, 21

KKRRR 1,-2,32,1 ,,,

Page 9: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

9

Fuzzy C-Means clustering- visual illustration

1 1 1 1 1 0 0 0 0 00 0 0 0 0 1 1 1 1 1

BA

Page 10: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

10

Fuzzy C-Means clustering- visual illustration

0.91 0.96 1.00 0.95 0.70 0.30 0.05 0.00 0.04 0.090.09 0.04 0.00 0.05 0.30 0.70 0.95 1.00 0.96 0.91

BA

Page 11: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

11

Fuzzy C-Means clustering…

Partitions N data Into clusters Result:

Objective function:

Minimization:

N

k

mik

N

kk

mik

i

u

u

1

1

x

v

diN Rxxxx ,,,, 21

,,,,, 21d

ic Rvvvv

2

1 1ki

c

i

N

k

mikuJ xv

)1/(2

1

mc

j kj

kiiku

xv

xv

cNc

N

c

N

uu

uuU

1

1111

1

v

v

xx

c

iikik kuu

1

1],1,0[

)1(, Ncc

Page 12: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

12

Reveals available structure within datao In form of partition matrices

Challengeso Different sources: Spatial part vs. temporal parto Different dimensionality in each parto Different structure within each part

Spatial time series clustering

Page 13: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

13

In spatial time series, we define

Adopted FCM objective function

Characteristicso When λ=0: Only spatial part of data in clusteringo A higher value of λ : a higher impact of time series part in

clusteringo Optimal value of λ: Optimal impact of each part in clustering

Spatial time series clustering…

0)()()()(),(

222

ttssd kikiki xvxvxv

),(2

1 1ki

c

i

N

k

mik duJ xv

Page 14: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

14

Spatial-time series clustering- Optimal value of λ

c

i

mik

c

i imik

ku

u

1

1ˆv

x

N

kkkE

1

2ˆ)( xx

Page 15: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

15

Revealing the structure of data in various time intervals Comparing the revealed structures

Overall scheme of the proposed method

Sliding window

Spatial time series clustering

Spatial time series data

KWWW ,,, 21

KUUU ,,, 21

Anomaly scores

Fuzzy relations

Ksss ,,, 21

KKRRR 1,-2,32,1 ,,,

Page 16: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

16

Assign an anomaly score to each single subsequence based on historical data

Aggregating anomaly scores inside revealed clusters

Assigning anomaly scores to clusters in different time windows

ciufusWN

kik

N

kkikiij ,...,2,1,

11

v

cUW vvv ,...,,, 2122

Nkfk ,...,2,1,

Page 17: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

17

Revealing the structure of data in various time intervals Comparing the revealed structures

Overall scheme of the proposed method

Sliding window

Spatial time series clustering

Spatial time series data

KWWW ,,, 21

KUUU ,,, 21

Anomaly scores

Fuzzy relations

Ksss ,,, 21

KKRRR 1,-2,32,1 ,,,

Page 18: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

18

Visualizing the propagation of anomalies- Fuzzy relations

Objective: quantifying relations between clusters

T

,1,2,1 ]...,,,[ kckkk uxuxuxux

T

,2,2,1 ],...,,[ kckkk uyuyuyuy

Nckcc

Nk

Nk

uyuyuy

uyuyuyuyuyuy

U

W

,2,21,2

,2,21,2

,1,11,1

2

2

,...,,...,

,...,,...,,...,,...,

:

Nckcc

Nk

Nk

uxuxux

uxuxuxuxuxux

U

W

,1,11,1

,2,21,2

,1,11,1

1

1

,...,,...,

,...,,...,,...,,...,

:

Page 19: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

19

Visualizing the propagation of anomalies…

Objective function to construct relation

Optimization

N

kkk RQ

1

2uyux

N

k

c

ikjji

cjki uyrux

1

1

1

2

,,2,...,2,1

, )t(max

][ , jirR 1,...,2,1 ci 2,...,2,1 cj

)()()1(

,,, iterr

Qiterriterr

tststs

]1,0[, jir

Page 20: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

20

Example

An outbreak o In southern part of Albertao Using NAADSM for 100 days

Page 21: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

21

Example…

A sliding window is usedo Length : 20o Movement: 10

Generated spatio-temporal subsequences:

Page 22: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

22

Page 23: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

23

Example…

Page 24: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

24

Example…

Page 25: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

25

Example…

Page 26: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

26

Example…

Page 27: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

27

Application Implemented for Agriculture and Rural Development

(Government of Alberta) Using KNIME (Konstanz Information Miner) Animal health surveillance in Alberta Anomaly detection Data visualization

Page 28: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

28

Conclusions

A framework for anomaly detection and characterization in spatial time series is developed

A sliding window to generate a set of spatio-temporal subsequences is considered

Clustering is used to discover the available structure within the spatio-temporal subsequences

An anomaly score assigned to each revealed spatio-temporal cluster

A fuzzy relation technique is proposed to quantify the relations between clusters in successive time steps

Page 29: Dr. Hesam Izakian October 2014. 2  Spatial time series  Problem formulation  Anomaly detection in spatial time series- questions  Overall scheme

29

Thank you