26
Anomaly Detection on ITS Data via View Association Anomaly Detection on ITS Data via View Association Junaidillah Fadlil, Hsing-Kuo Pao and Yuh-Jye Lee National Taiwan University of Science & Technology (Taiwan Tech)

Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

Embed Size (px)

Citation preview

Page 1: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

Anomaly Detection on ITS Data via

View Association

Anomaly Detection on ITS Data via

View Association

Junaidillah Fadlil, Hsing-Kuo Pao and Yuh-Jye Lee

National Taiwan University of Science & Technology (Taiwan Tech)

Page 2: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 2

Outline

Problem and motivation: anomaly detection on ITS

Related work

Datasets & Ground Truth

Detection by view association

Results of batch learning and online learning

Discussion

Conclusion

Page 3: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 3

Problem and Motivation

Anomaly detection in sensor-deployed intelligent

transportation systems (ITS)

Anomalous events in transportation systems include:

Traffic accidents

Emergency car passing

Harsh weather conditions, etc.

Focusing on traffic accident in this study

Plan to automatically create a police report for the

anomalous events

Page 4: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 4

Related Work…

Chawla et al proposed to model the road structures as a directed graph then

utilized PCA in order to detect anomalies. [Chawla et al, 2012]

Developed co-training algorithm which uses different views to label data for

semi-supervised learning. [Blum & Mitchell, 1998]

Proposed a hierarchical clustering method to classify vehicle motion

trajectories in real traffic video. [Fu et al, 2005]

Proposed using a tree of clusters in online fashion given trajectory data for

behavior analysis. [Piciarelli et al, 2006]

Employed manifold embedding method to examine anomalous cargo. [Agovic

et al, 2009]

Nevertheless, anomaly detection is important in the big-data era!!!

Page 5: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 5

Datasets and Ground Truth

― Anomaly Detection on ITS Data via View Association ―

Page 6: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 6

Datasets

Mobile Century Project Data (Century Data)

Collected on February 8, 2008

Including both of PeMS and GPS data

http://traffic.berkeley.edu/

Short-term but with more variety

California DOT Website Data (Caltrans Data)

Data recorded since 1993

Mainly PeMS data

http://pems.dot.ca.gov/

Long-term but with less data types

Focusing on the data of Dec, 14 2007 for now

Page 7: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 7

Datasets (cont’d)

PeMS Data (Loop detector)

Computing the temporal mean speed (TMS) for

every 5 (or 30) minutes

We associate each influence area with a

detector station

GPS Data (Trajectory)

Stored by mobile devices

Each mobile phone recording:

Position (latitude & longitude) & Velocity

for every 3 seconds

1388 trajectories collected

Page 8: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 8

Ground Truth

Detecting anomalies = accidents

Accident in Mobile Century Data

Time: 10:34 AM, February 8, 2008

Location: postmile 26.641

Consequence: a traffic congestion of 34 mins

Accident in Caltrants Data

Time: 1 PM, December 14, 2007

Location: postmile 26.641

Consequence: a traffic congestion of 38 mins

More anomaly data is needed!

Also focusing more on other types of anomalies

Page 9: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 9

Detection Scheme

― Anomaly Detection on ITS Data via View Association ―

Page 10: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 10

Detection Scheme

Feature

Extraction

Report

Feature

Extraction

Isomap

Hierarchical Clustering

Data

Representation

Clustering

view n

Report

view m

Data

Representation

Clustering

Final Report

Page 11: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 11

Views and Feature Extraction

View Data Source Feature

Flow Century PeMS 1. mean of flow

2. std. of flow

3. skewness of flow

4. mean of △flow

5. std. of △flow

6. skewness of of △flow

Speed Century GPS 1. mean of speed

2. std. of speed

3. skewness of speed

4. mean of △speed

5. std. of △speed

6. skewness of of △speed

Duration Century GPS 1. mean of duration

2. std. of duration

3. skewness of duration

4. total duration

Flow Caltrans PeMS similar to Century PeMS

Speed Caltrans PeMS similar to Century GPS

Page 12: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 12

Data Representation

Manifold learning approach

Utilizing Isomap in this work:

Construct the neighborhood graph using kNN

Compute the shortest path for each pair of points, by e.g., Dijkstra

algorithm

Apply Multidimensional Scaling (MDS) method for visualization

Page 13: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 13

Data Clustering

Given: projected data in low-dimensional

space

A hierarchical clustering (singular distance)

approach

Splitting data into two clusters: normal &

anomalous groups!

Applying 90-10 (or x% and 1 – x%) principle

in the clustering process (90% normal data

and 10% anomalous data)

Page 14: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 14

Report Generation

Based on the result of data clustering

One “raw” report generated (labeled) for each single view

Report association done by intersecting two or more reports, one

from each view, to generate the final report

Compared to ground truth for evaluation

More complicated view association mechanisms can be applied

By view association, we may be able to automatically decide the parameter

set that is necessary for the unsupervised learning method like anomaly

detection!

View association can be done in different data representation

spaces!

Page 15: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 15

Batch mode detection

― Anomaly Detection on ITS Data via View Association ―

Page 16: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 16

Experimental Settings

kiso = 5 (kNN in Isomap)

Intrinsic dimensionality = 2

Combined views (Exp. Via Century Data) :

- Flow & speed - Speed & duration

- Flow & duration - Flow, speed and duration

Dataset View Data Interval Station ID

Via Mobile

Century Project

Data

Century PeMS flow 5 mins, 1 station 400488 (24.007)

401561 (24.477)

400611 (24.917)

400284 (25.767)

400041 (26.027)

400165 (26.641)

Century GPS speed 1 station

Century GPS duration 1 station

Via Caltrans PeMS Caltrans PeMS flow, speed 30 mins, 1

station

400165 (26.641)

Page 17: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 17

Via the Mobile Century Project Data

Our final report indicates an accident near station 400165 (26.641)

which matches the ground truth

Ground truth: 10:34 AM, postmile 26.641, a duration of 34 mins

Flow and speed view combination gives us the best result

Result from all view combinations:

Flow and speed : 10:35 AM

Flow and duration : 10:50 AM

Speed and duration : 10:50 AM

Flow, speed and duration : 10:50 AM

Page 18: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 18

Via the Mobile Century Project Data (cont’d)

-30 -20 -10 0 10 20 30 40 50 60-10

-5

0

5

10

15

anomaly

normal

Speed view (GPS)-2 -1.5 -1 -0.5 0 0.5 1

-0.5

0

0.5

1

1.5

2

2.5

anomaly

normal

Duration view (GPS)-8 -6 -4 -2 0 2 4

-5

-4

-3

-2

-1

0

1

2

anomaly

normal

Flow view (PeMS)

40041

-8 -6 -4 -2 0 2 4-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

anomaly

normal

Flow view (PeMS)

400284

-5 -4 -3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

anomaly

normal

Flow view (PeMS)

400488

Flow view (PeMS)

401561

-6 -5 -4 -3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

anomaly

normal

Flow view (PeMS)

400611

-8 -6 -4 -2 0 2 4-6

-5

-4

-3

-2

-1

0

1

2

3

4

anomaly

normal

Flow view (PeMS)

400165

-10 -8 -6 -4 -2 0 2 4-5

-4

-3

-2

-1

0

1

2

anomaly

normal

Century PeMS Century GPS

Page 19: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 19

Via the Caltrans PeMS Data

The speed and flow view

combination also gives the

best result!

Our final report records the

accident happened at 1:00

PM, which matches the

ground truth

The anomalous points are

well separated from the

normal points in low

dimensional representation

-3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

Normal

Anomaly-3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

Normal

Anomaly

Speed view Flow view

Page 20: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 20

Online mode detection

― Anomaly Detection on ITS Data via View Association ―

Page 21: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 21

Experimental Settings (Online detection)

Online detection via Caltrans PeMS

Focusing on the accident at December 14, 2007

Using speed view to detect anomalous events

Using the previous 10 days’ data

Only including data at the same time with the test data

Using previous days’ data without accidents

Choosing the previous days’ data for training:

Only using working days’ data

Including weekend days’ data (1-day weekend and 2-day weekend)

Detecting by hourly basis

Page 22: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 22

Online Detection via the Caltrans PeMS Data

Using previous days’

data to detect

anomalies in the

current moment

Working days

influence the

detection results:

(The “red squares”

are the predicted

anomalies, The “red

crosses” are the real

anomalies)

-10 -5 0 5 10 15 20-2

-1

0

1

2

3

4

5

Normal

Anomaly

-15 -10 -5 0 5 10-4

-3

-2

-1

0

1

2

3

Normal

Anomaly

-10 -8 -6 -4 -2 0 2 4-1.5

-1

-0.5

0

0.5

1

1.5

Normal

Anomaly

Weekend days

working days One-day weekend Two-day weekend

Working days One-day weekend Two-day weekend

12/14/2007 13:00 12/9/2007 13:00 12/8/2007 13:00

12/9/2007 13:00

Final Report

Weekend days

Page 23: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 23

Discussion

Need to have rigorous criteria to judge “qualified” views for

association

Views provide trustworthy information

Can not allow contextual anomalies crossing views

Can allow contextual anomalies within a view

Using Isomap mainly for data representation and visualization

View association can be applied w or w/o Isomap

Deriving a better x% & 1 – x% principle

A strict or soft principle?

Respect more to the principle or to the data?

Page 24: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 24

Conclusion

Proposed an ITS anomaly detector for traffic analysis

Detecting anomalies based on view association

Can automatically generating an anomaly report

Claimed benefits if compared to other detectors:

The proposed method needs little parameter tuning

Can detect different types of anomalies given different training

data

The method can work efficiently if implemented in parallel

Page 25: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 25

Acknowledgement

The research was supported by

Taiwan National Science Council under Grant NSC101-2218-E-011-009

Taiwan National Science Council, National Taiwan University and Intel

Corporation under Grants NSC100-2911-I-002-001 and NTU101R7501-1

(Intel-NTU Connected Context Computing Center - http://ccc.ntu.edu.tw/)

Page 26: Anomaly Detection ITS View Association paooutlier-analytics.org/odd13kdd/papers/slides_hsing-kuo_pao.pdf · Anomaly Detection on ITS Data via View Association JunaidillahFadlil, Hsing-KuoPao

KDD 2013 Workshop on Outlier Detection and Description, August 11th, Chicago Page 26

q & a