Upload
ann-antonia-dalton
View
222
Download
0
Embed Size (px)
Citation preview
A Novel Approach to Novelty Detection:Voronoi Tesselation
[email protected] Science Division
NASA Ames Research Center
Collaborator: Nikunj Oza, NASA-Ames Research Center, ICPureSense, Inc.
Machine Learning Seminar
The Basic Ideas Nonparametric Density EstimationVoronoi Tessellation Voronoi Cells as Point Surrogates 1/Area of cell ~ local point density Cell geometry local density gradientTessellate training points plus 1 test pointIf the Voronoi cell assigned to the test point is an “edge” cell, the test point is an outlier; otherwise it is “normal”
Modes of Operation (1)
Static training data and test data
Modes of Operation (2)
Training data = all past dataTest data = one new data point
Modes of Operation (2)
Training data = all past dataTest data = one new data point
Modes of Operation (3)
Training data = past data of fixed sizeTest data = one new data point
Voronoi Tessellation of data in any dimension
Construct Voronoi cells to represent local photon density:
density ~ 1 / cell area
Voronoi cells also represent local photon density gradients
…
The Voronoi cells are a local representation of the data …
Selecting the smallest Voronoi cells yields the regions of highest photon density …
MatLab code
% do abnormal data for id = 1: num_test data = [ train_data test_data( id ) ]; [ vertices, v_cells ] = voronoin( data ); vertices_last = v_cells{ num_use + 1 };
if find( vertices_last == 1 ) % “infinite vertex” = #1 count_correct = count_correct + 1; else count_error = count_error + 1; end end
Biomed dataset: Cox, Johnson and Kafadar (1982), Exposition of statistical graphics technology,ASA Proceedings of the Statistical Computing Section, p. 55-6
Biomed dataset: Cox, Johnson and Kafadar (1982), Exposition of statistical graphics technology,ASA Proceedings of the Statistical Computing Section, p. 55-6
67 Abnormal Inputs 27 Normal InputsCorrect Wrong Correct
Wrong--------------------------------------------------------------------------------------------KernelClassifier 57 10 25 2
Grow WhenRequired net 56 11 25 2
Voronoi - mean 57.2 9.8 17.6 9.3 - best 60 7 25 2
Curse of Dimensionality?
Computation time for Voronoi Tessellation is roughly linear in number of data points.
But … much steeper function of the dimensionality.
In the “ball bearing” data set (following example) thedimensionality of the raw data is 32. I used singular Value decomposition to reduce the dimensionality.
Curse of Dimensionality?
Ball-bearing dataEPSRC Structural Integrity & Damage Assessmentwww.brunel.ac.uk/research/cnca/sida/html/data.htm
Normal Broken Damaged Basket ½ runs(New) Ring Basket destroyed loosely
-------------------------------------------------------------------------------------------Linear programmingkernel 1.3% 0% 46.7% 71.7% 74.5%
Grow WhenRequired net 37.8% 40.3% 43.8 4.6% 4.9%
LPDD* 0%? 0% 8.3%
Voronoi 3 1.6% 0% 30.7% 30.7% 35.7%4 6.4% 0% 12.1% 16.2% 19.9%
5 13.5% 0.7% 25.5% 28.9% 34.2%
Novelty Detection in Time Series
Multivariate Time Series
For single time series, use embedding captures the dynamical behavior of the process increases the dimensionality.
X(tn) { X(tn), X(tn+1), X(tn+2), …, X(tn+k-1)}Online Novelty Detection on Temporal SequencesJunshui Ma and Simon Perkins, SIGKDD 2003
Better:
X(tn) { X(tn), X(tn+m), X(tn+2m), …, X(tn+(k-1))}