Benchmarking Anomaly-based Detection Systems

Ashish GuptaNetwork Security

May 2004

Overview

• The Motivation for this paper– Waldo example

• The approach• Structure in data• Generating the data and anomalies• Injecting anomalies• Results

– Training and Testing: the method– Scoring– Presentation– The ROC curves: somewhat obvious

MotivationDoes anomaly detection depend on

regularity/randomness of data ?

Where’s Waldo !

The aim

• Hypothesis:– Differences in data regularity affect anomaly

detection– Different environments different regularity

• Regularity– Highly redundant or random ?– Example of environment’s affect

010101010101010101010101Or

0100011000101000100100101

Consequences

One IDS : Different False Alarm Rates

Need custom system/training for each environment ?

Temporal affects: Regularity may vary over time ?

Structure in dataMeasuring randomness

010101010101010101010101Or

0100011000101000100100101

Measuring Randomness

Relative Entropy Sequential Dependence+

Conditional Relative Entropy

The benchmark datasets

• Three types:– Training data ( the background data)– Anomalies– Testing data ( background + anomalies )

• Generating the sequences– 5 sets, each set 11 files ( for increasing

regularity)– Each set different alphabet size– Alphabet size decides complexity

Anomaly Generation

• What’s a surprise ? – Different from the expected probability

• Types:– Juxta-positional : different arrangements of data

• 001001001001001001111– Temporal

• Unexpected periodicities– Other types ?

Types in this paper

• Foreign symbol– AAABABBBABABCBBABABBA

• Foreign n-gram

– AAABABAABAABAAABBBBA• Rare n-gram

– AABBBABBBABBBABBBABBBABBAA

• Injecting anomalies– Make sure not more than 0.24 %

The experiments

The Hypothesis is true

• The hypothesis:– Nature of “normal” background noise affects

signal detection• The anomaly detector

– To detect anomalous subsequences– Learning phase n-gram probability table– Unexpected event anomaly !– Anomaly threshold decides level of surprise

• Example of anomaly detectionAAA 0.12

AAB 0.13

ABA 0.20

BAA 0.17

BBB 0.15

BBA 0.12

AAC ANOMALY !

Scoring

• Event outcomes– Hits– Misses– False alarms

• Threshold– Decides level of surprise– 0 completely unsurprising, 1 astonishing– Need to calibrate

Presentation of results

• Presents two aspects:– % correct detections– % false detections

• Detector operates through a range of sensitivities– Higher sensitivity ? – Need the right sensitivity

Interpretation

• Nothing overlaps regularity affects detection !

• What does this mean ?• Detection metrics are data dependent• Cannot say:

– My XYZ product will flag down 75% percent anomalies with 10% false hit rate !

– Sir, are you sure ?

Real world data

• Regularity index for system calls for different users

• Is this surprising ?• What about network traffic ?

Conclusions

Data Structure Anomaly Detection Effectiveness

Evaluation is data dependent

Conclusions

Change in regularityDifferent system

Change the parameters

Quirks ?

• Assumes rather naïve detection systems– “Simple retraining will not suffice”

• An intelligent detection can take this into account.

• What is really an anomaly ? – If data is highly irregular, won’t randomness

produce some anomalies by itself• Anomaly is a relative term

– Here anomalies are generated independently

Benchmarking Anomaly-based Detection Systems

Documents

Online Nonparametric Anomaly Detection in High-Dimensional ... · 13/2/2018 · Online Nonparametric Anomaly Detection in High-Dimensional Datasets Online Nonparametric Anomaly Detection

Anomaly Detection: Principles, Benchmarking, Explanation ...web.engr.oregonstate.edu/~tgd/...anomaly-detection... · Towards a Theory of Anomaly Detection [Siddiqui, et al.; UAI 2016]

Anomaly Detection and Attribution Using Bayesian … · UNCLASSIFIED DSTO{TR{2975 Anomaly Detection and Attribution Using Bayesian Networks Executive Summary Anomaly detection techniques

Adaptive anomaly detection with evolving connectionist systemstechlab.bu.edu/files/resources/articles_tt/Adaptive anomaly... · Adaptive anomaly detection with evolving connectionist

Anomaly Detection Technology Using BigGraph€¦ · Anomaly Detection Technology Using BigGraph ... detection Several anomaly detection techniques have been developed and applied

Chap10 Anomaly Detection

Taxonomy of Anomaly Based Intrusion Detection System… · Figure 1: Taxonomy of Anomaly based Intrusion Detection System. III. STATISTICAL ANOMALY BASED INTRUSION DETECTION SYSTEM

Credit Card Fraud Detection - Anomaly Detection

Anomaly Detection Survey

Anomaly detection- Credit Card Fraud Detection

Anomaly Detection Systems

Anomaly Detection and Mitigation. Outline DoS and DDoS Anomaly Detection and Mitigation Systems Cisco DDoS Anomaly Detection and Mitigation Solutions

Introduction to Anomaly Detection - uwyo.educlan/teach/rampe18_anomaly.pdfWhat is anomaly detection? “Anomaly detection refers to the problem of finding patterns in data that do

Anomaly Detection in Predictive Maintenance - KNIME · Anomaly Detection in Predictive Maintenance Anomaly Detection with Time Series Analysis

Intrusion & Anomaly Detection & Preventiongauss.ececs.uc.edu/Courses/c6056/pdf/intrusion.pdfIntrusion & Anomaly Detection & Prevention Intrusion Detection: Monitor events, analyze

From Anomaly Detection to Rumour Detection using Data ... · Local Anomaly Detection (§4). We propose a non-parametric method for anomaly detection at the level of individual entities,

Anomaly Detection - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Anomaly detection system architecture ! Incongruence detection ! Dempster Shaffer

Data Mining II Anomaly Detection - uni-mannheim.deData Mining II Anomaly Detection Heiko Paulheim 02/26/18 Heiko Paulheim 2 Anomaly Detection • Also known as “Outlier Detection”

A Survey on Anomaly detection in Evolving Data · Anomaly detection has received con-1In this paper, we use the terms outlier detection and anomaly detection interchangeably siderable

Survey Anomaly Detection