Data Mining: Introduction

Intrusion Detection

Outline

Intrusion detection and computer

security

Current intrusion detection approaches

Data Mining Approaches for Intrusion

Detection

Summary

Intrusion Detection and Computer Security

Computer security goals:

Confidentiality, integrity, and availability

Intrusion is a set of actions aimed to

compromise these security goals

Intrusion prevention (authentication,

encryption, etc.) alone is not sufficient

Intrusion detection is needed

Intrusion Examples Intrusions: Any set of actions that threaten the

integrity, availability, or confidentiality of a network resource

Examples Denial of service (DoS): attempts to starve a host of

resources needed to function correctly Scan: reconnaissance on the network or a particular host Worms and viruses: replicating on other hosts Compromises: obtain privileged access to a host by

known vulnerabilities

Intrusion Detection Intrusion detection: The process of monitoring

and analyzing the events occurring in a computer and/or network system in order to detect signs of security problems

Primary assumption: User and program activities can be monitored and modeled

Steps Monitoring and analyzing traffic

Identifying abnormal activities

Assessing severity and raising alarm

Monitoring and Analyzing Traffic

TCPdump and Windump

Provide insight into the traffic activity on a network

ftp://ftp.ee.lbl.gov/tcpdump.tar.Z

http://netgroupserv.polito.it/windump

Ethereal

GUI to interpret all layers of the packet

Goals of Intrusion Detection System (IDS) Detect wide variety of intrusions

Previously known and unknown attacks Suggests need to learn/adapt to new attacks or changes in

behavior

Detect intrusions in timely fashion May need to be real-time, especially when system

responds to intrusion Problem: analyzing commands may impact response

time of system May suffice to report intrusion occurred a few minutes or

hours ago

Goals of Intrusion Detect. System (IDS) (2)

Present analysis in simple, easy-to-understand format

Be accurate

Minimize false positives, false negatives

False positive: An event, incorrectly identified by the IDS

as being an intrusion when none has occurred

False negative: An event that the IDS fails to identify as

an intrusion when one has in fact occurred

Minimize time spent verifying attacks, looking for them

IDS Architecture Sensors (agent)

to collect data and forward info to the analyzer network packets log files system call traces

Analyzers (detector) To receive input from one or more sensors or from other

analyzers To determine if an intrusion has occurred

User interface To enable a user to view output from the system or

control the behavior of the system

IDS Architecture

Sensor 1

Sensor n

Sensor 2 Network

Sensor events

Classifier Human analyst

Clustering

ANALYSER

Signature-Based Intrusion Detection Human analysts investigate suspicious traffic

Extract signatures Features of known intrusions

Use pre-defined signatures to discover malicious packets

Examples LaBrea Tarpit by Tom Liston

Snort and Snort rules Marty Roesch

Snort by Marty Roesch

An open source free network intrusion detection system Signature-based, use a combination of rules and

preprocessors On many platforms, including UNIX and Windows www.snort.org

Preprocessors IP defragmentation, port-scan detection, web traffic

normalization, TCP stream reassembly, … Can analyze streams, not only a single packet at a time

Problems in Signature-Based Intrusion Detection Systems

Many false positives: prone to generating alerts

when there is no problem in fact

Signatures are not specific enough

A packet is not examined in context with those that

precede it or those that follow

Cannot detect unknown intrusions

Rely on signatures extracted by human experts

Misuse vs. Anomaly Detection Misuse detection: use patterns of well-known

attacks to identify intrusions Classification based on known intrusions

E.g., three consecutive login failures: password

guessing.

Anomaly detection: use deviation from normal

usage patterns to identify intrusions Any significant deviations from the expected behavior

are reported as possible attacks

Misuse vs. Anomaly Detection

Misuse Detection Anomaly Detection

Definition matching the sequence of “signature actions” of known intrusion scenarios

using statistical measure on system features

Shortcoming Has to hand-coded known pattern. Unable to detect any future intrusion

Rely upon in selecting the system features. Has to study sequential interrelation between transactions

Example STAT [HLMS90] IDES [LTG+92]

Host-based vs. Network-based According to data sources Host-based detection: the data is collected from

an individual host Directly monitor the host data files and OS processes Can determine exactly which host resources are the

targets of a particular attack

Network-based detection: the data is traffic across the network A set of traffic sensors within the network Can easily harder against attacks and hide from the

attackers

OUTLINE

Intrusion detection and computer security

Current intrusion detection approaches

Data Mining Approaches for Intrusion Detection

Summary

Current Intrusion Detection Approaches—Misuse Detection Misuse detection :

Record the specific patterns of intrusions Monitor current audit trails (event sequences) and

pattern matching Report the matched events as intrusions Representation models: expert rules, Colored Petri Net,

and state transition diagrams, etc.

Misuse Detection Example

Expert systems: use a set of rules to describe attacks IDES, ComputerWatch, NIDX, P-BEST, ISOA

Signature analysis: capture features of attacks in audit trail Haystack, NetRanger, RealSecure, MuSig

State-transition analysis: use state-transition diagrams STAT,USTAT and NetSTAT

Other approaches Colored petri nets, e.g., IDIOT Case-based reasoning, e.g., AUTOGUARD

Current Intrusion Detection Approaches—Anomaly Detection

Anomaly detection: Establishing the normal behavior profiles

Observing and comparing current activities with the

(normal) profiles

Reporting significant deviations as intrusions

Statistical measures as behavior profiles: ordinal and

categorical (binary and linear)

Anomaly Detection Example Statistical methods: multivariate, temporal

analysis IDES, NIDES, EMERALD

Expert systems ComputerWatch, Wisdom & Sense

Problems of Current Intrusion Detection Approaches

Main problems: manual and ad-hoc Misuse detection:

Known intrusion patterns have to be hand-coded Unable to detect any new intrusions (that have no

matched patterns recorded in the system) Anomaly detection:

Selecting the right set of system features to be measured is ad hoc and based on experience

Unable to capture sequential interrelation between events

OUTLINE

Intrusion detection and computer security

Current intrusion detection approaches Data Mining Approaches for Intrusion

Detection Summary

Why Can Data Mining Help? Data mining: applying specific

algorithms to extract patterns from data

Normal and intrusive activities leave evidence in audit data

From the data-centric point view, intrusion detection is a data analysis process

Why Can Data Mining Help?

Successful applications in related domains, e.g., fraud detection, fault/alarm management

Learn from traffic data Supervised learning: learn precise models

from past intrusions Unsupervised learning: identify suspicious

activities Maintain or update models on dynamic

data

Frequent Patterns Patterns that occur frequently in a database

Mining Frequent patterns – finding regularities

Process of Mining Frequent patterns for intrusion detection Phase I: mine a repository of normal frequent itemsets

for attack-free data

Phase II: find frequent itemsets in the last n connections and compare the patterns to the normal profile

Frequent Pattern Mining in MINDS MINDS: a IDS using data mining

techniques University of Minnesota

Summarizing attacks using association rules {Src IP=206.163.27.95, Dest Port=139,

Bytes[150, 200)} {ATTACK}

Patterns About Alerts

Ning et al. CCS’02 Find correlated alerts – the frequent

patterns of alerts Attack scenarios – the logical connections

between alerts A hyper-alerts correlation graph approach

Use the correlation of intrusion alerts to identify high level attacks

Associate rules Used for link analysis

E.g.: If the number of failed login attempts

(num_failed_login_attempts) and the network service on the destination (service) are features, an example of rule is:

num_failed_login_attempts = 6, service = FTP => attack = DoS [1, 0.28 ]

Sequential Pattern Analysis

Models sequence patterns (Temporal) order is important in many situations

Time-series databases and sequence databases

Frequent patterns (frequent) sequential patterns

Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets

Classification: A Two-Step Process Model construction: describe a set of

predetermined classes Training dataset: tuples for model construction

Each tuple/sample belongs to a predefined class

Classification rules, decision trees, or math formulae

Model application: classify unseen objects Estimate accuracy of the model using an independent

test set Acceptable accuracy apply the model to classify data

tuples with unknown class labels

Classification Methods Basic Algorithm ID3 Neural networks Bayesian classification

Naïve Bayesian classification Bayesian belief network

Support vector machines

Classification for Intrusion Detection Misuse detection

Classification based on known intrusions

Example: Sinclair et al. “An application of

machine learning to network intrusion detection” Use decision trees and ID3 on host session data

Use genetic algorithms to generate rules If <pattern> then <alert>

HIDE “A hierarchical network intrusion detection

system using statistical processing and neural network classification” by Zheng et al.

Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds

the statistical model Statistical processor maintains a model for normal

activities and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports

Intrusion Detection by NN and SVM S. Mukkamala et al., IEEE IJCNN May 2002 Discover useful patterns or features that

describe user behavior on a system Use the set of relevant features to build

classifiers SVMs have great potential to be used in place of

NNs due to its scalability and faster training and running time

NNs are especially suited for multi-category classification

Clustering Group data into clusters What is a good clustering

High intra-class similarity and low inter-class similarity Depending on the similarity measure

The ability to discover some or all of the hidden patterns

Clustering Approaches K-means Hierarchical Clustering Density-based methods Grid-based methods Model-based

Clustering for Intrusion Detection Anomaly detection

Any significant deviations from the expected behavior

are reported as possible attacks

Build clusters as models for normal activities

“A scalable clustering for intrusion signature

recognition” by Ye and Li Use description of clusters as signatures of intrusions

Alert Correlation

F. Cuppens and A. Miege, in IEEE S&P’02

Use clustering and merging functions to

recognize alerts that correspond to the same

occurrence of an attack

Create a new alert that merge data contained in these

various alerts

Generate global and synthetic alerts to reduce

the number of alerts further

Mining Data Streams

Continuous arrival data in multiple, rapid, time-

varying, possibly unpredictable and unbounded

streams

Many applications

Financial applications, network monitoring, security,

telecommunications data management, web application,

manufacturing, sensor networks, etc.

Mining Data Streams for Intrusion Detection Maintaining profiles of normal activities

The profiles of normal activities may drift Identifying novel attacks

Identifying clusters and outliers in traffic data streams

A Systematic Framework—J.Stolfo et al.

Build good models: select appropriate features of audit data to build

intrusion detection models

Build better models: architect a hierarchical detector system that combines

multiple detection models

Build updated models: dynamically update and deploy new detection system as

needed

A Systematic Framework

Support for the feature selection and model

construction:

Apply data mining algorithms to find consistent inter-

and intra- audit record (event) patterns

Use the features and time windows in the discovered

patterns to build detection models

A support environment to semi-automate this process

A Systematic Framework Combining multiple detection models:

Each (base) detector model monitors one aspect of the system They can employ different techniques and be independent of

each other The learned (meta) detector combines evidence from a number

of base detectors

An intelligent agent-based architecture: learning agents: continuously compute (learn) the detection

models detection agents: use the (updated) models to detect intrusions

A Systematic Framework

Building Classifiers for Intrusion Detection—J.Stolfo et al.

Experiments in constructing classification models for anomaly detection

Two experiments: sendmail system call data network tcpdump data

Use meta classifier to combine multiple classification models

Classification Models on sendmail The data: sequence of system calls made by

sendmail.

Classification models (rules): describe the

“normal” patterns of the system call sequences.

The rule set is the normal profile of sendmail

Detection: calculate the deviation from the profile large number/high scores of “violations” to the rules in a

new trace suggests an exploit

Classification Models on sendmail The sendmail data:

Each trace has two columns: the process ids and the system call numbers

Normal traces: sendmail and sendmail daemon Abnormal traces: sunsendmailcap, syslog-

remote, syslog-remote, decode, sm5x and sm56a attacks

Classification Models on sendmail Lessons learned:

Normal behavior can be established and used to detect anomalous usage

Need to collect near “complete” normal data in order to build the “normal” model

But how do we know when to stop collecting? Need tools to guide the audit data gathering

process

Classification Models on tcpdump The tcpdump data (part of a public data

visualization contest): Packets of incoming, out-going, and internal

broadcast traffic

One trace of normal network traffic

Three traces of network intrusions

Data Preprocessing

Extract the “connection” level features: Record connection attempts Watch how connection is terminated

Each record has: start time and duration participating hosts and ports (applications) statistics (e.g., # of bytes) flag: normal or a connection/termination error protocol: TCP or UDP

Divide connections into 3 types: incoming, out-going, and inter-lan

Building Classifier for Each Type of Connections

Use the destination service (port) as the class label

Training data: 80% of the normal connections Testing data: 20% of the normal connections and

connections in the 3 intrusion traces Apply RIPPER to learn rules

Lessons Learned

Data preprocessing requires extensive domain knowledge

Adding temporal features improves classification accuracy

Need tools to guide (temporal) feature selection

Meta Classifier that Combines Evidence from Multiple Detection Models

Build base classifiers that each model one aspect

of the system

The meta learning task:

each record has a collection of evidence from base

classifiers, and a class label “normal”or “abnormal” on

the state of the system

Apply a learning algorithm to produce the meta

classifier

Documents

Data Mining: Introduction