Upload
leslie-hurst
View
22
Download
3
Embed Size (px)
DESCRIPTION
ISIS Lab. Motivation: How to stop network abuse given a set of policies? No encrypted traffic on network, besides SSH and HTTPS No outbound video or audio streams Types of abusers: Malicious outsider looking for free resources to host illegal activities Malicious insider running a P2P hub - PowerPoint PPT Presentation
Citation preview
Nabs: A System for Detecting Resource Abuses via Characterization of Flow Content TypeKulesh Shanmugasundaram, Mehdi Kharrazi, and Nasir Memon
Motivation:•How to stop network abuse given a set of policies?
•No encrypted traffic on network, besides SSH and HTTPS•No outbound video or audio streams
•Types of abusers:•Malicious outsider looking for free resources to host illegal activities•Malicious insider running a P2P hub•Ill informed user running application proxy
Current solutions:• Block ports with firewall
• Tunnel everything trough an open port• Use IDS
• No signature available for content type• Check packet header
• Packet containing header needs to be captured• Not all data types have headers, i.e. text, encrypted data• Header could be changed
Classification:• A multi-class SVM is used• Two main ideas with support vector machines:
1-Map the data to another dimension (kernels)2-Maximize the separating margins
• For each data segment size:• 40% of data is used for training• The obtained classifier is tested on the rest of the unseen data
System Design
Nabs:•Use payload statistical properties to classify packets belonging to a set of possible content types.
Identifying statistics:Time domain statistics
Mean, variance, auto-correlation, entropyFrequency domain statistics
Power, mean, variance, and skewness of different frequency bands
Higher order statisticsused to characterize non-linearityMean and power of bicoherence magnitudePower of bicoherence phaseSkewness, and kurtosis
Feature Selection:•Some of the 25 features will have little information gain•Less feature means faster but less system complexity•Used SFFS to identify the more important features
•Entropy, Power in the first freq. band, Mean, Variance, mean and variance in the fourth freq. band
Deployment:•Monitored Poly network for two weeks
•600 flows processed per sec. on average•Flow characterization takes about 945us
•Detected abuses:•Unauthorized source of encrypted traffic
•9 hosts found being source of encrypted traffic•Waste a p2p application, which encrypts connections was being used
•Unauthorized source of multimedia content•16 hosts with heavy multimedia traffic detected•Further investigation revealed them as proxy servers
Dataset: Raw: TXT, BMP, WAV Compressed: ZIP, JPEG, MP3, MPEG Encrypted: AES encrypted files 1000 files collected from each category using a P2P
network 16384 bytes of data sample from each file, at random
location
Confusion matrix
Accuracy before and after feature selection
Avg. Entropy Avg. SkenewssAvg. Power infrequency band
Scatter plot of statistics from 4 data categories