Unsupervised Network Intrusion Detection Systems: Detecting the Unknown without Knowledge

Computer Communications 35 (2012) 772–783

Contents lists available at SciVerse ScienceDirect

Computer Communications

journal homepage: www.elsevier .com/ locate/comcom

Unsupervised Network Intrusion Detection Systems: Detecting the Unknownwithout Knowledge

Pedro Casas ⇑, Johan Mazel, Philippe OwezarskiCNRS; LAAS; 7 avenue du colonel Roche, F-31077 Toulouse Cedex 4, FranceUniversité de Toulouse; UPS, INSA, INP, ISAE; UT1, UTM, LAAS; F-31077 Toulouse Cedex 4, France

a r t i c l e i n f o a b s t r a c t

Article history:Received 31 August 2011Received in revised form 10 January 2012Accepted 20 January 2012Available online 28 January 2012

Keywords:NIDSUnsupervised Machine LearningSub-Space ClusteringEvidence AccumulationOutliers detection

0140-3664/$ - see front matter � 2012 Elsevier B.V. Adoi:10.1016/j.comcom.2012.01.016

⇑ Corresponding author at: CNRS, LAAS, 7 avenueToulouse Cedex 4, France. Tel.: +33 (0)5 61 33 68 05;

E-mail addresses: [email protected] (P. Casas), jmazel@(P. Owezarski).

Traditional Network Intrusion Detection Systems (NIDSs) rely on either specialized signatures of previ-ously seen attacks, or on expensive and difficult to produce labeled traffic datasets for user-profiling tohunt out network attacks. Despite being opposite in nature, both approaches share a common downside:they require the knowledge provided by an external agent, either in terms of signatures or as normal-operation profiles. In this paper we present UNIDS, an Unsupervised Network Intrusion Detection Systemcapable of detecting unknown network attacks without using any kind of signatures, labeled traffic, ortraining. UNIDS uses a novel unsupervised outliers detection approach based on Sub-Space Clusteringand Multiple Evidence Accumulation techniques to pin-point different kinds of network intrusions andattacks such as DoS/DDoS, probing attacks, propagation of worms, buffer overflows, illegal access to net-work resources, etc. We evaluate UNIDS in three different traffic datasets, including the well-knownKDD99 dataset as well as real traffic traces from two operational networks. We particularly show theability of UNIDS to detect unknown attacks, comparing its performance against traditional misuse-detec-tion-based NIDSs. In addition, we also evidence the supremacy of our outliers detection approach withrespect to different previously used unsupervised detection techniques. Finally, we show that the algo-rithms used by UNIDS are highly adapted for parallel computation, which permits to drastically reducethe overall analysis time of the system.

� 2012 Elsevier B.V. All rights reserved.

1. Introduction

The detection of network attacks is a paramount task for net-work operators in today’s Internet. Botnets, Malwares, DistributedDenial of Service attacks (DDoS), buffer overflow attacks, network-scanning activities, and spreading worms or viruses are examplesof the different threats that daily compromise the integrity andnormal operation of the network. The principal challenge in auto-matically detecting network attacks is that these are a moving andever-growing target [1,2]. Network Intrusion Detection Systems(NIDSs) are the war-horses of network security. Two different ap-proaches are by far dominant in the research literature andcommercial IDS security devices: signatures-based or misusedetection (detect what I know) and anomaly detection (detectwhat it is different from what I know).

Misuse detection is the de facto approach used in most IDSs.When an attack is discovered, generally after its occurrence during

ll rights reserved.

du colonel Roche, F-31077fax: +33 (0)5 61 33 64 11.laas.fr (J. Mazel), [email protected]

a diagnosis phase, the associated malicious pattern is coded as asignature by human experts, which is then used to detect a newoccurrence of the same attack. To avoid costly and time-consuminghuman intervention, signatures can also be constructed by super-vised machine-learning techniques, using instances of the discov-ered attack to build a detection model for it. Misuse detectionsystems are highly effective to detect those attacks which theyare programmed to alert on. However, they cannot defend the net-work against new attacks, simply because they cannot recognizethose attacks which do not match their lists of signatures. Indeed,networks protected by misused detection systems suffer from longperiods of vulnerability between the diagnosis of a new attack andthe construction of the new signature.

On the other hand, anomaly detection uses instances of normal-operation traffic to build normal-operation profiles, detectinganomalies as activities that deviate from this baseline. Such meth-ods can detect new kinds of network attacks not seen before.Nevertheless, they require training to construct profiles, which istime-consuming and depends on the availability of anomaly-freetraffic instances. In addition, it is not easy to maintain an accurateand up-to-date normal-operation profile, which induces high false-alarm rates.

http://dx.doi.org/10.1016/j.comcom.2012.01.016

mailto:[email protected]



http://dx.doi.org/10.1016/j.comcom.2012.01.016

http://www.sciencedirect.com/science/journal/01403664

http://www.elsevier.com/locate/comcom

P. Casas et al. / Computer Communications 35 (2012) 772–783 773

Despite being opposite in nature, both misuse detection andanomaly detection share a common downside: they require theknowledge provided by an external agent to achieve their goal,either in terms of attack-signatures or as normal-operation pro-files. As such, current network security looks more like a reactivecountermeasure than a proactive prevention mechanism. Overthe past years we have, however, witnessed an increased interestwithin the network security community in shifting away fromreactive defense towards more proactive security systems [3].Our thesis behind this work is that reactive, knowledge-based ap-proaches are not sufficient to tackle the network security problem,and that a holistic solution should also include proactive, knowl-edge-independent analysis techniques.

Armed with these ideas in mind, we present an UnsupervisedNetwork Intrusion Detection System (UNIDS) capable of detectingnetwork attacks without relying on signatures, training, or labeledtraffic instances of any kind. Based on the observation that networkattacks, and particularly the most difficult ones to detect, are con-tained in a small fraction of traffic instances with respect to nor-mal-operation traffic [6] (we show that this hypothesis canalways be verified by using traffic aggregation), their unsuperviseddetection consists in identifying outliers, i.e. instances that areremarkably different from the majority. UNIDS relies on robustclustering techniques to blindly extract the traffic instances thatcompose an attack. This unsupervised security system runs inthree consecutive steps, analyzing packets captured in contiguoustime slots of fixed length. Fig. 1 depicts a modular, high-leveldescription of this system:

The first step consists in detecting an anomalous time slot inwhich the clustering analysis will be performed. For doing so, cap-tured packets are first aggregated into multi-resolution traffic flows.Different time-series are then built on top of these flows, and anygeneric change-detection algorithm based on time-series analysisis finally used to flag an anomalous change. In this paper we shalluse a standard change-detection algorithm [4] on three very sim-ple and traditionally used volume metrics, consisting of # bytes,# packets, and # flows per time slot. The choice of volume metricsis based on [11], but change-detection can be performed on anyother traffic metric sensitive to anomalies. The algorithm basicallyflags an anomaly when the derivative of any of these metrics ex-ceeds a detection threshold, dynamically computed from the vari-ance of previous anomaly-free measurements. The reader shouldbear in mind that this change-detection step is not a critical partof UNIDS, but that it is merely used to limit the frequency of usage

Fig. 1. High-level description of the Unsupervised

of the clustering step, which is certainly more expensive in termsof computational resources.

The second step takes as input all the flows in the time slotflagged as anomalous. At this step, outlying flows are identifiedusing a robust multi-clustering algorithm, based on a combinationof Sub-Space Clustering (SSC) [18], Density-based Clustering [23],and Evidence Accumulation Clustering (EAC) [22] techniques. Theknowledge provided by this clustering algorithm is used to rankthe degree of abnormality of all the identified outlying flows, build-ing an outliers ranking.

In the third step, the top-ranked outlying flows are flagged asanomalies, using a simple thresholding detection approach.

As we show through out the paper, the main contribution ofUNIDS relies on its ability to detect unknown attacks in a com-pletely unsupervised fashion, avoiding the need for signatures,training, or labeled traffic flows. This paper represents a continua-tion of our previous work on unsupervised anomaly detection [5].In particular, we show that UNIDS can be used to detect unknownnetwork attacks of very different nature, outperforming traditionalmisuse-detection-based systems; as such, we provide more evi-dence and solid results on the quality and relevance of our propos-als for unsupervised anomaly detection. In addition, we show thatthe computational time involved in the unsupervised traffic analy-sis can be drastically reduced w.r.t the system presented in [5], bysimply taking advantage of the parallel structure of the multi-clustering algorithm used in the core of UNIDS.

The remainder of the paper is organized as follows. Section 2presents a brief state of the art in the network intrusion and anom-aly detection fields, describing our main contributions. Section 3describes the multi-resolution traffic aggregation and change-detection procedures used in the first step of the UNIDS systemto identify an anomalous time-slot. Section 4 describes the coreof UNIDS, presenting an in depth description of the different clus-tering techniques used to construct the outliers ranking. Section 5presents the validation of UNIDS in real traffic traces obtained fromtwo networking datasets: the public MAWI traffic repository of theWIDE project [25], and the METROSEC project dataset [27]. In thissection we also compare the performance of UNIDS against previ-ous proposals for unsupervised detection of attacks available in theliterature. Section 6 evaluates the ability of UNIDS to detect un-known attacks in the well-known KDD99 network attacks dataset,comparing its performance with that obtained by an extensivelyinvestigated misuse NIDS based on decision trees. Implementationrelated issues of UNIDS, including evaluation of computational

Network Intrusion Detection System (UNIDS).

774 P. Casas et al. / Computer Communications 35 (2012) 772–783

time and scalability, are discussed in Section 7. Finally, Section 8concludes the paper.

2. Related work & contributions

The problem of network attacks and intrusions detection hasbeen extensively studied in the last decade. Most NIDS are basedon misuse detection techniques, being Bro [28] and SNORT [29]the two most celebrated open-source systems in the literature.Most of the different techniques traditionally applied in the NIDSfield are summarized in [7,8]. A particularly studied approach formisuse detection that we shall include in this paper for perfor-mance-comparison consists in the use of decision trees [9,10]; insimple words, a tree-like graph is constructed from a training setof labeled attack/normal operation flows and then used to classifythe nature of new incoming traffic flows.

Our approach falls within the unsupervised attacks detectiondomain. The vast majority of the unsupervised detection schemesproposed in the literature are based on clustering and outliersdetection, being [19–21] some relevant examples. In [19], authorsuse a single-linkage hierarchical clustering method to cluster datafrom the KDD99 dataset, based on the standard Euclidean distancefor inter-patterns similarity. Reference [20] reports improved re-sults in the same dataset, using three different clustering algo-rithms: Fixed-Width clustering, an optimized version of k-NN,and one class SVM. Ref. [21] presents a combined density-grid-based clustering algorithm to improve computational complexity,obtaining similar detection results. PCA-based anomaly detectionrepresents another well-known unsupervised technique, used in[12,13] to detect network wide traffic anomalies in highly aggre-gated traffic flows, and more recently in [14] for anomaly detectionin single router traffic metrics.

The Unsupervised Network Intrusion Detection System pre-sented in this paper presents several advantages with respect tocurrent state of the art. First and most important, it works in acompletely unsupervised fashion, which means that it can be di-rectly plugged-in to any monitoring system and start to work fromscratch, without any kind of calibration and/or training step. Sec-ondly, it uses a robust density-based clustering technique to avoidgeneral clustering problems such as sensitivity to initialization,specification of number of clusters, detection of particular clustershapes, or structure-masking by irrelevant features. Thirdly, it per-forms clustering in very-low-dimensional spaces, avoiding sparsityproblems when working with high-dimensional data [17]. Finally,we shall evidence that UNIDS clearly outperforms some previouslyproposed methods for unsupervised detection in both artificial andreal network traffic.

Fig. 2. Low-intensity anomalies might be hidden inside highly aggregated traffic,but are visible at finer-grained aggregations.

3. Multi-resolution flows & change-detection

UNIDS performs unsupervised detection of network attacks onpacket-level traffic, captured at a single monitoring point in con-secutive time slots of fixed length DT , and aggregated in IP flows(standard 5-tuples). UNIDS could be extended to work on net-work-wide traffic captured at multiple vantage points, but onlyin the case of accurate time-synchronization between measuringprobes, and having access to the aggregated packet traces fromall monitoring points. In any case, this paper considers only trafficcaptured at a single monitoring point for the evaluations. IP flowsare additionally aggregated at different flow-resolution levels, using9 different aggregation keys li, please refer to Fig. 1. These include,from coarser to finer-grained resolution: traffic per Time Slot(l1 : tpTS), source Network Prefixes (l2;3;4 : IPsrc=8; =16; =24), destina-tion Network Prefixes (l5;6;7 : IPdst=8; =16; =24), source IPs (l8 : IPsrc),and destination IPs (l9 : IPdst). The 7 coarsest-grained resolutions

are used for change-detection, while the remaining 2 are exclu-sively used in the clustering step.

To detect an anomalous time slot, time-series Zlit are con-

structed for the simple aforementioned traffic metrics includingnumber of bytes, packets, and flows per time slot, using aggrega-tion keys i ¼ 1; . . . ;7. Any generic change-detection algorithmFð:Þ based on time-series analysis is then applied to Zli

t . At eachnew time slot, Fð:Þ analyses the different time-series associatedwith each aggregation key, going from coarser (l1) to finer resolu-tion (l7). Time slot t0 is flagged as anomalous if FðZli

t0Þ triggers an

alarm for any of the traffic metrics at any of the 7 aggregation lev-els. Tracking anomalies at multiple aggregation levels providesadditional reliability to the change-detection algorithm, and per-mits to detect both single source–destination and distributedanomalies of very different intensities.

Fig. 2 shows how a low intensity DDoS attack might be dwarfedby highly-aggregated traffic. The time-series associated with thenumber of packets, Zt ¼ #pktst , does not present a perceptiblechange Dð#pktstÞ at tpTS aggregation (left); however, the attackcan be easily detected using a finer-grained resolution, e.g., atthe victim’s network (IPdst=24 aggregation, on the right).

4. Unsupervised NIDS through clustering

The unsupervised detection step takes as input all the IP flowsin the flagged time slot. At this step UNIDS ranks the degree ofabnormality of each flow, using clustering and outliers analysistechniques. For doing so, IP flows are analyzed at two different res-olutions, using either IPsrc or IPdst aggregation key. Traffic anom-alies can be roughly grouped in two different classes, depending ontheir spatial structure and number of impacted IP flows: 1-to-Nanomalies and N-to-1 anomalies. 1-to-N anomalies involve manyIP flows from the same source towards different destinations;examples include network scans and spreading worms/virus. Onthe other hand, N-to-1 anomalies involve IP flows from differentsources towards a single destination; examples include DDoS at-tacks and flash-crowds. 1-to-1 anomalies are a particular case ofthese classes, while N-to-N anomalies can be treated as multipleN-to-1 or 1-to-N instances. Using IPsrc key permits to highlight1-to-N anomalies, while N-to-1 anomalies are more easily detectedwith IPdst key. The choice of both keys for clustering analysisensures that even highly distributed anomalies, which may

Fig. 3. Sub-Space Clustering: 2-dimensional sub-spaces X1;X2, and X3 are obtainfrom a 3-dimensional feature space X by simple projection. Units in the graph areirrelevant.


possibly involve a large number of IP flows, can be represented asoutliers.

Without loss of generality, let Y ¼ fy1; . . . ; yng be the set of naggregated-flows (at IPsrc or IPdst) in the flagged slot. Each flowyi 2 Y is described by a set of m traffic attributes or features, likenumber of sources and ports, or packet rate. Let xi 2 Rm be the vec-tor of features describing flow yi, and X ¼ fx1; . . . ;xng 2 Rn�m thecomplete matrix of features, referred to as the feature space.

UNIDS is based on clustering techniques applied to X. Theobjective of clustering is to partition a set of unlabeled samplesinto homogeneous groups of similar characteristics or clusters,based on some measure of similarity defined for the underlyingsamples’ space. Very different notions of similarity or distance areused in the clustering domain, ranging from simple Euclidean dis-tance to more complex notions such as statistical-based similarity.Samples that do not belong to any of the generated clusters areclassified as outliers. Our particular goal is to identify those outliersthat are remarkably different from the rest of the samples, addi-tionally ranking how much different these are. The most appropri-ate approach to find outliers is, ironically, to properly identifyclusters. After all, an outlier is a sample that does not belong toany cluster. Unfortunately, even if hundreds of clustering algo-rithms exist [17], it is very difficult to find a single one that canhandle all types of cluster shapes and sizes. Different clusteringalgorithms produce different partitions of data, and even the sameclustering algorithm provides different results when using differ-ent initializations and/or different algorithm parameters. This isin fact one of the major drawbacks in current cluster analysis tech-niques: the lack of robustness.

To avoid such a limitation, we have developed a divide & con-quer clustering approach, using the notions of clustering ensembleand multiple clusterings combination. These ideas are well-knownwithin the machine-learning community, but the application ofthese techniques for anomaly and intrusions detection is noveland appealing: why not taking advantage of the information pro-vided by multiple partitions of X to improve clustering robustnessand identification of outliers? A clustering ensemble P consists of aset of multiple partitions Pi produced for the same data. Each par-tition provides an independent evidence of data structure, whichcan be combined to construct a new measure of similarity that bet-ter reflects natural groupings and outliers. There are different waysto produce a clustering ensemble. For example, multiple partitionscan be generated by using different clustering algorithms, or byapplying the same clustering algorithm with different settingparameters or initializations. We particularly use Sub-Space Clus-tering (SSC) [18] to produce multiple data partitions, doing den-sity-based clustering in N different sub-spaces Xi of the originalspace, please refer to Fig. 1.

4.1. Clustering ensemble and Sub-Space Clustering

Each of the N sub-spaces Xi � X is obtained by projection of Xinto k features out of the m attributes, resulting in N k-dimensionalsub-spaces. To deeply explore the complete feature space, thenumber of sub-spaces N that are analyzed corresponds to the num-ber of k combinations obtained from m. Fig. 3 explains this ap-proach; in the example, a 3-dimensional feature space X isprojected into N ¼ 3 2-dimensional sub-spaces X1;X2, and X3,which are then independently partioned via density-based cluster-ing. Each partition Pi is obtained by applying DBSCAN [23] to sub-space Xi. DBSCAN is a powerful clustering algorithm that discoversclusters of arbitrary shapes and sizes [17], relying on a density-based notion of clusters: clusters are high-density regions of thespace, separated by low-density areas. This algorithm perfectly fitsour unsupervised traffic analysis, because it is not necessary to

specify a priori difficult to set parameters such as the number ofclusters to identify.

The results obtained by applying DBSCAN to sub-space Xi aretwofold: a set of pðiÞ clusters fCi

1;Ci2; . . . ;Ci

pðiÞg and a set of qðiÞ out-liers foi

1; oi2; . . . ; oi

qðiÞg. To set the number of dimensions k of eachsub-space, we take a very useful property of monotonicity in clus-tering sets, known as the downward closure property: if a collec-tion of elements is a cluster in a k-dimensional space, then it isalso part of a cluster in any ðk� 1Þ projections of this space. Thisdirectly implies that, if there exists any interesting evidence ofdensity in X, it will certainly be present in its lowest-dimensionalsub-spaces. Using small values for k provides several advantages:firstly, doing clustering in low-dimensional spaces is more efficientand faster than clustering in bigger dimensions. Secondly, density-based clustering algorithms such as DBSCAN provide better resultsin low-dimensional spaces [17], because high-dimensional spacesare usually sparse, making it difficult to distinguish between highand low density regions. Finally, clustering multiple low-dimen-sional sub-spaces provides a finer-grained analysis, which im-proves the ability of UNIDS to detect attacks of very differentcharacteristics. We shall therefore use k ¼ 2 for Sub-Space Cluster-ing, which gives N ¼ mðm� 1Þ=2 partitions.

Algorithm 1: Ranking Outliers via EA4RO

1: Initialization:2: Set dissimilarity vector D to a null n� 1 vector3: Set smallest cluster-size nmin ¼ a � n4: for i ¼ 1 : N do5: Set density neighborhood di for DBSCAN6: Pi ¼ DBSCAN Xi; di;nminð Þ7: Update DðjÞ;8 outlier oi

j 2 Pi:

8: wi nðn�nmaxi

Þþ�

9: DðjÞ DðjÞ þ dMðoij;C

imaxÞwi

10: end for11: Rank flows: Drank = sort (D)12: Set detection threshold: Th = find slope break (Drank)

4.2. Ranking Outliers using Evidence Accumulation

Having produced the N partitions, the question now is how touse the multiple identified clusters and outliers to detect networkattacks and other anomalies. A possible answer is provided in [22],where authors introduced the idea of Evidence Accumulation Clus-tering (EAC). EAC uses the clustering results of multiple partitions


Pi to produce a new inter-samples similarity measure that betterreflects their natural groupings.

UNIDS implements a particular algorithm for Evidence Accumu-lation, which we shall refer to as Evidence Accumulation for Rank-ing Outliers (EA4RO): instead of producing a similarity measurebetween the n different aggregated flows described in X, EA4ROconstructs a dissimilarity vector D 2 Rn in which it accumulatesthe distance between the different outliers oi

j found in each sub-space i ¼ 1; . . . ;N and the centroid of the corresponding sub-space-biggest-cluster Ci

max. The idea behind EA4RO is to clearlyhighlight those aggregated flows that are remarkably differentfrom the normal-operation traffic at each of the differentsub-spaces, statistically represented by Ci

max. The reader shouldremember here that even highly distributed attacks composed ofthousands of IP flows can be represented as outliers using trafficaggregation.

Algorithm 1 presents a pseudo-code for EA4RO. The differentparameters used by EA4RO are automatically set by the algorithmitself. The first two parameters are used by the density-based clus-tering algorithm: nmin specifies the minimum number of flows thatcan be classified as a cluster, while di indicates the maximumneighborhood distance of a sample to identify dense regions. nmin

is set at the initialization of the algorithm, simply as a fraction aof the total number of flows n to analyze (we take a ¼ 5% of n).di is set as a fraction of the average distance between flows insub-space Xi (we take a fraction 1/10), which is estimated from10% of the flows, randomly selected. This permits to speed-upcomputations.

The weighting factor wi is used as an outlier-boosting parame-ter, as it gives more relevance to those outliers that are ‘‘less prob-able’’. Figs. 4(a) and (b) explain this idea: wi takes bigger valueswhen the size nmaxi

of cluster Cimax is closer to the total number

of flows n, implying that this outlier is particularly different fromthe majority.

Finally, instead of using a simple Euclidean distance as a mea-sure of dissimilarity, we compute the Mahalanobis distance dM be-tween outliers and the centroid of the biggest cluster. TheMahalanobis distance takes into account the correlation betweensamples, dividing the standard Euclidean distance by the varianceof the samples. This permits to boost the degree of abnormality ofan outlier when the variance of the samples is smaller. Figs. 4(c)and (d) permit to evidence the advantage of using such a measure.The Euclidean distance between the centroid of the cluster and theoutlier is the same in both figures. However, variance in Fig. 4(c) isbigger than in Fig. 4(d), which intuitively should reduce the abnor-mality degree of the outlier. Such a notion is correctly captured bythe Mahalanobis distance, which is smaller in Fig. 4(c).

In the last part of EA4RO, flows are ranked according to the dis-similarity obtained in D, and the anomaly detection threshold Th isset. The computation of Th is simply achieved by finding the valuefor which the slope of the sorted dissimilarity values in Drank pre-sents a major change. In the evaluation section we explain howto perform this computation with an example of real traffic analy-sis. Detection is finally done as a binary thresholding operation onDrank: if DrankðiÞ > Th, UNIDS flags an anomaly in flow yi.

5. UNIDS performance detection in real traffic

We evaluate the ability of UNIDS to detect different attacks inreal traffic traces from the public MAWI repository of the WIDEproject [25]. The WIDE operational network provides interconnec-tion between different research institutions in Japan, as well asconnection to different commercial ISPs and universities in theUS. The traffic repository consists of 15 min-long raw packet tracesdaily collected for the last ten years. The traces we shall work with

consist of traffic from one of the trans-pacific links between Japanand the US. MAWI traces are not labeled, but some previous workon anomaly detection has been done on them [16,24]. In particular,[24] detects network attacks using a signature-based approach,while [16] detects both attacks and anomalous flows using non-Gaussian modeling. We shall therefore refer to the combinationof results obtained in both works as our ground truth for MAWItraffic.

We shall also test the true positive and false positive rates ob-tained with UNIDS in the detection of flooding attacks in traffictraces from the METROSEC project [27]. These traces consist of realtraffic collected on the French RENATER network, containing simu-lated attacks performed with well-known DDoS attack tools. Traceswere collected between 2004 and 2006, and contain DDoS attacksthat range from very low intensity (i.e., less than 4% of the overalltraffic volume) to massive attacks (i.e., more than 80% of the over-all traffic volume). In addition, we compare the performance ofUNIDS against some previous methods for unsupervised anomalydetection presented in Section 2.

5.1. Features selection for detection of attacks

The selection of the m features used in X to describe the aggre-gated flows in Y is a key issue to any intrusion detection system,but it becomes critical and challenging in the case of unsuperviseddetection, because there is no additional information to select themost relevant set. In general terms, using different traffic featurespermits to detect different types of anomalies. In this paper weshall limit our study to detect well-known attacks, using a set ofstandard traffic features widely used in the literature. However,as shown in Section 6, UNIDS can be extended to detect differenttypes of anomalies, considering different sets of traffic features.In fact, more features can be added to any standard list to improvedetection results. The main advantage of UNIDS is that we have de-vised an algorithm to highlight outliers respect to any set of fea-tures, and this is why we claim that our algorithm is highlyapplicable.

In this section we shall use the following list of m ¼ 9 traffic fea-tures: number of source/destination IP addresses and ports(nSrcs;nDsts;nSrcPorts;nDstPorts), ratio of number of sources tonumber of destinations, packet rate (nPkts= sec), fraction of ICMPpackets (nICMP=nPkts) and SYN packets (nSYN=nPkts), and aver-age packet size (avgPktsSize). According to previous work on signa-ture-based anomaly characterization [24], such simple trafficdescriptors permit to describe standard network attacks such asDoS, DDoS, scans, and spreading worms/virus. In Section 6 weshow how UNIDS can efficiently detect other kinds of attacks suchas buffer overflows, password guessing attacks, and unauthorizedaccess to network resources when considering other types of trafficdescriptors.

Table 1 describes the impacts of different types of attacks on theselected traffic features. All the thresholds used in the descriptionare introduced to better explain the evidence of an attack in someof these features. DoS/DDoS attacks are characterized by manysmall packets sent from one or more source IPs towards a single des-tination IP. These attacks generally use particular packets such asTCP SYN or ICMP echo-reply/echo-request, or host unreachablepackets. Port and network scans involve small packets from onesource IP to several ports in one or more destination IPs, and areusually performed with SYN packets. Spreading worms differ fromnetwork scans in that they are directed towards a small specificgroup of ports for which there is a known vulnerability to exploit(e.g. Blaster on TCP port 135,Slammer on UDP port 1434,Sasser onTCP port 455), and they generally use slightly bigger packets. Someof these attacks can use other types of traffic, such as FIN, PUSH, URGTCP packets or small UDP datagrams. The aforementioned

Fig. 4. Evidence Accumulation for Ranking Outliers. Weighting factor wi and Mahalanobis distance dM both permit to boost the degree of abnormality of an outlying flow.Units in these graphs are irrelevant.

Table 1Features used by UNIDS in the detection of DoS, DDoS, network/port scans, and spreading worms. For each type of attack, we describe its impact on the selected traffic features.

Type of Attack Class Agg-Key Impact on traffic features

DoS (ICMP/SYN) 1-to-1 IPdst nSrcs ¼ nDsts ¼ 1; nPkts= sec > k1; avgPktsSize < k2; nICMP=nPkts > k3; nSYN=nPkts > k4

DDoS (ICMP/SYN) N-to-1 IPdst nDsts ¼ 1; nSrcs > a1; nPkts= sec > a2; avgPktsSize < a3; nICMP=nPkts > a4; ðnSYN=nPkts > a5

Port scan 1-to-1 IPsrc nSrcs ¼ nDsts ¼ 1; nDstPorts > b1; avgPktsSize < b2; nSYN=nPkts > b3.Network scan 1-to-N IPsrc nSrcs ¼ 1; nDsts > d1; nDstPorts > d2; avgPktsSize < d3; nSYN=nPkts > d4.Spreading worms 1-to-N IPsrc nSrcs ¼ 1; nDsts > g1; nDstPorts < g2; avgPktsSize < g3; nSYN=nPkts > g4.


particularities permit to differentiate these kinds of distributed at-tacks from other distributed yet legitimate traffic (e.g., P2P traffic).Section 5.2 shows practical evidence on this claim.

5.2. Detecting attacks in MAWI traffic

We begin by analyzing the performance of UNIDS to detect net-work attacks and other types of anomalies in one of the traces pre-viously analyzed in [16]. IP flows are aggregated with IPsrc key, i.e.,each aggregated flow corresponds to many IP flows coming fromthe same IP address. Fig. 5(a) shows the ordered dissimilarity val-ues in Drank obtained by the EA4RO algorithm, along with their cor-responding manual classification. The first two most dissimilarflows correspond to a highly distributed SYN network scan (morethan 500 destination hosts) and an ICMP spoofed flooding attackdirected to a small number of victims (ICMP redirect traffic

towards port 0). The following two flows correspond to unusuallarge rates of DNS traffic and HTTP requests; from there on, flowscorrespond to normal-operation traffic. The ICMP flooding attackand the two unusual flows are also detected in [16]; the SYN scanwas missed by their method, but it was correctly detected withaccurate signatures [24]. Setting the detection threshold accordingto the previously discussed approach results in Th1 . Indeed, if wefocus on the shape of the ranked dissimilarity in Fig. 5(a), we canclearly appreciate a major change in the slope after the 5th rankedflow. Note however that both attacks can be easily detected andisolated from the anomalous but yet legitimate traffic without falsealarms, using for example the threshold Th2 on Drank.

Figs. 5(b) and (c) depict the corresponding four anomalousaggregated flows in two of the N partitions produced by Sub-SpaceClustering and the EA4RO algorithm. Besides showing typical char-acteristics of the attacks, such as a large value of nPkts= sec or a

Fig. 5. Detection and analysis of network attacks in WIDE.


100% of ICMP/SYN packets, both figures permit to appreciate thatthe detected attacks do not necessarily represent the largest ele-phant flows in the time slot. This emphasizes the ability of UNIDSto detect attacks of low intensity, even of lower intensity than nor-mal traffic.

5.3. Detecting attacks with ground truth

Fig. 6 depicts the True Positives Rate (TPR) as a function of theFalse Positives Rates (FTR) in the detection of different attacks inMAWI and METROSEC. Fig. 6(a) corresponds to the detection of36 anomalies in MAWI traffic, using IPsrc as key. These anomaliesinclude network and port scans, worm scanning activities (Sasserand Dabber variants), and some anomalous flows consisting onvery high volumes of NNTP traffic. Fig. 6(b) also corresponds to

Fig. 6. True Positives Rate vs false a

anomalies in MAWI traffic, but using IPdst as key. In this case, thereare 9 anomalies, including different kinds of flooding DoS/DDoS at-tacks. Finally, Fig. 6(c) corresponds to the detection of 9 DDoS at-tacks in the METROSEC dataset. From these, 5 correspond tomassive attacks (more than 70% of traffic), 1 to a high intensity at-tack (about 40%), 2 are low intensity attacks (about 10%), and 1 is avery-low intensity attack (about 4%). The detection is performedusing traffic aggregated with IPdst key. In the three evaluation sce-narios, the ROC plot is obtained by comparing the sorted dissimi-larities in Drank to a variable detection threshold.

We compare the performance of UNIDS against three previouslyused approaches for unsupervised detection of network attacksand anomalies: DBSCAN-based, k-means-based, and PCA-basedoutliers detection. The first two consist in applying either DBSCANor k-means to the complete feature space X, identify the largest

larms in MAWI and METROSEC.


cluster Cmax, and compute the Mahalanobis distance of all the flowslying outside Cmax to its centroid. The ROC is finally obtained bycomparing the sorted distances to a variable detection threshold.These approaches are similar to those used in previous work[19–21]. In the PCA-based approach, PCA and the sub-space meth-ods [12,13] are applied to the complete matrix X, and the attacksare detected by comparing the residuals to a variable threshold.Both the k-means and the PCA-based approaches require fine tun-ing: in k-means, we repeat the clustering for different values ofclusters k, and take the average results. In the case of PCA we pres-ent the best performance obtained for each evaluation scenario.

Obtained results permit to evidence the great advantage ofusing the EA4RO algorithm for outliers detection with respect toprevious approaches. In particular, all the approaches used inthe comparison generally fail to detect all the attacks with a rea-sonable false alarm rate. Both the DBSCAN-based and the k-means-based algorithms get confused by masking features whenanalyzing the complete feature space X. The PCA approach showsto be not sensitive enough to discriminate different kinds of attacksof very different intensities, using the same representation for nor-mal-operation traffic. This limitation has already been detected inprevious work [15].

6. UNIDS vs NIDS in KDD99

In this section we evaluate the ability of UNIDS to detect net-work attacks in the well-known and widely used KDD99 networkattacks dataset [26]. The main interest of this evaluation is to showthat UNIDS can detect other kinds of attacks when consideringother types of traffic descriptors. As we did in Section 5, we shallcompare UNIDS against the previous methods used for unsuper-vised detection, namely DBSCAN-based, k-means-based, and PCA-based detection. In addition, we compare UNIDS performance withthat obtained by an extensively investigated misuse NIDS based ondecision trees. Decision trees permit to construct comprehensivesignatures for network attacks in the form of multiple filtering-rules, using a graph structure.

The KDD99 dataset contains a wide variety of intrusions simu-lated in a military network environment. Traffic consists of packetsaggregated into connections, being a connection a flow of TCPpackets between a source and a destination IP address. Simulatedattacks include DoS attacks, unauthorized access from a remotemachine – R2L attacks (e.g. password guessing), unauthorized ac-cess to super-user privileges – U2R attacks (e.g. buffer overflows),and probing attacks (e.g. port scanning). Each connection or flow isdescribed by a set of m ¼ 41 features (e.g. number of bytes, TCPflags, failed remote-login attempts, etc.) and a label indicatingeither the name of the attack or if the flow corresponds to nor-mal-operation traffic.

In order to compare the performance of UNIDS against a deci-sion-tree-based NIDS we take two different data sub-sets, the firstused for training issues and the second one for testing. The testingdataset corresponds to instances of different attacks that are notpresent in the training dataset, which by the way will permit usto evidence the paramount advantage of doing unsuperviseddetection when new attacks arise. DoS and probing attacks inKDD99 are represented by a large number of flows, in some caseseven more flows than those corresponding to normal-operationtraffic. While this was not an issue for UNIDS when evaluated inprevious section with real traffic traces, KDD99 flows provided at[26] are already pre-processed and can not be aggregated becausethe corresponding IP addresses are not available. To avoid this lim-itation in the already processed KDD99 flows, we shall select only asmall fraction of flows for both DoS and probing attacks in ourtraining and testing datasets. In any case, the reader should

remember that we have already proved that UNIDS is able to de-tect both DoS and probing attacks in real traffic traces when usingflows aggregation. The training dataset has 950 normal flows and255 attacks, while the testing dataset consists of 950 normal flowsand 162 attacks.

Let us first evaluate the True Positives and False Positives ratesobtained by UNIDS in both training and testing datasets. Fig. 7 de-picts the corresponding ROC curves. Fig. 7(a) shows the results ob-tained by applying UNIDS to the training dataset, while Fig. 7(b)shows the results obtained when analyzing the testing dataset. Inboth cases we can appreciate that UNIDS is able to detect a largefraction of attacks (more than 90%) with very low false positiverates (less than 1% and 3.5% respectively). Fig. 7 additionally com-pares the obtained detection performance against the three previ-ous approaches used for unsupervised detection. In both cases weappreciate once again the outperforming ability of UNIDS w.r.t.these approaches, which fail as before to detect as many attacksas UNIDS with a reasonable false alarms rate.

Let us now compare UNIDS against a largely studied misuse-based NIDS built through decision trees. We shall build a differentdecision tree for each of the four different categories of attacks(DoS, probe, R2L, and U2R), using the training dataset and standardC4.5 decision trees [9,10]. To train each of the trees for each differ-ent category of attacks, we consider that the flows belonging to therest of the categories of attacks as well as the normal operationflows correspond to the ‘‘negative’’ class (there is no attack of thecorresponding category). For example, let us suppose that we wantto build a decision tree to detect R2L attacks; in that case, all theflows in the training dataset which belong to the R2L category be-long to the ‘‘positive’’ class (there is a R2L attack), while the nor-mal-operation flows as well as the DoS, probe, and U2R flowscompose the negative class. In order to avoid over-fitting problemswhen training the decisions trees, we have decided to use early-stopping learning. For this reason, not all the attacks present inthe training dataset are detected by the decision-trees-based NIDS.

Fig. 8 presents the detection accuracy (number of correctly de-tected attacks) obtained either with UNIDS or with the NIDS previ-ously described in both the training and testing datasets. Resultsare individually presented for each of the four categories of attacks.As expected, results obtained by both systems are very similar inthe training dataset: in the case of UNIDS, we have already shownin Fig. 7 that more than 90% of the attacks can be correctly de-tected; in the case of the NIDS, it is quite obvious that using thesystem to detect the attacks which has been programmed to alerton shall provide proper results. What it is interesting to appreciateis what happens with both systems when we try to detect un-known attacks. Fig. 8(b) evidences the limitations of the NIDS todetect unknown attacks, and more importantly, the paramountadvantage of using our UNIDS for detecting new previously unseenattacks. We refer the reader to Fig. 9 to appreciate the detectionaccuracy obtained with both systems for the different attacks oneach of the four different attacks categories available in theKDD99 dataset.

7. Computational Time and Parallelization

The last issue that we analyze is the Computational Time (CT) ofUNIDS. The EA4RO algorithm performs multiple clusterings inN ¼ mðm� 1Þ=2 low-dimensional sub-spaces Xi � X. This multiplecomputation imposes scalability issues for detection of attacks inhigh-speed networks. Two key features of the algorithm areexploited to reduce scalability problems in number of features mand the number of aggregated flows n to analyze. Firstly, clusteringis performed in very low dimensional sub-spaces, Xi 2 R2, which isfaster than clustering in high-dimensional spaces [17]. Secondly,

Fig. 7. True Positives Rate vs false alarms in KDD99.

Fig. 8. UNIDS vs Misuse-based NIDS in KDD99.


each sub-space can be clustered independently of the other sub-spaces, which is perfectly adapted for parallel computing architec-tures. Parallelization can be achieved in different ways: using a sin-gle multi-processor and multi-core machine, using networkprocessor cards and/or GPU (Graphic Processor Unit) capabilities,using a distributed group of machines, or combining these tech-niques. We shall use the term ’’slice’’ as a reference to a single com-putational entity.

Fig. 10 depicts the CT of the EA4RO algorithm, both (a) as afunction of the number of features m used to describe trafficflows and (b) as a function of the number of flows n to analyze.Fig. 10 (a) compares the CT obtained when clustering the com-plete feature space X, referred to as CTðXÞ, against the CT ob-tained with EA4RO, varying m from 2 to 29 features. Weanalyze a large number of aggregated flows, n ¼ 104, and usetwo different number of slices, M ¼ 40 and M ¼ 100. The analysisis done with traffic from the WIDE network, combining differenttraces to attain the desired number of flows. To estimate the CT of

EA4RO for a given value of m and M, we proceed as follows: first,we separately cluster each of the N sub-spaces Xi, and take theworst-case of the obtained clustering time as a representativemeasure of the CT in a single sub-space, i.e., CTðXSSCwcÞ ¼max i

CTðXiÞ. Then, if N 6 M, we have enough slices to completely par-allelize the algorithm, and the total CT corresponds to the worst-case, CTðXSSCwcÞ. On the contrary, if N > M, some slices have tocluster various sub-spaces, one after the other, and the total CTbecomes ðN%M þ 1Þ times the worst-case CTðXSSCwcÞ, where % rep-resents integer division. The first interesting observation fromFig. 10 (a) regards the increase of CTðXÞ when m increases, goingfrom about 8 s for m ¼ 2 to more than 200 s for m ¼ 29. As wesaid before, clustering in low-dimensional spaces is faster, whichreduces the overhead of multiple clusterings computation. Thesecond paramount observation is about parallelization: if thealgorithm is implemented in a parallel computing architecture,it can probably be used to analyze large volumes of traffic usingmany traffic descriptors in a reasonable time scale as compared

Fig. 9. UNIDS vs Misuse-based NIDS – R2L, DoS, U2R, and probing attacks.


with the duration of time-slots. The use of the term probablyemphasizes the fact that the simulation results presented in

Fig. 10 do not take into account the additional overheads entailedin the deployment of the algorithms on a real parallel architec-

Fig. 10. Computational Time as a function of n of features and n of flows to analyze. The number of aggregated flows in (a) is n = 10,000. The number of features and slices in(b) is m ¼ 20 and M ¼ 190 respectively.


ture, both at the hardware and at the system level (e.g., architec-ture of sampling components or pre-processing units and soforth).

Fig. 10 (b) compares CTðXÞ against CTðXSSCwcÞ for an increasingnumber of flows n to analyze, using m ¼ 20 traffic features andM ¼ N ¼ 190 slices (i.e., a completely parallelized implementationof the EA4RO algorithm). As before, we can appreciate the differ-ence in CT when clustering the complete feature space vs. usinglow-dimensional sub-spaces: the difference is more than one orderof magnitude, independently of the number of flows to analyze.Regarding the volume of traffic that can be analyzed with this100% parallel configuration, the EA4RO algorithm can analyze upto 50000 flows with a reasonable CT, about 4 min in this experi-ence. In the evaluations with real traffic traces presented inSection 5, the number of aggregated flows in a time slot ofDT ¼ 20 s rounds the 2500 flows, which represents a value ofCTðXSSCwcÞ � 0:4 s. For the m ¼ 9 features that we have used(N ¼ 36), and even without doing parallelization, the total CT isN � CTðXSSCwcÞ � 14:4 s.

8. Conclusions

The Unsupervised Network Intrusion Detection System that wehave presented has many interesting advantages w.r.t. previousproposals. It uses exclusively unlabeled data to detect network at-tacks, without assuming any kind of signature, particular model, orcanonical data distribution. This allows to detect new previouslyunseen network attacks, outperforming current misuse-basedNIDS.

Despite using ordinary clustering techniques to identify net-work attacks, our approach avoids the lack of robustness of generalclustering methods, by combining the notions of Sub-Space Clus-tering, Density-based Clustering, and Multiple Evidence Accumula-tion to rank the degree of abnormality of traffic flows. We haveverified the effectiveness of the proposed UNIDS system to detectboth simulated and real network attacks of very different nature(DoS, DDoS, scans, worms, buffer overflows, etc.) in multiple traffictraces, all in a completely blind fashion, without assuming any par-ticular traffic model, clustering parameters, or even clustersstructure.

Additionally, we have shown detection results that outperformtraditional approaches for intrusion detection and unsupervisedoutliers detection, providing a stronger evidence of the accuracyof this novel UNIDS to detect unknown network attacks.

Finally, and contrary to previous work on clustering and outliersanalysis for detection of network attacks, we have conducted anevaluation on the computational time of the core algorithms.Results suggest that the application of UNIDS for unsupervisedintrusions detection can be drastically improved regarding compu-tational time of the analysis if the algorithms are deployed on aparallel computation architecture. Unfortunately, we do not haveat the time of writing this paper an implementation of UNIDS ona real parallel computation architecture, which limits the practicalapplicability of drawn conclusions on computational time.

Acknowledgments

This work has been done in the framework of the European pro-ject ECODE, funded by the European Commission under grant FP7-ICT-2007-2/223936.

References

[1] S. Hansman, R. Hunt, A taxonomy of network and computer attacks, in:Computers and Security, vol. 24, no. 1, pp. 31–43, 2005.

[2] Symantec Enterprise Security, Symantec internet security threat report, Trendsfor 2010, Symantec reports, vol. 16, 2011.

[3] N. Wyler, Aggressive Network Self-Defense, Syngress-Elsevier, 2005. ISBN 978-1-931836-20-3.

[4] G. Cormode, S. Muthukrishnan, What is new: finding significant differences innetwork data streams, in: IEEE Transactions on Networking, vol. 13 no. 6, 2005,pp. 1219–1232.

[5] P. Casas, J. Mazel, P. Owezarski, UNADA: Unsupervised network anomalydetection using sub-space outliers ranking, in: IFIP Networking, 2011.

[6] G. Androulidakis, V. Chatzigiannakis, S. Papavassiliou, Network anomalydetection and classification via opportunistic sampling, IEEE Network 23 (1)(2009).

[7] R. Kemmerer, G. Vigna, Intrusion detection: a brief history and overview, IEEESecurity & Privacy Magazine 35 (4) (2002).

[8] A. Sperotto, G. Schaffrath, R. Sadre, C. Morariu, A. Pras, B. Stiller, An Overview ofIP Flow-Based Intrusion Detection, IEEE Communications Surveys & Tutorials12 (3) (2010).

[9] C. Kruegel, T. Toth, Using decision trees to improve signature-based intrusiondetection, in: Proceedings RAID, 2003.

[10] J. Lee, H. Lee, S. Sohn, J. Ryu, T. Chung, Effective value of decision tree withKDD99 intrusion detection datasets for IDS, in: Proceedings ICACT, 2008.

[11] A. Lakhina, M. Crovella, C. Diot, Characterization of network-wide anomalies intraffic flows, in: Proceedings ACM IMC, 2004.

[12] A. Lakhina, M. Crovella, C. Diot, Diagnosing network-wide traffic anomalies, in:Proceedings ACM SIGCOMM, 2004.

[13] A. Lakhina, M. Crovella, C. Diot, Mining anomalies using traffic featuredistributions, in: Proceedings ACM SIGCOMM, 2005.

[14] D. Brauckhoff, K. Salamatian, M. May, Applying PCA for traffic anomalydetection: problems and solutions, in: Proceedings IEEE INFOCOM, 2009.

[15] H. Ringberg, A. Soule, J. Rexford, C. Diot, Sensitivity of PCA for traffic anomalydetection, in: Proceedings ACM SIGMETRICS, 2007.


[16] G. Dewaele, K. Fukuda, P. Borgnat, P. Abry, K. Cho, Extracting hidden anomaliesusing sketch and non gaussian multi-resolution statistical detectionprocedures, in: Proceedings ACM LSAD Workshop, 2007.

[17] A.K. Jain, Data clustering: 50 years beyond K-means, Pattern RecognitionLetters 31 (8) (2010) 651–666.

[18] L. Parsons, E. Haque, H. Liu, Subspace clustering for high dimensional data: areview, ACM SIGKDD Expl. Newsletter 6 (1) (2004) 90–105.

[19] L. Portnoy, E. Eskin, S. Stolfo, Intrusion detection with unlabeled data usingclustering, in: Proceedings ACM DMSA Workshop, 2001.

[20] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, S. Stolfo, A geometric frameworkfor unsupervised anomaly detection: detecting intrusions in unlabeleddata, Applications of Data Mining in Computer Security, Kluwer Publisher,2002.

[21] K. Leung, C. Leckie, Unsupervised Anomaly detection in network intrusiondetection using clustering, in: Proceedings ACSC, 2005.

[22] A. Fred, A.K. Jain, Combining multiple clusterings using evidenceaccumulation, IEEE Transactions on Pattern Analysis and MachineIntelligence 27 (6) (2005).

[23] M. Ester, P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discoveringclusters in large spatial databases with noise, in: Proceedings ACM SIGKDD,1996.

[24] G. Fernandes, P. Owezarski, Automated classification of network trafficanomalies, in: Proceedings SecureComm, 2009.

[25] K. Cho, K. Mitsuya, A. Kato, Traffic data repository at the WIDE Project, in:USENIX Annual Technical Conference, 2000.

[26] The KDD Cup 1999 Dataset, at <kdd.ics.uci.edu/databases/kddcup99>.[27] METROlogy for SECurity and QoS, at <laas.fr/METROSEC>.[28] Bro intrusion detection system, at <www.bro-ids.org>.[29] SNORT: an Open source network intrusion prevention and detection system, at

<www.snort.org>.

http://kdd.ics.uci.edu/databases/kddcup99

http://laas.fr/METROSEC

http://www.bro-ids.org

http://www.snort.org

Documents

Unsupervised Network Intrusion Detection Systems: Detecting the Unknown without Knowledge