14
Anomaly Detection for Internet Worms Yousof Al-Hammadiand Christopher LeckieARC Special Research Centre for Ultra-Broadband Information Networks (CUBIN) National ICT Australia Department of Electrical and Electronic Engineering Department of Computer Science and Software Engineering The University of Melbourne Victoria 3010, Australia Email: [email protected], [email protected] Abstract Internet worms have become a major threat to the Internet due to their ability to rapidly compromise large numbers of computers. In response to this threat, there is a growing de- mand for effective techniques to detect the presence of worms and to reduce the worms’ spread. Furthermore, existing approaches for anomaly detection of new worms suffer from scalability problems. In this paper, we present an approach for detecting worms based on similar patterns of connection activity. We then investigate how to improve the computa- tional efficiency of worm detection by presenting a Greedy algorithm, which minimizes the amount of traffic processing needed to detect worms, thus increasing the scalability of the system. Our evaluation shows that the Greedy algorithm not only achieved high detection accuracy and reduced the amount of processing time to detect worms, but also achieved reasonable worm traffic detection in the early stages of an outbreak. Keywords security, Internet worms, anomaly detection, network intrusion detection 1. Introduction The Internet is persistently threatened by many types of attacks such as viruses, and worms. A worm is a self-propagating program that infects other hosts based on a known vulnerability in network hosts [1]. In contrast, a virus is a piece of code attached to an- other executable program, which requires human action to propagate. A major challenge in networking is how to detect new worms and viruses in the early stages of propagation in a computationally efficient manner. The impact of worms and viruses on the Inter- net include delays due to congestion, extensive waste of network bandwidth, as well as corruption of user’s computers and data. Furthermore, viruses and worms can carry soft- ware [2] that enables attackers to gain access to the personal information of users [3]. In addition, recent worms [4] are capable of launching distributed denial-of-service (DDoS) attacks against other hosts. Recently, there have been several studies into how to model worm spread in net- works [5] [6] [7] [8] based on classical epidemic models. These models have helped to understand the rate of propagation of different types of worms. However, the problem of detecting and reacting to worms remains unsolved. 0-7803-9087-3/05/$20.00 ©2005 IEEE

[IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

  • Upload
    c

  • View
    215

  • Download
    3

Embed Size (px)

Citation preview

Page 1: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

Anomaly Detection for Internet Worms

Yousof Al-Hammadi† and Christopher Leckie‡ARC Special Research Centre for Ultra-Broadband Information Networks (CUBIN)National ICT Australia†Department of Electrical and Electronic Engineering‡Department of Computer Science and Software EngineeringThe University of MelbourneVictoria 3010, AustraliaEmail: [email protected], [email protected]

AbstractInternet worms have become a major threat to the Internet due to their ability to rapidlycompromise large numbers of computers. In response to this threat, there is a growing de-mand for effective techniques to detect the presence of worms and to reduce the worms’spread. Furthermore, existing approaches for anomaly detection of new worms suffer fromscalability problems. In this paper, we present an approach for detecting worms based onsimilar patterns of connection activity. We then investigate how to improve the computa-tional efficiency of worm detection by presenting a Greedy algorithm, which minimizesthe amount of traffic processing needed to detect worms, thus increasing the scalabilityof the system. Our evaluation shows that the Greedy algorithm not only achieved highdetection accuracy and reduced the amount of processing time to detect worms, but alsoachieved reasonable worm traffic detection in the early stages of an outbreak.

Keywordssecurity, Internet worms, anomaly detection, network intrusion detection

1. Introduction

The Internet is persistently threatened by many types of attacks such as viruses, andworms. A worm is a self-propagating program that infects other hosts based on a knownvulnerability in network hosts [1]. In contrast, a virus is a piece of code attached to an-other executable program, which requires human action to propagate. A major challengein networking is how to detect new worms and viruses in the early stages of propagationin a computationally efficient manner. The impact of worms and viruses on the Inter-net include delays due to congestion, extensive waste of network bandwidth, as well ascorruption of user’s computers and data. Furthermore, viruses and worms can carry soft-ware [2] that enables attackers to gain access to the personal information of users [3]. Inaddition, recent worms [4] are capable of launching distributed denial-of-service (DDoS)attacks against other hosts.

Recently, there have been several studies into how to model worm spread in net-works [5] [6] [7] [8] based on classical epidemic models. These models have helped tounderstand the rate of propagation of different types of worms. However, the problem ofdetecting and reacting to worms remains unsolved.

0-7803-9087-3/05/$20.00 ©2005 IEEE

Page 2: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

One approach for detecting known worms is signature-based detection [9] [3] [10].However, these techniques are unable to detect 0-day attacks, i.e., the early outbreak ofnew attacks. This is due to the unavailability of signatures for new worms. An alternativeapproach is to use anomaly detection [2], which tries to detect abnormal patterns of be-havior based on a model of normal network behavior. Anomaly detection techniques havethe potential to detect new worms, but they suffer from producing false alarms. An openresearch problem is how to reduce the computational complexity of anomaly detectionwhile still ensuring reasonable detection accuracy.

In this paper, we present an approach to detecting worms based on analyzing unusualpatterns of similar connection activity in networks. In contrast to earlier approaches [2],our aim is to improve the computational efficiency of worm detection by presenting aGreedy connection-based anomaly detection algorithm. The key advantages of using ourGreedy algorithm are that it minimizes the amount of traffic processing required to detecta worm, and provides a scalable solution under attack traffic loads. In addition, no priorknowledge of traffic patterns are needed by our algorithm.

There are two key contributions in this paper. First, we develop a greedy algorithm thatcan detect abnormal connection patterns in a computationally efficient manner, and thusdetect new types of worm attacks for which no signature is known. Second, we evaluateour Greedy algorithm using traffic provided by DARPA [11], and demonstrate that ourGreedy algorithm reduces the amount of packet processing while maintaining detectionaccuracy in comparison to other anomaly detection schemes [2]. We also demonstrate thatour algorithm provides reasonable detection time compared to other approaches.

This paper is organized as follows. We discuss the related work in Section 2. In Section3, we describe the problem of worm detection. We present the details of our Greedy algo-rithm in Section 4. Our evaluation and results are presented in Section 5. We summarizeand conclude in Section 6.

2. Previous WorkDuring the past 20 years, thousands of different worms have been developed [12]. Someof these worms have caused huge disruption to global networks. The most notable wormsinclude Morris, Code Red and Code Red II, Nimda, Slapper, and Sapphire/Slammerworms, and recently, SoBig.F, MSBlast, and Mydoom. From the first worm that was re-leased in 1988 (the Morris worm), the area of Internet worm detection has been a sig-nificant research problem. In this section, we summarize previous approaches to wormdetection. Although there has been a considerable amount of research on worm model-ing [5] [13] [6] [7] [8], our focus is on the relevant research in worm detection.

Intrusion Detection Systems (IDS) for detecting worms in networks can be clas-sified into two general categories: Signature-based Detection and Anomaly Detection.Signature-based detection is based on defining malicious patterns that the system has todetect [2] [9]. Signature-based detection suffers from the problem that it requires a signa-ture of each attack be known.

In contrast, anomaly detection differs by constructing a profile of normal behaviors oractivities on the network, and then looking for activities that do not fit the normal profile.

Session Three Traffic Monitoring134

Page 3: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

Since not all the abnormal activities in the network are suspicious, anomaly detection hasthe problem of raising false alarms when it encounters normal traffic that it has not seenbefore [2]. However, anomaly detection has the important advantage that it can be used todetect new worms for which there is no known signature.

A Honey-pot is an entire network or server that is not meant to provide any service tolegitimate users [14]. The only traffic that a honey-pot should receive is probe traffic orother malicious traffic. Attack signatures can then be obtained from captured traffic, andused to block attacks into real systems. The disadvantage of a honey-pot is that it cannotdetect the suspicious traffic without receiving activity directed against it. Therefore, weneed the ability to monitor connections between hosts, which is the basis for our model.

Moore et al. [15] discusses potential solutions for mitigating the threat of wormsthrough the use of source address filtering and content-based filtering. They show thatusing content-based filtering has a significant effect in slowing down the spread of worms.

Motivated by Moore’s approach [15] of obtaining worm signatures for use in content-based detection, Kim and Karp [16] have developed an approach called Autograph. Au-tograph automatically generates signatures for Internet worms without prior knowledgeof worm’s payload. However, classifying flows as normal or suspicious, and extractingsignatures of variable length are time consuming tasks. Other approaches [4] [17] mon-itor the scanning pattern of worms by analysing the traffic on the link to the monitorednetwork. However, if a worm is designed to target local hosts, the local network will becompletely infected before the worm can be detected.

The Graphical Intrusion Detection System (GrIDS) detects co-ordinated worm behav-ior in large scale networks by collecting data about the connections that have been madebetween the hosts in a network. GrIDS then aggregates this connection data into an ac-tivity graph, which reveals the structure of network activity [2] [4] [18]. After buildingthe activity graph of host connections, it searches for predefined patterns of intrusion inin order to detect possible worms. GrIDS requires data exchange between different hosts,which consumes a large amount of bandwidth. In addition, GrIDS does not consider con-nection attempts to non-existing services and non-existing hosts, which are a commonbyproduct of worm activity.

Another connection-based approach was developed by Toth and Kruegel [2] to im-prove the effectiveness of GrIDS. They showed that it is possible to use anomaly detectiontechniques to detect worms based on their abnormal connection patterns. We refer to theiralgorithm as the TK algorithm. If the anomaly score of traffic exceeds a trigger value, aworm is detected in the network. There are several limitations of the TK algorithm. It doesnot consider the time difference between outgoing connections and incoming connectionsto the same node. This can lead to the erroneous associations between connections, whichcan result in false positives. In addition, their model suffers from scalability issues becausetheir algorithm analyzes all possible sequences of connections in the network. We are mo-tivated to solve the problems that the TK algorithm faced in order to improve detectionaccuracy, limit the number of false alarms, and provide a scalable system.

135Anomaly Detection for Internet Worms

Page 4: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

3. Problem DefinitionOur goal is to detect worm spread by monitoring the connections in the network. It isessential that we not only detect worm attacks with a high accuracy and low false positiverate, but we also detect new worms quickly before a large number of hosts have beeninfected. This raises two key challenges. The first challenge for detection systems is howto focus on suspicious traffic while ignoring normal traffic. This has the effect of reducingthe number of false alarms generated. The second challenge is how to detect abnormaltraffic patterns in a computationally efficient manner, especially under high traffic loads.

Our problem definition is based on the model of detecting worm traffic described byToth and Kruegel [2], which we refer to as the TK algorithm. Our aim is to extend theirapproach by developing an efficient detection algorithm that uses a subset of the connec-tions leading to a particular host to reduce the amount of processing required while stillmaintaining detection accuracy.

Inputs – The input to our detection algorithm is the sequence of all TCP connectionsgenerated to and from all hosts within a monitored subnetwork. In keeping with themodel in [2], each connection is defined as a 6-tuple (time, srcip, srcport, dstip,

dstport, payload), where time is the start time of the connection, srcip and dstip are thesource and destination IP addresses of the connection, srcport and dstport are the sourceand destination ports of the connection, and payload is a summary of the payload sentto the destination. After passing connections in this form to our model, our algorithmconstructs possible sequences of connections that could have led to the incoming connec-tions. These sequences of connections are referred to as chains. A chain is a sequence ofconnections {C1, C2, . . . , Cl}, such that should satisfy two conditions:

• the destination of connection Ci−1 is the source of connection Ci, where 1 ≤ i ≤ l• the time of connection Ci < the time of connection Ci+x where 0 < x ≤ l − i

Our Greedy algorithm analyzes each chain to find similarities, such as repeated des-tination addresses and destination port numbers, between the outgoing connection fromnode x with all previous connections that end at that node, in order to detect abnormal be-havior. In contrast to the TK algorithm, our Greedy algorithm avoids generating duplicatechains with similar anomaly score values. The advantage of this restriction is to reducethe number of chains that are considered as normal behavior.

Outputs – The output of our algorithm will raise an alarm if similar activities arenoticed, otherwise, nothing will be generated indicating normal traffic.

Assumptions – In this paper, we make the following assumptions in our analysis. Ourfirst assumption is that we consider only TCP connections, which are most commonlyused by worms. Although we focus on TCP traffic, our approach can be used to ana-lyze UDP traffic by treating each UDP packet as a connection. For efficiency, repeatedconnections between endpoints within a time window can be discarded.

We assume that we can extract at least part of the payload in the TCP connections.If this is not possible, our algorithm can still detect similarities based on destination portnumbers. Furthermore, we assume that the system monitors all internal and outboundnetwork traffic.

We assume that we can monitor the connections that are being made in within a sub-

Session Three Traffic Monitoring136

Page 5: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

Figure 1: Basic Model of the Greedy Algorithm

network [2]. Our approach is suited to monitoring connections that are collected from aswitching hub. Given the potentially large number of connections that can occur within asubnetwork, it is important that we ignore connections that frequently appear under nor-mal conditions. In order to ignore these normal connections, we maintain an Ignore List,which is described in Section 5.2.

Our scheme is designed to detect fast-spreading worms, which are by far the mostcommon form of worm attack. Our timing windows can be easily adjusted, either deter-ministically or randomly, making it difficult for attackers to evade the scheme by usinga slow worm. Indeed, a key advantage of our approach is that it would force attackers toslow the propagation rate of their worm.

4. A Greedy Algorithm for Worm Detection

Worm propagation results in a large number of similar TCP connections in a short periodof time [2]. In order to detect worm traffic, the detection system needs to examine andcompare connections within a network. This type of pairwise comparison of connectionsfor similarities is very computationally expensive. As a result, we require an algorithmthat is able to detect a worm by selectively examining important subsets of connections.We have developed a Greedy search algorithm to achieve these goals.

4.1 Basic Model

The overall architecture of our detection model is shown in Figure 1. We follow the samebasic architecture as used by the TK algorithm [2]. There are three main steps in thisarchitecture: (1) filtering well known traffic, (2) constructing chains of connections, and(3) evaluating an anomaly score for each chain. Let us now describe each of these steps,and describe our key contribution in each step.

The first step is to filter out well known traffic (i.e. well known services) using anIgnore List. Although an Ignore List was proposed in [2], little detail was provided. InSection 5, we describe how to generate and use such an Ignore List.

The second step is creating chains of connections. Every time a new connection ismade, we need to update the list of possible chains that terminate at the destination ofthe new connection, based on the existing chains that terminate at the source of the newconnection. Whereas the TK algorithm looks at all possible chains, our Greedy algorithmconsiders the time difference between the new connection and the last connection of the

137Anomaly Detection for Internet Worms

Page 6: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

chain. If the time difference is too long (i.e., longer than a threshold Tw), the Greedyalgorithm does not add the new connection to that chain.

For example, imagine that we have a network consist of 3 nodes in series (A, B, andC). A chain is produced at node C if there is a connection Ct1 from node (A) to node (B)established at time t1 and after that, a new connection Ct2 from node (B) to node (C) isestablished at time t2. The Greedy algorithm will first check the time difference of bothconnections. If (t2 − t1) < Tw, then the Greedy algorithm will create a chain of bothconnections Ct1 and Ct2 at node (C). However, if a new connection is established fromnode (A) to node (B) at time t3, a new chain would not be created at node (C) because thenew connection from (A) to (B) occurred after the connection from (B) to (C). We needto ensure that the causality of connections in each chain is consistent.

The third step is analyzing each chain and updating the anomaly score for the chain(see Algorithm 1 for a formal description of our algorithm). Before analyzing a chain,the Greedy algorithm checks the length of the chain. If the length is too long (i.e., longerthan a threshold Lc), the Greedy algorithm will only process the last Lc connections ofthe chain. Then, for each chain, the Greedy algorithm will identify suspicious traffic bymatching the destination port numbers and the packet contents between the outgoing con-nection and the last Lc connections in that chain. If the last connection matches with pre-vious connections in the same chain, then potential worm traffic is detected. Moreover, theGreedy algorithm restricts the number of incoming connections to the same node whichhave similar anomaly scores. If the number exceeded a predefined value nMatches, otherconnections will automatically be ignored. The anomaly score is updated every time thereis a match. The anomaly score is defined as follows [2]:

anomaly score = repeatcount*R + nehost*H + neservice*S (1)

where repeatcount is the number of similar connections in each chain, nehost andneservice are the number of connections attempted to non-existing hosts and non-existingservices respectively. The three parameters, R, H, and S, are used as configurable factorsto weight the anomaly score. We have used the parameter settings from [2], R = 0.3,H = 0.2 and S = 0.05. The final output is an alarm if the anomaly score exceeds somepredefined threshold, indicating the presence of a worm in the network.

4.2 Expected benefits of Greedy Algorithm

Our Greedy algorithm includes two important improvements over the TK algorithm [2].These improvements are the length of the chain Lc and the time difference between theoutgoing connection from a node and all incoming connections to that node Tw. The ben-efits of using Lc is limiting the algorithm from processing large numbers of connections.Since most worms spread rapidly, it will be useless to analyze connections that establisheda long time ago. In addition, processing long chains is time consuming and increases com-plexity. Using Tw will prevent connections that established long ago from being part ofthe chain. Thus, we expect to reduce the false alarms generated by the algorithm. In ad-dition, for each chain, we use nMatches, which allows chains with higher anomaly scoresto be part of detection process by ignoring chains with low anomaly scores.

Session Three Traffic Monitoring138

Page 7: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

A formal description of our Greedy algorithm as shown in Algorithm A1 below. Thedetails of each step in this algorithm are shown in Steps 1 and 2 below.

A 1 Greedy Anomaly Detection AlgorithmInitialize parameters• const nMatches {maximum number of chains that can terminate at each node}• const Tw {maximum timing window}• const Lc {maximum chain length}• AnomalyScore = 0 {total anomaly score}Step 1. Accept new connection and build chain: CreateChain(connection)• Check the time difference between new connection and the last connection of the chainStep 2. Calculate anomaly score for each chain: CalculateAnomaly(chain)• Check the length of each chain• Prune chains that have low anomaly scoresStep 3. Report possible worm traffic• if AnomalyScore > Threshold then

possible worm traffic is detectedend if

Go to: Step 1.

Step 1 Create Chainaccept new connection Ci

Chi = CreateChain( Ci ){initialize Chi.dst = {}initialize j = i − 1while j > 0 do

if Ci.src == Cj .dst && [Ci.time − Cj .time] < Tw thenChi.connections = Chj .connections ∪ Chi.connections

end ifj −−

end whileChi.connections = append(Ci, Chi.connections) {append Ci to chain Chi}return Chi.connections}

5. Evaluation

The goal of our evaluation is to compare the accuracy and efficiency of the TK and Greedyalgorithms. Accuracy represents the ability of the algorithm to distinguish normal trafficfrom attack traffic, while efficiency represents the computation time needed to process

139Anomaly Detection for Internet Worms

Page 8: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

Step 2 Calculate Anomaly

AnomalyV aluei=CalculateAnomaly(Chi){initialize Chi.AS = 0 {initialize Anomaly Value for chain}initialize Lagest = 0;match = 0;ChainLen = 0initialize len = Chi.length() − 1 {total number of connections in the chain}initialize LastElement = Chi[len] {last connection in the chain}while len ≥ 0 && ChainLen < Lc do

−− lenif LastElement.src ! = Chi[len].dst then

if LastElement.port == Chi[len].port && LastElement.content ==Chi[len].content then

Chi.AS + +else

if match < nMatches thenmatch + +if Chain[Chi[len]].AS ≥ Largest then

Lagest = Chain[Chi[len]].ASif LastElement.port == Chi[len].port &&LastElement.content == Chi[len].content then

Chi.AS + +end if

elsedrop Chain[Chi[len]]update Chi = CreateChain(Ch∀i�=len)update len = len−x {x is number of connections deleted from the chain}

end ifelse

drop Chain[Chi[len]]update Chi = CreateChain(Ch∀i�=len)update len = len − x {x is number of connections deleted from the chain}

end ifend if

end ifChainLen + +

end whilereturn max(Chi.AS, Largest)}

Session Three Traffic Monitoring140

Page 9: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

the network traffic and still achieve reasonable detection accuracy. We also examined theability of each algorithm to detect worm traffic in the early stage of an outbreak. Ourframework for evaluation consists of three parts: (1) Traffic Generation (2) Ignore ListConstruction, and (3) Worm Detector.

5.1 Traffic Generation

In this stage, a mixture of both normal and attack traffic is used in both algorithms. TheMIT Lincoln Lab and DARPA have collected data sets for evaluating Intrusion DetectionSystems [11]. The DARPA 1999 data sets were used as the input traffic in our evalua-tion. These kinds of data sets are needed to evaluate the accuracy and efficiency of ouralgorithm, since they contain attack free traffic, as well as attack traffic. Unfortunately,the DARPA 1999 data sets did not include worm traffic. Because we did not have tracesthat contain actual worm traffic, we generate worm traffic using a worm simulator. Ourworm simulator starts with an infected host that randomly accesses other vulnerable hostsand tries to infect them. Once the infection is made, all infected hosts scan for other vul-nerable uninfected hosts. Our worm traffic is simulated between the hosts that appear inthe inside network in the DARPA datasets. The simulated traffic is combined with theDARPA 1999 traces to make the final test traffic for our evaluation. The simulated wormtraffic was randomly injected into the DARPA 1999 traces to produce the final test traffic.Although the DARPA traces are not always ideal, they are a widely accepted test dataset.The use of this public domain trace helps provide a benchmark for testing.

5.2 Ignore List

The importance of the Ignore List comes from its ability to list well known hosts. Listingwell known hosts will prohibit them from being passed to the detector, therefore, elim-inating many chains that would have to be considered as part of a worm. Although thepaper by Toth and Kruegel [2] discussed the importance of the Ignore List, no procedurewas given for its generation. Consequently, we needed to develop the procedure describedbelow.

Our approach is suited to monitoring connections that are collected from a switchinghub. A factor in the practicality of our approach is the use of an Ignore List. The IgnoreList contains the most frequent host and port combinations that appear to be normal traffic.The Ignore List is used to ignore connections to well known hosts and ports. Therefore,only connections that do not match the Ignore List need to be logged. One method for ig-noring well known traffic is to generate a frequency table that contains the most frequentlyseen connections. The frequency table counts the number of times a destination host anddestination port have been accessed. This helps in providing statistics about hosts as wellas the services that are on these hosts.

After generating the frequency table, a threshold is set such that if a particular host hasa frequency value larger than the threshold, the destination IP address and its port numberwill be added to the Ignore List. As a result, any connection that has a destination and portnumber that matches any element in the Ignore List will be considered as normal.

141Anomaly Detection for Internet Worms

Page 10: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

5.3 List of Tests Conducted

We have investigated the accuracy of the TK algorithm and Greedy algorithm using nor-mal traffic represented by the DARPA 1999 traces [11], and simulated worm attack traffic.

We also investigated the efficiency of the TK and Greedy algorithms in terms of timecomplexity. The TK algorithm potentially suffers from scalability problems due to theneed to generate and process large numbers of chains in order to detect the worm traffic.Constructing all possible chains consumes considerable computation time. Our aim is toinvestigate whether the Greedy algorithm can significantly reduce the amount of connec-tion processing while maintaining detection accuracy.

The last investigation tests the sensitivity of each algorithm. Our aim is to test howmany worm connections need to be seen by each algorithm before the worm is detected.

5.4 Results

Detection AccuracyThe first set of results compare the accuracy of the TK and Greedy algorithms in termsof their ability to distinguish between normal traffic and worm traffic. We used the first50 connections in each packet trace to learn the Ignore List, and applied the remainingconnections to the worm detector stage after being filtered by the Ignore List. The resultsare shown in Figure 2, where a threshold of 1 was applied to the anomaly score in or-der to detect a worm. The x-axis represents the packet trace that was used to test eachalgorithm, while the y-axis represents the value of the anomaly score on the packet trace.We found that both algorithms achieved 100% detection accuracy for normal traffic andattack traffic as shown in Figures 2(a) and 2(b). We notice from Figure 2(a) that if theTrigger threshold was set to 0.5, the TK algorithm produces one false positive. In con-trast, the Greedy algorithm has no false alarms. This is because the TK algorithm eitherfound a coincidental match in a long chain, or did not take account of the time differencebetween the outgoing connection from a node and the incoming connections to that node.We demonstrated that the Greedy algorithm is able to reduce the number of false positivealarms on the test data compared to the TK algorithm.Computational EfficiencyThe second part of our evaluation is the efficiency of both the TK and Greedy algorithms.Since computation time is a major issue in implementing both algorithms, we evaluatedboth algorithms in terms of the number of chains that are generated for each new connec-tion. The larger the number of chains we have in the algorithm, the longer the algorithmtakes to search for patterns of similar connections. Using the Greedy algorithm, we wereable to significantly reduce the computation time required for detection, while still main-taining accuracy. To calculate the number of chains created every time a new connectionoccurs, we have developed the following equation:

chain[dest] = max(chain[src]+chain[dest],chain[dest]+1) (2)

where dest is the destination host, and src is the source host. Since the TK algorithm isso computationally intensive, it was an important aim of the Greedy algorithm to reducecomputation time by introducing the timing window Tw and the chain length Lc.

We have focused on an empirical evaluation of complexity. An analytical complexity

Session Three Traffic Monitoring142

Page 11: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

(a) The TK algorithm (b) The Greedy algorithm

Figure 2: Accuracy for normal and attack traffic

(a) normal traffic (b) attack traffic

Figure 3: Efficiency of the TK algorithm with and without the time window Tw

analysis requires a model for calculating the number of chains in a random graph. This isa challenging combinatorial analysis problem which is beyond the scope of this paper.

We have evaluated both algorithms using normal and attack traffic, in order to calculatethe total number of chains that need to be generated as the number of connections increase.Figure 3(a) shows how the number of chains generated by the TK algorithm grows asthe number of connections increases under normal traffic conditions. In contrast, Figure3(b) shows the number of chains generated by the TK algorithm for attack traffic. Theoriginal TK algorithm corresponds to the curve with Tw = ∞. Note that the TK algorithmgenerates many more chains for attack traffic compared to normal traffic as shown inFigure 3(b). We have noticed that the TK algorithm suffers from scalability problems forboth normal and attack traffic since it generates a large number of chains every time a newconnection occurred.

Figure 3 also shows the effect of introducing the time window Tw into the TK al-gorithm, which is the first contribution of the Greedy algorithm. From Figures 3(a) and

143Anomaly Detection for Internet Worms

Page 12: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

(a) normal traffic (b) attack traffic

Figure 4: Efficiency of the Greedy algorithm

3(b), we observe that as we decrease Tw and Lc, the number of chains being processedsignificantly decreases as well. Therefore, the computational time required to generatechains is significantly reduced by the introduction of the Tw threshold on the maximumtime between connections in chains. However, the number of chains generated is still ex-tremely large. Therefore, we introduce the complete Greedy algorithm, which is designedto provide a scalable solution to the problem of chain generation.

Figure 4(a) shows the number of chains generated by the Greedy algorithm undernormal traffic conditions using Tw = 100, Lc = 100, and nMatches = 4. In contrast,Figure 4(b) shows the number of chains generated by the Greedy algorithm for attacktraffic with the same values of Tw, Lc, and nMatches used for normal traffic.

Figures 4 also shows the effect of introducing the time window Tw and chain lengthLc on the Greedy algorithm. From Figures 4(a) and 4(b), we observe that by introducingTw and Lc, we dramatically decrease the number of chains being processed. In compari-son to Figure 3, we made a significant decrease in the number of chains. For instance, innormal traffic, the Greedy algorithm produces only 800 chains compare to 15,000 chainsproduced by the original TK algorithm. In addition, the Greedy algorithm produces only8,000 chains compared to 600,000 chains produced by the original TK algorithm under at-tack traffic. Therefore, the computational time required to generate chains is dramaticallyreduced by introducing Tw and Lc thresholds while maintaining detection accuracy.Early Worm DetectionWhile the Greedy algorithm is able to increase computational efficiency while maintain-ing detection accuracy, we also need to determine whether it can achieve reasonable de-tection accuracy in the early stages of worm infection. In order to test this, we measuredthe number of connections that needed to be seen by each algorithm before we were ableto detect the worm. Figure 5 shows this detection delay using the Greedy algorithm withnMatches = 4 in comparison to the TK algorithm.

From Figure 5, we note that both algorithms were able to detect the presence of theworm using the same number of connections on days D3 and D4. On the other hand,the TK algorithm was much faster in detecting the presence of the worm on Day D1. If

Session Three Traffic Monitoring144

Page 13: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

Figure 5: Detection Time for both the TK and Greedy algorithms

we restrict the number of connections that can access the same destination host to 1 (i.e.nMatches = 1), we find that the TK algorithm performs better in detecting the presenceof worm compared to Greedy algorithm. As we increase the value of nMatches, theGreedy algorithm approaches the performance of the TK algorithm in detection delay.

6. Conclusion and Future Work

In this paper, we have presented an algorithm that can detect Internet worms based onanalyzing connection patterns. The algorithm uses anomaly detection methods to identifythe suspicious connections, based on similarities in the destination port number and thepacket payload. The main contribution in this paper is developing an algorithm that isable to: (1) generate fewer false alarms; (2) provide higher efficiency by minimizing theamount of traffic processing to detect the worm while maintaining the detection accuracy,and finally, (3) maintain early worm detection.

Both the TK and Greedy algorithms were able to achieve high detection accuracy.In contrast to the TK algorithm, our Greedy algorithm used the timing window betweenthe incoming connection to a node and outgoing connection from the same node to reducefalse positives, since worms normally spread in the network in a very short time period. Inaddition, we have used the chain length as a factor to limit the number of chains generated,and to discard old connections. Overall, the Greedy algorithm was able to reduce thecomputational time while maintaining the detection accuracy. The results are promisingin terms of detecting worms in the network.

For future work, we will evaluate the accuracy and robustness of the Greedy algorithmbased on taking only samples of network traffic. In addition, peer-to-peer traffic will beevaluated rather than depending only on worm traffic. Peer-to-peer traffic tends to differfrom worm traffic in that it tends to have a smaller branching factor and does not ex-hibit the random probing shown by worms. We expect the P2P traffic will follow morepersistent patterns, which can be learned by the Ignore List.

145Anomaly Detection for Internet Worms

Page 14: [IEEE 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005. - Nice, France (15-19 May 2005)] 2005 9th IFIP/IEEE International Symposium on Integrated

Acknowledgment

We thank the MIT Lincoln Laboratory and DARPA for providing the data traces the weused for our evaluation. This research is supported in part by the ARC Special ResearchCentre for Ultra-Broadband Information Networks (CUBIN), an affiliation of NationalInformation Communication Technology Australia (NICTA). The authors would like tothank Etisalat College of Engineering and the Emirates Telecommunication Corporation(Etisalat), United Arab Emirates, for providing financial support for this work.

References[1] E. Spafford. The internet worm program: An analysis. Technical Report CSD-TR-823, Pur-

due University, November 1988.[2] T. Toth and C. Kruegel. Connection-history based anomaly detection. In IEEE Workshop on

Information Assurance and Security, 2002.[3] N. Banglamung. Combination of misuse and anomaly network intrusion. Kaleton Internet

Research Paper, March 2002.[4] F. Buchholz et al. Digging for worms, fishing for answers. In 18th Annual Computer Security

Applications Conference (ACSAC’02), 2002.[5] V. Paxon, S. Staniford and N. Weaver. How to 0wn the internet in your spare time. In 11th

USENIX Security Symposium (Security’02), 2002.[6] J. O. Kephart and S. R. White. Directed graph epidemiological models of computer viruses.

In 1991 IEEE Computer Society Symposium on Research in Security and Privacy, 1991.[7] Z. Chen et al. Modeling the spread of active worms. In IEEE INFOCOM, 2003.[8] D. Towsley, C. C. Zou, W. Gong. Code red worm propagation modeling and analysis. In 9th

ACM Conference on Computer and Communication Security (CCS), 2002.[9] S. Kumar and E. H. Spafford. A pattern matching model for misuse intrusion detection. In

17th National Computer Security Conference, 1994.[10] J. W. Lockwood et al. Application of hardware accelerated extensible network nodes for

internet worm and virus protection. In International Working Conference on Active Networks(IWAN), 2003.

[11] D. J. Fried, E. Tran, S. Boswell, M. A. Zissman, J. W. Haines, Richard P. Lippmann. 1999darpa intrusion detection system evaluation: Design and procedures. MIT Lincoln LaboratoryTechnical Report, 2001.

[12] N. Joukov and T. Chiueh. Internet worms as internet-wide threat. Technical Report. Depart-ment of Computer Science, Stony Brook University, 2003.

[13] N. Weaver. Potential strategies for high speed active worms: A worst case analysis. Technicalreport, UC Berkeley, March 2002.

[14] C. C. Zou et al. Monitoring and early warning for internet worms. In 10th ACM Conferenceon Computer and Communication Security (CCS’03), 2003.

[15] D. Moore et al. Internet quarantine: Requirements for containing self-propagating code. InIEEE INFOCOM, 2003.

[16] H. Kim and B. Karp. Autograph: Toward automated, distributed worm signature detection.In 13th USENIX Security Symposium, 2004.

[17] X. Chen and J. Heidemann. Detecting early worm propagation through packet matching.Technical Report, University of Southern California, (ISI-TR-2004-585), 2004.

[18] S. Stainford Chen et al. Grids - a graph based intrusion detection system for large networks.In 19th National Information Systems Security Conference, 1996.

Session Three Traffic Monitoring146