Download doc - Detection of application layer ddos attack using hidden semi markov model (2009) (synopsis)

Detection of Application Layer DDOS Attack

using Hidden Semi Markov Model

(Synopsis)

Abstract:

The recent tide of Distributed Denial of Service attacks

against high-profile web sites, demonstrate how damaging

the DDoS attacks are and how defenseless the Internet is

under such attacks. The services of these web sites were

unavailable for hours or even days as a result of the attacks.

In this attack the adversary simultaneously send a large

volume of traffic to a victim host or network. The victim is

overwhelmed by so much traffic that it can provide little or

no service to its legitimate clients. The burst traffic and high

volume are the common characteristics of App-DDoS attacks

and flash crowds, it is not easy for current techniques to

distinguish them merely by statistical characteristics of

traffic. Therefore, App-DDoS attacks may be stealthier and

more dangerous for the popular Websites than the general

Net-DDoS attacks when they mimic the normal flash crowd.

This project proposes a scheme to capture the spatial-

temporal patterns of a normal flash crowd event and to

implement the App-DDoS attacks detection. Since the traffic

characteristics of low layers are not enough to distinguish

the App-DDoS attacks from the normal flash crowd event,

the objective of this project is to find an effective method to

identify whether the surge in traffic is caused by App-DDoS

attackers or by normal Web surfers. This project defines the

Access Matrix (AM) to capture spatial-temporal patterns of

normal flash crowd and to monitor App-DDoS attacks during

flash crowd event. Hidden semi-Markov model is used to

describe the dynamics of AM and to achieve a numerical and

automatic detection. Principal component analysis and

independent component analysis used to deal with the

multidimensional data for Hidden semi-Markov model and

finally the monitoring architecture validate the real flash

crowd traffic.

Introduction

Any attack on the Internet today can be highly

devastating. Distributed Denial of Service (DDoS) attacks are

among the most malicious Internet attacks, that overwhelm

a victim system with data such that the victim response time

is slowed or totally stopped. There have been many

instances where DDoS attacks have caused damages worth

billions of dollars. Defending against DDoS attacks has hence

become a major priority in the Internet community The

attacker’s objective is to interrupt or reduce the quality of

experienced by legitimate users. Many attacks have

innocent counterparts (e.g., someone sends me very large E-

mail services as attachment, and blocks my access to other

messages)

Basic Concepts:

Flash crowd: It is a sudden, large surge in traffic to a

particular Web site

Denial of Service (DoS): It is an explicit attempt to prevent

legitimate users of a service from using that service

Attack Types:

1) Bandwidth consumption

i) attackers have more bandwidth than victim, e.g.

T3 (45Mpbs) attacks T1 (1.544 Mbps).

ii) attackers amplify their bandwidth engaging other

computers to attack victim with higher bandwidth,

e.g. 100 56Kbps attack a T1

2) Resource starvation: consumes system resources like

CPU, memory, disk space on the victim machine using

flooding

Smurf, Fraggle, Syn flood: Attacker sends sustained

packets to broadcast address of the Simplifying

network with source address is forged to read the

victim’s IP address. Since traffic was sent to broadcast

address all hosts in the amplifying LAN will answer to

the victim’s IP address If a few SYN packets are sent by

the attacker every 10 seconds, the victim will never

clear the queue and stops to respond.

Hidden semi Markov Model:

We apply the hidden semi-Markov model (HSMM)

to characterize legitimate request patterns to a Web

server and to detect DDoS (distributed denial of

service) attacks on it. Measurements of real workload

often indicate that a significant amount of variability is

present in the traffic observed over a wide range of

time scales, exhibiting self similar or long range

dependent characteristics Major advantages of using an

HSMM are its efficiency in estimating the model

parameters to account for an observed sequence, and

the estimated parameters can capture various

statistical properties of the workload, including self-

similarity, long-range and short-range dependence.

Therefore, use of this HSMM is effective in better

understanding the nature of Web workload and in

detecting the anomalous behavior that a DDoS attack

may present.

Existing System:

At present most of the systems are vulnerable to Dos attack.

DoS attacks are of particular interest and concern to the

Internet community because they seek to render target

systems inoperable and/or target networks inaccessible.

"Traditional" DoS attacks, however, typically generate a

large amount of traffic from a given host or subnet and it is

possible for a site to detect such an attack in progress and

defend themselves. Distributed DoS attacks are a much

more nefarious extension of DoS attacks because they are

designed as a coordinated attack from many sources

simultaneously against one or more targets. There are some

attack detection mechanisms as follows

1)Signature detection :

Signature detection (also known as misuse

detection),where we look for patterns signaling well

known attacks

2)Anomaly detection:

Identifying something out of ordinary is essentially

anomaly detection.

PHAD (packet header anomaly detector):

PHAD extends the four attributes normally used in

network anomaly detection systems (source and destination

IP address, source and destination port numbers). Transport

headers (TCP, UDP) fields are tested as appropriate for each

protocol. In testing, we discovered that many attacks could

be detected because of unusual values in these fields. In

addition to IP address anomalies, we found that some

attacks generate unusually small packet sizes, unusual

combinations of TCP flags (e.g. urgent data, missing

acknowledgements, reserved flags).

ALAD (application layer anomaly detector):

Instead of modeling single packets, as in PHAD, we

model incoming TCP connections to the well known server

ports (0-1023).Although this misses a few attacks that

exploit IP, UDP or higher numbered ports (such as X servers),

it does (or should) catch most attacks against servers, which

usually use TCP. The attackers will keep trying to

establishing connections to servers by huge number of

requests which will generate the flash crowd in

network and resource starvation.

Time-To-Live (TTL)

Here each router marks packets with dynamic

probability. Specifically, each router marks a packet with a

probability proportional to the distance it has to travel. As

such, a packet that has to traverse long distances is marked

with higher probability, compared with a packet with shorter

distances to traverse. This modification ensures that a

packet is marked with much higher probability compared to

existing mechanisms, which greatly reduces effectiveness of

spoofed marks. It can reduce the number of false positives

by 90%

1) All the legitimate packets would be marked at least once

by an intermediate router before it reaches the destination

(victim).

2) There is an upper bound on the probability that a

spoofed (illegitimate) packet reaches the destination without

being marked. This upper bound is a function of the distance

between the sender (attacker) and the destination (victim).

The attackers will set TTL to high, but the spoofs will be find

and reduce the TTL by routers based on distance to

destination.

Disadvantages:

1. The Existing Attack detection mechanism uses only

the concept of request rate of the particular

user and flash crowd event in network.

2.Other existing defense methods may be those based

on schemes.

Those schemes are not effective for the DDoS attack

detection

They may annoy users and introduce additional

service delays.

3 Though anomaly detection can detect novel attacks,

it has the disadvantage that it is not capable of

discerning intent. It can only signal that some event is

unusual, but not necessarily hostile, thus generating

false alarms

.

Proposed System:

The goal of the proposed system is to add some new

attack detection with addition of existing system. We

proposed a attack detection

mechanism, a scheme ,based on document popularity using

Access Matrix that will define the temporal patterns.

Pattern indicates the website links that have some sequence

of path. We used a sequence anomaly detector based on

hidden semi-Markov model to detect the App-DDOS attacks.

Advantages:

1. The basic idea behind the proposed system is to isolate

and protect legitimate traffic from huge volumes of

DDoS traffic when an attack occurs.

2. Our first step is to distinguish packets that contain

genuine source IP addresses from those that contain

spoofed addresses. This is done by redirecting a client

to a new IP address and port number (to receive web

service) through a standard HTTP redirect message.

3. The proposed system uses some advanced detection

technique with addition to existing technique to detect

the App-DDOS attack.

4. The proposed system uses Access Matrix to maintain

the access

sequence of every user.

Modules

The following are the modules obtained by the detailed design of

the proposed system.

1) MAC Generator

2) MAC verifier

3) IP handler

4) Query Handler

5) Access Matrix

6) Hidden semi Markov Model

Module 1:

MAC Generator

This module is to distinguish packets that

contain genuine source IP addresses from those that contain

spoofed address. Once the very first TCP SYN packet of a

client gets through, the proposed system immediately

redirects the client to a pseudo-IP address (still belonging to

the website) and port number pair, through a standard HTTP

URL redirect message. Certain bits from this IP address and

the port number pair will serve as the Message

Authentication code (MAC) for the client’s IP address. MAC is

a symmetric authentication scheme that allows a party A,

which shares a secret key k with another party A, which

shares a secret key k with another party B, to authenticate a

message M sent to B with a signature MAC (M,k) has the

property that, with overwhelming probability, no one can

forge it without knowing the secret key k.

Module 2

MAC Verifier

This module is to prevent attackers who are using

genuine address or spoofed address. Since a legitimate

client uses its real IP address to communicate with the

server, it will receive the HTTP redirect message (hence the

MAC). So, all its future packets will have the correct MAC

inside their destination IP addresses and thus be protected.

The DDos traffic with spoofed IP addresses, on the other

hand, will be filtered because the attackers will not receive

the MAC sent to them. So, this technique effectively

separates legitimate traffic from DDos traffic with spoofed IP

addresses.

Module 3:

Attacker Prevention (IP Handler Mechanism)

If the server find that the request rate from a IP is a

higher than the limit, the IP will be moved to blocked state,

and further the response will not be provided. Each time if a

new request arrives, the server will get its IP and check

whether this IP is in blocked state or Normal state.

If it is in blocked state the service will not be provided or else

the request is handled and immediate response is given for

the normal users.

Module 4:

Query Handler:

The attackers will try to attack the popular websites

by sending the queries on the URL path. If the queries are

executed then some unexpected results will happen for

websites. For example modify and delete queries will leads

to more problems for popular sites. This module will check

the URL path and redirect the request if it contains the

unwanted queries.

Module 5: Access Matrix:

Here in this Access Matrix module we will store the

Online Shopping’s list of sequence access path information

in a separate table. Here the necessary information like

user’s id, IP address port number access time and the recent

sequence of access path information is stored in another

separate table for future reference.

Module 6:

Hidden semi-Markov model:

Here in this module we will check the client’s

sequence access path information with the access matrix

table to identify the attacker. If the sequence of access path

differs, we will update and name that ip address in separate

table as attacker.

SYSTEM REQUIREMENTS

The following are the software tools are required to

implement the system and tested using Unit testing

applications.

SOFTWARE SPECIFICATION

Operating System : Windows 2000/XP

Front End : JSP

Back End : SQL Server 2000

Web Server : TOMCAT 5.5

HARDWARE SPECIFICATION

Processor : Pentium IV 500MHz.

Monitor : SVGA

RAM : 128 MB SDRAM

Secondary Storage : 40GB HDD

Floppy Drive : 1.44MB