Detection of Application Layer DDOS Attack
using Hidden Semi Markov Model
(Synopsis)
Abstract:
The recent tide of Distributed Denial of Service attacks
against high-profile web sites, demonstrate how damaging
the DDoS attacks are and how defenseless the Internet is
under such attacks. The services of these web sites were
unavailable for hours or even days as a result of the attacks.
In this attack the adversary simultaneously send a large
volume of traffic to a victim host or network. The victim is
overwhelmed by so much traffic that it can provide little or
no service to its legitimate clients. The burst traffic and high
volume are the common characteristics of App-DDoS attacks
and flash crowds, it is not easy for current techniques to
distinguish them merely by statistical characteristics of
traffic. Therefore, App-DDoS attacks may be stealthier and
more dangerous for the popular Websites than the general
Net-DDoS attacks when they mimic the normal flash crowd.
This project proposes a scheme to capture the spatial-
temporal patterns of a normal flash crowd event and to
implement the App-DDoS attacks detection. Since the traffic
characteristics of low layers are not enough to distinguish
the App-DDoS attacks from the normal flash crowd event,
the objective of this project is to find an effective method to
identify whether the surge in traffic is caused by App-DDoS
attackers or by normal Web surfers. This project defines the
Access Matrix (AM) to capture spatial-temporal patterns of
normal flash crowd and to monitor App-DDoS attacks during
flash crowd event. Hidden semi-Markov model is used to
describe the dynamics of AM and to achieve a numerical and
automatic detection. Principal component analysis and
independent component analysis used to deal with the
multidimensional data for Hidden semi-Markov model and
finally the monitoring architecture validate the real flash
crowd traffic.
Introduction
Any attack on the Internet today can be highly
devastating. Distributed Denial of Service (DDoS) attacks are
among the most malicious Internet attacks, that overwhelm
a victim system with data such that the victim response time
is slowed or totally stopped. There have been many
instances where DDoS attacks have caused damages worth
billions of dollars. Defending against DDoS attacks has hence
become a major priority in the Internet community The
attacker’s objective is to interrupt or reduce the quality of
experienced by legitimate users. Many attacks have
innocent counterparts (e.g., someone sends me very large E-
mail services as attachment, and blocks my access to other
messages)
Basic Concepts:
Flash crowd: It is a sudden, large surge in traffic to a
particular Web site
Denial of Service (DoS): It is an explicit attempt to prevent
legitimate users of a service from using that service
Attack Types:
1) Bandwidth consumption
i) attackers have more bandwidth than victim, e.g.
T3 (45Mpbs) attacks T1 (1.544 Mbps).
ii) attackers amplify their bandwidth engaging other
computers to attack victim with higher bandwidth,
e.g. 100 56Kbps attack a T1
2) Resource starvation: consumes system resources like
CPU, memory, disk space on the victim machine using
flooding
Smurf, Fraggle, Syn flood: Attacker sends sustained
packets to broadcast address of the Simplifying
network with source address is forged to read the
victim’s IP address. Since traffic was sent to broadcast
address all hosts in the amplifying LAN will answer to
the victim’s IP address If a few SYN packets are sent by
the attacker every 10 seconds, the victim will never
clear the queue and stops to respond.
Hidden semi Markov Model:
We apply the hidden semi-Markov model (HSMM)
to characterize legitimate request patterns to a Web
server and to detect DDoS (distributed denial of
service) attacks on it. Measurements of real workload
often indicate that a significant amount of variability is
present in the traffic observed over a wide range of
time scales, exhibiting self similar or long range
dependent characteristics Major advantages of using an
HSMM are its efficiency in estimating the model
parameters to account for an observed sequence, and
the estimated parameters can capture various
statistical properties of the workload, including self-
similarity, long-range and short-range dependence.
Therefore, use of this HSMM is effective in better
understanding the nature of Web workload and in
detecting the anomalous behavior that a DDoS attack
may present.
Existing System:
At present most of the systems are vulnerable to Dos attack.
DoS attacks are of particular interest and concern to the
Internet community because they seek to render target
systems inoperable and/or target networks inaccessible.
"Traditional" DoS attacks, however, typically generate a
large amount of traffic from a given host or subnet and it is
possible for a site to detect such an attack in progress and
defend themselves. Distributed DoS attacks are a much
more nefarious extension of DoS attacks because they are
designed as a coordinated attack from many sources
simultaneously against one or more targets. There are some
attack detection mechanisms as follows
1)Signature detection :
Signature detection (also known as misuse
detection),where we look for patterns signaling well
known attacks
2)Anomaly detection:
Identifying something out of ordinary is essentially
anomaly detection.
PHAD (packet header anomaly detector):
PHAD extends the four attributes normally used in
network anomaly detection systems (source and destination
IP address, source and destination port numbers). Transport
headers (TCP, UDP) fields are tested as appropriate for each
protocol. In testing, we discovered that many attacks could
be detected because of unusual values in these fields. In
addition to IP address anomalies, we found that some
attacks generate unusually small packet sizes, unusual
combinations of TCP flags (e.g. urgent data, missing
acknowledgements, reserved flags).
ALAD (application layer anomaly detector):
Instead of modeling single packets, as in PHAD, we
model incoming TCP connections to the well known server
ports (0-1023).Although this misses a few attacks that
exploit IP, UDP or higher numbered ports (such as X servers),
it does (or should) catch most attacks against servers, which
usually use TCP. The attackers will keep trying to
establishing connections to servers by huge number of
requests which will generate the flash crowd in
network and resource starvation.
Time-To-Live (TTL)
Here each router marks packets with dynamic
probability. Specifically, each router marks a packet with a
probability proportional to the distance it has to travel. As
such, a packet that has to traverse long distances is marked
with higher probability, compared with a packet with shorter
distances to traverse. This modification ensures that a
packet is marked with much higher probability compared to
existing mechanisms, which greatly reduces effectiveness of
spoofed marks. It can reduce the number of false positives
by 90%
1) All the legitimate packets would be marked at least once
by an intermediate router before it reaches the destination
(victim).
2) There is an upper bound on the probability that a
spoofed (illegitimate) packet reaches the destination without
being marked. This upper bound is a function of the distance
between the sender (attacker) and the destination (victim).
The attackers will set TTL to high, but the spoofs will be find
and reduce the TTL by routers based on distance to
destination.
Disadvantages:
1. The Existing Attack detection mechanism uses only
the concept of request rate of the particular
user and flash crowd event in network.
2.Other existing defense methods may be those based
on schemes.
Those schemes are not effective for the DDoS attack
detection
They may annoy users and introduce additional
service delays.
3 Though anomaly detection can detect novel attacks,
it has the disadvantage that it is not capable of
discerning intent. It can only signal that some event is
unusual, but not necessarily hostile, thus generating
false alarms
.
Proposed System:
The goal of the proposed system is to add some new
attack detection with addition of existing system. We
proposed a attack detection
mechanism, a scheme ,based on document popularity using
Access Matrix that will define the temporal patterns.
Pattern indicates the website links that have some sequence
of path. We used a sequence anomaly detector based on
hidden semi-Markov model to detect the App-DDOS attacks.
Advantages:
1. The basic idea behind the proposed system is to isolate
and protect legitimate traffic from huge volumes of
DDoS traffic when an attack occurs.
2. Our first step is to distinguish packets that contain
genuine source IP addresses from those that contain
spoofed addresses. This is done by redirecting a client
to a new IP address and port number (to receive web
service) through a standard HTTP redirect message.
3. The proposed system uses some advanced detection
technique with addition to existing technique to detect
the App-DDOS attack.
4. The proposed system uses Access Matrix to maintain
the access
sequence of every user.
Modules
The following are the modules obtained by the detailed design of
the proposed system.
1) MAC Generator
2) MAC verifier
3) IP handler
4) Query Handler
5) Access Matrix
6) Hidden semi Markov Model
Module 1:
MAC Generator
This module is to distinguish packets that
contain genuine source IP addresses from those that contain
spoofed address. Once the very first TCP SYN packet of a
client gets through, the proposed system immediately
redirects the client to a pseudo-IP address (still belonging to
the website) and port number pair, through a standard HTTP
URL redirect message. Certain bits from this IP address and
the port number pair will serve as the Message
Authentication code (MAC) for the client’s IP address. MAC is
a symmetric authentication scheme that allows a party A,
which shares a secret key k with another party A, which
shares a secret key k with another party B, to authenticate a
message M sent to B with a signature MAC (M,k) has the
property that, with overwhelming probability, no one can
forge it without knowing the secret key k.
Module 2
MAC Verifier
This module is to prevent attackers who are using
genuine address or spoofed address. Since a legitimate
client uses its real IP address to communicate with the
server, it will receive the HTTP redirect message (hence the
MAC). So, all its future packets will have the correct MAC
inside their destination IP addresses and thus be protected.
The DDos traffic with spoofed IP addresses, on the other
hand, will be filtered because the attackers will not receive
the MAC sent to them. So, this technique effectively
separates legitimate traffic from DDos traffic with spoofed IP
addresses.
Module 3:
Attacker Prevention (IP Handler Mechanism)
If the server find that the request rate from a IP is a
higher than the limit, the IP will be moved to blocked state,
and further the response will not be provided. Each time if a
new request arrives, the server will get its IP and check
whether this IP is in blocked state or Normal state.
If it is in blocked state the service will not be provided or else
the request is handled and immediate response is given for
the normal users.
Module 4:
Query Handler:
The attackers will try to attack the popular websites
by sending the queries on the URL path. If the queries are
executed then some unexpected results will happen for
websites. For example modify and delete queries will leads
to more problems for popular sites. This module will check
the URL path and redirect the request if it contains the
unwanted queries.
Module 5: Access Matrix:
Here in this Access Matrix module we will store the
Online Shopping’s list of sequence access path information
in a separate table. Here the necessary information like
user’s id, IP address port number access time and the recent
sequence of access path information is stored in another
separate table for future reference.
Module 6:
Hidden semi-Markov model:
Here in this module we will check the client’s
sequence access path information with the access matrix
table to identify the attacker. If the sequence of access path
differs, we will update and name that ip address in separate
table as attacker.
SYSTEM REQUIREMENTS
The following are the software tools are required to
implement the system and tested using Unit testing
applications.
SOFTWARE SPECIFICATION
Operating System : Windows 2000/XP
Front End : JSP
Back End : SQL Server 2000
Web Server : TOMCAT 5.5
HARDWARE SPECIFICATION
Processor : Pentium IV 500MHz.
Monitor : SVGA
RAM : 128 MB SDRAM
Secondary Storage : 40GB HDD
Floppy Drive : 1.44MB